311 30 4MB
English Pages XV, 180 [191] Year 2020
Energy Use in Cities A Roadmap for Urban Transitions Stephanie Pincetl · Hannah Gustafson · Felicia Federico · Eric Daniel Fournier · Robert Cudd · Erik Porse
Energy Use in Cities “The book provides a unique knowledge resource for researchers, practitioners and decision-makers alike. It is based on a detailed and novel methodology to interrogate the socio-spatial features of energy demand across regions and cities, opening the path for ambitious measures to transform our relationship with energy and infrastructure in response to the global climate challenge.” —Stefan Bouzarovski, Professor, Department of Geography, University of Manchester “The Energy Atlas provides insightful and detailed information on localized energy consumption for the City of Los Angeles. As LADWP builds the energy grid of the future, the Energy Atlas remains vital in the development and incorporation of new clean technologies into the grid. The Energy Atlas assists LADWP to better understand building energy consumption across the diverse neighborhoods of Los Angeles. Now, as LADWP begins to incorporate distributed energy resources, energy storage, and building electrification technologies into the grid, UCLA’s Energy Atlas has new importance in maintaining reliability and managing customer demand.” —Steve Baule, Director of Sustainable Projects in the Office of Sustainability at Los Angeles Department of Water and Power “The team at UCLA have created a wonderfully, comprehensive, knowledge base with the Energy Atlas—an interactive data centric, map of energy consumption that enables us as Strategic Sustainability Consultants to rapidly develop insight into local community level energy usage and GHG emissions, to support the development of climate action planning and energy reduction strategies that we are currently undertaking in Southern California.” —David Herd, Managing Partner, BuroHappold “If global society is to overcome climate change, then it needs significant transformation in urban energy systems to occur. With Californian cities leading the way, Dr. Pincetl and her team have constructed and analyzed the most amazing, spatially detailed data-sets. They have found solutions to formidable data access challenges and explored thorny issues of social justice under energy transitions. This book is essential reading for local governments, planners and academics looking to build sustainable cities.” —Chris Kennedy, Chair of Civil Engineering, University of Victoria, Canada
Stephanie Pincetl · Hannah Gustafson · Felicia Federico · Eric Daniel Fournier · Robert Cudd · Erik Porse
Energy Use in Cities A Roadmap for Urban Transitions
Stephanie Pincetl Institute of the Environment and Sustainability University of California Los Angeles Los Angeles, CA, USA
Hannah Gustafson Institute of the Environment and Sustainability University of California Los Angeles Los Angeles, CA, USA
Felicia Federico Institute of the Environment and Sustainability University of California Los Angeles Los Angeles, CA, USA
Eric Daniel Fournier Institute of the Environment and Sustainability University of California Los Angeles Los Angeles, CA, USA
Robert Cudd Institute of the Environment and Sustainability University of California Los Angeles Los Angeles, CA, USA
Erik Porse Institute of the Environment and Sustainability University of California Los Angeles Los Angeles, CA, USA
ISBN 978-3-030-55600-6 ISBN 978-3-030-55601-3 (eBook) https://doi.org/10.1007/978-3-030-55601-3 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Cover illustration: © Alex Linch_shutterstock.com This Palgrave Macmillan imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Acknowledgments
We would like to thank Howard Choy, Dave Freeman, Ken Alex, Steve Baule, Michael Peevey, Eric Stokes, Mark Gold, Peter Kareiva, and Amy Reardon as well as the staff at the Institute of the Environment and Sustainability at UCLA for their help, support and faith in this project. Mary Hardin, Julia Skrovan, and Lauren Strug provided invaluable editorial contributions and Sean Kennedy some invaluable analysis. A special thanks must go to Dan Cheng who spent nearly four years working on the Atlas, developing creative solutions to gnarly problems. Her cheer, and consistently positive attitude made an enormous difference. Alex Ricklefs was a great addition to the team as well. Studio NAND, our website developer based in Berlin, has been an integral partner in this venture, providing beautiful, smooth, and sophisticated web development, making the Atlas what it is. We are also only as far along as the previous Atlas team enabled: special thanks goes to Zoe Elizabeth, one of the first staff and manager of the Atlas 1.0, Jackie Murphy who created the first beta map, Rob Graham, our first programmer, Sinnott Murphy and Deepak Sivaraman who also participated in the initial Atlas development. Finally, this book and our work would not have been possible without the ongoing support of the Anthony & Jeanne Pritzker Family Foundation for the California Center for Sustainable Communities at UCLA, spearheaded by Dr. Pincetl.
v
Contents
1 3 6
1
Introduction About the Chapters Bibliography
2
The Larger Context, Cities, Smart and Big Data Introduction Smart and Big Data Building Energy Data A Changing Energy Landscape Data to Guide the Transformation Origins of the Energy Atlas Cities, Data, and Sustainability Mapping Intentionality and Consequence A Roadmap for Transparent, Data-Driven Energy Transitions Bibliography
7 7 8 9 11 15 19 22 25 28
Building Energy Data Access and Aggregation Rules Introduction California’s 2014 Data Access Decision University Privacy Aggregation Rules for Public Disclosure
33 33 35 38
3
30 30
vii
viii
CONTENTS
Data Access in Practice Summary Bibliography
38 42 43
4
Building an Energy Atlas Database Overview Utility Data Parcels and Buildings Megaparcels Administrative Boundaries US Census CalEnviroScreen Data Preparation Geocoding and Utility Data Standardization Spatial Joins in PostgreSQL with PostGIS Monthly Billing Dates Security Masked vs. Unavailable Data Units Limitations Missing Data for Some Publicly Owned Utilities Parcel Data Errors Raw Utility Data Errors Census Data Errors Geocoding Errors Compliance with Data Aggregation Rules Summary Bibliography
45 46 47 49 51 52 52 53 54 54 56 56 57 57 58 59 59 59 60 61 62 63 64 64
5
User Design and Functionality Introduction Stakeholder Engagement Website Components Map View Profile View Data Download Visualization, Data Availability, and Privacy Aggregation
65 65 66 68 68 69 78 79
CONTENTS
6
7
ix
Summary Bibliography
83 84
Data Analytics Data-Driven Decision-Making How Data Is Transforming Science How Data Is Transforming Society Utility Customer Data Utility Company Internal Customer Data Use Cases Barriers to Third-Party Access Revisiting Utility Customer Data Aggregation Rules Ethical Implications of Utility Customer Data Access Decisions Third-Party Utility Customer Data Request Procedures Utility Customer Data Analytics Working with Utility Customer Data at Scale Verifying Utility Customer Data Integrity Putting Consumption into Context Detecting Significant Changes in Historical Consumption Forecasting Future Consumption Levels Utility Customer Data Models Versus Descriptions Modeling Applications Minimum Data Requirements Future Research Questions Bibliography
85 85 86 87 88 89 92 93 95 98 99 99 103 105 106 110 112 113 114 115 116
Case Studies Introduction Advanced Energy Communities Project Overview Background Role of the Energy Atlas Results and Findings Solar Prioritization Tool 1.0 Overview Background Role of the Energy Atlas Results and Findings
119 119 120 120 122 124 126 128 128 129 132 135
x
8
CONTENTS
Solar Prioritization Tool 2.0 Overview Background Role of the Energy Atlas Results and Findings Energy Transitions, Natural Gas, and Indoor Air Quality Overview Background Role of the Energy Atlas Results and Findings Electricity Infrastructure Vulnerabilities Due to Extreme Heat Overview Background Role of the Energy Atlas Results and Findings LA County Sustainability Plan—GHG Inventory and Business-as-Usual Scenarios Overview Role of the Energy Atlas Results and Findings A Building Energy Consumption Database for the California Bay Area Overview Background Role of the Energy Atlas Results and Findings Annex 70 International Research Collaboration Overview Background Role of the Energy Atlas Summary Bibliography
137 137 138 138 139 140 140 141 142 142
Conclusion Supporting Local Government Progress on the Energy Transition Regulatory Impediments Insights from Data-Driven Research
157
144 144 144 145 145 147 147 148 150 150 150 150 151 151 152 152 153 154 155 155
157 158 160
CONTENTS
Time-of-Use Pricing Interconnection of Distributed Generation Building Energy Use Summary Bibliography
xi
161 162 163 165 169
Glossary
171
Index
177
List of Figures
Fig. Fig. Fig. Fig. Fig. Fig.
2.1 4.1 4.2 4.3 5.1 5.2
Fig. 5.3 Fig. 5.4 Fig. 5.5 Fig. 5.6 Fig. 5.7 Fig. 5.8 Fig. 5.9
California’s energy system and sources of Energy UCLA Energy Atlas database overview Map of gas utility coverage in the Energy Atlas Map of electric utility coverage in the Energy Atlas Map view of the Energy Atlas website Footprint at the base of the map view contains longitudinal data for a specific geography when a user clicks the map Profile views compare up to three municipalities with tabular summaries of consumption data Profile charts displaying energy consumption by building type for Orange County in 2016 Profile charts displaying energy consumption by building size for Orange County in 2016 Profile charts displaying energy consumption by building vintage for Orange County in 2016 Profile charts displaying residential electricity consumption by CalEnviroScreen Score in Orange County in 2016 Residential electricity consumption by median household income in Orange County in 2016 The data download portal of the website
14 47 48 49 70
71 72 73 74 75 76 77 78
xiii
xiv
LIST OF FIGURES
Fig. 5.10
Fig. 5.11
Fig. 6.1
Fig. 6.2
Fig. 7.1
Fig. 7.2 Fig. 7.3
Fig. 7.4
Fig. 7.5
City-level industrial natural gas consumption is significantly masked on the Atlas website due to privacy rules set by the CPUC. The hashed geographies on this map show the extent of industrial sector masking in Southern California The spatial distribution and intensity of energy consumers plays an important role in how much data is masked by the 15/15 rule Results of an experiment using the Energy Atlas data to evaluate the effective masking rates associated with a large number of different potential alternatives to California’s existing 15/15 Rule Graphical illustration of the use of the simulated historical forecasting technique to evaluate the efficacy of a set of energy efficiency measures implemented in two different households Overview map of the six census tracts (orange) which comprise the Disadvantaged Communities of Basset and Avocado Heights selected as the study area for the AEC project. Note the number of major freeways intersecting the site as well as the adjacent large industrial facilities (bottom right), decommissioned landfill (bottom center), and gravel quarry (top center) (Basemap data credit: Mapbox) Overview of analysis methodology for the Solar Prioritization Tool 1.0 Average hourly GHG emissions intensity of grid power, based upon the changing composition of generators in the grid portfolio mix, across the hours in the day for each year between 2010 and 2019 Map illustrating the geographic location of the two project study area zip codes (91732 & 91746), which largely overlap with the AEC project study area (Basemap data credit: Mapbox) Los Angeles County DPR planning areas
80
82
97
109
121 133
141
143 148
List of Tables
Table 2.1 Table 4.1 Table Table Table Table Table
4.2 4.3 4.4 5.1 6.1
Table 7.1
Major laws, regulations, and rules governing California’s energy system and its transformation Utilities included in the Energy Atlas and years of data available Building use type categories and descriptions Data load periods in the Energy Atlas Parcel geocoding match rate by county The 15/15 rule Summary overview of the four paradigms of scientific discovery as proposed by Hey et al. (2009) Overview of procedures and methods used in the Solar Prioritization Tool 1.0
16 47 50 61 63 79 86 134
xv
CHAPTER 1
Introduction
This book is about an experiment that we felt ethically and scientifically obligated to conduct. It is a result of finding ourselves in a particular moment in history—one in which humanity needs to draw upon all of our species’ knowledge and capabilities to realize a sustainable future in the coming decades. As academics, we feel it is incumbent upon us to leverage the university’s unique status—legally, ethically, and intellectually—in furtherance of urban sustainability and a just energy transition away from fossil fuels. In doing so, we have discovered an immense gap in the practical availability of the data necessary for informing decisionmaking, including the implementation of California’s ambitious goals for climate action. California’s efforts to decarbonize electricity generation and to reduce energy use in buildings have resulted in the development of a multitude of ambitious and well-intentioned targets, benchmarks, and policies pursuant to the state’s overarching sustainability goals. In the course of our research, however, we discovered that a great many of these targets and benchmarks were “data free”—that is, there were no data available to guide their implementation or evaluate their success. Nor were there legislative provisions requiring that any particular kind of data be used measure or verify progress in meaningful ways. This was not because a record of building-level energy consumption, among other data, did not exist—it is regularly collected by utilities for billing purposes—but rather because utilities have claimed that such data are proprietary. As we shall © The Author(s) 2020 S. Pincetl et al., Energy Use in Cities, https://doi.org/10.1007/978-3-030-55601-3_1
1
2
S. PINCETL ET AL.
explain, these proprietary claims seriously limit the use of granular energy consumption data for public decision-making, and are now proving to be a significant obstacle to the achievement of the state’s sustainability goals. In California, greenhouse gas (GHG) emissions from building stock vary across the state, and by time of day and season, but overall, building energy use accounts for 25% of the state’s emissions (Mahone et al. 2019). Globally, buildings account for nearly 40% of cities’ GHG emissions and 40% of their total energy use (International Energy Agency and the United Nations Environment Program 2018). Consequently, knowledge of building energy usage patterns and an understanding of their underlying drivers are fundamental to developing policy strategies for reduction of energy consumption and GHG emissions. When we first began our work to document energy use in cities, we found ourselves lacking the data needed to inform the energy transition in a scientifically rigorous manner and help enable the implementation of effective public policy. Not only did we university researchers not have the information we needed to pursue our own research questions, but local governments—charged with saving energy for their residents—could not even obtain energy data on their own municipal operations from the utilities, let alone building energy consumption data for their populations. As the lack of access had become an increasingly significant obstacle to energy research and policy implementation efforts, we moved to participate in public debates about the rights of different entities to access building energy data. We pushed for, and participated in, a state level regulatory process (described in more depth in subsequent chapters) that eventually gave universities access to building energy consumption data, although extensive challenges still remain to this day. What we present in this book is the outcome of that engagement: the UCLA Energy Atlas. The UCLA Energy Atlas is a spatial-temporal record of the energy consumed by and within the built environment. The Atlas links energy consumption from utility billing data to tax assessor’s parcels using addresses; making statistical, geographic, and chronological analyses of energy consumption possible. It consists of two parts: a back-end database, and a public facing, interactive web map, and data visualization platform. The back-end database, on which the Energy Atlas website is based, contains hundreds of millions of records of historical consumption at the address level; it enables a wide array of research projects related to decarbonizing building energy use, which in turn serve to inform state and local government policies. The public web map displays
1
INTRODUCTION
3
building energy use in a multitude of ways—including by use type, square footage, and vintage, as well as by socio-demographics—and is aggregated to protect customer privacy following regulatory rules. The Energy Atlas is regularly updated with new data and spatial attributes as we obtain additional research funding, and as local governments request in-depth analyses. The UCLA Energy Atlas was born from the need to fill an enormous gap at this pivotal juncture in the climate crisis. It is a product of our commitment as scientists to evidence-based energy policy, and a testament to how crucial such data are for understanding building energy use. This book is a collaborative endeavor, weaving together knowledge and contributions from an interdisciplinary team of researchers who have created, maintained, and expanded its geographical extent and content purview. The Atlas could not have been developed without this interdisciplinary collaboration. We are also grateful to a number of people without whom this work would have never been possible. These include Dave Freeman, former General Manager of the Los Angeles Department of Water and Power, Howard Choy, former General Manager of Energy and Environmental Services Internal Services Division of Los Angeles County, Mark Gold, former Acting Director of the UCLA Institute of the Environment and Sustainability, Amy Reardon of the state Public Utilities Commission, Michael Peevey, former Chair of the state Public Utilities Commission, Ken Alex, former Director of the state Office of Planning and Research, and Eric Stokes, Program Manager at the state Energy Commission who provided us our first round of funding. We cannot thank all the people who have, in one way or another, supported our work, but it is a truism that good work is a collaborative effort, and the Atlas is no different.
About the Chapters • Chapter 2—Places our initiative in the larger context of the rise of big data and data representation, the energy transition, and California’s energy politics. • Chapter 3—Addresses building energy data access issues in California (found commonly across the country and beyond) and aggregation rules. We describe the process by which address-level building energy use data became available and the political context of that decision. This chapter, as a result, provides a window into the institutional context of electricity and natural gas regulation in
4
S. PINCETL ET AL.
the state, its evolution, and current challenges. Regulation of electricity and natural gas utilities is scattered across state agencies, the state legislature and federal agencies and regulators. We shed light on the evolving regulatory regime for the energy utilities that then cascades down to the locality. Limitations to using and displaying disaggregated data are discussed through examining the data aggregation rules and the consequences for research and public display of energy use in the Atlas. The chapter also introduces utility data— what data is maintained by the utilities and how researchers have developed processing methods to make the data usable. We also explain our geocoding process for the data, how spatial matches are enabled that allow correlating energy use to buildings and people and other attributes of the region, such as solar capacity of rooftops, environmental quality, temperature, or microclimate. • Chapter 4—Describes how the Atlas was built: the constituent data layers and their characteristics; the data processing steps and methodology for linking the data layers together; the approach to maintaining security and privacy; and the inherent limitations and challenges of working with these data. • Chapter 5—Discusses the mapping process itself and the making of the interactive Atlas website. Interactive websites must be both functional and beautiful to attract users. They must provide sufficient important data, but in user-friendly, intuitive ways that engage the user because the presentation of data is well thought through. We discuss the selection of the covariate data—partly a function of what is publicly available and what seemed to the researchers— integrating stakeholder suggestions—to be revelatory about building energy use for equity analysis and the implementation of renewable energy resources. This approach is guided by an intent to provide information to support a just energy transition, and to assist local government reporting and program implementation. We also discuss the website’s architecture—the software utilized, and the reasons for its selection. Finally, we detail database protocols that maintain the security and confidentiality of the data. • Chapter 6—Delves into data analytics. “Big data” may be defined by its size, complexity, and the speed at which additional data is generated. For big data to be useful, its potential to explain phenomena and provide insights must be well understood. We explain our approach to extracting information from the “big” data stored in
1
INTRODUCTION
5
the Energy Atlas. We also discuss analytics driven by requests for information from stakeholders and/or communities, as well as those which support specific research grants. The examples presented aim to demonstrate the enormous potential inherent with the type, size, and extent of data in the Atlas. • Chapter 7—Presents a series of case studies made possible by the Energy Atlas database, including a number of grant-funded research projects that used the granular data for cutting-edge analysis without compromising customer privacy. We utilized a hub and spoke approach to the work: the data is the hub, and we have built specific analytics, many spokes, as a result of additional funding. We hope that these projects advance a just transition in the region by harnessing the power of empirical data to improve our understanding of energy use in disadvantaged and low-income communities, and to support policy recommendations. This chapter also describes the use of the Energy Atlas as the backbone for greenhouse gas emissions accounting for Los Angeles County’s first-ever Sustainability Plan, allowing for disaggregation of the inventory for the County’s 88 cities, and providing localities with the information needed for energy efficiency and conservation investments to upgrade building energy use. The chapter concludes with a discussion of an international research collaboration around building energy data in which we participate, and which highlights the Atlas as a global best practice. • Chapter 8—Concludes with a discussion of policy insights and implications. Policy insights from the Energy Atlas are both multi-scalar, and temporal. For example, the Atlas enables a prospective analysis of the potential impacts of “Time-of-Use” pricing for different income groups derived from energy use at different times of the day, and the upcoming new time-of-use pricing of energy. Additionally, we have been able to calculate the capacity of the built environment in Los Angeles County, to electrify based on roof size and energy use. Grid information matched to regulations about grid penetration of solar adds yet another dimension to insights for future policy, for example, where the grid should be modernized. An important lesson the Atlas provides is that data access is key to cities developing policies and programs for energy use reduction, yet some of the most critical data remains very difficult to obtain. Knowledge of building energy use has enormous power to inform policy and policymakers about the
6
S. PINCETL ET AL.
types of interventions that can be the most effective, as well as how to ensure that the buildings in most need of retrofits can be targeted first. Without good data, everybody (regulators, local governments, utilities, researchers, and the public) are flying blind.
Bibliography Mahone, A., Li, C., Subin, Z., Sontag, M., Mantegna, G., & Karolides, A. (2019). Residential Building Electrification in California Consumer Economics, Greenhouse Gases and Grid Impacts. www.ethree.com.
CHAPTER 2
The Larger Context, Cities, Smart and Big Data
Introduction Cities are now the primary habitat for human beings. For the first time in our history, most humans live in these areas across the globe, and cities are increasingly being recognized as important venues within which to pursue the reduction of humanity’s impacts on the environment (Evans 2019). Cities are accumulations of many of the Earth’s rarest materials. Often mined and processed in distant places; these materials are assembled and embedded in city systems so that their unique physical and chemical properties can be leveraged to provide shelter, jobs, and enable the operation of many types of infrastructure. Cities rely on continuous input flows of energy: electricity, natural gas, and other hydrocarbons, which enable the heating and cooling of buildings, propulsion of vehicles, and functioning of machinery. As humanity has become increasingly urbanized and reliant on infrastructural systems, concomitant advances in computational technology now promise tantalizing solutions to previously intractable infrastructural management problems. However, to “optimize” the functioning of infrastructural systems, we require data from and about those systems. Accordingly, we now exist in an era of “smart” and “sensored” cities (Kitchin 2014a, b; Marvin et al. 2018). Naturally, the type and availability of data affects the character and orientation of a wide range of policy processes, including sustainability policy. Furthermore, the content and form of relevant data; its granularity, whether it is designated as
© The Author(s) 2020 S. Pincetl et al., Energy Use in Cities, https://doi.org/10.1007/978-3-030-55601-3_2
7
8
S. PINCETL ET AL.
public or private information, the frequency of its collection, and how it is curated, also influence policy processes—especially the evaluation of policy impacts.
Smart and Big Data Smart data typically refers to digital data that are collected from distributed devices (sensors), which monitor activities and provide realtime data and feedback. Examples include traffic counts, mobility technologies, waste collection, dynamic street lighting, or energy data from smart meters. While sensors and control-based devices have existed for many years as part of municipal and utility infrastructure, the emergence of the Internet of Things (IOT), coupled with Smart City applications, has led to an exponential growth in opportunities for data connectivity and collection. This raises important issues about privacy and security. Protocols must be developed to ensure consistent, standards-based approaches, as well as to ensure that only data that creates added insights are collected. Data collection and storage are not free, and more data does not always create additional utility. Typically, a city’s data goes into an open-source portal that is made available to planners, decision-makers, and/or the public. It is generally intended to improve government transparency, accountability, public participation, and interdepartmental collaboration. Smart sensor networks are also becoming increasingly marketed to consumers. Domestic sensor and software packages that provide residents with information about electricity consumption, air temperature, the status of groceries in a refrigerator, etc., and allow for remote control devices are now available. Boyd and Crawford (2012) define “big data” as resting on an interplay of (1) technology that maximizes computation power and algorithmic accuracy to gather, analyze, link, and compare large data sets; (2) analysis drawing on large data sets to identify patterns in order to make economic, social, technical, and legal claims, and; (3) mythology that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy (p. 663). It is important to recognize that big data are not neutral and not all big data are the same. What is collected is a choice about what is important, and inquiry is always driven by points of view, or methods that implicitly reflect points of view or goals of the investigators. Big data and systems that display the data are therefore a result of a social
2
THE LARGER CONTEXT, CITIES, SMART AND BIG DATA
9
process of choice reflecting priorities and ideas about what is important. How data are identified, collected, processed and displayed is not neutral. For our work, we make sure that the collection and analysis of building energy data also provides knowledge about how the existing system and regulatory shifts may impact the most disadvantaged populations.
Building Energy Data It is both interesting and revealing to assess what city data are collected but not made publicly accessible. Privacy protection has been invoked, most frequently by utilities, to prevent the sharing or release of building energy and water use. In the United States, flows of energy and water are considered necessary for human well-being in cities—considerable investment is made in ensuring reliability and quality of these essential resources. Despite the vital nature of these services, and the extensive public oversight of both public and private utilities, granular data about energy consumption (or water for that matter) in the built environment are not publicly available in California, nor most of the world.1 This situation persists despite building energy use comprising a large portion of urban energy consumption. While data are usually available at a macro scale (e.g., citywide, or by sector), little is known about energy consumption inside most cities at a granular level—by neighborhood, by building type, or as a function of other salient building characteristics. For example, people use energy to heat, cook, run appliances and equipment, and for illumination. But knowledge of the energy use of each of these activities, the spatial differences in energy use that exist between different homes and neighborhoods, industries and commercial establishments, as well as how patterns of energy use change over time have, historically, been gleaned from either modelling studies or small empirical data samples. Those approaches, we have discovered, are insufficient to determine differences in patterns of energy consumption across building and socioeconomic attributes, between heterogeneous geographies, and over time.
1 There are a few exceptions such as Gainesville Florida and Cambridge MA where property-level energy consumption data are publicly available. The International Energy Agency is working to develop a building energy “epidemiology” knowledge base through its Annex 70 research effort with the goal of characterizing building energy use, but even in Europe, building energy use reporting is not available at the building scale.
10
S. PINCETL ET AL.
The lack of data is not the result of its absence—utilities must collect it in order to bill customers. Rather, utilities consider these data proprietary; key to internal operations and decisions. Generally speaking, utilities prefer not to have outside entities—the public, researchers, regulatory bodies—meddle in their internal decision-making processes, which rely on information extracted from millions of customer bills. Utilities, by and large, have been very successful in providing essential services—electricity and natural gas—at affordable prices, with few major breakdowns, blackouts or service interruptions—though this track record is likely to come under increasing stress as climate events accelerate. At the same time, electricity generation and natural gas impose considerable externalities on society, including pollution and greenhouse gas emissions. Moreover, there is a strong correlation between affluence and consumption, which also has societal impacts due to infrastructure requirements. Carefully and parsimoniously shifting to a decarbonized energy system is very much in the public’s interest, but the scale and extraordinary complexity of the infrastructural systems involved requires that robust, disaggregated energy consumption data be readily available. That energy consumption data has been by and large proprietary has negatively affected the ability of regulators to deeply understand patterns of energy use by socioeconomic groups, and the relationship of that use to income for electricity and natural gas. Further there are important issues regulators should better understand for themselves, related to grid infrastructure and capacity, local renewable energy generation potential in any given urban morphology, and the possibilities of storage. Each of these has important implications for the future of energy generation and use, and opaque to most PUCs, other regulators and researchers due to the lack of data. Further there are enormous possibilities to better understand commercial and industrial energy use as well, and to target efficiency programs as more is known about the different sectors and how they might compare among themselves. None of this possible without granular data about building energy use. Therefore, to plan for equitable and “smart” energy transitions, knowing exactly how and how much energy is utilized in buildings, by whom and where, is critical for precise policy interventions, i.e., targeting programs, incentives, initiatives and investments according to the driver (building age, appliances, or other). Distinctions between building types and residents’ sociodemographic characteristics are also important for evaluating the performance of energy efficiency programs, which are
2
THE LARGER CONTEXT, CITIES, SMART AND BIG DATA
11
key components of California’s broader strategy of decarbonization and electrification. While there is a general belief in the efficacy of energy efficiency investments, what works where, and under what circumstances, remains largely unknown. State agencies and utilities cannot continue to rely almost exclusively on ex ante methods to evaluate the performance energy efficiency policies if they are serious about achieving significant reductions in the demand for energy. Costly repercussions may occur when programs are poorly designed or implemented, or when they fail to reach the intended groups of consumers. Access to granular energy consumption data makes it far more likely that California will meet its energy targets in an optimal manner, as granular spatial analysis provides invaluable insights into patterns of energy use across the landscape and over time, and the extent to which equity objectives are being achieved. It does no good for the state to set targets if policies and programs aimed to implement them cannot be accurately evaluated and course changes can’t be made if some seem to work and others not.
A Changing Energy Landscape The energy sector worldwide is struggling with how to generate and distribute energy while simultaneously reducing greenhouse gas emissions. The emergence of increasingly inexpensive renewable energy generation, changing building technologies, transportation electrification, and increasing demand have disrupted the twentieth-century model, putting incumbent energy companies on the defensive. The highly successful approach of the modernist era, now challenged by the aforementioned developments, was to build centralized electricity generating plants, fueled by hydrocarbons, or sometimes hydropower, and to distribute that energy through an extensive transmission and distribution grid. Today, there are other alternatives. Traditional utilities are struggling to keep up with a dynamic new energy world. As the cost of renewable energy supply and storage systems begins to rival traditional generation options, and the operations of conventional generation and distribution assets are increasingly disrupted by the impacts of climate change, utilities must alter their business models and infrastructure to adapt. Regulatory agencies in the United States, such as state Public Utilities Commissions (PUCs) must update the way they regulate the utilities to confront rapid
12
S. PINCETL ET AL.
market and policy changes pertaining to the energy supply and utility operations.2 In California today, the state has developed greenhouse gas emissions reduction mandates for electricity generation, requiring the utilities to purchase increasing increments of power generated by renewable sources of energy, as well as to fund energy efficiency programs to reduce demand. These energy efficiency programs (EE) form an important part of the state’s approach to reducing building energy use. EE incentives are funded by revenue collected from ratepayers, designated by the California PUC (CPUC) for the purpose, and implemented through an array of different programs and partners. The CPUC has constructed a complex set of requirements and regulations to govern each of the interacting aspects of the mandatory regulations from the state, such as the renewable energy portfolio, and program implementation such as the EE incentives. Much attention has been focused on electricity generation where alternatives are available, but natural gas use (also under the umbrella of the PUC) in homes and industrial processes remains endemic and difficult to supplant. It is an inexpensive energy source and extensive infrastructure exists to distribute it. Most homes, at least in California, have natural gas appliances that rely on that infrastructure. Natural gas still serves a critical need for generation of electricity, making up for the intermittency in renewable energy (solar and wind), and night time generation due to the state’s lack of renewable energy storage. Shifts toward greater electrification will result in stranded assets for natural gas utilities, and expenses for households that must replace their appliances, as well as potentially higher utility bills. Transportation electrification will lead to yet more changes and impacts. These are among the complex tradeoffs that PUCs are grappling with, but without the tools and information that would otherwise enable them to truly understand the distribution of building energy use and important explanatory variables. One might ask why, given the importance of consumption data for policy implementation, it has not been made more widely available? In California access came only due to pressure from local governments, the
2 PUCs were created in the early twentieth century to regulate private monopolies and ensure that those dependent on their services were charged affordable and fair rates. They were to further ensure that the monopoly shareholders received a fair rate of return, and that utility rates could provide the utilities with enough revenue to upgrade and maintain their infrastructure.
2
THE LARGER CONTEXT, CITIES, SMART AND BIG DATA
13
Governor’s Office under Governor Brown, and academia. Public Utilities Commissions have historically had the role of ensuring that utilities provide services equitably (everyone can have electricity and natural gas), that rates charged to customers were fair and affordable, and that the utilities invested sufficiently in their infrastructure to ensure service was reliable. The California PUC, like its counterparts, is also involved in regulating utility infrastructure, to make sure it is safe from catastrophe. The oversight role of the CPUC has shifted and grown in response to additional policy goals, for example, the statutory requirement for utilities to develop renewable resources portfolios to reduce greenhouse gas emissions from electricity generation, and additional regulations encouraging energy conservation. Some of the approaches have included requiring the utilities to fund energy efficiency programs (out of ratepayer funds). But access to billing data itself, for program implementation by local governments, energy planning by the state’s Energy Commission, or deeper analyses conducted by academics, is a novel development in the realm of energy management and oversight. Further, the CPUC and the utilities, it must be remembered, have long-standing institutional relationships, going back over a century. The utilities’ contention that they have been the sole keepers of the data and have effectively protected customer privacy and provided services, has been well taken by the CPUC, and only a concerted lobbying effort at a particular time, convinced the Commission the sharing of utility data with other public sector entities should be considered. In fact, Governor Brown’s Office effectively convinced his appointees to the Commission of the need for greater data transparency and availability. This window of interest and opportunity seems to have closed currently under the new Governor and his appointments. This points to how change results from conjuncture, in a sense to all the stars aligning at a point in time. The organizations that provide energy supply services are also evolving rapidly. In California, there are complex new developments in the realm of energy delivery. Long dominated by just Investor-Owned Utilities (IOUs), such as Southern California Edison, or Publicly Owned Utilities (POUs) such as Burbank Water and Power, the energy provision business now has a new type of utility: Community Choice Aggregators (CCAs). Formed by one or more local governments in cooperation, CCAs procure their own electricity and are legally authorized to utilize IOU distribution networks (poles and wires) to deliver it to customers, thereby directly competing with IOUs. CCAs provide an option for local
14
S. PINCETL ET AL.
governments that want more control over their electricity sources, more renewable power than is offered by the default utility, and/or lower electricity prices. These new entrants add an additional layer of complexity to the state’s energy system (Fig. 2.1). The conventional large-scale grid systems that move energy across long distances are owned and operated by the regulated utilities. They cover huge geographic distances and are increasingly vulnerable to the impacts of climate change, such as wildfires. A predicted increase in the number of extreme heat days will strain existing capacity and reduce transmission efficiency (Burillo et al. 2019). Distributed, time-dependent renewable energy production also makes balancing supply and demand more challenging, exemplified by the so-called “duck curve.” Periodic oversupply of electricity has led California to pay other states to take it. Growing defections from the grid made possible by microgrids and localized storage are also impacting the system.
Fig. 2.1 California’s energy system and sources of Energy
2
THE LARGER CONTEXT, CITIES, SMART AND BIG DATA
15
The world of energy generation, transmission and supply is at an important inflection point. Greater insight into how energy is used, where it is used, at what time, and by whom, is vital to a successful transition. However, given the traditional function of utilities (to provide energy regardless of the amount) contextual analysis of building energy use has simply not been part of their business models. Utilities have had a narrow mission—to provide the electricity and natural gas necessary for cities to function within the prevailing paradigm. Now that the paradigm is shifting toward decarbonization, it is time to revisit not only the mission of utilities, but also their practices. Changing mandates and policies, and the urgent need to curtail greenhouse gas emissions and energy use are undermining the utilities’ justifications for keeping energy consumption data private. As valuable as our work on the Energy Atlas has been, it is still limited in significant ways by the state’s IOUs, who use concerns about customer privacy and their power over the data request process itself to limit any further scrutiny of their operations—scrutiny which could result in additional regulation or reveal structural inequalities in service and/or rates. In subsequent chapters, we will discuss at length how current privacy aggregation rules and the IOUs’ control over the data request process limits the Energy Atlas’s capabilities and power.
Data to Guide the Transformation California has established important and precedent-setting policies relating to buildings and energy use (Table 2.1). These include: energy efficiency standards and renewable energy technologies for electricity, setting a goal for a 50% reduction of building energy use by 2030, requiring building energy use disclosure for large office buildings, and mandates for new buildings to be zero-net energy and to include solar panels. But the implementation of these rules, regulations, and guidelines require accurate and detailed information about energy supply and demand, as well as the state of existing infrastructure. Furthermore, these significant changes to the state’s regulatory regime will have far-reaching impacts on California’s economy. Coupling building energy use data with other information can provide much more insight into relative building energy use, vulnerability, and potential for transformation. Variables such as energy efficiency program participation, sociodemographic characteristics, industrial classification
16
S. PINCETL ET AL.
Table 2.1 Major laws, regulations, and rules governing California’s energy system and its transformation California State Legislation State legislation AB 32—Global Warming Solutions Act of 2006 (Nunez, 2006)
AB 758—Existing Building Efficiency (Skinner, 2009)
AB 2514—Energy Storage Systems (Skinner, 2010) SB 535—Disadvantaged Community Benefits (De Leon, 2012) SB 350—Clean Energy and Pollution Reduction Act of 2015 (De Leon, 2015)
Major provisions • Created a comprehensive program to reduce GHG emissions in California. Strategies include a reduction mandate to achieve 1990 emission levels by 2020 and a cap-and-trade program • Required ARB to develop a Scoping Plan that describes the approach California will take to reduce GHGs; the plan must be updated every five years • Required the CEC to collaborate with the CPUC and stakeholders to develop a comprehensive program to achieve greater energy and water savings in existing residential and non-residential buildings • The CEC developed an Existing Buildings Energy Action Plan in August 2015 • Required electric utilities to install minimum levels of grid-scale energy storage infrastructure • Required at least 25% of state cap-and-trade revenues to go to projects that benefit disadvantaged communities • Expanded California’s Renewable Portfolio Standard (RPS) goals and requires retail sellers of electricity and local publicly owned electricity to increase their procurement of eligible renewable energy resources to 40% by the end of 2024, 45% by the end of 2027, and 50% by the end of 2030 • Required the CEC to establish annual targets for statewide energy efficiency savings in electricity and natural gas final end uses of retail customers by January 1, 2030; and 3) provide for transformation of the Independent System Operator into a regional organization
(continued)
2
THE LARGER CONTEXT, CITIES, SMART AND BIG DATA
17
Table 2.1 (continued) California State Legislation AB 802—Energy Use Reporting (Sternberg, 2015)
SB 32—State Targets for Climate Pollution (Pavley, 2016) SB 100—California Renewables Portfolio Standard (De León, 2018)
AB 802 (Williams, 2015)
California Governor’s executive orders Executive order B-30-15
B-48-18
B-55-18
• Statewide energy use benchmarking and public disclosure program for large buildings commercial and some multifamily buildings. The law requires that owners of commercial buildings larger than 50,000 square feet report annual whole-building energy use data to the CEC • Reported data is used to develop CEC energy demand forecasts, and is made available to the public • Required ARB to ensure that statewide greenhouse gas emissions are reduced to 40% below the 1990 level by 2030 • Accelerated program targets to achieve a target of 50% renewable resources by December 31, 2026, and a 60% target by December 31, 2030 • Required retail sellers and local publicly owned electric utilities to procure a minimum quantity of electricity products from eligible renewable energy resources • Created the Building Energy Benchmarking Program. The program requires the owners of commercial and large multifamily buildings to disclose the energy consumption of their buildings on an annual basis Major provisions • Established a new interim statewide target to reduce GHGs to 40% below 1990 levels by 2030, to ensure California meets its target of 80% below 1990 levels by 2050 • Set a target of 5 million ZEVs on California roads by 2030, and streamline ZEV infrastructure • Established a new statewide goal to achieve carbon neutrality as soon as possible, and no later than 2045, and achieve and maintain net negative emissions thereafter. This goal is in addition to the existing statewide targets of reducing greenhouse gas emissions
(continued)
18
S. PINCETL ET AL.
Table 2.1 (continued) California State Legislation N-19-19
State codes Code California Energy Code—CCR Title 24, Part 6
Appliance Efficiency Regulations—CCR Title 20, Division 2, Chapter 4, Article 4, §§ 1601 et. seq.
• Required redoubling of the state’s “efforts to reduce greenhouse gas emissions and mitigate the impacts of climate change while building a stable, inclusive economy.” Major Provisions • Ensures that new and existing buildings achieve energy efficiency and preserve outdoor and indoor environmental quality through use of the most energy-efficient technologies and construction • A component of the California Building Standards Code, published every three years. Promulgated by the CEC in collaboration with the CBSC • Requires manufacturers of new appliances sold or offered for sale in California to test them using specified test methods • Covered appliances include refrigerators, air conditioners, heaters, plumbing fitting/fixtures, lighting, washers, dryers, cooking products, electric motors, transformers, power supplies, televisions, and battery charger systems • Promulgated by the CEC
AB: Assembly Bill CARB: California Air Resources Board CBSC: California Building Standards Commission CCR: California Code of Regulations CEC: California Energy Commission CPUC: California Public Utilities Commission SB: Senate Bill
codes, building vintage, grid capacity, and distribution networks, climate hazards such as high heat, and other spatial attributes, can help develop insights into causal factors that influence energy use in buildings. With an enhanced and geographically specific understanding, policies and programs can be targeted to the buildings (and populations that inhabit or
2
THE LARGER CONTEXT, CITIES, SMART AND BIG DATA
19
use those buildings) that are most in need. For example, we have discovered that affluent neighborhoods use up to 10 times as much energy per capita as less affluent areas. Yet, houses in these affluent neighborhoods are far more efficient per square foot than their counterparts in low-income areas. How can this be? With the Atlas data coupled to building vintage, size and sociodemographics, we discovered that houses in Malibu are much larger than those in South Los Angeles. Size, in this case, has trumped energy efficiency (Fournier et al. 2019). This is a significant finding going forward, as the state continues to invest resources in efficiency as a way to decrease total energy consumption. As California promotes appliance and vehicle electrification to decarbonize its economy, these transitions will impact total building energy consumption as well as timing of use of electricity, indoor and ambient air quality, and household resiliency. For these and other reasons, concurrent and integrated evaluations of building energy use, solar generation capacity, and grid capacity, are necessary for integrated energy planning. Without building-level data, energy planning is less accurate, liable to result in wasteful expenditures, and unable to address equity issues. Considering how important disaggregated energy consumption data is to efforts to reduce greenhouse gas emissions and local air pollution, the lack of granular and specific data on building energy use in cities is of major concern. Inquiry driven by theoretical and critical questions about equity, Earth system impacts, and energy transitions, can elucidate patterns and trends, and perhaps even anticipate the consequences of various changes to the energy system. This question-driven approach is distinct from simply collecting massive amounts of data and combining it for trends. A critical approach involves starting with a set of questions motivated by a desire for social and environmental justice that then proceeds toward the discovery of approaches and procedures for energy transitions that do not cause additional hardship or worsen inequality, while also minimizing environmental harm.
Origins of the Energy Atlas By 2010, it was abundantly clear that a lack of building energy baseline data was making it difficult for local and municipal governments to develop programs to comply with state policies, and that verification of progress toward those policies’ goals would be weak at best. Cities
20
S. PINCETL ET AL.
did not know whether their actions were effective at meeting state goals other than at the most aggregated levels, and equity impacts were nearly impossible to assess. Local municipalities, responsible for demonstrating progress toward state mandates and climate goals, were in desperate need of building energy consumption data. Energy efficiency programs, costing up to a billion dollars a year of ratepayer funds, and implemented through an arcane myriad of entities, large, small, public, and private, had not been assessed for performance due to lack of such data. Additionally, cities had not been able to consistently set commercial building energy disclosure thresholds, since they did not know the consumption of energy per unit area (square foot) of buildings. It was clear that a host of entities need access to disaggregated, geographically specific data in order to integrate state climate goals into daily operations and long-term plans. Concerned with the need to develop such baselines, UCLA researchers proposed to develop an in-depth analysis of energy and resource use for Los Angeles County as part of an urban metabolism research program, but found they were unable to obtain the building energy data needed for the study (Pincetl et al. 2014). Determined to support the needs of local governments, UCLA researchers created a demonstration Atlas from electricity billing data provided by the Los Angeles Department of Water and Power (LADWP), the largest POU in the United States. This pilot showed the power of granular data for policy implementation and evaluation. UCLA then participated with local governments and other energy researchers in a pivotal 2014 CPUC proceeding on building energy data access. The lessons learned from the creation of the prototype Atlas supported our contention that access to parcel billing data was critical to policy implementation. The CPUC’s final decision, discussed in more detail in Chapter 3, enabled university researchers to request address level billing data from IOUs under non-disclosure agreements and other data security conditions. Any public use of the data would be subject to privacy aggregation rules, as described in subsequent chapters. Aggregation rules were the topic of great discussion and controversy during the proceedings, as the utilities were unconvinced that the data should be available at all, and consumer privacy protection advocates worried about the possible intrusive knowledge about household habits that data could provide. Ultimately, the UCLA team applied for and received address-level data from IOUs for both electricity and natural gas. Building on the pilot
2
THE LARGER CONTEXT, CITIES, SMART AND BIG DATA
21
version, Energy Atlas 1.0 was created, which displayed building consumption spatially, and by sociodemographic characteristics and building attributes (vintage, size, and use type), starting with the vast landscape of Los Angeles County, a heterogeneous region of over 10 million people. Atlas 2.0, released in 2019, now covers most of Southern California, a population of approximately 19 million people. The Atlas website is designed to be queried with a series of drop-down menus (e.g., commercial electricity use per square foot, or residential natural gas use per capita) for various geographies (county, city, neighborhood, councils of government). For certain utilities and geographies, the Atlas displays data from 2006–2016, a unique longitudinal record that allows observation of consumption changes over time. The database itself is kept separately from the Atlas website, ensuring strict security protection of the address-level records, as explained in detail in Chapters 4 and 5. The Atlas is an example that researchers from other regions may follow once address-level data is available to them. Unfortunately thus far, few if any, other US states (ACEEE 2020) or countries (Webster 2015; Hamilton et al. 2015) have allowed researchers to access accountlevel data at the scale available in California; what is available is primarily limited to data from large buildings that fall under cities’ energy benchmarking laws (Kontokosta et al. 2020). Therefore the Atlas remains a unique contribution to planning for greater sustainability. The back end of the Atlas enables in-depth and complex analytics across millions of customers and the various attributes associated with buildings, and over time. This analytical power, achieved without violating customer privacy, provides invaluable insights about the sociotechnical system that energy encompasses, and its societal, environmental and regulatory attributes and impacts. Granular data enables a series of questions that cannot be accurately answered with modeled or sampled data, the conventional way of inquiry, as we explain in-depth in subsequent chapters. The Energy Atlas design was inspired by, and is consistent with, the concept of Public Participation Geographic Information Systems (PPGIS) (Sieber 2006), in that it has been guided by user and community input, and provides spatial and other data in a form that can assist in community or public decision-making. This is discussed further below. Functions of the Atlas interface include enabling the understanding of neighborhoodlevel dynamics relative to energy use, improving public sector and utility programs and policies such as energy efficiency retrofits, and assisting the
22
S. PINCETL ET AL.
energy transition toward renewables. The Atlas database, discussed further in Chapter 4, enables researchers to pursue in-depth analyzes to answer specific questions about energy transitions using spatial statistical techniques that can be anonymized for publication. The database therefore serves as a rich research repository. The Atlas is circumscribed by constraints on data access and disclosure. Consequently, the address-level utility data contained in the Atlas database is provided to UCLA under nondisclosure agreements. A wide array of measures ensure the security of the data, including as previously mentioned, maintaining the back-end database on a separate server from the public-facing website. Because privacy aggregations must be done in a secure environment, this regrettably constrains independent third-party verification of our data aggregation and matching techniques. Aggregation compliance rules are clearly defined in the CPUC ruling; however, the application of these rules in practice forces researchers to make consequential decisions about how to display aggregated data on the Atlas website. In some cases, reportable energy use data must be redacted, or “masked” for an entire geographical area if one user’s consumption is large enough relative to others. In other cases, privacy aggregation rules necessitate the exclusion of certain users from calculations of consumption statistics, thus distorting them. We discuss the complexities of privacy aggregation guidelines in detail in Chapter 4, as well as their implications for the accurate display of energy use by certain categories.
Cities, Data, and Sustainability Many theoretical and demonstrative studies describe how data can improve the ways that cities conduct their activities, and how, today, information technology and abundant computational resources may help solve persistent and serious public management problems. Yet, it is important to remember that although cities have used data and various analytical techniques for centuries to make decisions, the effectiveness and competence of city administrators, and the political contexts in which they exist, also determine outcomes. Data can help city managers make decisions about street paving cycles, municipal finance programs, watershed and air quality monitoring, and many other aspects of municipal operations. Implementation of programs will always, in the end, be predicated on funding and staff, political decision-making and popular will, issues that data itself cannot necessarily affect. The protection of utility data as proprietary is
2
THE LARGER CONTEXT, CITIES, SMART AND BIG DATA
23
one such example—regulators can determine a great deal regarding data availability. Public opinion also plays a role, for example, in determining the extent to which CCTV is used to monitor public spaces. The twentieth century saw a rise in national development and use of socioeconomic and demographic statistics such as unemployment rates, gross domestic product, etc., to evaluate societal conditions and the effectiveness of public policies (Godin 2003 in Kitchin et al. 2015). In the twenty-first century, cities are using ever-larger data sets to develop indicators, track maintenance issues and operational needs, and task employees with work. Municipal departments use information to track progress, target efficient use of city funds, and devise long-term plans for managing assets. The public simultaneously expects greater access to, and transparency in, city operations—including data about how funds are spent, how choices are made, and the evaluation of impacts of policies and programs. Data offer opportunities for city agencies to be accountable to residents about performance. But, in some instances, the public is apprehensive about data collection for the improvement of public services, even if it is highly germane to city operations. A good example in California is the collection of data on motorists who run red lights or violate speed limits. In some places the public has revolted against cameras that monitor this behavior, and municipalities have removed those sensors. Limited state and federal action on climate change (California notwithstanding) has meant that North American cities are leading the charge toward a sustainable society by default. A network of non-profit organizations and researchers have arisen to assist municipalities in this endeavor. For instance, C40 (https://www.c40.org/cities), a nonprofit network of the world’s megacities committed to addressing climate change, supported by Bloomberg Philanthropy among other donors, builds capacity within city governments to implement sustainability and climate action programs. Data are at the core of how they approach suggested initiatives. Participation in global city initiatives around climate change mitigation has led to the creation of an informal network, supported by a technological infrastructure to help member cities with their emissions accounting. For example, cities participating in the C40 network have access to a pre-developed reporting template for data that includes a list of predetermined indicators and how to track them. Of course, the template contains certain implicit judgements by C40 about what data is important, and how to display it. It also means that all cities are treated similarly across
24
S. PINCETL ET AL.
the globe. While it is possible to customize the database and measurement criteria, doing so requires skill, insight, and additional resources. Other nonprofit organizations have offered additional templates for cities to adopt. Such templates help local government agencies to select data for analysis and organize its presentation. Templates and supportive software provide them a common data sharing platform and conventions for collecting and quantifying the data (e.g., similar values across different agencies). These templates are often designed to provide greater transparency. They may include visualization software to display indicators, and allow residents to query and explore local environmental and economic data, such as progress toward policy goals, where funds are spent, and indicators of sustainability. However, the predetermination of city sustainability metrics may discourage users from thinking critically about local conditions, or how to navigate related issues such as energy inequality and environmental racism. Developers of the templates and software systems also implicitly determine data quality thresholds. The need for data of certain quality can cause distortions in accounting if data is lacking, poor, insufficiently granular, or does not conform to the templates provided. The use of data to track city sustainability performance falls into general categories (Kitchin et al. 2015). First, there are benchmarking applications that allow for comparative evaluations to be made between different areas within cities and set up aspirational and competitive approaches to motivate better performance. Second, real-time dashboards summarize and communicate data through common graphic interfaces (like those of the Carbon Neutral Cities or C40). Dashboards may include live streams, providing detailed, up-to-date information about different aspects of urban systems. Most city data are also aggregated into indicators, which serve as recurrent quantified measures that can be tracked over time. They form a kind of barometer of how various aspects of a city are doing. These categories are not exhaustive, but seem to predominate overall. Often, indicators are assumed to reflect the city as it actually is, through “objective, trustworthy, factual data that can be collected, statistically analyzed and visualized to reveal patterns and trends” (Kitchin, ibid.: 13). They are considered neutral and rational, providing a commonsense, evidential basis for programs and policies and their evaluation. As such, they shape how cities are perceived, defining what is important and, in some sense, real. The representation of the city, through the indicators,
2
THE LARGER CONTEXT, CITIES, SMART AND BIG DATA
25
is shaped by what kinds of data are available, what is measurable, and what is considered important to measure. They are generally presented as unproblematic: value judgement-free, neutral, and factual—representative real things that can be accurately measured. They are, however, contingent on the many factors that often shape them, including their consistent availability, the granularity of the data used to construct them, the type of measurement employed, the context in which they arise, and what the data collectors consider indicative of a condition. They are also a reflection of what data various stakeholders consider public or private. All of these concerns weighed on our minds while we were working with utility and other data to construct the Atlas. The question of how to present energy consumption (per capita? per square foot? per income?) has many valid answers, but each one leads to a different understanding of energy use across a city. It was therefore essential that the Atlas be designed to offer all of those options, and thus allow for critical insights to be drawn from their comparison.
Mapping The use of maps in our daily lives has become ubiquitous. We navigate the world on the basis of maps and understand information through maps. Maps today are predominantly digital objects, enabling multiple layers to be easily combined to reveal or illustrate geospatial relationships. Meteorological data, roads, traffic conditions, population densities, flood zones, voting patterns—so much information is now displayed on and by digital maps, it is hard to even imagine a world when this did not occur; when maps were flat pieces of paper. Adding a spatial dimension to benchmarks and dashboards, as the Atlas’ front-end website does, is a powerful way to see and understand data, revealing patterns and insights not otherwise readily observable. Web-based mapping today makes use of geographic information systems (GIS), an extension of cartography that was pioneered largely by geographers. Today, GIS has become a discipline of its own, one in which specialized programming skills are required. A primary GIS component is the base layer—a map that supports other data layers and to which spatial attributes can be attached. Creating maps requires mapping software. Specialized software has been developed to address many aspects of mapping: from the map server itself, spatial database management systems, web applications for accessing data, to specific
26
S. PINCETL ET AL.
languages designed to handle certain kinds of data and applications. Utilizing these different types of software often requires expert knowledge and considerable computational resources. Maps and their underlying assumptions, embedded in the software, portray information according to the epistemological systems3 used (consciously or unconsciously) to create them. So even when the maps are interrogatable—the user can click through functions to create the knowledge that the user desires—the underlying architecture of the software often remains opaque, as do the assumptions about the world made by the software’s creators. Furthermore, GIS science itself does not tend toward reflexivity, that is, probing and questioning assumptions embedded in software development. GIS started in the 1960s as computers became capable of storing, collating, and displaying digital maps, and was initially applied to regional planning. One notable development in its history was Design with Nature (1969), Ian McHarg’s integration of geophysical attributes (such as streams, flood zones, infiltration zones, and biodiversity) with land use planning to determine where development would be the least environmentally harmful. Soon after, proprietary GIS software such as ArcGIS became commercially available. However, the cost and subscription-based business models of the companies that create proprietary programs has led other developers to create open-source alternatives. The variety of GIS software makes replication of mapping methods challenging across different users with differing resources. Furthermore, the ever-increasing size of spatial datasets requires additional software tools and technical knowledge, as well as storage capacity, which can create significant costs. GIS has become a complex and sophisticated enterprise. Public Participation Geographic Information Systems (PPGIS) evolved as a counterpoint to the exclusivity of knowledge production by GIS experts (Sieber 2006). “From a collection of tools to increase access in official policy circles, PPGIS has metamorphosed into a coproduced concept composed of multiple disciplinary approaches and actors, rapidly changing technologies, and numerous as well as occasionally transgressive goals” (Ibid: 492). PPGIS has been promoted by members of the public and private sectors who believe that access to computer tools and digital 3 Epistemology is a theory of knowledge, how you know things. In biophysical science, scientists use experiments, measurements, and models to develop knowledge, based on a set of assumptions and/or laws about how the system they are studying works.
2
THE LARGER CONTEXT, CITIES, SMART AND BIG DATA
27
data forms an essential part of an informationally enabled democracy (Sieber 2006). PPGIS is grounded in an assumed connection between the collection, management, and representation of spatial data, and the empowerment of marginalized communities. By extending the use of spatial information to all relevant stakeholders, it is assumed that better policymaking will result (Sieber 2006). GIS is seen as a valuable tool for communities to gain greater insight into the places they live. It has been widely promulgated in planning and environmental justice circles, but its underlying architecture remains a disciplinary domain, difficult to utilize by nonexperts. Most often, PPGIS initiatives involve the community interacting with experts, informing them of what the community would like to see or know. Conducting PPGIS requires the commitment of resources over time for it to be current and relevant, and for participating experts to be open to working with communities. While the Atlas itself is not an example of PPGIS, it has incorporated comments and suggestions from a wide variety of stakeholders. A supplemental component of the Atlas, the Solar Prioritization Tool, has from its inception been developed in cooperation with under-resourced communities and environmental justice organizations. Another approach to data visualization and mapping is thick mapping, pioneered by experts in digital humanities using state-of-the-art data, information, and computing tools. Thick mapping is a concept derived from an ethnographic approach developed by the anthropologist Clifford Geertz. Geertz originally identified the process of collecting and integrating sociological and other information to understand aspects of human societies as “thick description.” In the digital humanities, thick mapping is “… the process of collecting, aggregating, and visualizing ever more layers of geographic or place specific data … [to]… embody temporal and historical dynamics through a multiplicity of … practices” (Presner et al. 2014). Mapping based on thick description is motivated by a critical engagement with how places are represented; it recognizes that data reflect epistemological perspectives and attempts to be transparent about the perspective utilized by the researcher, to acknowledge that the mapping exercise is one of interpretation of complex phenomena. This also includes how the software used in mapping was developed and how its use influences interpretation. The goal is to be critically engaged with the material, utilizing expert knowledge and interpretation to inscribe information in maps tied to concrete social issues and events that affect
28
S. PINCETL ET AL.
people’s lives—the political, economic realities that shape the data being represented. Thick mapping attempts to develop additional spatial information that may contextualize data by including historical, cultural, political events, and contingencies. Thick mapping enriches mapping and visualization of data by enabling the user to access multiple attributes to build an independent interpretation of data. Similar to PPGIS in that way, thick mapping integrates interactive functions so the user can pick and choose attribute clusters to query. Somewhat different than PPGIS, thick mapping may also include attributes that researchers have found to contextualize the data at levels beyond community-based concerns. For example, a changing regulatory regime that may cascade down to a community, or historical legacies that shape a situation. Thick mapping, to date, has been limited to the digital humanities and the concept has not been much imported into the quantitative social sciences. It informs, however, how we developed the Energy Atlas.
Intentionality and Consequence One might ask how thick mapping and PPGIS differ from seemingly positivist social science4 methods that have utilized geographic information systems to map data and information, and to include indicators in maps. Thick mapping is an approach that acknowledges and embraces the role of interpretation in data, making it explicit. It does so by being transparent about the intent of data collection, and the purpose of its interpretation in mapping curated social information (Dwivedi et al. 2018). It recognizes the social system within which data scientists work, and knowledge is applied. Search functions, data cleaning and normalization, data matching, aggregation choices and algorithms, decisions about which factors are significant, and many others, all shape both the state of the science of data management and choices made by the researchers. In thick mapping, clearly articulating methods is critical. The regulatory and information regimes (such as private software developers or philanthropic organizations) that influence and shape information 4 Positivism is an approach to knowledge that knowledge is based on observations, interpreted through reason and logic and can be verified. It forms the basis of the scientific method. Positivism does not tend to acknowledge the role of values, interpretation or situation in the process of knowledge discovery.
2
THE LARGER CONTEXT, CITIES, SMART AND BIG DATA
29
systems, and therefore knowledge production, are often invisible to the public. Clarity about methodological decisions, such as the choice of analytical software and its application, however, enables users to evaluate those methods, to determine if what is being represented is arbitrary or haphazard, if it is based on established templates, or whether it is generated with an internal coherence and honesty about intent. Thick mapping involves the explicit description of the process of thinking—through the links between types of data and methods and the choice of what gets represented. It involves layering the interaction among different epistemic communities—in this case, data scientists, urban planners, critical geographers, energy systems engineers—and different user communities, including governmental agencies and cities. In thick mapping, this interaction among communities also drives processes to create relevant and actionable maps. This includes ascertaining the best methods to employ (including whether to use open source or proprietary software), determining what is being represented, what is being left out, how useful the information is, as well as reflecting upon the methods themselves in terms of their accuracy and sufficiency. Additionally, it includes acknowledging when researchers are simply being opportunistic about the process due to the limitations of time and funding. In this book, we take the position, articulated and advanced by many others, that data does not exist independently of the ideas, methods, contexts and their prospective use (Kitchin 2014a, b; Kitchin and Lauriault 2015; Kitchin et al. 2015; Lauriault 2012). Data are generated by people who determine what form of information to gather, how to gather it, manage it, and display it. Data are “epistemological units” (Kitchin and Lauriault 2015: 16). They serve as the basis for shaping a world view, they reflect a certain idea of how we know things. Thus, data are not sui generis, nor do data exist just ready for collection; they reflect the inherently normative and political process of its generation. That is why templates created by organizations such as C40 deserve to be examined carefully to make sure, when and if they are adopted, that the data collected and presented are the data that the city finds useful, necessary and reflects their priorities and/or conditions. Data creation and utilization can have real consequences and must be engaged in thoughtfully. They also reflect history, notably what has been collected in the past and the infrastructure used for data collection and reporting. Bowker et al. (2013) explain that data are never raw, they are already cooked. This means they are not free of epistemological framing. However, the recipe behind the data is often black boxed and hard to unpack.
30
S. PINCETL ET AL.
A Roadmap for Transparent, Data-Driven Energy Transitions In this book we attempt to contribute to a more transparent, socially informed and transformative use of mapping, with an aim to contribute to knowledge for a just energy transition. We describe the many decisions made along the way that required the application of expert knowledge, so that others wishing to create an energy atlas may consider alternatives and strategies different than ours, or accept the logic we applied. Some may call the web atlas a tool for implementing programs. We are happy if it can serve that public function. The process of obtaining and using the building energy billing data led to greater and greater involvement in California’s energy policy processes, and, eventually, in the use of the data we obtained to inform policies related to the state’s energy transition. Buildings are connected to grids, grids are connected to power generation. All are subject to rules and regulations, policy initiatives and economic interests. They all, importantly, form one of the critical contexts in which people live, work and play. The cost of energy, thermal comfort, pollution burden, environmental impacts of energy systems, are all considerations that motivated the Energy Atlas’s creation. It is a window into an evolving and contentious struggle for the future. How will cities be powered? What will buildings look like? What will be the social, environmental, and economic consequences of the choices? All these are at stake and revolve around building energy use.
Bibliography American Council for an Energy-Efficient Economy. (n.d.). State and Local Policy Database. Retrieved July 1, 2020. https://database.aceee.org/state/ data-access. Bowker, G. C., Brine, K. R., Gruber, Garvey E., Gitelman, L., Jackson, S. J., Jackson, V., et al. (2013). Raw Data Is an Oxymoron. Cambridge: MIT Press. Boyd, D., & Crawford, K. (2012). Critical questions for big data. Information, Communication & Society, 15(5), 662–679. https://doi.org/10.1080/136 9118x.2012.678878. Burillo, D., Chester, M. V., Pincetl, S., & Fournier, E. (2019). Electricity Infrastructure Vulnerabilities Due to Long-Term Growth and Extreme Heat from Climate Change in Los Angeles Country. Energy Policy, 128, 934–953. City Carbon Footprints. (2019). Global Gridded Model of Carbon Footprints (GGMCF). Accessed March 5, 2019.
2
THE LARGER CONTEXT, CITIES, SMART AND BIG DATA
31
Dwivedi, Y. K., Kelly, G., Janssen, M., Rana, N. P., Slade, E. L., Clement, M. (2018). Social Media: The Good, the Bad, and the Ugly. Information System Frontier,. 20, 419–423. https://doi.org/10.1007/s10796-018-9848-5. Evans, J. Z. (2019). Governing Cities for Sustainability: A Research Agenda and Invitation. Frontiers in Sustainable Cities. https://doi.org/10.3389/frsc. 2019.00002. Fournier, E. D., Federico, F., Porse, E., & Pincetl, S. (2019). Effects of Building Size Growth on Residential Energy Efficiency and Conservation in California. Applied Energy, 240, 446–452. Hamilton, I., Oreszczyn, T., Summerfield, A., Steadman, P., Elam, S., & Smith, A. (2015). Co-benefits of Energy and Buildings Data: The Case for Supporting Data Access to Achieve a Sustainable Built Environment. Procedia Engineering, 118, 958–968. https://doi.org/10.1016/j.proeng. 2015.08.537. International Energy Agency and the United Nations Environment Programme. Global Status Report 2018: Towards a Zero-Emission, Efficient and Resilient Buildings and Construction Sector; International Energy Agency and the United Nations Environment Programme: Katowice, Poland, 2018; Available online: https://www.unenvironment.org/resources/report/global-status-rep ort-2018. Kitchin, R. (2013). Big Data and Human Geography: Opportunities, Challenges and Risks. Dialogues in Human Geography, 3(3), 262–267. Kitchin, R. (2014a). The Real-Time City? Big Data and Smart Urbanism. GeoJournal, 79, 1–14. https://doi.org/10.1007/s10708013-9516-8. Kitchin, R. (2014b). The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences. London: Sage. Kitchin, R., & Lauriault, T. (2015). Small Data in the Era of Big Data. GeoJournal, 80(4), 463–475. Kitchin, R., Lauriault, T., & McArdle, G. (2015). Knowing and Governing Cities Through Urban Indicators, City Benchmarking and Real-Time Dashboards. Regional Studies, Regional Science 2, 1–28. Kontokosta, C. E., Reina, V. J., & Bonczak, B. (2020). Energy Cost Burdens for Low-Income and Minority Households: Evidence From Energy Benchmarking and Audit Data in Five U.S. Cities. Journal of the American Planning Association, 86(1), 89–105. Lauriault, T. P. (2012). Data, Infrastructures and Geographical Imaginations (Ph.D. thesis), Carleton University, Ottawa. Mahone, A., Li, C., Subin, Z., Sontag, M., Mantegna, G., & Karolides, A. (2019). Residential Building Electrification in California Consumer Economics, Greenhouse Gases and Grid Impacts. www.ethree.com.
32
S. PINCETL ET AL.
Marvin, S., Bulkeley, H., Mai, L., McCormick, K., & Palgan Y. V. (Eds.). (2018). Urban Living Labs: Experimenting With City Futures. Routledge. https://doi.org/10.4324/9781315230641. McHarg, I. (1969). Design with Nature. New York: American Museum of Natural History. Pincetl, S., Chester, M. K., Circella, G., Fraser, A., Mini, C., Murphy, S., et al. (2014). Enabling Future Sustainability Transitions; An Urban Metabolism Approach to Los Angeles. Journal of Industrial Ecology, 18, 871–882. Pincetl, S., Chester, M. K., Circella, G., Fraser, A., Mini, C., Murphy, S., et al. (2015). Enabling Future Sustainability Transitions; An Urban Metabolism Approach to Los Angeles. Journal of Industrial Ecology, 18, 871–882. Presner, T., Shepard, D., & Kawano, Y. (2014). Hypercities, Thick Mapping in the Digital Humanities. Cambridge: Harvard University Press. Sieber R. (2006). Public Participation Geographic Information Systems: A Literature Review and Framework. Annals of the Association of American Geographers, 96(3), 491–507. https://doi.org/10.1111/j.1467-8306.2006. 00702.x. U.N. Environment and International Energy Agency. (2017). Toward a Zero Emission Efficient and Resilient Building and Construction Sector. Global Status Report. U.S Energy Information Administration. https://www.eia.gov/tools/faqs/faq. php?id=86&t=1. Accessed March 5, 2019. Webster, J. (2015). Data Issues and Promising Practices for Integrated Community Energy Mapping.
CHAPTER 3
Building Energy Data Access and Aggregation Rules
Introduction As California continues efforts to reduce building energy use and local governments have assumed greater responsibility for reducing greenhouse gas emissions and municipal energy consumption, it has become increasingly obvious that a readily accessible, spatial record of energy consumption is a necessity (Klass et al. 2016). As mentioned previously, localities have historically lacked consumption data, and have thus been unable to effectively design and implement energy efficiency and conservation programs. Prior to 2014, most utilities did not provide cities or counties their energy consumption data in order for them to design and implement such programs. State policy, however, asks municipalities to implement programs that diminish energy use and GHG emissions, and provides subsidies and grants for cities to do so. The Governor’s Office of Planning and Research, which is responsible for drawing up guidelines for cities and counties to follow in creating general plans, also began to require that greenhouse gas reporting and inventories be included in general plan updates. With these new requirements, and the realization that information about their own energy consumption habits could help reduce their expenditures, localities became increasingly frustrated with the lack of access to energy consumption data for their operations and territories. Simultaneously, UCLA researchers were funded by the state to study and quantify the urban metabolic flows of Los Angeles County. Urban © The Author(s) 2020 S. Pincetl et al., Energy Use in Cities, https://doi.org/10.1007/978-3-030-55601-3_3
33
34
S. PINCETL ET AL.
metabolic accounting involves quantifying the energy, water and resource flows into an urban area and measuring its waste flows out of it (Zhang et al. 2015). Most urban metabolism studies to date are not spatially specific, nor do many include analyses of social equity. Few if any studies examine resource use along both socioeconomic and spatial dimensions— that is, attempting to understand who uses how much of a particular resource, what they do with it, and where they do it. These issues are fundamental to understanding patterns of consumption and their impacts (Zipper et al. 2019). Examining a sociodemographic and spatial breakdown of resource consumption helps to understand economic patterns that give rise to GHG emissions and other forms of pollution. Disaggregating energy consumption across space, time, and sociodemographic categories can reveal a great deal about the material and energy intensities of people’s lives and the differential impacts of energy consumption. Further, it can suggest how to best address those impacts. For example, examining building energy use by age of building, size, and income, can show whether newer or older buildings use more energy per square foot, and the extent to which occupant income is related to energy use per capita. UCLA researchers quickly realized there existed no energy consumption data suitable for an analysis of urban metabolism other than consumption totals aggregated at the city- or county-levels. With no disaggregated data available, it was therefore impossible to measure building energy use by any other category—public, private, building size, year of construction, or other important factors. Previous work with the Los Angeles Department of Water and Power (LADWP) suggested a way forward. To help LADWP reduce water consumption during drought conditions, UCLA researchers were given account-level water consumption data to determine water use by neighborhood. This data was provided to UCLA under a non-disclosure agreement. This consumption data was mapped onto the different neighborhoods of Los Angeles. Census and tax assessor’s data, including parcel size, building size and vintage were then added to the neighborhood water consumption map. Significant differences in consumption between neighborhoods soon became apparent, as did correlations between consumption and income, parcel size, and climate zone. The work with LADWP not only demonstrated that matching disaggregated consumption data to other attributes could be an enormously powerful way to explore the dynamics of resource consumption, but that a similar spatial database could be
3
BUILDING ENERGY DATA ACCESS AND AGGREGATION RULES
35
built for energy consumption data as well. Subsequently, LADWP also provided researchers building energy use, so that a similar analysis could be performed for energy consumption. As briefly described earlier, in 2012 the Governor’s Office of Planning and Research, local governments, and researchers from UCLA petitioned the Public Utilities Commission to hold a set of proceedings on access to energy data. This proceeding was held by an Administrative Law Judge and involved numerous experts presenting different arguments about whether certain bodies and institutions should have access to building energy consumption data. Since the Public Utilities Commission regulates investor-owned utilities in the state, its decision is binding on those utilities. After more than two years of deliberation, the California Public Utilities Commission (CPUC) agreed with the petitioners, and allowed greater access to utility data. Permission was provided under Administrative Law Judge Sullivan’s ruling of May 2014: D. 14-05-016. Decision Adopting Rules to Provide Access to Energy Usage and Usage-Related Data While Protecting Privacy of Personal Dat a (referred to here as the 2014 Decision) (CPUC 2014).
California’s 2014 Data Access Decision The debate over access to address-level billing data had a rocky start in the 1990s, during one of the state’s recurring droughts. During the drought, the press published individual water billing records in Silicon Valley, revealing that very wealthy individuals in the tech industry had not reduced their water consumption and were watering large, lush landscapes. Embarrassed by the press coverage, wealthy Silicon Valley residents put pressure on their state legislator, who then authored a bill that made such utility data proprietary (California Senate Rules Committee 1997). The bill included all utility consumption data, including electricity and natural gas consumption. Clearly high consumers wanted their consumption data kept private. However, in the early 2000s, as California began enacting ever more aggressive policies to reduce greenhouse gas emissions, local governments, and others turned to the Public Utilities Commission for greater transparency in the electricity system. The original proceeding, which began in 2008, was entitled “Order Instituting Rulemaking to Consider Smart Grid Technologies Pursuant to Federal Legislation and on the
36
S. PINCETL ET AL.
Commission’s own Motion to Actively Guide Policy in California’s Development of a Smart Grid System.” The proceeding was expanded in scope in Aug–Nov, 2012 to address issues related to an Energy Data Center proposal put forth by CPUC. This proposal was presented as a briefing paper entitled “Energy Data Center” to the PUC President, in September 2012. This was based in part on a similar proposal put together by university researchers at UCLA. Comments were received on the Energy Data Center proposal starting in December 2012, and the CPUC held related workshops and hearings in January 2013 which addressed use cases, definitions, data aggregation, and model NDAs. It was anticipated that a resolution on these issues might lessen the immediate need for the creation of a data center, a possibility that the utilities strongly and vociferously opposed. It is at these meetings that UCLA presented its prototype Energy Atlas that was based on address-level LADWP electricity consumption data. Following the workshops, a CPUC administrative law judge issued a ruling entitled: Setting Schedule to Establish “Data Use Cases,” Timelines for Provision of Data, and Model Non-Disclosure Agreements, February 27, 2013. This use case ruling proposed eight use cases, a model non-disclosure agreement, definitions that could be used to determine whether certain types of usage information were subject to privacy protections, and the degree of data aggregation necessary to prevent reidentification of utility customers based on their energy consumption. It also set up a working group to develop refinements to the use cases, definitions, and NDAs. A Working Group Report was issued on July 10, 2013, and numerous parties filed comments on the Report between July 29 and Sept 11, 2013. The final Decision was issued in May 2014. It included a summary of the various parties’ written comments, workshop discussions, and contained a final determination on data access for various use cases, definitions of terms, and a model non-disclosure agreement. The final decision allowed universities and other nonprofit educational institutions to request account-level energy consumption data from investor-owned utilities under non-disclosure agreements. The 2014 Decision also allowed energy consumption data collected by investor-owned utilities to be used for a number of other use cases, for example:
3
BUILDING ENERGY DATA ACCESS AND AGGREGATION RULES
37
• Local governments requesting pre-aggregated energy data for their jurisdictions for local energy planning purposes, greenhouse gas emissions inventories, etc. • The California Department of Community Services and Development (CDCSD), a state agency, requesting building-level energy use and weatherization data to support their state and federal legislative mandates. The decision also established the Energy Data Access Committee (EDAC), an informal and non-adjudicatory panel comprised of uncompensated representatives from each of the utilities, Commission Staff, the ORA, the CEC, voluntary representatives of customer and privacy advocacy groups, university researchers who meet the qualifications outlined in this decision, and any other interested parties, for two years. The goal of the EDAC was to create a forum in which to discuss data access implementation issues and informally mediate any disputes that arise. Since it had no formal powers, the EDAC could issue multiple diverging recommendations on any petition brought to it, and any of the participants were entitled to go directly through the CPUC’s petition process to resolve a data access dispute. The EDAC was required to be established within six months of the May 2014 Decision, meet quarterly for two years, and then as needed thereafter. There have been no meetings of the EDAC since 2017. Investor-owned utilities are required to publicly disclose zip code level energy consumption data on a quarterly basis, categorized by four customer segments: Residential, Commercial, Industrial, and Agricultural. However, zip code level consumption data may not be reported by utilities unless privacy aggregation thresholds, also stipulated in the Decision, are met. As we mentioned previously, the aggregation of data is not pure science; it involves judgement and discernment. Data aggregation may be done in such a way as to generate a range of results; it can reflect the biases of the analyst doing the aggregating, as well as the scope and purpose of the analysis. Thus, the application of aggregation rules can vary among parties, relative to their goodwill and commitment to transparency. Address level consumption data can be requested by university researchers through the Energy Data Request Process. As we will discuss further, the IOUs determine whether a given data request is valid, and each utility has its own validity criteria.
38
S. PINCETL ET AL.
University Privacy Aggregation Rules for Public Disclosure While the CPUC ruled that universities can obtain account-level data in the 2014 Decision, any data disclosed to the public by the university must follow strict privacy aggregation rules: • Residential customer groups must be comprised of at least 100 accounts in order for consumption totals to be disclosed; • For non-residential customers (commercial, industrial, etc.), groups must consist of at least 15 accounts, and no single account can represent more than 15% of the group’s total consumption in order for consumption totals to be disclosed. The effects of the Decision’s privacy aggregation rules, or “masking” rules are discussed at length in Chapter 5. Account-level data is usually provided at a monthly timestep, consistent with billing protocols, but researchers are also able to request interval data (hourly or 15-min) where such data is available through advanced metering infrastructure.
Data Access in Practice Data access in practice has proven unpredictable: whether to fill a particular data request is the prerogative of the IOUs (each with their different institutional cultures), and the Decision’s provisions for creating request protocols and resolving disputes between parties have proven ineffective. The EDAC—established to help create common, consistent procedures for data access—failed to serve its intended role at the time of this writing, and currently the only established process by which issues related to the Decision may be addressed is through the CPUC’s formal legal complaint process, which can take up to a year for a resolution. Furthermore, the structure of the complaint process favors the utilities. Complainants must engage in lengthy and potentially expensive arbitration processes with IOUs and their legal teams, and very few likely complainants can match the IOUs’ in terms of legal capacity. CCSC researchers participated in the EDAC, but found that it was mostly a forum for discussion, and that no substantive decisions or recommendations were rendered by the Committee. The manner in which the IOUs respond to data requests, and determine their validity continues to be problematic. UCLA has
3
BUILDING ENERGY DATA ACCESS AND AGGREGATION RULES
39
found that the investor-owned utilities interpret data requests differently from one another, but generally require that the data be used for funded research. That is, data requests will not be fulfilled for a researcher who simply wants the data to write a paper with no evident funding stream. The utilities also require significant data security assurances, in addition to other provisions, but again, no consistent protocol exists for ensuring secure storage. Without the PUC to hold them accountable, utilities are effectively determining which data requests to fulfill on an ad hoc basis. For example, a data request from CCSC to PG&E was deemed by the utility as not conforming to the academic use case, despite clearly meeting all the requirements listed in the 2014 Decision; no details were provided by the utility to support their determination. Although CCSC was eventually able to receive the data directly from the CPUC, PG&E continued to create roadblocks to the use of the data. This included requirements for CCSC to sign an additional NDA directly with them (although they provided no data to us), as well as additional data security reviews. UCLA’s experience with requesting energy consumption data from utilities bears out the regulatory flaws and oversights that have enabled the utilities to determine who gets data for what purpose. As a result, UCLA found that requesting data from the PUC itself—a public entity— to be a more promising strategy. While the PUC does not use the data for analyses such as those conducted by UCLA, it can, and does, have the authority to obtain billing data. This is generally utilized by consultants to PUC to verify utility expenditures in the realm of energy efficiency programs. The university was able to obtain monthly electricity and natural gas consumption data to populate the Energy Atlas by requesting data directly from the CPUC. The NDA governing the university’s use of the data is also between the CPUC and UCLA, rather than with the utilities themselves. The relationship established between UCLA and the CPUC during the proceedings also helped expedite the data requests, again showing how personal relationships, trust and good faith, as well as compatible goals, remain essential in these endeavours. UCLA’s data requests were processed by the CPUC smoothly for more than four years—until 2019—when some members of the CPUC staff responsible for serving the data requests were replaced over the course of 6 months. The staff turnover resulted in a series of delays in processing a data request for a funded project in the California Bay Area, nearly causing the project to fail. Successful collaboration between institutions requires
40
S. PINCETL ET AL.
that participants have a shared understanding of the needs and objectives of the project. This shared understanding can be time-consuming and expensive to establish, and the replacement of key staff during the course of a project inevitably increases transaction costs. In this instance, the new staff member had little knowledge of UCLA and was possessed of a different epistemological view regarding the use of energy consumption data. From this new person’s perspective, the Energy Atlas initiative seemed reframable as a smart and automated systems data problem. The complexity of the process, the validation challenges and more, eluded the perception of this individual, trained within a different structure of knowledge and motivation. There was and continues to be no established pathway to appeal or discuss the decision of a staff person relative to energy data access. This incident shows the fragility of arrangements predicated on personal relationships, and the need for these processes to be governed by enforceable and transparent policies. These events, among others that have been experienced by the public interest community at large, leave in question the enforcement of the 2014 Decision and the interpretation of its contents. The interactions between UCLA, the CPUC, and the IOUs show how tenuous access to data can be, even when access is recognized as essential for successful policy implementation. It points to the continued reluctance of utilities to change their business models despite their inadequacy with regard to the policy mandates of the state and the needs of local governments. Finally, it demonstrates the utilities’ fundamental wariness of energy policy research. It is interesting to note that during the 2019 summer fires PG&E shut down a vast portion of its grid to avoid further catastrophe. It had no ability to determine which parts of its grid were most vulnerable to the threat of fire, and had no insights into how the shutdowns would affect end users, even important ones like hospitals or fire stations. Not only have the IOUs resisted providing granular data to local governments and researchers, and sought to limit the spatial matching of accountlevel consumption to geospatial data, but they have not developed the capacity to do so themselves, leaving their customers yet more vulnerable to disruptions. In contrast, additional data needed for specific project work, such as California Energy Commission grants, were successfully requested from the applicable IOUs in Southern California: Southern California Edison (SCE) and Southern California Gas (SCG). Examples of these additional
3
BUILDING ENERGY DATA ACCESS AND AGGREGATION RULES
41
data requests (at the account level) include multiple years of energy efficiency program participation data for LA County, requested from SCE, and one year of hourly natural gas use for two zip codes, requested from SCG. While these data requests proceeded smoothly, it was noted that the utilities data request forms include a required field for the grant award or contract number, whereas the 2014 Decision makes no stipulation that data requests be associated with a specific award. Such an interpretation by the utilities limits universities’ scientific freedom. The 2014 Decision requires investor-owned utilities to maintain a log of data requests and provide access to those logs on the utilities’ websites. Two utility websites were reviewed for this discussion. • Southern California Edison (SCE)—The data request log maintained by SCE1 is provided as a downloadable Excel file. A total of 169 requests were received through October 25, 2019. Of these, 70% were from local governments (cities), 27% were from academic researchers, and 2% were recurring requests from a single state agency (the CDCSC). Seventy-three per cent of requests were categorized as “completed” or “in process”; only two of the remaining 25% were categorized as “denied.” Since 2017, all unfulfilled requests in the SCE log are shown as “cancelled,” which covers multiple reasons, including where researchers could not provide a signed NDA, or where SCE determined that the request was not covered under the data access program, as well as where the requester cancelled the request themselves. Across the 123 completed requests within the SCE log, the average time from a completed request to data provision was 45 days, with some more complicated requests (such as for interval data) taking up to 166 days. • Pacific Gas & Electric (PG&E)—The data request log maintained by PG&E2 is provided only as a series of multiple web-viewable pdfs, and are not organized in order of date (nor any other discernible order). Furthermore, because many of the data requests only included the date completed, but not the data requested, it 1 The Southern California Edison data request log is located at: https://www.sce.com/ regulatory/energy-data---reports-and-compliances. 2 The Pacific Gas & Electric data request log is located at: https://pge-energydatare quest.com.
42
S. PINCETL ET AL.
was impossible to evaluate the average time from request to data provision. Of the 117 data requests listed in the log as of November 6, 2019, 65% were from local governments (cities), 29% were from academic researchers, and 6% were from state agencies or community organizations. The UCLA request was noted as turned down with no explanation. This shows, again, the lack of consistency across the utilities, one of the issues that was to be addressed by the EDAC, and a lack of oversight and enforcement by the Public Utilities Commission.
Summary Access to consumption data remains contested. Despite widespread recognition of the need for granular data matched to relevant attributes that can inform policy implementation and evaluation, incumbent utilities remain hostile to providing data, and resistant to utilizing the data themselves for policy change, as exemplified by PG&E’s response to the fire season of 2019. The IOUs’ hostility to change is due in part to the path-dependent nature of energy distribution systems. Consequential technological choices made early in energy system development, and the pursuit of economies of scale by utilities have resulted in systems that are profitable to operate in their current configurations but expensive to change (Liebowitz and Margolis 1995). There is a path-dependent legacy and lag wherein new utility business models are slow to develop and alter utility use of consumption data, and often utilities will only change under regulatory pressure or mandates (David 1987). There is no question that access to billing data, spatially matched to relevant attributes could open up a new world of programs and strategies to implement existing ones. For example, data could be used to target energy efficiency programs so they reach consumers who would benefit most by participating (though, of course, more effective energy efficiency programs would reduce consumption, and thus sales). Paradoxically, since the state legislature deregulated electric utilities in 2000 and forced them to sell most of their generation assets, IOUs are mostly in the business of transmitting electricity (Cudahy 2002). That is, their business is to purchase and distribute power. It appears they have not successfully devised a business model based on transmission alone; volumetric sales continue to be the operant basis model for
3
BUILDING ENERGY DATA ACCESS AND AGGREGATION RULES
43
generating revenue. Thus the weight of the past seems to constrain innovation, including the more sophisticated use of their own data, as well as entrenching their resistance to anyone else doing so. Under the administration of Governor Brown, the Public Utilities Commission was acutely concerned with the lack of data availability and transparency. With the change in leadership at the CPUC (which is appointed by the Governor), and staff changes, that level of commitment and awareness seems to have dissipated. Despite the success of the UCLA Atlas, the ability to integrate new energy consumption data across new geographies rests on a fragile basis—the commitment of a few staff members at PUC and favorable interpretation of data requests by the utilities.
Bibliography California Public Utilities Commission (CPUC). (2014). Decision [D. 14-05016]. Adopting Rules to Provide Access to Energy Usage and Usage-Related Data While Protecting Privacy of Personal Data. California Senate Rules Committee. (1997, June 24). Hearing on SB 448. Retrieved July 2, 2020. http://www.leginfo.ca.gov/pub/97-98/bill/sen/sb_ 0401-0450/sb_448_cfa_19970730_114159_sen_floor.html. Cudahy, R. D. (2002). Electric Deregulation After California: Down but Not Out. Recent Developments—Deregulation of Energy: Lessons from the West. Administrative Lew Review, 54, 333. David, P. A. (1987). Some New Standards for the Economics of Standardization in the Information Age. In Economic Policy and Technological Performance (pp. 206–239). Cambridge: Cambridge University Press. Klass, A. B., & Wilson, E. J. (2016). Remaking Energy: The Critical Role of Energy Consumption Data. California Law Review, 104(5), 1095–1158. https://doi.org/10.15779/Z38B55F. Liebowitz, S. J., & Margolis, S. E. (1995). Path Dependence, Lock-in, and History. Journal of Law, Economics, & Organization, 11, 205–226. Zhang, Y., Yang, Z., & Yu, X. (2015). Urban Metabolism: A Review of Current Knowledge and Directions for Future Study. https://doi.org/10.1021/acs.est. 5b03060. Zipper, S. C., Whitney, K. C., Deines, J. M., Befus, K. M., Bhatia, U., Albers, S. J., et al. (2019). Balancing Open Science and Data Privacy in the Water Sciences. Water Resources Research, 55(7), 5202–5211. https://doi.org/10. 1029/2019WR025080.
CHAPTER 4
Building an Energy Atlas
The UCLA Energy Atlas is a constantly evolving database of building energy consumption that links utility account information to building characteristics, sociodemographic data, and other significant attributes that can be expressed spatially. Perhaps one of the most important aspects of this tool is that it can be continually updated, expanded, and improved upon. The ability to associate energy consumption with spatial data layers and contextual information allows users to ask questions about how energy is being used. For example, what types of buildings in Los Angeles have the highest energy intensity per unit area? How does energy consumption vary spatially, by population density, by income level, or by industry? How does energy consumption compare between single-family homes in coastal Santa Monica and the high desert of Palmdale? These types of fundamental questions are impossible to answer without a spatial database that merges these datasets together. This chapter will describe the essential data building blocks used to create the UCLA Energy Atlas. We will explore how these datasets are linked together to form a queryable, spatially aware database. We will also discuss data and process limitations that have significant impacts on how this tool can be used. While the details of this chapter are specific to the UCLA Atlas, we share these guidelines in the hope that others may use, and perhaps improve upon, our methods when building similar tools to serve their needs. © The Author(s) 2020 S. Pincetl et al., Energy Use in Cities, https://doi.org/10.1007/978-3-030-55601-3_4
45
46
S. PINCETL ET AL.
Database Overview The Energy Atlas is the product of two separate and distinct databases. The public portion of the Energy Atlas is a front-end website that displays spatially aggregated energy consumption statistics at an annual temporal resolution for most neighborhoods, cities, and counties in Southern California. To develop this public-facing website, researchers collected, processed, and analyzed energy and related data from a variety of sources. The website features interactive energy maps, comparative graphs, and tabular views of community energy profile data. The aggregated information presented on the public front-end website is derived from a separate, confidential, back-end geospatial relational database, which contains almost five billion unique records. These records relate the private, account-level monthly energy consumption data to building characteristics and census information. • Back-end PostgreSQL database (protected to comply with regulatory requirements)—This database is an internal geospatial relational database that contains address-level energy billing data. The PostgreSQL database also contains county parcel data, census data, administrative boundaries, and other relevant data used to analyze and map energy consumption across Southern California. Due to privacy constraints, this database is only accessible to qualifying CCSC researchers under binding non-disclosure agreements. • Front-end database (Public)—The Energy Atlas website is based on separate aggregated and privacy-protected data tables that are exported as a set of query outputs from the back-end PostgreSQL database. Once it has been verified that the aggregated data contained in these tables pass privacy standards, they are made accessible to the public to download on this site. The data tables displayed on the public site are hosted on a separate cloud server (Fig. 4.1). Both the back-end and front-end databases are a product of collecting, normalizing, and combining disparate datasets. The following sections describe the key datasets captured by the Energy Atlas.
4
BUILDING AN ENERGY ATLAS
47
Utility Billing Data County Assessor Parcels Census/Community Survey Administrative Boundaries
Pre-processing Standardization
EE Program Data Grid Capacity
Geocoding DB Planning
Weather/Temperature Data
Confidential PostgreSQL DB Relational database organizes account-level energy consumption and spatial relationships
Aggregation Statistical Analysis
Public Database Stores all aggregated data displayed on website
Climate Zones Solar Generation Capacity CalEnviroScreen Scores Privacy Controls
Fig. 4.1 UCLA Energy Atlas database overview
Utility Data Energy consumption data, collected by utilities for billing purposes, is the most essential component of the Atlas. Table 4.1 summarizes details of the utility data contained in the back-end database. Because the first version of the Atlas only covered Los Angeles County, we have the longest period of record for this area, starting in 2006. Data starts in 2011 for additional counties included in version 2.0: Orange, Imperial, Riverside, San Bernardino, and Ventura. Southern California is served by three Investor-Owned Utilities (IOUs): Southern California Edison Table 4.1 Utilities included in the Energy Atlas and years of data available Utility
Data years available
Southern California Edison (SCE)
LA County: 2006–2010 Orange, Imperial, Riverside, San Bernardino, and Ventura Counties: 2011–2016 LA County: 2006–2007, 2009–2010, 2011–2016 Orange, Imperial, Riverside, San Bernardino, and Ventura Counties: 2011–2016 2006–2016
Southern California Gas (SCG)
Los Angeles Department of Water and Power (LADWP) Glendale Water and Power (GW&P) Burbank Water and Power (GW&P) Long Beach Gas and Oil (LBG&O)
2006–2010 2006–2010 2010 (Zip-Code Level)
48
S. PINCETL ET AL.
(SCE) for electricity, Southern California Gas (SCG) for natural gas, and San Diego Gas & Electric (SDG&E). In the current version of the Atlas, SDG&E’s territory is not included due to funding constraints. Data for SCE and SCG territories were obtained by UCLA through non-disclosure agreements with the California Public Utilities Commission (CPUC). In addition, there are 22 Publicly Owned Utilities (POUs) serving the Southern California region; however data for only three POUs are included in the Atlas (and only for a limited number of years), again due to funding constraints. For POUs that are included in the Energy Atlas, data sharing agreements were made directly with them. In the case of Long Beach Gas and Oil (LBG&O), data was only available at the zipcode level, rather than at the account level, and therefore gas consumption statistics disaggregated by building attributes are not available for that city (Figs. 4.2 and 4.3).
Fig. 4.2 Map of gas utility coverage in the Energy Atlas
4
BUILDING AN ENERGY ATLAS
49
Fig. 4.3 Map of electric utility coverage in the Energy Atlas
Parcels and Buildings Building information, including use type, year of construction (year built), and square-footage values are derived from multiple county parcel dataset sources. Because parcel data is collected and maintained by individual counties for the purpose of assessing property taxes, these datasets can differ from each other in important ways. First, across datasets, the categories of land use types can vary significantly (see Table 4.2). The Southern California Association of Governments (SCAG)1 created use code-standardized 2016 parcel data for Los Angeles, Orange,
1 SCAG is a Council of Government that encompasses the six-county area around and including Los Angeles County, including Ventura, Orange, San Bernardino, Riverside, and Imperial Counties. Each county and their constituent cities have representatives on the SCAG governing board. SCAG makes transportation investments and coordinates housing policy, though has no legislative power. Councils of Government exist throughout California and the United States and were created to coordinate transportation systems and funding to improve mobility.
50
S. PINCETL ET AL.
Table 4.2 Building use type categories and descriptions Building use type categories
Description
Single family Multifamily Condominiums Residential other
Single-family homes Duplexes to large apartment complexes Condominium designated parcels Mobile home parks, manufactured homes, nursing homes, rural residential, and unknown other residential use codes that do not clearly fit within single family, multifamily, or condominium categories Consumption that was categorized as residential from the utility designation, but unable to be linked to its parcel and thus unable to be categorized by parcel use type. Without parcel linkage, this consumption lacks square-footage and building vintage information Sum of all residential categories Office buildings, hotels, retail, restaurants, mixed-use commercial, etc. Manufacturing, warehouses, processing facilities, extraction sites, etc. Schools, public hospitals, government-owned facilities, churches, tax-exempt properties, etc. Farms, agricultural lands, orchards, etc. Spans diverse range of use types unable to fit within the preset categories, including miscellaneous bus terminals, vacant land, reservoirs, truck terminals, rights-of-way, etc. Consumption that was categorized as non-residential from the utility designation, but unable to be linked to its parcel and lacking NAICS/SIC codes and thus unable to be categorized by parcel use type. Without parcel linkage, this consumption lacks square-footage and building vintage information Sum of all individual use types
Residential uncategorized
Residential (total) Commercial Industrial Institutional Agriculture Other
Uncategorized
All (total)
Ventura, San Bernardino, Riverside, and Imperial Counties, and generously provided it for use in the Energy Atlas. SCAG’s standardized use types are used to generate the building use type categories found in the Atlas. Second, attribute data associated with parcels also varies widely as different counties collect different sets of attributes. Los Angeles County collects building square footage and year built (vintage). Unfortunately, information on building square footage or year built is not provided by
4
BUILDING AN ENERGY ATLAS
51
Imperial County, nor is it provided for non-residential parcels in Riverside County. This means that analysis on the basis of these attributes is unavailable for these areas. Inconsistencies within a single parcel dataset are also common. For example, parcels may contain multiple buildings and land use types. They may also represent protected areas or public spaces. Within Los Angeles County for example, building attribute information is recorded for up to five buildings per parcel. Cases where more than five buildings exist on a parcel are handled at the Assessor’s discretion; sometimes buildings are omitted, while in other instances their attributes may be combined with those of other recorded buildings. In cases where there are multiple buildings on a parcel, the Energy Atlas sums up the consumption for all buildings. UCLA, for the most part, does not have the capacity to comprehensively validate parcel data, so any inaccuracies in the parcel data itself will be present in the Atlas results. However, we have been able to rectify some parcel errors that were obviously erroneous classifications. Megaparcels It is common for an individual building to play host to multiple different ownership parcels, such as in the case of condominiums or individually owned spaces within a commercial office building. The solution which has been developed to address this issue in the Energy Atlas is to aggregate the attributes of any such “stacked parcels” on the basis of shared parcel geometries. For example, a condominium tower with 300 privately owned units would typically have 300 unique parcels attributes for each unit all sharing the same geographic boundary, rather than one singular parcel boundary summarizing the entire building’s information. The identical geometries pose a methodological challenge to correctly link addresses to their specific parcel attributes because the relationship from service address to parcel is determined by spatial intersection. Researchers solved this issue by aggregating these types of individual parcels into single unified geometries called megaparcels, a derived spatial entity with summed square footage and concatenated use types, building designs, vintages, and identification numbers.
52
S. PINCETL ET AL.
Administrative Boundaries The Energy Atlas database contains a number of administrative boundary layers, which allow energy consumption statistics to be calculated and displayed at four primary reference geographies within Southern California—neighborhoods (in LA County only, possible because of high population numbers), cities, councils of government (COGs—also in LA County only, again a result of high numbers of residents), and counties. County boundaries for Southern California are based on a statewide county shapefile from the US Census. City boundaries in Los Angeles County were obtained from the Los Angeles County GIS Data Portal; this source data includes 88 municipalities and nearly 200 unnamed unincorporated areas in the County. Researchers aggregated the county’s unincorporated areas to one geographic area. Because the City of Los Angeles covers such a large area, neighborhood boundaries from The Los Angeles Times were used to provide better geographic differentiation within the city. CCSC internally developed the Councils of Government boundaries using the city boundary shapefile. Though they are not used for front-end display, the back-end database also includes zip-code boundaries, climate zones designated by the CEC, and Census tract and block group boundaries. Were these desired attributes by Atlas users, they could be added, and they are used for research purposes. US Census Researchers use data from the US Census and the American Community Survey (ACS) to study relationships between energy consumption, demographic characteristics, and income levels. The Energy Atlas includes socioeconomic information from the ACS 2006–2010 and 2010–2014 surveys at the block group level. Consumption data is matched to ACS estimates based on the corresponding year of consumption. Block groups were chosen as the unit of analysis because they are the most granular delineation available that include detailed housing information, such as income estimates and whether a residence is owner or renter occupied. Researchers exported relevant tables using Social Explorer Professional Edition, a robust web interface to Census materials made available through UCLA professional subscription. All Census information in the Energy Atlas (population, median household income, etc.) is aggregated from block group level statistics to the
4
BUILDING AN ENERGY ATLAS
53
administrative boundaries displayed on the site. Population estimates are calculated from the block group values. However, block group boundaries do not align perfectly with the neighborhood, city, and COG boundaries presented in the front-end website. In order to provide the best population estimates for each of the reference geographies, we used a two-step process. First, block groups whose boundaries are completely contained within each geography are assigned to the neighborhood, city, and COG that they are within. For those block groups on the boundaries of multiple cities or neighborhoods, we divide the population proportional to the area within each geography and assign totals based on these proportional distributions. A limitation of this process is the assumption of equal population distribution throughout a block group. Renter/Owner proportions are also assigned using the same method. Finally, median household incomes for each geography are derived from the median of all block groups’ that are wholly within it, or intersect it. This process shows that in all data representation, decisions must be made regarding the ways in which data is collected. Data is collected usually for one purpose, and then when it is to be matched with other data, there may be discrepancies between boundaries or units of measurement. The important thing is to be transparent about the decisions, which allows that user to understand the limitations or constraints of the data. CalEnviroScreen CalEnviroScreen (CES)2 is a dataset that ranks every census tract in California based on environmental pollution and socioeconomic health indicators. The data is provided by the California Office of Environmental Health Hazard Assessment (OEHHA) and is largely used to identify disadvantaged communities across the state and prioritize funding and opportunities for those areas. Each census tract is given a percentile score with the most disadvantaged tracts scoring in the 75th–100th percentile. The Energy Atlas incorporates CES data in the back-end database as a means to analyze energy consumption and environmental justice. On the front-end website, residential energy consumption totals by CES score quartiles are available for each geography.
2 https://oehha.ca.gov/calenviroscreen/report/calenviroscreen-30.
54
S. PINCETL ET AL.
Data Preparation The preparation of the data presented in the Atlas is a time-consuming and resource-intensive process. Utility data, parcel data, census data, and additional relevant datasets must be collected, standardized, mapped and linked spatially, and finally organized into a database capable of processing billions of records. Once the database is complete, aggregated statistics for each geography are queried, masked for privacy, and exported to the online host that generates all maps, tables, and data points for the public Energy Atlas website. The following sections describe each of the main steps in this process. Geocoding and Utility Data Standardization Geocoding refers to the process by which an address string can be converted into a set of geographically referenced coordinates (Bolstad 2005). Geocoding the account addresses is fundamental to linking account locations with other reference geographies such as cities, census blocks, and parcels. This process assigns each customer street address its corresponding latitude and longitude coordinates. Geocoding is an iterative process, and the success (match) rate varies based on the geocoding method, the quality of the parcel reference data, and completeness of utility account addresses. Customer addresses have many internal formatting inconsistencies and are organized differently from utility to utility. CCSC researchers process the addresses through SQL and regular expression batch “cleaning” in order to make them readable by geocoding tools. This process converts heterogeneous address strings into a standardized record format suitable for geocoding, as well as loading into the database. Account addresses are assigned spatial locations in a hierarchical process, from most precise (building/parcel boundaries) to least precise (zip code centroid). We first attempt to geocode all account addresses to their corresponding parcel locations, so that consumption data can be spatially linked to building and property information. For addresses that cannot be mapped to a parcel location, we assign them to the closest location possible, which may be street or zip code locations. These addresses may be incomplete (missing a street number or name) or inaccurate. The following list describes the levels of geocoding precision and order of geocoding processing.
4
BUILDING AN ENERGY ATLAS
55
1. Parcel centroid—This process matches an address to the parcel centroid in which it is located. This is the most precise method of geocoding, but also the one with the lowest match rate relative to other methods. Match rates vary by county, utility, sector, and method used to do the geocoding. CCSC researchers have found that the Google Geocoding API is the most efficient and accurate method for mapping utility accounts to their parcels. This step is essential when analyzing consumption alongside parcel information such as built square footage or building vintage and use type. 2. Street centerline—This process uses street level locations to match addresses to their street centerline coordinates when they cannot be mapped to their parcel locations. With street-based geocoding, accounts and their consumption values cannot be associated with parcel information like building type or size. 3. Utility provided locations —Some utilities include coordinate locations for billing records that range in precision from meter- to street-level. However, there is no indication as to how accurate these coordinates are, and so they are only used to geolocate accounts when other methods have been exhausted. 4. ZIP code centroid—for addresses that cannot be located using any of the previous methods, an additional process seeks to locate them to their ZIP code centroid. This location information is the most complete with a 99% match rate; however, it is not nearly as precise as the other levels and is more difficult to aggregate to appropriate larger geographical boundaries like neighborhoods and cities. When an address/account cannot be mapped to the parcel level, additional data from the utility is used to try to assign a use type to the account’s consumption. If a particular account is designated as “residential” by a utility, this consumption is added to the “uncategorized residential” use type and to the total residential consumption for the neighborhood or city in which it is located. If the account is designated as non-residential by the utility, industrial classification codes (NAICS, and SIC) are used (when available) to assign consumption to commercial or industrial categories as appropriate. It is important to note that since associated building information like square-footage or building vintage is not available for accounts classified in this way, such accounts are excluded from calculations of energy consumption statistics that reference building characteristics. Thus, the aggregated statistics involving building size and vintage are calculated only for parcel-geocoded accounts.
56
S. PINCETL ET AL.
Spatial Joins in PostgreSQL with PostGIS The key to the Energy Atlas is the back-end relational database and the ability to maintain spatial relationships among all datasets. The confidential back end uses a PostgreSQL database system with a PostGIS extension to manage these spatial relationships. Geocoded addresses are spatially joined to associated parcel datasets, census data, and any other municipal or administrative boundaries. Parcel boundaries, however, do not always fit completely within census or municipal boundaries. In cases where they do not, we assign a parcel to a specific census tract or city based on whether the parcel has 80% of its area within that geography. If the parcel is split by a geographic boundary and falls short of the 80% rule, we assign its geography based on the location of the parcel centroid. Monthly Billing Dates Energy consumption billing data is organized around billing dates that are specific to individual customers. Some accounts have mid-month billing cycles, or billing periods that span multiple months (or even multiple years) for one charge. In these cases, UCLA attempts to calendarize consumption, or assign consumption to specific months. This process is done programmatically. First, for billing cycles under 35 days, consumption is assigned to the dominant month of the cycle. For billing cycles above 35 days, we check the start and end month where there are at least 15 billing days. These strategies have been developed over time and involve tedious steps to ensure that all billing data is treated in a similar way. Then the consumption is divided evenly across the total number of months between these billing start and end dates. In some occasions, billing cycles can span multiple years, which follows the same logic for dividing across months. Because the Energy Atlas website only shows annual data, this process is mostly invisible, however some billing cycles from December to January can attribute data for this cycle to one of these months in one year and not the other. Further, we have the ability to query monthly billing data, which enables us to understand seasonal use, for example, and compare among different years.
4
BUILDING AN ENERGY ATLAS
57
Security Researchers gained access to the data through negotiated non-disclosure agreements with utilities and the CPUC. Per these agreements, CCSC worked with UCLA Information Technology Services to develop strict privacy and security protocols for accessing the data and displaying it on the Energy Atlas website. The underlying disaggregated data itself is stored in a secure environment that has no interface with the website. There is limited access by researchers to the disaggregated data, and each authorized researcher must comply with security protocols. The data is also stored in a secure location that is heavily monitored and to which physical access is limited. The data on the website is aggregated for analysis and displayed in such a way that no individual customer’s consumption can be identified. Specific information about data security measures is also confidential. Naturally, such precautions are costly, and as the volumes of data increase over time, so do the storage charges. Yet, this longitudinal dataset serves an essential research purpose—providing the only means of evaluating the efficacy of state and local energy policies at a large scale. Outside of this database, there is no historical record of consumption of the Atlas’s scale and length. In many cases, the utilities themselves jettison their data after three to seven years (depending on the utility), and thus do not use it to inform their operations or investments. Masked vs. Unavailable Data In some instances, energy consumption data for certain cities or neighborhoods is omitted from the Energy Atlas’s public site and data tables. Municipalities are omitted from publicly released datasets because they are either served by utilities other than those included in the Atlas (as discussed earlier in the chapter), or because of the CPUC’s privacy aggregation guidelines. Geographies that are omitted because of lack of utility data are labeled as not available; geographies that are omitted due to privacy guidelines are referred to as masked. In California, individual electricity and natural gas account information is protected as personally identifiable information (PII). Through non-disclosure agreements, UCLA gained access to account-level information with the provision that the raw data can never be revealed publicly. To ensure no individual customer’s information is revealed, CCSC masks data which does not meet minimum aggregation thresholds to protect
58
S. PINCETL ET AL.
privacy. The Energy Atlas follows the guidelines set by the California Public Utilities Commission in 2014 (D.14-05-016).3 For non-residential consumption, aggregated data must include a minimum of 15 customers with no single account’s consumption exceeding 15% of the group’s total energy use. For residential consumption, there must be at least 100 customers. If these conditions are not met, the aggregated consumption in a geography will be masked for privacy on the website. Challenges associated with these aggregation thresholds are discussed in Chapter 5. Units Account-level consumption data from the utilities is reported in units of kilowatt-hours (kWh) for electricity and therms for natural gas. In order to provide information on the total combined energy and associated greenhouse gas emissions, these values are first converted to British Thermal Units (BTUs).4 Total consumption is simply the sum of equivalent BTUs for both electricity and natural gas. Greenhouse gas emissions (MTCO2 ) are calculated as a product of total consumption and utility-specific GHG emissions factors reported annually. There are four main units displayed on the UCLA Energy Atlas: • Electricity: Kilowatt-hour (kWh) • Natural Gas: Therm • Combined electricity and natural gas consumption: British Thermal Unit (BTU) • Greenhouse Gas Emissions (GHG—MTCO2 ) derived from emissions factors based on annual utility portfolio, sum of both gas and electricity emissions. Because BTUs and GHGs are each a sum of electricity and natural gas consumption, if either one is masked or not available, the combined sum cannot be released as GHG emissions or BTUs to prevent revealing data by reverse calculation.
3 http://docs.cpuc.ca.gov/PublishedDocs/Published/G000/M090/K845/908459 85.PDF. 4 1 kWh = 3412.141633 BTU and 1 US Therm = 99,976.129 BTU.
4
BUILDING AN ENERGY ATLAS
59
Limitations With the challenge of standardizing and combining large and heterogeneous datasets, the Atlas is constantly evolving and improving. As technology improves and researchers refine the calculation of various energy consumption and intensity statistics, data on the Atlas website is updated. We strive to produce the most accurate results possible, but certain intrinsic sources of error are unavoidable. Being cognizant of the data’s value and role in consequential decision-making processes, we are constantly working to improve the Atlas. The subsequent sections detail some of the most significant challenges we have encountered while developing the database, and the limitations they impose. Missing Data for Some Publicly Owned Utilities As discussed earlier, utility data from most Publicly Owned Utilities are not available for inclusion in the energy database, due to time and resource constraints. The Energy Atlas does not account for electricity or natural gas consumption from these POUs, which limits analysis and also detracts from the reported countywide energy figures for most counties. In Los Angeles County, for example, roughly 6% of the population lives in utility territory that is not included in the Atlas database. In areas served by a variety of utilities, collecting the necessary data to complete a regional energy database can be extremely difficult. This is especially true when utility boundaries do not adhere to municipal or administrative boundaries, as is often the case. With data sharing privacy constraints, decentralized data collection, and project funding restrictions, acquiring a complete record of energy consumption can be nearly impossible in some instances. Parcel Data Errors Much of the analysis presented in the Energy Atlas is predicated upon the assumption that parcel data for building use, size, vintage, and design information are both accurate and complete. Parcel data is collected and maintained by individual counties, and some level of error is to be expected. Tax assessors’ data are known to undercount square footage and contain incomplete information for nontaxable properties such as churches, government buildings, schools, and nonprofit organizations
60
S. PINCETL ET AL.
(deemed “institutional” use type in the Energy Atlas). For these buildings, consumption statistics relating to building size and year built sometimes reflect these errors. Parcel data can also contain errors in land use types, which will be reflected in the reported energy consumption statistics by building use type, since parcels are the underlying source of information for designating these categories. An additional limitation is that while the years of energy data span 2006–2016, the parcel data for the region corresponds to the year 2016, so changes to construction or land use during the study time frame may not be reflected at the precise moment of consumption. In addition, Riverside County (non-residential parcels) and Imperial County (all parcels) do not provide attribute information for building square footage or construction vintage, and we were unable to obtain these attributes for this analysis. Thus, the Atlas will only display total energy consumption by use type, without additional vintage and size information. These parcel-level limitations suggest that state policy should be developed that requires counties to collect similar data with a consistent set of attributes. Raw Utility Data Errors UCLA receives raw utility billing data from multiple sources. This raw data comes in a variety of formats, structures, and file types, and requires extensive cleaning and preprocessing by the UCLA team before it is included in the Atlas database. Researchers usually discover raw data errors, such as missing or incomplete data, when piecing together different batches of account-level data over time. Researchers spend a great deal of time attempting to identify and remedy these errors when possible, but certain limitations still persist. Over the past eight years of the Energy Atlas project, UCLA has received utility data in four main batch acquisitions, each covering different billing periods. Although these data are collected and maintained by individual utilities, some have been processed by consultants prior being sent to us, and some have not. Because of differences in data organization, structure, timing, and sources, ensuring data completeness across the region is difficult. These data batches are referred to as “data load periods.” These load periods are defined with respect to the date that the raw data was received and the years of billing data they contain. Despite our best efforts to ensure data is comparable across all of the
4
BUILDING AN ENERGY ATLAS
61
Table 4.3 Data load periods in the Energy Atlas Data load period
Years of billing data
Year UCLA received data
1
2006–2010 (LADWP/SCE/SCG/BW&P/GW&P) 2011–2013 (SCE/SCG) 2014–2016 (SCE/SCG) 2011–2016 (LADWP)
2012–2013
2 3 4
2015–2016 2018 2019
available time periods, there are observable differences between raw utility data across data load periods. For example, we have observed customer accounts that appear in some data periods and not others, which suggests the presence of incomplete records over time. Additionally, utilities can cycle through or reuse account numbers, or change them over time. One of the ways researchers have addressed this issue is by comparing customer addresses to their associated parcel number across data load periods to check for consistency. While a number of quality control checks have been implemented to improve data accuracy, there may be underlying errors that are impossible to correct once the data has been provided to us. Such issues are artifacts of the historic evolution of the utilities and the regulatory supervision by the Public Utilities Commission. Only in recent years has building energy use data become of interest to the public, since an energy transition toward renewables depends on the information it contains. In the past the emphasis was on energy use reduction, and guided by a sense that providing energy efficiency programs would suffice to ensure that energy use did not increase. However, our work shows that assumption is not correct, and that in addition, for a sophisticated, targeted, equitable, and parsimonious approach to an energy transition, far more data is required and it needs to be analyzable in systematized, comparable ways. The regulatory infrastructure of the twentieth century is simply inadequate for twenty-first century needs (Table 4.3). Census Data Errors Demographic analysis of energy consumption relies on information from the US Census and American Community Surveys that sample a small percentage of the population each year and report estimates based on
62
S. PINCETL ET AL.
those samples. Their margins of error vary based on survey location, response rates, and other statistical factors. These errors cannot be corrected for, and are present in the Energy Atlas. Furthermore, population and income statistics for neighborhoods and cities are based on census block group level statistics. In some cases, boundaries of block groups cross neighborhood and city lines, making precise aggregation difficult. In these cases, population numbers are calculated based on the proportion of block group areas within each of the geographies they intersect, as described in detail earlier in this chapter. Geocoding Errors Geocoding physical addresses to their most precise spatial location is an imperfect science. Constructing accurate linkages between account-level consumption, building, and demographic data depends on the accuracy with which service addresses can be mapped to the corresponding parcels where they are physically located. This process requires developing an address locator that translates a street address to its geographic coordinates, standardizing addresses, and using a geocoding method to match them to their locations on Earth’s surface. As described earlier in this chapter, the data contained within the back-end database has been geocoded to the following levels: parcel, street centerline, and ZIP code, each with distinct success rates and geospatial accuracy. Geocoding to the parcel level, which is a necessary step for matching consumption to building information, is especially challenging and errorprone because it requires the most precision. Street level geocoding is sufficient for block group analysis and is generally more complete, and 97–99% of addresses can be geocoded at the street level. ZIP code level matches are used where more precise locations are not available. About 99% of remaining addresses not street geocoded can be geocoded to their zip code. The statistics that link consumption to land use are heavily dependent on the accuracy of the parcel records database. Parcel data varies in completeness across counties. Unfortunately, this results in different geocoding match rates across counties, as well as by utility, and use type. Residential account geocoding match rates tend to be higher than non-residential account geocoding match rates, likely because residential addresses are more standardized. Table 4.4 shows parcel geocoding match rates by county and energy type.
4
BUILDING AN ENERGY ATLAS
63
Table 4.4 Parcel geocoding match rate by county County
Los Angeles Orange Riverside Imperial San Bernardino Ventura
Sector
Residential Non-residential Residential Non-residential Residential Non-residential Residential Non-residential Residential Non-residential Residential Non-residential
Electricity account parcel match rate (%)
Natural gas account Parcel match rate (%)
2011–2013
2014–2016
2011–2013
2014–2016
91 76 74 49 88 61 NA NA 97 66 93 75
92 80 78 53 89 62 NA NA 97 67 94 75
94 83 84 65 92 65 84 54 97 69 93 84
94 83 8 56 92 67 84 56 97 71 93 84
Compliance with Data Aggregation Rules The Energy Atlas must adhere to data privacy guidelines from the California Public Utilities Commission 2014 Decision 14-05-016.5 As such, geographies are masked if there are fewer than 15 non-residential customers, or if one account consumes more than 15% of the total energy in a geography. For the residential use type, there must be more than 100 customers. These privacy regulations trigger certain use types more than others. For example, industrial consumption is largely masked for a majority of geographies due to individual industrial customers consuming a disproportionate amount of energy in an area (i.e., oil refineries, large production facilities, etc.). Additionally, if only one use type is masked for an area, a second category of use type must also be masked so that the individual use type cannot be reverse calculated from the total. We choose to mask the “uncategorized” use type consumption in such cases. Such extreme masking was probably not anticipated in the development of the aggregation rules, and are only now evident. Masking challenges are further explored in Chapter 5. 5 http://docs.cpuc.ca.gov/PublishedDocs/Published/G000/M090/K845/908459 85.PDF.
64
S. PINCETL ET AL.
Summary As this discussion shows, every data layer in the Atlas is subject to its own idiosyncrasies and inaccuracies, despite originating from regulated utilities or public agencies with long histories of collecting this information. It reveals that even in the era of “big data,” information about land use, the built environment, and the functioning of infrastructural systems remains fragmented and disparate. Researchers must spend significant effort in cleaning and processing data to improve accuracy and consistency. Decisions about data processing must be made in a transparent and explicable manner so that those using an Atlas-like database to make decisions or pursue research questions may be apprised of its strengths and weaknesses. These data limitations further highlight the remarkable gap between the state’s aspirations for the energy transition, and the data required and available to achieve those goals. Codification of collection and reporting protocols is essential in order for these data to inform targeted and effective programs. No amount of data processing can remedy the lack of building vintage in a parcel database. But rather than requiring county tax assessors to individually take on the labor- and resource-intensive responsibility of improving records and information collection, this effort should be funded by the state, with guidelines or mandates about what parcel data should be in county assessor databases. Likewise, utilities should be required to use consistent data fields, category definitions, and other measures to improve data quality and facilitate comparative analysis. Going forward, the ability to effectively map and analyze energy use to reveal underlying relationships and drivers will require structural change probably only possible with legislative intervention. The Atlas has been instrumental in revealing these needed changes, while making the best use of the currently available data to support policymaking.
Bibliography Bolstad, P. (2005). GIS Fundamentals: A First Text on Geographic Information Systems. White Bear Lake, MN: Eider Press.
CHAPTER 5
User Design and Functionality
Introduction Web-based tools which support interactive data visualizations and queries have become increasingly important within the environmental sciences (Vitolo et al. 2015). It is incumbent on the creators of these tools to think carefully about user experience and the visual presentation of data (Grainger et al. 2016). In the current age of Big Data, many tools which have been created with the best of intentions fail to meet the needs of their intended users, as they are unable to overcome the challenge of presenting large and heterogeneous datasets in a clear way that supports decision-making (Simon 2014). During the development stage of the Energy Atlas prototype, design and functionality were of the utmost importance, and have remained so in subsequent versions of the Atlas. The ultimate purpose of the Atlas front-end website is to provide energy consumption data to local governments and allow them (and other users) to explore how energy is being used within their communities. With this end goal in mind, engaging the public sector stakeholders who would become primary users was an obvious first step in the development process. For the build-out of the website, we were fortunate to partner with Studio NAND, an innovative and sophisticated web development and data visualization firm based in Berlin. The Studio NAND team worked closely with our team and the stakeholders to develop the prototype
© The Author(s) 2020 S. Pincetl et al., Energy Use in Cities, https://doi.org/10.1007/978-3-030-55601-3_5
65
66
S. PINCETL ET AL.
website and refine the website’s features based on our collective feedback. We felt it was important to have an attractive site, as well as a highly functional one. This chapter describes the development and current functionalities of the public-facing Energy Atlas website, from stakeholder engagement through the features available on the site today, demonstrating the extent to which the Atlas is a product of a public participatory GIS development process (Sieber 2006). The final section of this chapter examines the challenges of visualizing data with strict privacy aggregation protocols and the effects these rules have on data access.
Stakeholder Engagement In the early stages of the Energy Atlas project, a key group of interested officials from the Los Angeles (LA) region was gathered for a series of engagement meetings. We invited officials and staff from various local governments and nonprofits, including: • • • • • • • • •
The LA County Geographic Information Services (GIS) Department LA County Internal Services Division Strategic Growth Council LA Department of Water and Power Local Governments for Sustainability (ICLEI) City of Los Angeles Mayor’s Office Better Buildings Challenge Climate Resolve Regional Councils of Government
Multiple workshops were held during the initial phase in 2013. The first workshop outlined the goals of the project, and solicited feedback on the challenges that stakeholders faced related to energy data access, the types of information that would help them accomplish their work, and the specific scope of data they needed. The subsequent workshops provided input on the tool design, website features, and explicit datasets that would aid their efforts. Stakeholders provided guidance on the geographic aggregation scales they found most useful: neighborhoods, cities, councils of government, and the county as a whole. Their input led directly
5
USER DESIGN AND FUNCTIONALITY
67
to the design of the website, including the map view capabilities, and the geography-specific energy profile that can be compared across the region. Officials wanted a tool that could easily answer basic questions about how much energy is being consumed in their cities. What types of buildings are the most energy intensive and where are they located? Are consumption trends increasing or decreasing over time? Where might energy efficiency measures be best deployed to help meet climate action goals? How does their city’s consumption compare to neighboring cities? Without data, these relatively simple questions are nearly impossible to answer. The group universally noted the basic lack of transparency and access to energy consumption data within their jurisdictions. For some local governments, even accessing the total electricity or natural gas usage for their cities was not possible because of utility protocols which mandate the aggregation of consumption data to protect customer privacy—and which remain an issue for many jurisdictions even today. This data gap made it exceedingly difficult to develop a realistic energy efficiency targeting plan or an accurate GHG inventory for a climate action plan. While still limited to state-established aggregation rules, a product such as the Atlas would give municipalities access to standardized data for the entire region, as well as important statistics that could only be generated through the use of account-level data. Participants also identified other contextual datasets that would be helpful to include in the Energy Atlas, like population density, land use characteristics, building size, and median household income variation. These discussions helped decide the units of energy calculated and displayed on the website: kilowatt-hours (kWh) for electricity, therms for natural gas, British thermal units (BTUs) for combined electricity and natural gas consumption, and metric tons of carbon dioxide equivalent (MTCO2 e) for greenhouse gas emissions. Stakeholders voiced the need for additional metrics aside from energy totals, requesting median per square-foot values as well as residential consumption per capita. This feedback became the basis for the initial Energy Atlas. In developing a tool like the Energy Atlas, engaging relevant stakeholders has been, and continues to be, a key aspect through all stages of the work. The intended user base should be consulted early in the development process so their input can shape the end result (Schlossberg et al. 2005). The following sections will highlight the key website components that were generated as a direct result of stakeholder feedback.
68
S. PINCETL ET AL.
Website Components The fundamental Energy Atlas website1 features include an interactive map, visualizations of tabular and graphical data, and the ability to download the underlying public data. These interactive web visualizations display summaries of energy consumption along with sociodemographic and land use characteristics for several levels of geographies, including cities and counties, in Southern California. The socio-demographic information on the website, such as population and income levels, are sourced from the American Community Survey from 2006–2010 and 2010–2014. All building attribute information, such as building use type, square footage, and year built are taken from Tax Assessor parcel data for each county from 2016. The user experience design expertise of our web developers was invaluable in determining the best ways to visualize the public version of the Atlas data. Their input also helped determine the list of analytical functionalities available to the website’s users, and how to manage downloads of energy data. The website itself is built using React JavaScript, Mapbox, and Amazon Web Services S3 servers to host the summarized data. All data on the website are aggregated statistical summaries of the energy consumption data. No confidential data are hosted on the public server, only the statistics for the geographies that pass the customer privacy threshold are hosted. As described in Chapter 4, confidential account-level energy data never leaves the separate and secure offline database environment to ensure that data security and individual customer privacy is protected. The subsequent sections provide a tour of the site’s core data visualizations, all of which were created with the intended users’ needs in mind. Map View The map view is the primary visual element of the Energy Atlas. A choropleth map of the six-county region displays energy consumption summary statistics for the most recent year of data available. This type of map uses colors to depict the range of values represented among the set of geographies being mapped. In this way, each color, usually within an ordinal color ramp, depicts a binned range of attribute values. A series of interactive drop-down menus at the top of the page allows users to select the 1 www.energyatlas.ucla.edu.
5
USER DESIGN AND FUNCTIONALITY
69
type of energy consumption data they wish to view: natural gas, electricity, combined BTUs, or GHG emissions. They can also select which building use type classification to display: single-family residential, multifamily residential, commercial, industrial, institutional, or all buildings combined. In addition to energy totals, users can access median consumption values per square foot, as a way to view energy intensity by building type (Fig. 5.1). The mapping interface is dynamic. Users can zoom to specific geographies, click on cities or neighborhoods of interest, and compare values visually across the region. For example, users can compare median electricity per square foot consumed by single-family homes in Malibu to that in less affluent homes in Bell Gardens or Palmdale. To contextualize the energy consumption statistics displayed in the Map View, the map also displays selected sociodemographic information alongside energy consumption, including population density and median household income. Where energy consumption statistics cannot be shown for a particular geography, due to the 15/15 privacy guidelines, these areas on the map are masked and displayed in a greyed-out hash pattern (Fig. 5.2). A small “footprint” with longitudinal data appears at the bottom of the map when a user clicks on a geography. Each color in the time-series chart represents a building type, and users can hover over each block to see the total energy consumption of that building type over time. On the right half of the footprint is a summary of key socioeconomic statistics for that geography, including total population and median household income. Profile View The Profile View shows users a more comprehensive set of energy consumption statistics for the geographies included in the Atlas. Up to three geographies can be compared at one time. The profiles summarize energy consumption of each city, county, or neighborhood. The Profile View also displays charts per geography, showing energy consumption by building type, building age, and residential median household income (Fig. 5.3). In response to feedback from users, we added an additional chart breaking down residential energy consumption by CalEnviroScreen (CES) score quartiles.2 CES scores are a percentile-based ranking tool,
2 https://oehha.ca.gov/calenviroscreen/report/calenviroscreen-30.
Fig. 5.1 Map view of the Energy Atlas website
70 S. PINCETL ET AL.
Fig. 5.2 Footprint at the base of the map view contains longitudinal data for a specific geography when a user clicks the map
5 USER DESIGN AND FUNCTIONALITY
71
72
S. PINCETL ET AL.
Fig. 5.3 Profile views compare up to three municipalities with tabular summaries of consumption data
developed by the state of California, that measure the level of environmental and social disadvantage for every census tract in the state. These scores are used by the State to prioritize funding opportunities for communities with the highest levels of burden (those with scores in the 75th percentile and above) (Fig. 5.4, 5.5, and 5.6). The top chart displays the amount of electricity consumed by residential buildings within CES score quartiles. The bottom chart shows the relative population size within each CES quartile. Querying the data can reveal important differences in, for example, energy use among different sociodemographic groups. In the example in Fig. 5.7, Orange County residential buildings in the most advantaged areas (0–25% CES score) consume nearly six times the electricity as those in the most disadvantaged areas (75%+ CEC score). This is despite the 0–25% ranked areas containing the smallest population compared to other quartiles. Similar trends can be found in residential consumption by median household income quintiles, in Fig. 5.8. Residential buildings in the
5
USER DESIGN AND FUNCTIONALITY
73
Fig. 5.4 Profile charts displaying energy consumption by building type for Orange County in 2016
74
S. PINCETL ET AL.
Fig. 5.5 Profile charts displaying energy consumption by building size for Orange County in 2016
5
USER DESIGN AND FUNCTIONALITY
75
Fig. 5.6 Profile charts displaying energy consumption by building vintage for Orange County in 2016
76
S. PINCETL ET AL.
Fig. 5.7 Profile charts displaying residential electricity consumption by CalEnviroScreen Score in Orange County in 2016
5
USER DESIGN AND FUNCTIONALITY
77
Fig. 5.8 Residential electricity consumption by median household income in Orange County in 2016
wealthiest quintile of Orange County consume more than twice the electricity per capita than those in the least wealthy areas. The profile view provides a simple visual presentation of a set of key statistics, showing how energy is being used in communities across Southern California. The visualizations allow users to answer the important questions posed in the initial stakeholder meetings. These charts are flexible, and their appearance and content have evolved with user feedback. It is important to again note that data selection is not a neutral process. As we are motivated by assisting a just energy transition, ensuring that the most disadvantaged populations are included, we incorporated data such as the CalEnviroScreen to enable better understanding of the disparities in energy use across different populations in the region and developed drop-down menus that were aimed at that understanding.
78
S. PINCETL ET AL.
Data Download One of the most important functionalities of the front-end website is its role as a publicly accessible data portal. The aggregated energy consumption data published on the site may be downloaded by anyone who visits. Free and unencumbered access is imperative since the primary goal of the Energy Atlas project is to support energy systems research and help to solve practical problems. All data shown in website visualizations are available in CSV format, via the Data page of the website. Further, as of this writing, we do not oblige users to register who they are. This has been the subject of great discussion as some of our funders wish to have such tracking done, so they can justify their funding of the Atlas. Yet, registration may also inhibit certain classes of users. Thus, so far, we have opted not to use a registration process in providing data (Fig. 5.9). Since the launch of the website, Atlas data has been downloaded by many local governments for energy planning, climate action accounting, and informational purposes. City-level Energy Atlas data was used to generate the GHG inventory for all 88 cities and unincorporated areas
Fig. 5.9 The data download portal of the website
5
USER DESIGN AND FUNCTIONALITY
79
in Los Angeles County for their first ever Sustainability Plan in 2019, discussed further in Chapter 7. The Atlas’s public data has also been used for other academic research projects and by energy-related nonprofit groups such as Elevate Energy, the Natural Resources Defense Council as well as smaller local environmental justice organizations.
Visualization, Data Availability, and Privacy Aggregation A significant challenge in developing this tool has been providing maximum access to data for local governments within the confines of customer privacy aggregation rules. In order for energy consumption data to be published and shared on the Atlas website, it must first be aggregated in accordance with the rules stipulated in CPUC Decision 14-05-016, also known as the 15/15 rule. If data for a particular geography, such as a city or county, do not meet the aggregation rules, the data cannot be shared publicly and must be masked. Current customer privacy aggregation standards cause significant masking of predominantly non-residential data. The 15/15 rule necessitates that for any statistical summary of energy consumption to be published there must be at least 15 non-residential customers within a given geography (or category), and that no one customer within the group represents more than 15% of the total energy consumption of the group. For groups containing only residential customers, the rules are less stringent, only requiring at least 100 customers be within a group (Table 5.1). For dense urban environments and residential groups, the masking rules are rarely triggered. However, industrial and commercial consumption categories often have a few large users, resulting in frequent triggering of the masking rules. This has a significant impact on data access for smaller geographies like cities and neighborhoods. Table 5.1 The 15/15 rule
Sector Non-residential Residential
Minimum number of customers 15 100
Largest consumer threshold 15% of total energy use N/A
80
S. PINCETL ET AL.
As is evident in Fig. 5.10, a majority of cities are unable to access their industrial and other non-residential consumption totals because of the 15/15 rule. In most of these cases, high levels of masking are the result of the requirement that no one customer can use more than 15% of the total consumption for that use type within any geography. Nonresidential use type categories tend to contain a few accounts that use a disproportionately large amount of energy relative to others in the same category. For example, oil refineries or large manufacturers often consume far more than 15% of the total industrial consumption for an entire city. In some cases, these types of large customers can use more than 15% of an entire county’s industrial consumption, masking the total consumption for that county. Note that although many large industrial facilities are required to publicly report GHG emissions under state rules, they are not required to report energy use, and therefore this public disclosure cannot be a substitute for specific electricity and natural gas consumption data. For cities like Torrance, with a population of approximately 150,000, located in LA County’s South Bay region, the presence of a large oil
Fig. 5.10 City-level industrial natural gas consumption is significantly masked on the Atlas website due to privacy rules set by the CPUC. The hashed geographies on this map show the extent of industrial sector masking in Southern California
5
USER DESIGN AND FUNCTIONALITY
81
refinery within the city limits masks nearly all of their data, including the total energy consumption for the city. This is due to another impact of the privacy rules: when one building use type category is masked, it prevents the reporting of at least one other building energy use category within that municipality in order to prevent back-calculation of the masked category. This leads to additional masking that may otherwise be unnecessary to prevent the accidental release of one sector’s total consumption. Most alarmingly, the total industrial natural gas consumption for LA, Riverside, San Bernardino, Ventura, and Imperial counties are masked. This means there is at least one industrial customer that represents more than 15% of all industrial natural gas consumption within the vast region of each of those five counties. These instances of masking call into question the usefulness of the current privacy aggregation rules: how can a countywide total for an entire sector betray the privacy of any one individual customer? The current privacy aggregation rules put the privacy concerns of large consumers of energy ahead of the public’s need for energy consumption data. Masking is an important consideration in determining which aggregation levels should be displayed on the Atlas’s public website. As a population of energy consumers is divided into smaller subgroups, masking becomes more prevalent. This is due to both the minimum customer number criterion and the largest user threshold criterion. As the extent of geographic areas decrease, from city to census block group, the population of customers decreases, and the chances of triggering the privacy masking rule increases. This is true for dividing data in almost any way, including by customer type (residential vs non-residential), or by geography (census block group vs county), or by building characteristics (year built and square-footage buckets) (Fig. 5.11). The rates of masking are dependent on both customer density within a geography as well as the highest rates of consumption among those customers in the case of non-residential energy use. Complying with data aggregation rules is both an art and a science. It necessitates experimentation—parsing out use types against the aggregation rule—and some degree of interpretation. It is important for the developer to consider how to achieve the greatest degree of transparency given the current privacy aggregation constraints when specifying the geographies or categories for which consumption statistics will be reported, while at the same time developing aggregations the developer is certain cannot be reverse engineered. These will depend on the number of customers, but also the
82
S. PINCETL ET AL.
Spatial Distribution of Energy Consumers and the 15/15 Rule County
Imagine each dot represents a single customer and its size corresponds to the relative volume of its consumption.
At the broadest geographic scale, we have the boundary of the county which contains these customers.
There are large numbers of small customers, distributed in loose clusters throughout the area, with one very highvolume customer located near the center.
At the county level, the aggregation of many customers counteracts the presence of the single high-volume customer and thus, the 15/15 rule is satisfied.
County
City
What if we would now like to also disclose the consumption for a city whose boundaries nest within that of the county?
County
City
In this case, consumption data for the city would be masked because the single large customer violates the 15% consumption fraction portion of the 15/15 rule. If a countywide total was also being reported here, it would have to be minus the consumption of the city to prevent an “N-1” de-anonymization attack.
Fig. 5.11 The spatial distribution and intensity of energy consumers plays an important role in how much data is masked by the 15/15 rule
5
USER DESIGN AND FUNCTIONALITY
83
distribution of these customers and potential large users within a specific area. More frequent masking of energy use types with the highest rates of consumption is also undesirable from an equity standpoint in LA County. The reasons for this are two-fold. First, poorer areas of the county tend to be where major industrial and commercial consumers are located, and their presence often masks total municipal energy consumption. Second, industrial, institutional, and commercial use types are masked more frequently than residential ones, meaning that municipalities with prolific non-residential consumers will find that there is, on average, less energy consumption data available to them. Without publicly available information about the magnitude of consumption, municipal governments cannot devise policies to reduce consumption or emissions from classes of large consumers within their jurisdictions, or push for the adoption of emissions reduction policies at higher levels of government in cases where local, unilateral action is infeasible. Until the California Public Utilities Commission or the state legislature alters or amends their privacy aggregation rules, this data will remain inaccessible to policymakers and the public.
Summary The design and construction of a tool like the Energy Atlas is an ongoing process. Needs, priorities, and the policy environment change over time, and there is no single tool that would meet the needs of all interested public sector entities and academic researchers at all times. However, by engaging these parties early in the development process, their input can shape the end result into a highly effective and valuable product. Customer privacy aggregation rules, created years ago without datadriven justification, have been revealed through this work to have an enormously deleterious impact on the ability to understand energy consumption. The reality of these rules creates complex challenges around how to best present aggregated data on the Atlas website that minimizes masking to the greatest extent possible. Finally, analysis of building energy use is ultimately predicated on the data available, including building attributes, socioeconomic characteristics, and more. We realize that in many places this background information may not be available at a local scale leading to the need to use national level data that is downscaled,
84
S. PINCETL ET AL.
or other work arounds. Ultimately, however, the more accurate the background data is, and the more granular, the better the understanding of energy use by buildings and the ability to develop carefully crafted policies will be.
Bibliography Grainger, S., Mao, F., & Buytaert, W. (2016). Environmental Data Visualisation for Non-Scientific Contexts: Literature Review and Design Framework. Environmental Modelling and Software. https://doi.org/10.1016/j.envsoft.2016. 09.004. Schlossberg, M., & Shuford, E. (2005). Delineating ‘Public’ and ‘Participation’ in PPGIS. https://scholarsbank.uoregon.edu/xmlui/handle/1794/1343. Sieber, R. (2006). Public Participation Geographic Information Systems: A Literature Review and Framework. Annals of the Association of American Geographers, 96(3), 491–507. https://doi.org/10.1111/j.1467-8306.2006. 00702.x. Simon, P. (2014). The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions. Wiley and SAS Business Series. www.bullseyer esources.com. Vitolo, C., Elkhatib, Y., Reusser, D., Macleod, C. J. A., & Buytaert, W. (2015). Web Technologies for Environmental Big Data. Environmental Modelling and Software. https://doi.org/10.1016/j.envsoft.2014.10.007.
CHAPTER 6
Data Analytics
Data-Driven Decision-Making The UCLA Energy Atlas has been developed during a period of staggering hyperbole about the value of data analytics and machine learning. These days, it seems as though one can’t help encountering yet another news article, business report, or scientific publication espousing the potential for big data or machine learning to completely transform— basically everything (Gandomi and Haider 2015; Labrinidis and Jagadish 2012). But why now? And what underlies these claims? Can these technologies ever truly deliver on all of the promises, given the challenges related to data accessibility and consumer privacy concerns? Data of myriad different types and sizes have been around for a long time (Barnes 2013). What is so special about the present day, that all of this data seems to be suddenly brimming with such boundless potential? Four key threads of technological innovation have converged to produce the present state of data nirvana. First has been the advent of smaller, lower power sensors which have made it possible to collect a greater variety of data, more frequently than ever before. Second has been the linkage of these sensors to larger, higher bandwidth wireless networks, making it possible to move large quantities of data back to centralized locations on a continuous basis. Third has been the decreasing cost and increasing capacity of digital storage media, making it possible to keep all of this data, for the foreseeable future. Fourth and finally has
© The Author(s) 2020 S. Pincetl et al., Energy Use in Cities, https://doi.org/10.1007/978-3-030-55601-3_6
85
86
S. PINCETL ET AL.
been the development of faster, more distributed computer systems which enable users to extract more, and more meaningful, information from these ever-growing data warehouses. How Data Is Transforming Science In 2009 a group of prominent data scientists from Microsoft Research argued that this confluence of innovations has been responsible for ushering in an entirely new “4th paradigm” of data-intensive scientific discovery (Hey et al. 2009). Their theory holds that the history of science can be divided into four successive phases depicted in Table 6.1. During the era of the 1st Paradigm, scientific inquiries began with the statement of a hypothesis about how some aspect of the natural world worked. Following from the formulation of this hypothesis, data would then be collected—usually by conducting an experiment—that could then be used to either confirm or reject the validity of the hypothesis. The 2nd Paradigm accelerated the search for new, more useful Table 6.1 Summary overview of the four paradigms of scientific discovery as proposed by Hey et al. (2009) 1st Paradigm—Experimental science First emerged well over a thousand years ago. It is exemplified by the work of figures such as Galileo, Copernicus, van Leeuwenhoek, and Brahe who showed that new knowledge could be discovered through the experimental observation of the natural world 2nd Paradigm—Theoretical science Began around two hundred years ago. Exemplified by figures such as Newton, Maxwell, Planck, and Bohr, who showed that new knowledge could be discovered through the development of theories which were based upon mathematical models of the natural world 3rd Paradigm—Computational science Appeared in the late 1950s after the development of programmable computing machines. Exemplified by the work of figures such as Turing, Von Neumann, Lovelace, and Babbage who showed that computational simulations could be used to generate knowledge about systems that were too complex to model analytically 4th Paradigm—Data science Appeared only in the past twenty years or so as a result of rapid information technology advancements. This paradigm is so new that its standard bearers have yet to be named. However, it has already demonstrated that knowledge can be discovered through the automated collection, processing, and statistical evaluation of large quantities of raw data
6
DATA ANALYTICS
87
hypotheses through the development of mathematical models. And, the 3rd Paradigm accelerated the rate at which these new hypotheses could be tested using simulation environments created within high-performance computer systems. In all of the preceding paradigms, the essential order of precedence between the hypothesis formulation and data collection processes have remained unchanged. Today, with this proposed new 4th Paradigm, we are seeing an inversion in the order in which these activities occur. In an era where data is being collected from such a wide diversity of sources, essentially all of the time, the collective body of information which is being created can effectively be thought of as the results of one huge integrated experiment—one which captures all of the subtle variations and correlations which are produced by the entire system’s complex dynamics (Bowker and Gitelman 2013). The process of scientific inquiry now begins with this continuous process of data collection. The formulation and testing of hypotheses now largely occur after the fact—through computationally intensive processes of data mining and statistical analysis. The magnitude of this change in perspective cannot be overstated; it represents a fundamental change in the way that we go about searching for new knowledge in our world—one arguably more dramatic than any of the previous transitions (Kitchin 2014). In essence, this implies a research strategy that does not start with a hypothesis or even a question—it expects the data itself to come up with results. How Data Is Transforming Society Information technology and the widespread use of data have significantly altered the order and cadence of daily life for billions of people. Each and every one of us who now lives in a modern industrialized society both draws from and feeds into this new global information system on a continual basis. What is more, our many interactions with this system can no longer be considered to be voluntary. Rather, interactions with this system are now necessary for subsistence for a great many. Many industrialized societies have now reached the point where living one’s life in a manner that is free from digital observation requires deliberate effort and sustained inconvenience. These essential exchanges of information go well beyond the tracking of credit card purchases, cell phone communications, and online browsing activities. Huge new data streams are coming online from novel sensing
88
S. PINCETL ET AL.
platforms ranging in scale from miniscule, low-power, Internet of Things “IoT” devices all the way up to orbiting constellations of satellites. These disembodied data streams can be reduplicated with perfect fidelity and retransmitted around the world at the speed of light. The volume of data that they collect is now causing profound changes within our society— changes to the fundamental processes by which all of our daily decisions are made. The expectation that bits of data—measured, quantized, and encoded into binary format—can and should be used to inform the making of any and all decisions has become utterly pervasive (Lepri et al. 2017). The evidence of this dramatically expanded role of data in our everyday decision-making processes is all around us. At the click of a mouse, enormous volumes of data can be summoned, nearly instantaneously, by an individual consumer seeking to make a decision about which car to purchase, or which doctor to visit, or which restaurant to patronize. In this new data-driven reality—where mountains of information are being stored and indexed in support of these types of small-scale, personal decisions—it is perhaps surprising to learn the extent to which similar data are critically lacking to support high-level, macro-scale resource management decisions of the kind undertaken commonly by state and local governments. The stakes for these decisions are comparatively high, having long-term impacts on local and global sustainability. This is where the complex relationships between utility companies, their rate-paying customers, and the public regulatory agencies, all come to the fore.
Utility Customer Data Most utilities, even publicly regulated ones, treat the data which they collect about customer usage and infrastructure operations as private, protected information. Their justifications for doing so include arguments for the protection of customer privacy and the security of the nation’s critical infrastructure. However, this highly conservative approach to data management also has the useful property of shielding their operations from the types of criticism and oversight that would come from granting independent third parties with access. Utilities, like many corporations, institutions, governments, are still grappling with the notion that all of their major decisions should be informed by the unbiased, quantitative measurement of something that actually happened in the world (Frisk and Bannister 2017). This is an
6
DATA ANALYTICS
89
ironic state of affairs, as they were partially responsible for ushering in the era of “Big Data” with the first development of high temporal frequency Advanced Metering Infrastructure (i.e., smart meters) in the mid-1970s. For many utilities however, the deployment of the sensor hardware necessary to collect all of this data has been much easier than enacting the changes to their organizational and management structures that would be necessary to effectively put it to use. As an illustrative example of this situation, consider the case of an electric power utility which decides that it wants to join the big data revolution by installing smart meters on all the service accounts within its territory. Although this is an expensive and time-intensive endeavor, it actually amounts to a fairly straightforward infrastructure procurement and installation process, one that does not fundamentally challenge the organization’s core operational paradigm. But once such a deployment has been completed, what then? With each passing day the volume of data collected by these smart meters will grow. Without a set of automated processes for indexing, filtering, and otherwise extracting meaningful information from this data, it will simply show up as a net cost on the balance sheet. The only way to recover value from it is to develop processes: first, for transforming the data into information and then, second, for directly incorporating this information into key management decisions (Chen et al. 2012). These additional steps, which go beyond the procurement of sensors and storage systems, have been the Achilles heel of the big data revolution within the utility space. Utility Company Internal Customer Data Use Cases At present, utilities’ primary uses for the consumption data they collect are: (1) to accurately bill customers for services rendered, (2) to ensure that distribution infrastructure operates reliably, and (3) to develop long-term procurement and capital investment strategies. Such limited scope of use reflects a narrow conceptualization of the data’s potential value. Part of the reason for this limited use of data is that utilities understand themselves first-and-foremost as power providers. Although they have been variously tasked with the administration of programs for building energy efficiency retrofits, planning for and implementing district heating systems, or the use of distributed energy resources to support infrastructure operations, they tend not to perceive these projects as new opportunities for the transformation of their business. Rather,
90
S. PINCETL ET AL.
these projects are frequently treated as inconvenient obligations which have been imposed by regulators and should, ideally, be satisfied with a minimum of cost and effort. Similarly, the lack of any legal requirements that this data be made externally available reflects the fact that governments and regulators have yet to fully appreciate how it could be used to benefit society. Utility consumption data has the potential to be extraordinarily valuable for efforts to reduce greenhouse gas emissions and air pollution, improve building energy performance, reduce energy poverty, and more. The CCSC’s experiences working with large investor and municipally owned utilities, as well as the regulatory bodies which oversee them, have revealed a number of common organizational and operational patterns which support the preceding observations. For example, we have observed that among most large utilities, granular consumption data are not generally retained in readily accessible, “hot” storage media for more than three to seven years following its date of generation. Furthermore, among a smaller number of utilities, following this short retention period, these granular data are not even transferred to less readily accessible, “cold” storage media, but instead are simply flushed from their systems. In both instances, all that utilities retain over the long term are coarsely aggregated statistics which they generate to comply with state and federal reporting requirements and use internally to support long-term capacity planning efforts. The absence of longitudinal energy consumption data makes understanding change over time impossible, and jeopardizes the ability to evaluate the impacts and effects of utility initiatives such as rate changes, energy efficiency programs, etc. In the specific case of California, for example, the state began mandating energy efficiency standards in the building code in 1978 (Title 24). These standards are updated regularly but there is no granular and specific tracking of their success and/or impact, including with regard to such important factors as the climate zone. At the same time, starting in the 2000s, the PUC has required the utilities to expend billions for energy efficiency programs. These too have no substantively robust ability to be tracked and evaluated due to a lack of sufficiently granular data reporting and matching to the specific program. Another relevant issue is that most utilities lack the technical resources and skilled personnel necessary to undertake complex data analysis projects which require either advanced statistical methods or the integration of data sets collected from multiple independent sources. Part of the problem is the path-dependent nature of technological investment.
6
DATA ANALYTICS
91
Consider for example, a utility which has been in continuous operation for a century, and whose billing system is based upon a forty-year-old database technology. This database may still be capable of performing the tasks for which it was initially designed; however, its query language, data storage model, and physical hardware systems may all be fundamentally incapable of supporting advanced data analytics operations. Updating to a current database technology requires huge investments in time and labor, including changes in dependent systems such as those responsible for automated billing or compliance reporting. Such changes also entail a degree of risk. Most large-scale technology changes need significant transition time and a great deal of troubleshooting as unanticipated implementation issues arise. Most utilities are not eager to undertake large and expensive technological upgrades, since these almost always require time to “get right.” Examples of these types of technical difficulties abound among US-based utilities, with recent experiences in Seattle and Los Angeles resulting in the largest and most costly billing errors (Lindblom 2016; Los Angeles Times Editorial Board 2019). There are other cases where a utility may have been successful in modernizing their data storage systems but have failed to properly invest in personnel training to support the execution of complex data analysis projects. This situation is arguably more disturbing as it reflects the utility’s belief that such activities are not a part of its core business model. In situations such as this, where data analytics projects are deemed beyond the scope of normal business operations, any project requiring such analytics will typically be subcontracted to one or more third-party consultants. Such efforts typically require numerous project managers and the execution of third-party data access and non-disclosure agreements. This additional overhead inevitably causes these projects to either run over-budget or, conversely, become reduced in scope, thus limiting both their real and perceived value to the organization. Many utilities are only now coming to terms with the fact that collecting a vast amount of data without the means to analyze it is a losing proposition. This is because the mere collection and storage of data will only ever result in positive and increasing costs. In order to extract value from investments in data collection, efforts must continuously be made to convert the data into new streams of information which are used to direct business activities. Most utilities are publicly regulated entities, and thus the scope of beneficial uses for the data that they collect should not be
92
S. PINCETL ET AL.
limited to their internal business operations. There are many other potential applications for utility customer data beyond the scope of the utilities themselves. The focus of the remainder of this chapter therefore is not how utilities can “unlock the value” of their data to “enhance profitability” or “eliminate inefficiencies.” These topics have been comprehensively addressed elsewhere in the management literature (McAfee et al. 2012). Rather, the focus of the remaining discussions is on the value of utility data for society, as brought about by the work of researchers at academic institutions, program managers within local governments, customer advocacy organizations, state energy planning entities, as well as private companies with product offerings, or solutions designed to improve resource efficiency and/or conservation. It could be argued that if utilities implemented better data analytics it would result in heretofore unanticipated benefits to customers such as lower costs or more reliable service, but such activities have not been part of the traditional utility model, especially for the Investor-Owned Utilities. Some public utilities are starting to shift toward this approach since their mission is to provide the best service to their customers, rather than to ensure that shareholders are compensated. Their shareholders are the public, residents of the area served. Barriers to Third-Party Access Within the United States, account-level utility consumption records are treated as private personally identifiable information (PII). As a practical result, this designation prevents transparency around certain aspects of Investor-Owned Utilities’ (IOU) business operations. The PII designation even, in some instances, limits what regulators who have been tasked with overseeing their operations and business practices may know. There are various arguments for why account-level consumption of such commodities as electricity, natural gas, or water should be treated as PII; however, all rest on the same central idea: that knowledge of the rate at which these resources are being consumed at a particular time and location can potentially reveal information about separate, but related, private activities. Just as the release of energy consumption data can benefit society—for example, through targeted conservation/efficiency improvement programs—so too can it be used to conduct various types of criminal malfeasance (McKenna et al. 2012).
6
DATA ANALYTICS
93
Classic examples used to illustrate the potential for such malicious use of customers data include the following. In the case of residential accounts, knowledge of a household’s patterns of energy consumption could potentially reveal windows of opportunity for burglars or thieves to stage a break-in during times when the home is known to be typically unoccupied (McKenna et al. 2012). Alternatively, in the case of commercial/industrial accounts, the release of detailed consumption patterns could create opportunities for corporate espionage. Sensitive trade secrets could be reverse engineered from the combination of energy usage data with economic or other output statistics. This tension between the risk to privacy and the benefits of publicly available utility data requires that an ethical determination be made either by regulators (in the case of IOUs) or by utility managers (in the case of MOUS) regarding the balance of obligations to the public vs. the individual. This determination is expressed, quantitatively, by the specification of aggregation rules. Aggregation rules set the minimum number of individual accounts, and the distribution of consumption among them, that must be grouped together in order for energy consumption data to be made public.
Revisiting Utility Customer Data Aggregation Rules Within the United States, the state of California and its Public Utility Commission (CPUC) have been among the first entities forced to address the issue of utility customer data privacy. Through a series of commission decisions and court rulings described in Chapters 3, the CPUC established a set of rules in response to a chorus of calls for increased data access and operational transparency. These data access rules define the eligibility requirements for third parties to request data and establish a fixed set of requirements for how individual account-level consumption records must be aggregated prior to any sort of public data disclosure. As we described previously, the established data aggregation rule within California is commonly referred to as “15/15” or “the Rule of 15” and is so-called because of the following, parallel requirements:
94
S. PINCETL ET AL.
1. Consumption data can only be released for a group of 15 or more customers.1 2. Consumption data can only be released for a group if no single customer within it accounts for more than 15% of the group’s total usage. In practice, we have found the second requirement, which relates to the maximum threshold fraction of consumption attributable to any single customer, tends to be the far more restrictive of the two. This is because at the account level, the frequency distribution of consumption magnitudes within a population of customers tends to be log-normally shaped. This distribution, which is characterized by right skew and a long tail, means that consumption levels by a small number of customers are two to three orders of magnitude higher than average. We call these large individual customers “whales” as their individual contribution can dwarf that of hundreds or even thousands of others within a given reporting group. At the level of a city for example, whale accounts tend to be associated with large commercial and industrial facilities engaged in highly resource-intensive production activities: oil refining, metal smelting, chemical synthesis, and others. The presence of just a single whale account can be sufficient to prevent the disclosure of consumption data for regions that are populated by millions of people—an effect that we call “masking.” Masking of the consumption of large groups of customers due to the inevitable presence of large individual accounts fundamentally limits the ability of any interested party to track spatiotemporal changes in resource consumption patterns (electricity, water, natural gas, etc.). The practical impact of these privacy aggregation rules is the introduction of significant inaccuracies into the aggregated data which is then made publicly available.
1 For residential customers, the minimum number of accounts that are allowed to be disclosed is 100. The 15-account minimum applies to all non-residential customer categories.
6
DATA ANALYTICS
95
Ethical Implications of Utility Customer Data Access Decisions In light of the severe shortcomings of these types of aggregation rules, it is imperative to revisit the underlying assumptions around the designation of all consumption data as PPI. The magnitude of risks associated with data disclosures is strongly dependent upon (1) the temporal frequency at which the data has been collected and (2) the temporal lag between the time of data collection and the time of release. For example, 15-minute interval data made available in near-real time can reveal a significant amount of potentially compromising information about an individual customer. Compare this to the relatively modest threats posed by the release of monthly data made available a year after the time of consumption. The delayed release of low temporal frequency data is enormously valuable for resource planning purposes and incurs little tangible privacy risk to the customers. Each specific utility customer data aggregation rule can be interpreted as an ethical determination made by the regulatory body which created it. This determination reflects the regulator’s rank prioritization of the following two competing responsibilities: protecting the privacy of individual consumers versus making use of data to pursue the public good. California’s specific 15/15 rule reflects a balance of these ethical responsibilities which, we believe, is overly skewed toward the protection of individual privacy. Moreover, the continued protection of this rule from legal challenges has been unduly influenced by the political lobbying of special interest groups who represent the largest individual customers who desire to keep the magnitude of their consumption private for fear of public outcry or potential rate increases. The full extent of impacts of these aggregation rules are not well understood—nor have they been rigorously studied since their inception. However, our application of the aggregation rules as part of development of the Energy Atlas has allowed us to observe some of their effects. One of the most serious impacts of the 15/15 rule is the extent to which it undermines the ability of local governments to accurately account for their greenhouse gas emissions by sector. While most utilities report the total consumption occurring within their service territories, these aggregated values do not enable localities to target programs and policies intended to address specific types of energy use or customer classes, nor to track change over time in any specific manner. This significantly
96
S. PINCETL ET AL.
hampers the implementation of emissions reduction policies. For example, as California works to reach the 2020 greenhouse gas emissions targets established in the AB 32 Scoping Plan (see the list of regulations and policies in Chapter 2), local and regional municipalities have begun preparing emissions inventories and setting GHG emissions reduction targets as part of their broader sustainability planning efforts. These forecasts may contain substantial inaccuracies due to the application of the aggregation rules. Moreover, it inhibits the targeting of programs for energy use reductions. Take for example the commercial sector. Clearly within any sector, say tortilla making, there is a range of energy consumption that depends on the firm’s adoption of new or different technologies. Having the ability to conduct in-depth industry energy use analysis by industrial classification code and building attributes (size and vintage) could enable local governments, utilities or the state, to develop appropriate incentive programs for those industries to help them reduce energy use. We used the Energy Atlas to assemble inventories and forecasts of building-related greenhouse gas (GHG) emissions for the LA County Sustainability Plan in the spring of 2019. An important feature of this project was to eliminate the need for cities to create their own GHG emissions inventories and to ensure methodological consistency. But consumption had to first be aggregated. The application of the aggregation rules resulted in extensive masking throughout the County and its 88 cities. The extent of the masking was such that we resorted to generating estimates for some mostly or completely masked cities and areas, based on Energy Usage Intensities (EUIs) derived from other cities that were less masked. This created significant levels of uncertainty in the individual and aggregate municipal GHG inventories and forecasts. We found the range of differences between EUI-based estimates and true consumption values for individual cities within LA County was on the order of − 790 to +533 GWh per year for electricity, and −8 to +13 M-therm per year for natural gas. As a context, in 2018 the total residential usage in Los Angeles was 21,024 GWh for electricity and 1107 M-therm for gas. Municipalities that are home to industrial plants and other large, nonresidential consumers of energy are the most affected by the current privacy rules. The data masking impacts of the aggregation rule can be measured in two ways: (1) in terms of the percentage of total consumption which must be masked and (2) in terms of the percentage of the total number of geometries which must be masked. We conducted an experiment using the Energy Atlas reporting geographies to determine,
6
DATA ANALYTICS
97
for electricity, natural gas, and water, what these different masking rates would be if the masking rule was set using different values for the individual customer maximum consumption fraction and the minimum number of customers. The results of this experiment are shown in Fig. 6.1, which shows, for a large number of different pairwise combinations of these values, what the effective masking rates would be. The masking rates under the current 15/15 rule, are marked on each plot in the figure for context. As the strength of the vertical gradient of these plots shows, small changes in the individual customer maximum consumption fraction have large impacts on the percentage of total consumption and geographies which must be masked. This is something which we directly encountered during our efforts to use the UCLA Energy Atlas to develop the building GHG emissions inventory for the Los Angeles County Sustainability Plan. Due to a small number of very large consumers within the county and in order to abide by the 15/15-Rule, we were limited to the use of estimated values for certain consumption sectors and in certain
Fig. 6.1 Results of an experiment using the Energy Atlas data to evaluate the effective masking rates associated with a large number of different potential alternatives to California’s existing 15/15 Rule
98
S. PINCETL ET AL.
areas. Furthermore, the cities themselves were unable to conduct more accurate inventories. This example shows how even the most fundamental steps toward climate change mitigation are hampered by limitation on the use of granular building energy data. Errors will be replicated through resulting documents and policies, which undermine their effectiveness.
Third-Party Utility Customer Data Request Procedures Because of the large number of individual customers and the frequency with which their consumption data are collected, complete time series of disaggregated utility consumption have become so large as to be impractical to share with researchers and governments via duplication. Consequently, if a major utility desires to share its data, or is otherwise compelled to do so, the most sensible option would be to allow for an external third-party to remotely view or query the utility’s source data repository (Wallis et al. 2013). However, the idea of providing third parties with any sort of direct access to their back-end database systems is deeply unsettling for utility managers. Doing so would place a significant burden upon them to adequately secure such a system so that only authorized parties had access to limited portions of the data. In order to facilitate this type of third-party remote access, it is necessary to develop what are known as Application Programmatic Interfaces (APIs). An API is a set of software libraries and standards which systemize the structures and procedures by which information can be queried and accessed over a network. API development is a complex technical undertaking which requires significant and ongoing investment on behalf of the owner of the data (Bloch 2006). Today, the majority of utilities have not yet reached the level of technical sophistication required to grant researchers or third-party entities direct access to their internal data storage systems via direct API calls. Consequently, existing processes for the fulfillment of third-party data requests require that detailed information about the nature of the data that is being requested be submitted to the utility (assuming the utility is willing to share the data in the first place). Once such a request is received, it must be interpreted by the utility’s in-house database administrators and translated into a set of data queries capable of being executed by their internal databases. Once the queries have been successfully executed, copies of the query results must then be physically transferred to the
6
DATA ANALYTICS
99
third-party requester. This entire process carries with it extremely high transaction costs for both the utility and the requesting third-party. Additionally, the need to copy-and-transfer all query results limits the scope of data requests that can be practically fulfilled. Requests that would require scanning over the entire utility database are therefore almost always denied. Within California, the Energy Data Request Program (EDRP) stipulates a protocol by which requests must be formally submitted to energy IOUs for electricity and natural gas data, including limits on the timeframes over which these requests must be processed, and over which data must be made available. The EDRP emerged from the same 2014 CPUC decision as the data access and aggregation rules (CPUC 2014a). The EDRP can be quite restrictive in terms of the requirements that must be satisfied for a requesting party to be considered eligible. For example, for an account-level consumption data request to be considered as valid, it must come from researchers working at a not-for-profit academic institution and the methodologies used in their research program must be certified as ethical by an Independent Review Board (IRB). It must be accompanied by a detailed data security plan with several required layers of redundant software and protections (each set uniquely by the individual utilities), and the researchers must be willing to sign a binding non-disclosure agreement with the granting utility. Ultimately, however, the utility decides whether they consider the request valid and the use of the data legitimate. There are limited options for recourse if the data request is turned down. All of these requirements function to dramatically limit the practical availability of account-level consumption data to local governments and program implementers who have a legitimate interest in creating public benefits from its use.
Utility Customer Data Analytics Working with Utility Customer Data at Scale Earlier in this chapter it was mentioned that utility datasets are often so large as to be “infeasible to share via duplication.” This is one of the defining characteristics of “big data”; however, there are others. As we shall see, it is a term whose definition is constantly evolving. Incremental advancements in the price-performance of new information technologies have made datasets which only a few years ago might
100
S. PINCETL ET AL.
have been considered unwieldy in size now seem quite manageable. This extraordinary technological progression has come to be viewed as an almost mundane occurrence because of the unerring consistency with which it has taken place. Recently, however, the pace of innovation in new computer architectures has begun to slow. It is now fairly clear that Moore’s Law, which for the past forty years has dictated an annual doubling in the density of transistors in new computer chips, has begun to grind to a halt (Theis and Philip Wong 2017). And while computer performance is still increasing year after year, it is no longer doing so at the exponential rate as it had been previously. Today, architectural features of computer hardware design have emerged as the key determinants of performance and likely will continue to do so for the foreseeable future (Williams 2017). Academic definitions of Big Data often lean heavily upon the three V’s: Volume, Velocity, Variety (Big Data, Little Data, No Data: Scholarship in the Networked World—Christine L. Borgman—Google Books, 2015). The term Volume refers to the notion that one of the Big data’s requisite characteristics is a large number of individual records. Velocity refers to the speed with which new data points are generated. This, in some sense, can be thought of as a proxy for a dataset’s potential to become even Bigger in the future. Finally, Variety relates to the concept that many Small datasets can become Big through the process of being integrated with one another. From the perspective of an individual researcher or analyst however, such abstract academic definitions can be of little practical concern. For these practitioners, the following, more operative definition is likely to be of greater relevance: Any dataset can be described as being “Big” if its size practically exceeds either the storage volume or the computational resources of the largest machine that is available to the person trying to analyze it. E. D. Fournier
With millions of customers and new AMI infrastructure collecting data at 15-minute intervals, there are many utility datasets which qualify as “Big” by any and all of the previous definitions. In order to bring these firehoses of data under control, individual utilities have been forced to choose between two approaches to the management of their customers’ data.
6
DATA ANALYTICS
101
Either (1) expend substantial resources to upgrade their in-house information technology infrastructure and build-out the systems and expertise necessary to store it on premises or (2) enlist the help of third-party consultants to store the information off-site in one of a handful of major cloud service providers’ managed data centers. The preceding, more practical definition of Big Data is suggestive of some common strategies used by researchers and analysts when faced with a Big Data problem: • Do nothing and wait —until the performance of the best single machine that you have access to catches up to the size of your dataset. This may sound like a silly suggestion, but in years past, when the semiconductor industry’s pursuit of Moore’s Law was in full effect, this was actually a quite reasonable strategy. It was low cost and did not require learning how to use any fundamentally new tools or systems. Instead, it relied simply upon the assumption that the computer which you owned five years from now, would cost the same amount of money to buy, would operate in the same essential way, but would have increased in performance by a factor of 10x. • Find another, larger machine—one which is capable of handling the size of your dataset. The challenge with this approach is that it forces you to borrow, rent, or buy access to this larger machine; providing that one of a suitable size even exists. Typically, this process also entails overcoming some additional difficulties in terms of transporting not only the data itself but also the software environment intended to be used for its analysis, to the location of the new remote host. • Use multiple, small machines —and embrace the challenges of distributing both your data and the computations you intend to perform, over them. Disaggregating Big datasets into smaller component pieces and orchestrating a coherent set of analyses across these data shards entails a hugely complex set of software and networking operations. Until very recently, this approach was generally the only viable option for large technology companies whose core business operations relied upon generating Big data insights. This competitive business environment supplies sufficient financial incentives for these firms to hire a large number of dedicated computer scientists to architect, build, and make productive use of an entirely new set of distributed computing platforms and tools.
102
S. PINCETL ET AL.
Over the long term, the third option is always going to be the most scalable and effective solution. However, implementing such a solution is not a task for an individual researcher or analyst. This is something which enterprising cloud service providers have recognized and sought to capitalize upon through the development of commercial product offerings which are intended to lower the barriers to the access and use of distributed data storage and analytics. Today, there are a host of different commercial solutions to choose from. They range in complexity from Infrastructure as a Service (IAAS) offerings—where the process entails renting access to distributed computing hardware, and the user installs, manages, and uses the software necessary for their analytics—all the way up to Platform as a Service (PAAS) product offerings, which basically involve handing data over to the solution provider and having them take care of all the myriad intermediate operations and systems necessary to start executing queries upon it (Gibson et al. 2012). When using these types of commercial product offerings, understanding how much data there is, what is required to store it, and the computational complexity of the various types of queries or analytic operations that might be useful to perform are all fundamental to ensuring that costs can be controlled. There are obviously best practices around the use of these types of systems: using preemptible services, setting hard resource capacity limits, and raising warnings or exceptions when certain resource utilization thresholds have been reached (Blokdijk and Menken 2009). Unfortunately, full exploration of this important topic is beyond the scope of this chapter. However, it bears mentioning that Big data storage systems are large and complex, and their use can exact significant operational costs. These costs must be considered prior to the decision to work with Big data. The current version of the Energy Atlas back-end database contains monthly consumption data for a combined population of roughly 30 million people. This dataset spans a time period of roughly ten years in some locations. Currently, we operate an on premise, single node, PostgreSQL database server which is capable of handling this volume of data and the types of queries which we typically apply to it as part of our research. Despite the power of the current generation of hardware and software that we use, it is likely that a significant expansion in the size of our dataset would require a new, more distributed data storage solution. And indeed, this is something that we are actively planning for in the future.
6
DATA ANALYTICS
103
Verifying Utility Customer Data Integrity When working with any new dataset, Big or otherwise, the first steps that analysts must take always involves a process of familiarization. As part of this process, analysts use a variety of techniques including manual inspection of records, calculation of descriptive statistics, and development of diagnostic plots, to get a better sense of a dataset’s contents. The larger a dataset is, the longer this process of familiarization can take. Indeed, for some truly large sets, it may actually be impossible for any single individual to become completely familiar with its contents. This is in part because of the computational expense associated with generating various types of descriptive statistics and diagnostic plots for datasets with billions or trillions of individual records. While it is common for analysts to extract a subset sample of a big dataset that is more feasible to become familiar with, this approach can often be misleading and even cause an analyst to develop a false intuition about the underlying distribution of important attributes or the completeness of their coverage. Regardless of how it is accomplished, the point of this familiarization process is to inform the development of automated procedures for validating the integrity and completeness of the data. These integrity checks are essential for identifying and reconciling pernicious data quality issues that can crop up later when attempting to execute different types of analytical procedures (Strong et al. 1997). Big datasets present unique challenges from an integrity validation standpoint as they are almost always generated through the use of some computer-automated measurement or reporting system. These systems, when well designed, can be quite reliable. However, when poorly designed or implemented, they can silently introduce systematic errors into a dataset. A good example of this might be a miscalibration which has erroneously been applied to an entire sensor fleet. In the rare case when this does happen, it can be very difficult for analysts to identify because even though the relationships between sampled values would be correctly retained, the absolute magnitude of all of the values would be off. Fortunately, however, while these kinds of errors may be difficult to initially spot, their consistency usually allows for them to be rapidly corrected using simple rule-based computational methods. It is critically important for the designer of a data collection and reporting system to take into consideration potential sources of corruption and defend against them. This includes limiting the options available
104
S. PINCETL ET AL.
to users for data input, checking against invalid data entries at report time, providing detailed and informative error codes in the event of unexpected hardware or software failures, and others. And while this topic is worth mentioning here, we acknowledge that the intended audience for this work is not necessarily the utility employees who architect big data creation pipelines, but rather the researchers and analysts who will be their ultimate downstream consumers of the data collected by such systems. When working with big datasets, there is often the temptation to simply discard observations that do not pass some sort of integrity check or appear to be erroneous. The temptation to discard these observations must be weighed against the knowledge that not all observations have the same information content—a statement which is particularly true with respect to resource consumption data collected by utilities. This is because, as mentioned previously, consumption values tend to be log-normally distributed. It is therefore advisable to investigate potential causes of error prior to discarding observations, because the appearance of these errors might be correlated in space, time, or relative to some attribute. In the worst case, correlated errors can be systematically biased toward a subset of observations which, though relatively small in number, may have an above average information content. This approach is both methodological and epistemological. That is, it involves implementing data methods to assess these outliers and to ensure they are not automatically discarded. But, perhaps more importantly, it involves a world view that there are Black Swans and unanticipated situations that bear investigation and understanding as they may reveal trends and tendencies or patterns that had not been considered within “normal” expectations and past experience. In order to address these issues in the development of the Energy Atlas, CCSC has developed a multistage process for data validation which involves a staged series of integrity checks. The first of these checks references external sources of information. These include aggregated consumption statistics published by the California Energy Commission as well as by individual utilities themselves. The second phase of these integrity checks involves referencing internal sources of information, such as data for the same location in previous years or for similar locations in the same year when a time series is unavailable. This process of comparison involves determinations to be made regarding what degree of change or discrepancy might be considered feasible. Sharp discontinuities in time series are usually a strong indication of geocoding errors, attribute errors,
6
DATA ANALYTICS
105
or gaps in the raw input consumption data. All of these must be checked for and dealt with on a case by case basis by the analysts involved in developing the back-end consumption database. It also involves using experience, intuition, and discernment. Putting Consumption into Context On their own, account-level utility consumption data are not especially useful for the design and implementation of programs or policies. Their value to these types of applications is dramatically increased, however, if it can be combined with sociodemographic and geospatial data. The addition of contextual information vastly expands the potential to extract policy-relevant information from a consumption dataset. When thinking about the types of analytical operations which can be applied to a dataset, it is tremendously useful to have a conceptual model of the data structure that is being used to store it. The UCLA Energy Atlas has been built on a relational data model (Paredaens et al. 2012). According to this model, customer identity information is stored in one table, consisting of rows and columns, and their corresponding consumption information is stored in another, separate table. Entries within each table are linked by the values of a common attribute, or “key.” In the customer table, each customer is represented as a row and the unique attributes for that customer are represented by the values within its columns. Similarly, with the consumption table, each row represents the unique combination of a customer and a time period of consumption. Breaking the data down into these types of separate, but linked tables, each with their own set of related attributes, significantly reduces the duplication of stored information while also improving the speed with which it can be accessed by the database system. According to this relational data model (Bartusch et al. 2012), adding a contextual dataset to a set of consumption records can be thought of as expanding the number of columns, or the number of attributes, that are associated with each entity in the dataset. In this way, the size of a dataset can be measured both in terms of its “length”—i.e., the number of rows, or unique customers which it contains—but also in terms of its “width”—i.e., the number of columns, or unique attributes that it possesses. It is far more common for datasets to be considered Big because of their length than because of their width. This is partly because the more attributes collected for a set of entities, the closer it comes to their
106
S. PINCETL ET AL.
complete description, and thus, the greater the likelihood that such data collection efforts would run afoul of privacy concerns. Such an approach “thickens” the data around customer energy use. Within the relational data model, contextual enrichment can be achieved through a powerful set of operations that are collectively known as joins. Generally, a join operation is one that creates a linkage between the records in two separate tables on the basis of a shared attribute or satisfied condition. Interestingly, the application of the “join” concept can be extended beyond simple numerical or text fields and be applied to spatial attributes as well. A spatial join is one in which the attribute keys used to join two separate tables are spatial features (i.e., point locations or polygonal areas) that are associated with each record in the dataset. In such cases, the determination as to which records from each table become joined to which other is made not on the basis of equality, but rather when some predefined spatial relationship—i.e., intersection, proximity, adjacency, etc.—is satisfied. Spatial joins are incredibly powerful tools within the context of utility consumption data analysis. They provide a robust mechanism for linking account-level consumption data to a wide variety of other sources of geographic information such as parcel-level building attribute data collected by municipal tax assessor offices, or sociodemographic data from a national census. In order to execute these types of spatial joins however, a geographic identifier must be associated with each utility account’s consumption data. If the utility does not provide an actual spatial reference, in the form of a latitude–longitude coordinate pair or a premise polygon, then it is the task of the analyst to create one using billing address information that the utility does provide. This process, which involves converting an address string into a latitude–longitude coordinate pair, is known as geocoding and is discussed in detail in Chapter 4. Detecting Significant Changes in Historical Consumption Another common use for utility data is to detect changes in a time series of consumption values. The ability to identify the timing and extent of changes can be extremely useful for evaluating the efficacy of an energy efficiency program or set of conservation measures. An exceedingly common (but naïve) approach to this type of problem is to simply compare consumption levels in the “before” period to those in the “after” period. This simplistic approach suffers from a number of
6
DATA ANALYTICS
107
shortcomings. First, the lengths of time over which the “before” and “after” periods are defined is undetermined, as is the mechanism for aggregating the consumption levels observed within each time period for use in comparison. Second is the fact that consumption levels are nearly always changing. These changes reflect cyclical patterns in the demand for these resources due to changing weather patterns, levels of economic activity, and other consumer behaviors. In light of this reality, the question becomes not merely one of detecting “any” change but rather one of detecting “anomalous” change. In order to determine if a behavior is anomalous or not, it needs to be compared to some baseline of expectation. The canonical methodology for estimating the efficacy of energy efficiency programs is a technique called “difference-in-differences” (Bartusch et al. 2012; Haben et al. 2015). This method works in the following way. First, the consumption time series for a customer account that participated in an efficiency or conservation program is obtained. This time series must span at least one full cycle of any expected periodic changes in consumption that exist (i.e., annual seasonal changes, for example) both in the period before and the period after the measure was applied. Next, a control customer must be selected which matches the participant customer as closely as possible in as many different ways as possible, save for its lack of participation in the program. There are many different potential ways in which this customer “similarity” can be measured, but all benefit from having the maximum possible number of descriptive attributes available (Donald and Lang 2007). Finally, once a participant customer has been paired with a group of corresponding nonparticipant control customers, the differences in their consumption levels in the after period are compared. This calculation is generally conducted repeatedly over many pairings to come up with results. The difference-in-difference methodology is predicated on two important assumptions. The first is that the customer attributes which are available for use in the matching procedure perfectly correlate with their resource consumption. The second is that there is a negligible behavioral component to the potential efficacy of a program or measure. Our experience in the application of this method to the evaluation of program efficacy suggests that both assumptions are insufficient within the context of energy and water consumption. A novel alternative approach that we are currently exploring involves the following procedure: (1) fitting a model to the participant’s individual
108
S. PINCETL ET AL.
consumption using data from the time period before the intervention, (2) using the model to generate forecast predictions of consumption in the period after the intervention, and then (3) assessing the difference between the forecasted and actual consumptions. The forecasted consumption generated from the model—rather than the consumption of another, similar, account—serves as the counterfactual in this method. This approach is superior to the difference-in-difference method if the accuracy of the consumption model predictions exceeds the accuracy with which a program participant customer could be matched to a non-participant control. We call this type of efficacy evaluation method “simulated historical forecasting” and its applicability is currently the subject of active research. Regardless of how program efficacy is measured however, there is always the possibility that the same measure, when applied in two different locations, might be observed to have both positive and negative effects on resource consumption in the “after” period. This possibility stems from the unpredictable nature of customer behavior post-implementation. When the efficiency of an end-use of energy or water increases, the cost associated with engaging in that end-use decreases. This may lead to a situation where consumers are content to spend the same amount of money for a greater quantity of services in the aftermath of an efficiency improvement. This undesirable situation is called a “rebound effect” and has been observed in many different situations (Berkhout et al. 2000; Greening et al. 2000; Sorrell 2009). Figure 6.2 provides a graphical illustration of the results of a recent experiment involving the application of this new method to the evaluation of the efficacy of a residential energy efficiency program implemented within Los Angeles County. The two plots show the consumption time series for two anonymized individual households which have received the same energy efficiency intervention. The timing of the intervention is shown with a blue vertical line. Black dots plot metered monthly electricity consumption for a household over a period of 10 years before the measures were implemented. Blue dots show the measured consumption data in the period following the intervention. The blue shaded area shows the predicted consumption values from a time series model which has been fit using only the consumption data in the before period. Negative deviations from the model fit in the after period indicate energy savings (top, green). Conversely, positive deviations indicate the presence of energy rebound effects (bottom, red).
6
DATA ANALYTICS
109
Fig. 6.2 Graphical illustration of the use of the simulated historical forecasting technique to evaluate the efficacy of a set of energy efficiency measures implemented in two different households
There are many potential explanations for how such rebound effects can occur. Some of the causes for these unintended outcomes relate to the design or implementation details of the programs in question. To illustrate how this is possible, consider the following hypothetical example. A program is created which provides customers with a financial rebate on the purchase of a new, more energy efficient refrigerator. This rebate is
110
S. PINCETL ET AL.
applied at the point of sale in the form of a discount on the purchase price of the new appliance. Crucially, no effort is made to recover the old inefficient appliance to ensure that it has been replaced. Consequently, when the customer goes home with their new refrigerator, instead of replacing the old, as intended, they simply move it into the garage and use it as a second, supplementary unit. From a program efficacy perspective, the rebate was perceived as being completely successful at reducing the volume of energy consumed by the customer. In fact, exactly the opposite outcome was achieved. The rebate functioned instead to unlock a higher level of energy consumption that was hitherto restricted by the high upfront cost of a new appliance purchase. Another example is in the case of the installation of a new, more efficient air conditioning system. Rather than keeping the cooling temperature the same as before, the customer cools their space more aggressively, hence keeping similar levels of energy use, or increasing it. The ability to access a long historical time series of granular resource consumption data is critical for the successful application of this method. Access to such a time series also helps the analysts better understand the individual behaviors and decisions that might be responsible for any observed savings or rebound effects. If one tries to evaluate the efficacy of an energy efficiency program using geographically aggregated data, they will often find that average savings of the program are close to zero. This is because the aggregation functions to group together the experiences of individual consumers where the program was effective with individual consumers for whom the program led to rebound. As surprising as it may sound, the state of California does not use metered consumption data to evaluate the efficacy of its energy efficiency and conservation programs. Rather, program savings are modeled “ex ante”—i.e., before they have been implemented—using a strict set of predefined models and assumptions (California Public Utilities Commission [CPUC] 2014b). Despite the widespread use of this approach, we at the CCSC struggle to understand how modeled savings could ever be more accurate than measured savings. Forecasting Future Consumption Levels Another important application of contextually enhanced utility consumption data is in forecasting future resource demands (Huang et al. 2015). Accurate forecasts are essential for planning new capital infrastructure
6
DATA ANALYTICS
111
development, changes to rate tariff structures, and long-term procurement contract negotiations. A common approach to dealing with the inherent uncertainty associated with predicting future behaviors is to develop a set of plausible scenarios (Ghanadan and Koomey 2005). Gaining access to granular resource consumption data and enriching it with contextual attributes is a useful mechanism by which different growth trends can be extracted and different scenarios of future change defined. As has been mentioned previously, many types of resource consumption exhibit strong cyclical patterns of increase and decline. This periodic behavior can either be explicitly accounted for in the structure of a forecasting model or explicitly removed by normalizing consumption values relative to a periodic source of variation—such as temperature or precipitation. In either case, these regular periodic variations often occur amidst other, long-term patterns of increase or decrease in consumption levels due to ongoing structural processes affecting the system. These processes could be economically driven, such as the increasing scarcity of the resource, or technologically driven, such as the increasing efficiency with which the resource is able to be used to render services. In either case, isolating these structural trends—disaggregated by different customer groups—is one of the most important components of any forecasting exercise. Aggregate growth trends computed over a large and heterogeneous customer base, can often lead to misleading interpretations. For example, on average, year over year growth in the resource consumption for all customers in a utility service territory could appear to be negligible overall. However, this zero-growth conclusion could be masking significant, but simultaneous, trends of growth and decline exhibited within specific customer groups. If the size or relative contribution of one or another of these groups is expected to increase in future time periods their respective growth trends will come to dominate the behavior of the collective over the long term. Such differences among customer groups may also have significant equity implications. We have observed this phenomenon while forecasting the future penetration of battery electric and plug-in hybrid vehicles in Los Angeles County. Aggregated statistics for this area indicate robust growth in the demand for these vehicles, but at rates which were neither among the highest nor the lowest nationwide. However, when the data were disaggregated by zip code and normalized to population densities, we found
112
S. PINCETL ET AL.
that the rates of growth in EV penetration per capita in LAC’s more affluent communities were among the highest in the nation. Moreover, on the opposite end of the spectrum, many Disadvantaged Community neighborhoods were found to be experiencing zero or near zero growth in BEV/PHEV adoption. This is a troubling outcome from an equity standpoint, and has been the focus of several ongoing projects. Ideally, we believe that growth forecasts should be developed at the most granular level possible, potentially even for individual customers. In this way, forecasting methods can directly account for the individual periodic variations and growth behaviors exhibited by each customer. This type of granular analysis, previously computationally prohibitive, is now very much within the realm of feasibility. Moreover, developing individualized growth forecasts for individual customers allows for these forecasts to be used for other applications, such as with the preceding simulated historical forecasting approach to energy efficiency program efficacy evaluation. As the state of California continues to adopt policies promoting electrification and the improvement of building energy performance, implementation success will be predicated on understanding customer behavior, building performance, and more. Without granular information, these attempts will likely have low success rates and be much costlier than need be. Forecasting at this type of scale requires the use of mathematical models that are robust to missing data, have parameters which can be automatically fit, and, generally, do not require manual interventions by a research analyst to deliver good results. We have found that these requirements are very similar to those encountered by major technology companies interested in forecasting user engagement or customer sales. As such, we have experienced good results in efforts to adapt forecasting model architectures that were originally developed for these other applications to customer level utility resource consumption forecasting problems (Taylor and Letham 2018).
Utility Customer Data Models Versus Descriptions A description is a set of observations about the state of a system under a given set of conditions. A model is a mathematical construct which seeks to predict system states. Models are, by definition, simplified representations of reality. They can be made more or less complex depending upon the intended application. Simple models are often useful as teaching aids
6
DATA ANALYTICS
113
because their mechanics are readily understood. However, overly simplified—“toy”—models tend to suffer from low prediction accuracies. More complex models can deliver much higher prediction accuracies, but they often do so at the expense of their intelligibility (Bender 2000). When descriptions are both accurate and complete, they can be used to inform the development of new models or the refinement of existing ones. Highquality descriptions are necessary prerequisites for the development of high-quality models. Thus, models which have been developed in the absence of accurate or complete descriptions can be completely wrong and lead to grave errors in judgment or planning. Our decision to develop the UCLA Energy Atlas—as a purely descriptive platform—was highly intentional. The energy domain is filled with a huge variety of different types of models. However, the historical lack of available granular consumption data means that these models have been developed using limited or inferior quality descriptions. One of our primary objectives in developing the UCLA Energy Atlas as a descriptive platform was therefore to support model development efforts by making the highest possible quality descriptions of the energy system available to other researchers and the public at large. Modeling Applications It is important to think about models in contrast to descriptive platforms in the context of building energy, as so many models have been developed in many parts of the globe and to know when and where they are useful in contrast to ground-up data. Models that are capable of taking publicly available customer attributes as inputs and generating high accuracy predictions of customer level resource consumption as outputs can be useful for a number of applications. These types of models can enable the precision forecasting of future resource consumption demands under different hypothetical growth scenarios. They can also provide a mechanism for navigating privacy constraints when attempting to share granular consumption information for planning purposes. This is because accountlevel model predictions can provide an essential insight into the expected resource consumption intensity of a customer of a given type, without actually revealing the values for any specific customer. For these types of applications, the choice of modeling architecture should not be so concerned with scrutability, as prediction accuracy is the main objective.
114
S. PINCETL ET AL.
Thus, the use of complex, non-linear model architectures such as artificial neural networks is quite acceptable. There are other modeling applications for utility data where the core purpose is to develop intuition about the structural determinants of certain patterns of consumption. In these cases, the use of very simple, often linear, model architectures can deliver important insights about which customer attributes are most significant in determining consumption levels. Individual attributes or specific combinations of attributes identified as key determinants of consumption within such a model can then be incorporated into conservation or efficiency policies. These targeted policies have the ability to focus only on the subset of customers whose attributes strongly suggest that their consumption behavior may be problematic or excessive, without actually requiring the knowledge of their individual consumption. Minimum Data Requirements The UCLA Energy Atlas contains utility consumption records that have been sampled at a monthly temporal resolution and a parcel-level geographic resolution. While these are impressive granular spatial and temporal scales of observation, even more refined sampling is possible. For example, with the widespread deployment of GPS-located advanced metering infrastructure, it is now technically possible to record electricity consumption at 15-minute intervals all the way down to the individual customer level. These new technical advancements in data collection beg the question however, of “exactly how much data is necessary?” In this day and age, when cloud storage and compute options have become so readily available, it is fascinating to note how seldom this issue is raised. Each year new studies highlight the large and growing energy and material footprints associated with cloud computing. If the ultimate purpose of storing energy consumption data is to enhance energy conservation, should not the energy costs of storing all of this data not also be weighed into the balance? Fifteen-minute interval electricity consumption datasets collected for millions of customers within large metropolitan areas are enormous— on the order of hundreds of terabytes per annum. Before we commit to the cost and complexity of storing this information over the long term, utility managers, policymakers, and researchers should think long and hard about the types of questions that this data could potentially be used
6
DATA ANALYTICS
115
to answer. It is possible that data needs could be best satisfied by generating some statistical distillation of the raw data. Such digested products might then also be easier to store and disseminate. Such an exercise can perhaps be analogized as a form of compression, with the goal being the persistence of essential information, and the elimination of repeated or nonessential data values.
Future Research Questions The preceding ideas are suggestive of some potentially fruitful research questions. One of the most important of which relates to the challenges of data aggregation which have been frequency discussed. As an example of this, as part of one of our most research projects we have begun to analyze a large sample of hourly interval, account-level natural gas consumption data. This dataset only corresponds to the residential accounts within two zip codes. Yet in spite of this, it’s size is already enormous—very nearly crossing the threshold into the realm of “big data.” Additionally, in the process of working with this data we have observed extremely high levels of variation between the consumption profiles of different customer accounts. Given the potential importance of these accountlevel variations to different types of research questions, we have begun to wonder whether or not it would be possible to distill such a high frequency, account-level consumption dataset into a more compact set of descriptions? But what would these descriptions look like? There has not yet been comprehensive assessment of the essential sources of variation in account-level utility customer data. These sources must be more well understood in order to preserve the information which these data contain. Along these lines, we often wonder: would it be possible to develop a “compression algorithm” for utility customer data? How might an approach to such a problem build upon the techniques that are popularly used to reduce the size of digital images or audio and video streams? If the effective compression of a large energy dataset were possible what would this mean for the data’s spatial and temporal dimensions? What would be the optimal units of aggregation for each? It is conceivable that such a compression algorithm could be designed with the constraint that all of its output descriptions abide by privacy regulations and thus, that its outputs always be publicly shareable. More research is needed on the subject of data aggregation to determine whether an optimal solution can be found.
116
S. PINCETL ET AL.
The energy consumption which takes place in buildings constitutes a significant component of total urban energy use in California and across the world. The success of policies to reduce energy consumption in an equitable manner, and to decarbonize, is predicated on accurate and specific data about buildings and customers. Consumption data which have been joined to relevant contextual attributes increase understanding and the ability to craft the right interventions. The UCLA Atlas includes California climate information, which is important as there are significant differences for coastal and inland regions, even in the same jurisdiction. Needed spatial and geographic data will depend on local variables, but data matching and mapping enable the use of those variables. What those variables are, will be best determined by local experts, familiar with place. However, it is also important to have a strong understanding of the promises and limitations of big data as we hope this chapter has illustrated. Not only is big data in itself not useful without a clear intent for its application, but how it is used for change detection, modeling, or forecasting requires a clear understanding of the methods used and their relationship to big-picture analytical objectives.
Bibliography Barnes, T. J. (2013). Big Data, Little History. Dialogues in Human Geography, 3(3), 297–302. https://doi.org/10.1177/2043820613514323. Bartusch, C., Odlare, M., Wallin, F., & Wester, L. (2012). Exploring Variance in Residential Electricity Consumption: Household Features and Building Properties. Applied Energy, 92, 637–643. Bender, E. A. (2000). An Introduction to Mathematical Modeling. New York: Dover Publications. Berkhout, P. H., Muskens, J. C., & Velthuijsen, J. W. (2000). Defining the Rebound Effect. Energy Policy, 28(6–7), 425–432. Borgman, C. L. (2015). Big Data, Little Data, No Data: Scholarship in the Networked World. MIT Press. https://books.google.com/books?hl=en& lr=&id=gL8vBgAAQBAJ&oi=fnd&pg=PR7&dq=big+data+little+data+no+ data&ots=I6a51Fcl51&sig=U6LUx2rjCmEhucLbipOcawzPI9o#v=onepage& q=big+data+little+data+no+data&f=false. April 6, 2020. Bloch, J. (2006, October). How to Design a Good API and Why It Matters. In Companion to the 21st ACM SIGPLAN Symposium on Object-Oriented Programming Systems, Languages, and Applications (pp. 506–507). Blokdijk, G., & Menken, I. (2009). Cloud Computing-the Complete Cornerstone Guide to Cloud Computing Best Practices: Concepts, Terms, and Techniques
6
DATA ANALYTICS
117
for Successfully Planning, Implementing… Cloud Computing Technology (2nd ed.). London, GBR: Emereo Pty Ltd. Bowker, G. C., & Gitelman, Lisa. (2013). “Raw Data” Is an Oxymoron. Cambridge, MA: The MIT Press. California Public Utilities Commission (CPUC). (2014a). Decision [D. 14-05016] Adopting Rules to Provide Access to Energy Usage and Usage-Related Data While Protecting Privacy of Personal Data. Sacramento, CA. California Public Utilities Commission (CPUC). (2014b). Ex Ante Review Fact Sheet #2: The Commission’s Ex Ante Review Process. http://www.cpuc.ca.gov/ WorkArea/DownloadAsset.aspx?id=5329. Chen, H., Chiang, R. H., & Storey, V. C. (2012). Business Intelligence and Analytics: From Big Data to Big Impact. MIS quarterly, 36, 1165–1188. Donald, S. G., & Lang, K. (2007). Inference with Difference-in-Differences and Other Panel Data. The Review of Economics and Statistics, 89(2), 221–233. Frisk, J. E., & Bannister, F. (2017). Improving the Use of Analytics and Big Data by Changing the Decision-Making Culture. Management Decision, 55(10), 2074–2088. Gandomi, A., & Haider, M. (2015). Beyond the Hype: Big Data Concepts, Methods, and Analytics. International Journal of Information Management, 35(2), 137–144. Ghanadan, R., & Koomey, J. G. (2005). Using Energy Scenarios to Explore Alternative Energy Pathways in California. Energy Policy, 33(9), 1117–1142. Gibson, J., Rondeau, R., Eveleigh, D., & Tan, Q. (2012, November). Benefits and Challenges of Three Cloud Computing Service Models. In 2012 Fourth International Conference on Computational Aspects of Social Networks (CASoN) (pp. 198–205). IEEE. Greening, L. A., Greene, D. L., & Difiglio, C. (2000). Energy Efficiency and Consumption—The Rebound Effect—A Survey. Energy Policy, 28(6–7), 389– 401. Haben, S., Singleton, C., & Grindrod, P. (2015). Analysis and Clustering of Residential Customers Energy Behavioral Demand Using Smart Meter Data. IEEE Transactions on Smart Grid, 7 (1), 136–144. Hey, T., Tansley, S., & Tolle, K. (2009). The Fourth Paradigm: Data-Intensive Scientific Discovery (Vol. 1, T. Hey, Ed.). Redmond, WA: Microsoft Research. Huang, Z., Yu, H., Peng, Z., & Zhao, M. (2015). Methods and Tools for Community Energy Planning: A Review. Renewable and Sustainable Energy Reviews, 42, 1335–1348. https://doi.org/10.1016/j.rser.2014.11.042. Kitchin, R. (2014). Big Data, New Epistemologies and Paradigm Shifts. Big Data & Society, 1(1), 2053951714528481.http://journals.sagepub.com/doi/10. 1177/2053951714528481. April 3, 2020.
118
S. PINCETL ET AL.
Labrinidis, A., & Jagadish, H. V. (2012). Challenges and Opportunities with Big Data. Proceedings of the VLDB Endowment, 5(12), 2032–2033. http:// dl.acm.org/doi/10.14778/2367502.2367572. April 3, 2020. Lepri, B., Staiano, J., Sangokoya, D., Letouzé, E., & Oliver, N. (2017). The Tyranny of Data? The Bright and Dark Sides of Data-Driven Decision-Making for Social Good. In Transparent Data Mining for Big and Small Data (pp. 3– 24). Cham: Springer. Lindblom, M. (2016). Seattle’s New, Over Budget Computer System Let Utility Customers See Others’ Bills. Seattle Times. https://www.seattleti mes.com/seattle-news/politics/city-light-computer-glitch-lets-customers-seeother-users-bills/. April 21, 2020. Los Angeles Times Editorial Board. (2019). LADWP Still Hasn’t Resolved Its Billing Debacle After Six Years—Los Angeles Times. Los Angeles Times. https://www.latimes.com/opinion/story/2019-10-01/the-dwp-billing-deb acle-continues. April 21, 2020. McAfee, A., Brynjolfsson, E., Davenport, T. H., Patil, D. J., & Barton, D. (2012). Big Data: The Management Revolution. Harvard Business Review, 90(10), 60–68. McKenna, E., Richardson, I., & Thomson, M. (2012). Smart Meter Data: Balancing Consumer Privacy Concerns with Legitimate Applications. Energy Policy, 41, 807–814. Paredaens, J., De Bra, P., Gyssens, M., & Van Gucht, D. (2012). The Structure of the Relational Database Model (Vol. 17). Heidelberg: Springer Science & Business Media. Sorrell, S. (2009). Jevons’ Paradox Revisited: The Evidence for Backfire from Improved Energy Efficiency. Energy Policy, 37 (4), 1456–1469. Strong, D. M., Yang, W. L., & Wang, R. (1997). Data Quality in Context. Communications of the ACM, 40(5), 103–110. Taylor, S. J., & Letham, B. (2018). Forecasting at Scale. The American Statistician, 72(1), 37–45. Theis, T. N., & Philip Wong, H. S. (2017). The End of Moore’s Law: A New Beginning for Information Technology. Computing in Science & Engineering, 19(2), 41–50. http://ieeexplore.ieee.org/document/7878935/. April 6, 2020. Wallis, J. C., Rolando, E., & Borgman, C. L. (2013). If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology. PLoS One, 8(7). www.plosone.org. April 6, 2020. Williams, R. S. (2017). What’s Next? [The End of Moore’s Law]. Computing in Science & Engineering, 19(2), 7–13.
CHAPTER 7
Case Studies
Introduction The Atlas and its underlying data support a multitude of applications and projects that address previously unanswerable questions. Using consumption data at the billing address level, correlated with an array of spatial attributes, we are able to perform spatial, temporal, sociodemographic, and building attribute-based queries. Most of this analysis is conducted using the confidential billing address-level data which resides in our backend relational database system, as described in Chapter 4. This rich collection of attributes creates a thick set of interactive layers—our version of digital thick mapping, as discussed in Chapter 2. It also enables the quantitative evaluation of new energy system designs for more targeted, more appropriate, and more equitable energy transitions. University researchers are particularly suited to conduct these types of analyses, as they have the specialized training in the computational tools and technical infrastructure systems that are required, whereas state agencies and local governments usually do not have the dedicated staff or funding to engage in such laborious and computationally intensive research. However, our experience demonstrates the benefits and importance of collaboration with public agencies and other stakeholders in all phases of our research to understand conditions on the ground and their implications for a just energy transition. Such collaborations are critical to ensure that policy is not only informed by the appropriate data, but also targeted, effective, and equitable. The California Center for © The Author(s) 2020 S. Pincetl et al., Energy Use in Cities, https://doi.org/10.1007/978-3-030-55601-3_7
119
120
S. PINCETL ET AL.
Sustainable Communities is one of only a handful of university research teams that have partnered directly with members of community-based organizations representing underserved communities. Our experiences, as we help to pioneer this approach, show that while there can be challenges in communication and integrating feedback, working directly with community-based partners ensures our research is grounded, and that it provides the type of analysis most relevant to their interests and needs. Federal funding sources (e.g., National Science Foundation) typically do not support applied research projects such as the Atlas, which are intended to provide information for local and state policymakers. Similarly, funding through sources such as the US Department of Energy tend to be restricted to technological innovation projects. As such, our projects are predominantly funded by the state of California and local governments. Researchers in California are fortunate in this regard, as the state remains committed to funding research to advance the state’s energy policies, as well as to ensure a more equitable energy future. Below we present a number of case studies illustrating the use of the Energy Atlas for applied research projects. First, we describe four statefunded research grants which were made possible by the power of the Atlas’ dataset. Next, we discuss how the Atlas was used to develop a GHG emissions inventory and business-as-usual scenarios for a countyscale sustainability plan. This is followed by a description of another energy consumption database we are creating, modeled on the Atlas, that comprises nine counties in Northern California. We end with a discussion of our participation in an international research collaboration around building energy data, which serves to demonstrate the Atlas’ unique standing on a global scale.
Advanced Energy Communities Project Overview In 2016, we began a three-year concept project, formally titled “Accelerating AEC Deployment around Existing Buildings in Disadvantaged Communities through Unprecedented Data Analysis and Comprehensive Community Engagement.” Funding was provided through a grant from the California Energy Commission (CEC), a state regulatory and planning agency that oversees aspects of managing and improving the statewide energy system. The project was developed in response to a
7
CASE STUDIES
121
CEC solicitation seeking new and innovative ways to incentivize the wider adoption of solar systems in under-resourced communities. The project aimed to create a replicable “advanced energy community” (AEC) design (see below for a full definition) that would address structural and programmatic barriers to a renewable energy transition within the under-resourced communities of Bassett and Avocado Heights, pictured in Fig. 7.1. The AEC project was a collaboration between university
Fig. 7.1 Overview map of the six census tracts (orange) which comprise the Disadvantaged Communities of Basset and Avocado Heights selected as the study area for the AEC project. Note the number of major freeways intersecting the site as well as the adjacent large industrial facilities (bottom right), decommissioned landfill (bottom center), and gravel quarry (top center) (Basemap data credit: Mapbox)
122
S. PINCETL ET AL.
and nonprofit organizations. Following a three-year in-depth analysis and the development of a comprehensive plan, including complex public– private funding and partnerships, this project was selected by the CEC for implementation funding starting in 2020, with a $9M grant over five years. The Advanced Energy Community project shows the power of purposeful data collection and analysis for energy transitions and resilience planning. The data mining and analytics performed during the course of this project demonstrate the depth and richness of the Energy Atlas data, and how information can be generated for planning and policymaking while protecting customer privacy. While this specific project is modest in its funding and extent, the resulting model is highly scalable. The team assembled for the project included UCLA researchers in collaboration with: The Energy Coalition, a local nonprofit responsible for energy system modelling and design; Day One, a community-based organization that conducted all the outreach and education in the area of the proposed project; and a technical advisory committee consisting of stakeholders from nonprofit, governmental, and university sectors. Background An advanced energy community, as defined by the CEC, is one in which energy efficiency and renewable energy technologies are deployed in ways that reduce energy costs and GHG emissions, improve air quality, home comfort, and resilience, and are also financially attractive and replicable. This is an especially challenging task in low-income and under-resourced communities, which face a number of critical barriers to accessing existing energy efficiency programs and new energy technologies. Such barriers include (CEC, 2016): • More renters, lower-income residents, and limited English-speakers. under-resourced communities tend to have higher percentages of residents that cannot make substantial energy efficiency upgrades to the buildings they inhabit. Renters generally lack the agency to upgrade appliances and improve building energy efficiency. This significantly limits their opportunities to adopt efficiency upgrades, rooftop solar, and energy storage systems. For low-income building owners, the initial costs associated with energy efficiency measures
7
CASE STUDIES
123
or rooftop solar is often beyond a household’s ability to pay. (California, like most places in the United States, approaches energy efficiency program adoption as a ground up, household by household endeavor that relies on individuals participating and contributing a significant proportion of the cost.) Furthermore, structural inadequacies in rooftops can create additional expenses that must be overcome before on-site solar installation is possible. The percentage of residents who speak limited English in California also tends to be higher in under-resourced communities, creating a barrier to understanding information provided by organizations promoting energy efficiency or solar installation. • Lack of data on the effectiveness of energy efficiency upgrades. Billions of dollars have been spent on energy efficiency upgrades in California. However, there remains limited data on energy use before and after implementation of these upgrades in order to evaluate their performance, as explained in the previous chapter. The performance of these upgrades is also rarely evaluated across income groups, building age, and other characteristics. This lack of data limits the ability to target locations most in need and to determine where specific programs have been most successful; it also limits opportunities to accurately quantify benefits to ratepayers and investors, thereby creating risks and uncertainties around financing and implementing work. Most importantly, many energy efficiency investments may not be particularly effective. This is troublesome as the programs often claim savings, yet in fact, there may be no actual improvement in building performance and thermal comfort for residents. Overall, this uncertainty intensifies in under-resourced communities, where financial risks can have greater impact. There also remains a question of whether low-income communities can afford to pay for retrofits at all, and whether there is a public responsibility to undertake such retrofits to improve the building stock for the long term. • Lack of full community engagement. Homeowners and renters are often isolated in efforts to learn about energy efficiency products and services and to evaluate the competency of contractors and real estate professionals. A lack of community outreach means that only “well-connected” residents with access to resources and expertise can access incentive programs. Even for those individuals, the complexities and uncertainties involved in applying for
124
S. PINCETL ET AL.
help can be disincentives to action. Such issues are amplified in under-resourced communities, where education, language barriers, or both can inhibit knowledge transfer, and where greater portions of residents are renters with fewer connections to traditional pathways for this information. This does not mean, however, that these communities are unaware of their energy use or of the need to conserve. Furthermore, there is a history of predatory companies selling solar PV and/or energy efficiency upgrades in these communities, by taking advantage of homeowners’ lack of familiarity with nuanced contracts and program details. This has resulted in skepticism and a reluctance to engage around these programs unless the message is from a highly trusted source, such as a local elected official, a long-standing community organization, or a university (ideally in partnership with the local government and/or community organization). Role of the Energy Atlas Statistical analysis made possible by Energy Atlas data allowed the project team to develop critical insights and support the implementation of the project’s four components: 1. Energy use profiles for buildings in the project pilot community— By using account-level consumption data to understand the links between energy consumption and building characteristics, the research team developed highly accurate statistical profiles for baseline consumption, energy efficiency savings potential (see below), and remaining demand to be met through solar PV generation, all while respecting data confidentiality. This detailed level of analysis, made possible by the Atlas, increased confidence in the final energy system design and related financing decisions. 2. Identification of candidate locations for community solar installation—To identify the best locations for a community solar installation, the research team assessed the solar potential of rooftops and paved parking areas for properties in L.A. County near the pilot site, including properties such as churches and school district campuses. Pursuing the novel strategy of developing communityscale solar generation assets allows renters and those without the
7
CASE STUDIES
125
capacity to host rooftop solar to purchase renewable energy through a community solar green tariff.1 The community approach departs from the conventional utility model, which assumes a direct relationship between the customer and the meter. This is not the case for community solar systems, and if such systems are to proliferate, utility billing practices will need to change. New types of electric service providers, such as Community Choice Aggregators, have the flexibility to support these innovations.2 In a related effort, a solar prioritization tool to identify solar capacity was also developed during the course of this project. The tool will be useful for replication of such analysis throughout the region (see item 4 below). 3. Analysis of energy efficiency program effectiveness —Monthly electricity usage and energy efficiency participation data for 10 million accounts in Southern California were analyzed using a difference-indifference econometric approach. All accounts sampled were from Southern California Edison’s (SCE) service territory, and the analysis controlled for the effects of building vintage, square footage, and use type. The scale of the analysis (more than 160 million monthly observations) ensured credible and generalizable results. The research team also examined the geographic distribution of energy efficiency program participants across Southern California by income level, and also quantified the extent of energy efficiency program uptake within the project pilot area (Chuang et al. 2018).3 The results of the difference-in-difference analysis informed the choice of energy efficiency measures incorporated into the AEC model design, as well as assumptions around energy efficiency opportunities and potential energy savings for the pilot site.
1 A community solar green tariff program enables residential customers in DACs (who may be unable to install solar on their roof) to benefit from a local solar project and receive a 20% bill discount. https://www.cpuc.ca.gov/SolarInDACs/#CSGT. 2 Community Choice Aggregators (CCAs) have been created throughout California, comprised of local governments who band together to purchase electricity that is then transmitted through the utility grid. In Southern California, the CCA led by the County of Los Angeles is called the Clean Power Authority (CPA). 3 Yating Chuang, Magali Delmas, Felicia Federico, Eric Fournier, Stephanie Pincetl (2018). UCLA AEC Project, Energy Efficiency Program Effectiveness Analysis, Final Report. https://ucla.app.box.com/s/xp0dkev4qiu9l3qyzokmbvm5mg2ms8d6.
126
S. PINCETL ET AL.
4. Solar prioritization tool to guide investments in under-resourced communities —This tool scales up the analysis described in project component 2, above, and adds further capabilities. The Solar Prioritization tool combines highly detailed consumption data for Los Angeles County from the Energy Atlas with a number of other data sets, including an existing map of rooftop solar generation potential, a map of existing grid infrastructure and available surplus capacity for the local electric grid,4 hourly demand and generation profiles, socioeconomic data from CalEnviroScreen,5 and socioeconomic and population data from the US Census Bureau’s American Community Survey. A novel analysis based upon the combination of these datasets allowed for geographic areas to be prioritized for the development of rooftop solar PV assets, taking into account local, on-site demand and the capacity of local grid circuits to accommodate distributed generation assets. Results and Findings Implementation-Ready Advanced Energy Community Design The selected pilot site for the project is Avocado Heights/Bassett, an unincorporated area of Los Angeles County in the San Gabriel Valley. Three major freeways, a (now decommissioned) landfill, and a major industrial center surround this community on all sides. Residents are affected by high levels of air pollution, including particulate matter, and ongoing concerns about harmful lead and arsenic emissions from a nearby lead battery recycling center. Regionally downscaled climate projections indicate that this community will experience more than 40 additional days of extreme heat (>95°F) per year by 2050.6 4 At the direction of the California Public Utilities Commission (CPUC), the incumbent electric utility—Southern California Edison—publishes a Distributed Energy Resources Integration Map (DERiM) of existing electric grid infrastructure resources and available capacity for new distributed energy interconnections for the entire utility service territory including many parts of LA County. 5 The CalEnviroScreen is a unique tool the state of California developed to identify communities that are most impacted by a combination of pollution burden and socioeconomic factors. https://oehha.ca.gov/calenviroscreen/report/calenviroscreen-30 (Accessed January 30, 2020). 6 https://www.latimes.com/local/california/la-me-extreme-heat-20150514-htmlstory. html.
7
CASE STUDIES
127
The Advanced Energy Community design provides locally generated, GHG-free electricity from community solar and storage to offset electricity consumption of participants who “opt in” through an enrollment system. The Clean Power Alliance of Southern California, the local community choice aggregator, will sponsor the community solar and storage project and allow the community solar green tariff program to be implemented as a pilot. The design also incorporates electric vehicle (EV) charging infrastructure and a community EV car-share program, a community resilience center located in a large neighborhood church, inhome energy efficiency upgrades, and a rooftop solar pilot program that uses blockchain accounting7 to support peer-to-peer energy trading. Using applied research to develop implementation plans requires flexibility. As an example, between the completion of the design phase and the award of the implementation phase, some aspects of the design were revised in response to the implementation phase solicitation. The final AEC design will start in 2020, implemented by The Energy Coalition, with the CCSC leading the performance monitoring and evaluation, and conducting a project case study. Assessment of Energy Efficiency Program Effectiveness The following are the key findings and recommendations from this analysis (Chuang et al. 2018): • Between 2010 and 2015, the overall energy efficiency program adoption rate was about 8%, compared to 5.5% among the lowest income quartile. • Compared to nonparticipants, energy efficiency program participants tend to live in newer houses and are more likely to be homeowners, rather than renters. • Energy efficiency program participants live in neighborhoods with higher incomes and lower population densities, and are more likely to be white, Asian, and highly educated. • Energy efficiency program participation among the highest income quartile is nearly twice that of those within the lowest income quartile. 7 Blockchain accounting is a record-keeping technology that stores records of credits and debits (in this context, units of energy generated and consumed) in a distributed manner—among the users in the network—rather than a centralized one.
128
S. PINCETL ET AL.
• Households whose houses were built after 1990 have higher participation than those in houses built before 1950 (around 9% and 5%, respectively). Most of the building stock in Los Angeles dates from between the 1940s and 1970s. • The most effective programs with respect to electricity savings were those providing incentives to upgrade pool pumps and refrigerators. Pool pump incentives led to, on average, 12–13% savings among participating households, while programs that facilitated energy efficient refrigerators yielded, on average, 6% savings. • Lighting programs, despite high participation by households, yielded only about 1% savings. • The analysis showed almost no significant change in electricity consumption among participants who received incentives for heating, ventilation, and air conditioning (HVAC) or whole-building retrofit programs. The analysis was limited by the lack of differentiation between upgrades for heating services and upgrades for cooling services within the utility’s HVAC program data. Prototype Solar Prioritization Tool Details of this tool are discussed in the next section.
Solar Prioritization Tool 1.0 Overview As mentioned in the previous section, the AEC project research team developed a pilot Solar Prioritization Tool to identify promising areas for new rooftop solar energy resources across metropolitan Los Angeles (Porse et al. 2020). The tool is an interactive, web-based platform that visualizes multiple data sets and allows users to dynamically explore net solar potential and underlying data sets using advanced web-mapping tools. The platform aims to meet multiple energy planning goals: local energy system resilience, community-scale zero-net electricity, grid reliability, and prioritized investments in under-resourced communities. The capability to prioritize new solar installations in high detail is new, as grid data and solar potential have only recently become publicly available.
7
CASE STUDIES
129
Combined with consumption data from the Energy Atlas and information about the building stock and sociodemographics, researchers were able to create a novel tool to help guide distributed solar investments, and to provide insights into differences among communities. Background The solar prioritization tool is an attempt to address some of the most significant contemporary energy planning challenges in California. Utility planners recognize that the electric grid must evolve as reliance on renewable energy increases, necessitating changes in energy generation, procurement, storage, and distribution practices. Conventional utility business models are being challenged by the state’s aggressive targets for the percentage of electricity supplied by renewable sources (50% by 2035) and energy efficiency (doubling by 2030) as part of key legislation—SB 350, signed in 2015. Driven by these mandates as well as market forces, municipal and investor-owned utilities are procuring increasingly larger quantities of renewable energy generated by wind farms and solar arrays. Many, if not most of these assets are being developed in remote locations, far away from urban areas. The California Independent System Operator (CAISO) coordinates the wholesale market for the generation of electricity which supplies the majority of the State’s electricity grid.8 Customers of large investor-owned utilities, as well as many smaller municipally owned utilities (MOUs) with their own separate distribution lines, rely on the proper functioning of this multitiered system to ensure reliable electricity in homes, businesses, and schools. Local generation assets have the potential to reduce or completely eliminate the consumption of grid-supplied electricity. However, when deployed in an ad hoc manner, without proper visibility to the grid-operator, they also have the potential to disrupt the functioning of the power system and even cause unwanted infrastructure failures. To maximize the speed with which these distributed generation 8 CAISO manages the flow of electricity across the high-voltage, long-distance power lines for the grid serving 80 percent of California and a small part of Nevada. The nonprofit public benefit corporation keeps power moving to homes and communities. It grants equal access to nearly 26,000 circuit miles of transmission lines and coordinates competing and diverse energy resources into the grid where it is distributed to consumers. It also operates a competitive wholesale power market designed to promote a broad range of resources at lower prices.
130
S. PINCETL ET AL.
assets can be deployed, attention must be paid to local grid capacity constraints during project planning and development phases. The rapid rise in grid-connected solar and photovoltaic installations in California has exceeded the predictions of even the most optimistic of renewable energy advocates. In 2004, only 60 MW of total solar installations (residential and commercial) existed. Other renewable sources such as wind power, along with traditional sources including hydropower, were more prominent. By 2016, however, over 5200 MW was cumulatively installed. Much of this expansion was supported by major IOU incentive programs; a significant portion of the rapid and unanticipated rise owes to individual homes and building owners installing small panels on rooftops. From 2004 to 2017, nearly 3300 MW of residential distributed solar generation capacity was installed across the state, representing over half of installed capacity.9 Most rooftop solar installations in California are interconnected to the local utility’s distribution network. When these solar arrays produce quantities of electricity that are greater than on-site demand, the excess energy flows back out, onto the grid. The potential for bidirectional power flows has created an additional challenge for statewide electricity grid operators, since the grid has historically been architected assuming a unidirectional flow of power. Without widely available capacity for storing energy, further increases in the amount of solar electricity production can potentially create a significant mismatch in demand and supply across the grid. Signs of this growing mismatch have become particularly acute during spring and summer seasons, as evidenced by a characteristic pattern in the net demand for electricity, now referred to as the “duck curve”10 (Denholm et al. 2015). California grid operators have even curtailed hydropower outputs and, on occasion, paid other western grid systems to accept excess generation in attempts to ameliorate this situation (Trabish, 2017). However, in order to mitigate potential detrimental consequences of distributed solar generation on electric grid operations, California utilities with grid management responsibilities are now imposing limits on the
9 https://www.californiadgstats.ca.gov/. 10 In utility-scale generation, the duck curve is a graph of power production over the course of the day that shows the timing imbalance between peak demand and renewable energy production. The term was coined in 2012 by the California Independent System Operator.
7
CASE STUDIES
131
size of solar generation systems that can be interconnected to individual distribution circuits. Prior to installation and interconnection to the grid, all solar systems are subject to review by the connecting utility. Through the regulatory process, a utility provider reviews and approves the proposed generation site and asset mix, ensuring that existing circuit lines and substations can handle the flow of power into and out of site. In California, the California Public Utilities Commission (CPUC) regulates grid interconnection requests under Electric Rule 21 for IOUs, but allows each IOU to establish their own request process, rules, and tariff structure (SCE 2015; Ricklefs et al. 2018).11 The multiple sets of interconnection criteria create a lack of consistency across the different utility service areas. The heterogeneity allowed by Rule 21 with respect to interconnection criteria also means that there is little regulatory recourse for those seeking to build solar generating assets. Two important factors govern the size of solar systems. First, utilities designate an upper limit on the size of installed solar capacity that may receive favored compensation rates. Net Energy Metering (NEM)12 rates for exported solar are only applied to generation capacity up to 150% of historic annual demand at a site. Second, for NEM participants associated with a local circuit, utilities limit the total amount of distributed solar power exported to the circuit. The utility determinations are based on the capacity of existing circuits. To ensure grid stability and prevent bidirectional flows of electricity, utilities enforce several screens through Rule 21 which designate upper limits for installations (Ricklefs et al. 2018). Screen L designates that sites where more than 500 kW of solar capacity will be installed must be located in areas where the local grid has no existing stability issues. Screen M, which is key for projects, designates a 15% penetration capacity upper limit on distributed solar energy installations. Investor-owned utilities publish circuit capacities through Distributed Energy Resource Interconnection Maps (DERiM) that are available for
11 https://www.sce.com/business/generating-your-own-power/Grid-Interconnections/ Interconnecting-Generation-under-Rule-21. 12 Net energy metering, or “NEM,” is a special billing arrangement that provides credit to customers with solar PV systems for the full retail value of the electricity their system generates.
132
S. PINCETL ET AL.
each service area.13 For a property or set of properties associated with a local distribution grid circuit, solar generation capacity cannot exceed 15% of the line’s historical peak load. This may significantly disadvantage late comers in seeking to install solar on their roofs. Given this array of considerations related to siting distributed solar, it is not surprising that municipalities need assistance to determine where to prioritize investments to boost renewable generation, while minimizing grid effects and supporting under-resourced communities. The Solar Prioritization tool is designed to bridge this information gap. Role of the Energy Atlas Combining property-level data for energy consumption from the Energy Atlas with other sources offered an opportunity to address gaps in knowledge. As part of the Advanced Energy Community project discussed above, we developed a tool to identify priority locations of rooftop solar electric generation capacity across investor-owned utility territory in Los Angeles County. It required assembling multiple “big data” sources that span public and proprietary data for technical operations, energy use, and sociodemographic information. The data components of the tool include historic electricity demand, on-site solar potential, local electric distribution grid capacity, hourly profiles for demand and solar generation, and sociodemographic indicators (Fig. 7.2). A series of calculations were made to estimate the net potential for a collection of rooftop solar assets connected to a single circuit to export electricity back to the grid. These calculations compare parcellevel estimates of hourly solar energy generation output to hourly energy demand. These building-level estimates were then aggregated to the level of individual distribution circuits to determine the expected timing and location of net over-generation from the full development of the region’s distributed solar capacity. Solar system size (i.e., the total panel area) for each building was calculated using a technical potential value representing the full development of 100% of the available suitable rooftop area on each parcel’s buildings 13 Southern California Edison DERiM aims to connect developers with the SCE system data needed to enable strategic distributed energy resource siting. Details about the attributes provided can be found at https://www.sce.com/sites/default/files/inline-files/ DERiM_User_Guide_Final_AA_1.pdf.
7
CASE STUDIES
133
Fig. 7.2 Overview of analysis methodology for the Solar Prioritization Tool 1.0
(Lee et al. 2017). In many locations, this technical potential significantly exceeded the volume of generation that would be practically allowed to interconnect to the local distribution circuits based upon existing capacity constraints. The requirements of Rule 21 therefore can be understood as limiting this technical potential to an effective “regulatory potential”— i.e., the volume of distributed rooftop solar generation assets that can be developed given current regulatory constraints. Energy Atlas data allowed researchers to quantify building-level energy consumption, which is the critical first step for an accurate assessment of net solar potential. For each building in Southern California Edison’s Los Angeles County territory (approximately 1.2 million accounts), the annual energy demand can be compared to an estimate of the potential solar electricity generation from its rooftop. However, calculations of net solar potential must account for temporal changes in demand for, and generation of, electricity at the building level (Table 7.1, Steps 3 & 4) in order to understand impacts to the grid. Both electric demand and solar generation vary with time, changing throughout the day. Electric demands in homes spike in mornings and late afternoons, especially in spring and summer months when temperatures are highest. In contrast, solar generation from distributed rooftop arrays peak in late morning and early afternoon hours. The prioritization tool uses published data for hourly demand from local utility and generation values within the California Solar Statistics database reports solar generation at 15-minute intervals for a set of metered panels (Go Solar California, 2017).
134
S. PINCETL ET AL.
Table 7.1 Overview of procedures and methods used in the Solar Prioritization Tool 1.0 Step
Description
1
Calculate Annual Electricity Consumption (kWh) for each property in LA County using the most recently available year in the LA Energy Atlas (2010) Derive the Annual Rooftop Solar Electricity Generation Potential and Maximum Potential Power Output (kWh and kW) for each property based on values reported in the LA Solar Map database, which is integrated into the Energy Atlas database tables Calculate Hourly Consumption and Generation Values (kWh) for each property for the entire year, using best available estimates of hourly load and generation profiles within the SCE service territory Determine on-site electricity use offsets from solar arrays and net exports of solar generation to the grid for each property at each hour in a year, and aggregate to annual totals. On-site solar use, remaining grid demands, and solar exports are calculated based on the difference between hourly generation and consumption. The total amount of building electricity demand that is offset by solar generation is calculated for the whole region by summing values for all properties Attribute Parcels to Electric Grid Circuits (“Circuit Groups”) based on a shortest-distance assessment of the proximity of parcels to local grid circuits Calculate Totals for Each Circuit Group, including annual rooftop electricity generation potential and peak power output potential Estimate Unutilized Net Solar Generation from Rule 21 installation limits for Circuit Groups using SCE-defined grid capacity constraints for exporting power to local grid circuits based on the 15% peak load penetration maximum for distributed generation (“Rule 21”) Rank Grid-Constrained Net Solar Potential for Disadvantaged Community Boundaries based on CalEnviroScreen rankings in LA County Map Net Solar Potential given grid management policies that affect net solar potential (Rule 21), as well as DAC boundaries that provide a geographic “filter” for highlighting areas to prioritize investments
2
3
4
5 6 7
8 9
For each hour in the year, combining the estimated demand and generation profiles yields a net demand or net generation in a building. For periods of net generation, total energy (kilowatt-hours) and peak power output (kilowatts) at an instantaneous time are two important values to understand. Power is electricity output at a given moment in time. Electric grid circuits have a capacity range within which they can transmit power reliably. If peak capacity is exceeded at a given time, it could cause instability or even failures such as blackouts (total loss of electricity) and brownouts (periods of low voltage). To evaluate if maximal
7
CASE STUDIES
135
solar installations would affect grid operations, the prioritization tool method screened for circuits where net solar output from local properties would exceed the stated limits identified by the local electric utility in DERiM data. During any period of time when the power output from properties exceeded the 15% penetration rate capacity of the circuit, it was assumed that power output was limited to the 15% limit. Public data was not available to attribute each parcel to a local grid circuit, so proximity assumptions were used to attribute properties to local grid circuits and calculate net potential export of solar electricity to the grid and curtailments (Steps 5–8). Finally, mapping the results through a dynamic web-mapping interface provides a powerful visualization tool to support planning activities (Step 9).14 Results and Findings Integrating spatial and temporal data for property-level electricity demands, rooftop solar generation potential, and grid capacity constraints offer the chance to estimate the potential of rooftop solar to meet onsite demands and to supply net exports to the grid. In the study area (Southern California Edison territory within Los Angeles County) across more than 1 million parcels, total retail electricity demands in aggregate comprise approximately 25,000 Gigawatt hours (GWh) per year, while the total technical potential of rooftop solar generation output is 23,600 GWh (94% of retail demand) per year. Correlating building-scale demand and potential generation yields additional insight into estimating net potential generation at a fine level of detail. Performing the analysis at the building scale and hourly time step yields an estimated 7200 GWh of rooftop solar generation that could be used for on-site building demands, which is 29% of total retail demand— note this estimate does not include other potential sites, such as parking lots, community buildings, and more. Remaining building demand, which must be supplied by grid sources, is approximately 18,000 GWh. The potential solar output for export to the grid (not used directly on-site) is 16,400 GWh. This would be electricity available for use in neighboring regions or potentially stored.
14 http://solar.energyatlas.ucla.edu.
136
S. PINCETL ET AL.
Overall, potential net generation is negative: buildings use more electricity than they could produce, given the current performance of solar generation technologies. Despite this, however, distributed solar arrays could provide substantial cumulative net exports for use in other local buildings, elsewhere throughout the electric grid, or for storage during intermittent high output periods. Furthermore, this analysis showed that current policies regulating the interconnection of distributed solar generation assets onto the grid restrict the development of an estimated 1700 MW of the region’s technical solar potential. These limitations result from the 15% penetration rate threshold for distributed solar on grid distribution circuits. By comparison, this value is nearly equal to the generation capacity of the Hoover Dam, which is 2000 MW. Infrastructure upgrades and more detailed assessments of net export potential based on actual conditions would likely lead to increased estimates of solar production capacity. The 15% penetration rate itself has no hard, empirical basis that researchers have been able to ascertain; rather, this threshold is likely based on utility experience and caution, though contextualized by a general aversion to distributed generation assets that are non-dispatchable and whose performance characteristics are more difficult to predict and understand. The average peak net export of a circuit across the study area is 5.9 MW, while the average utility-specified penetration rate limit (15%) is 12.5 MW. Thus, on average across circuits in the LA County grid, extra grid capacity exists. Areas with more solar interconnection capacity, based on penetration rate limits, include rural areas of northern Los Angeles County and dispersed communities within the interior of metropolitan Los Angeles. While solar potential and demand vary significantly throughout Los Angeles County, there are apparent trends in the net solar generation capacity across communities of different socioeconomic status. Correlating the results for net solar potential and curtailed solar generation capacity with data from the CalEnviroScreen index that evaluates vulnerability to socioeconomic and environmental hazards, “high-risk” communities subject to lower incomes, or environmental hazards also have the highest potential for underutilized net solar generation. This means that the technical potential for solar generation in these communities would be more constrained by interconnection rules than in lower risk communities. However, such communities currently have more capacity in local grid circuits to accept net solar generation beyond rooftops, such as from ground-mounted or canopy parking lot arrays.
7
CASE STUDIES
137
Paradoxically, the high relative capacity of these communities is also a reflection of their prevalence in Los Angeles. This granular analysis reveals the complexity embedded in widespread adoption of rooftop solar as a renewable energy source. Findings indicate the extent to which regulatory constraints can hinder the opportunities to maximize renewable energy production, and also reveal differences across socioeconomic groups. CCSC was successful in obtaining additional state grant funding to further develop this tool, and its subsequent development is discussed in the next section.
Solar Prioritization Tool 2.0 Overview CCSC believed it was important for local community-based organizations (CBOs) committed to a just energy transition, but with little capacity or access to research, to participate in shaping the transition affecting their communities. A grant from the state’s Strategic Growth Council (SGC) provided the opportunity to pursue this goal, building on work previously done under a CEC grant to develop the Solar Prioritization Tool. The SGC is a planning organization, administered through the Governor’s office and funded largely by the state’s carbon cap-andtrade program. SGC has launched a series of funding programs to assist under-resourced communities to transition to more energy-efficient and low-carbon futures; many of these programs include research components. The CCSC was awarded a research grant to expand and refine the Solar Prioritization Tool, initially developed with CEC project funding (discussed above). This two-year grant, entitled “Coupling Community Knowledge with Big Data Tools to Facilitate Equitable Energy Transitions,” began in 2018, and allows for further investigation of the structural barriers to energy transformation in under-resourced communities, and for direct participation by CBOs. This work incorporates direct feedback from local communities in the Los Angeles region through a series of meetings and discussion sessions with representatives of interested community-based organizations. We partnered with a Los Angeles based social and environmental justice foundation, The Liberty Hill Foundation, with deep connections to community-based organizations in the region. This foundation
138
S. PINCETL ET AL.
facilitated connections and eventual direct involvement by eight separate CBOs, representing a wide range of under-resourced communities throughout the county. We are currently refining and expanding the interactive, web-based tool to support these CBOs to achieve the community energy transitions for which they are advocating, while also building their capacity to work on these issues. Background SGC funding is guided by the desire for equitable implementation of California’s ambitious goals for the generation of electricity from renewable energy sources, including distributed photovoltaics on rooftops (DER). However, as discussed above, the integration of distributed generation resources into the electric power grid is guided by the implementation of Rule 21. This research involves working with eight community-based organizations (CBOs) to enhance the capacity of disadvantaged neighborhoods to satisfy their energy needs through the increased local deployment of distributed photovoltaic systems. Many of these under-resourced communities are located in parts of LA County that will be experiencing increasing numbers of high heat days (>95°F) due to climate change. Because high heat itself reduces grid capacity, this creates additional impacts and risks (Burillo et al. 2019). The push for electrification of personal vehicles and household appliances further increases the potential for encountering grid-related limitations. Moreover, with utilities moving to a Time-of-Use (TOU) tariff structure in which electricity rates increase in the evening when solar generation wanes, less affluent communities potentially face reduced reliability, reduced comfort, and increased costs. Role of the Energy Atlas The Energy Atlas continues to form the basis for this new interactive, web-based tool, designed to support communities as they seek to participate in the transition toward low-carbon, distributed energy generation. The project used a public participation geographic information system (PPGIS) approach, as a way to ensure the tool is of the greatest value for its intended users. This involves a two-way learning process. CBO participants are being given structured opportunities to communicate their
7
CASE STUDIES
139
communities’ goals for the energy transition, and the types of data, functionality, and user interface that would be most valuable in the tool to support their plans. The university is reciprocating by providing a multipart training program on the complexity of the energy system and its regulatory framework. The Atlas and its unique database enabled the following relevant research questions to be explored: • What are the “official” anticipated/preferred scenarios for climatesmart transitions in under-resourced communities, including vehicle electrification, energy efficiency upgrades, distributed energy generation and storage, and appliance electrification? • What strategies can be implemented to address existing grid capacity limitations? To what extent can these limitations be overcome through the implementation of community energy systems design, such as: energy storage, energy efficiency, demand-side management, and/or an emphasis on community solar over rooftop systems? • To what extent will investments in grid upgrades be required? To what degree will such transitions be hampered by existing regulation and financing? Results and Findings The following project results are currently in progress: • Growth forecasts based on historical data for the following energy transitions available from the literature and reported state trends and data: – – – –
Energy efficiency EV penetration Rooftop solar installations Appliance electrification
• A web-based tool that facilitates the interactive exploration of potential sites for community-scale solar facilities, with estimates of the number of local households whose energy demands such systems would be able to offset through their production.
140
S. PINCETL ET AL.
• “Energy 101” modules for CBOs that explain how the energy system works and the regulatory infrastructure behind it, to build organizational knowledge and capacity to advocate on behalf of under-resourced communities. This set of modules can be considered part of our PPGIS approach, a kind of foundational popular education strategy that assists the community in contextualizing and interpreting the information in the tool. The project probes the potential limits of the energy transition in DAC communities due to infrastructure limitations, and enables community groups to engage in public and regulatory dialogues and decision-making about change.
Energy Transitions, Natural Gas, and Indoor Air Quality Overview This project is focused on developing an improved understanding of the challenges and potential benefits associated with widespread electrification of the residential energy sector as the state considers a transition away from natural gas. Funding comes from the California Energy Commission, the same state agency that funded the AEC project. The overarching goal is to explore the relative benefits to indoor and ambient air quality associated with a range of adoption of the following suite of energy system transformation pathways: 1) home energy efficiency, 2) electrification of residential natural gas appliances, 3) vehicle electrification, 4) distributed renewable generation (e.g., solar photovoltaics) and storage asset penetration (e.g., batteries). This three-year project (2018–2021), officially titled “Using Big Data to Holistically Assess Benefits from Building Energy System Transition Pathways in Disadvantaged Communities,” is focused on communities in the San Gabriel Valley, in Eastern Los Angeles County. It integrates novel community-scale energy system modeling techniques with a detailed community outreach survey effort designed in conjunction with leading UCLA public health researchers. The project team also includes The Energy Coalition, a local nonprofit responsible for energy system modelling and design; Active San Gabriel Valley, a community-based organization that conducted all the outreach and education in the area
7
CASE STUDIES
141
of the proposed project; and a technical advisory committee consisting of stakeholders from nonprofit, governmental, and university sectors. Project objectives include: • Identify the combination of energy system transformations that result in the greatest health benefits for under-resourced communities, considering both indoor and ambient air quality. • Expose any unforeseen health consequences associated with energy system transformations that do not account for both indoor and ambient air quality within under-resourced communities. • Understand structural drivers of indoor and ambient air quality that may disproportionately impact under-resourced communities. Background Over the past several years, as the grid has transitioned to larger and larger percentages of renewably generated power, the average GHG, and criteria pollutant emissions intensities of grid sourced electricity have gone down. However, simultaneously, because of continued reliance on natural gas to buffer ramping loads, early evening hour GHG and criteria pollutant emissions intensities have all gone up. This phenomenon is clearly illustrated in Fig. 7.3, which plots the average hourly GHG emissions intensity
Fig. 7.3 Average hourly GHG emissions intensity of grid power, based upon the changing composition of generators in the grid portfolio mix, across the hours in the day for each year between 2010 and 2019
142
S. PINCETL ET AL.
of grid power (mean kg-CO2 /MWh) for each hour in the day over the previous 10 years (Data: CAISO). The persistently high emissions intensities associated with the consumption of grid power during peak evening hours (6–10 PM), has important policy implications during this period of transition toward 100% renewably generated electricity. As additional load is added to the grid, for example from residential appliance electrification, improvements in indoor air quality may accrue to those specific households; however, ambient air quality may get worse if the use of those appliances occurs when the grid’s pollutant emissions intensity is high. A holistic assessment of air quality impacts associated with various energy transition pathways is critical to informing state policies. Role of the Energy Atlas The backend database of energy consumption (electricity and natural gas) and building attribute data (aggregated for privacy) is being used to help develop and calibrate hourly load profiles for the building-scale and community-scale energy models. In addition, researchers have obtained one year of hourly natural gas use for the study area (Fig. 7.4). Results and Findings The following project outputs have been completed or are currently in process: • Over 400 surveys of local community members about the number and type of appliances (natural gas, electric) within their homes, as well as actions taken (or future interest) related to home energy efficiency measures, electric vehicle purchases, appliance electrification, and rooftop solar installation. • Indoor air pollution monitoring at 64 homes throughout the two study area zip codes. This included real-time measurements of particulate matter (PM) concentrations over a two-week period and seven-day passive NO2 measurements. Further, questionnaires were collected that asked for extensive information on home appliances and behaviors relating to natural gas use and home ventilation. • Development of relationships between indoor PM concentrations and natural gas use using statistical methods.
7
CASE STUDIES
143
Fig. 7.4 Map illustrating the geographic location of the two project study area zip codes (91732 & 91746), which largely overlap with the AEC project study area (Basemap data credit: Mapbox)
• Generation of baseline community electricity and natural gas hourly load profiles for the project community that will be used as the basis of comparison for the project’s subsequent evaluation of different energy system transformation pathway scenarios. • Evaluation of energy system transformation pathway scenarios and assessment of relative indoor and ambient emissions impacts.
144
S. PINCETL ET AL.
• Communication of study findings to community members and to state policymakers. All of the communication material back to the community members is translated into Spanish, in order to make the results accessible to the large number of Spanish speakers in the community. Only partial findings have been developed to date and these are still preliminary; the project extends through 2021.
Electricity Infrastructure Vulnerabilities Due to Extreme Heat Overview Under a CEC grant, CCSC collaborated with a UCLA climate modeler and energy systems engineers at Arizona State University to analyze the impacts of changes in extreme heat (higher temperatures and more frequent high-temperature degree days) on electricity infrastructure in Los Angeles county. Using high-resolution climate projections, infrastructure maps, and forecasts of peak electricity demand based on the Energy Atlas, we estimated vulnerabilities in the electricity infrastructure to 2060. We considered rising air temperatures under the Intergovernmental Panel on Climate Change (IPCC) RCP 4.5 and RCP 8.5 at 2 km2 grid cell resolution as well as population growth scenarios, different energy efficiency implementation of buildings, air conditioners and higher air conditioning penetration (Burillo et al. 2019). Background No previous studies had considered these factors together, founded on actual energy consumption, grid and infrastructure data, as well as climate projections for a specific region at such a high geographical resolution. Developing a spatially quantitative understanding of electricity infrastructure vulnerabilities is critical for long-term capital planning, and, with the impacts of climate change, should also be considered in planning for further land development. To date, it has been assumed that services such as electricity, natural gas, and/or water could be supplied without taking a changing climate into account. This study shows the increasing importance of taking future climate into account, particularly in these sectors.
7
CASE STUDIES
145
Increasing high heat will affect the efficiency and/or availability of critical urban flows (water, for example), and impacts must be analyzed and mapped out. Our research asked three questions: 1. By what amount could capacity be reduced at generator plants, transmission lines, and substations by 2060 due to heatwaves? 2. What could derated load factors be on components with increased peak demand and decrease in infrastructure capacities throughout the county 3. What areas of the county would be at highest risk of shortages? Role of the Energy Atlas The historical electricity consumption data contained within the Energy Atlas was used as a key source of data for the calibration of base and peak load forecasts that were developed for this study. Peak loads are often described in utility regulatory reporting documents in terms of “load factors.” A load factor is defined as the average load for some service area divided by the absolute maximum load experienced over a given time period. For this study, using historical Energy Atlas consumption data, average load values were computed at the census tract level throughout the Los Angeles County region. These average loads were then combined with utility reported load factors to back out estimated ranges for historical peak loads encountered within the study’s different geographic regions. These peak loads were correlated with historical temperature data to model the sensitivity of peak electricity demands to high heat events within the different regions. Results and Findings Results showed that generation within LA county would suffer, and the most vulnerable power plants were the combined cycle and combustion natural gas facilities located in the parts of the region expected to experience the greatest impacts from high heat. PV generation plants located in these areas—largely in the high desert—would also suffer reductions in output. Currently, during times of heatwaves, the county needs to import 1–6 GW of electric power. In the future, this range could increase
146
S. PINCETL ET AL.
to 3.1 GW–15 GW by 2060 depending on the different RCP projections and future local generation. In contrast, generation and substations closer to the coast would be significantly less affected. These results point to the importance of planning factors, most especially land use, to energy outcomes. Improved building and AC energy efficiency, as well as increased densities in the cooler parts of the region, could mitigate outage and load factors during the peak hours of an extreme heatwave. We find that additional supplies could meet increases in peak demand: more fast-ramping gas plants and/or DER. However, gas plants that do not use water for cooling (an increasingly scarce resource in southern California) are more sensitive to rising air temperatures and also less fuelefficient. At the same time, distributed solar PV has its own requirements such as space needs, a grid adapted to two-way flows, the need for storage and new rate structures. Increased high heat also impacts the efficiency of air conditioning units pointing to a need for new technologies that perform efficiently at much higher temperature. In sum, the study shows the need for anticipatory planning and investments, and consideration of where urban growth should occur as temperatures increase over time. This is a new frontier for planners, and requires integration between energy suppliers and those responsible for land use. Permitting land development in areas where populations will be increasingly exposed to temperatures over 50 degrees Celsius or more, when cooler areas could be densified, is certainly an issue that should be considered for multiple reasons beyond energy generation alone. Moreover, as the Energy Atlas consumption data analysis shows, single-family housing in the areas expected to be the most affected by increased heat would increase peak demand and energy consumption per capita by almost twice as much as building multifamily housing in cooler parts of the region (Burillo et al. 2018, 2019). An important synergistic benefit is that multifamily land use is generally more walkable, has narrower streets and more shade, and has commercial services that are more accessible, reducing reliance on the automobile. Increasingly, the ability to layer and match the multitude of interacting factors around urban energy consumption, urban morphology, and energy systems, is critical to good planning for the future. The thick description of the Atlas is a foundational tool for so doing.
7
CASE STUDIES
147
LA County Sustainability Plan---GHG Inventory and Business-as-Usual Scenarios The LA County Chief Sustainability Office (CSO) is responsible for supporting and coordinating the county government’s pursuit of its sustainability goals. Created in 2016, the CSO provides policy support and guidance to the County Board of Supervisors, city governments, other departments, and the county’s unincorporated areas. The CSO oversees the development, updating, and implementation of the county’s sustainability plan. Overview CCSC, as part of a collaborative team,15 was selected by the County of Los Angeles to lead the development of the county’s first Sustainability Plan. Within the county, this effort was managed by the Chief Sustainability Office (CSO), which is responsible for supporting and coordinating the county government’s pursuit of its sustainability goals. In addition to CCSC’s responsibilities related to the visioning process, community outreach, and development of goals, targets, and metrics, we were able to apply Energy Atlas data to conducting the county GHG emissions inventory. Specifically, CCSC was responsible for the buildingrelated inventory and business-as-usual (BAU) projections. The Sustainability Plan’s GHG inventory revised the previous county inventory, bringing sectoral emissions estimates current to 2015, and disaggregating results by Los Angeles Department of Regional Planning’s (DRP) 11 planning areas16 (Fig. 7.5), as well as by each of the 88 cities within the county and the unincorporated areas.
15 In 2017, the newly formed CSO began soliciting bids for the development of a countywide sustainability plan. After issuing a request for proposals, the CSO eventually partnered with the project team of BuroHappold (a civil engineering firm), UCLA (including CCSC, the Sustainable LA Grand Challenge, and the UCLA Law School), and Liberty Hill Foundation (a social and environmental justice funding foundation). 16 https://egis3.lacounty.gov/dataportal/2018/11/26/planning-areas/data_portal_upl oads-planning_areas/.
148
S. PINCETL ET AL.
Fig. 7.5 Los Angeles County DPR planning areas
Role of the Energy Atlas Both the GHG inventory and BAU emissions projections were calculated using UCLA Energy Atlas energy consumption data aggregated in compliance with the CPUC’s 15/15 data privacy protection rules. Building areas and other information are co-geolocated with energy consumption in the Energy Atlas’ back-end database, allowing for consumption and associated emissions to be attributed to specific buildings and building use type categories.
7
CASE STUDIES
149
The GHG emissions attributable to electricity and natural gas consumption within each of the County’s planning areas are computed on the basis of the historical account-level SCE and SCG metered billing data contained within the UCLA Energy Atlas. Square-footage data for SCE and SCG territories are based on geocoded account-level data contained in the UCLA Energy Atlas. The inventory methodology followed the Global Protocol for Community-Scale Greenhouse Gas Emission Inventories (GPC),17 the globally recognized standard for city-level GHG accounting and reporting. GPC provides a framework for calculating and reporting community-wide emissions in line with 2006 Intergovernmental Panel on Climate Change (IPCC) guidelines for National Greenhouse Gas Inventories. Building-related GHG emissions were calculated using natural gas and electricity consumption totals for cities and DRP planning areas and the corresponding emission factors. CO2 , CH4 , and N2 O emissions from natural gas consumption were calculated by multiplying fuel consumption by the corresponding emission factors. Emissions were estimated for Residential, Commercial, Institutional, and Industrial building use types. The BAU GHG Emissions Forecast is a set of reference case emissions projections for the cities and the DPR planning areas from 2015 to 2050, intended for the evaluation of emissions abatement policies. BAU projections were computed based on the historical rate of change in building square-footage volumes by use type and disaggregated by city, as recorded in 12 consecutive annual versions of the LA County Assessor Parcel Database (2006–2018). Projected GHG emissions were calculated by multiplying 2050 estimated gross square footages by the 2015 energy usage intensity (EUI) for the associated geographic-specific building use type. The county BAU projections estimated the growth in emissions from eight distinct building use types. In instances where consumption data was missing or masked due to the 15/15 rule, electricity and natural gas consumption were estimated using the median category-specific energy usage intensity (EUI) derived from unmasked usage data and building areas for unincorporated areas.
17 https://ghgprotocol.org/greenhouse-gas-protocol-accounting-reporting-standardcities.
150
S. PINCETL ET AL.
Results and Findings The County Sustainability Plan, published in 2019, included the buildingrelated GHG emissions inventory and BAU projects, as part of a comprehensive GHG inventory for Los Angeles County. It was only because of the existence of the Energy Atlas database that LA County was able to conduct GHG accounting disaggregated to the municipal level. But even with such a high-quality spatial record of energy consumption, current CPUC privacy rules make it very difficult to utilize the data to its full effect in order to produce accurate estimates of building-related emissions. This is true for any geography, from cities and counties to the state as a whole. A more general discussion of the problems with privacy aggregation can be found in Chapters 4 and 5.
A Building Energy Consumption Database for the California Bay Area Overview CCSC was asked to create an aggregated building energy consumption database for the nine counties of the San Francisco Bay Area, which comprise approximately eight million people. The lead organization for this work is the Bay Area Regional Energy Network (BayREN), a collaboration of the nine Bay Area counties, which provides energy efficiency programs, services, and resources at a regional scale. BayREN is funded through the CPUC by utility ratepayer funds, as well as by member agencies and other sources, and is led by the Association of Bay Area Governments (ABAG). The goal of the project is to create a tool similar to the Energy Atlas, and specific to this area of Northern California. Due to funding constraints and other political considerations at the time the project was initiated, it is being developed as a separate database rather than an expansion of the Energy Atlas. Background The San Francisco Bay Area is served by Pacific Gas and Electric (PG&E) for both electricity and natural gas. Linking PG&E customer accountlevel energy consumption to parcel information within a spatial database
7
CASE STUDIES
151
allows for in-depth analysis of consumption patterns by building characteristics and socio-economic patterns, similar to the Energy Atlas for Southern California. Aggregated energy consumption statistics (following customer privacy guidelines from the CPUC) will provide BayREN with crucial information for assessing building energy consumption and datadriven efficiency targeting programs at the city and county scale. The back-end database will be maintained securely and privately, and will generate the aggregated summary statistics for display on an interactive website, where they can be viewed as maps or tables, or downloaded. Role of the Energy Atlas The Energy Atlas provided the template for the development of the Building Energy Consumption Database. All of the methodologies from the Atlas were applied, including database development, interactive website development, data processing, geocoding, aggregation, QA/QC procedures, and mapping. Of course, new challenges arose related to data quality problems specific to PG&E’s consumption data, and to the assessor’s parcel data from the nine Bay Area counties; however, these were largely variations on data quality problems already encountered in the Atlas, and we were able to apply the same tools and procedures to either correct or account for them. Results and Findings This project is still a few months from completion as of the writing of this book. When finished, the interactive website will be public in the same way as the Energy Atlas. Once this milestone is complete, CCSC staff will continue to maintain the security of the backend database, provide training to local governments on how to use the website, and use the backend data to support a number of county-specific analysis requests that go beyond what is provided on the website. Furthermore, CCSC will conduct an analysis of energy efficiency program adoption within the BayREN territory, and evaluate program effectiveness using pre- and postenergy consumption data. It is worth noting that this project’s implementation was very difficult as PG&E opposed the creation of the product. PG&E refused to fulfill CCSC’s application through the standard energy data request procedure, then contested the transmission of data from the CPUC to CCSC,
152
S. PINCETL ET AL.
insisted that CCSC sign an additional non-disclosure agreement with ABAG even though ABAG did not provide the data, and required an additional security protocol review. Because BayREN receives program funding from PG&E, a result of CPUC policy to require utilities to invest in energy efficiency programs as discussed earlier, it became abundantly clear that if CCSC did not comply, BayREN would not receive any of its allocated funding, even that portion of funding unrelated to the construction of the database. CCSC had to add additional data protection protocols, including data encryption and more. This example is simply to illustrate the difficulty of conducting public interest energy research in the face of enduring opposition by the utilities.
Annex 70 International Research Collaboration Overview On the basis of the Energy Atlas, UCLA researchers were invited to participate in Annex 70, an International Energy Agency research initiative, under the Energy in Buildings and Communities program (IEAEBC).18 This effort is an international collaboration of researchers from across the globe, working to develop methods for improving the empirical evidence on energy demand in the building stock. The focus is on identifying, reviewing, evaluating, and producing leading edge methods for studying and modelling the building stock including: • data collection techniques on energy use, building and occupant features, and building morphology • analysis of smart meter energy data, building systems, and user behavior • modelling energy demand among sub-national and national building stocks Much of the motivation behind the Annex 70 work is similar to that of the Energy Atlas.
18 https://energyepidemiology.org/.
7
CASE STUDIES
153
Background The Annex is led by the University College London (UCL). Partners include research institutions from Australia, Austria, Belgium, China, Denmark, Germany, Ireland, Japan, Portugal, Norway, Sweden, Switzerland, The Netherlands, United Kingdom, and the United States. In addition to the CCSC, the US participants are the US Dept. of Energy, Lawrence Berkeley Laboratories, and the National Renewable Energy Laboratory. The emerging field of Energy Epidemiology (Hamilton et al. 2013) provides the context and underpinnings for the Annex 70 work. The following points summarize the issues and goals that Energy Epidemiology aims to address and advance: • Provide a strong evidence base to identify associations and establish underlying causes behind outcomes and variations in end-use energy demand. • Create a new approach to end-use energy demand research, founded on the interdisciplinary health sciences research framework of epidemiology. • Advance a strong, population-level, empirically based research foundation that provides a methodological framework for interdisciplinary work. • Strengthen the evidence base to inform policy decisions and evaluate past intervention programs or regulatory actions. The stated objectives of Annex 70 are to support member countries in the task of developing realistic transition pathways to dramatic reductions in energy use and carbon emissions associated with their buildings by: • Comparing across the national approaches to developing building stock data sets, building stock models, and to addressing the energy performance gap in order to identify lessons that can be learned and shared; • Establishing best practice in the methods used for gathering and analysing real building energy use data; and, • Evaluating the scope for using real building energy use data at scale to inform policymaking and to support industry in the development of low-energy and low-carbon solutions
154
S. PINCETL ET AL.
Role of the Energy Atlas The Annex work is organized into three main task areas. • Task A: Data Needs and Uses—This task focuses on stakeholder engagement for existing and prospective data users from government, academia, industry, and the IEA itself. Activities include identifying and mapping user needs through a stakeholder survey, conducting a literature review, and preparing case studies on the use of building energy data. One of the Annex case studies, written by UCLA researchers, is on the Energy Atlas, which has been recognized by the Annex operating agent as an international best practice for the spatial analysis and mapping of granular building energy data. • Task B: Data Access and Methods—This task involves the development of a framework to describe and classify energy and building stock data. Activities including the development of a classification and registry for energy and building stock data, as well as the development of best practice guides on related topics such as spatial energy planning, smart meter data, data access and aggregation, and metadata, among others. UCLA researchers contributed to the data registry development by providing metadata on the numerous datasets used to populate the Energy Atlas, as well as on the aggregated data outputs from the Atlas 2.0, and on open data sites in California related to building energy data. CCSC is also co-authoring a best practice guide on data access and aggregation. • Task C: Building Stock Modelling and Analysis—This task is focused on the development and use of building stock models. Activities include developing a new model classification system, constructing a registry of building stock models, conducting model validity testing and uncertainty analysis, and developing high-level performance metrics for stock models. The Atlas was not directly applicable to this specific task because it is an empirical knowledge-base, rather than a model. We were invited to participate in this international endeavor to show how granular building energy can be used to understand the multiple interactive factors affecting this important part of urban energy consumption. As such, we hope to contribute to the push for greater data availability in this sector. In nearly all parts of the world, building energy use is
7
CASE STUDIES
155
treated as confidential information, proprietary to the utilities. Yet, granular building energy use is the basic building block to meet multiple overlapping goals, including reducing building energy use, targeting efficiency programs, ensuring that no groups are adversely affected, and making certain that a transition to low or zero-carbon energy system is deployed with the least impacts.
Summary This chapter illustrates the kinds of research and analyzes that can be undertaken with granular building energy data, even while respecting customer privacy. In each of our funded projects, we have contributed additional layers of data and attributes to build a thick set of interacting variables that provide contextual information, offering greater insight into this complex sociotechnical system that is spatial in its constitution. Our projects and results have served as an international model for how building energy data and analysis can advance policy goals, including guiding the implementation of community solar projects to address energy poverty and disadvantage in lower income neighborhoods. It is gratifying to see our Advanced Energy Community planning work move to actual implementation, due to the efforts of our research partnership team; it is also a replicable example of how big data can be used to implement change on the ground.
Bibliography Burillo, D., Chester, M., Pincetl, S., Fournier, E., Walton, D., Fengpent, S., … & Hall, A. (2018). Climate Change in Los Angeles County: Grid Vulnerability to Extreme Heat. California’s Fourth Climate Change Assessment. California Energy Commission, CCCA4-CEC-2018-013. Burillo, D., Chester, M. V., Pincetl, S., & Fournier, E. (2019). Electricity Infrastructure Vulnerabilities Due to Long-Term Growth and Extreme Heat from Climate Change in Los Angeles County. Energy Policy, 128, 934–953. https://doi.org/10.1016/j.enpol.2018.12.053. California Energy Commission. (2016). Low Income Barriers Study. file:///Users/felicia/Downloads/TN214830_20161215T184655_SB_350_ LowIncome_Barriers_Study_Part_A__Commission_Final_Report.pdf. Denholm, P., O’Connell, M., Brinkman, G., & Jorgenson, J. (2015). Overgeneration from Solar Energy in California: A Field Guide to the Duck
156
S. PINCETL ET AL.
Chart (United States: Dept. of Energy Technical Report). https://doi.org/ 10.2172/1226167. Hamilton, I. G., Summerfield, A. J., Lowe, R., Ruyssevelt, P., Elwell, C. A., & Oreszczyn, T. (2013). Energy Epidemiology: A New Approach to End-Use Energy Demand Research. Building Research & Information, 41(4), 482– 497. https://doi.org/10.1080/09613218.2013.798142. Lee, N., Flores-Espino, F., & Hurlbut, D. (2017). Renewable Energy Zone (REZ) Transmission Planning Process: A Guidebook for Practitioners (Technical Report No. NREL/TP-7A40-69043). Golden, CO: National Renewable Energy Laboratory. Porse, E., Fournier, E., Cheng, D., Hirashiki, C., Gustafson, H., Federico, F., & Pincetl, S. (2020). Net Solar Generation Potential from Urban Rooftops in Los Angeles. Energy Policy, 142, 111461. Ricklefs, A., Porse, E., Federico, F., & Pincetl, S. (2018). UCLA Advanced Energy Community Project: Local Implementation Recommendation Report. https://ucla.app.box.com/file/282516021868. Southern California Edison. (2015). 2015 General Rate Case Information Technology (IT) Volume 1—Overview, O&M and Capital. http://www3. sce.com/sscc/law/dis/dbattach5e.nsf/0/15F1898E6F06633088257C21 00812546/%24FILE/SCE-05%20Vol.%2001.pdf. Trabish, H. K. (2017). Prognosis Negative: How California Is Dealing with Below-Zero Power Market Prices. Utility Dive.
CHAPTER 8
Conclusion
This final chapter discusses the policy insights and implications derived from the Energy Atlas and the research projects it has enabled, as well as directions for future research that are supportive of a just energy transition. The Energy Atlas was created in response to the urgent need for cities to curb their consumption of energy and GHG emissions to achieve greater sustainability, and, increasingly, to become more resilient in the face of climate change impacts. At the same time, the transformation of the energy system and climate adaptation must also be managed in such a way as to increase energy equity and democracy, so that everyone is able to thrive.
Supporting Local Government Progress on the Energy Transition Local governments have been at the forefront of policy implementation in the realm of energy use reduction, climate mitigation, and adaptation. Increasingly, there is a need for quantitative analyses, including greenhouse gas emissions inventories, depictions of business as usual, measurement and documentation of energy use reductions, commercial building energy disclosures, etc., to support policy development and implementation. This has put pressure on localities to find sufficiently granular data. Providing that data is service that the Atlas is uniquely
© The Author(s) 2020 S. Pincetl et al., Energy Use in Cities, https://doi.org/10.1007/978-3-030-55601-3_8
157
158
S. PINCETL ET AL.
qualified to deliver. Whereas the state of California restricts the provision of energy data to local governments to the zip code level only, the Atlas provides much-needed data by a number of relevant building and sociodemographic attributes, thereby supporting policy implementation and quantitative analyses.
Regulatory Impediments The Energy Atlas demonstrates that in the world of big data, information—even information essential for the pursuit of the public good—can be scarce. Electricity and natural gas billing data can make possible the tracing and quantification of energy flows, either fossil fuel-based or renewable. This is valuable information; all sources of energy have different environmental impacts and social tradeoffs: combusting fossil fuels creates greenhouse gas emissions and dangerous co-pollutants that are now found in the air, in soil, in water, and in the bodies of nearly all living beings, both animal and human. Sun and wind technologies, while vastly more benign by comparison, still depend on rare Earth minerals that are often mined in places with weak environmental protections. They are also—especially in the case of solar—space intensive. Wind generation causes bird deaths and noise. No energy technologies are impact free, but renewables create fewer toxic byproducts, and fewer GHGs. Thus understanding the consumption of energy—quantities, types, and demands—can help better shape the direction of public policy. Information about building energy use, coupled with additional data on material flows, life cycle costs, and social life cycle analysis can yield insights into the tradeoffs between various energy system investments and transformation pathways. It can also help connect issues such as affluence and consumption levels to Earth systems impacts, such as those of rare Earth mineral extraction for smart systems, air pollution and GHG emissions impacts of fossil fuel energy, and more. The Energy Atlas, we hope, provides an additional set of empirical data to help understand how cities work and their impacts on both people and planetary resources. Yet, at the same time, the utilities’ grip on account-level energy data creates substantial obstacles to shifting toward a more benign energy regime. Relatively little is known about patterns of energy consumption, such as the relationships between consumption and urban morphology, the built environment, income, and sociodemographic factors. Furthermore, the effects of regulation are poorly understood. Though California
8
CONCLUSION
159
state policy calls for the expansion of renewable generation and better building performance, the data needed for the successful implementation of the policies pursuant to those goals is currently inaccessible. Our efforts to provide necessary information in this complex and interconnected realm have been impeded at every step. Resistance to our efforts is due primarily to a state regulatory regime that is seemingly ignorant of the need for actual ground-up data, and that regime’s relationship with the state’s utilities. The regulatory regime is made up of state agencies, legislation, rules and regulations, and interactions between the regulated industries and the public. The current system is mostly a product of the back-and-forth between regulatory agencies, including the CPUC and others, and the utilities—with continued, but minor, input from the state legislature and the public. The long-standing relationships with the utilities—nearly a century of collaboration to establish the current system, as agonistic as it has been at times—make it difficult for CPUC to radically alter its rules governing the sharing of and access to utility data for “outsiders” to use (including local governments, other agencies in state government, academica, or other stakeholders). As of this writing, there is no indication that the CPUC or state legislature is willing to challenge the idea that the data utilities collect in the course of their operations is proprietary and subject to strict rules relating to its access and use. Although granular building energy data was made available to universities through an administrative judge’s ruling in 2014, this process was highly contested and the outcome of a particular moment in time, as well as leadership by the Jerry Brown administration. After the decision there was a loss of momentum; the EDAC process did not yield a clearly defined set of rules for the fulfillment of data requests. Instead, utilities fulfill university data requests according to their own criteria which are not subject regulatory oversight. Our research has suffered from the utilities’ capriciousness, as they can choose to deny a data request if the project does not have an identifiable funding source, if they consider the study’s geographical area too large, or if they determine that the cost to produce the data is too high. Each utility can also require its own data security reviews, even if the requester has already passed a security review for another similar utility, and so forth. These interactions are time consuming and satisfying the conditions set by the utilities
160
S. PINCETL ET AL.
may cause projects to stretch beyond their funding windows, potentially causing researchers to lose a grant or to be unable to fulfill their deliverables. Any changes to the State’s utility data privacy rules that might be warranted based on the experiences of universities requesting data from utilities would entail a highly complex legal procedure, and require policymakers at the top to initiate it. It is not at all obvious how university researchers could set such changes in motion. After the departure of Governor Brown, the interest and commitment to greater data transparency seems to have waned; other issues have taken the forefront, including now, as of this writing, the Coronavirus pandemic, and just recently, the imminent bankruptcy of PG&E. And yet, the regulatory and fiscal requirements to which local governments are subject have not changed. Local governments are increasingly desperate to accurately report their GHG emissions under mandates to do so, and must rely on aggregated data from utilities. This data has not been subjected to third-party review or verification, and is simply too aggregated to be much use for GHG emissions accounting. State agencies or other third parties do not have the capacity and/or authority to verify the accuracy of utility data, and universities remain the only parties who can request address-level billing data and extract meaningful information from it. A data-driven approach requires information about which sectors to target with what interventions. Without this, the goal of cutting building energy by half, mandated by state law, will remain a chimera, an unrealizable dream. In the absence of better data, local governments will inevitably resort to using extrapolations from aggregated data, modeling potential savings, and claiming success without ground truthing or validation. It doesn’t have to be this way. The Energy Atlas demonstrates that it is possible to take utility billing data and extract valuable insights from it, even while complying with current privacy rules. And we anticipate that one day, the weaponized version of “privacy” advanced by the utilities will lose to compelling public interest.
Insights from Data-Driven Research Despite attempts to constrain the project, the back-end data of the Energy Atlas has allowed researchers to ask questions about the energy system and dynamics of energy consumption heretofore rarely addressed.
8
CONCLUSION
161
Most energy data have been used to estimate possible savings through various mechanisms, such as energy efficiency investments, or pricing, but issues of equity impact have been of less prominent concern. Having the ability to probe data with the addition of place-specific layers (our thick description attributes), creates the opportunities to ask new questions and develop important findings. Time-of-Use Pricing The state and utilities hope to reduce energy use at peak times by shifting it to other times of the day via time-of-use pricing. Time-of-use pricing is intended to reduce the need for ramping up inefficient, gas fired “peaker” power plants. These plants are currently needed because of the steep changes in demand between mid-day, with high levels of solar generation, and the early evening hours, when solar generation drops off concurrent with increasing demand by households (this is known as the “Duck Curve”). However, important energy justice issues must be evaluated and addressed as part of this implementation of time-of-use rate schedules. To what extent do households, especially in under-resourced communities, have the capacity to shift their loads? How will this impact household electricity bills? And, more broadly, how much energy is being used per capita in under-resourced households, and how does this compare to wealthier communities? Should load shifting requirements apply equally to all households? Moreover, what are the trade-offs between natural gas peaker power plants and the installation of batteries for the storage of surplus renewable solar electricity generated during the day, to be discharged during times of peak demand? These cost/benefit, environmental impact and generation vs demand studies seem to be missing. Yet state regulators have continued to approve new gas-fired peaker power plants while hoping to curb demand through time-of-use pricing. The implementation of time-of-use tariffs involves charging higher rates at peak hours in order to reduce energy use and shift it to earlier or later times of the day, and has potential significant disproportional equity impacts. The least affluent have fewer options for shifting their energy use, such as the purchasing programmable appliances or installing energy storage. Aside from cost considerations, they are also often renters, using equipment provided by landlords, which may or may not be energy efficient. Considering the thermal characteristics of their buildings, their lack of access to distributed renewable energy resources, and their sensitivity
162
S. PINCETL ET AL.
to energy price increases, lower income communities are more likely than relatively affluent communities to be severely impacted by time-of-use pricing. Their appliances, as we have observed in our community-based research, are much older, and their housing stock is often in poorer condition. Higher peak use prices will mean difficult choices for these communities. Households lucky enough to have air conditioning (and these are often inefficient window units) will have to decide whether to turn it on upon coming home from work or school, suffer in the heat, have very high electricity bills, or perhaps find a local air-conditioned space, like a mall or library. It is important to remember that many underresourced households already forgo consumption of energy in order to save on utility bills; are these the families that should now be responsible for solving the “Duck Curve” problem? Time-of-Use pricing shifts the burden to households, in contrast to an aggressive mandate for battery storage. It is possible to imagine an alternative where utilities prioritize battery storage to avoid TOU charges for under-resourced neighborhoods, but this would require a fundamental shift in utility business model. Interconnection of Distributed Generation To mitigate potential detrimental consequences of distributed solar generation on electric grid operations, California utilities can impose limits on the size of solar generation capacity, as well as the numbers of installations on a circuit (Rule 21), per the Public Utilities Commission. By coupling data on solar rooftop potential, consumption, income characteristics, and grid information, we found that under-resourced communities may potentially be disproportionately impacted by these interconnect policies. In nearly 20% of communities in Los Angeles County, current interconnection policies reduce the potential of net rooftop solar generation. This is due to limits on the size of arrays that can be installed. Utilities limit the size of the photovoltaic arrays so that they meet 150% of historical consumption of the building on which they are installed, as they are obliged to pay a feed in tariff for the power produced by rooftop systems, and do not wish to undermine their own power purchases by having their customers becoming distributed power providers. Urban rooftop capacity might generate more than is utilized by the building currently, but the
8
CONCLUSION
163
potential of such installations to contribute to energy supply is limited due to concern about grid capacity and utility power purchase preferences. We found that under-resourced communities in Los Angeles generally have greater potential to contribute peak solar generation exports to the electric grid, along with greater excess capacity in local circuits to accept solar from sources other than rooftops (Porse et al. 2020). This means that solar capacity potential is underutilized in situ, pushing purchases of solar to off-site solar arrays, taking up lands that could be used for other purposes. Our findings imply that current interconnection rules aren’t consonant with the goal of a 100% renewable transition; utility interconnection policies forestall maximum generation in the urbanized region, and foreclose on maximum solar generation from rooftops due to inadequate grid configurations. These findings are important, as they suggest how states can develop policy instruments and interconnection rules to increase solar generation in the existing built environment, mitigating possible environmental impact of extensive solar power plants outside of the urban areas. Building Energy Use Comparative analyses using the Atlas data have also revealed substantial shortcomings in California’s attempts to reduce building energy use. Beginning in 1978, the state developed building energy codes to make buildings more energy efficient. These codes have been revised regularly over time, reflecting new building technologies and materials. We found that despite increasingly stringent building codes, energy use per capita has been rising over time, driven by the construction of larger and larger homes. This building trend significantly undermines the state’s sustainability regulations and goals relating to energy use. Our analysis revealed that historical energy savings within Los Angeles County, attributable to state-mandated building codes, could have been equivalently achieved by constraining growth in the size of new homes. Put another way, the growth in home size reduced the potential energy savings from building codes by more than half. This is a classic illustration of the Jevons Paradox (Sorrell 2009) which seems entirely unknown to energy policymakers. Nineteenth-century economist William Stanley Jevons observed that increasing the efficiencies of coal burning machines such as locomotives resulted in more of them being built and used, thus increasing the total consumption of energy. This is the case in California, where the
164
S. PINCETL ET AL.
actual amount of consumption is not the subject of regulation. What is regulated instead is the efficiency of consumption. The promulgation of mere efficiency measures like insulation and double-paned windows, without caps or goals for absolute energy use reductions, means that buildings can become more efficient per square foot while total energy use keeps increasing. Such trends raise fundamental ethical questions. We must consider: how much energy is sufficient for a decent life? Is there a level of consumption above which the externalities of consumption, such as greenhouse gas emissions from peaker power plants or natural gas appliances, and the embedded GHGs in materials required to build new construction, outweigh the benefits of those that enjoy them (Fournier et al. 2020). In addition to the effects of increasing building size and efficiency on total energy consumption, we have also found that energy efficiency incentives (for example, utility rebates for efficient washing machines or home weatherization) are disproportionately taken up by affluent residents who live in newer, more energy efficient homes (per square foot). This cohort can afford to shoulder the cost of the upgrades or new devices and receive partial repayment over time, while the less wealthy, in most cases, cannot (Chuang et al. 2018). It is important to realize that energy efficiency measures merely serve to modify existing conditions, but do not take into consideration structural circumstances that drive energy use, such as the power grid, grid supplier, prices charged, neighborhood context, weather, the proliferation of electronic appliances, and the building itself (Lutzenhiser 2014). Additionally, energy efficiency incentives of the type discussed here assume that individuals privilege energy savings above all else. Such programs implicitly assume that people prioritize financial savings that result from greater efficiencies. Generally speaking, such programs do not recognize that people have many household priorities, including, for example, caring for the sick who may need very warm, or cold environments, or other factors. Another important factor to consider in building energy use is the impact of anticipated increased heat days and higher temperatures. Southern California is anticipated to experience 1–4 °C in the region, resulting in higher summertime peak electricity demand. With the data from the Energy Atlas, we were able to ascertain what the possible impacts increased heat load would have on the grid. Since there may be up to 1 million additional persons living in parts of the Los Angeles region that will be experiencing the hottest days, according to state population
8
CONCLUSION
165
projections, understanding the possible risk to those populations is important. We estimated that generators, substations, and transmission lines could lose up to 20% of safe operating capacities, as discussed in the case study chapter, and that 4–32% of additional capacity, distributed energy resources, and/or peak load shifting would be necessary by 2060 (Burillo et al. 2019). By utilizing consumption data from the Atlas for the regions most susceptible to extreme higher temperatures, matched to the grid network generation capacity and housing types, such projections are possible—and are empirically based. The work shows the importance of Atlas type data in planning where future urban growth may put people at greater heat risk. In Los Angeles County, temperature increases will vary significantly from the coast to the high desert, reflecting the regional climate’s heterogeneity. Energy system impacts analysis should be used to guide future land use planning and where urban growth should be minimized. The analysis further showed, based on existing energy use, that common wall buildings—such as apartments or row houses—are less energy intensive than isolated single-family dwellings. This too is an important finding for planning into the future with the likelihood of more and higher heat days.
Summary California has adopted aggressive electric utility renewable portfolio requirements, GHG emissions reductions targets, building energy performance standards, and other measures intended to reduce GHG emissions and energy consumption. However, the fact is that existing buildings, their energy use and their users, remain largely mysterious. They are the objects of modeling exercises and suppositions that are based on rarely examined assumptions, yielding policy mandates that are impossible to implement or evaluate. State energy policy is a kind of house of mirrors; to find our way forward we glance carefully at highly distorted images of empirical reality, wandering at times indeterminately and occasionally bumping into obstacles. Current policies emerge from epistemologies of knowledge based, in part, on the belief that increasing the efficiency of end uses of energy will lead to energy savings—a belief which is supported by some modeled and experimental evidence. While an exhaustive description of these epistemologies (and the beliefs that undergird them) is beyond the scope of this book, it is important to note that the pursuit of efficiency is at the
166
S. PINCETL ET AL.
core of the policy tool kit that is becoming universalized as a set of “best practices.” This tool kit is being adopted by localities and states across the globe that aspire to carbon neutrality. The creators of this tool kit evince a kind of Promethean techno-utopianism which fails to account for the life cycle impacts of technological fixes, such as smart meters and controllers, and are mostly satisfied with simplistic assumptions about individuals being essentially utilitarian or “economically rational.” Calculations of savings are uniformly modeled, reflecting assumptions about efficiency and that savings can be compared across types, technologies, and implementation. We have argued that granular building energy data—thickly contextualized—is indispensable to a parsimonious and just energy transition away from fossil energy. We are in an era of feverish measurement and quantification, one which seems qualitatively and ideologically distinct from the past, one in which we discuss the inevitability of “smart” cities, “networked” cities, “sensored” cities. These terms imply the integration of information gathering into the fiber/infrastructure of the city itself, measuring and gathering information second by second, minute by minute. More and newer technology, so we are told, will solve previously intractable problems: smart roadways and distributed traffic monitoring systems will decrease congestion, lowering emissions, and supporting thriving neighborhoods. Verizon, for example, is marketing a new smart approach to urban infrastructure, offering cities sensor technology that they can use to more efficiently direct people to empty parking spots by enabling drivers to use their phones to access the information. The smart city evangelists, however, fail to mention or account for the many tons of GHG emissions the new technologies and their deployment will create, along with the rare Earth minerals and other life cycle costs, and the energy costs of collecting, transmitting, and storing the data. Theirs is a truncated view of sustainability which does not adequately address the climate impacts of their approach. They seem to exist independent of their materiality. The technologies in which smart city technophiles place their faith are essentially one-offs, developed for specific applications (like the management of parking) that are being commodified by a handful of firms—Siemens, AECOM, ABB—as part of competing data collection programs, dashboard, sensors, and implementation strategies. The data collection and analysis platforms marketed by these firms and others are
8
CONCLUSION
167
not alternatives to an integrated, holistic, and interactive set of strategies for dealing with urban sustainability issues and climate change. The creators and boosters of such technology are, at best, agnostic about the underlying causes of environmental problems, and have a tendency to reduce them to engineering problems. This kind of strictly technological approach falls short because it does not involve understanding of city specificities and possibilities inherent in a particular place. Cities are different, conditions will always vary, but locally appropriate problemsolving and solutions are potentially foreclosed upon by “sustainability experts” who wield one-size-fits-all metrics and models. Taking local variation into account means—for example—understanding the potential for local energy generation, and what measures might improve building energy performance, such as openable windows that take advantage of cooling breezes. Instead, technophilic solutions tend toward hermetically sealed buildings that control their internal temperatures automatically, “efficiently,” and are nearly identical to all other smart buildings, even those in very different climates. Their construction is not accompanied by an accounting of the life cycle costs or the impacts of their sensor and control systems, nor is it informed by notions of human comfort outside of the set temperatures included in the building’s control software. These totalizing technical solutions flatten geographies of difference and suppress local potentialities. The Energy Atlas serves as a window, allowing us to see and appreciate the importance of granular accounting of energy use embedded in the context in which that energy use occurs: the age and size of buildings, the sociodemographic characteristics of inhabitants, weather, urban morphology, and more. From a lens of “thick description” that couples building energy use with urban history (such as patterns of urban development and housing types over time), sociological characteristics (income, ethnicity, renter/owner), policy mandates (Title 24 building standards, renewable portfolios, building energy use reduction), technologies (smart meters, photovoltaics, battery storage, inverters), such data can enable a thicker reading of the urban landscape from an energy and infrastructure perspective. Pursuing urban sustainability requires addressing energy use by cities, and developing analytical methods that can assess patterns and relationships that might lead to the prevailing energy use. Only with a fine-toothed approach can we begin to “understand” building energy use and the potential for change. The Energy Atlas creates an opening for thinking about the existing built environment and what
168
S. PINCETL ET AL.
that means for sustainability. If the goal is to create more sustainable cities, and to mitigate climate change, buildings, and urban morphology become a critical part of the agenda. The ubiquitous lack of transparency about building energy use needs to be addressed. Without granular data, successful mitigation of the most intensive energy users in cities will be hit and miss, wasteful and will potentially lead to greater energy inequality. Finally, building energy use is nested in a set of larger questions about the energy and the sustainability future of urban areas, how they are built, rebuilt, and powered. If resources like sand, lithium, and rare Earth minerals, are becoming scarcer, and the energy required for mining, processing, manufacturing, and transportation to construct and maintain the urban environment is taken into account, it becomes increasingly obvious that our current approach to the built environment cannot be sustained. There needs to be a serious reconsideration of the value of the existing stock of buildings and infrastructure. These are sunken/fixed Earth materials whose extraction, processing, transportation, and transformation into urban morphology has produced greenhouse gases and untold amounts of pollution. They are also materials that may not be replaceable at as high a quality or low a cost. We should begin to look at the entire urban fabric as representing embedded energy and materials and to treat it as an investment of expended GHGs and stocks of Earth materials. It therefore needs to be thought of and treated as durable and lasting, if reconfigurable. New large-scale construction that replaces the old simply increases the burden of GHGs and impacts Earth resources. Existing building stocks can be used more effectively and efficiently. Densities need to be increased in both new development and in the existing built environment. Increasing density makes better use of the existing water, sewer, and other fixed infrastructures in areas already built up, and any urban areas should maximize solar generation by utilizing all available rooftops and open built spaces. The generation should be routinely coupled with different scaled storage technologies that would be available for quick peak time demand, to longer consistent demand. And all of this intensification of use should be accompanied by building energy data to determine how buildings are performing and to make the buildings perform better over the long term. The understanding of planetary limits and the ways in which cities must start to think differently changes how energy efficiency and renewables are integrated into the existing urban fabric. If buildings are expected to endure 100 or 200 years, what retrofits make sense? If we are indeed to
8
CONCLUSION
169
reduce GHGs and pollution emitted from fossil energy, how do we integrate renewables in the built environment, and perhaps more importantly, how do we configure daily life to use less energy overall? These are the questions of the day. Hoarding building energy data to protect the status quo will not get us there. Rather, understanding building energy use and its context through thick description mapping is a foundational step to launch into these larger and connected issues facing us today. The UCLA Energy Atlas provides an important window in constructing a different energy future that is based on data, data that reflects people, place, and buildings. Data constructed to help understand the condition of the most disadvantaged in order to be able to create an equitable energy transition. It enables understanding of context and how regulations shape energy type and provision. It helps think through how and which changes need to take place to ensure a just energy transition for the future.
Bibliography Burillo, D., Chester, M. V., Pincetl, S., & Fournier, E. (2019). Electricity Infrastructure Vulnerabilities Due to Long-Term Growth and Extreme Heat from Climate Change in Los Angeles County. Energy Policy, 128, 943–953. Chuang, Y., Delmas, M., Federico, F., Fournier, F., & Pincetl, P. (2018). UCLA AEC Project, Energy Efficiency Program Effectiveness Analysis (Final Report). https://ucla.app.box.com/s/xp0dkev4qiu9l3qyzokmbvm5mg2ms8d6. Fournier, E., Federico, F., Porse, E., & Pincetl, S. (2019). Effects of Building Size Growth on Residential Energy Efficiency and Conservation in California. Applied Energy. https://doi.org/10.1016/j.apenergy.2019.02.072. Fournier, E. D., Cudd, R., Federico, F., & Pincetl, S. (2020). On Energy Sufficiency and the Need for New Policies to Combat Growing Inequities in the Residential Energy Sector. Elem Sci Anth, 8(1). Lutzenhiser, L. (2014). Through the Energy-Efficiency Looking Glass. Energy Research and Social Science., 1, 141–151. Porse, E., Fournier, E. D., Cheng, D., Hirashiki, C., Gustafson, H., Federico, F., et al. (2020). Net Solar Generation Potential from Urban Rooftops in Los Angeles. Energy Policy. https://doi.org/10.1016/j.enpol.2020.111461. Sorrell, S. (2009). Jevons’ Paradox Revisited: The Evidence for Backfire from Improved Energy Efficiency. Energy Policy, 37 (4), 1456–1469.
Glossary
15/15 Privacy Rule: a privacy standard that requires that aggregated data include a minimum of 15 customers with no one customer’s load exceeding 15 percent of the group’s energy consumption Advanced Metering Infrastructure (AMI): an integrated system of smart meters, communications networks, and data management systems that enables two-way communication between utilities and customers American Community Survey (ACS): an ongoing survey of social, economic, housing, and demographic factors, conducted by the federal government to determine federal and state funds should be distributed Application Programmatic Interfaces (API): a set of functions and procedures allowing the creation of applications that access the features or data of an operating system, application, or other service Artificial Neural Networks: an information processing paradigm that consists of a is an interconnected group of nodes, inspired by a simplification of neurons in a brain BayREN: a collaboration of the nine San Francisco Bay Area counties, and provides energy efficiency programs, services, and resources at a regional-scale. It is funded through the California Public Utilities Commission by utility ratepayer funds, as well as through grants and funding from member agencies and other sources. BayREN is led by the Association of Bay Area Governments (ABAG) BEV: Battery Electric Vehicle © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 S. Pincetl et al., Energy Use in Cities, https://doi.org/10.1007/978-3-030-55601-3
171
172
GLOSSARY
Big Data: extremely large data sets Building Vintage: year the building was built Business-as-Usual (BAU): emission projection estimated on current emission levels, regulations, technology, and behavior California Energy Commission (CEC): a state regulatory and planning agency that oversees aspects of managing and improving the statewide energy system. The agency was created in 1974 in response to the 1972 energy crisis. Its mission is to plan for the future energy needs of the state. CEC is in charge of the regulations relating to the improvement of building energy performance—Title 24—and the development of new regulations for building energy, including the recent requirement that new residential construction be net-zero energy (though this has been newly and variously interpreted). The agency is funded by legislative appropriations and surcharges applied to ratepayer utility bills. The CEC directs funding for the research and development of new energy technologies and projects intended to better understand and/or transform various aspects of the energy system, conducts long-term energy forecasting, and finally, approves the siting of power plants California Independent System Operator (CAISO): a non-profit entity founded in 1997 that oversees the operation of California’s bulk electric power system, transmission lines, and electricity market generated and transmitted by its member utilities Cloud Services: a service made available to users on demand via the internet from a cloud computing provider’s servers as opposed to being provided from a company’s own on-premises servers Community-Based Organizations (CBOs): nonprofit groups that work on local-level issues Community Choice Aggregation (CCA): programs and entities that allow local governments to procure power on behalf of their residents, businesses, and municipal accounts from an alternative supplier while still receiving transmission and distribution service from their existing utility Council of Government (COG): a governmental agency made up of localities, cities, and counties that coordinate transportation and land use infrastructure Criteria Pollutants: air pollutants with national air quality standards that define acceptable concentrations in ambient air. This includes: carbon
GLOSSARY
173
monoxide, lead, nitrogen dioxide, ozone, particulate matter, and sulfur dioxide Derated Load Factors: operating a system at less than capacity to respond to a change in conditions, for example, if the ambient temperature is higher than a nominal temperature above which the system cannot operate, the load performance is reduced Distributed Energy Resources (DER): small-scale power generation or storage facilities Distributed Resource Interconnection Map (DERiM): map of the utility grid Duck Curve: a graph of power production over the course of the day that shows the timing imbalance between peak demand and renewable energy production, a term coined by the CAISO in 2012 Electric Rule 21: a set of regulations that describes the volume of distributed rooftop solar generation assets that can be developed, interconnection criteria, and designates upper limits for installations Energy Data Request Platform (EDRP): a form to request energy data from the California Investor-Owned Utilities Energy Usage Intensity (EUI): a measure of the energy use of a building obtained by dividing the total annual energy use of a building for a year divided by the total gross floor area of the building Enterprise Cloud Service Providers: the provision of cloud computing services to business by outside vendors or service providers, in contrast to consumer cloud computing services Geocoding: providing geographical coordinates corresponding to a location Go Solar California 2017: a joint effort of the California Energy Commission and the California Public Utilities Commission. to encourage Californians to install 3,000 megawatts of solar energy systems on homes and businesses by the end of 2016, making renewable energy an everyday reality. The program also has a goal to install 585 million therms of gas-displacing solar hot water systems by the end of 2017. https://www.gosolarcalifornia.ca.gov/ Infrastructure as a Service (IAAS): an instant computing infrastructure, provisioned and managed over the internet Institutional Review Board (IRB): an administrative body established to protect the rights and welfare of human subjects recruited to participate in research activities conducted under the auspices of the institution with which it is affiliated
174
GLOSSARY
Investor-Owned Utility (IOU): a privately owned monopoly utility that is regulated by a public utility commission and that provides an essential service such as electricity, natural gas, or water Intergovernmental Panel on Climate Change (IPCC): An intergovernmental organization with the mission of developing objective, science-based information on the risks of and potential responses to climate change Load Profile: a graph of the variability of electrical load versus time Megaparcels: aggregations of types of individual parcels (e.g., part of a building (merely the ground floor footprint), such as in condominiums or individually owned space in office buildings), into single unified geometries Moore’s Law: processor speeds or overall processing power of computers will double every two years MTCO2 : metric tons of carbon equivalent Municipally Owned Utility (MOU): a utility owned by a municipality or publicly owned Net Energy Metering (NEM): the total amount of energy produced by a building imported from, or exported to, the utility grid Net Solar Potential: the total estimated solar electricity that can be generated from a solar array Non-Disclosure Agreement (NDA): a legal agreement by which one or more parties agree not to disclose confidential information that has been shared with each other North American Industry Classification System (NAICS): a classification within the North American Industry Classification for the collection, analysis, and publication of statistical data related to the US economy. Supersedes the Standard Industrial Classification Codes Particulate Matter (PM): the sum of microscopic particles of liquid or solid matter that are suspended in air Personally Identifiable Information (PII): information that can be used to distinguish or trace an individual’s identity either alone or when combined with other personal or identifying information that is linked or linkable to a specific individual PHEV: Vehicle equipped with a battery liquid fuel tank whose battery can be charged electrically Platform as a Service (PAAS): an application platform as a service or platform-based service is a category of cloud computing services
GLOSSARY
175
allowing customers to develop, run, and manage applications without the complexity of building and maintaining the infrastructure to do so PostgreSQL: an open source relational database management system Public Participation Geographic Information Systems (PPGIS): an approach that broadens public involvement in the development and use of Geographic Information Systems to promote the goals of nongovernmental organizations, grassroots groups, and community-based organizations Public Utilities Commission (PUC): a governing body whose Commissioners, in California, are appointed by the Governor and ratified by the State Senate that regulates the rates and services of utilities. It is staffed by civil servants Publicly Owned Utilities (POUs): utilities that are owned by cities or counties, hence public Ramping Loads: electric loads that increase demand for power over the course of several hours, e.g., demand for electric lighting and plug loads as the evening begins Rebound Effect: in conservation and energy economics, the reduction of expected gains from new technologies that increase the efficiency of resource use because of behavioral or other responses Relational Data Model: a database that is a collection of relations Representative Concentration Pathway (RCP): Adopted by the Intergovernmental Panel on Climate Change (IPCC), the RCP defines a trajectory of greenhouse gas concentration used in climate models Scrutability: capable of being understood through study and observation SIC Code: Standard Industrial Classification code—a four digit numerical code assigned by the US government to business establishments to identify the primary business of the establishment Solar Capacity: a measurement of the percentage of capacity of total energy produced during some period of time divided by the amount of energy the plant would have produced if it ran at full output during that time Strategic Growth Council (SGC): a state-based committee that coordinates state agencies and departments to develop activities that support sustainable communities, strong economies, social equity, and environmental stewardship Time of Use: a tariffs rate on electricity use designed to incentivize customers to use energy at off-peak times in order to balance demand.
176
GLOSSARY
The tariffs charge cheaper rates at certain times of night or day when demand is at its lowest, and higher rates at popular times, typically in late afternoon and early evening Urban Metabolism: the sum total of the technical and socioeconomic processes that occur in cities,resulting in growth, production of energy, and elimination of waste Virtual Net Metering: is a bill crediting system for community solar. It refers to when solar is not used on-site but is externally installed and shared among subscribers. The subscriber receives credits on their bill for excess energy produced by that customer’s share of the community solar system
Index
A AB 32, 96 Active San Gabriel Valley, 140 Administrative Law Judge, 35, 36 Advanced Energy Communities (AEC), 120–122, 125, 127, 128, 132, 140, 155 Advanced Metering Infrastructure (AMI), 38, 89, 100, 114, 171 Aggregated data, 22, 46, 58, 83, 94, 110, 154, 160 Aggregation, 3, 15, 20, 22, 28, 37, 38, 57, 62, 63, 66, 67, 79, 81, 83, 93–96, 99, 110, 115, 150, 151, 154 American Community Survey (ACS), 52, 61, 68, 126, 171 Application Programmatic Interfaces (API), 55, 98, 171 Artificial neural networks, 114, 171 B Baselines, 19, 20, 107, 124, 143 Benchmarking, 17, 21, 24
BEV/PHEV, 112, 171, 174 Billing cycle, 56 Building vintage, 18, 19, 50, 55, 64, 125, 172 Business-as-usual (BAU), 120, 147–150, 172 Business model, 11, 15, 26, 40, 42, 91, 129, 162 C C40, 23, 24, 29 CalEnviroScreen (CES), 53, 69, 72, 77, 126, 134, 136 California Energy Commission (CEC), 16–18, 37, 40, 52, 72, 104, 120–122, 137, 140, 144, 172 California Independent System Operator (CAISO), 129, 130, 142, 172 California Office of Environmental Health Hazard Assessment (OEHHA), 53 California Public Utilities Commission (CPUC), 12, 13, 16, 18, 20, 22,
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 S. Pincetl et al., Energy Use in Cities, https://doi.org/10.1007/978-3-030-55601-3
177
178
INDEX
35–40, 43, 48, 57, 58, 63, 79, 83, 93, 99, 110, 126, 131, 148, 150–152, 159 Carbon Neutral Cities Alliance, 24 Cartography, 25 Chief Sustainability Office (CSO), 147 Clean Power Alliance of Southern California (CPASC), 127 Clean Power Authority (CPA), 125 Closed-Circuit Television (CCTV), 23 Cloud services, 101, 172 Community Choice Aggregation (CCA), 13, 125, 172 Conservation, 5, 13, 33, 92, 106, 107, 110, 114 Contextual information, 45, 105, 155 Councils of government (COG), 21, 49, 52, 53, 66, 172 Criteria pollutants, 141, 172 Customer privacy aggregation rules, 79, 83 D Data aggregation, 4, 22, 36, 37, 81, 93, 95, 115 Data disclosure, 93, 95 Data errors, 60 Data structure, 105 Data visualization, 2, 27, 65, 68 Decarbonization, 11, 15 Description, 29, 106, 112, 113, 115, 120, 165 Difference-in-difference, 107, 108, 125 Digital humanities, 27, 28 Disadvantaged Communities, 16, 53, 112, 120, 134, 140 Distributed Energy Resource Interconnection Maps (DERiM), 126, 131, 132, 135, 173 Distributed Energy Resources (DER), 89, 132, 165, 173
Duck Curve, 14, 130, 161, 162, 173 E Electric Rule 21, 131, 173 Electric Vehicle (EV), 112, 127, 139, 142 Elevate Energy, 79 Energy Data Access Committee (EDAC), 37, 38, 42, 159 Energy Data Request Program (EDRP), 99, 173 Energy efficiency, 5, 11, 15, 16, 18, 19, 21, 33, 67, 89, 90, 108, 110, 122–125, 127, 129, 139, 140, 142, 144, 146, 161, 164, 168 Energy efficiency programs (EE), 10, 12, 13, 15, 20, 39, 41, 42, 61, 90, 106–108, 110, 112, 122, 123, 125, 127, 150–152 Energy Intensity Units (EUI), 96, 149 Enterprising cloud service providers, 102, 173 Environmental justice, 19, 27, 53, 79, 137, 147 Equity, 4, 11, 19, 20, 34, 83, 111, 112, 157, 161 Errors, 51, 59–62, 91, 98, 103, 104, 113 G Geocoding, 4, 54, 55, 62, 104, 106, 151, 173 Geographical Information Systems (GIS), 25–27, 52, 66 Glendale Water and Power (GW&P), 47, 61 Governor’s Office of Planning and Research, 33, 35 Granular data, 5, 10, 20, 21, 40, 42, 90, 157, 168
INDEX
Greenhouse Gas (GHG), 2, 5, 10–13, 15–19, 33–35, 37, 58, 67, 69, 78, 80, 90, 95–97, 120, 122, 127, 141, 147–150, 157, 158, 160, 164–166, 168, 169 Grid penetration, 5 H Heating, Ventilation, and Air Conditioning (HVAC), 128 I Independent Review Board (IRB), 99, 173 Information technology infrastructure, 101 Infrastructure as a Service (IAAS), 102, 173 Integrity check, 103, 104 Integrity validation, 103 Internet of Things (IOT), 8, 88 Intuition, 103, 105, 114 Investor-Owned Utility (IOU), 13, 15, 20, 37, 38, 40, 42, 47, 92, 93, 99, 129–131, 174 L Long Beach Gas and Oil (LBGO), 47, 48 Long term procurement, 89 Los Angeles County Sustainability Plan, 97 Los Angeles Department of Water and Power (LADWP), 3, 20, 34–36, 47, 61 M Managed data centers, 101 Masking, 38, 63, 79–81, 83, 94, 96, 97, 111
179
Megaparcels, 51, 174 Models, 11, 26, 36, 42, 86, 87, 91, 92, 105, 107, 108, 111–114, 122, 125, 142, 145, 153–155, 167 Moore’s Law, 100, 101, 174 MTCO2 , 58, 67, 174 Municipally Owned Utility (MOU), 90, 93, 129, 174 Municipal tax assessor, 106
N NAICS code, 50, 55 Natural Resources Defense Council, 79 Net Energy Metering (NEM), 131, 174 Net solar potential, 128, 133, 134, 136, 174 Non-disclosure agreement (NDA), 20, 22, 34, 36, 39, 41, 46, 48, 57, 91, 99, 152, 174
P Pacific Gas and Electric (PG&E), 39–42, 150–152, 160 Personally Identifiable Information (PII), 57, 92, 174 Platform as a Service (PAAS), 102, 174 Policy implementation, 2, 12, 20, 40, 42, 157, 158 PostgreSQL, 46, 56, 102, 175 Preemptible services, 102 15/15 privacy guidelines, 69, 171 Publicly Owned Utilities (POUs), 13, 20, 48, 59 Public Participation Geographic Information Systems (PPGIS), 21, 26–28, 138, 140, 175
180
INDEX
Public Utilities Commission (PUC), 3, 10–13, 35, 36, 39, 42, 43, 61, 90, 162, 175 PV/PV generation/PV Assets, 124, 126, 131, 145, 146
R Real-time dashboards, 24 Rebound effect, 108–110, 175 Regulatory potential, 133 Relational data model, 105, 106, 175
S San Diego Gas & Electric (SDG&E), 48 Scrutability, 113, 175 SIC code, 50, 55, 175 Simulated historical forecasting, 108, 112 Smart data, 8 Software, 4, 8, 24–29, 98, 99, 101, 102, 104, 167 Solar capacity, 4, 125, 131, 132, 163, 175 Solar Prioritization tool, 27, 125, 126, 128, 129, 132, 137 Southern California Association of Governments (SCAG), 49, 50
Southern California Edison (SCE), 13, 40, 41, 47, 48, 61, 125, 126, 131–135, 149 Southern California Gas (SCG), 40, 41, 47, 48, 61, 149 Spatial join, 106 Stakeholder engagement, 66, 154 Strategic Growth Council (SGC), 137, 138, 175 Studio NAND, 65 T Thick mapping/Thick description, 27–29, 119, 146, 161, 167, 169 Time-of-use, 5, 161, 162, 175 U UCLA Information Technology Services, 57 US Census, 52, 61 Utility Data Standardization, 54 V Virtual Net Metering, 176 Volume, Velocity, Variety, 100 W Whales, 94