Energy Sustainability through Retail Electricity Markets: The Power Trading Agent Competition (Power TAC) Experience (Applied Innovation and Technology Management) 3031397061, 9783031397066

The world is moving away from demand-driven electricity markets supplied by centralized generation and distribution of f

131 28 8MB

English Pages 245 [242] Year 2023

Table of contents :
Foreword
References
Contents
1 Introduction
1.1 Background
1.2 Electric Power Markets
1.3 Power TAC Goals
1.4 The Power TAC Platform
1.5 What Is in This Book
References
2 Modeling a Customer Population in Power TAC: Electric Vehicle Chargers
2.1 Introduction
2.2 Related Work
2.3 Modeling a Population of EV Chargers
2.3.1 Representation
2.3.2 Learning the Charging Behavior of EV Users
2.3.3 Dataset
2.3.4 GMM Results
2.3.5 Model Operation
2.4 Generating Demand
2.5 Results
2.6 Conclusion
References
3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids: From Theory to Practice
3.1 Introduction
3.2 Preliminaries
3.2.1 RL: A Brief Overview
3.2.2 Game Theory: A Brief Overview
3.2.3 VidyutVanika: Overview
3.3 Wholesale Strategies
3.3.1 Bayesian Nash Equilibrium Analysis of Wholesale Strategies
3.3.2 VV18–WS
3.3.3 DDPGBBS for Bidding in Smart Grids
3.3.4 VV21–WS
3.4 Tariff Strategies
3.4.1 VV18-TS
3.4.2 VV21–TS
3.4.3 Game Theoretical (GT) Analysis of VV21 Tariff Strategy
3.5 Results and Discussion
3.5.1 Tournament Results (PowerTAC 2018 and 2021)
3.5.2 Discussion
3.6 Future Directions
References
4 Designing Retail Electricity Tariffs Using ReinforcementLearning
4.1 Introduction
4.2 Smart Electricity Market
4.3 Proposed Broker Strategy
4.3.1 Tariff Design
4.3.2 Transformation of MUBP to ToU Pricing Schemes
4.3.3 Manage Published Consumption Tariffs
Customer Demand Prediction
4.4 Numerical Results and Discussion
4.5 Related Work
4.6 Conclusions and Future Work
References
5 Nudging the Direction of Energy Tariff Selection: Lessons Learned from an Attribute Framing Experiment with Temporal Construal Levels
5.1 Introduction
5.2 Decision Frames and Environmental Sustainability
5.2.1 The Framing of Preferential Choice
5.2.2 Framing Effects, Construal Levels, and Individual Differences
5.2.3 The Present Study
5.3 Data and Methods
5.4 Results
5.4.1 Basic Differences Between Tariff Evaluations
5.5 Discussion and Conclusion
References
6 AgentUDE: A Smart Broker Agent for Autonomous PowerTrading
6.1 Introduction
6.2 Related Work
6.2.1 Electricity Demand and Price Forecasting
6.2.2 Strategic Bidding in Wholesale Markets
6.2.3 Tariff Forming in Retail Markets
6.3 Experimental Setup and Resources
6.3.1 Power TAC Tournament Manager
6.3.2 Power TAC Log Analysis Tool
6.4 AgentUDE14: A Champion Agent
6.4.1 Wholesale Market
6.4.2 Retail Market
6.5 AgentUDE15: Utilizing Storage Capacities
6.5.1 Experimental Setup
6.5.2 Results
6.6 AgentUDE17: A State-of-the-Art Broker
6.6.1 AgentUDE17: Smart Bidding in Wholesale Markets
6.6.2 AgentUDE17: Evolutionary Trading in Retail Markets
6.7 Conclusion and Future Work
6.8 AgentUDE Executables and Resources
References
7 Upgrading a Winning Agent to Not Winning: The Case of Agent Mertacor in Power TAC
7.1 Introduction
7.2 Related Work
7.2.1 The Power TAC Environment
7.2.2 Pivotal Broker Designs
7.3 Mertacor: A Winning Power TAC Agent
7.3.1 The Wholesale Market Module
7.3.2 The Retail Market Module
7.3.3 Mertacor Prediction Strategy
7.4 Competition Results (2019, 2020, 2021) and Discussion
7.5 Conclusions
References
8 SPOT: Strategies for Power Trading in Wholesale Electricity Markets
8.1 Introduction
8.2 Background
8.2.1 PowerTAC and Related Agent Strategies
8.2.2 Periodic Double Auctions (PDAs)
8.2.3 Monte Carlo Tree Search (MCTS)
8.3 Learning Prices in Dynamic Wholesale Market
8.3.1 Supervised Price Predictors
8.3.2 Dynamic MDP Price Predictor
8.3.3 Choosing Price Predictors for Bidding
8.4 SPOT's Wholesale Trading Strategies
8.4.1 Heuristic Bidding Strategies
8.4.2 Bidding Using MCTS
8.4.3 Dynamic MCTS Strategy
8.5 Experimental Methods and Benchmark Strategies
8.5.1 Testbed
8.5.2 Benchmark Strategies
8.6 Experimental Results
8.6.1 MCTS Strategy Variations
8.6.2 Candidate Strategy Comparison
8.6.3 Dynamic MCTS Comparison
8.7 Discussion
8.8 Conclusion
References
9 CrocodileAgent: A Decade of Competing in the Power Trading Agent Competition
9.1 Introduction
9.2 CrocodileAgent Design
9.3 Evidence from the Power TAC 2020 Competition
9.3.1 Absolute Scores Analysis
9.3.2 Relative Scores Analysis
9.4 Conclusions
References
10 Incorporating Social Values for Cooperation in Energy Trading and Balancing Research
10.1 Introduction
10.2 Cooperation in Social Dilemma Situations
10.2.1 Social Value Orientation and Cooperation in Social Dilemmas
10.2.2 The Measurement of Social Value Orientation
10.3 Situational Moderators of Social Value Orientation
10.3.1 The Impact of Give-Some vs. Take-Some Games
10.3.2 The Impact of One-Shot vs. Repeated Games
10.3.3 The Impact of Communication
10.3.4 The Impact of Gender Differences
10.3.5 The Impact of Trust
10.4 Incorporating Social Values for Cooperation in the Power TAC Environment
10.4.1 The Power Trading Agent Competition (Power TAC)
10.4.2 Toward Non-competitive Broker Agents Based on Social Values
10.4.3 Addressing Grand Societal Challenges with Social Values
10.5 Conclusion
References
11 Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles
11.1 Introduction
11.2 Background and Related Literature
11.2.1 Balancing the Electrical Grid: Control Reserve Market
11.2.2 Information-Based Sustainable Society: Carsharing with Electric Vehicles
11.3 Data
11.4 Model Description
11.4.1 Virtual Power Plant Decision Support: FleetPower
11.4.2 Endogeneity from Market Participation
11.5 Evidence from a Real-World Setting
11.5.1 Energy Market Data: California ISO
11.6 Analysis and Discussion
11.7 Conclusions
References
12 Power TAC Experiment Manager: Support for Empirical Studies
12.1 Introduction
12.2 Related Work
12.3 Supporting Empirical Research
12.3.1 Simulation Space
12.3.2 Experiments
12.4 Architecture
12.4.1 Simulation Services
12.4.2 Container Virtualization
12.5 Implementation
12.5.1 Game and Experiment Creation
12.5.2 Simulation Automation
12.6 Getting Results
12.7 Conclusion
References

Recommend Papers

The Power to Choose: Demand Response in Liberalised Electricity Markets (Energy Market Reform) 9264105034, 9789264105034

Highly volatile electricity prices are becoming a more frequent and unwanted characteristic of modern electricity wholes

358 86 2MB Read more

Power Generation Investment in Electricity Markets (Energy Market Reform) 9264105565, 9789264105560

This report looks at how investors have responded to the need to internalise investment risk in power generation and how

306 50 985KB Read more

Solar Power: Innovation, Sustainability, and Environmental Justice 9780520963191

In this important new primer, Dustin Mulvaney makes a passionate case for the significance of solar power energy and off

202 68 13MB Read more

Solar Power: Innovation, Sustainability, and Environmental Justice 9780520288164, 0520288165

In this important new primer, Dustin Mulvaney makes a passionate case for the significance of solar power energy and off

410 16 15MB Read more

Wind Power Electric Systems: Modeling, Simulation, Control and Power Management Control (Green Energy and Technology) [2 ed.] 3031528824, 9783031528828

This book enhances existing knowledge in the field of wind systems. It explores topics such as grid integration, smart g

101 71 10MB Read more

Sustainability, Technology and Innovation 4.0 1032025905, 9781032025902

Sustainability, Technology and Innovation 4.0 is a holistic perception and analysis of innovation at the level of public

1,846 150 11MB Read more

Re-Building University Capabilities: Public Policy and Managerial Implications to Innovation and Technology (Applied Innovation and Technology Management) 3031316665, 9783031316661

Although most universities could be considered bureaucrat organizations, the accumulated knowledge reveals that universi

105 23 7MB Read more

Cybersecurity: A Technology Landscape Analysis (Applied Innovation and Technology Management) [1st ed. 2023] 3031348427, 9783031348426

Cybersecurity has become a critical area to focus after recent hack attacks to key infrastructure and personal systems.

156 57 55MB Read more

Virtual Power Plants and Electricity Markets: Decision Making Under Uncertainty [1st ed.] 9783030476014, 9783030476021

This textbook provides a detailed analysis of operation and planning problems faced by virtual power plants participatin

393 86 9MB Read more

Cybersecurity: A Technology Landscape Analysis (Applied Innovation and Technology Management) [1st ed. 2023] 3031348427, 9783031348426

Cybersecurity has become a critical area to focus after recent hack attacks to key infrastructure and personal systems.

150 70 14MB Read more

Energy Sustainability through Retail Electricity Markets: The Power Trading Agent Competition (Power TAC) Experience (Applied Innovation and Technology Management)
3031397061, 9783031397066

Author / Uploaded
John Collins (editor)
Wolfgang Ketter (editor)
Andreas L. Symeonidis (editor)

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Applied Innovation and Technology Management

John Collins Wolfgang Ketter Andreas L. Symeonidis Editors

Energy Sustainability through Retail Electricity Markets The Power Trading Agent Competition (Power TAC) Experience

Applied Innovation and Technology Management Series Editors Tugrul U. Daim , Department of Engineering & Technology Management, Portland State University, Portland, OR, USA Marina Dabi´c Croatia

, Faculty of Economics & Business, University of Zagreb, Zagreb,

Technology is not just limited to technology companies. Managing innovation and technology is no longer a luxury and needs to be understood by all sectors around the world and by both technical and non-technical managers. This book series explores existing and emerging technologies that address current challenges within innovation and technology managements. Each title is developed to provide a set of frameworks, tools and methods that can be adopted by researchers, managers and student in engineering, innovation and technology fields. Research, policy and practice-based books in the series cover topics such as roadmapping, portfolio management, technology forecasting, R&D management, health technologies, bio technologies, transportation management, smart cities, and open innovation, among many others.

John Collins • Wolfgang Ketter • Andreas L. Symeonidis Editors

Energy Sustainability through Retail Electricity Markets The Power Trading Agent Competition (Power TAC) Experience

Editors John Collins Computer Science and Engineering University of Minnesota Minneapolis, MN, USA

Wolfgang Ketter University of Cologne Cologne, Germany

Andreas L. Symeonidis Electrical and Computer Engineering Aristotle University of Thessaloniki Thessaloniki, Greece

ISSN 2662-9402 ISSN 2662-9410 (electronic) Applied Innovation and Technology Management ISBN 978-3-031-39706-6 ISBN 978-3-031-39707-3 (eBook) https://doi.org/10.1007/978-3-031-39707-3

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.

Foreword

By now, in 2023, research competitions have become an entrenched fixture of the AI community. Many AI conferences host competitions pertaining to their particular subfield. Entire platforms, such as Kaggle,1 exist to host and promote machine learning competitions. However, when the Trading Agent Competition (TAC) was introduced more than two decades ago, it was among the first. The first Trading Agent Competition (TAC) was held from June 22 to July 8, 2000, organized by a group of researchers and developers from the University of Michigan and North Carolina State University [5]. Their goals included providing a benchmark problem in the complex and rapidly advancing domain of e-marketplaces [1] and motivating researchers to apply unique approaches to a common task. They devised a fun and creative scenario in which entrants submitted programs to act as travel agents, tasked with securing resources for their clients. Specifically, TAC agents bought flights, hotel rooms, and entertainment tickets in different types of auctions. The TAC server, running at the University of Michigan, maintained the markets and sent price quotes to the agents. The agents sent bids to the server that updated the markets accordingly and executed transactions. In a 2005 article with many of the same goals as this book (albeit shorter and less ambitious), Amy Greenwald and I summarized the research contributions of the various entrants [3]. From that first iteration in 2000, TAC grew and thrived, attracting a dedicated community of participants and expanding to a variety of scenarios, including supply chain management, market design, ad auctions, and eventually electricity markets. As a frequent competitor (and occasional winner) of these competitions, I gained a deep appreciation for the difficulty of designing such a scenario. It’s one thing to come to a fully formed competition and program an entrant. It’s an entirely different matter to craft a scenario that simultaneously surfaces interesting research issues, supports an interesting and fair competition, and represents some degree of realism. The editors of this book have done just that in the domain of the very topical and

1 https://www.kaggle.com/competitions/

v

vi

Foreword

challenging domain of electric power markets, for which they deserve a lot of kudos and respect! The first TAC was motivated in some ways by RoboCup, the robot soccer world cup, which was at the time, and remains today, the largest and most influential research competition in the world. Since I was among the earliest participants of both events (I remain actively involved in RoboCup, having recently completed a term as president), I reflected more than two decades ago on the hazards and benefits of research competitions such as RoboCup and TAC [2]. There is no doubt that research competitions have their hazards. They can give rise to an obsession with winning that runs counter to disseminating advances; they can encourage domain-dependent solutions that do not generalize; they can create a high barrier to entry for newcomers to the community who have not participated in previous iterations; and they can lead to invalid evaluation conclusions based on a particular method being embedded in a winning (or losing) entry, when the entry’s result is due in larger part to some other feature. On the other hand, research competitions also come with many benefits. They provide inspiration for novel algorithms and approaches; they force entrants to embed their algorithms in complete, working agents by a deadline; they provide a common platform and challenge that enables researchers from around the world to easily exchange experiences and ideas; they are attractive and inspiring to (many) students; and they lead to a large pool of working agents that can be used for later benchmarking. It is incumbent upon the organizers of a research competition to take all steps possible to make sure that the benefits outweigh the hazards. And the editors of this book have done that exceedingly well with Power TAC! My former student, Daniel Urieli, and I were greatly inspired by the early versions of the competition. We formalized the Power TAC electricity trading problem as a continuous, highdimensional Markov Decision Process (MDP), which is computationally intractable to solve exactly, and then introduced effective approximation methods [4] that helped us win the 2015 competition. This research formed an integral part of Daniel’s Ph.D. Dissertation entitled Autonomous trading in modern electricity markets.2 Although I have not participated since Daniel graduated, I am thrilled to see that Power TAC continues to thrive. The 11 articles in this book embody the very best of the benefits of research competitions, presenting interesting research inspired by all aspects of the competition, some focusing in on specific subcomponents of the challenge, and others presenting complete, integrated agents. I expect the book to be useful and interesting both to long-time Power TAC participants and to newcomers to the domain interested in ramping up quickly. Thus, I commend all the authors and editors for their efforts to put together this compelling record of the impacts of Power TAC, and more generally for continuing

2 https://www.cs.utexas.edu/~urieli/thesis/

Foreword

vii

the legacy of the Power TAC and the overall TAC research community. This is what research competitions are meant to look like! The University of Texas at Austin, Austin, TX, USA Sony AI, New York, NY, USA

Peter Stone

References 1. Eisenberg, A.: In online auctions of the future, it’ll be bot vs. bot vs. bot. The New York Times, 17 Aug 2000 2. Stone, P.: Multiagent competitions and research: lessons from RoboCup and TAC. In: Kaminka, G.A., Lima, P.U., Rojas, R. (eds.) RoboCup-2002: Robot Soccer World Cup VI. Lecture Notes in Artificial Intelligence, vol. 2752, pp. 224–237. Springer, Berlin (2003) 3. Stone, P., Greenwald, A.: The first international trading agent competition: autonomous bidding agents. Electron. Commerce Res. 5(2), 229–265 (2005) 4. Urieli, D., Stone, P.: An MDP-based winning approach to autonomous power trading: formalization and empirical analysis. In: Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems (AAMAS) (2016) 5. Wellman, M.P., Wurman, P.R., O’Malley, K., Bangera, R., Lin, S.D., Reeves, D., Walsh, W.E.: A trading agent competition. IEEE Internet Comput. 5(2), 43–51 (2001)

Contents

1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . John Collins, Wolfgang Ketter, and Andreas L. Symeonidis

2

Modeling a Customer Population in Power TAC: Electric Vehicle Chargers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . John Collins, Philipp Page, and Wolfgang Ketter

3

4

5

6

VidyutVanika: AI-Based Autonomous Broker for Smart Grids: From Theory to Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sanjay Chandlekar, Bala Suraj Pedasingu, Susobhan Ghosh, Easwar Subramanian, Sanjay Bhat, Praveen Paruchuri, and Sujit Gujar Designing Retail Electricity Tariffs Using Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nastaran Naseri, Saber Talari, Wolfgang Ketter and John Collins Nudging the Direction of Energy Tariff Selection: Lessons Learned from an Attribute Framing Experiment with Temporal Construal Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Laurens Rook, Jan van Dalen, and Wolfgang Ketter AgentUDE: A Smart Broker Agent for Autonomous Power Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Serkan Özdemir and Rainer Unland

1

9

25

57

75

97

7

Upgrading a Winning Agent to Not Winning: The Case of Agent Mertacor in Power TAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Lampros Makrodimitris and Andreas L. Symeonidis

8

SPOT: Strategies for Power Trading in Wholesale Electricity Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Moinul Morshed Porag Chowdhury, Christopher Kiekintveld, Tran Cao Son, and Enrico Pontelli ix

x

Contents

9

CrocodileAgent: A Decade of Competing in the Power Trading Agent Competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Demijan Grgic, Jurica Babic, and Vedran Podobnik

10

Incorporating Social Values for Cooperation in Energy Trading and Balancing Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Laurens Rook, Sudip Bhattacharjee, and Wolfgang Ketter

11

Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Micha Kahlen, Karsten Schroer, Wolfgang Ketter, and Alok Gupta

12

Power TAC Experiment Manager: Support for Empirical Studies . . . 219 Frederik Milkau, John Collins, and Wolfgang Ketter

Chapter 1

Introduction John Collins

, Wolfgang Ketter

, and Andreas L. Symeonidis

1.1 Background The climate crisis and the need for sustainability threaten to upend many facets of our global civilization. Much of the uncertainty is focused on our sources and use of energy. As Saul Griffith points out in his book Electrify [3], most of our energy needs can be met by converting from fossil fuels into electricity, which could reduce our overall consumption of energy while dramatically reducing carbon emissions. However, the transition to weather-dependent renewables is likely to upend both wholesale and retail markets for electricity. It will require both significant new investment in storage and a transition from demand-driven grid management toward a supply-driven paradigm. A major goal of our research has been to develop and evaluate market-based approaches to drive the transition to supply-driven grid management and to encouraging and compensating demand flexibility that can help drive the transition. The three of us, John Collins, Wolf Ketter, and Andreas Symeonidis, all have backgrounds that span engineering, computer science, and economics. We have been working together and competing with each other since 2003, when we began participating in the Trading Agent Competition for Supply Chain Management [1]. The idea for a Trading Agent Competition came from Michael Wellman’s group at J. Collins () University of Minnesota, Minneapolis, MN, USA e-mail: [email protected] W. Ketter University of Cologne, Cologne, Germany A. L. Symeonidis Aristotle University of Thessaloniki, Thessaloniki, Greece

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Collins et al. (eds.), Energy Sustainability through Retail Electricity Markets, Applied Innovation and Technology Management, https://doi.org/10.1007/978-3-031-39707-3_1

1

2

J. Collins et al.

the University of Michigan [12], who saw it as a way to study and evaluate decision processes in complex market environments. Since 2000, several different complex trading problems have attracted researchers who build autonomous agents to participate and compete directly with each other in related competitive simulations. In 2009, at a German workshop on renewable energy, Wolf Ketter realized that the Trading Agent paradigm might be a good way to study retail electricity markets that were opening up and seeing significant growth in distributed solar production. The three of us saw an opportunity to use our skills and experience to make useful contributions to the sustainability challenge in the energy sector. After a few dead ends, we came up with a simulation design that seemed to present an interesting range of the issues electricity retailers would face in a market with increasing weather-dependent energy supply, storage capacity, and demand flexibility. The simulation is less abstract, and therefore more complex, than earlier Trading Agent Competitions, using real-world weather and wholesale-market data, and customer models based on real-world pilot studies and engineering manuals. We had enough implemented by summer 2011 to demonstrate it at the International Joint Conference on Artificial Intelligence in Barcelona (IJCAI 2011), and we ran the first competition at the Autonomous Agents and Multi-Agent Systems conference in Valencia in 2012 (AAMAS 2012). This is how the Power TAC journey began.

1.2 Electric Power Markets Traditionally, electric power has been generated in large centralized hydroelectric and thermal power plants. Operation of these plants is governed by load forecasts. Grid stability depends strongly on real-time balance between demand and supply, since the only “inventory” in the system has been in coal piles and gas storage facilities, and in reservoir levels for hydroelectric plants. Until recently, energy was sold by measuring consumption monthly or even less often. As a result, the only way to manage the grid was to make supply follow demand; the only way to influence demand was to ask people to conserve during expected peak demand periods. However, these assumptions and limitations were quickly erased by a number of trends: – Widespread introduction of “smart” meters that not only measure consumption at intervals of 15 minutes or less and that can report not only usage, but also power factor and outages in real time. – Introduction of time-of-use rates, enabled by smart meters, has caused many energy users to shift portions of their consumption to lower-cost periods. – Widespread adoption of grid-connected distributed generation, primarily in the form of solar arrays on homes and businesses.

1 Introduction

3

– Batteries being added to the grid, especially in places such as California, are being increasingly important for shifting energy from when it is produced to when it is needed. It is common knowledge that the cost of electricity is a combination of (mainly) capital cost, fuel cost, and operations and maintenance costs. Specifically, the cost of delivered energy includes: – Energy—The cost of producing the energy. For a thermal plant (coal, gas, nuclear), this includes the cost of fuel to produce the heat needed to spin the turbines, and the cost operates the plant. Solar and wind have essentially zero fuel and operation costs. – Capacity—The cost of the infrastructure required to deliver energy to end users. This includes the financing cost for power plants and for solar and wind resources, the cost of building, managing, and maintaining transmission and distribution grids and all associated components. It also includes the “fixed” portion of operation and maintenance costs that is incurred whether or not the plant is producing energy. – Customer service—The cost of interacting with energy customers, who may be both producers and consumers of electrical energy. Until the 1990s, most electricity in the United States was generated, transmitted, and sold by vertically integrated utilities, by government entities, and by cooperatives. Investor-owned utilities were generally treated as regulated monopolies. Trading among these entities was governed primarily through bilateral contracts. Similar arrangements were common in most of the world. Today, in the U.S.A., in most of Europe and Latin America, and many other parts of the world, electricity is traded through organized markets at the wholesale level, and in some cases at the retail level. Shahidehpour et al. [8] give a good overview of how these markets work. Wholesale power markets typically trade energy over multiple time horizons, often as far as 36 hours into the future, and in some markets as close as 5 minutes into the future. Most large power plants take time and money to start up and shut down, and so they prefer not to start up unless they have confidence they can run for an extended period. This fact strongly influences their trading behavior. It is therefore obvious that in order to properly plan the next day of power markets, one has to understand their dynamics and limitations at any given time. Electric power market designs must address a number of economic, technical, and policy issues, while also provisioning for crisis situations, such as the California energy crisis in 2000 [4], the Enron disaster [10], and the Texas debacle in 2021 [9].

1.3 Power TAC Goals Further developing the above, power markets depend strongly on the behavior of market stakeholders that participate in them. Individual household customers in

4

J. Collins et al.

retail electric power markets typically lack the knowledge and motivation to monitor markets and adjust their behaviors to minimize their costs; their individual utility often depends more on perceived convenience than on cost. Commercial, industrial, and institutional energy users, especially those for which energy cost is a major factor, are more likely to pay attention to how their behavior affects their costs. But even time-of-use rates are a blunt instrument for adapting energy use to the availability of weather-dependent renewable resources. Power TAC is a platform for exploring solutions to this problem at the intersection of economics and technology. We believe retail electricity markets are an ideal environment for applying the idea of “smart markets” [5], in which intelligent automation is used to make low-level decisions about energy use in response to market signals and user preferences. Many uses of electricity, such as heating and cooling, laundry, and EV charging, have some flexibility in when the energy is actually consumed, and with appropriate automation such uses can be treated as storage resources in the grid. For example, a domestic water heater does not need to maintain water temperature if there will be no demand for hot water over the next 8 hours. An EV does not need to be charged sooner than the driver intends to unplug it, and if energy cost is high, it may not need to be fully charged. The addition of a simple mixing valve on the output of a hot water heater can greatly increase its energy storage capacity, and the installation of an “ice plant” as part of the HVAC system in a large building can offer considerable flexibility in the timing of energy use.

1.4 The Power TAC Platform Our primary research interest is finding ways to address energy sustainability through markets. There has been considerable interest in studying wholesale markets, such as the work of Tesfatsion and her colleagues and students [11]. But, if we are to find a way to shape demand to the availability of sustainable resources, we must focus on end users of energy. Power TAC provides a rich simulation environment to study the interactions among energy resources, markets, users, and retailers. It is designed to support a style of research we call Competitive Benchmarking [6] through: – Development of increasingly realistic simulation models of suppliers, customers, and markets – Periodic open competitions in which research groups around the world are invited to build agents that act as retailers who must attract retail customers with competitive tariff offerings, and then trade in forward wholesale energy markets and real-time flexibility markets to match supply and demand in this simulation environment – Careful analysis of results to discover and characterize effective strategies

1 Introduction

5

Further details are available in a variety of forms. The game specification [7] provides a detailed description of the markets and models, with an emphasis on the various decision problems that must be addressed by customer models and retail brokers. A recent paper by Collins and Ketter [2] describes the research-oriented software architecture of Power TAC platform.

1.5 What Is in This Book We have collected a variety of papers for this volume that describe work on the simulation itself, analysis and design of competitive retail broker agents, aspects of behavioral economics that relate to tariff selection and social values that might help with needs to shape demand, management of electric vehicle fleets in response to market opportunities that reward demand flexibility, and support for empirical studies in Power TAC and other complex simulation environments. Several of these papers are follow-ons to doctoral dissertations that focused on specific problems presented by the complex trading environment that Power TAC broker agents must navigate. Chapter 2: Page and Collins describe in some detail the process of turning a realworld data set into a scalable customer model representing populations of electric vehicle chargers. Chapter 3: Chandlekar et al. describe aspects of VidyutVanika, their highperformance broker agent, with a focus on machine learning and game theory. Chapter 4: Naseri et al. detail their use of machine learning techniques in formulating profitable retail tariffs that attract a desired mix of customer types that maximize revenue while minimizing both energy and capacity cost. Chapter 5: Rook et al. explore the problem of retail tariff selection from the standpoint of behavioral economics with a focus on how “nudging” and “framing” can affect human behavior around economic decisions related to energy consumption. Chapter 6: Özdemir and Unland describe the design of their Agent UDE broker agent, providing an overview of a variety of agent design issues. In particular, they describe a genetic algorithm for tariff design, a prediction model used for bidding in the wholesale market, and the empirical research process they used to evaluate alternative approaches. Chapter 7: Makrodimitris and Symeonidis describe significant features of their high-performing Mertacor broker agent and detail how its performance was degraded in the following tournament by excessive focus on a few performance measures without careful empirical study that could have shown that these improvements reduced overall performance. Chapter 8: Chowdhury et al. describe the design of their SPOT broker agent with an emphasis on its wholesale-market strategy. Since the earlier publication of a portion of this work, the SPOT wholesale-market trading approach has been adopted by several competitors.

6

J. Collins et al.

Chapter 9: Grgic et al. describe how the design of their CrocodileAgent has evolved over ten years of Power TAC competitions. Of particular interest is the analysis of its performance with respect to its competitive environment, given that tournament sessions include all combinations of three, five, and eight competing agents. Chapter 10: Rook et al. discuss an approach based in behavioral economics to the problem of encouraging energy users to modify their demand behavior during peak demand periods. Although most approaches to this problem rely on pricing mechanisms, they point out that social values might also be harnessed. They suggest an approach that could be used to represent social values in Power TAC and similar simulation-based research tools. Chapter 11: Schoer et al. analyze the potential for operators of shared, autonomous electric vehicle fleets to optimize key performance measures through simultaneously offering mobility services and grid services using a virtual power plant (VPP) model. Chapter 12: Milkau and Collins describe a platform for empirical study based on Power TAC and broker agents. It supports designing, running, and analyzing experiments. It uses “containers” to overcome a limitation of earlier attempts at addressing this need caused by the increasing complexity of competitive agent designs that are no longer simple programs that can easily be packaged and shared as binary run-time files.

References 1. Collins, J., Ketter, W., Sadeh, N.: Pushing the limits of rational agents: the trading agent competition for supply chain management. AI Mag. 31(2), 63–80 (2010) 2. Collins, J., Ketter, W.: Power TAC: software architecture for a competitive simulation of sustainable smart energy markets. SoftwareX 20, 101217 (2022). https://doi.org/10.1016/j. softx.2022.101217, https://www.sciencedirect.com/science/article/pii/S2352711022001352 3. Griffith, S.: Electrify: An Optimist’s Playbook for Our Clean Energy Future. MIT Press, Cambridge, MA (2021) 4. Joskow, P.L.: California’s electricity crisis. Oxf. Rev. Econ. Policy 17(3), 365–388 (2001). http://oxrep.oxfordjournals.org/content/17/3/365.short 5. Ketter, W., Collins, J., Block, C.A.: Smart grid economics: policy guidance through competitive simulation. Tech. Rep. 1707913 (2010) 6. Ketter, W., Peters, M., Collins, J., Gupta, A.: Competitive benchmarking: an IS research approach to address wicked problems with big data and analytics. MIS Q. 40(4), 1057–1080 (2016) 7. Ketter, W., Collins, J., Weerdt, M. D.: The 2020 Power Trading Agent Competition. SSRN Scholarly Paper ID 3564107, Social Science Research Network, Rochester, NY, Mar 2020. https://doi.org/10.2139/ssrn.3564107, https://papers.ssrn.com/abstract=3564107 8. Shahidehpour, M., Yamin, H., Li, Z.: Market Operations in Electric Power Systems: Forecasting, Scheduling, and Risk Management. Wiley-IEEE Press, Hoboken, NJ, USA (2002) 9. Smead, R.G.: ERCOT—the eyes of Texas (and the World) are upon you: what can be done to avoid a February 2021 repeat. Climate Energy 37(10), 14–18 (2021). Publisher: Wiley Online Library

1 Introduction

7

10. Smith, M.D.: Lessons to be learned from California and Enron for restructuring electricity markets. Electr. J. 15(7), 23–32 (2002). Publisher: Elsevier 11. Sun, J., Tesfatsion, L.: Dynamic testing of wholesale power market designs: an open-source agent-based framework. Comput. Econ. 30(3), 291–327 (2007) 12. Wellman, M.P., Wurman, P.R., O’Malley, K., Bangera, R., Reeves, D., Walsh, W.E.: Designing the market game for a trading agent competition. IEEE Internet Comput. 5(2), 43–51 (2001). Publisher: IEEE

Chapter 2

Modeling a Customer Population in Power TAC: Electric Vehicle Chargers John Collins

, Philipp Page, and Wolfgang Ketter

2.1 Introduction Power TAC is a discrete-time competitive simulation intended to stimulate research into future retail electricity markets. We expect those markets to be increasingly supply driven due to increasing penetration of weather-dependent renewables. As a result, demand flexibility will have an increasing value. At the same time, mobility needs will increasingly be met by electric vehicles (EVs) that can offer considerable demand flexibility when they are connected to the grid for longer periods than needed to charge them. Increasingly, EV owners will find charging facilities available at their homes, workplaces, retail businesses, and entertainment venues. Unlike most owners of internal combustion vehicles, most EV owners do not wait for their battery levels to approach “empty” before they fill them by connecting them to convenient chargers. It is common for EV owners to leave their vehicles connected to chargers for much longer than needed to charge them. Power TAC has a number of customer models available, including households and businesses, cold-storage warehouses, solar arrays, and electric vehicles. Some of these models are “bottom-up” designs that directly model individual appliances, refrigeration systems, battery chargers for warehouse lift trucks, and electric vehicle chargers. These models are very limited in scalability; for example, the current EV J. Collins () University of Minnesota, Minneapolis, MN, USA e-mail: [email protected] P. Page University of Cologne, Dublin, Ireland W. Ketter University of Cologne, Cologne, Germany

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Collins et al. (eds.), Energy Sustainability through Retail Electricity Markets, Applied Innovation and Technology Management, https://doi.org/10.1007/978-3-031-39707-3_2

9

10

J. Collins et al.

model simulates vehicles that drive around and periodically plug in to chargers. Unfortunately, it is very difficult to represent more than a few hundred EVs without causing serious performance problems. We also have a set of “factored” customer models [12] that are driven by statistical distributions modified by time of day, day of week, and weather. But these models are unable to realistically represent systems with significant storage and considerable flexibility in when and how much power they use. As a result, there is a critical need for a highly scalable model of EV chargers rather than EVs because from the viewpoint of the power grid, we only care about EVs when they are connected to chargers. Such a model must be capable of representing tens of thousands of units with reasonable performance. The design of such a model is the subject of this chapter. This chapter not only explains how a realistic and highly scalable model of a population of EV chargers might be built but also describes in some detail the responsibilities of a Power TAC customer model and how it must interact with other elements of the Power TAC system. We also show how we use real-world data to derive the statistical formulations to drive the overall behavior of our model.

2.2 Related Work Customer models are a core component of the Power TAC platform [6]. From the beginning, they have been based on real-world data such as the MeRegio data studied by Gottwalt et al. [5] and used as the foundation for the first household models. Reddy and Veloso [12] developed a statistical framework for modeling populations of energy users as opposed to individuals. This model has been a core part of the Power TAC framework for many years. The accelerating adoption of electric vehicles, along with their promise as a source of electrical demand flexibility, has motivated studies to learn their characteristics as electricity consumers. Lee et al. [9] describe a dataset collected from chargers on the campus of the California Institute of Technology. They also calculate the conditional distribution of charging duration and electricity demand given the plugin hour using a Gaussian Mixture Model (GMM) [16]. Flammini et al. [4] analyze the ElaadNL data1 using a beta-mixture model to learn the charging behavior, but they do not produce a temporal model conditional on the plugin hour. Using such models for simulations requires more than a statistical model, and it requires a process for generating realistic behaviors from such models. Lahariya et al. [8] describe a Synthetic Data Generator based on data from the ElaadNL project. They also fit a GMM to learn a conditional distribution given the plugin hour. Chung et al. [3] treat the problem as a supervised prediction problem, developing

1 https://www.elaad.nl/

2 Modeling a Customer Population in Power TAC: Electric Vehicle Chargers

11

an ensemble model to predict charging duration and electricity demand given some explanatory variables. They do not consider the generation of new data.

2.3 Modeling a Population of EV Chargers A working Power TAC customer model must represent reasonably well the behaviors and preferences of real-world retail energy consumers and producers, and it must interface correctly with the simulation infrastructure. The Power TAC platform runs a discrete-time simulation in which the basic unit is one hour. So a week in the simulated world is represented by 168 one-hour “timeslots.” In tournaments, where competitors connect to the simulation over the Internet, the standard timeslot duration is 5 seconds of real time, so a typical 2-month “game” takes approximately 2 hours. Brokers need a reasonable amount of information about customer and market behavior at the beginning of a game. The system uses real-world weather and market price data, and for many customer types, consumption and production patterns are strongly influenced by weather. To provide the information needed by brokers, the system runs a 2-week “bootstrap” session prior to broker login, and brokers are provided with a “bootstrap record” containing configuration information along with two weeks of weather, customer production/consumption behavior, and market data when they connect. Bootstrap records are also very useful as baseline components for empirical studies [14] that seek to evaluate performance of specific broker features or configurations or of market interventions such as taxes and subsidies. This means that any state data needed by a customer model must be serialized into an external format at the end of a bootstrap session and reloaded at the start of a subsequent simulation session. Modeling a large population is a very different problem from modeling the behavior of people and their devices in a single household or business. Our first attempt was to find a way to add the EV charger model to the factored customer framework originally built by Reddy and Veloso [12] at CMU. However, that framework was not designed to model devices with significant storage and attempts to add this feature have led to an explosion of complexity. The core of the new model is a “storage state” abstraction that tracks the number of vehicles plugged in, when they expect to unplug, and how much energy still needs to be delivered to them before they unplug. A charger population might consist of a collection of private residential installations, or chargers installed in parking facilities in commercial areas, or fleet chargers for servicing delivery vehicles. These populations might vary in several ways, the most important being their usage patterns and the willingness of their owners to adjust their charging schedules to minimize cost under time-varying prices or to maximize the value of demand flexibility.

12

J. Collins et al.

Fig. 2.1 Energy–time charge envelope for a single EV charging episode

2.3.1 Representation Electric vehicle charging can offer demand flexibility when vehicles remain connected to chargers for a longer period than needed to deliver the desired amount of energy. Most chargers have the ability to vary their power levels, the rate in kilowatts at which energy is delivered. Figure 2.1 is a graphic representation of the “charging envelope,” assuming an incoming and desired outgoing charge level in the battery and also assuming we know when the vehicle will be disconnected (or the earliest time at which the user might want to disconnect). In this diagram, the slope represents power and so up-regulation (making more power available on the grid) is done by reducing the slope and moving right within the envelope, while down-regulation (pulling more power from the grid) is done by increasing the slope, moving left in the envelope. In this model, we want to represent a population rather than an individual charger–vehicle pair. We do that by aggregating the charging sessions into cohorts that are scheduled to disconnect in the same future timeslot d. If we assume that all the connected EVs need the same amount of energy by the time they disconnect, then the upper limit on energy use within a timeslot t would be .min(rmax n, i∈n (ci,d )), where n is the number of active chargers in the collection, .rmax is the maximum charging rate, and .ci,d is the remaining charge needed by vehicle i by the disconnecttime d. Similarly, the lower limit on energy use in timeslot t would be .max(0, i∈n (ci,d ) − rmax (d − t − 1)) or the amount of energy needed in the current timeslot t assuming the chargers can be run at full power for the remaining timeslots .(d − t − 1).

2 Modeling a Customer Population in Power TAC: Electric Vehicle Chargers

13

Fig. 2.2 Piecewise-linear approximation of energy needed at time t, in units of charger capacity, for vehicles that will disconnect at time .t + 6

However, we cannot make that assumption. The problem is that at any given time some vehicles in the cohort will be almost fully charged, and we cannot allocate their charger capacity to other EVs in the cohort that need more energy. But if we assume all the chargers are the same capacity, say 8 kW, then we can cluster the population within a cohort by the number of full-power hours that would be needed to complete their requirements. For simplicity, we will assume the total demand within a cluster varies linearly, giving us a piecewise-linear approximation of overall cohort energy needs. As charging progresses, up-regulation can shift EVs from lower demand to higher demand clusters. Down-regulation can shift EVs from higher demand to lower demand clusters. Figure 2.2 shows the situation at time t for a cohort of EVs that plan to disconnect at time .t + 6. A few of the currently connected vehicles need more than 5 hours to complete their charge, and a few need less than one hour to finish charging. The bulk of the cohort needs less than 3 hours of full-power charging to meet their needs over the next six hours. Given our piecewise-linear approximation, the minimum amount of energy we could supply to this group of chargers would be to fully power group 1 and provide no power to the rest. In this case, all but group 1 would appear again in the following timeslot .t + 1 still needing all the energy they needed in timeslot t. The most we could use would be full power for groups 1–5 and the group average for group 6. The model can then respond to regulation or to varying prices by shifting the bars in the histogram down to varying degrees (we cannot shift them up unless we extract energy from EV batteries and return it to the grid). At most, a bar can be shifted down by one full charger-hour in one timeslot. If in this process the low end of the linear distribution falls below the next charger-hour level, a portion of the group will be reallocated to the next lower group. For example, if we were to apply .1/2 charger-hour to group 4, then half of the EVs in group 4 would shift to group 5.

14

J. Collins et al.

2.3.2 Learning the Charging Behavior of EV Users Power TAC is a discrete-time hourly simulation and so our model must be able to generate realistic charging session data with hourly resolution. Because we are interested in the value of flexibility, we are less interested in fast chargers for which charging sessions are normally less than an hour, and very little flexibility is generally on offer. The model we propose is a population of EV chargers, but the aggregate behavior is a large collection of individual charging sessions. Each session contains information like plug-in and plug-out times and energy needs. A core feature of Power TAC is a level of realism that will support research that is applicable to real-world problems. Therefore, we base our model on realworld data. But using such data directly does not fully explore the space of problems and potential solutions. Rather, we want a representation that is statistically similar to real-world data. We propose a Gaussian Mixture Model (GMM) based on a large collection of individual EV charging sessions from a dataset S of N charging sessions. Each charging session is identified by the triple .sn = (dn , en , zn )∀n ∈ N, where .dn ∈ R≥0 is the charging duration in hours, .en ∈ R≥0 is the charged energy in kilowatt hour, and .zn = 0, 1, . . . , 23 is the plug-in hour of the day. A GMM models each observation .sn as a sample from a finite mixture of multivariate normal distributions with .c = 1, . . . , C components weighted by a (.C × 1) weight vector .π with elements .πc corresponding to the cth component in C. Each mixture component represents the charging session triples by a trivariate distribution with (.3 × 1)-dimensional mean vector .μc and a variance-covariance matrix .Σc of dimension (.3 × 3). The goal is to estimate the parameters .θ = (πc , μc , Σc )C c=1 such that the underlying data S is well approximated. Given this estimate of .θ , the probability density function to observe a data point .sn is given by the probability density function (pdf) of the GMM p(sn |θ ) =

C

.

πc N (sn |μc , Σc ) .

(2.1)

c=1

Note that . c∈C πc = 1 such that .p(sn |θ ) is a valid pdf. We estimate .θ using the Expectation-Maximization algorithm as implemented by the Scikit-learn GMM module [11]. Samples from the joint pdf .p(dn , en , zn ) as described in Eq. (2.1) represent charging sessions independent of the plug-in time of a vehicle. However, the behavior of EV owners varies considerably by the hour of day. To capture these temporal dynamics as accurately as possible, it is necessary to condition charging duration and electricity demand on the vehicle’s plug-in time. We use hour of day to match the hourly timeslot resolution of Power TAC. Mathematically, we formulate a conditional Gaussian Mixture Model to estimate the density .p(dn , en |zn ) where charging duration and electricity demand are now conditional on the plug-in hour. Note that the parameter estimates .θ are now

2 Modeling a Customer Population in Power TAC: Electric Vehicle Chargers

15

conditional on the plug-in hour as well. This is because we have changed the mean and covariance matrix dimensions, and .π is no longer independent. In fact, the mixture component weights .π might change significantly depending on the plugin hour. The variation in .π for different plug-in hours can be interpreted as how strongly the user’s charging behavior depends on time of the day. To denote the conditional GMM, we partition .sn into .sn = (dn , en ) and .sn = (zn ) such that the mean vector reads .μ = (μsn , μsn ) and the covariance matrix becomes Σsn sn Σsn sn .Σ = Σ Σ , where the notation .Σxx denotes the variance–covariance matrix sn sn

sn sn

of the variable x and .Σxy the cross-covariance of the variables x and y. Given these definitions, we write the conditional GMM as p(sn |sn , θ ) =

C

.

π c N (sn |sn , μc , Σ c ),

(2.2)

c=1

where the notation .x c refers to the variable’s value in the cth mixture component under the conditional model. Based on the joint Expectation-Maximization estimate of .θ , we can calculate the updated conditional values of .θ . According to the properties of the multivariate normal distribution, the new conditional mean and variance values can be calculated as follows: (c) (c) (c) −1 (c) zn − μs . .μc = μ + Σ (Σ ) (2.3) s s s s s n

Σc =

(c) Σs s n n

n n

n n

n

(c) (c) (c) − Σs s (Σs s )−1 Σs s . n n n n n n

(2.4)

To calculate the updated conditional mixture weight vector .π c , we leverage Bayes’ rule [10, Eq. 5]: πc =

(c) πc p(sn |μ(c) s , Σs s ) n

.

c ∈C

n n

(c )

(c )

n

n n

πc p(sn |μs , Σs s )

,

(2.5)

where .p(sn |μs , Σs s ) is the marginal normal pdf of .sn . (c) n

(c)

n n

Using these new values .θ = (π c , μc , Σ c )C c=1 , it is possible to generate tuples of charging duration and energy demand .sn = (dn , en ) while keeping the plug-in time .sn = (zn ) of a vehicle fixed.

2.3.3 Dataset The dataset on which we base our initial model contains 6878 individual charging sessions from a residential housing cooperative in Trondheim, Norway [15]. In

16

J. Collins et al.

Table 2.1 Excerpt of the Norway residential dataset. The variable names .zn , dn , and .en map the column names to the model variables used in Sect. 2.3.2 session_ID 3 111 161 2960

Start_plugin_hour (.zn ) 11 19 15 13

Duration_hours 8.22 11.40 19.42 2.56

(.dn ) El_kWh 29.87 31.06 10.21 2.01

(.en )

December 2018, a new EV charging infrastructure with 3.6 kW and 7.2 kW chargers was installed at the apartment complex. The dataset spans charging sessions from December 2018 to January 2020 by 97 unique registered users. All charging sessions originate from both shared and private charging points associated with individual apartments. Since the dataset has already been cleaned by the authors, we only remove outliers in charging duration by omitting data points with a z-score .≥ 3. After this outlier removal, 6544 charging sessions remain with a maximum value of 53.75 hours for charging duration. Table 2.1 shows an excerpt of four samples from the original dataset only including the columns used by the statistical model. One row represents one charging session by an individual vehicle.

2.3.4 GMM Results Gaussian Mixture Models treat the number of components (C) as a hyperparameter which has to be chosen by the scientist. It is important to choose C carefully. While a larger number of components always lead to a better model fit, it is typically not desired to set C as large as possible because it leads to slower and overly complex models which potentially overfit the training data. A common way to address this problem is to measure the model performance not only by the fit to the data but also by the compactness of the model. Two common information criteria to measure model performance while penalizing for the number of components are the Akaike Information Criterion (AIC) [2] and the Bayesian Information Criterion (BIC) [13]. We perform model selection using the BIC and choose C based on the gradient of the BIC scores for different component sizes. Figure 2.3 visualizes the BIC scores for different model sizes. We select .C = 5 as the “best” model because the BIC does not improve significantly for higher numbers of components. Fitting the joint GMM defined in Eq. (2.1) to the Norway residential dataset (see Sect. 2.3.3) shows a good overall fit to the data. Figure 2.4 visualizes the model fit to the data for each variable separately. The results are obtained from a Monte Carlo simulation of 1000 independent runs with different random seeds to guarantee the stochastic stability of the experiments.

2 Modeling a Customer Population in Power TAC: Electric Vehicle Chargers

17

Fig. 2.3 BIC scores for model sizes between 1 and 20 components

Fig. 2.4 Kernel density estimates (KDEs) for the joint GMM model fit. The green line represents the true data. Each red line represents one out of S Monte Carlo simulation runs. During each run, we sample N values from the fitted mixture distribution .p(sn |θ), where N is the size of the underlying dataset

The conditional GMM obtained from Eq. (2.2) is able to capture different charging profiles with reasonable accuracy. Figure 2.5 visualizes the model fit as a bivariate contour plot of the estimated conditional distributions .p(dn , en |zn , θ ) versus the true data. Each of the 24 subplots visualizes the distribution of charging duration and energy demand for vehicles that plug in during one hour of the day. The results are obtained from a Monte Carlo simulation of 100 independent runs. During each run, we sample N values from the fitted mixture distribution .p(dn , en |zn , θ ), where N is the number of data points for that hour in the underlying dataset. For all distributions, the areas with high density are correctly identified by the model. However, for the conditional distributions during night times (hours 0 to 6), the estimated distribution exhibits high uncertainty especially along the x-axis for charging duration. Uncertainty can be identified by the blurriness of the red contours’ borders. The more blurry the border, the more uncertain is the model in its estimate because the different Monte Carlo runs yield estimates which are quite different from each other. This is due to a lack of data for these examples.

18

J. Collins et al.

Fig. 2.5 Contour plots of kernel density estimates (KDEs) for the conditional GMM. Green contours represent the true data. Red contours are obtained by overlaying results of Monte Carlo runs

2.3.5 Model Operation Competing brokers offer tariff contracts with a variety of features [7] including time-of-use prices, fixed periodic charges, signup bonuses, discounts for customer willingness to be curtailed, and explicit payments for demand flexibility. The detailed charging behavior of a population can be strongly affected by the tariff they have subscribed to. At any given time, portions of the overall population may be

2 Modeling a Customer Population in Power TAC: Electric Vehicle Chargers

19

subscribed to different tariffs. As a result, their detailed behavior will vary according to their needs and the incentives offered in the subscribed tariffs. A Power TAC customer model must perform a series of operations in each timeslot: 1. Sample the statistical model for new charging requests (see Sect. 2.4). Each request specifies some number of chargers that have been activated (or vehicles plugged in) in the current timeslot, an approximate density function or histogram that describes their energy needs, and when they expect to deactivate or unplug. The model is then expected to deliver the specified amount of energy within the specified time period. 2. Adjust the sample data to account for weather conditions because electric vehicles generally need more energy in colder weather. 3. Constraint check: ensure that all requests are realistic given the time and the capacity of the chargers to which they are connected. 4. For each sub-population subscribed to different tariffs: (a) Without violating constraints, allocate the energy from any regulation events that were exercised in the previous timeslot (see item d below). Constraints include the maximum charger capacity in current and future timeslots and the amount of energy that is yet to be delivered in each future timeslot. (b) Without violating constraints, update the demand histogram by distributing the new demand across cohorts as described in Sect. 2.3.1. (c) As described in Sect. 2.3.1, determine the minimum and maximum amount of energy to be delivered in the current timeslot. (d) Based on tariff terms, determine how much energy to actually deliver in the current timeslot and how much regulation capacity can be offered to the balancing market. For a tariff that values both up- and down-regulation, we might choose the midpoint between minimum and maximum demand, while for a tariff that only values curtailment, we might choose a value closer to the maximum demand. (e) Allocate the energy to be delivered, ensuring first that needs are met for EVs that plan to disconnect in the current timeslot. 5. At least once/day, evaluate tariffs and possibly change subscriptions. In the real world, retail customers would do this much less frequently, but since the simulation is intended to evaluate the competitive behavior of brokers in a realistic timeframe, we ask our customers to look at their options more frequently. In fact, most customers ignore most new offers according to a customer-specific “inertia” value as detailed in the Power TAC specification [7]. The result is that migration toward more attractive tariffs happens over time. 6. Subscription changes are handled by moving portions of the population between subscriptions, carrying with them their current state and needs.

20

J. Collins et al.

2.4 Generating Demand In each timeslot t, the statistical model described in Sect. 2.3.2 is responsible to generate electricity demand for a population of chargers over a time horizon. Samples from this definition produce tuples of .(dn , en ) from the GMM, where .en is the electricity demand of one vehicle over horizon .dn in hours. The number of samples to be generated in each timeslot depends on the hour of day .zn because the number of new plug-ins changes throughout the day. Determining this plugin probability is naturally provided by the marginal distribution .p(zn ). Given a population size P , we evaluate .p(zn ) at the current hour of day to get the density for this hour. We multiply the result by the population size P and add a random noise term to avoid getting a deterministic number for the same hours. Note that this approach of determining the number of new plug-ins scales in .O(1) since it does not rely on population sampling. The demand in each hour .zt is then generated as follows: – Set .N = p(zt ) · P , where P is the configurable population size of EV chargers. – Update .N = N +, where . is a Gaussian noise term defined as . = N (0, N ·α). .α ∈ [0, 1] controls the level of random noise relative to the deterministic number of new plug-ins. – Sample N times from .p(dn , en |zt ) to retrieve tuples of plug-in duration and energy demand .{(d1 , e1 ), (d2 , e2 ), . . . , (dN , eN )}. For scalability purposes, the number of samples could be less than N but some multiple of the maximum plug-in time. – We then group the population N into cohorts that will un-plug within a particular hour given by the integer values .dn . – Energy demand distribution for a cohort is then represented by the array of sampled .en s for each cohort of integer .dn s. The resulting energy demand values .en are then grouped by the minimum number of charger-hours required to produce the piecewise-linear approximation of cohort energy demand (see Fig. 2.2). These charger-hours are the integer values .en /ν, where .ν is the capacity of the chargers in use.

2.5 Results Electric vehicle charging has the potential to add considerable demand flexibility to a power grid, thereby reducing other resources, such as batteries or gas turbines, that would otherwise be needed to keep a grid in balance as the share of weatherdependent renewable resources is increased. Most EV owners leave their vehicles plugged in for longer than needed to charge their batteries. This is especially true for chargers based in residential areas, in which vehicles are often plugged in from early evening until the following morning.

2 Modeling a Customer Population in Power TAC: Electric Vehicle Chargers

21

Fig. 2.6 Daily demand under three different control scenarios: in the left plot, vehicles are fully charged as quickly as possible; in the right, they are charged as late as possible; and in the center plot, they are charged half as fast as needed

Using the residential charger data from Trondheim, we set up our customer model with a population of 10,000 and ran a few weeks of demand under three simple control regimes. In Fig. 2.6, we see three plots of daily energy consumption under these regimes. In the left plot, vehicles are fully charged as quickly as possible after they plug in. This shows the typical uncontrolled demand of a population of EV chargers in a typical residential area, where most charging is from commuters who plug in when they arrive home and unplug the following morning. The right plot shows the total demand when charging is delayed as long as possible, under the assumption that vehicle disconnect times in our dataset are actually the times EV owners would have chosen at the time they plugin in. In the center plot, we see the midpoint—daily demand was spread out evenly over the charging intervals for each plug-in session. This regime produces maximum demand flexibility in both directions. The model will adjust charging behavior to maximize utility for the EV owners according to three parameters: energy needs for individual charging sessions, the possibly time-varying price of energy, and the value of flexibility. A simple flatrate tariff with no incentive to offer flexibility will produce the first plot under the assumption that EV owners will generally prefer to have their vehicles charged sooner rather than later. A tariff that offers significant payments for flexibility will produce a profile approximating the middle plot under the assumption that flexibility is exercised equally in both directions. A profile approximating the third plot could be produced by time-varying prices as described in [17].

2.6 Conclusion We have shown how a reasonably complex customer population model for the Power TAC simulation platform can be constructed that exhibits behavior that is statistically similar to real-world energy consumption behavior. Electric vehicles are expected to become very important players in electricity markets in the near

22

J. Collins et al.

future, as they consume significant energy while potentially offering significant demand flexibility that can be used to help compensate for the variability of weatherdependent renewable supplies. This model can be configured for multiple categories of EV populations, such as this one for which data come from charging records in a residential area in Norway. Other populations that would likely have significantly different usage patterns might include delivery vehicle fleets and office park or urban parking ramps that are outfitted with chargers. At the time this chapter was written, the implementation of this model is still undergoing testing, and we expect to include it in the 2022 Power TAC tournament. There are a number of enhancements we hope to address in the near future, including: 1. Condition the model on day of week as well as hour of day. It does not appear that the initial dataset from Norway contains enough instances to produce a useful model that shows how weekend behavior differs from weekday behavior. 2. Collect and analyze additional datasets which should be increasingly available as the population of EVs grows in a number of areas. 3. Consider using more advanced modeling techniques (while being cognizant of the danger of overfitting). These could include Generative Adversarial Networks (GANs) or Bayesian Variational Autoencoders (VAEs) or a Bayesian Gaussian Mixture Model with an informative Dirichlet prior. However, in the related literature, GMMs and other mixture models are still considered state of the art [1, 4, 9].

References 1. Adam, R., Qian, K., Brehm, R.: Electric vehicle user behavior prediction using gaussian mixture models and soft information. In: 2021 IEEE PES Innovative Smart Grid Technologies – Asia (ISGT Asia), pp. 1–5 (2021). https://doi.org/10.1109/ISGTAsia49270.2021.9715580 2. Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974). https://doi.org/10.1109/TAC.1974.1100705 3. Chung, Y.W., Khaki, B., Li, T., Chu, C., Gadh, R.: Ensemble machine learningbased algorithm for electric vehicle user behavior prediction. Appl. Energy 254, 113732 (2019). https://doi.org/10.1016/j.apenergy.2019.113732, https://www.sciencedirect. com/science/article/pii/S0306261919314199 4. Flammini, M.G., Prettico, G., Julea, A., Fulli, G., Mazza, A., Chicco, G.: Statistical characterisation of the real transaction data gathered from electric vehicle charging stations. Electr. Power Syst. Res. 166, 136–150 (2019). https://doi.org/10.1016/j.epsr.2018.09.022, https:// www.sciencedirect.com/science/article/pii/S037877961830316X 5. Gottwalt, S., Ketter, W., Block, C., Collins, J., Weinhardt, C.: Demand side management – a simulation of household behavior under variable prices. Energy Policy 39, 8163–8174 (2011) 6. Ketter, W., Collins, J., Reddy, P.: Power TAC: a competitive economic simulation of the smart grid. Energy Econ. 39, 262–270 (2013). https://doi.org/10.1016/j.eneco.2013.04.015, http:// www.sciencedirect.com/science/article/pii/S0140988313000959 7. Ketter, W., Collins, J., Weerdt, M.D.: The 2020 power trading agent competition. SSRN Scholarly Paper ID 3564107, Social Science Research Network, Rochester, NY, Mar 2020. https://doi.org/10.2139/ssrn.3564107, https://papers.ssrn.com/abstract=3564107

2 Modeling a Customer Population in Power TAC: Electric Vehicle Chargers

23

8. Lahariya, M., Benoit, D.F., Develder, C.: Synthetic data generator for electric vehicle charging sessions: modeling and evaluation using real-world data. Energies 13(16), 4211 (2020). https:// doi.org/10.3390/en13164211, https://www.mdpi.com/1996-1073/13/16/4211 9. Lee, Z.J., Li, T., Low, S.H.: ACN-data: analysis and applications of an open EV charging dataset. In: Proceedings of the Tenth International Conference on Future Energy Systems. eEnergy’19, Event-place: Phoenix, Arizona (2019) 10. McLachlan, G.J., Rathnayake, S.: On the number of components in a Gaussian mixture model. Wiley Interdisciplinary Rev.: Data Min. Knowl. Disc. 4(5), 341–355 (2014). https://doi.org/10. 1002/widm.1135, https://onlinelibrary.wiley.com/doi/10.1002/widm.1135 11. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011) 12. Reddy, P., Veloso, M.: Factored models for multiscale decision-making in smart grid customers. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-12) (2012). http://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/viewFile/4939/ 5164 13. Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978). https:// doi.org/10.1214/aos/1176344136. Publisher: Institute of Mathematical Statistics 14. Sodomka, E., Collins, J., Gini, M.: Efficient statistical methods for evaluating trading agent performance. In: AAAI07, pp. 770–775 (2007) 15. Sørensen, Å.L., Lindberg, K.B., Sartori, I., Andresen, I.: Residential electric vehicle charging datasets from apartment buildings. Data Brief 36, 107105 (2021). https://doi.org/10.1016/j.dib. 2021.107105, https://www.sciencedirect.com/science/article/pii/S2352340921003899 16. Titterington, D., Smith, A., Makov, U.: Statistical Analysis of Finite Mixture Distributions. Wiley, New York (1985) 17. Valogianni, K., Ketter, W., Collins, J., Zhdanov, D.: Sustainable electric vehicle charging using adaptive pricing. Prod. Oper. Manag. 29(6), 1550–1572 (2020). https://doi.org/10.1111/poms. 13179, https://onlinelibrary.wiley.com/doi/abs/10.1111/poms.13179

Chapter 3

VidyutVanika: AI-Based Autonomous Broker for Smart Grids: From Theory to Practice Sanjay Chandlekar, Bala Suraj Pedasingu, Susobhan Ghosh, Easwar Subramanian, Sanjay Bhat, Praveen Paruchuri, and Sujit Gujar

3.1 Introduction A smart grid is an electricity network based on digital technology that is used to supply electricity to consumers via two-way digital communication, which allows for monitoring, analysis, control, and communication within the supply chain to help improve efficiency, reduce energy consumption and cost, and maximize the transparency and reliability of the energy supply chain [1]. Smart grid systems aim to overcome the weaknesses of conventional grids by enabling customers to get involved using smart meters. A typical smart grid ecosystem consists of various components like wholesale market, tariff market, distribution utility, and electricity brokers. The wholesale market of a smart grid contains large power generating companies (GenCos) that act as a primary electricity source. The tariff market incorporates different types of customers like consumers, producers, prosumers, and storage. The distribution utility is responsible for electricity distribution from the wholesale market to customers in the tariff market. The electricity brokers are a crucial element of a smart grid that interacts between the wholesale and tariff

Author Bala Suraj Pedasingu was affiliated with TCS Innovation Labs at the time of this work. Author Susobhan Ghosh was affiliated with IIIT Hyderabad at the time of this work. S. Chandlekar () · S. Ghosh · P. Paruchuri · S. Gujar International Institute of Information Technology, Hyderabad, India e-mail: [email protected]; [email protected]; [email protected]; [email protected] B. S. Pedasingu · E. Subramanian · S. Bhat TCS Innovation Labs, Hyderabad, India e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Collins et al. (eds.), Energy Sustainability through Retail Electricity Markets, Applied Innovation and Technology Management, https://doi.org/10.1007/978-3-031-39707-3_3

25

26

S. Chandlekar et al.

markets. In smart grids, electricity brokers are the energy distribution companies that procure energy by bidding in the wholesale market auction and selling the procured energy via tariff contracts. The smart grid system is still emerging with some challenges and potential opportunities to improve. The first challenge is the rising contribution of renewable energy resources in energy generation, which raises the challenge of efficiently managing fluctuating supply–demand scenarios and grid imbalance situations. Second, the problem occurs when there is a sudden surge of consumption (i.e., during heat waves) and the demand goes beyond the normal working range of supply, leading to demand peaks. Peak demands lead to added load on GenCos to supply additional energy through fast ramping mechanisms to fulfill the energy requirement leading to higher costs for distribution companies. Lastly, from the electricity broker’s perspective, managing the supply–demand balance within its portfolio in a smart grid ecosystem is an existing challenge. However, testing the strategies on real-world (physical) smart grids is impractical. Thus, one needs an efficient simulator that can mimic the behavior of smart grids and provide helpful feedback. PowerTAC [2–5] is one such platform that replicates the crucial elements of smart grids and provides an efficient simulation of real-world smart grids. In PowerTAC, electricity brokers play a pivotal role in interacting with all three PowerTAC markets, namely wholesale, tariff, and balancing markets. PowerTAC enables the design of an electricity broker to maximize its profit. An electricity broker procures energy by participating in periodic double auctions (PDA) of the wholesale market. Then, it serves its customer base in the tariff market by offering attractive tariffs. The broker manages supply–demand imbalances via balancing market. PowerTAC conducts an annual tournament where multiple participants compete with each other by designing autonomous broker agents. PowerTAC tests broker agents across several games in varying weather conditions and player configurations. A participating broker agent aims to achieve superior profits than its opponents by maximizing its revenue and minimizing costs and penalties. A broker incurs costs to purchase electricity from the wholesale market and generates revenue from the tariff market by selling tariff contracts to retail customers. Moreover, brokers suffer from sizeable penalties for contributing to peak time usages and grid imbalance. Thus, a broker’s objective is to design tariff and wholesale strategies which maximize revenue and minimize cost/penalties. PowerTAC has a rich literature of exciting strategies, both learning-based and heuristic-based. In the wholesale market, brokers in previous tournaments employed Markov decision process (MDP) [6] and decision tree-based strategies coupled with heuristics [7] to design bidding strategies to minimize electricity purchase costs. In the tariff market, brokers utilized genetic algorithm-based approaches [8], optimization-based techniques [9], and MDP formulations [10] to design effective tariffs. TUC_TAC, a successful broker from the 2020 and 2021 editions of the PowerTAC tournament, employed a retail strategy that aims to acquire half the market share [11]. However, very few brokers design strategies that are interlinked

3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids

27

across markets and have a principled approach toward broker development. Our design goal in developing VidyutVanika (VV) has been to develop interlinked strategies across markets with a certain theoretical foundation. We have been participating in PowerTAC tournaments since 2017 by designing a broker called VidyutVanika (VV). VidyutVanika is a Sanskrit word that translates to electricity broker. In the five years of participation history, VV achieved the first and runner positions in 2021 and 2018, respectively. We have always focused on designing a broker agent as a complete system where strategies interact across markets. We mainly use concepts from Reinforcement Learning (RL) and Game Theory (GT) along with neural network-based prediction techniques to design different components of VV. This chapter aims to present our successful strategies and share our learnings with the community. In this chapter, we describe our wholesale and retail strategies of VV18 [12–14] and VV21 [15–17]. Additionally, we discuss some of the techniques that did not perform well and alternative ways to approach those problems. Organization of this chapter The remainder of this chapter is organized as follows. First, in Sect. 3.2, we introduce preliminaries to make the reader wellversed with the necessary background. Then, in Sect. 3.3, we describe our most notable strategies in the wholesale market. Specifically, we elaborate wholesale strategies of VV18 and VV21 and Deep Deterministic Policy Gradient (DDPG)based wholesale strategy. In Sect. 3.4, we showcase our prominent tariff strategies, specifically, the tariff strategies of VV18 and VV21. Finally, in Sect. 3.5, we summarize VV’s journey in the PowerTAC so far and share the learnings from these experiences, followed by future directions of our work in Sect. 3.6.

3.2 Preliminaries In this section, we introduce the necessary concepts used in the chapter. In the following sections, we provide a brief overview of RL, GT, and the necessary definitions and algorithms used in broker development. Finally, we introduce the architecture of our broker VV and review its constituent modules.

3.2.1 RL: A Brief Overview RL is a branch of machine learning used to solve sequential decision problems. An agent learns an optimal behavior within a specific context to maximize the performance or long-term rewards. As shown in Fig. 3.1, the RL agent interacts with an environment, an environment in an RL setting produces a state .st , and in response to state .st , the agent takes an appropriate action .at and moves to the next state .st+1 . For the action .at in the state .st , the agent receives a reward .rt

28

S. Chandlekar et al.

Fig. 3.1 Reinforcement learning framework

from the environment. This quadruple .(st , at , rt , st+1 ) constitutes an experience. The agent collects many such experiences, and based on the reward associated with the experience, it determines the optimal behavior or policy. A policy .π(s, a) is a mapping from each state to an action that determines how the agent acts at each state. An RL algorithm like Q-Learning helps develop an optimal policy based on such a collection of experiences. The process of using gathered experience to determine an optimal policy is what constitutes training. After learning, the agent is supposed to choose the best action from any given state, where the best action is the one that maximizes long-term output. We briefly introduce below the notion of the MDP and two popular RL algorithms that are used in the development of VV. Markov Decision Process (MDP) An MDP is a mathematical framework to model sequential decision-making in partly random environments. It is represented by a tuple .M = (S, A, P , r, γ ), where S stands for the set of states, A stands for the set of actions, and P denotes the transition probability function, where .P (s | s, a) = P (st+1 = s | st = s, at = a) is the probability that taking action a in state s at time t will lead to state .s at time t+1, r indicates the reward function, where .r(s, a) is the reward obtained by taking action a in state s, and .γ ∈ [0, 1] denotes the discount factor. Q-Learning Q-Learning is a popular model-free RL algorithm used to learn the value of an action in a given state, known as Q-value. Q-values are the state–action values that estimate how good it is to take action in any given state. It provides a way to incorporate immediate and future rewards, thus making it possible to consider long-term rewards for the agent in taking action a in the state s. During the training phase, the agent selects an action based on an explore-exploit dilemma, where it may choose to explore by selecting a random action or choose to exploit by selecting the current best action in a given state and the Q-values for a state–action pair are updated using the following equation: Q(st , at ) = (1 − α) ∗ Q(st , at ) + α ∗ [r(st , at ) + γ ∗ (max Q(st+1 , at+1 ))],

.

at+1

(3.1)

3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids

29

where .Q(st , at ) denotes the Q-value of state–action pair .(st , at ), while .α ∼ [0, 1] and .γ ∼ [0, 1] denote the learning-rate and discount factor, respectively. Deep Deterministic Policy Gradient (DDPG) Deep Deterministic Policy Gradient (DDPG) [18] is an actor-critic, model-free RL algorithm built upon the concept of deterministic policy gradients [19] for solving continuous stochastic control problems. They have been shown to be very successful in various continuous control problems [18], e.g., MuJoCo [20] and TORCS [21] environments. The algorithm deploys a parameterized actor function .μ(s | θμ ), which describes the current policy by deterministically mapping states to a specific action. The critic .Q(s, a | θQ ) is learned using the Bellman updates similar to Q-Learning. The .θμ and .θQ represent the parameters (weights) of the actor and critic networks, respectively. The algorithm maintains an additional network for both actor and critic, which gets updated periodically after some n training epochs to provide stability during the learning process.

3.2.2 Game Theory: A Brief Overview Game theory (GT) is a branch of applied mathematics used to analyze situations involving intelligent, rational players. More formally, GT is the study of mathematical models of conflict and cooperation between intelligent, rational decision-makers. It provides tools for analyzing situations in which players make interdependent decisions. This interdependence between players’ decisions causes each player in a game to consider other players’ possible decisions while formulating its strategy. A strategy is an algorithm or a set of rules, following which each player decides its action in each situation (state). Nash equilibrium (NE) is one of the most important solution concepts in GT, which defines strategy profiles where no player is better off by unilaterally deviating from its equilibrium strategy. However, there are games where at least one player contains private information about the game that is not accessible to the other players, leading to incomplete information games. Incomplete information games are more realistic and are studied using Bayesian Nash equilibrium (BNE), which is defined below along with other necessary definitions. Definition 1 (Mixed Strategy) For player i, its mixed strategy .σi is a probability distribution over the strategy set .Si , i.e., .σi (si ), si ∈ Si , indicates the probability with which player i plays .si . Definition 2 (Mixed Strategy Nash Equilibrium (MSNE)) Given an N player game .Γ =< N, (Si ), (ui ) >, a mixed strategy profile .(σ1∗ , . . . , σn∗ ) is called a ∗ ) ≥ u (σ , σ ∗ ), ∀σ ∈ mixed strategy Nash equilibrium if, .∀i ∈ N, ui (σi∗ , σ−i i i −i i ∗ Δ(Si ). .σ−i denotes mixed strategies of all players except i.

30

S. Chandlekar et al.

Definition 3 (Bayesian Nash Equilibrium (BNE)) If N is a set of players, .θi and ui (.) are true types and utility of player i, respectively, a Bayesian Nash equilibrium in a Bayesian game (incomplete information game) .Γ is defined as, a profile of strategies .(s1∗ , s2∗ , . . . , sn∗ ) is a Bayesian Nash Equilibrium, if .∀i ∈ N; .∀si : Θi → Si ; .∀θi ∈ Θi ,

.

∗ ∗ ui ((si∗ , s−i ) | θi ) ≥ ui ((si , s−i ) | θi ).

.

That is, .∀i ∈ N; .∀ai ∈ Si ; .∀θi ∈ Θi , ∗ ∗ Eθ−i [ui (θi , θ−i , si∗ (θi ), s−i (θ−i ))] ≥ Eθ−i [ui (θi , θ−i , ai , s−i (θ−i ))],

.

∗ denotes BNE strategies of all players except i. where .s−i

3.2.3 VidyutVanika: Overview In this section, we present a generic architecture of VV, followed by architectures of VV18 and VV21. VV18 and VV21 are the brokers deployed in the 2018 and 2021 PowerTAC tournaments, respectively. Figure 3.2 shows the generic architecture of VV, which consists of a wholesale and a tariff module. The wholesale module is responsible for submitting bids and asks in a PDA of the wholesale market. The tariff module is in charge of publishing or revoking tariffs in the tariff market. Additionally, VV maintains dedicated repositories to store data from the PowerTAC game

Fig. 3.2 An abstract architecture of VidyutVanika

3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids

31

server, including the weather, wholesale, balancing, and retail market information. VV uses this stored information in the decision-making process. The strategies used by VV in the wholesale and tariff module vary from tournament to tournament, which we highlight in the next paragraph. VV18 uses dynamic programming (DP) in the wholesale market and RL in the tariff market to solve modified versions of known MDP formulations. As depicted in Fig. 3.3, to aid in the decision-making process, VV18 consists of four submodules called Customer Usage Predictor (CUP), MDP-based Limit Price Predictor (LPP), Bid/Ask Quantity Predictor (BAQP), and Dummy Order Quantity And Price Predictor (DOQAP) for the wholesale market and three sub-modules called Net Demand Predictor (NDP), MDP .& Q-Learning Model (MDPQLM), and Tariff Designer (TD) for the tariff market. On the other hand, VV21 relies on artificial intelligence (AI) and GT to design its wholesale and tariff strategies. VV21 follows the supply curve of the prominent power generating company (GenCo) in the wholesale market and designs its wholesale bidding strategy based on the learned supply curve model. For this, as depicted in Fig. 3.4, it uses Demand Predictor (DeP), Supply Curve Follower (SCF), and Bid Generator (BG) sub-modules. VV21 determines the optimal market share using GT analysis and designs tariffs to maintain such market share in the tariff market with the help of Tariff Enhancer (TE), Tariff Designer (TD), and Tariff Health Checker (THC) sub-modules. These sub-modules are detailed in the sections below.

Fig. 3.3 VidyutVanika18: system architecture

32

S. Chandlekar et al.

Fig. 3.4 VidyutVanika21: system architecture

3.3 Wholesale Strategies In this section, we present the wholesale strategies deployed in the VV18 and VV21 brokers. First, we include the theoretical background for the strategies used in the wholesale market. More precisely, we show the BNE analysis for a single buyer, single seller, and single-unit item, where both buyer and seller deploy scale-based bidding strategies. We focus on scaling-based bidding strategies for the following reasons: (i) they are well studied in the literature and the most natural when the brokers need to submit bids in real time, and (ii) as compared to solving differential equations [22] or fictitious play formulations [23] to devise strategies, which are complex in nature, scale-based strategies are easy to implement. Next, we show the BNE analysis for the more general case involving multiple units of items for trade. Then, we explain each sub-module of the wholesale modules of VV18 and VV21 shown in Figs. 3.3 and 3.4, respectively. We present some interesting results based on PowerTAC tournaments and offline experiments to showcase the performance of the presented wholesale strategies. Note that PowerTAC follows a k-double auction (with .k = 0.5) in the wholesale market with a uniform pricing rule for energy trading which is defined as follows: Definition 4 (k-Double Auction) If a buyer and a seller participate in a double auction and if the buyer’s bid b is higher than the seller’s bid s, then the clearing price is given by .kb + (1 − k)s for some fixed .k ∈ [0, 1]. Definition 5 (Uniform Pricing Rule) In the uniform pricing rule, each buyer with a bid higher than the clearing price is declared the winner and pays the clearing price to the seller.

3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids

33

3.3.1 Bayesian Nash Equilibrium Analysis of Wholesale Strategies Single-Unit Auctions Let us consider a single-unit k-double auction, where .k = 0.5, with a uniform pricing rule with a single buyer and a single seller. Let us assume the types of the buyer and seller are .θB and .θS , respectively. We consider that both players deploy scaling-based strategies. More precisely, a bid[ask] by a buyer[seller] is .bB = αB θB [.bS = αS θS ], where .αB [.αS ] is the scale factors by which the buyer[seller] scales its true types while bidding. We assume .θB ∼ U [lB , hB ] and .θS ∼ U [lS , hS ], and this is common knowledge. We make further assumptions in Eq. 3.2, which says that the buyer’s bid (seller’s ask) at any point will be less (higher) than or equal to the highest (lowest) possible seller’s ask (buyer’s bid). .

αB θB ≤ hS ; αS

αS θS ≥ lB . αB

(3.2)

Thus, the utility of the buyer if its bid gets cleared is denoted by the difference of true valuation and clearing price. Given the true types are picked over a distribution, the expected utility is computed as

1 αB θB + αS θS θB − dθS . h S − lS 2

αB αS θB

uB =

.

lS

(3.3)

Now, assuming that the buyer decides to fix its αB before even seeing its own type, then its utility is given by UB =

hB

uB

.

lB

1 h B − lB

dθB

3 hB − lB3 3αB2 αB 1 UB = − 3 αS 4αS (hS − lS )(hB − lB ) αB h2B − lB2 αS 2 l (hB − lB ) . − lS 1 − + 2 4 S 2

(3.4)

Now, differentiating with respect to αB and equating to 0 to find maxima, ∂UB 2 αS lS . = 0 ⇒ αB = + ∂αB 3 2

h2B − lB2 h3B − lB3

.

(3.5)

Performing the similar analysis for the seller results in uS =

.

hB αS αB θS

αB θB + αS θS 2

− θS

1 h B − lB

dθB .

(3.6)

34

S. Chandlekar et al.

Now, assuming that the seller decides to fix its αS before even seeing its own type, then its utility is given by US . Then, differentiating it with respect to αB and equating to 0 to find maxima, US =

hS

uS

.

lS

1 h S − lS

∂US 2 αB hB = 0 ⇒ αS = + ∂αS 3 2

(3.7)

dθS .

h2S − lS2 h3S − lS3

.

(3.8)

Theorem 1 For a single-buyer, single-seller, single-unit k-double auction, where k = 0.5, and buyer’s and seller’s true types are drawn from a 0 − 1 uniform distribution, and if they deploy scaling-based bidding strategies bB and bS which satisfy Eq. 3.2 and fix their scaling factors αB and αS before seeing their true types, then αS = 1 and αB = 23 constitute a BNE. Suppose we increase just one buyer or seller. In that case, the complexity of the solution increases drastically, thus becoming exceedingly challenging to extend and generalize the above results for real-world markets. Thus, for the PowerTAC wholesale market PDA, we design VV18-WS based on the above theoretical background and experimentally demonstrate that it follows the theoretical results obtained in this section [12]. Multi-unit Auctions Now, instead of an increasing number of players, let us assume that both the buyer and the seller are trading for m = 2 units of identical and indivisible items. In this scenario, both the buyer and the seller place two bids/asks in the auction by following scale-based bidding strategies bB and bS , respectively. The bB [bS ] is defined as a strategy in which the buyer[seller] places two bids[asks] bB1 = αB1 θB and bB2 = αB2 θB [bS1 = αS1 θS and bS2 = αS2 θS ]. Here too, we analyze the buyer’s expected utility. There are three possible clearing scenarios: u1B is the expected utility when two-unit clearance happens, u2B is the expected utility for one-unit clearance, and u3B is the expected utility in the case of no clearing. In the case of no market-clearing, u3B = 0. After that, assuming the buyer decides to fix its αB1 and αB2 before even seeing its own type, UB can be written as follows: 1 .uB

=2 lS

u2B = UB =

αB1 αS1 θB αB2 αS2 θB

hB lB

1 αB2 θB + αS2 θS θB − dθS. 2 h S − lS 1 αB1 θB + αS1 θS θB − dθS. 2 h S − lS

αB2 αS2 θB

u1B + u2B

1 h B − lB

(3.9)

(3.10)

dθB .

(3.11)

3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids

35

Equations 3.9 and 3.10 compute the interim utility of the buyer for the cases when two units and one unit of clearance happen, respectively. The integral limits in Eq. 3.9 and 3.10 denote the regions involving two units and one unit of clearance, respectively. The normalizing factors denote the sampling probabilities for our assumed uniform distribution. Equation 3.11 computes the expected utility of the buyer based on the above scenarios. In order to solve Eq. 3.11, we first need to solve Eqs. 3.9 and 3.10,

2 αB2 1 αS2 αB2 2 2 θ − lS . θB − lS θB − = (2 − αB2 ) 2 B 2 h S − lS αS2 αS2 (3.12) αB1 1 αB1 αS1 αB1 αB2 αB2 1− . − + u2B = θB2 − αS2 2 4 αS1 αS2 h S − lS αS1 (3.13)

1 .uB

Solving Eq. 3.11 using the results in Eqs. 3.12 and 3.13, UB =

.

+2

αB1 αS1 αB1 αB1 αB2 αB2 1− − − + αS1 αS2 2 4 αS1 αS2 α2 αB2 1 (hB + lB ) αB2 − B2 − lS 1− + l 2 αS2 . 1− 2 2αS2 (hS − lS ) 2 2(hS − lS ) S (3.14)

(h2B + hB lB + lB2 ) 3(hS − lS )

αB2 αS2

The seller’s expected utility can be calculated by following the similar logic, and we refer the reader to [15] for the complete analysis. Now, we consider the following four possible scenarios based on the scale factors of the buyer and the seller: CASE-1: CASE-2: CASE-3: CASE-4:

Same Scale Factors for Buyer, Same Scale Factors for Seller Different Scale Factors for Buyer, Same Scale Factors for Seller Same Scale Factors for Buyer, Different Scale Factors for Seller Different Scale Factors for Buyer, Different Scale Factors for Seller

The below theorem presents our main results. We refer the reader to the extended version of our paper [15] for the remaining analysis and BNE scale factors for each case. Theorem 2 For a single-buyer, single-seller, two-unit k-double auction with k = 0.5, where θB ∼ U[lB , hB ] and θS ∼ U[lS , hS ], respectively; when they deploy scale-based bidding strategies bB and bS , we get a system of equations, solving which results in a unique set of scale factors for the buyer and the seller that constitute a BNE for each of four cases. For this analysis, too, increasing the number of units further makes analysis increasingly complex, thus becoming exceedingly challenging to extend and generalize the above results for real-world markets. Thus, we propose one more strategy

36

S. Chandlekar et al.

Algorithm 1 MDPLCPBS(energyReq[1. . . 24]) 1: 2: 3: 4: 5: 6: 7: 8: 9: 10:

marketData[0. . . 24] = getMarketStatistics() if EnoughDataPoints(marketData) then bidPrices[1. . . 24] = LPPMDP(marketData) bidQty[1. . . 24] = BAQP(energyReq, bidPrices) else bidPrices[1. . . 24] = SampleBiddingPolicy() bidQty[1. . . 24] = energyReq[1. . . 24] end if sendBids(bidPrices, bidQty) sendDummyBids(bidPrices, marketData)

based on DDPG, and we call it DDPG-based bidding strategy (DDPGBS) for PowerTAC PDAs, which approaches the above theoretical equilibrium and generalizes for real-world PDAs. In the following sections, we discuss the strategies designed based on the above theoretical background, namely VV18-WS and DDPGBBS.

3.3.2 VV18–WS The wholesale module of VV18 (VV18-WS) participated in the wholesale market auctions by placing bids/asks by predicting net usage for a future timeslot using a Neural Network (NN)-based Customer Usage Predictor (CUP). VV18-WS comprises four major submodules, as shown below. Customer Usage Predictor (CUP) It is responsible for estimating the broker’s net energy requirement for a future target timeslot T , by adding the predicted usage of each subscribed customer. This NN predictor uses two hidden layers with seven neurons each, trained with ten epochs of the training data. The input vector consists of the weather report, time of day .(0 − 23), and day of the week .(1 − 7), while the target variable is the customer’s actual usage. During the prediction, the weather forecast takes a proxy for the weather report to predict the usage for the next 24 timeslots. At the beginning of the new game, a fresh model is initialized and trained on the 336 data points from the bootstrap information. Subsequently, the model is continuously updated via online training throughout the game, as the broker gets more data points from the usage reports for each subscribed customer. Limit Price Predictor (LPP) It is primarily motivated from the works of [6] and [24]. The novelty of our MDP structure lies in the design of the reward and the way the MDP solution is used in placing bids. The limit prices generated from solving the MDP are used to place bids. The amount of energy to be placed in a bid is decided by the BAQP sub-module, and normally the energy requirement for a particular delivery slot is split across multiple opportunities, unlike placing a bid with the total predicted energy in a single auction as proposed in [6]. VV18-WS maintains two MDP instances—one for buying and another for selling energy. Below, we describe

3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids

37

Algorithm 2 LPPMDP(marketData) 1: V(0) = marketData.getAverageBalancingPrice() 2: for s in [1,2,. . . 24] do 3: Initialize an empty dictionary .V 4: ac = marketData.getAuctions[s] 5: for price in [marketData.minPrice, marketData.maxPrice] do Σac.LastClearedP rice 0 then energyReq(s) 4: .energyDist[s] = 24 limitprice[j ] Σj =s

5: 6:

else .energyDist[s] =

limitprice[s]

energyReq(s) limitprice[s] Σj24=s limitprice[j ]

7: end if 8: end for 9: return energyDist[1. . . 24]

LCP (s) = min(dummybidscleared ; limitprice[s]cleared ),

.

(3.15)

where .dummybidscleared is the set of bid prices of all cleared dummy bids in the state s, and .limitprice[s]cleared is the limit price for the cleared final bid made by the broker in state s. To estimate LCP for asks, we replace min by max and .dummybidscleared by .dummyaskscleared . Then, LCP is used to update the transition function .pcleared .

3.3.3 DDPGBBS for Bidding in Smart Grids We design DDPGBBS using the concepts from RL and GT. Below is the MDP formulation of the strategy: – State space: 1. Proximity p, where .p ∈ {0, 1, 2, . . . , 24} 2. Quantity to but q, where .q ∈ R 3. Buyer’s true type .θ , where .θ ∈ R. Here, we consider .θ as average unit buying balancing price from balancing market in a game. – Action space: Buyer’s scale factors .αB1 , .αB2 ∈ [0, 1] – Transitions: 1. Proximity p changes from p to .p − 1. 2. Quantity to buy q becomes the remaining quantity .q after current auction clearing. 3. Buyer’s true type .θ remains unchanged. – Terminal State: If .p = 0 or .q = 0 – Rewards:

3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids

39

1. If s is not a terminal state and no market-clearing, then .reward = 0. 2. If s is not a terminal state and market-clearing happens, then .reward = −cp∗ cq, where cp and cq are clearing price and buyer’s clearing quantity. 3. If s is a terminal state, then .reward = −q" ∗ θ , where .q" is the remaining quantity at the end of all 24 auctions. During the game, the model outputs two scale factors, which get multiplied by the .θ to create two bids for the auction. At the same time, the required quantity of subscriber demand is equally distributed into two bids, as our model places only two bids in the auction. We assume the quantity placed in each bid is a single indivisible unit and thus can be treated as a two-unit double auction. We train the model offline by accumulating experiences in the replay buffer; for this, we run two batches of experiments, with 20 games in each set. In the first batch, the DDPGBBS competes against a ZI broker in a two-player configuration, while it faces three similar ZI brokers in a four-player configuration in the second batch. The ZI strategy follows a randomized approach to bid in a PDA by sampling a price from a uniform distribution between the minimum and maximum bid prices. The hourly demand is distributed equally between all the competing brokers in each batch. The strategy updates the replay buffer after each auction instance in a game. The ZI brokers are preferred as their bidding pattern is random, thus improving learning by visiting a wide range of states in the state space as opposed to some other brokers that make the agent visit only a subset of state space. After executing both batches, we update our final model using the merged replay buffer of both batches by following the standard DDPG update procedure. To benchmark the performance of DDPGBBS, we use state-of-the-art wholesale bidding strategies (ZI P [25] and ZI ) and successful strategies of previous PowerTAC tournaments (SP OT [26] and V V 18 [12]). We execute two sets of experiments; the first set of experiments is divided into four parts. In each of these four parts, DDPGBBS plays ten two-player games against one of the brokers from the set {SP OT , V V 18, ZI P , ZI }. Similarly, in the second set, DDPGBBS plays ten five-player games with all the available brokers in the game. Figure 3.5 summarizes the first set of experiments that display other brokers’ relative unit clearing prices with respect to the DDPGBBS’s clearing price. A value greater than 1 implies that the broker had a higher mean clearing price than DDPGBBS after playing ten games. Figure 3.6 demonstrates the result for the second set of experiments, where we compare each broker’s average unit clearing price across ten games in a five-player game configuration. In both experiments, DDPGBBS consistently outperforms all the other brokers by a considerable margin.

Fig. 3.5 Relative unit clearing price for 2-player comparison

40

S. Chandlekar et al.

Fig. 3.6 Mean unit clearing price comparison for 5-player game configuration

3.3.4 VV21–WS The aim of VV21-WS is to procure the energy of its subscriber base at the lowest possible cost. The sequential decision-making of bid placement for delivery timeslot T is guided using a supply curve constructed using uncleared asks after every auction clearing. This entire pipeline is controlled using a DeP, Supply Curve Follower (SCF), and bid generator (BG). After exhausting all opportunities, any uncertainty in energy procurement is adjusted through transactions in the balancing market. Although the wholesale module is described from a buyer’s perspective, it is straightforward to design a similar behavior for brokers wishing to sell energy using the portfolio of production customers. Algorithm 4 VV21-WS(currentTime) 1: 2: 3: 4: 5: 6: 7: 8:

mktUsage[] = NetDemandPredictor(currentTime) askPrices[] = SupplyCurveFollower(currentTime, mktUsage) bids[][] = BidGenerator(currentTime, askPrices) for hour in [1,2. . . 23]) do futureTime = currentTime + hour Fetch M bids from bids[futureTime] Submit M bids to Auction(currentTime,futureTime) end for

Supply Curve Follower (SCF) The SCF helps in determining suitable limit prices for generating bids. In the PowerTAC wholesale market, a GenCo acts as the main supplier, and the relationship between supply and demand (.p, q) is quadratic in nature. The core idea involves constructing a supply–demand curve through uncleared asks and following the curve to identify the minimum ask price of GenCo for given outstanding energy requirements of all buyers. Each auction can be marked using the tuple .(t, T ), where .t ∈ {T − 24, · · · , T − 1} are the bidding times and T is the delivery time. VV21-WS does not place any bids at the first opportunity (i.e., at

3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids

41

Algorithm 5 SupplyCurveFollower(currentTime, netDemand) 1: for hour in .[1 . . . 23] do 2: futureTime = currentTime + hour 3: Get UnclearedAsks of Auction(currentTime-1,futureTime) 4: if UnclearedAsks is empty then 5: Get ClearedPrices of Auction(t,t+hour) .∀ t. 1, else .si = 1. Steps 3 and 4 calculate the rate values for every 168 hour of the week. Finally, step 5 designs a weekly ToU tariff for the supplied rate values and a given powerType by calling CreateTariff() method. Tariff Health Checker (THC) The THC periodically removes unhealthy tariffs from the retail market. Specifically, it queries the tariff repository to get the accounting information of all the VV21’s active tariffs, using which it calculates the profit/loss of tariffs and its average subscription rate. The profits obtained by a tariff are calculated using revenue generated by the tariff from its subscribers, less energy procurement costs. Tariffs that are loss-making or under-subscribed are marked as unhealthy and are removed from the market. Important Results Below, we present some of the crucial results indicating the performance of VV21-TS. First, we examine the efficacy of VV21-TS and TD by performing controlled ablation experiments. For this, we disable VV21-TS and TD individually to create new brokers and report the performance drop observed by playing 50 two-player games between full VV21 and the new curtailed brokers. Let VV21-WTS be the broker generated by substituting the VV21-TS with the sample broker tariff strategy and VV21-WTD be the broker with the disabled TD submodule. We observe that VV21-WTS and VV21-WTD were only able to generate .50.54% and .79.70% profits, respectively, indicating the importance of VV21-TS and TD to mitigate capacity transition charges and improve profits. After that, based on the GT analysis, we chose to keep HB around .60% for 3player configuration and .45% for 5- and 7-player configurations in PowerTAC 2021 tournament. The values of MB and LB are set to .60% and .40% of the HB. The details of these calculations are included in Sect. 3.4.3. Based on these calculations, the equilibrium market shares for 3, 5, and 7 player configurations are .48%, .38.55%, and .30%, respectively. As shown in Fig. 3.8(right), VV21 maintained a mean market-share of approximately .55%, .40%, and .38% in PowerTAC 2021 tournament

3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids

49

Fig. 3.8 Market share analysis of PowerTAC 2018 (left) and PowerTAC 2021(right)

in 3, 5, and 7 player-configurations, respectively, which are approximately in accordance with the equilibrium market shares.

3.4.3 Game Theoretical (GT) Analysis of VV21 Tariff Strategy In this section, we describe how we model the problem of determining the optimal market as a GT problem and identify the required parameters, namely HB, MB, and LB used in Algorithm 11 for each player configuration. Here, we only show the analysis for the 5-Player configuration. We follow the same method to determine parameters for other player configurations as well. We model the PowerTAC game as a two-player zero-sum game, where utility .u1 for VV21 is defined as the difference between the mean cash positions of VV21 and all opponents. Modeling the game this way helps to maximize the difference between the mean cash positions of VV21 and opponents, which in turn helps VV21 to yield higher profits than its opponents. The higher difference between profits gets reflected in the normalized scores, which decide the winner of the tournament. Contrary to this approach, if we just aim to maximize our cash positions, opponents may also earn similar cash positions and achieve similar normalized scores, which is undesirable. Although PowerTAC games are not zero-sum games by design, we model them as two-player zero-sum games to serve our aim of not only achieving the highest cash balance but also maximizing the difference, as mentioned above. In this analysis, we make VV21 act as a row player and all the opponents act as a (single) column player. The higher bound (HB) on market share is discretized as strategy set .S1 = {0%, 15%, 30%, 45%, 60%, 75%, 100%}. We consider broker agents from past PowerTAC tournaments to generate the column player’s strategy set .S2 , as shown in columns of Fig. 3.9. Specifically, we use AgentUDE (A), CrocodileAgent (C), TUC_TAC (T T ), VidyutVanika18 (VV18), and VidyutVanika20 (VV20) as opponent brokers. Thus, the size of the column player’s strategy set is .5c4 = 5. An example of one such game setting (the first cell in Fig. 3.9) is when VV21 plays .0% strategy and four opponents T T , VV18, VV20, and C participate as opponents denoted as one of the column player strategies.

50

S. Chandlekar et al.

Fig. 3.9 Five player-game analysis (with utility values in millions)

The utility values of VV21 shown in Fig. 3.9 are calculated by playing T games each for all possible combinations of VV21’s and opponent’s strategies. Below is the formula to calculate above utility values for .∀si ∈ S1 and ∀s−i ∈ S2 , u1 (si , s−i ) =

.

n T T 1

1 1

( yik ). xi − n T T i=1

k=1

(3.19)

i=1

In Eq. 3.19, .xi denotes the final cash balance of VV21 in game i, while .yik denotes the final cash balance of opponent broker k in game i and n denotes the number of opponent brokers in the game. For our analysis, we kept .T = 5. With the help of Gambit [27], we figured out that the above game has a unique BNE, where VV21’s MSNE proposes randomizing between .30% and .45% market shares with probabilities of .0.43 and .0.57, respectively, which results in an equilibrium market share of .38.55% .(0.43 ∗ 30 + 0.57 ∗ 45). Similarly, for 7- and 3-player games, equilibrium market shares turn out to be .30% and .48%, respectively. The above analysis presents how VV21 should randomize while targeting market share. Apart from this randomization, it is not easy to maintain a specific market share across different games due to the stochasticity of the PowerTAC simulation and customer models. Hence, we seek to maintain market share within explicit bounds with the help of HB, MB, and LB hyperparameters, such that VV21’s mean market share is near the equilibrium market share as shown in the above subsection.

3.5 Results and Discussion In this section, we first present the results of the PowerTAC 2018 and 2021 tournaments to showcase the performance of VV18 and VV21 in the respective tournaments. After that, we discuss our learnings from PowerTAC based on our experience of participating in PowerTAC tournaments. In particular, we talk about what kind of strategies might not work well in PowerTAC and provide a discussion for it.

3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids

51

3.5.1 Tournament Results (PowerTAC 2018 and 2021) We first present the leaderboard of the 2018 and 2021 PowerTAC tournament finals having the normalized scores of each broker across each game configuration in Fig. 3.10. Instead of unnormalized scores, we indicate the percentage cash position with respect to the winning broker based on unnormalized scores. The leaderboard numbers demonstrate that VV18 was runner-up in 2018 and that VV21 was the winner of the 2021 PowerTAC tournaments. In 2018, VV18 was placed second, behind AgentUDE, despite winning more games than AgentUDE. This is because of the higher normalized cumulative profits of AgentUDE in each configuration across all games in the tournament. In 2021, our modeling of the PowerTAC game as a zero-sum game maximizing the difference between the cash position of VV21 and its opponents, along with the ability to procure energy in wholesale and balancing markets for lower prices, helped VV21 to achieve approximately double scores than the second-placed broker in each configuration. Figure 3.11 shows the plots for the number of 1st and 2nd ranks of each broker in the 2018 and 2021 PowerTAC tournament finals. As stated previously, VV18 won more games than the winning broker AgentUDE. Still, it ended up as a runner-up due to lower normalized cumulative profits. Specifically, AgentUDE earned high profits in 2-player games that assisted it in cementing its place as the tournament winner. Upon analyzing 2021 plots, we notice that both VV21 and TUC_TAC won more than half of their respective games. In fact, TUC_TAC won slightly more games than VV21 but lost considerably in the other games, whereas VV21 managed

Fig. 3.10 Leaderboard of PowerTAC 2018 (left) and PowerTAC 2021(right)

Fig. 3.11 Number of first and second ranks of each broker in PowerTAC 2018 (left) and 2021 (right)

52

S. Chandlekar et al.

Fig. 3.12 Games with negative cash in PowerTAC 2018 (left) and 2021 (right)

to curtail its losses in the games where it could not win. The capability to curtail losses also played a decisive role in the victory of VV21 in the 2021 tournament. To analyze the results further, we show the number of games where brokers ended up having negative profits in the 2018 and 2021 PowerTAC tournament finals in Fig. 3.12. In 2018, VV18 had the second-fewest games with negative profits after CrocodileAgent. Note that VV18 had four times the average market share than CrocodileAgent and still had a similar number of negative profit games. Then, in 2021, VV21 ended up in negative profits in fewer games than any other broker in the tournament despite maintaining a considerable market share. Thus, VV consistently managed to make up for its losses and rarely became non-profitable. Accounting analysis of the PowerTAC tournament finals of 2018 and 2021 is presented in Fig. 3.13 that furnishes the breakup of broker revenue in the tariff, wholesale, and balancing market, along with capacity transaction penalties and the final cash position of each broker. In 2018, VV18 managed to reduce imbalance costs despite having an almost comparable number of customers as Bunnie, displaying the efficacy of the CUP module. Moreover, VV18 had one of the best tariff market income-to-cost ratios (.1.14), with only CrocodileAgent and AgentUDE (.1.32 and .1.43, respectively) having better ratios. Note that both CrocodileAgent and AgentUDE captured a meager market share than VV18. In 2021, an effective retail strategy resulted in handsome profits in the retail market and lower capacity transaction charges. Furthermore, a strong wholesale and retail strategy contributed to low balancing costs. Additionally, VV21 performed the best in terms of the income-to-cost ratio (.1.67) among the revenue-making brokers of the tournament.

3.5.2 Discussion We have been participating in the PowerTAC tournaments for five years. During this period, we attempted many different strategies and techniques. We have already discussed the successful strategies; however, some strategies/techniques did not

3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids

53

Fig. 3.13 Accounting information of brokers in PowerTAC 2018 (left) and 2021 (right)

result in satisfactory performance, and we learned that those are not suitable for the PowerTAC environment. Below, we highlight some of the crucial lessons we learned that could be useful for broker development in the future. The first lesson relates to usage prediction during a game in PowerTAC simulation. To achieve higher prediction accuracy for customer usage and net demand prediction, which is a time series prediction task, we endeavored various offline learning techniques ranging from statistical analysis (i.e., ARIMA) to machine learning (i.e., FFN, RNN like LSTM and GRU). We observed that even after training on enough data points and utilizing the best-known architectures, our models did not consistently outperform the sample broker’s exponential average prediction method during the tournament games. The lesson here is that offline prediction is unsuitable for usage prediction in the PowerTAC simulator because of the simulator’s inbuilt randomness and other exogenous elements like weather. In PowerTAC, there is inbuilt randomness to model customer behavior at the start of the game; additionally, the weather location and the season are also selected randomly at the start, leading to high randomness from game to game. Thus, we need to collect many possible data classes for offline training. It would be tough for a prediction model to learn from the data having high variation in the usage patterns. Moreover, even if the model learns, it will learn an average response based on all collected data which may not be suitable for usage prediction in PowerTAC as even a small imbalance leads to high balancing penalties. The more accurate way is to make online predictions within the game and not rely on previous game data. Another important learning we came across is the design of tariff strategy. For designing any learning-based strategy (i.e., RL), one needs to collect enough data points, possibly millions, to train the model over the maximum possible state space. However, the problem occurs while generating data points for any tariff strategy. To design any tariff strategy, one needs to select an appropriate state space containing information about own current tariffs, opponents’ tariffs, market situation, etc. Thus, the state space has to be sufficiently large, and we need millions of data points, which is problematic as tariff usage data is generated once in six timeslots during the game. Thus, there are around 240 data points in a game, which implies we need to play thousands of games to generate enough data for training, which is highly unlikely to execute in the PowerTAC.

54

S. Chandlekar et al.

3.6 Future Directions In the future, we aim to design complete learning-based strategies by incorporating information about market share equilibrium in the tariff market and supply curve details in the wholesale market. Such learning-based strategies for tariff and wholesale markets would adapt from game to game and play appropriately against different brokers and player configurations. In the past, there have been effective learning-based wholesale strategies; however, there has not been an effective learning strategy in the tariff market. Our MDP-based tariff strategy used in VV18 made losses in many games, and we had to resort to a predefined weekly ToU tariff. As discussed in the chapter, the strategies based on RL might not do well because of the scarcity of training data. One possible direction includes designing an effective online learning strategy in the tariff market. An online learning framework like multi-arm bandits may be more suitable for the task. Along similar lines, one possible extension of VV21’s tariff strategy would be to learn equilibrium market share online depending on player configuration and the nature of players. This would enable us to maintain appropriate market shares against stronger and weaker opponents during the game. Apart from the wholesale and tariff markets, there has not been much work done in the balancing market. VV21 exploits the balancing market and earns handsome profits in games with unusually high demands but does not earn much profit from the balancing market in the games with standard demands. A bidding strategy for the balancing market can be designed by incorporating an accurate predictor of market surplus and deficit. However, such a bidding strategy in the balancing market has to be integrated with the wholesale bidding strategy. This would enable the wholesale strategy to buy less/more during the wholesale auctions based on the predicted market surplus/deficit. At the same time, the remaining/extra energy can be traded in the balancing market. For any strategic decisions in PowerTAC markets, we need a reliable prediction model, i.e., customer usage predictor and net demand predictor. A more robust prediction model incorporating the variability of usage patterns based on offered tariffs (basically a prediction model integrated with demand response) would be more accurate for dynamic usage patterns.

References 1. Techopedia. Smart grid (2022). [Online. Accessed 25 Mar 2022] 2. Ketter, W., Collins, J., de Weerdt, M.: The 2020 power trading agent competition (March 30, 2020). In: ERIM Report Series Reference No. 2020-002 (2020). https://doi.org/10.2139/ssrn. 3564107 3. Ketter, W., Collins, J., Reddy P.: Power TAC: a competitive economic simulation of the smart grid. Energy Econ. 39, 262–270 (2013) 4. Ketter, W., Peters, M., Collins, J., Gupta, A.: A multiagent competitive gaming platform to address societal challenges. MIS Q. 40, 447–460 (2016)

3 VidyutVanika: AI-Based Autonomous Broker for Smart Grids

55

5. Ketter, W., Peters, M., Collins, J., Gupta, A.: Competitive benchmarking: an IS research approach to address wicked problems with big data and analytics. MIS Q. 40, 1057–1080 (2016) 6. Urieli, D., Stone, P.: Tactex’13: a champion adaptive power trading agent. In: Proceedings of the Twenty Eighth AAAI Conference on Artificial Intelligence. Association for the Advancement of Artificial Intelligence (2014) 7. Chowdhury, M.M.P., Kiekintveld, C., Son, T.C., Yeoh, W.: Bidding in periodic double auctions using heuristics and dynamic Monte Carlo tree search. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) (2018) 8. Özdemir, S., Unland, R.: AgentUDE17: a genetic algorithm to optimize the parameters of an electricity tariff in a smart grid environment. In: Advances in Practical Applications of Agents, Multi-Agent Systems, and Complexity: The PAAMS Collection, pp. 224–236 (2018) 9. Urieli, D., Stone, P.: Autonomous electricity trading using time-of-use tariffs in a competitive market. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI16). Association for the Advancement of Artificial Intelligence (2016) 10. Serrano, J., González, A.Y.R., de Cote, M.: Fixed-price tariff generation using reinforcement learning, vol. 674. In: Fujita, K., et al. (eds.) Modern Approaches to Agent-based Complex Automated Negotiation. Studies in Computational Intelligence, Springer, Cham (2017) 11. Orfanoudakis, S., Kontos, S., Akasiadis, C., Chalkiadakis, G.: Aiming for half gets you to the top: winning PowerTAC 2020. In: Multi-Agent Systems, 18th European Conference, EUMAS 2021, pp. 144–159 (2021) 12. Ghosh, S., Gujar, S., Paruchuri, P., Subramanian, E., Bhat, S.: Bidding in smart grid PDAs: theory, analysis and strategy. In: Proceedings of the AAAI Conference on Artificial Intelligence AAAI’2020, pp. 1974–1981 (2020) 13. Ghosh, S., Subramanian, E., Bhat, S. P., Gujar, S., Paruchuri, P.: VidyutVanika: a reinforcement learning based broker agent for a power trading competition. In: Proceedings of the AAAI Conference on AI (2019) 14. Ghosh, S., Prakash, K., Chandlekar, S., Subramanian, E., Bhat, S., Gujar, S., Paruchuri, P.: VidyutVanika: an autonomous broker agent for smart grid environment. In: Policy, Awareness, Sustainability and Systems (PASS) Workshop (2019) 15. Chandlekar, S., Subramanian, E., Bhat, S., Paruchuri, P., Gujar, S.: Multi-unit double auctions: equilibrium analysis and bidding strategy using DDPG in smart-grids (2022) 16. Chandlekar, S., Subramanian, E., Bhat, S., Paruchuri, P., Gujar, S.: Multi-unit double auctions: equilibrium analysis and bidding strategy using DDPG in smart-grids. In: Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, AAMAS ’22, International Foundation for Autonomous Agents and Multiagent Systems, pp. 1569–1571 (2022) 17. Chandlekar, S., Pedasingu, B. S., Subramanian, E., Bhat, S., Paruchuri, P., Gujar, S.: Vidyutvanika21: an autonomous intelligent broker for smart-grids. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, International Joint Conferences on Artificial Intelligence Organization, pp. 158–164 (2022). Main Track 18. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. In: International Conference on Learning Representations (ICLR) (2016) 19. Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.A.: Deterministic policy gradient algorithms. In: ICML, JMLR Workshop and Conference Proceedings, vol. 32, pp. 387–395. JMLR.org (2014) 20. Todorov, E., Erez, T., Tassa, Y.: MuJoCo: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033 (2012) 21. Espié, E., Guionneau, C., Wymann, B., Dimitrakakis, C., Coulom, R., Sumner, A.: Torcs, the open racing car simulator. Software 4(6), 2 (2005) 22. Vetsikas, I., Jennings, N.: Bidding strategies for realistic multi-unit sealed-bid auctions. Auton. Agent. Multi-Agent Syst. 21, 01 (2008)

56

S. Chandlekar et al.

23. Shi, B., Gerding, E., Vytelingum, P., Jennings, N.: An equilibrium analysis of competing double auction marketplaces using fictitious play. In: 19th European Conference on Artificial Intelligence (ECAI) (2010) 24. Tesauro, G., Bredin, J.L.: Strategic sequential bidding in auctions using dynamic programming. In: Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems: Part 2, AAMAS ’02, pp. 591–598. Association for Computing Machinery, New York, NY (2002) 25. Cliff, D.: Minimal-intelligence agents for bargaining behaviors in market-based environments. Technical Report HPL-97-91, Hewlett Packard Labs (1997) 26. Chowdhury, M.M.P., Kiekintveld, C., Tran, S., Yeoh, W.: Bidding in periodic double auctions using heuristics and dynamic Monte Carlo tree search. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 166–172 (2018) 27. McKelveya, R.D., McLennan, A.M., Turocy, T.L.: Gambit: software tools for game theory, V16.0.1 (2014). www.gambit-project.org

Chapter 4

Designing Retail Electricity Tariffs Using Reinforcement Learning Nastaran Naseri, Saber Talari, Wolfgang Ketter

, and John Collins

4.1 Introduction With the deregulation of retail electricity markets, the role of retail aggregators (called “brokers” in this work) who are in charge of electricity trading across markets and customers becomes crucial. A broker aims to increase profits through energy trading while balancing supply and demand in real time. Aggregators are challenged to reach these goals in the face of stochastic supply from weatherdependent renewable resources as well as stochastic demand from end-user consumption behavior. In addition, numerous brokers compete to be selected by customers; therefore, offering competitive tariffs has a significant impact on winning this game. To address these issues, smart grids are evolving in a way that combines technical foundations such as smart metering with new market structures to support sustainability [1]. This work investigates an approach to the design of autonomous retail broker agents operating in a simulated future market. Tariff definition is important for brokers to maximize profit. Peak demand is the time with the highest electric consumption by customers; peak loads largely determine the grid infrastructure cost. Retailers whose customers are connected to a distribution grid must pay a share of the capital and maintenance cost of the grid in proportion to the share of peak load due to their subscribed customers, which can be 40 percent or more of the total cost for electricity from a brokers’ perspective [2]. Brokers can reduce this part of

N. Naseri () · S. Talari · W. Ketter Cologne Institute for Information Systems, University of Cologne, Cologne, Germany e-mail: [email protected]; [email protected]; [email protected] J. Collins College of Science and Engineering, University of Minnesota, Minneapolis, MN, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Collins et al. (eds.), Energy Sustainability through Retail Electricity Markets, Applied Innovation and Technology Management, https://doi.org/10.1007/978-3-031-39707-3_4

57

58

N. Naseri et al.

their costs via exercising demand side management such as shifting consumption toward off-peak times [3]. To implement an effective approach to achieve this goal, market segmentation is applied to divide customers into those who heavily contribute to peak demands and those who do not. In this way, the heterogeneous market of consumption gets separated into two smaller, more homogeneous market segments [4]. Targeting the market segment that does not contribute to peak demands yields a more desirable customer portfolio that minimizes capacity costs. Reinforcement learning (RL) helps to adapt tariff design as the broker participates in the market [5, 6]. In this regard, the authors in [7] propose a simple MDP strategy for retail tariff design. To avoid blackouts in the electricity grid, supply and demand must always match in real time. This leads to a balancing cost, which also plays an essential role in brokers’ expenses. A proper prediction service that can forecast the demand and production precisely can reduce this cost significantly. Therefore, a sophisticated time series methodology such as seasonal autoregressive integrated moving average with exogenous factors (SARIMAX) can be applied to treat weather data, which is the primary source of uncertainty in demand and production, as an exogenous variable. Our goal in this work is to design a strategy so that brokers can build a profitable customer portfolio. To this end, we investigate reducing the brokers’ major costs, including capacity costs and balancing costs. Thus, market segmentation of customers is applied for the proposed broker, and its tariff is adapted by using RL while participating in the market. In addition, a prediction service for load helps to reach this goal. The resulting broker addresses the existing deficits in other available literature, which we discuss in Sect. 4.5. The business model for our proposed broker incorporates the following features: – It uses a Markov Decision Process (MDP) formulation that incorporates features of the consumption tariff with the goal of segmenting the market. – It applies RL to finding the best tariff through learning the market behavior while participating in the market. – It incorporates a Time-of-Use (ToU) price structure to reflect the cost of electricity and shift customer demand toward off-peak time. – It uses a SARIMAX time series model to consider autoregressive lags for accurate prediction of load demand while considering weather data as exogenous variables. The proposed broker strategy is implemented in a market simulation platform called Power TAC for proof of concept; however, the methodology of the broker in this chapter holds equivalently for general cases. Hence, the designed business model for the broker can be practically applied to the future market. The remainder of the chapter is organized as follows. Section 4.2 presents the market structure faced by the broker. The broker strategy is described in Sect. 4.3, and the results of implementation on Power TAC are discussed in Sect. 4.4. Related works are presented in Sect. 4.5, and finally Sect. 4.6 closes with the conclusions and future work.

4 Designing Retail Electricity Tariffs Using Reinforcement Learning

59

4.2 Smart Electricity Market The retail market in this work is designed so that brokers aggregate energy supply and demand to make profits. A broker could be an energy retailer, aggregator, or a commercial/municipal utility who buys and sells energy through tariff contracts with customers such as households and electric vehicle (EV) owners and also through trading in wholesale markets [8]. There is competition among brokers to attract customers by offering tariff contracts to the end-users. Since consumers must always have access to energy, a distribution utility (DU)—known as a distribution system operator who owns network infrastructure, e.g., cables, and substations—acts as a default broker. In this case, customers who reject tariffs offered by the competing brokers can be served by the DU with a “default” tariff that is intended as a ceiling for other brokers [9]. The contract can include fixed or varying prices for consumption and production. There might be some incentives for energy conservation, bonuses for sign-up, or penalties for early withdrawal. The tariffs for customers should be designed for different types of demand, including interruptible consumption, customers with storage, retail solar producers, and general customers of different sizes and demand profiles [10]. These tariffs may contain tiered rates, ToU rates, two-part tariffs, signup, and early withdrawal payments, variable rates, and regulation rates for energy storage devices [11]. EV tariffs can be offered separately to limit charging during peak hours or offer to pay EV owners for discharging at certain hours. Variable tariffs can be offered in a fixed schedule or entirely dynamic with advanced notice of price changes. Max/min price, notification interval, and expected mean price are specified in variable tariffs offer [12]. Brokers should also compensate for the difference between supply and demand through two available flexible resources, including demand side management and balancing market. In other words, in demand side management, Brokers’ hourly committed purchases in the wholesale market must match net demand (consumption minus production) by their customers [13]. A balancing market operated by DU resolves imbalances. Brokers can bid in this market by offering demand/supply flexibility of their customers for prices determined by the regulation rates in the subscribed tariffs or individual balancing orders [14]. The balancing market also has access to the regulating market within the wholesale ancillary services market. In the absence of broker bids, prices are determined by applying a premium to the wholesale spot market prices, as in the Nordpool balancing market [15]. Brokers in a tariff with regulation rates should buy up-regulation energy and sell downregulation energy at a substantial discount. Customers with controllable loads or storage systems who want to participate in regulation should evaluate their expected costs and benefits from these regulation rates. Therefore, brokers interconnect with several market participants, including distribution utility (DU), customers, brokers, wholesale market, and balancing market to maximize their profit.

60

N. Naseri et al.

4.3 Proposed Broker Strategy In the following, we explain the design of an autonomous agent. The design of the broker includes three strategies in three markets of the simulation platform. First, we develop a SARIMAX time series model to predict customers’ demand considering the weather features as exogenous variables and bid in the wholesale market to procure energy for customers. Second, we propose an MDP formulation to design consumption tariffs to segment the market and avoid additional capacity costs. We use the MDP output to adopt the ToU pricing scheme to reduce the demand during peak hours.

4.3.1 Tariff Design The decision problem of offering brokers’ consumption tariffs will be formalized as a Markov Decision Process (MDP). An MDP consists of states .S containing relevant information from the environment for the problem, actions .A that can be executed from each state, a reward function r that determines the value of a state–action pair, and a discount factor .γ : .MDP = S, A, r, γ [16]. We define the state of our active consumption tariff with two features, namely the mean usage-based price status (MUBPS) and the periodic payment factor (PPF): .S = MU BP S, P P F . The mean usage-based price (MUBP) is defined as the average of all rates specified in a tariff. The .P P F = {4, 6, 8, 10} is the factor to multiply .MUBP with, to determine the periodic payment. For evaluating our tariff design relative to our competitors, we define the .MU BP S = {lowest, near, close, far} as the mean usage-based price status (MUBPS) compared to the lowest MUBP of all active consumption tariffs. Given .MU BPmin = minc∈C MU BP (c), where C is the set of current active consumption tariffs and b .c 0 is the current active tariff of our broker, the MUBPS is classified with ⎧ ⎪ lowest ⎪ ⎪ ⎪ ⎨near .MU BP S = ⎪close ⎪ ⎪ ⎪ ⎩ far

MUBP(cb0 ) 57.15.◦ 22.45.◦ < SVO.◦ < 57.15.◦ ◦ ◦ ◦ .−12.04. < SVO. < 22.45. ◦ ◦ SVO. < .−12.04.

Altruism: Prosociality: Individualism: Competitiveness:

◦

SV O = arctan

.

(Ao − 50) (As − 50)

.

(10.2)

Unlike other alternatives, the SVO.◦ is a continuous measure of social value orientation, in which a low score indicates a competitive value orientation and a high score represents prosociality. Articles show that the SVO.◦ is compatible with the Triple-Dominance Measure of Social Values in terms of outcomes but can be administered faster and more efficiently [50, 58]. As illustrated in Table 10.3, it is even possible to transform the ratio scores into four categorical social value orientations—including altruism—with the help of predetermined cut points.

10.3 Situational Moderators of Social Value Orientation Thus far, the discussion about social value orientation centered on the interplay between self and other and on the theoretical claim that such considerations of payoff allocation reflect personality differences. Largely ignored was the situational grounding of social interactions. Due to the maturity of the research involving social value orientation, many studies in the psychology domain have explored the same research question (whether social value orientation and a given situational variable x jointly influence cooperation–competition). Because of the structural similarity in research design, such studies can be subjected to meta-analysis; cf. [24]. Meta-analysis is a statistical procedure to “estimate the overall effect size in the population by combining effect sizes from different studies that test the same hypothesis” ([25], p. 121). In this section, the results from several meta-analyses on moderators of social value orientation will be summarized.

10.3.1 The Impact of Give-Some vs. Take-Some Games Social dilemma research is rooted in game theory, an academic approach in which many different game types have been devised. Psychological research into social dilemmas and cooperation focuses particularly on experimental games that mimic the ways in which people interact with collective goods and resources. From a theoretical point of view, a distinction must be made between public goods and common-pool resources [55].

10 Incorporating Social Value Orientation

185

First, a public good belongs to the overall community and is shared with all. Within the social psychological literature, public roads, public parks, clean water, and air are considered prominent examples of public goods; cf. [56]. Also, charities, public broadcasting, and governments are viewed as public goods [55]. A key characteristic of a public good is that it needs replenishment when supply is diminishing. Because of this, people must invest time, effort, or money so as to keep the public good intact [55]. Accordingly, the games involving public goods are labeled “give-some” games in social dilemma research [10, 55]. Second, common-pool resources are available to all, and each member of the group or community is free to harvest from it at will. A crucial caveat is that the common resource may run dry or run out, if one/some community member(s) decide(s) to take too much from it. This is famously described as the “tragedy of the commons” [31]. Examples of common-pool resources are natural resources including oil, gas, and electricity [55]. Whereas the best solution for the individual in a commons dilemma would be to take as much from the resource as possible, the best solution for all is to use the resource with moderation only. Because of this aspect, games involving common-pool resources are labeled “take-some” games in social psychology [10, 55]. Research on social value orientation, in general, indicates that prosocials give more than individualistic people in give-some games and take less than individualists in take-some games [13]. Notwithstanding this, social dilemma researchers have argued that give-some games and take-some games trigger fundamentally different psychological responses and mechanisms. One group of authors [22] explains the difference in outcome with the help of prospect theory—the theory that showed the existence of loss aversion (“losses loom larger than gains”) [36, 37]. Accordingly, it is reasoned that—in a give-some game—people start with a certain amount of the public good, which they may lose during game play (i.e., a loss frame). In a takesome game, people start with nothing but eventually take some resources out of the game (i.e., a gain frame). A second group of authors argues that the impact of social value orientation is smaller in take-some games because such dilemmas are “strong” from a normative point of view. Under such circumstances, social value orientation may be overruled by a mutual (normative) understanding among the players that equality is important [23]. The results of a meta-analysis of 82 studies into the impact of social value orientation on public good and common-pool dilemmas [10] confirm the general observation that give-some games and take-some games represent different social situations. A larger effect size was revealed for social value orientation in give-some than in take-some games. Whether the underlying mechanism is best explained with the help of prospect theory or normative equality remains unclear [10].

10.3.2 The Impact of One-Shot vs. Repeated Games Another important situational variable in social dilemma research is the presence or absence of iterations in the game. Some games in economics and psychology

186

L. Rook et al.

are one-shot games, in which the player responds to the gamified information only once. In repeated games, the player responds to the behavior of the other player(s) in the game on multiple occasions. Well-known examples are the repeated prisoner’s dilemma, in which players recurrently decide to cooperate or compete in order to win the game [4], and the Bayesian persuasion (or repeated coordination) game, in which players respond to disruptions against the status quo under incomplete information; cf. [28, 38]. As reviewed in [10], some researchers argue that one-shot games are so poor in information that the game context forces the player to rely on spontaneous—read: dispositional—impulses. Thus, the prosocial individual would respond to the game by playing it in a prosocial manner, whereas individualistic and competitive people would play the game in individualistic or competitive manner (see also [55]). The repeated game, in contrast, allows the player to learn from the actions of the other players. Among others, the repeated game can lead to behavioral assimilation of the actions of the other players in the game (such as mutual trust) [66]. On the other hand, a considerable amount of time and energy must be invested in assessment of social dynamics surrounding the game rather than in game play itself [39]. Applied to social value orientation, this would imply that the effect of trait social value orientation is more pronounced in one-shot games than in repeated games. A meta-analysis of 82 studies was conducted to investigate the impact of one-shot vs. repeated game structure on the relationship between social value orientation and cooperation [10]. No significant effect size was found for social value orientation as a function of one-shot vs. repeated game setup. This seems to suggest that other aspects of the game design may be more important.

10.3.3 The Impact of Communication Communication is among the oldest situational variables in social dilemma research. The bulk of research has focused on the prisoner’s dilemma, the game in which the players decide whether to cooperate (or not) with the other players based on specified payoffs to be assigned to self and other [55, 66]. From an economic— rational utility maximizing—point of view, it makes sense to always act upon selfinterest in the game, by consistently choosing the option with the highest payoff for self [6, 10, 13, 55, 65]. Offering participants the opportunity to communicate with each other, however, tends to dramatically increase cooperative behavior (for review, see [66], for early meta-analysis, see [62]). In the laboratory, a similar effect has been observed in public good games and common-pool resource dilemmas for small decision-making groups. Especially when group members are given the opportunity to pledge in favor of cooperation, communication, indeed, engenders cooperation [16]. As reviewed in [8], researchers have explored the impact of the type of communication (face-to-face vs. mediated), the timing of the communication (prior-during the game), and the size of the decision-making group. A meta-analysis of 45 studies revealed that the

10 Incorporating Social Value Orientation

187

positive effect of communication on cooperation was strongest for face-to-face (vs. mediated) communication and in larger (vs. smaller) groups. Offering players the opportunity to communicate with each other prior or throughout the game did not have an impact [8]. It should be emphasized that the results from this meta-analysis were produced in the year 2010. At the time, the author argued that face-to-face communication was more effective in expressing and exchanging the social norms that are dominant among the players in a decision-making group than computer-mediated communication [8]. It remains to be seen to what extent this argument still holds in the social media-dominated and gamified digital landscape of today, in which digital natives communicate via likes, scores, and leaderboards [64].

10.3.4 The Impact of Gender Differences It is sometimes argued that women are more cooperative than men. However, the impact of gender differences on social value orientation remains largely unclear. One study exists in which the joint effect of gender and social value orientation was explored in the setting of honesty [30]. This study found that women on average have a higher score on social value orientation—hence, are overall more prosocial than men. Another study, in which this relationship was explored for tax compliance in five countries, however, failed to consistently replicate this effect [21]. The authors suggested that the gender–social value orientation linkage may be context-dependent rather than a universal phenomenon. Independent from the social value orientation construct, it was maintained in a meta-analysis of 272 studies [9] that differences between men and women in cooperative and competitive behavior may exist under particular circumstances. Accordingly, apart from gender differences per se, the meta-analysis evaluated various demographic and cultural variables, social dilemma game types (public good, common-pool resource dilemma, one-shot, repeated), communication (with friends or with strangers), and group characteristics (same-sex vs. mixed-sex group composition). The results showed that men and women do not differ in amount of cooperation displayed in social dilemma games. Women cooperate less than men when part of a same-sex group but cooperate more when in a mixed-sex group. In repeated games, men tend to be more cooperative than women [9].

10.3.5 The Impact of Trust Trust is an important concept in research on social dilemmas and cooperation. The concept is usually defined as “the belief that others are honest and that trusting them is (not) risky” (see [55], p. 131). Whether the act of putting faith in someone else is perceived as risky or not depends on the person’s high or low level of dispositional

188

L. Rook et al.

trust. Along such lines, the societal benefits of having a functioning punishment– sanctioning system in place are described in the economics literature [1, 53]. A meta-analysis of 83 studies on the relationship between trust, punishment, and cooperation in society shows that the positive effect of such punishment–sanctioning systems on cooperation is most pronounced in societies characterized by high levels of trust [11]. This finding may exist because—in such societies—trust and societal norms jointly enhance cooperation. Interestingly, gossip and information exchange about someone’s reputation should be understood as alternative (informal) punishment–sanctioning systems also aimed at establishing and safeguarding cooperation [75]. An aspect of normative trust is the extent to which someone expects the other to cooperate. A higher overall expectation of cooperation—as expressed in trait social value orientation—has been shown to make people cooperate more [13]. In a recent meta-analysis of 33 studies, the processes underlying this general effect were further investigated [58]. It was found that prosocials, unlike individualistic and competitive people, more strongly expect others to cooperate. Notwithstanding this, even individualists sometimes have such expectations—i.e., when they are under the impression that the other players in the game will cooperate [58]. Trust and expectations, therefore, seem to provide the fuel for cooperation in social dilemma situations.

10.4 Incorporating Social Values for Cooperation in the Power TAC Environment Social value orientation theory is particularly applicable to the study of environmental sustainability. In general, social value orientation predicts human interaction with collective goods and resources [55]. Prosocial people, for instance, more positively evaluate public transportation modes than privately owned vehicles [35, 69, 70]. Studies further show that prosocial people more likely participate in pro-environmental political activities (such as signing petitions, offering financial support, joining a demonstration) than individualists and competitive people [34]. It therefore seems reasonable to assume that social value orientation can also make a contribution to the study of wicked problems in the energy domain.

10.4.1 The Power Trading Agent Competition (Power TAC) Grand societal challenges such as energy sustainability are so complex that they are labeled wicked problems—sociotechnical problems so complex on so many levels that they are (almost) impossible to solve [60]. It has been proposed to explore

10 Incorporating Social Value Orientation

189

wicked problems in the energy domain in interdisciplinary research endeavors [44] using competitive benchmarking: an approach to addressing real-world wicked problems that is beyond the capacity of a single discipline or research team, by developing a shared paradigm consisting of problem definitions, vocabulary, and research questions; representing it in a tangible, open simulation platform; evaluating potential solutions from a wide range of researchers in direct competition with each other; and an ongoing process that continually updates the paradigm and platform to represent updated understanding of the real-world challenges and platform performances (p. 1058).

This proposal is grounded in research on trading agent competitions that mimic the so-called smart markets [12], in which research groups with various academic backgrounds engage in programming competitions on complex societal and business problems—by developing computational solutions based on artificial intelligence and machine learning; cf. [20, 40, 44, 45]. The Power Trading Agent Competition (Power TAC) is a platform infrastructure specifically developed to study wicked problems in the energy domain. In tournament mode, its players (dedicated research teams) are presented with the following scenario: a “liberalized” retail power market in a medium-sized city, in which users and smallscale producers of power may choose among a set of alternative power suppliers or brokers, represented by the competing broker agents. These choices are represented by “subscriptions” to the tariff contracts offered by the brokers. The brokers are self-asserted, autonomous agents (p. 264).

Power TAC, under those conditions, is an energy trading game that allows for competition between players over multiple rounds; cf. [40, 43, 44]. Research points to the importance of autonomous electricity broker agents (read: intelligent software agents) as intermediaries between users and suppliers of energy; cf. [43, 44]. Consistent with the explicit grounding of the trading agent competition in mainstream economics [40], the electricity brokers on the gaming platform are programmed to behave as rational agents in the strict sense of the word. They thus are selfish agents that maximize utility under risky, complex, and uncertain trading conditions; cf. [40]. These rational broker agents have proven capable of providing the stakeholders involved with highly accurate information and monetary recommendations, while at the same time balancing the electricity grid [19, 44].

10.4.2 Toward Non-competitive Broker Agents Based on Social Values Power TAC—in its present form and shape—simulates a highly complex power grid, in which energy trading and grid balancing are claimed to arise from the behaviors of “millions of self-interested participants” [44] (p. 448). Interestingly, in research mode, Power TAC lends itself for the exploration of possible energy futures. That is, alternative energy landscapes can be modeled and assessed on

190

L. Rook et al.

opportunities, challenges, and pitfalls before market implementation. Given the evidence in psychological science for a positive impact of social value orientation on pro-environmental human behavior, it makes a lot of sense to investigate how the energy market could look like in scenarios, in which electricity consumers are supported by non-competitive broker agents grounded in social values. In the following sections, two possible energy futures with non-competitive— prosocial—energy broker agents will be proposed. It will be assumed that a prosocial energy broker will make payoff allocations away from its own (utility maximizing) interests. That is, the non-competitive broker agent will take an intermediate position and procure energy such that a compromise is sought between the interests of the broker and those of the other stakeholders involved. This is consistent with the way in which human participants are assessed on prosocial (vs. individualistic or competitive) value orientation in the most commonly used measurement instruments [51, 67]. It follows that the non-competitive broker agent will be less profitable than its utility-maximizing counterpart. The non-competitive broker agent will compensate for this by being more efficient in the promotion of trust between interaction partners. As a consequence, the non-competitive energy broker agent will likely establish structural cooperation between the multiple players in a simulated energy market. Prosocial Local Energy Cooperatives First, it makes sense to explore a possible energy future in which local renewable energy cooperatives will gain prominence. Renewable energy cooperatives are legal entities that consist of small groups of “prosumers”—i.e., private individuals, who not only consume but also produce power generated from renewable energy sources. They hold great promise for a renewable energy future [41]. Local energy cooperatives self-organize around a shared residential microgrid, which serves as a classic common-pool resource: in principle, individual members benefit equally from the electricity generated by all [26, 32]. The major challenge of these small-scale legal energy entities is to ensure that all members at all times take some but not all from the residential microgrid (a take-some dilemma). Prolonged individualistic behavior by one or some members (to take much more electricity than necessary) typically destabilizes the microgrid and tends to lead to legal disputes that put the future of the cooperative in danger [59]. A competitive—utility maximizing—broker agent will generate immediate profit for the cooperative. However, this monetary success may serve as a cataclysm for the cooperative, bringing individual(istic) members into temptation to use more electricity during peak periods, at the expense of the cooperative. Interestingly, a non-competitive energy broker can be programmed to yield smaller revenues for the cooperative, but over a longer time span. Not only will this reduce the likelihood that immediate temptation to take something extra from the common pool arises in individual members (social conflict); it will also ensure that the local renewable energy cooperative remains up and running for a longer time—as it provides prolonged temporal opportunity to keep on cooperating for all. What this non-competitive broker agent essentially does, from a psychological point of view, is removing grounds for social conflict by providing automated incen-

10 Incorporating Social Value Orientation

191

tives to cooperate over a longer time horizon. These incentives can be interpreted by individual participants as signals of trustworthiness. Also, by providing sustained temporal opportunity, the broker agent reduces the possible tension arising between individual members of the cooperative with regard to seeking immediate vs. delayed gratification. These contextual factors have proven of paramount importance in the establishment of cooperation in small groups. Research shows that prosocial people in a group context keep on cooperating with others as long as they seem trustworthy, whereas individualists and competitors remain motivated to cooperate as long as it seems to pay off [13, 55, 58]. The non-competitive broker agent balances and satisfies the needs of both groups. Prosocial Industrial Energy Clusters Second, it makes sense to explore a possible energy future in which industrial energy clusters play a central role. Industrial energy clusters are large partnerships between companies within a specific region with the aim of pooling energy and other resources. They consist of professional organizations in possession of their own (renewable energy) power plant, legally paired with other corporate stakeholders in direct geographical proximity [32]. Industrial renewable energy clusters currently do not exist, but they may have several advantages. The inter-organizational partnership is assumed to reduce the threshold for businesses to rely on (self-generated) renewable energy sources, to enable energy balancing at lower costs due to the pooling of resources, and to lower energy costs for all involved [73]. Documented disadvantages include the lack of a shared identity (owning a power plant, being geographically close to other participants, and willingness to give it a try may suffice for an invitation) [32]. Also, the low intrinsic motivation to participate is a threat to cluster longevity— i.e., industrial energy clusters should be considered strategic alliances between pragmatic business partners that wish to cut costs [32]. Industrial energy clusters can be converted into centralized energy platforms supported by intelligent broker agents. Because the entities involved may treat the partnership as an interesting business proposition (a profitable way to gain access to energy at lower costs), it makes sense to pair the industry cluster with a rational— competitive—broker agent from the Power TAC tournament repository. Reception of monetary incentives in return for participation will, in that case, surely meet a corporation’s strategic goals. Interestingly, however, the very act of joining an industrial energy cluster is thought to alleviate the pro-environmental commitment of partners involved [73]. If this is true, also the prosocial broker agent introduced for local cooperatives above may become successful. As before, the broker agent will facilitate cooperation within the cluster by preventing the occurrence of social conflict stemming from greed and by keeping business partners committed to the cause of environmental sustainability. Arguably, an industrial energy cluster based on strategic alliance contains more individualists and competitors than an idealistic cooperative run by like-minded proenvironmentalists. To make the non-competitive broker agent a success, also actions should be taken beyond the Power TAC environment. The following psychological principles can be applied in a real industrial energy cluster:

192

L. Rook et al.

First, the value-based broker agent can be given formal status as an electronic gatekeeper. Human gatekeepers are capable of regulating social interaction and conflict in many social dilemma situations [55]. The intelligent broker agent may play a similar role in the energy partnership, especially when this role is formally recognized by all. Second, the broker agent can be programmed to automatically share trading, balancing, sustainability, and profitability statistics with all stakeholders in the cluster. Transparent communication tends to bolster cooperation, also in people low on trust—as long as individual payoffs are clear [54]. Third, trust and increased cooperation are the more likely, when the broker agent exchanges reports, in which the generosity of the trading activities for all members involved is clearly expressed. Generosity is a proven signal of trustworthiness, and a great contributor to (inter)group cooperation [55]. In similar fashion, the broker can strengthen cooperation within the energy cluster. Finally, the non-competitive broker agent may be contractually activated to act on behalf of the industry cluster for several years. Individualistic and competitive people are more willing to cooperate with other group members if they know that they are supposed to do so for a considerable amount of time [68]. In an (inter-)organizational setting, these interventions can soften individualistic and competitive behaviors among the stakeholders involved and make the noncompetitive broker agent a success. In modified form, some of these techniques may even feed back to the Power TAC environment and serve as input to explore possible energy futures on industrial renewable energy clusters.

10.4.3 Addressing Grand Societal Challenges with Social Values As illustrated for Power TAC, there is a need and opportunity to explore microlevel variables relating to social values for cooperation in the study of energy sustainability. This observation resonates with a recently proposed research agenda for the study of grand societal challenges with the help of analytics [42] that builds on the Environmental, Social, and Governance (ESG) framework [74]. In this ESG– ICT framework, the environmental factor describes issues relating to environmental sustainability, the governance factor covers regulatory, and policy-related aspects, while the social factor captures psychological themes such as Individual well-being, Community welfare, and economic resilience through Technology (ICT) [42]. Noncompetitive energy broker agents grounded in social values clearly belong to the technology-enabled social category of this theoretical framework. As such, they will likely offer fertile grounds for future research into cooperation-based energy sustainability. Beyond the Power TAC environment and infrastructure, it is a question open to debate how individual energy consumers, cooperatives, and strategic alliances will interact with a prosocial autonomous broker agent under natural circumstances.

10 Incorporating Social Value Orientation

193

Research on the serial balancing of supply and demand with experimentally controlled software agents shows that people respond differently to an individualistic– competitive vs. prosocial software agent. That is, judgments and forecasting accuracy of human decision-makers are influenced by the cooperative–competitive nature of the software agent they are interacting with [57]. Related to this, the trend in recommender systems research is to tailor recommendations to unique user characteristics and specific circumstances [61, 76]. It follows that some (but not all) energy consumers, cooperatives, and industry clusters will benefit from interaction with a non-competitive electricity broker. Therefore, simulation studies in the Power TAC environment should ideally be complemented with studies on (the impact of) human–broker interaction on pro-environmental behavior in the behavioral laboratory. First user studies along those lines are currently emerging [27], suggesting that the topic of human–broker interaction will likely become an important avenue of future research in energy informatics.

10.5 Conclusion This book chapter discussed the importance of incorporating social values for cooperation in research on the trading and balancing of energy. A review was provided of the psychological research on cooperation–competition and of social dilemmas arising from human interaction with public goods. The construct of social value orientation was presented as a personality trait that taps the extent to which a person differs in individualistic, competitive, or prosocial focus in social interaction with other people. The two most prominent measurement instruments for social value orientation as well as important situational moderators were discussed. These insights served as input for the description of a non-competitive broker agent based on social values. Illustrations within and beyond the Power TAC environment were provided as to how prosocial broker agents can be utilized in the study of future energy initiatives grounded in cooperation.

References 1. Ahn, T.-K., Ostrom, E., Schmidt, D., Shupp, R., Walker, J.: Cooperation in PD games: fear, greed, and history of play. Public Choice 106(1), 137–155 (2001) 2. Andersen, F.M., Baldini, M., Hansen, L.G., Jensen, C.L.: Households’ hourly electricity consumption and peak demand in Denmark. Appl. Energy 208, 607–619 (2017) 3. Andersson, K.P., Ostrom, E.: Analyzing decentralized resource regimes from a polycentric perspective. Policy Sci. 41(1), 71–93 (2008) 4. Andreoni, J., Miller, J.H.: Rational cooperation in the finitely repeated prisoner’s dilemma: experimental evidence. Econ. J. 103(418), 570–585 (1993) 5. Ariely, D., Bracha, A., Meier, S.: Doing good or doing well? Image motivation and monetary incentives in behaving prosocially. Am. Econ. Rev. 99(1), 544–555 (2009)

194

L. Rook et al.

6. Au, W.T., Kwong, J.Y.Y.: Measurements and effects of social value orientation in social dilemmas: a review. In: Suleiman, R., Budescu, D., Fischer, I., Messick, D. (eds.) Contemporary Psychological Research on Social Dilemmas, pp. 71–98. Cambridge University Press, Cambridge (2004) 7. Azarova, V., Cohen, J.J., Kollmann, A., Reichl, J.: Reducing household electricity consumption during evening peak demand times: evidence from a field experiment. Energy Policy 144, 111657 (2020) 8. Balliet, D.: Communication and cooperation in social dilemmas: a meta-analytic review. J Conflict Res 54(1), 39–57 (2010) 9. Balliet, D., Li, N.P., Macfarlan, S.J., Van Vugt, M.: Sex differences in cooperation: a metaanalytic review of social dilemmas. Psychol. Bull. 137(6), 881–909 (2011) 10. Balliet, D., Parks, C., Joireman, J.: Social value orientation and cooperation in social dilemmas: a meta-analysis. Group Processes Intergroup Relations 12(4), 533–547 (2009) 11. Balliet, D., Van Lange, P.A.M.: Trust, punishment, and cooperation across 18 societies: a metaanalysis. Perspectives Psychol. Sci. 8(4), 363–379 (2013) 12. Bichler, M., Gupta, A., Ketter, W.: Designing smart markets. Inf. Syst. Res. 21(4), 688–699 (2010) 13. Bogaert, S., Boone, C., Declerck, C.: Social value orientation and cooperation in social dilemmas: a review and conceptual model. British J Soc Psychol 47(3), 453–480 (2008) 14. Brief, A.P., Aldag, R.J.: The intrinsic-extrinsic dichotomy: toward conceptual clarity. Acad. Manag. Rev. 2(3), 496–500 (1977) 15. Cheek, N.N., Ward, A.: When choice is a double-edged sword: understanding maximizers’ paradoxical experiences with choice. Personality Individual Differences 143, 55–61 (2019) 16. Chen, X.-P.: The group-based binding pledge as a solution to public goods problems. Organizational Behav. Hum. Decis. Process. 66(2), 192–202 (1996) 17. Coba, L., Rook, L., Zanker, M., Symeonidis, P.: Decision making strategies differ in the presence of collaborative explanations: two conjoint studies. In: Proceedings of the 24th International Conference on Intelligent User Interfaces, pp. 291–302. ACM (2019) 18. Coba, L., Rook, L., Zanker, M., Symeonidis, P.: Choosing between hotels: impact of bimodal rating summary statistics and maximizing behavioral tendency. Inf. Technol. Tour. 22(1), 167– 186 (2020) 19. Collins, J., Ketter, W., Gini, M.: Flexible decision support in dynamic inter-organisational networks. Eur. J. Inf. Syst. 19(4), 436–448 (2010) 20. Collins, J., Ketter, W., Sadeh, N.: Pushing the limits of rational agents: the trading agent competition for supply chain management. AI Mag. 31(2), 63–63 (2010) 21. D’Attoma, J.W., Volintiru, C., Malézieux, A.: Gender, social value orientation, and tax compliance. CESifo Econ. Stud. 66(3), 265–284 (2020) 22. De Dreu, C.K.W., McCusker, C.: Gain–loss frames and cooperation in two-person social dilemmas: a transformational analysis. J. Personality Soc. Psychol. 72(5), 1093–1106 (1997) 23. De Kwaadsteniet, E.W., Van Dijk, E., Wit, A., De Cremer, D.: Social dilemmas as strong versus weak situations: social value orientations and tacit coordination under resource size uncertainty. J. Exp. Soc. Psychol. 42(4), 509–516 (2006) 24. Field, A.P., Gillett, R.: How to do a meta-analysis. British J. Math. Stat. Psychol. 63(3), 665– 694 (2010) 25. Field, A.: Discovering Statistics Using IBM SPSS Statistics. Sage, London (2018) 26. Fridgen, G., Kahlen, M., Ketter, W., Rieger, A., Thimmel, M.: One rate does not fit all: an empirical analysis of electricity tariffs for residential microgrids. Appl. Energy 210, 800–814 (2018) 27. Fügener, A., Grahl, J., Gupta, A., Ketter, W.: Will humans-in-the-loop become Borgs? Merits and pitfalls of working with AI. Manag. Inf. Syst. Q. 45(3), 1527–1556 (2021) 28. Goldstein, I., Huang, C.: Bayesian persuasion in coordination games. Am. Econ. Rev. 106(5), 592–596 (2016)

10 Incorporating Social Value Orientation

195

29. Gottwalt, S., Ketter, W., Block, C., Collins, J., Weinhardt, C.: Demand side management – a simulation of household behavior under variable prices. Energy Policy 39(12), 8163–8174 (2011) 30. Grosch, K., Rau, H.A.: Gender differences in honesty: the role of social value orientation. J. Econ. Psychol. 62, 258–267 (2017) 31. Hardin, G.: The tragedy of the commons. Science 162(3859), 1243–1248 (1968) 32. Hentschel, M., Ketter, W., Collins, J.: Renewable energy cooperatives: facilitating the energy transition at the port of rotterdam. Energy Policy 121, 61–69 (2018) 33. Iyengar, S.S., Wells, R.E., Schwartz, B.: Doing better but feeling worse: looking for the “best” job undermines satisfaction. Psychol. Sci. 17(2), 143–150 (2006) 34. Joireman, J.A., Lasane, T.P., Bennett, J., Richards, D., Solaimani, S.: Integrating social value orientation and the consideration of future consequences within the extended norm activation model of proenvironmental behaviour. British J. Soc. Psychol. 40(1), 133–155 (2001) 35. Joireman, J.A., Van Lange, P.A.M., Van Vugt, M.: Who cares about the environmental impact of cars? Those with an eye toward the future. Environ. Behav. 36(2), 187–206 (2004) 36. Kahneman, D., Tversky, A.: Prospect theory: an analysis of decision under risk. Econometrica 47, 263–291 (1979) 37. Kahneman, D., Tversky, A.: Choice, values and frames. Am. Psychol. 39(4), 341–350 (1984) 38. Kamenica, E., Gentzkow, M.: Bayesian persuasion. Am. Econ. Rev. 101(6), 2590–2615 (2011) 39. Kelley, H.H., Thibaut, J.W.: Interpersonal Relations: A Theory of Interdependence. Wiley, New York (1978) 40. Ketter, W., Collins, J., Reddy, P.: Power TAC: a competitive economic simulation of the smart grid. Energy Econ. 39, 262–270 (2013) 41. Ketter, W., Collins, J., Saar-Tsechansky, M., Marom, O.: Information systems for a smart electricity grid: emerging challenges and opportunities. ACM Trans. Manag. Inf. Syst. (TMIS) 9(3), 1–22 (2018) 42. Ketter, W., Padmanabhan, B., Pant, G., Raghu, T.S.: Addressing societal challenges through analytics: an ESG ICE framework and research Agenda. J. Assoc. Inf. Syst. 21(5), 1115–1127 (2020) 43. Ketter, W., Peters, M., Collins, J., Gupta, A.: A multiagent competitive gaming platform to address societal challenges. MIS Q. 40(2), 447–460 (2016) 44. Ketter, W., Peters, M., Collins, J., Gupta, A.: Competitive benchmarking: an IS research approach to address wicked problems with big data and analytics. MIS Q. 40(4), 1057–1080 (2016) 45. Ketter, W., Symeonidis, A.: Competitive benchmarking: lessons learned from the trading agent competition. AI Mag. 33(2), 103–107 (2012) 46. Lewin, K.: Field Theory in Social Science: Selected Theoretical Papers. Harper, New York 47. McClintock, C.G.: Social values: their definition, measurement and development. J. Res. Dev. Educ. 12, 122–137 (1978) 48. McClintock, C.G., Allison, S.T.: Social value orientation and helping behavior. J. Appl. Soc. Psychol. 19(4), 353–362 (1989) 49. Messick, D.M., McClintock, C.G.: Motivational bases of choice in experimental games. J. Exp. Soc. Psychol. 4(1), 1–25 (1968) 50. Murphy, R.O., Ackermann, K.A.: Social value orientation: theoretical and measurement issues in the study of social preferences. Personality Soc. Psychol. Rev. 18(1), 13–41 (2014) 51. Murphy, R.O., Ackermann, K.A., Handgraaf, M.: Measuring social value orientation. Judgment Decis. Making 6(8), 771–781 (2011) 52. Nauta, A., De Dreu, C.K.W., Van Der Vaart, T.: Social value orientation, organizational goal concerns and interdepartmental problem-solving behavior. J. Organ. Behav. 23(2), 199–213 (2002) 53. Ostrom, E., Ahn, T.-K.: The meaning of social capital and its link to collective action. In: Castiglione, D., Van Deth, J.W., Wolleb, G.: Handbook of Social Capital: The Troika of Sociology, Political Science and Economics, vol. 17, pp. 17–35. Edward Elgar, Cheltenham (2009)

196

L. Rook et al.

54. Parks, C.D., Henager, R.F., Scamahorn, S.D.: Trust and reactions to messages of intent in social dilemmas. J. Conflict Resolution 40(1), 134–151 (1996) 55. Parks, C.D., Joireman, J., Van Lange, P.A.M.: Cooperation, trust, and antagonism: how public goods are promoted. Psychol. Sci. Pub. Interest 14(3), 119–165 (2013) 56. Peck, J., Kirk, C.P., Luangrath, A.W., Shu, S.B.; Caring for the commons: using psychological ownership to enhance stewardship behavior for public goods. J. Mark. 85(2), 33–49 (2021) 57. Pennings, C.L.P., Van Dalen, J., Rook, L.: Coordinating judgmental forecasting: coping with intentional biases. Omega 87, 46–56 (2019) 58. Pletzer, J.L., Balliet, D., Joireman, J., Kuhlman, D.M., Voelpel, S.C., Van Lange, P.A.M.: Social value orientation, expectations, and cooperation in social dilemmas: a meta-analysis. Eur. J. Person. 32(1), 62–83 (2018) 59. Rieger, A., Thummert, R., Fridgen, G.,Kahlen, M., Ketter, W.: Estimating the benefits of cooperation in a residential microgrid: a data-driven approach. Appl. Energy 180, 130–141 (2016) 60. Rittel, H.W.J., Webber, M.M.: Dilemmas in a general theory of planning. Policy Sci. 4(2), 155–169 (1973) 61. Rook, L., Sabic, A., Zanker, M.: Engagement in proactive recommendations. J. Intell. Inf. Syst. 54(1), 79–100 (2020) 62. Sally, D.: Conversation and cooperation in social dilemmas: a meta-analysis of experiments from 1958 to 1992. Ration. Soc. 7(1), 58–92 (1995) 63. Schwartz, D., Keenan, E.A., Imas, A., Gneezy, A.: Opting-in to prosocial incentives. Organ. Beh. Hum. Decis. Process. 163, 132–141 (2021) 64. Tobon, S., Ruiz-Alba, J.L., García-Madariaga, J.: Gamification and online consumer decisions: is the game over? Decis. Support Syst. 128, 113167 (2020) 65. Van Lange, P.A.M., De Cremer, D., Van Dijk, E., Van Vugt, M.: Self-interest and beyond: basic principles of social interaction. In: Kruglanski, A.W., Higgins, E.T. (eds.) Social Psychology: Handbook of Basic Principles, pp. 540–561. Guilford, New York (2007) 66. Van Lange, P.A.M., Joireman, J., Parks, C.D., Van Dijk, E.: The psychology of social dilemmas: a review. Organ. Behav. Hum. Decis. Process. 120(2), 125–141 (2013) 67. Van Lange, P.A.M., De Bruin, E.M.N., Otten, W., Joireman, J.: Development of prosocial, individualistic, and competitive orientations: theory and preliminary evidence. J. Person. Soc. Psychol. 73(4), 733–746 (1997) 68. Van Lange, P.A.M., Klapwijk, A., Van Munster, L.M.: How the shadow of the future might promote cooperation. Group Process. Intergroup Relat. 14(6), 857–870 (2011) 69. Van Vugt, M., Meertens, R.M., Van Lange, P.A.M.: Car versus public transportation? The role of social value orientations in a real-life social dilemma. J. Appl. Soc. Psychol. 25(3), 258–278 (1995) 70. Van Vugt, M., Van Lange, P.A.M., Meertens, R.M.: Commuting by car or public transportation? A social dilemma analysis of travel mode judgements. Eur. J. Soc. Psychol. 26(3), 373–395 (1996) 71. Verhagen, E., Ketter, W., Rook, L., van Dalen, J.: The impact of framing on consumer selection of energy tariffs. In: 1th IEEE International Conference on Smart Grid Technology, Economics and Policies (SG-TEP), pp. 1–5. IEEE (2012) 72. Von Neumann, J., Morgenstern, O.: Theory of Games and Economic Behavior. Princeton University Press, Princeton (1947) 73. Walker, G.: What are the barriers and incentives for community-owned means of energy production and use? Energy Policy 36(12), 4401–4405 (2008) 74. Widyawati, L.: A systematic literature review of socially responsible investment and environmental social governance metrics. Bus. Strategy Environ. 29(2), 619–637 (2020) 75. Wu, J., Balliet, D., Van Lange, P.A.M.: Gossip versus punishment: the efficiency of reputation to promote and maintain cooperation. Sci. Rep. 6(1), 1–8 (2016) 76. Zanker, M., Rook, L., Jannach, D.: Measuring the impact of online personalisation: past, present and future. Int. J. Hum.-Comput. Stud. 131, 160–168 (2019)

Chapter 11

Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles Micha Kahlen, Karsten Schroer, Wolfgang Ketter

, and Alok Gupta

11.1 Introduction Carsharing is increasingly becoming a popular business model [18]. Unlike traditional car rentals, where renters keep possession of the car even during the long unused hours, the idea of carsharing is to rent the cars for short, one-way trips. Many carsharing companies use electric vehicles (EVs) in their fleets. Clearly, while the cars are parked, they are unproductive, yet incur investment and maintenance costs. While these inefficiencies are unavoidable for traditional fossil-fuel-based cars, for fleets with EVs, additional revenues can be generated by appropriate use of EVs as virtual power plant (VPP), a collection of distributed power sources that are centrally coordinated by an information system (IS) to offset energy imbalances [26]. Fleets of EVs offer an interesting and novel case of multi-product resources, were vehicles can be offered both on the electricity and the mobility markets. In this chapter, we develop and validate a novel generalized model for managing and allocating such

M. Kahlen () Rotterdam School of Management, Erasmus University, Rotterdam, Netherlands K. Schroer Faculty of Management, Economics and Social Sciences, University of Cologne, Cologne, Germany W. Ketter Faculty of Management, Economics and Social Sciences, University of Cologne, Cologne, Germany Rotterdam School of Management, Erasmus University, Rotterdam, Netherlands A. Gupta Carlson School of Management, University of Minnesota, Minneapolis, MN, USA © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Collins et al. (eds.), Energy Sustainability through Retail Electricity Markets, Applied Innovation and Technology Management, https://doi.org/10.1007/978-3-031-39707-3_11

197

198

M. Kahlen et al.

multi-product resources across multiple markets. In doing so, our framework offers utilization and profitability benefits for resource operators. Specifically, we develop and validate a method to increase the utilization and, hence, the profits of electric vehicle (EV) carsharing fleets. Our approach allows EVs to charge when there is a surplus of electricity and discharge it to the grid (vehicle-2-grid, V2G) when there is a shortage of electricity. We develop a computational control mechanism for the VPP to decide how the EVs should be allocated over time in terms of charging, discharging, or being available for rental customers. Based on a discrete event simulation model that is calibrated using real data on availability of vehicles and their movements in three distinct locations supplemented with data on electricity prices on reserve markets, we show what extra profits can be generated from EVs in addition to the rental business, by participating in these electricity auction markets. In essence, we develop smart markets [6, 10, 23] to enable energy sources to be deployed cost-effectively. Fleets can offer the storage on short-term electricity markets. They can chose between the day-ahead, intraday, and operating reserve market based on the largest price differentials. In this research, we seek to analyze the operation in control reserve market, which acquires back-up power sources that can consume electricity from or dispatch electricity to the grid within seconds. These back-up sources guarantee an alternative source of power when another power source produces more or less than that had been promised (for example, due to technical defects or weather-related issues). The fact that EV batteries can be charged and discharged flexibly makes them very suitable for offering control reserve power at economic rates. To operationalize our model, we compute trading prices (bids and asks) for every 15-minute time interval of a given week to charge and discharge EVs, and these prices are offset against the opportunity costs of not renting out the EV. We have developed an intelligent software agent [8], called FleetPower that combines information from the location, the state of charge of the battery (SoC), historical rental transaction data, and historical prices on the control reserve market and uses this information to make optimal allocation of EVs to either the rental market or a VPP (charging or discharging). The agent considers both the profits from charging and discharging and what effect the withdrawal of vehicles for rent might have on the mobility of those wanting to hire cars (sociotechnical implications). This means that our model makes an explicit trade-off between the asymmetric benefits to be gained from either offering cars for rental or using them to balance the grid in real time. To generate profits for the fleet, the system optimizes the allocation of EVs either to those needing to rent an EV (the social part) or to the balancing market that purchases services to balance the grid (the technical part). The focus is on ensuring that the availability of EVs for rental is not compromised, as the opportunity cost of losing a rental customer is very high. While the profits from renting out a vehicle are on average around $15 per transaction, profits from the charging and discharging are only a few cents per 15 minutes. As a consequence, FleetPower offers the VPP capabilities conservatively. We use real data from EV carsharing fleets and control reserve markets to test our strategies via a simulation platform that allows us to calibrate the trading prices. The actual data on electric vehicle carsharing fleets

11 Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles

199

were provided by Daimler’s subsidiary, Car2Go. Car2Go is a city carsharing service where customers can rent cars, which they pay for on a per-minute basis. To use the service, users have to register and pay a one-time registration fee. Once registered, they can make use of the free float service that enables them to pick up and drop off vehicles anywhere within the city boundaries, not necessarily in the same location. The service is available in 30 major cities in North America and Europe, with a total of more than 13,000 Smarts ForTwo. We tracked the location, state of charge, and transactions of 300 EVs in San Diego for 14 months. This city had 100 charging stations at the time. Users get 10 minutes’ free driving if the state of charge is below 20% when they return the EV to the charging station. We also use the prices of the control reserves from the transmission system operators California ISO. We show that carsharing fleets can increase their gross profits without compromising the mobility of rental customers. Profitability depends on whether there is an appropriate charging infrastructure and on the level of market demand for the control reserves that are available. We find that the largest profit increases for the VPP come from payments made for charging EVs when there is surplus energy that needs to be removed from the grid. Our data show that the market rarely uses the EV batteries to cover electricity shortages, due to the high cost of batteries. However, both discharging and charging the EVs contribute to the bottom line of carsharing fleets. Even with relatively low penetration of these vehicles at present, one of the benefits of the VPP that can immediately be realized is that carsharing fleets do not have to pay anything to fuel their EVs. In the future, these cost reductions may be even more substantial as the demand for back-up power increases, due to the growing adoption of volatile, sustainable energy sources [1]. Our research derives synergies by using EVs in the control reserve market instead of conventional, fossil-fuel-based means of peak-balancing, thereby creating greater efficiencies across the system. Renewable energy sources are weather-dependent, and their production is difficult to forecast. Large-scale penetration of volatile energy sources poses a challenge to the stability of the grid [15]. The grid is the backbone of a highly perishable electricity supply chain, where supply and demand have to be in balance at all times. With the phasing out of power plants based on fossil fuels and a growing number of renewable energy sources, balancing the grid becomes increasingly difficult. In practice, this means that the chance of blackouts increases, with potentially disastrous consequences. Our research provides insights on how electric vehicles (among other energy sources) can be used to mitigate the instability of electric grids caused by the increasing amount of renewable resources. A key contribution of this research is that it increases the supply in the operating reserve markets and therefore decreases the market price. Mak et al. [21] and Avci et al. [3], who have been studying the location of battery-switching stations, consider future work on the area of intelligent charging of EV batteries as a crucial step in moving toward a sustainable economy. In the IS research community, [33] recognize the societal importance of the issue addressed in this chapter by formulating an energy informatics framework with the aim of creating an ecologically sustainable society. Their framework formulates the need for IS research to take on the role of managing supply and demand in an energy-efficient way, and we show that we can do this by

200

M. Kahlen et al.

using EVs as mobile energy resources that are coupled by an information network that monitors the location, charging status, rental demand, and electricity supply and demand to create a real-time decision framework for optimizing resource utilization and profits. While forward looking, this topic is already receiving attention from the automotive industry. For example, Tesla Motors cars have a function to charge at cheaper night-time tariffs. With the Internet-of-Things, the framework could be used not only in cars but also in individual appliances and devices [25]. An example is Google, which acquired Nest Labs with its programmable smart thermostats that produce energy savings of between 10% and 15% [24]. Creating appropriate IT infrastructure is central to the coordination mechanism for the current industrial examples as well as for the mechanism we propose. To the best of our knowledge, this is the first study that uses real driving, charging, and locational data from more than 300 EVs. Another key contribution of our research (from the perspective of EV balancing research) is that we assume that driving patterns are unknown a priori; this represents a key characteristic in EV balancing research, as previous work in this area by [32] and [30] was done using stationary batteries and EV fleets with known driving schedules, respectively. This chapter is structured as follows: We first develop a general model. In the Background and Related Literature Section, we discuss the environment and knowledge base of our paper. Next, we give an overview of the data that is used to calibrate our model. This entails vehicle location, usage, and transaction data to calculate probabilities of rentals. Afterward, we develop the bidding strategy, a design science artifact, in the Model Description Section. The challenge with this bidding strategy is that rentals are much more economically beneficial than using the battery for electricity storage, but it is wasteful if vehicles are idle. When building simulations to replicate real-world phenomena, one has to build a model that captures the essential characteristics of the environment without overfitting the data so that it is generalizable and applicable to other data. We describe the calibration of this simulation with bid and ask prices from real electricity market in the Evidence from a Real-world Setting Section. Consequently, we do a thorough evaluation and reflection of the bidding strategy artifact in the Analysis and Discussion Section. Finally, we conclude our paper in the Conclusions Section.

11.2 Background and Related Literature This section summarizes relevant, previous research and outlines the general setting of balancing renewable energy sources. First, we describe the electricity market in detail and explain how the trading prices are computed. Subsequently, we will position our work within the information systems literature on EVs, the carsharing context, and sustainability in general.

11 Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles

201

11.2.1 Balancing the Electrical Grid: Control Reserve Market Electricity is sold on day-ahead and intraday markets as unit commitments hours before it is physically generated. However, when a source cannot meet its commitment (for example, due to technical problems or weather-related issues), there are control reserve markets to guarantee immediate replacement (known in the USA as the real-time market). These reserve markets require extremely fast reaction times called ramping rates from participating generators. For an overview of these markets and how they differ in their ramping rates, see Fig. 11.1. EVs possess large electrical batteries whose energy is almost instantly accessible without ramping cost, making them very suitable for reserve purposes. The present study focuses on the secondary control reserve market with a required ramp rate of 30 seconds [12]. We focus on this market because the energy prices are higher than those in markets that allow for a slightly longer ramp-up time. From this point on, when we are referring to energy markets, we refer to the secondary control reserve market. In the control reserve market, power plants are paid to be on standby so that they can produce (or consume) electricity when needed. The market is coordinated by electronic auctions, in which participants make asks or issue bids. The clearing mechanism is a multi-unit, first-price, sealed-bid auction, which is settled on a “pay-as-bid” basis [12]. “Asks” refer to the generation of electricity at short notice (upregulation), while “bids” relate to the consumption of electricity, also at short notice (downregulation). Asks to generate electricity and bids to consume electricity state the price for which they would either generate or consume electricity and the maximum quantity they could generate or consume a week in advance. The transmission system operator settles these asks and bids as needed 30 seconds before delivery in merit order (the cheapest resources are used first). We assume bids can be placed separately for each 15-minute time interval as recommended by the grid operators [1].

Fig. 11.1 The reaction time, which is the time between notification and generation, differs according to the market. With no ramp-up time, EVs are able to comply with the short reaction times on the primary and secondary reserve markets

202

M. Kahlen et al.

Fig. 11.2 Solar panel electricity output illustrating the erratic behavior in photovoltaic electricity production with extreme variations in output. (Data from Minneapolis, Minnesota, in April 2015)

Increasing levels of intermittent renewable energy and the decommissioning of conventional power plants expose the control reserve market to the risk that at some point the demand may exceed the available supply (for instance, when the sun is suddenly covered by cloud and photovoltaic cells stop producing energy). Figure 11.2 provides an example of electricity output from solar panels, showing how production from solar panels is erratic, with extreme variations in output per minute. Note how the panel produces its maximum output at 1.30 pm, yet only minutes later production drops by more than 50%—in stark contrast to fossil fuel generators that produce electricity at a constant rate. These drops in energy output need to be offset within seconds to avoid blackouts.

11.2.2 Information-Based Sustainable Society: Carsharing with Electric Vehicles Information systems can be both a contributor to climate change and way of dealing with negative environmental impact. Similar to [20]’s use of information to align individual interests with sustainability, we use information to align organizational goals with sustainability by means of a decision support system. As a result, financial and environmental goals are brought into harmony to foster carbon neutrality [22]. Knowing when and where people rent EVs puts EV fleets in a position to make inferences about the rentals patterns of the population and to make sociotechnical trade-offs [11] between their need for transportation and the need for storage on the energy market. We demonstrate this trade-off in a simulation platform similar to the one described by [17] and calibrate it using real-world data.

11 Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles

203

Charging many EVs in the same neighborhood at the same time can quickly overload transformers and substations [19]. Previous research has addressed this issue by proposing smart charging, meaning that EVs are charged at times when the grid is less congested, helping to complement peaks in electricity consumption without creating new peaks. With smart charging, EV fleets are given financial incentives to change their charging times, resulting in significant reductions in peaks [31]. The departure times of EVs parked at public charging stations in California, [14], show that an intelligent scheduling would result in a reduction of 24.8% in the monthly energy bill for users. An extension of smart charging is the V2G concept.1 A study by [32] considers the savings a household can make with a battery exposed to dynamic pricing on the energy wholesale market and finds that efficient use of the battery would provide savings of 14% in utility costs and 7% in carbon emissions. Similar effects are found by [35] in an industrial setting. Another study relating to EVs finds yearly benefits per EV of $176-203 [28]. Tomic and Kempton [30] show that the profitability depends on the target market: the larger the variations in the electricity price, the higher the profitability. Therefore, we focus on the control reserve market from Sect. 11.2.1. Most studies make the assumption that households or car owners trade on the energy wholesale market. This assumption is not realistic because they do not have a sufficient quantity of electricity to sell or buy to meet the minimum lot sizes required to participate in the market. To address this issue, [16] introduced the notion of electricity brokers (a.k.a. aggregators), which act on behalf of a group of households in order to reach the minimum lot size requirements. Simulations by [7] and [13] show that this is possible to achieve with EVs. On top of the brokers, we also apply the concept of a VPP. The asks and bids that are accepted constitute a promise to deliver electricity to the market, but which specific source will be used to fulfill that promise is not decided until actual delivery [i.e., whether a commitment will be delivered specifically from EV A or B (or a combination) is decided in real time, based on the availability of those particular vehicles]. We will show that this is a powerful tool that carsharing fleets can use to offer appropriate service levels for rental customers while making additional profits from balancing markets. As the number of charging stations increases, EVs are more likely to be connected to the grid and to be used as part of a VPP. As we want to make a statement about the profitability of VPPs of EV in the future, we also need to consider the possible density of charging infrastructures in the future. In earlier articles in this journal, [21] and [3] put forward an optimal spatial infrastructure design for battery-swapping stations. This setup has also been studied by [34]. However, we focus on conventional charging stations instead because although there are no battery-swapping stations in San Diego at the time of analysis. We

1 Vehicle-to-grid (V2G) discharging is technically possible. Even though not all charging stations support discharging yet, the standard of the International Electrotechnical Commission IEC 62196 supports V2G. For the purpose of this study, and with regard to future infrastructure, we assume that all charging stations have V2G capabilities.

204

M. Kahlen et al.

will therefore make recommendations on where additional charging stations should be placed. A shortcoming of the existing studies is that—with the exception of [14]—they all used either small fleets or data from combustion engine vehicles that have a longer range and are not subject to “range anxiety,” the fear of becoming stranded with an empty battery. More importantly, in previous research, trips are assumed to be known in advance. In reality, trips are more spontaneous (nondeterministic) and not always known in advance [29]. This is problematic when an EV is committed to either charge or discharge at the same time as someone needs to drive it [2]. Here a sociotechnical trade-off needs to be made between balancing the grid (technical) and providing mobility to customers (social). As it is impossible to determine precisely what value each individual places on mobility, we approximate its value with the profits from rental transactions. Free float carsharing, where users can pick up and drop off the vehicle anywhere, allows us to specify the value for mobility for all pick-up and drop-off locations and times. Firnkorn and Müller [9] show that free float carsharing has a significantly positive environmental effect, reducing carbon emissions by 6%. For our study, a free float carsharing business model fits very well as rentals are paid on a per-minute basis, and there still is uncertainty about where and for how long people will rent an EV (rentals are not booked in advance, i.e., they are nondeterministic), an issue that has not been covered in previous studies.

11.3 Data We draw on carsharing data from Car2Go’s San Diego fleet that consists of 300 EVs. In addition to a sign-up fee, members pay for the carsharing service on a per use basis (per minute/hour/day, with an extra per mile fee above a threshold of 150 miles). The rental and driving data were retrieved from a private application programming interface which we were given access to by Daimler, Car2Go’s parent company. We retrieved a list of all EVs that were available for rental at the time of the query from the Car2Go website, www.car2go.com. We downloaded the data, added a time stamp, and stored it in a database every 15 minutes from May 1, 2014, to June 29, 2015; we were also continuing to collect the data for future research. This information contains the unique car name, the geographic coordinates of where the car is parked, the street name and zip code of that location (l), the state of charge of the battery (SoC), the state of the interior and exterior, and whether the EV is currently charging. We infer certain information about the transaction, such as how long the EV was rented, how many kilometers were driven, and how much revenue by looking at the duration and timing of when was earned as rental benefit (.RB) the EV was unavailable for rent and the difference in the SoC level beforehand and afterward. Even though the number of kilometers that can be covered using average fuel consumption will depend upon individual driving behavior, and this could therefore affect the accuracy of our estimates, we are confident that the differences

11 Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles

205

will in fact be marginal, since all the journeys take place within the same urban environment. We assume that a fully charged EV will cover a distance of 66 miles (106 km). A drawback of the data set is that there is a chance that a car may be returned and rented again to another customer within the 15-minutes time interval. However, for the sake of our analysis, the EV remains unavailable, so this does not have a significant influence on the overall estimation and results. We also observe that several times particular EVs did not feature in the data for more than two days, even though the maximum rental duration is 2 days. We speculate that these cars were either in maintenance, repair, or not able to drive for some other reason and were therefore not shown as available by Car2Go. We therefore removed from the data set all rentals that we inferred from the data to have lasted more than two days. We infer the location of charging stations based on the GPS coordinates of where cars have been charged at least once in the data set. We assume that if a car is parked at a charging station, it will be connected to the charging station. This is a sound assumption because cars are only allowed to park at a charging station when they are plugged in, and any car that does not comply with this may be towed away.

11.4 Model Description At the core of this research is the development of a decision support system that places bids and asks in a market. The market’s clearing mechanism ultimately decides when to turn EVs into VPPs. The system is evaluated in a simulation environment bootstrapped with real-world carsharing driving data from Car2Go. A discrete-event simulation is most suitable for this purpose, as we are dealing with a complex system that would be prohibitively expensive to build in the real world and where market parameters would be difficult to manipulate. In the following sections, we outline our selected approach and the reasoning behind it.

11.4.1 Virtual Power Plant Decision Support: FleetPower Fleets need to decide how to deploy EVs by deciding which ones should be charged, which should provide V2G services, and which should be made available for rental, and they then bid accordingly. The charging and discharging (V2G) are physically constrained to EVs that are connected to a charging station. Making real-time deployment decisions in this complex environment requires automated decisionmaking by an intelligent trading agent [8]. We call this intelligent trading agent, which acts on behalf of the fleet, FleetPower. How FleetPower bids for the charging and discharging energy of the electric vehicles is described in the activity diagram in Fig. 11.3. The agent needs to submit asks and bids for whatever price it is willing to charge or discharge and to decide how many EVs it wants to make available. The asks and bids need to be placed before the auction closes, and after the auction, the

206

M. Kahlen et al.

Fig. 11.3 Activity diagram showing the decision-making steps involved in FleetPower’s bidding on the secondary control reserve (real-time) market for energy

agent has to provide or consume whatever quantity of energy has been agreed for any accepted bids or asks. The first step ❶ for the agent is to forecast the total amount of energy stored and how much can still be stored in the EVs that are available (parked at charging stations) for the timeslot under consideration. Next ❷, the agent has to determine a price at which it would be willing to sell or buy energy, to at least cover the opportunity cost. Afterward ❸, this information about the price and the SoC of the EVs forms the basis of the asks and bids to be submitted to the auction. After the asks and bids have been submitted to the auction, the market decides which asks and bids to accept and reject according to the “pay-as-bid” mechanism. The fleet needs to make sufficient EVs available to match the quantity of energy that has been agreed. ❹ These are dedicated EVs that deliver or consume energy according to the accepted asks and bids. ❺ If a customer asks to rent one of these particular EVs, either ❻ another car connected to a charging station then replaces that EV in the VPP in order to deliver the agreed amount to the market, or ❼ the customer is told that no car is available and ❽ the potential revenues are written off as opportunity cost. In practice, customers will not be turned away as these cars will not show up on the list of available EVs, so customers would not notice any difference, especially since this is done already for cars that are charging. Note that if another EV was

11 Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles

207

free in the immediate vicinity, it was assumed that this car would then be rented out instead. Our interpretation of immediate vicinity is that customers are likely to be willing to walk to another car if it is approximately 250 meters away (drawn from a normal distribution with a mean of 250 meters and a standard deviation of 100 meters (d)). This value seems realistic to us, but we have also tested means of 100 and 500 meters with no significant difference in results. The great-circle distance between the coordinates is calculated using the haversine formula [27]. In the next section, we will explain each step of the bidding procedure in more detail. Determine Ask and Bid Quantity ❶ The first step in the ask and bid submission is to determine the quantity of electricity that should be offered in each 15-minute time interval. While it is important for customers to rent a car at a specific location, the precise location within a city is less relevant for energy markets as long as the car is parked at a charging station on the same distribution grid. Rather than making a decision on how each individual EV should be deployed, we can estimate an overall quantity of energy to charge and discharge, which allows us to harness the “risk pooling effect.” This effect refers to the fact that EV storage potential and energy stored can be predicted more accurately for a whole fleet rather than for each individual EV. For the purposes of this study, we have applied various machine learning algorithms, including neural network regression, support vector machine regression, and random forest regression, in order to forecast the energy storage available for charge discharge charging (.Qt ) and discharging (.Qt ) for the whole fleet at a specific time. We chose these regression algorithms because we use many attributes and there is a dependency in the data between the independent and dependent variables. At the end of every week, the market closes for submissions for all the 15-minute intervals of the following week. We are therefore interested in predicting storage availability for up to one week in advance. The capacity to store or discharge for the fleet of EVs as a whole is predicted using the following equations: charge

Qt

.

discharge

Qt

.

= βt,0 + βt,1 ∗ day_of _week(t) + βt,2 ∗ hour_of _day(t) = βt,0 + βt,1 ∗ day_of _week(t) + βt,2 ∗ hour_of _day(t),

(11.1) (11.2)

where .β0 , .β1 , and .β2 , are unknown parameters. The decisive factors determining the availability of storage are the day of the week (day_of_week) and the hour of the day in 15-minute intervals (hour_of_day). To predict the energy storage for each week, we use a fixed two months’ time period (training period) to learn the daily and weekly patterns. The training period duration is fixed so that all predictions are comparable. The duration of the training period will be discussed in more detail in Sect. 11.5. A random forest regression model had the highest accuracy of prediction during the training period. This model was parametrized with two randomly preselected variables (mtry=2), 1000 randomized trees, and a minimum

208

M. Kahlen et al.

sum of weights for splitting of 5. We do not consider time series models because commitments are due one week in advance. In its essence, the issue we are dealing with is a classification problem. We have to decide how many EVs we should assign to which class (rental or VPP). However, there is an asymmetric pay-off between assigning EVs to certain classes. Renting earns the carsharing fleet $17.61 on average per transaction, and the VPP earns the fleet $0.09 on average per transaction. In addition, asks and bids on the energy market are binding, and non-delivery will result in very high penalties. We therefore give misclassifications for the rental class proportionally more weight than the VPP class. This weight is assigned with the stratified sampling method, where we sample disproportionately to reflect the asymmetric pay-off [5]. This method decreases the likelihood that our model adds a car to a VPP. Example Assume, for instance, that we are interested in submitting a bid for Sunday, July 6, 2014, for the time interval t 5.00 pm to 5.15 pm (t = 60 as it is the 60th 15-minute interval). To predict the available storage, we look at the training period from May 1, 2014 to the day on which asks and bids can be submitted for auction on June 30, 2014. Based on the number of EVs and their state of charge for each Sunday in that time period, as well as each t = 60 time period, we predict the availability for the test period July 6, 2014 at t = 60. To account for changes in usage patterns over time, we explicitly include the availability between t = 60 of the last Sunday as a lagged dependent variable (in this case: June 29, 2014). With more historical data, one could also include the same day from previous years to improve the accuracy of the model. If there were on average 10 EVs connected to charging stations, each with a state of charge of 70% (SoC) and a 16.5 kWh battery (.Ω), the charge storage available for charging would be .Qt = 10 ∗ 0.3 ∗ 16.5 kWh = 99 kWh, discharge = 10∗0.7∗16.5 kWh = and the storage available for discharging would be .Qt 231 kWh. Due to physical constraints of the available infrastructure in San Diego, the charging (.γ ) and discharging speed (.δ) of 3.6 kWh per hour (or 0.9 kWh in 15 minutes) and charging (.ξ charge ) and discharging efficiencies (.ξ discharge ) of 96% and 97.4% limit the actual values to a maximum of .10 ∗ 0.9 ∗ 0.96 = 8.6 kWh and .10 ∗ 0.9 ∗ 0.974 = 8.8 kWh, rather than 99 and 231 kWh for the 15-minute interval. Determine Ask and Bid Price ❷ The second step in the ask and bid submission process is to determine the price at which the asks and bids should be offered so as to balance out potential gains to be made from the auction versus the likelihood of the offer being accepted. There is a price for capacity (standby fee) and for electricity $ (per unit of energy). We bid at a capacity price of 0. MW to ensure that our bids will always be considered by the market. To determine the electricity price, we apply a bottom-up model that estimates the optimal price per EV. For each EV, the fleet has a number of costs that need to be covered. For example, when charging an EV, the agent needs to ask a price .P charge that takes into account the industrial electricity tariff, the opportunity cost of being unable to serve a customer while charging, plus a margin. To discharge the EV (V2G), an agent should ask a price discharge that is based on the energy cost of charging in the first place, the cost .P

11 Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles

209

of battery depreciation, the opportunity cost of not being able to serve a customer while discharging, and of not being able to serve customers in the future due to a lower battery SoC, plus a margin. For a table of notation, including measurement units, see Table 11.1.

Table 11.1 Table of notation Variable

Description

Unit

.RB

Expected rental benefit per unit of energy stored

.

Observed rental benefits Charging cost, see Equation 11.3 Discharging cost, see Equation 11.6

$ $ $

c

Rental probability

.

% 100

D

Battery depreciation cost

.

$ kWh

d

Distance .EVi to closest EV available for rent

km

.RB .C

charge

.C

discharge

charge on .Pt

EC

Energy cost, based

ET

Industrial electricity tariff (flat price)

I i l P

Total number of EVs Specific EV Location Bid/ask price for buying or selling electricity from reserve market Bid/ask quantity for buying or selling electricity from reserve market Equilibrium quantity (sign indicates shortage or surplus electricity)

Q .Q

∗

$ kWh

.

$ kWh

.

$ kWh

ID Zip code .

$ kWh

kWh MWh % 100

q

State of charge (SoC) (.Ψ/Ω)

.

t

.λ

Time interval Unknown regression parameter Duration of a time interval Charging speed Discharging speed Dummy to account for opportunity costs from recharging

Index – 0.25 hours kW kW Boolean vector

.μ

Margin on the bid/ask price, to optimize bidding price

.

$ kWh

.β .Δt .γ .δ

.ξ

charge

Charging efficiency

.

% 100

.ξ

discharge

Discharging efficiency

.

% 100

Amount of electricity stored in an EV Maximum battery capacity

kWh kWh

.Ψ .Ω

210

M. Kahlen et al.

Determine the price for charging (.P charge ) EVs can be parked anywhere in the city, but only if an EV is parked at a charging station does FleetPower have the option to turn it into a VPP. Where this is the case, FleetPower can bid in the energy market for a cheap electricity rate. Bids submitted to the auction are composed of several components of the bid, together with the bid quantity and price. The first component of the bid price, opportunity benefit, serves as a reference point; FleetPower will not purchase electricity on the energy market if the industrial electricity tariff were cheaper. The second component, the expected gross profits from rental, ensures that EVs are less likely to charge when it is probable that they will be rented out. The industrial electricity tariff and the expected rental profit determine the break-even point at which renting out or turning an EV into a VPP is equally matched financially. The final consideration, the margin, allows the fleet to make a profit, and here a trade-off needs to be made between the pay-off and the likelihood of the bid being accepted. We now describe the bid in more detail. The financial cost of charging (.Ccharge ) a specific EV i at 15-minute time interval t is determined by the following equation: charge

Ci,l,t

.

=

min(SoCi,l,t ∗ Ω, δ ∗ Δt) charge −P , i,l,t ξ charge

(11.3)

where .min(SoCi,l,t ∗Ω, δ∗Δt) is the amount of electricity that could still be charged to the battery of car i at interval t and (.P charge ) is the bid price to charge, which differs per EV i and time interval t. The variable .ξ charge accounts for the charging inefficiency. The bidding price .P charge for charging is determined as follows: charge

Pi,l,t

.

i,l,t − μt = ET − RB

charge

,

(11.4)

where ET is the opportunity benefit of not having to pay the industrial electricity is the expected rental benefit that we will describe next, and .μcharge is the tariff, .RB profit margin that is parametrized to maximize the overall profits for all previous time intervals t (of the training data set). In other words, we take the electricity tariff, we deduct what we could have earned with the EV if it had been available during that period, and we add a margin to arrive at the lower electricity price the carsharing fleet would be willing to accept in return. We use the same machine learning algorithms to predict the rental profits per unit of energy and thus to decide how much energy to offer ❶. In contrast to the quantity prediction, support vector machine regression had the best predictive accuracy for rental profits ❷. The rental profits are predicted with the following equation, similar to Equations 11.1 and 11.2: t = βt,0 + βt,1 ∗ day_of _week(t) + βt,2 ∗ hour_of _day(t). RB

.

(11.5)

11 Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles

211

The support vector machine regression was parametrized using a radial basis kernel function with the parameters gamma = 2 and cost = 1. The expected profit for renting during interval t parked at location l is determined EV i per unit of energy (.RB) by a support vector machine regression with four independent variables: rental probability c, state of charge of the battery SoC, interior status, and exterior status. The rental probability c captures the preferences and behaviors of those who rent EVs. The probability has three dimensions: location, the hour of the day, and the day of the week. The locational dimension gives insights into the likelihood that EVs are rented out given that they are parked in a certain district, which is represented by a zip code. The temporal dimension gives insights into the likelihood that EVs will be rented out given a certain hour of the day and a day of the week. We break time down into discrete 15-minute intervals for each day of the week. Based on this information, we create a four-dimensional model that enables us to predict whether a car is likely to be rented out within the next 15-minute interval, based on the day of the week, the specific 15-minute interval, and the zip code of the place where the car is parked. Unlike day and time, zip code is a categorical variable. Example How we determine the bid price for charging is illustrated by what happens with car .I D#5 on the morning of Monday May 11, 2015 between 5.15 am and 5.30 am. The battery is charged (SoC) to 90%, the interior and exterior statuses are “good,” and the car is parked in zip code (l) 1012. Given an industrial $ $ electricity tariff of 0.08. kWh and an optimal margin of 0.02. kWh , the bidding price charge $ $ $ $ is .P5,1012,21 = 0.08 kWh − 0.042 kWh − 0.02 kWh = 0.018 kWh . If the quantity charge

Q5,1012,60 = 0.9 kWh is bought from the market, and an adjustment is made for efficiency losses .ξ charge =0.02, the total opportunity cost of charging the EV during charge $ $ that period would be .C5,1012,60 = (0.9 kWh/(0.98)) ∗ (0.08 kWh − 0.018 kWh ) = $0.057. In order for the charging of this EV to be economical during this 15-minute interval, the carsharing fleet would need to bid at a price not exceeding $0.057. If it would pay more for electricity, it would be better off to use the flat electricity tariff. Any figure above this would mean that the fleet would be better off charging its vehicles using electricity supplied at the standard flat tariff.

.

Determine the price for discharging/V2G (.P discharge ) An EV can also contribute to a VPP by discharging if it is parked at a charging station with V2G. FleetPower then has the option to sell electricity through V2G by submitting an ask to the energy market. The first component of the ask price (energy cost) is the sum needed to reimburse the fleet for the cost of charging the EV in the first place. The second component, battery depreciation, compensates the fleet for wear and tear on the battery. The third component, the expected rental gross profit, ensures that the maximization takes into account that an EV is less likely to discharge using V2G when it is probable that it will be rented out, and this calculation includes an allowance for the time needed to recharge the EV to its previous charge state. Even though rental gross profits also include an element to cover the costs of battery depreciation, we explicitly include this as a separate part of the ask price because there are substantial differences between discharging and driving in terms of the

212

M. Kahlen et al.

battery depreciation depending on the volume of activity. For example, at night the expected battery depreciation costs from rentals are close to zero because it is unlikely that someone would rent an EV, whereas if the ask is accepted, the battery depreciation costs associated with discharging and subsequent recharging will be incurred in full. Also, in this case, the electricity cost, the battery depreciation cost, and the expected rental profits are combined to determine the break-even price at which the rental and VPP are of equal value to the carsharing fleet. The last consideration, the margin, allows the fleet to make a profit in the “pay-as-bid” market, though a trade-off needs to be made between the potential gains and the likelihood of the ask being accepted. We describe the ask in more detail below. The financial cost of discharging (.Cdischarge ) a specific EV i at 15-minute time interval t is determined by the following equation: discharge

discharge .C i,l,t

=

Qi,l,t

1 − ξ discharge

discharge −Pi,l,t ,

(11.6)

where (.Qdischarge ) is the electricity stored in EV i that can be accessed within time interval t. .P discharge is the price at which the electricity is being offered for sale, as defined in Equation 11.7. .ξ discharge is the discharging inefficiency that accounts for energy conversion losses. The asking price for discharge (.P discharge ) is determined as follows: discharge

Pi,l,t

.

= −D − EC −

1+h i,l,(t+j ) ∗ λi,l,(t+j ) ) − μdischarge ((RB , t

(11.7)

j =1

where the cost for wear on the battery is depreciated (D) for each kWh of energy used. Also, the energy costs for charging EC, based on the asks accepted during the training period, are taken into account. The summation term refers to recharging the EV after V2G. .h = round .

discharge

Qi,l,t δ∗Δt

is the time it takes to recharge the EV,

are the opportunity costs of not rounded to the nearest time interval (15 minutes). .RB being able to rent out the EV due to it being committed to a VPP during the current interval t and costs of recharging it subsequently. The dummy variable .λ states that opportunity costs only apply if the next person to rent the vehicle cannot complete i,l,t+j with the remaining capacity from V2G. .μdischarge is the the expected trip .RB margin that maximizes the overall profits for the time intervals t in the training data set in a similar way to the margin in Equation 11.4. Example To see how we determine the bid price for discharging, take the example of car .I D#5 on the morning of Monday May 11, 2015 from 5.15 am to 5.30 am at a given zip code. The same conditions apply as in the example provided above

11 Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles

213

$ for charging. Assume that the battery depreciation is .D = 0.1 kWh , the quantity in discharge question is .Q5,1012,60 = 0.9 kWh, the discharging speed .δ = 3.6 kW per EV, the $ , and rental benefit for the next time interval (5.30–5.45 am) is .RB5,l,61 = 0.055 kWh .λ5,l,61 = 1 as there are rental costs in t = 60, but as the battery SoC is completely full, it is unlikely that in period t = 61 the EV will have too little battery power left to be used for another rental. Under these circumstances, the price for discharging can 2 discharge $ $ 5,l,(60+j ) ∗ be expressed as follows: .P5,1012,60 = −0.1 kWh − ((RB − 0.08 kWh j =1

discharge

$ $ λ) − 0.02 kWh . If the quantity .Q5,1012,60 = 0.9 kWh is bought from = 0.018 kWh the market, and an adjustment is made for efficiency losses .ηdischarge = 0.98, the total opportunity cost of discharging the EV during that time period would be discharge $ $ $ .P 5,1012,60 = (0.9 kWh/(0.98)) ∗ ((0.1 kWh + 0.08 kWh ) − 0.018 kWh ) = $0.149. In order for the discharging of this EV to be economical, the fleet would need to ask a price of at least $0.149.

Place Ask and Bid ❸ The third step is to combine the quantities and prices as asks and bids, respectively, and submit them to the market. To do this, the agent chooses charge discharge the EVs i with the lowest cost for charging .Ci,l,t and discharging .Ci,l,t until charge

discharge

the respective overall quantities .Qt and .Qt are reached. Each quantity is submitted to the energy market at the average price from Equations 11.4 and 11.7, weighted by the amounts bought or sold. We only submit one ask and one bid for each time interval due to the minimum lot size of 1 MW. We do not consider submitting multiple asks and bids for the same auction, even though this would increase the profits, because substantially larger fleets would be required to meet the minimum lot size. To reach the 1 MW threshold, one would need to collaborate with an aggregator. Example Take, for example, the following situation where the costs of charging charge charge EVs ID#1 and ID#2 are .C1,1012,59 = 0.036$ and .C2,1012,59 = 0.09$, and the corresponding bidding prices are .P1,1012,59 = 0.04$ and .P2,1012,59 = −0.02$ (t = 59 means 4.45–5.00 pm). The negative price for EV ID#2 means in the time interval t = 59 the market needs to pay Car2Go for the charging to be economically worthwhile. The states of charge of the batteries of the EVs are .SoC1 = 0.3 and .SoC2 = 0.4, respectively. FleetPower has determined the optimal quantity that charge should be offered to the market to be .Qt = 1.5 kWh. We also assume a battery capacity of .Ω = 16.5 kWh and a charging speed (.γ ) of 3.6 kWh per hour (or 0.9 kWh in 15 minutes) per EV. In this case, FleetPower offers to provide .1.5 kWh $ at a price of .0.016 kWh , as .0.9 kWh (depending on the amount that can be discharged with the infrastructure in the time constraint .0.9 kWh and what SoC the battery is in, .0.6 ∗ 16.5 kWh) can be provided from EV with ID#1, which has the lowest cost, and the remaining 0.6 kWh will be provided from EV ID#2.

214

M. Kahlen et al.

11.4.2 Endogeneity from Market Participation By participating in the market, we may have an influence on market equilibrium, and this might in turn lead other market participants to behave differently. However, we argue that there is no endogeneity problem from reactions to our market participation as the asks and bids of other participants are aligned with their preferences. Discriminatory-price multi-unit auctions are not incentive-compatible, but our approach will work with any mechanism. For example, the uniform-price multi-unit auction can be designed to be posterior regret-free (i.e., even though the mechanism is not incentive-compatible a priori, no one could benefit from not bidding their true valuation when evaluating allocation ex post) [4]. Under these mechanisms, other market participants have no incentive to alter their behavior in response to new market entrants. Our methodology will also work well with this kind of mechanism. While the revenues may be different, the structural results will not change.

11.5 Evidence from a Real-World Setting For the evaluation, we consider the 14-month period from May 1, 2014, to June 29, 2015. We train our model from the first two months from May 1, 2014 till June 29, 2014 and test it on the first week of auctions (the bids and asks for all the 15-minute intervals in a week are always submitted for the full week in advance, Monday to Sunday). Consequently, we use a rolling time window for the training period of two months for each week of bidding. We test the algorithm for each week in the period from June 30, 2014, to June 29, 2015. From these training sets, values for the rental likelihood model, expected driven kilometers, rental time, and rental profits are used to train the model. Based on this training period, we evaluate the trained model over all one-week bidding blocks in that time period. There is no need to simulate the distribution of trips as we have an immense number of real driving transactions with which we can test our results, and a calibration of the driving data is therefore not necessary (as it is real data). The test period is given externally by the market, while the training period is a constraint from the data collection perspective (we are limited to 14 month of collected data).

11.5.1 Energy Market Data: California ISO As illustrated in Sect. 11.2.1, EV storage is particularly suited to real-time market operation due to the fast response times required (dispatch occurs within seconds of order acceptance). We therefore use auction data from these markets to determine the prices for balancing (charging as well as discharging) at each point in time.

11 Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles

215

Fig. 11.4 Average regulation prices (June 30, 2014, to June 29, 2015), with standard deviation illustrating the extreme price volatility in the evening in San Diego

We use the data from the energy market operator in San Diego, California ISO. Only the clearing prices are published, though this still allows us to infer which bids and asks are accepted (the ones below the market price). The violin plot in Fig. 11.4 shows the average regulation prices and their standard deviations for San Diego. The prices for regulation reserves are quite variable. The high renewable energy content in the energy mix in California leads to generally higher prices and to large fluctuations in price in San Diego. This increases the revenues for VPPs because they can sell smaller amounts of energy at extremely high prices at the high variation in the evening hours.

11.6 Analysis and Discussion In this section we will discuss the evaluation of the business model in terms of the profits to the carsharing fleets and the implications for the grid. We will also describe the sensitivity analysis that we have conducted to show the robustness of our model. The proposed model creates a new business model for carsharing fleets, which is a natural extension to the traditional carsharing business. We are interested in

216 Table 11.2 Decision outcome results over a one-year period

M. Kahlen et al. Description Discharged (V2G) Discharged energy sold (MWh) Number of lost rentals Increase in gross profit (%) Increase in gross profit (in 1000 $) Charged Quantity of energy (MWh) purchased Number of lost rentals Increase in gross profit (%) Increase in gross profit (in 1000 $)

San Diego 7 9 0.6 2 136 50 2.4 9.5

Fig. 11.5 Shows the VPP output over the year

whether this VPP business can increase the profits of carsharing fleets. Table 11.2 shows how VPPs influence the gross (variable) profits, excluding overhead cost. Creating a VPP using an EV fleet provides a sound business case for a carsharing fleet. However, it is also beneficial to the grid and thereby society because it provides additional reserve power to help keep the grid in balance at all times. This is already beneficial for the operation of the grid, but it becomes essential when a large proportion of weather-dependent renewable energy sources come on to the market. The VPP supports the grid by providing and consuming electricity on demand within seconds. The capacity that Car2Go provided to the market is displayed in Fig. 11.5. While Car2Go consumes a substantial amount of surplus energy (a negative value on the y-axis means the total quantity charged), it discharges its EVs only infrequently. Due to the low discharging prices, the cost of battery wear cannot be covered, with the consequence that asks to discharge EVs is accepted infrequently.

11 Smart Market-Driven Virtual Power Plants of Shared Electric Vehicles

217

11.7 Conclusions We have proposed and evaluated the FleetPower decision support system, which enables EV fleets to participate in the energy market as well to continue their traditional rental business. We do this by using an intelligent agent that decides whether an EV at a specific location should be made available for rent, or whether it should be charged or discharged in form of a virtual power plant, providing an ancillary service. The system makes this decision based on forecasted rental transactions, charging, and discharging. Our tests show that using EVs for ancillary services consistently enhances gross profits for the EV fleet by 2.4%. V2G currently accounts for only a small proportion of these additional profits, as 90% of the profits come from electricity savings. However, we show that V2G has a strong impact on the gross profits of carsharing fleets when the demand for reserve power increases. With this decision support system, it is possible to replace carbon-intensive back-up capacity with clean energy storage, but as there are not yet enough EVs on the street, they need to be combined with other fast-response technologies such as biogas or hydropower in order to balance volatile renewable energy sources such as wind or solar.

References 1. Agricola, A.E.A.: DENA ancillary services study 2030. Security and reliability of a power supply with a high percentage of renewable energy. Technical report, German Energy Agency, Berlin (2014) 2. Ahadi, R., Ketter, W., Collins, J., Daina, N.: Cooperative learning for smart charging of shared autonomous vehicle fleets. Transp. Sci. (2022), https://doi.org/10.1287/trsc.2022.1187 3. Avci, B., Girotra, K., Netessine, S.: Electric vehicles with a battery switching station: Adoption and environmental impact. Manag. Sci. 61(4), 772–794 (2015) 4. Bapna, R., Goes, P., Gupta, A.: Pricing and allocation for quality-differentiated online services. Manag. Sci. 51(7), 1141–1150 (2005) 5. Berk, R., Sherman, L., Barnes, G., Kurtz, E., Ahlman, L.: Forecasting murder within a population of probationers and parolees: a high stakes application of statistical learning. J. R. Stat. Soc. 172(1), 191–211 (2009) 6. Bichler, M., Gupta, A., Ketter, W.: Designing smart markets. Inf. Syst. Res. 21(4), 688–699 (2010) 7. Brandt, T., Wagner, S., Neumann, D.: Evaluating a business model for vehicle-grid integration: evidence from Germany. Transp. Res. Part D: Transp. Environ. 50, 488–504 (2017) 8. Collins, J., Ketter, W., Sadeh, N.: Pushing the limits of rational agents: the trading agent competition for supply chain management. AI Magazine 31(2), 63 (2010) 9. Firnkorn, J., Müller, M.: What will be the environmental effects of new free-floating car-sharing systems? The case of car2go in Ulm. Ecol. Econ. 70(8), 1519–1528 (2011) 10. Gallien, J., Wein, L.: A smart market for industrial procurement with capacity constraints. Manag. Sci. 51(1), 76–91 (2005) 11. Geels, F.: From sectoral systems of innovation to socio-technical systems: insights about dynamics and change from sociology and institutional theory. Res. Policy 33(6), 897–920 (2004) 12. International Grid Control Cooperation: Information on grid control cooperation and international development. IGCC price model—transmission system operator market information, pp. 1–19 (2014)

218

M. Kahlen et al.

13. Kahlen, M., Ketter, W., van Dalen, J.: Balancing with electric vehicles: a profitable business model. In: 22nd European Conference on Information Systems, Tel Aviv, pp. 1–16 (2014) 14. Kara, E., Macdonald, J., Black, D., Berges, M., Hug, G., Kiliccote, S.: Estimating the benefits of electric vehicle smart charging at non-residential locations: a data-driven approach. Appl. Energy 155, 515–525 (2015) 15. Kassakian, J.G., Schmalensee, R.: The future of the electric grid: an interdisciplinary MIT study. Technical report, Massachusetts Institute of Technology (2011). ISBN 978-0-98280086-7 16. Ketter, W., Collins, J., Reddy, P.: Power TAC: a competitive economic simulation of the smart grid. Energy Econ. 39(0), 262–270 (2013) 17. Ketter, W., Peters, M., Collins, J., Gupta, A.: Competitive benchmarking: an IS research approach to address wicked problems with big data and analytics. MIS Q. 40(4), 1–53 (2016) 18. Ketter, W., Schroer, K., Valogianni, K.: Information systems research for smart sustainable mobility: a framework and call for action. Inf. Syst. Res. (2022). https://doi.org/10.1287/isre. 2022.1167 19. Kim, E., Tabors, R., Stoddard, R., Allmendinger, T.: Carbitrage: utility integration of electric vehicles and the smart grid. Electricity J. 25(2), 16–23 (2012) 20. Loock, C., Staake, T., Thiesse, F.: Motivating energy-efficient behavior with green IS: an investigation of goal setting and the role of defaults. Manag. Inf. Syst. Q. 37(4), 1313–1332 (2013) 21. Mak, H., Rong, Y., Shen, Z.M.: Infrastructure planning for electric vehicles with battery swapping. Manag. Sci. 59(7), 1557–1575 (2013) 22. Malhotra, A., Melville, N., Watson, R.: Spurring impactful research on information systems for environmental sustainability. Manag. Inf. Syst. Q. 37(4), 1265–1274 (2013) 23. McCabe, K., Rassenti, S., Smith, V.: Smart computer-assisted markets. Science 254(5031), 534–538 (1991) 24. Nest Labs: Energy savings from the Nest learning thermostat: energy bill analysis results. White paper, Palo Alto (2015) 25. Porter, M.E., Heppelmann, J.E.: How smart, connected products are transforming competition. Harvard Bus. Rev. 92, 11–64 (2014) 26. Pudjianto, D., Ramsay, C., Strbac, G.: Virtual power plant and system integration of distributed energy resources. Renew. Power Gen. IET 1(1), 10–16 (2007) 27. Robusto, C.: The cosine-haversine formula. Am. Math. Mon. 64(1), 38–40 (1957) 28. Schill, W.: Electric vehicles in imperfect electricity markets: the case of Germany. Energy Policy 39(10), 6178–6189 (2011) 29. Schroer, K., Ketter, W., Lee, T.Y., Gupta, A., Kahlen, M.: Data-driven competitor-aware positioning in on-demand vehicle rental networks. Transp. Sci. 56(1), 182–200 (2022). https:// doi.org/10.1287/trsc.2021.1097 30. Tomic, J., Kempton, W.: Using fleets of electric-drive vehicles for grid support. J. Power Sour. 168(2), 459–468 (2007) 31. Valogianni, K., Ketter, W., Collins, J., Zhdanov, D.: Enabling sustainable smart homes: an intelligent agent approach. In: 35th International Conference on Information Systems (ICIS) pp. 1–20 (2014) 32. Vytelingum, P., Voice, T., Ramchurn, S., Rogers, A., Jennings, N.: Theoretical and practical foundations of large-scale agent-based micro-storage in the smart grid. Artif. Intell. Res. 42, 765–813 (2011) 33. Watson, R.T., Boudreau, M., Chen, A.J.: Information systems and environmentally sustainable development: energy informatics and new directions for the IS community. Manag. Inf. Syst. Q. 34(1), 4 (2010) 34. Wolfson, A., Tavor, D., Mark, S., Schermann, M., Krcmar, H.: Better place: a case study of the reciprocal relations between sustainability and service. Serv. Sci. 3(2), 172–181 (2011) 35. Zhou, Y., Scheller-Wolf, A., Secomandi, N., Smith, S.: Electricity trading and negative prices: storage vs. disposal. Manag. Sci. 62(3), 880–898 (2016)

Chapter 12

Power TAC Experiment Manager: Support for Empirical Studies Frederik Milkau, John Collins

, and Wolfgang Ketter

12.1 Introduction In this chapter, we describe the Power TAC Experiment Manager, a tool for setting up and running “designed experiments” [2] that can systematically test hypotheses about agent design, competitive performance, effects of weather, penetration of electric vehicles or distributed solar production, policy options such as subsidy or tax for certain energy resources, market design options, or other configuration options. Individual Power TAC simulation sessions or “games” typically take at least an hour to simulate just two months of activity, and experiments require multiple runs, often with much longer simulations. The simulation server and each of the competing agents must run in separate processes, possibly in separate machines. Without a tool such as the Experiment Manager, it can be very time-consuming to set up, run, gather, and analyze data from even a fairly simple experiment. Power TAC is a competitive simulation platform [8, 10] that supports an annual tournament, where developers of broker agents compete to test their trading strategies against each other. But the annual competition is just one phase of a research model we call “Competitive Benchmarking” [9] (CB). The CB model starts with an important and complex problem domain that may resist serious study by individuals and small groups; Power TAC addresses the problem of finding market mechanisms that can help improve energy sustainability. The CB process involves a shared Platform (the Power TAC simulation) and a periodic cycle consisting of:

F. Milkau · W. Ketter University of Cologne, Cologne, Germany e-mail: [email protected] J. Collins () University of Minnesota, Minneapolis, MN, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Collins et al. (eds.), Energy Sustainability through Retail Electricity Markets, Applied Innovation and Technology Management, https://doi.org/10.1007/978-3-031-39707-3_12

219

220

F. Milkau et al.

– Alignment: Update the simulation to address new challenges, to add new models that are important sources of demand or supply at the retail level, such as loads that have interesting or unique supply or demand patterns such as fleets of electric forklift trucks, or that offer significant demand flexibility that can help offset weather-dependent production resources including EV chargers or large thermal loads with significant thermal inertia such as cold-storage warehouses. – Competition: Teams from around the world are invited to develop or update trading agents to compete with each other, testing and validating competitive strategies, and learning the value of different types of demand flexibility or customer types with different demand patterns. – Analysis: Tournaments typically involve several hundred games with different numbers of competitors and different weather conditions. Detailed logs1 from these games contain all significant messages between the simulation server and competing brokers, as well as state changes within the simulation. A number of analysis tools have been built and shared among the community that extract and process interesting data from these logs. Teams share their working agents to allow other participants to run experiments using them. Experiments can help improve their own trading agents in preparation for future competitions, and test hypotheses about the effects of server and agent configuration. – Dissemination: Teams write up their results and share them through conference and journal papers. The Experiment Manager (EM) aims to support all phases of this CB process. The Alignment phase often involves building new models of customers and energy resources, which must be validated before release. In the Competition phase, participants need tools for validating and analyzing the performance of their agents. It will support the Analysis phase of this process by giving users an easily accessible and manageable way to run experiments and provide access to the generated data. It also supports the Dissemination phase through experiments that explore boundary conditions and alternative scenarios.

12.2 Related Work Simulation has long been used to model complex environments for the purpose of studying everything from production lines to climate and nuclear reactions [11]. It supports empirical study of phenomena that are too expensive, disruptive, or dangerous to study directly. Of course, simulations are abstractions of reality, and care must be taken to include features that might affect the factors being studied, while managing complexity.

1 See

https://powertac.org/tournament/#section-6

12 Power TAC Experiment Manager: Support for Empirical Studies

221

The use of simulations to study markets and automated trading agents through competitions was popularized by Wellman et al. [13] with the original Trading Agent Competition (TAC), which modeled a travel-agent scenario that required trading in multiple markets in order to satisfy clients. The TAC supply chain competition (TAC-SCM) [1, 5] modeled a supply chain in which the agents had to purchase components in a wholesale market, manage inventory and assembly, and sell end products to customers. Jordan et al. [6] designed an experiment to show dominance relationships among published TAC-SCM agents using an approach they called “empirical game theory” that required thousands of games to be run and analyzed. The MinneTAC team at Minnesota developed an experiment-management system [4] for TAC-SCM that used over 20 machines to run multiple games and collect data for analysis. Sodomka et al. [12] showed how careful management of initial conditions such as random-number seeds could significantly reduce the number of games required to achieve a desired confidence interval. Ketter et al. [7] used this system extensively in his study of economic regimes. An interesting outcome of that work was to show that improving the trading performance of one agent could improve the utility of all agents in the market.

12.3 Supporting Empirical Research A Power TAC simulation, such as virtually all simulations, can be thought of as a large collection of stochastic processes, driven by statistical models and pseudorandom number sequences. A typical Power TAC simulation contains more than 1800 such sequences. As in the real world, behaviors of individual customers are quite random, while large collections of customers are more predictable. Pseudorandom sequences are initialized with 48-bit integer “seeds,” allowing them to be repeated. Simulation models are designed to ensure that the random value sequences are unaffected by per-game situations, such as whether a customer subscribes to a time-of-use versus a flat-rate tariff. This feature allows random behavior in the simulation to be separated from the varying behavior of external broker agents. Unlike most simulations, model behaviors also depend on weather forecasts and hourly weather reports. Power TAC uses real-world data collected over several years from a number of sites around the world. Behavior of customer models also depends on the behavior of brokers as they publish new tariffs and adjust prices. A fourth source of potential variability comes from the “bootstrap record” that sets the initial conditions for a simulation session; see the game specification [8] for details. These four sources of variability mean that if we run multiple simulations using the same competing brokers, the detailed results will be different. If our goal is to evaluate two variations of a decision process in our broker, we cannot make a credible claim about the result unless we can show that the probability of the null hypothesis, in which our variation had no credible impact, is under 5% or even 1%.

222

F. Milkau et al.

Power TAC is a discrete-time simulation, which means that every “tick” of the simulation clock represents a fixed one-hour time interval or “timeslot” in the simulated “world.” In a tournament environment, where the competitors are communicating over the Internet from all around the world, the normal tick interval is five seconds, which means a two-month simulation takes about two hours to complete. In a local environment using the EM, where the platform and the competing retail brokers are all running on the same machine, the tick interval can be shortened to two seconds or less.

12.3.1 Simulation Space Power TAC simulates a wide variety of customer models and wholesale energy providers, along with wholesale and retail market entities and mechanics. Most of these aspects are configurable, creating a very large space of potential simulations. For clarity, we partition this simulation space into profile, market, and environment spaces, expanding on the partition introduced by Sodomka et al. [12]. Profile space The profile space is made up of brokers, autonomous agents whose primary goal is to maximize profits by interacting with the simulation environment through wholesale, tariff, and balancing markets. Brokers include those provided by the research teams participating in the annual Power TAC tournaments that are published on the Power TAC broker repository.2 Many of them are configurable, so broker configurations are also points in the profile space. Broker developers who want to experiment with variations on their own brokers are also operating in the profile space. Market space Power TAC simulations are centered around three different markets that operate under different rules. From a broker’s view, the wholesale and tariff markets accept messages (bids, tariffs) and respond immediately. Interactions with the balancing market may be direct or indirect because tariffs that offer payment for demand flexibility are automatically registered with the balancing market that then exercises that flexibility as needed, paying or charging customers while crediting or debiting broker accounts as specified in their tariff contracts, and notifying brokers by sending transaction messages. The mechanics of these markets can be influenced by setting parameters such as publication fees, interest rates, auction margins, and in the case of the primary wholesale market model, price curve parameters. Environment space The environment space primarily consists of costumer models, wholesale provider models, and real-world weather data. Populations and behavior elements of customer models, such as sensitivity to price or weather, can be configured. Weather data come from the weather server given a location

2 https://www.powertac.org/wiki/index.php/Category:Brokers

12 Power TAC Experiment Manager: Support for Empirical Studies

223

and start date/time for which data are available, and consist of hourly weather reports (temperature, precipitation, sky cover, wind direction, and speed) and 24hour forecasts collected from a number of real locations over a period of several years. Depending on research or development requirements, one or more of these spaces may be the focus of an experiment. A computer scientist might focus on broker development and therefore the profile space. An economist might be interested in the effects of market parameters. Policy and sustainability research might benefit from experiments in the environment space. The basic use case stays the same, however: testing a hypothesis against a meaningful data set. Many of the processes in the profile and environment spaces are driven by random-number sequences. For example, deviations between mean and actual price for a given settlement quantity in the wholesale market are partly determined by time-of-day and day-of-week, partly by weather, and partly by a residual random factor with a distribution that is roughly normal. Sodomka showed that the number of games that must be run to achieve a given confidence interval can be significantly reduced by running treatment games using the same random-number seeds as a given baseline game. As a result, we consider the initial settings of the various random-number sequences within the simulation to be an additional space that defines a simulation session.

12.3.2 Experiments The Experiment Manager supports definition and operation of experiments composed of baselines and treatments as visualized in Fig. 12.1. If S is the space of possible simulation configurations including server configuration, broker agent configuration, weather data, and random-sequence seeds, then .s ∈ S represents a single game configuration. An experiment is designed to test a hypothesis that proposes a measurable difference between a set .B ⊂ S of “baseline” configurations and one or more treatments, each defined by a modifier .f : S → S that maps baseline configurations to treatment configurations .TB,f with .TB,f = {s | s = f (s) ∧ s ∈ B}. Instances in B differ only in the random-sequence seeds. At this time, there are three classes of modifiers available. Broker modifiers are changes in the profile space, replacing a specific broker with another broker and/or broker version, or adding or removing a broker. Parameter set modifiers are changes in the market and environment spaces specified by one or more changes in simulation server parameters. Weather modifiers are changes in the environment space resulting from the weather sequences used, as specified by locations or date ranges. An experiment then consists of a set of baseline games and a set of treatment games defined by the selected modifiers. Commonly, treatments are defined by varying a single parameter, such as the population of small solar producers.

224

F. Milkau et al.

Fig. 12.1 Experiment dimensions. Each circle is a game. Larger numbers of instances can reduce uncertainty

While investigating the impact of changes to one or more aspects of a broker design, developers might set up several different broker configurations as treatments and use the standard configuration as a baseline. To study the effects of changing certain environment or market parameters, researchers might construct treatments along a single dimension by changing a parameter value in steps, creating a trajectory. This allows researchers to evaluate system behavior along a path in the configuration space. It is also common to investigate the effect of customer or wholesale market configuration changes under different weather conditions, in which case we would set up an experiment set with baselines in the individual experiments differing among chosen weather sequences, but using identical treatments.

12.4 Architecture Power TAC simulations operate as distributed systems in which brokers and the simulation server communicate with each other through TCP connections. Modern high-end desktop machines often have the capacity to run multiple simulations with their associated brokers simultaneously, but running more than one simulation on a single machine requires careful assignment of port numbers to avoid conflict. To simplify the process of packaging and running multiple servers and to free broker developers from the need to provide source code and detailed instructions for building and running their brokers, we have designed the EM around the Docker3 container virtualization engine.

12.4.1 Simulation Services Each Power TAC game involves several independent services to interact with each other, as shown in Fig. 12.2. A single game requires a simulation server, one or more brokers, and a weather server. The simulation server is described in more

3 https://docs.docker.com/

12 Power TAC Experiment Manager: Support for Empirical Studies

225

Fig. 12.2 Service setup for a single game

detail by Collins and Ketter [3]. Brokers are autonomous agents who trade energy on wholesale and balancing markets and offer tariffs to customers. Both simulation server and brokers use real-world weather forecasts and reports provided by a weather server, which can serve multiple simulations simultaneously. All services are run as independent programs communicating via TCP connections. The simulation server and brokers interact through the Apache Active MQ asynchronous messaging framework, included in the simulation server. The weather server provides a REST API to access its data. This interface is used by the simulation server that in turn passes weather data to brokers. There is no direct connection between brokers and the weather server. As a system of independent services, Power TAC simulations can be run on single or multiple machines. During tournaments, brokers run on machines spread around the world, while the current implementation of the Experiment Manager runs all services on a single machine, as described in Sect. 12.5. The simulation and weather servers are implemented in Java. Broker developers are free to use any desired technology for building their broker agents, as long as they can interact correctly with a Java-based core that handles communication with the server. While many brokers have been written as plain Java applications, competitors are increasingly using languages such as Python or R to provide access to a broader set of tools. Python, for example, provides access to a rich ecosystem of data analytics tools and machine learning algorithms. It is also more widely known among students outside Computer Science. The increasingly diverse nature of broker dependencies requires a different approach to building and deploying broker instances. Another core driver of our design is to minimize administrative overhead. Both factors have led us to select the container virtualization engine Docker as a tool to orchestrate the required services.

226

F. Milkau et al.

12.4.2 Container Virtualization Docker is a widely used tool for “container virtualization.” Virtualization in general is a process by which a system’s physical resources are allocated and represented to subsystems as separate virtual resources. In the context of Docker, these subsystems are called containers. Docker effectively creates distinct environments for each container and exposes only a subset of the host’s resources, e.g., CPU, memory, files, or network connections, to them. Containers interface with the host’s kernel functions and do not directly access hardware features as shown in Fig. 12.3. By using this architecture, a container’s processes are running in isolation from all other processes and system resources. Containers are primarily used to run standardized units of software that are distributed in the form container images. A container image is composed of all files, file structure, and metadata required to run an application and its dependencies on any given host provided that the host offers the required kernel interfaces. This means, for example, that Linux container images can only be run on Windows or MacOS systems if a Linux kernel is provided.4 Container images are made up of read-only layers with each layer adding on top of the previous layer. By adding a writable layer on top of an existing image and providing host-specific configuration information such as port mappings, file path bindings, or resource quotas, container images become containers at runtime. Fig. 12.3 Container host architecture

4 With Windows 10 introducing the Windows Subsystem for Linux (WSL) and MacOS using LinuxKit, the latest versions of both platforms are capable of running Linux containers.

12 Power TAC Experiment Manager: Support for Empirical Studies

227

We use Docker to package all simulation services as container images, enabling the EM to set up and run the services required for a single game based solely on a game’s configuration. On one hand, this approach allows users to control the space of possible simulations more precisely by using exact copies, meaning container images, of simulation services. On the other hand, it reduces the administrative overhead for EM users, since the burden of managing binaries and dependencies is with the server and broker developers and not with the users themselves.

12.5 Implementation In its current implementation, the EM consists of two main components, the orchestrator and the web-based Graphical User Interface (GUI). Additionally, a MySQL database is used for metadata persistence. The orchestrator is a Java application that manages the game, baseline, and treatment metadata and runs the specified games. The orchestrator directly interfaces with the host’s Docker engine to manage containers. The GUI is an independent web-based service that gives users the option to manage the orchestrator’s resources and receive updates about experiment status. Since the weather service is able to serve multiple simulation servers at once, it is included as part of the EM setup and not created on a per-game basis. To keep the EM as a whole as independent as possible from the underlying host OS, all services can be set up within Docker containers as well, as shown in Fig. 12.4, making Docker the only dependency required on host systems.

Fig. 12.4 Experiment Manager architecture

228

F. Milkau et al.

12.5.1 Game and Experiment Creation A standalone game is created by manually specifying all parts of the game configuration. Using the GUI, users can define the parameters of a new game by selecting one or more brokers from the set of available brokers, setting a simulation date and location, choosing values for server parameters, and providing a name for the game as shown in Fig. 12.5. This game configuration is sent to the orchestrator. After being validated, the specification is given a unique ID and is persisted in the MySQL database. A similar process is currently used for the creation of baseline games. In this case, a single game configuration is defined and used for a certain number of games,5 with the important exception that no seeds are specified for these games. Therefore,

Fig. 12.5 User interface for the creation of a single game

5 The number of games required to reach statistically significant results may vary depending on experiment design.

12 Power TAC Experiment Manager: Support for Empirical Studies

229

all baseline games differ only in their random-number sequences. As outlined in Sect. 12.3.2, treatment configurations are defined by applying modifiers to baseline game configurations. In practice, this means that treatment games are created by: (a) changing either the broker set or server parameters and (b) using the baseline games’ state logs as seed files. As a natural requirement for this relationship, respective baseline games must be finished before starting treatment games.

12.5.2 Simulation Automation To run games according to their configurations, the EM translates the configurations into the container layouts required to run games, including virtual networks, and composes sets of configuration files for individual containers. When running a game, the EM interfaces with the Docker daemon to create containers for the simulation server and the brokers according to their respective configurations. These containers are one-off instances, meaning that they will be removed after the game run has finished. In contrast, the weather server container can be shared across game runs and multiple games simultaneously and is not created on a per-game basis. The resulting container layout is described in Sect. 12.4.1. The current EM implementation uses a first-in first-out queue to schedule games, meaning that the games are run in order of their creation. Depending on resources available on the host machine, the EM provides one or more execution slots, which determines the number of games that can be run in parallel. It periodically checks if there are any games that have not yet been started. If there are, they are handled as follows: 1. If there is an available execution slot, a new thread is created for the game. 2. Within this thread, the EM creates files, containers, and virtual network endpoints based on the game configuration. 3. After all resources are created and configured, the containers are started. 4. Once a game is running, the EM periodically checks the status of running containers, which can be one of {running, completed, crashed}. 5. Once a game run has finished, the EM will update the game’s metadata and check several post conditions to determine whether the game has completed successfully or has failed. Then it will either: (a) Re-run the game for a limited number of re-tries if one of the post conditions is not met. (b) Or complete the run and open up the execution slot. 6. After a game has finished, the resulting data are made available on the host’s file system. The EM can be used for a variety of purposes, and the standard error-checking methods may not detect problems that are specific to the purpose of a given experiment. Therefore, it is possible, especially when using the EM for broker

230

F. Milkau et al.

development, that errors will occur in the simulation or in a broker that will require active user involvement.

12.6 Getting Results To demonstrate the system, we created a small experiment to examine variation along one of the environment space dimensions. Treatments varied the number of distributed small-scale solar installations. The standard configuration includes 7000 5 kW units, which is almost enough to cover total energy consumption in the middle of a sunny day, as long as the weather is not hot enough to create a large air-conditioning load. Due to its configuration architecture, the Power TAC simulation server is deployed with a default configuration. When designing games and experiments, users are only required to explicitly set the parameters that relate to the research question at hand, which then override the default configuration. For the experiment, we modified the number of small-scale solar installations in four steps: the population parameter was set to only one unit for the baseline and to 7000, 14,000 and 21,000 units for the respective treatments. We used real-world weather data for both Rotterdam, NL, with simulation start date December 25th, 2009 (RD), and Cheyenne Wyoming, USA, with simulation start date June 23rd, 2014 (CJ). Therefore, the games were partitioned into one of the eight sets as shown in Table 12.1. All games ran with the same set of eight brokers, including the default broker: AgentUDE17, CrocodileAgent16, IS3,6 Maxon16, SPOT17, SPOT19, and TUC_TAC2020.7 Each set consisted of 10 games, for a total of 80 games. The only difference between games within a given set was the random seed used for a particular game. The random-sequence seeds were created while running baseline games and subsequently re-used for treatment games. We wanted to test two hypotheses: Table 12.1 Game set partition

Weather Solar pop. 1 7000 14,000 21,000

CJ

RD

CJ – P1 CJ – P7000 CJ – P14000 CJ – P21000

RD – P1 RD – P7000 RD – P14000 RD – P21000

6 Development snapshot for the 2022 tournament; the Docker image can be found on Docker Hub: https://hub.docker.com/r/is3cologne/is3-broker/tags (tag: exp-solar-leasing). 7 Broker binaries are available on the Power TAC broker repository. Utilities to build broker images can be found on GitHub: https://github.com/powertac/broker-images.

12 Power TAC Experiment Manager: Support for Empirical Studies

231

Fig. 12.6 Daily market price contours in the Cheyenne baseline. The wide variability in afternoonevening prices is likely due to air conditioning during hot spells

1. Increased population of distributed solar might correlate with lower wholesale market prices. 2. Increased population of distributed solar might result in higher supply/demand imbalance values for individual brokers. For the first hypothesis, we found no clear effect. Wholesale prices in the treatments differed little from the baseline pattern, seen for the Cheyenne simulation in Fig. 12.6. Only three of the brokers, the default broker, TUC_TAC, and AgentUDE, offered solar tariffs. There should have been plenty of room for profits between the observed wholesale market pricing and the low price of 0.01/kWh offered by the default broker for distributed generation. It is likely the developers of these brokers did not put much effort into optimizing their pricing for solar, given the low amount of solar generation in the default configuration. At the same time, the default broker does not have a sophisticated approach to bidding in the wholesale market as we shall see in the next example. For the second hypothesis, we saw significant effects on broker balancing behavior. Brokers must match hourly supply and demand, with any imbalance being supplied from one of several sources: (1) storage capacity or demand flexibility among their subscribed customers; (2) imbalance of the opposite sign from other brokers; or (3) regulation capacity in the wholesale market, which is much more expensive than wholesale power. With the baseline configuration using the Cheyenne summer weather (Fig. 12.7), we see that some agents need additional energy or “up-regulation” (positive values), while others typically are oversupplied and need “down-regulation.”

232

F. Milkau et al.

Fig. 12.7 Broker imbalance with 1 solar producer

Fig. 12.8 Broker imbalance with 21,000 solar producers

At the other extreme (Fig. 12.8), we see that the default broker is oversupplied, and several of the other brokers, including IS3, TUC_TAC, and Maxon16, are experiencing much wider variations in their hourly imbalance values.

12 Power TAC Experiment Manager: Support for Empirical Studies

233

This experiment used two baseline sets, one using Cheyenne summer weather and one using Rotterdam winter weather. Unsurprisingly, we find that the effect of additional solar capacity in the Rotterdam winter is not significant.

12.7 Conclusion Power TAC is an environment for supporting policy research in retail electricity markets, for studying a variety of approaches to trading in these complex markets, and for extrapolating into a future that may have much higher penetrations of solar/wind capacity, electric vehicles, heat pumps, and other factors that we expect to see in our transition to a more sustainable energy environment. It is designed to support the Competitive Benchmarking research model [9] in which a platform is used for periodic competitions and empirical study and is periodically extended to maintain its relevance to evolving ideas about how electricity should be generated, distributed, and managed. The Power TAC experiment manager is a new tool for supporting empirical study using the Power TAC simulation environment, addressing two important limitations that have made the system less than ideal for conducting empirical study. The first is the need to set up and run tens to hundreds of multi-hour simulation sessions, keeping track of the configuration and logs from each session, and of keeping track of the relationships between hundreds of GB of logs and the experimental conditions that produced them. The second is the fact that many of the more capable broker designs are not simple programs, but rather fairly complex setups involving code in multiple programming languages, along with data sets resulting from machine learning processes. Many of the competitive brokers are built by students who have neither the time nor the experience to package them up and support them for use by others. By encapsulating these packages as Docker container images, they can be shared without exposing sources and without requiring specific configurations of tools in the experimental environment. While the EM already works as intended for the core concern of designing and running game series, we plan to expand on this foundation. Due to the EM still being in an early development stage regarding some features, we will first and foremost focus on fixing stability issues and adding quality-of-life features, for example, file storage management. One limiting factor we observed during the EM’s development process were the hardware resources provided by some host systems. Games with a large number of brokers sometimes stopped unexpectedly due to the simulation server container not being able to send or receive messages in time. Therefore, we would like to investigate setups that include multiple container hosts. We hope by balancing the load between several hosts we can both avoid this class of errors and enable more resource-intensive experiments. Additionally, we would like to collect physical resource data for both broker and server containers to create usage profiles. By

234

F. Milkau et al.

having these profiles available, we hope to be able to create data-driven game schedules and to provide more precise physical resource requirements with respect to experiment design. The Power TAC ecosystem already provides a set of analysis tools to process and visualize simulation data.8 Since analysis will always be a part of experimentation, we would like to add interfaces for these tools and integrate them into the EM, further automating existing workflows. It is our hope that by unifying all Power TAC tools into one integrated platform, Power TAC will become more accessible for new users and users without the skills currently required to run Power TAC via a command-line interface. This will make the Power TAC platform more attractive for use in less technical domains such as economics and sustainability research.

References 1. Arunachalam, R., Sadeh, N.M.: The supply chain trading agent competition. Electron. Commerce Res. Appl. 4(1), 66–84 (2005). Elsevier 2. Cohen, P.R.: Empirical Methods for Artificial Intelligence, vol. 139. MIT Press Cambridge (1995). http://opencoursesfree.org/archived_courses/seas.harvard.edu/~parkes/cs286r/ spring08/reading6/CohenTutorial.pdf 3. Collins, J., Ketter, W.: Power TAC: software architecture for a competitive simulation of sustainable smart energy markets. SoftwareX 20, 101217 (2022). https://doi.org/10.1016/j. softx.2022.101217, https://www.sciencedirect.com/science/article/pii/S2352711022001352 4. Collins, J., Ketter, W., Pakanati, A.: An experiment management framework for TAC SCM agent evaluation. In: TADA 09 Workshop on Trading Agent Design and Analysis. IJCAI, Pasadena (2009) 5. Collins, J., Ketter, W., Sadeh, N.: Pushing the limits of rational agents: the Trading Agent Competition for Supply Chain Management. AI Mag. 31(2), 63–80 (2010) 6. Jordan, P.R., Kiekintveld, C., Wellman, M.P.: Empirical game-theoretic analysis of the TAC Supply Chain game. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS’07, pp. 193:1–193:8. ACM, New York (2007). https://doi.org/10.1145/1329125.1329359 7. Ketter, W., Collins, J., Gini, M., Gupta, A., Schrater, P.: Detecting and forecasting economic regimes in multi-agent automated exchanges. Decis. Support Syst. 47(4), 307– 318 (2009). https://doi.org/10.1016/j.dss.2009.05.012, http://www.sciencedirect.com/science/ article/pii/S0167923609001262 8. Ketter, W., Collins, J., Weerdt, M.D.: The 2020 Power Trading Agent Competition. SSRN Scholarly Paper ID 3564107, Social Science Research Network, Rochester (2020). https://doi. org/10.2139/ssrn.3564107, https://papers.ssrn.com/abstract=3564107 9. Ketter, W., Peters, M., Collins, J., Gupta, A.: Competitive benchmarking: an IS research approach to address wicked problems with big data and analytics. MIS Q. 40(4), 1057–1080 (2016), http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2700333 10. Ketter, W., Peters, M., Collins, J., Gupta, A.: A multiagent competitive gaming platform to address societal challenges. MIS Q. 40(2), 447–460 (2016)

8 Power

TAC tools on GitHub: https://github.com/powertac/powertac-tools

12 Power TAC Experiment Manager: Support for Empirical Studies

235

11. Law, A.M., Kelton, W.D.: Simulation Modeling and Analysis, vol. 3. Mcgraw-hill, New York (2007) 12. Sodomka, E., Collins, J., Gini, M.: Efficient statistical methods for evaluating trading agent performance. In: AAAI07, pp. 770–775 (2007) 13. Wellman, M.P., Wurman, P.R., O’Malley, K., Bangera, R., Reeves, D., Walsh, W.E.: Designing the market game for a trading agent competition. IEEE Internet Comput. 5(2), 43–51 (2001). IEEE