Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch 9819907985, 9789819907984

With the increasing penetration of renewable energy and distributed energy resources, smart grid is facing great challen

204 40 7MB

English Pages 270 [271] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Foreword
Preface
Acknowledgments
Contents
1 Introduction for Smart Grid Forecast and Dispatch
1.1 Smart Grid Forecast
1.2 Smart Grid Dispatch
1.2.1 Problem Statement
1.2.2 Problem Properties
References
2 Review for Smart Grid Forecast
2.1 Introduction
2.2 The Load and Netload Forecasting
2.2.1 The Representative Patterns of Load Forecasting
2.2.2 The Statistical Model of Load/Net Load Forecasting
2.2.3 The Machine Learning Model of Load and Netload Forecasting
2.3 The Electrical Price Forecasting
2.3.1 The Mathematical Method for Electrical Price Forecasting
2.3.2 The Learning Method for Electrical Price Forecasting
2.4 The Electrical Vehicle Charging Station Charging Power Forecasting
2.4.1 Model-Based Electrical Vehicle Charging Station Charging Power Forecasting Method
2.4.2 Data-Driven Electrical Vehicle Charging Station Charging Power Forecasting Method
References
3 Review for Smart Grid Dispatch
3.1 Introduction
3.2 Real-World Applications
3.2.1 Distribution Network
3.2.2 Microgrid Network
3.2.3 Electric Vehicles
3.2.4 Integrated Energy System
3.3 The Methods for Smart Grid Dispatch
3.3.1 Mathematical Programming
3.3.2 Evolutionary Algorithms
3.3.3 AI-Enabled Methods
References
4 Deep Learning-Based Densely Connected Network for Load Forecast
4.1 Introduction
4.2 Residual Architecture
4.3 Unshared Convolution
4.4 Densely Connected Network
4.4.1 Overall Framework
4.4.2 Densely Connected Block
4.4.3 Clipped L2-norm
4.4.4 Smooth Loss
4.4.5 Smooth Quantile Regression
4.5 Case Study
4.5.1 Data Description
4.5.2 Case 1: Methods Validation
4.5.3 Case 2: Deterministic Forecasting
4.5.4 Case 3: Probabilistic Forecasting
4.6 Conclusion
References
5 Reinforcement Learning Assisted Deep Learning for Probabilistic Charging Power Forecasting of EVCS
5.1 Introduction
5.2 Framework
5.2.1 Problem Formulation
5.2.2 The Probabilistic Forecast Framework of EVCS Charging Power
5.3 Data Transformer Method
5.4 Reinforcement Learning Assisted Deep Learning Algorithm
5.4.1 Long Short-Term Memory
5.4.2 The Modeling of LSTM Cell State Variation
5.4.3 Proximal Policy Optimization
5.5 Adaptive Exploration Proximal Policy Optimization
5.6 Case Study
5.6.1 Data Description and Experiential Initialization
5.6.2 The Performance of Probabilistic Forecasting Obtained by LSTM-AePPO
5.6.3 Metrics Comparison Among Different Algorithms
5.6.4 The Effectiveness of AePPO
5.7 Conclusion
References
6 Dense Skip Attention-Based Deep Learning for Day-Ahead Electricity Price Forecasting with a Drop-Connected Structure
6.1 Introduction
6.2 Structure of the Proposed Framework
6.2.1 Data Preprocessing
6.2.2 Feature Extraction
6.2.3 Autoweighting of Features
6.2.4 Target Regression
6.3 Drop-Connected UCNN-GRU
6.3.1 Advanced Residual UCNN Block
6.3.2 Drop-Connected Structure
6.4 Dense Skip Attention Mechanism
6.4.1 Dense Skip Connection
6.4.2 Feature-Wise Attention Block
6.5 Case Study
6.5.1 Data Description
6.5.2 Implementation Details
6.5.3 Case 1: Model Effectiveness Evaluation
6.5.4 Case 2: Comparison with Statistical Techniques
6.5.5 Case 3: Comparison with Conventional DL Techniques
6.6 Conclusion
6.7 Quantile Regression
6.8 Formulation of the Evaluation Index
6.9 PReLU: A Solution to the Neuron Inactivation
References
7 Uncertainty Characterization of Power Grid Net Load of Dirichlet Process Mixture Model Based on Relevant Data
7.1 Introduction
7.2 A Bayesian Framework Based on the Dirichlet Mixture Model of Data Association
7.2.1 Net Load Time-Series Correlation
7.2.2 Bayesian Framework Based on the Dirichlet Mixture Model of Data Association
7.2.3 Dirichlet Process and Folded Stick Construction Representation
7.2.4 Nonparametric Dirichlet Mixture Model
7.3 The Dirichlet Mixture Model Based on VBI for Data Association
7.3.1 Nonparametric Dirichlet Mixture Model
7.3.2 Variational Posterior Distribution Considering Data Association
7.4 Example Analysis
7.4.1 Description of the Algorithm
7.4.2 DDPMM Convergence Analysis
7.4.3 Analysis of DDPMM Fitting Effect
7.4.4 DDPMM Interval Indicator Analysis
7.5 Conclusion
References
8 Extreme Learning Machine for Economic Dispatch with High Penetration of Wind Power
8.1 Introduction
8.1.1 Background and Motivation
8.1.2 Literature Review
8.1.3 Contribution of This Paper
8.2 Multi-objective Economic Dispatch Model
8.2.1 Formulations of Economic Dispatch
8.2.2 Multi-objective Economic Dispatch Model
8.3 Extreme Learning Machine Assisted Group Search Optimizer with Multiple Producers
8.3.1 Group Search Optimizer with Multiple Producers
8.3.2 ELM Assisted GSOMP
8.4 Simulation Studies
8.4.1 Simulation Settings
8.4.2 Simulation Results
8.5 Conclusion
References
9 Multi-objective Optimization Approach for Coordinated Scheduling of Electric Vehicles-Wind Integrated Power Systems
9.1 Introduction
9.2 Operation Models of EV and Wind Power
9.2.1 Operational Model of EV Charging Station
9.2.2 Model of Uncertain Wind Power
9.2.3 Wind Power Curtailment Based on Probability Model
9.3 Coordinated Scheduling Model Integarated EV and Wind Power
9.3.1 Objective Functions
9.3.2 Decision Variables
9.3.3 Constraints
9.4 Solution of Coordinated Stochastic Scheduling Model
9.4.1 The Parameter Adaptive DE Algorithm
9.4.2 Decision-Making Method
9.4.3 Solution Procedure
9.5 Case Study
9.5.1 Case Description
9.5.2 Result Analysis
9.5.3 Algorithm Performance Analysis
9.6 Conclusion
References
10 Many-Objective Distribution Network Reconfiguration Using Deep Reinforcement Learning-Assisted Optimization Algorithm
10.1 Introduction
10.2 Many-Objective Distribution Network Reconfiguration Model
10.2.1 Problem Formulations
10.2.2 Objectives
10.3 Deep Reinforcement Learning-Assisted Multi-objective Bacterial Foraging Optimization Algorithm
10.3.1 Multi-objective Bacterial Foraging Optimization Algorithm
10.3.2 Deep Reinforcement Learning
10.3.3 Multi-objective Material Foraging Optimization Algorithm Based on Deep Reinforcement Learning
10.4 Simulation Studies
10.4.1 Simulation Settings
10.4.2 Simulation Result and Analysis
10.4.3 Comparison with Other Algorithms
10.5 Conclusion
References
11 Federated Multi-agent Deep Reinforcement Learning for Multi-microgrid Energy Management
11.1 Introduction
11.2 Theoretical Basis of Reinforcement Learning
11.3 Decentralized Multi-microgrid Energy Management Model
11.3.1 Isolated Microgrid Energy Management Model
11.3.2 Isolated MG Energy Management Model via MDP
11.3.3 Decentralized Multi-microgrid Energy Management Model
11.4 Federated Multi-agent Deep Reinforcement Learning Algorithm
11.4.1 Proximal Policy Optimization
11.4.2 Federated Learning
11.4.3 Federated Multi-agent Deep Reinforcement Learning Algorithm
11.5 Case Study
11.5.1 Experiment Setup
11.5.2 Analysis of the F-MADRL Algorithm
11.5.3 Performance Comparison
11.6 Conclusion
References
12 Prospects of Future Research Issues
12.1 Smart Grid Forecast Issues
12.1.1 Challenges
12.1.2 Future Research Directions
12.2 Smart Grid Dispatch Issues
12.2.1 Challenges
12.2.2 Future Research Directions
Recommend Papers

Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch
 9819907985, 9789819907984

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Engineering Applications of Computational Methods 14

Yuanzheng Li Yong Zhao Lei Wu Zhigang Zeng

Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch

Engineering Applications of Computational Methods Volume 14

Series Editors Liang Gao, State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China Akhil Garg, School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, Hubei, China

The book series Engineering Applications of Computational Methods addresses the numerous applications of mathematical theory and latest computational or numerical methods in various fields of engineering. It emphasizes the practical application of these methods, with possible aspects in programming. New and developing computational methods using big data, machine learning and AI are discussed in this book series, and could be applied to engineering fields, such as manufacturing, industrial engineering, control engineering, civil engineering, energy engineering and material engineering. The book series Engineering Applications of Computational Methods aims to introduce important computational methods adopted in different engineering projects to researchers and engineers. The individual book volumes in the series are thematic. The goal of each volume is to give readers a comprehensive overview of how the computational methods in a certain engineering area can be used. As a collection, the series provides valuable resources to a wide audience in academia, the engineering research community, industry and anyone else who are looking to expand their knowledge of computational methods. This book series is indexed in both the Scopus and Compendex databases.

Yuanzheng Li · Yong Zhao · Lei Wu · Zhigang Zeng

Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch

Yuanzheng Li School of Artificial Intelligence and Automation Huazhong University of Science and Technology Wuhan, Hubei, China

Yong Zhao School of Artificial Intelligence and Automation Huazhong University of Science and Technology Wuhan, Hubei, China

Lei Wu Department of Electrical and Computer Engineering Stevens Institute of Technology Hoboken, NJ, USA

Zhigang Zeng School of Artificial Intelligence and Automation Huazhong University of Science and Technology Wuhan, Hubei, China

ISSN 2662-3366 ISSN 2662-3374 (electronic) Engineering Applications of Computational Methods ISBN 978-981-99-0798-4 ISBN 978-981-99-0799-1 (eBook) https://doi.org/10.1007/978-981-99-0799-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Foreword

With the increasing penetration of renewable energy and flexible loads in smart grids, a more complicated power system with high uncertainty is gradually formed, which accordingly brings great challenges to smart grid forecast and dispatch. Traditional methods usually require knowing accurate mathematical models, and they cannot well deal with the growing complexity and uncertainty. Fortunately, the widespread popularity of advanced meters makes it possible for smart grids to collect massive data, which offers opportunities for data-driven artificial intelligence (AI) methods to address the forecast and dispatch issues. In fact, big data and AI-enabled computational methods are widely deployed nowadays. People from different industries try to apply AI-enabled techniques to solve practical yet challenging problems. The power and energy industry is no exception. AI-enabled computational methods can be utilized to fully explore the value behind these historical data and enhance electric services such as power forecast and dispatch. This book explores and discusses the applications of AI-enabled forecast and dispatch techniques in smart grids. The contents are divided into three parts. The first part (Chaps. 1–3) provides a comprehensive review of recent developments in smart grid forecast and dispatch, respectively. Then, the second part (Chaps. 4–7) investigates the AI-enabled forecast approaches for smart grid applications, such as load forecast, electricity price forecast and charging power forecast of electric vehicle charging station. On this basis, the smart grid dispatch issues are introduced in the third part (Chaps. 8–11). This part introduces the application of extreme learning machine, data-driven Bayesian assisted optimization algorithm, multi-objective optimization approach, deep reinforcement learning as well as the federated learning, etc. Finally, the future research directions of smart grid forecast and dispatch (Chap. 12) are presented. This book presents model formulations, novel algorithms, in-depth discussions and comprehensive case studies. One author of this book, Prof. Zhigang Zeng, is an internationally established researcher in the area of AI. He has also conducted extensive work in the application of AI in smart grids. Moreover, another author Prof. Lei Wu is an expert in the smart grid dispatch and serves as associate editors of several top-tier international journals. Prof. Yong Zhao, one of the coauthors, has engaged in several practical projects v

vi

Foreword

and accumulated valuable experience in smart grid research. The first author, Prof. Yuanzheng Li, has conducted research in smart grid for a long time with more than 15 years, published a variety of academic papers and finished many practical projects. It is a worthy reading book, and potential readers will benefit much from AI perspective and how AI-enabled computational methods are used in smart grid forecast and dispatch. Prof. Yang Shi Fellow of IEEE, ASME, CSME, Fellow of Engineering Institute of Canada University of Victoria Victoria, Canada [email protected]

Preface

As the next generation of power system, smart grid is devoted to achieving a sustainable, secure, reliable and flexible energy delivery through decolonization, decentralization and digitization. In order to realize the modernization of power system, increasing penetration of renewable energy is integrated into the smart grid, which also challenges the reliability, stability and flexibility of the power and energy system. Furthermore, a large number of distributed energy resources such as photovoltaic, wind power and electric vehicles make the smart grid more decentralized and complicated. Meanwhile, data acquisition devices such as advanced meters are gaining popularity, which enables an immense amount of fine-grained electricity data to be collected. To this end, the modern smart grid calls for making the best utilization of these history data and promoting the power system operation. Under this background, data-driven artificial intelligence computation approaches are applied in the power and energy system to address the forecast and dispatch issues. In fact, AI-enabled computational methods and machine learning techniques such as deep learning, reinforcement learning and federated learning have been greatly and considerably developed in recent years. It seems natural to figure out how to apply these state-of-the-art techniques to uncertainty forecast and energy dispatch. However, it is a predicament in the power industry that even though an increasing and huge number of smart meter data are collected, these data are not yet fully utilized due to the complexity, uncertainty as well as the privacy concern of power system. As a result, our book aims to take full advantage of numerous data and advanced AI techniques to present some successful applications and also inspire more valuable thoughts, which is quite important for both academia and industry. This book is a monograph about the AI-enabled computational methods for smart grid forecast and dispatch, which consists of 12 chapters. It begins with an overview of the basic concepts of smart grid forecast and dispatch in terms of problem statement and property. Since uncertainty forecast is the basis of further smart grid dispatch and its applications, three issues on AI-enabled forecast approaches, i.e., electrical load forecast, electricity price forecast and electrical vehicle charging station charging power forecast are subsequently studied. On this basis, the following works try to depict the increasing dynamic and complicated smart grid dispatch issues. Specific vii

viii

Preface

works include reinforcement learning, federated learning, machine learning as well as the AI assisted evolutionary algorithm are introduced in this book. Finally, prospects of future research issues on smart grid forecast and dispatch are provided at the end of this book. To help readers have a better understanding of what we have done, we would like to make a simple review of the 12 chapters in the following. Chapter 1 conducts a brief introduction of smart grid forecast and dispatch issues, including the concept of smart grid and application-oriented review of forecast and dispatch techniques. Following the three stages of analytic, namely descriptive, predictive and prescriptive analytic, the key problem statement and property are identified at this chapter. Chapter 2 provides a comprehensive review of smart grid forecast and decomposes the key application areas into three aspects from the perspective of consumers: the load and netload forecast, the electricity price forecast as well as the electrical vehicle charging station charging power forecast. On this basis, the research framework for smart grid forecast is established in this chapter. Chapter 3 offers a application-oriented survey of dispatch techniques and methodologies in the smart grid. Some real-world applications regarding smart grid dispatch are introduced in this chapter, including distribution network, microgrid network, electric vehicle and the integrated energy system. After that, the classical methods for smart grid dispatch are divided into three categories, i.e., mathematical programming, evolutionary algorithm and AI-enabled approached, which are discussed in detail, respectively. Chapter 4 develops a novel deep learning model for deterministic and probabilistic load forecasting. In this model, unshared convolution neural network is selected as the backbone, which is the first time of applying unshared convolution to load fore casting. By reconstructing the unshared convolution layers into the densely connected structure, this architecture has a good nonlinear approximation capability and can be trained in the end-to-end fashion. Chapter 5 proposes a reinforcement learning assisted deep learning probabilistic forecast framework for the charging power of EVCS. This framework contains a data transformer method to preprocess the charging session data and a probabilistic forecast algorithm, termed as LSTM-AePPO. In this framework, the LSTM is used to forecast the mean value of the forecast distribution, and the variation of its cell state is modeled as an MDP. Then, a reinforcement learning algorithm, AePPO, is applied to solve the MDP model and calculate the variance of the forecast distribution. Chapter 6 presents an effective DL based DAEPF model for deterministic and interval forecasting. In recognizing that the temporal variability exists in electricity price datasets, the coherently aggregating structure of unshared convolution neural network and gated recurrent unit is proposed to extract multi-term dependency features. Considering the feature-wise variability, the feature-wise attention block is proposed for autoweighting in the feature dimension. Chapter 7 introduces a Dirichlet mixture model based on data association and improve the posterior distribution by variational inference method, so that the posterior distribution takes more information on net load data association into

Preface

ix

account. Thus, the lower bound of the improved evidence is constructed so that the DDPMM obtains a suitable variational distribution through this lower bound, and its convergence is proved by combining it with the EM algorithm. Chapter 8 proposes a multi-objective ED (MuOED) model with uncertain wind power. In this model, the expected generation cost, the upside potential and the downside risk are taken into account at the same time. Then the MuOED model is formulated as a tri-objective optimization problem, and we use an extreme learning machine assisted group search optimizers with multiple producers to solve the problem. Afterward, a fuzzy decision-making method is used for choosing the final dispatch solution. Chapter 9 depicts a coordinated stochastic scheduling model of electric vehicle and wind power integrated smart grid to conduct the comprehensive investigation among wind power curtailment, generator cost and pollution emission. Specially, the proposed model considers uncertainties of wind power and calculates wind power curtailment by probabilistic information. Besides, we propose the parameters adaptive differential evolutionary algorithm to solve the above optimal issue in an efficient way. Chapter 10 presents a many-objective distribution network reconfiguration model with stochastic photovoltaic power. In this model, the objective function involves the photovoltaic power curtailment, voltage deviation, power loss, statistic voltage stability, and, generation cost. Then, a deep reinforcement learning assisted multiobjective bacterial foraging optimization algorithm is proposed to solve the above many-objective distribution network reconfiguration model. Chapter 11 designs a federated multi-agent deep reinforcement learning algorithm for the multi-microgrids system energy management. A decentralized multimicrogrids model is built first, which includes numerous isolated microgrids and an agent is used to control the dispatchable elements of each microgrid for its energy self-sufficiency. Then, the federated learning mechanism is introduced to build a global agent that aggregated the parameters of all local agents on the server and replaces the local microgrid agent with the global one. Chapter 12 discusses some research trends in the smart grid forecast and dispatch, such as big data issues, novel machine learning technologies, new distributed models, the transition of smart grids and data privacy and security concern. On this basis, a relatively comprehensive understanding about the challenges of current forecast and dispatch approaches, potential solutions and future directions are depicted in this chapter. In conclusion, this book provides various applications of the state-of-the-art AIenabled forecast and dispatch techniques for the smart grid operation. We hope this

x

Preface

book can inspire readers to define new problems, apply novel methods and obtain interesting results with massive history data in the power systems. Wuhan, China Wuhan, China Hoboken, USA Wuhan, China

Yuanzheng Li Yong Zhao Lei Wu Zhigang Zeng

Acknowledgments

This book made a summary of our research about smart grid forecast and dispatch achieved in recent years. These works were carried out in the School of Artificial Intelligence and Automation, Ministry of Education Key Laboratory of Image Processing and Intelligence Control, Huazhong University of Science and Technology, Wuhan, China. Many people contributed to this book in various ways. The authors are indebted to Mr. Zhixian Ni, Mr. Jingjing Huang, Mr. Chaofan Yu, Mr. Shangyang He, Mr. Guokai Hao, Mr. Tianle Sun, Mr. Yizhou Ding, Mr. Fushen Zhang and Mr. Jun Zhang from Huazhong University of Science and Technology, who have contributed materials to this book and thank for their assistance in pointing out typos and checking the whole book. In addition, we appreciate the staff at Springer for their assistance and help in the preparation of this book. This work is supported in part by the National Natural Science Foundation of China (Grant 62073148), in part by Key Project of National Natural Science Foundation of China (Grant 62233006) and in part by Key Scientific and Technological Research Project of State Grid Corporation of China (Grant No. 1400-202099523A-0-0-00). The authors really appreciate their supports. Yuanzheng Li Yong Zhao Lei Wu Zhigang Zeng

xi

Contents

1

Introduction for Smart Grid Forecast and Dispatch . . . . . . . . . . . . . . 1.1 Smart Grid Forecast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Smart Grid Dispatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Problem Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 2 4 4 7 9

2

Review for Smart Grid Forecast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The Load and Netload Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 The Representative Patterns of Load Forecasting . . . . . . 2.2.2 The Statistical Model of Load/Net Load Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 The Machine Learning Model of Load and Netload Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 The Electrical Price Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 The Mathematical Method for Electrical Price Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 The Learning Method for Electrical Price Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 The Electrical Vehicle Charging Station Charging Power Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Model-Based Electrical Vehicle Charging Station Charging Power Forecasting Method . . . . . . . . . . . . . . . . 2.4.2 Data-Driven Electrical Vehicle Charging Station Charging Power Forecasting Method . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13 13 14 15 16 17 19 19 20 22 22 22 24

xiii

xiv

Contents

3

Review for Smart Grid Dispatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Real-World Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Distribution Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Microgrid Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Electric Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Integrated Energy System . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 The Methods for Smart Grid Dispatch . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Mathematical Programming . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Evolutionary Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 AI-Enabled Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31 31 32 32 33 35 35 35 36 40 45 47

4

Deep Learning-Based Densely Connected Network for Load Forecast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Residual Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Unshared Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Densely Connected Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Overall Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Densely Connected Block . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 Clipped L 2 -norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.4 Smooth Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.5 Smooth Quantile Regression . . . . . . . . . . . . . . . . . . . . . . . 4.5 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.2 Case 1: Methods Validation . . . . . . . . . . . . . . . . . . . . . . . . 4.5.3 Case 2: Deterministic Forecasting . . . . . . . . . . . . . . . . . . . 4.5.4 Case 3: Probabilistic Forecasting . . . . . . . . . . . . . . . . . . . . 4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55 55 57 58 59 59 61 62 63 64 65 65 66 67 70 71 72

5

Reinforcement Learning Assisted Deep Learning for Probabilistic Charging Power Forecasting of EVCS . . . . . . . . . . . 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 The Probabilistic Forecast Framework of EVCS Charging Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Data Transformer Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Reinforcement Learning Assisted Deep Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Long Short-Term Memory . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 The Modeling of LSTM Cell State Variation . . . . . . . . . . 5.4.3 Proximal Policy Optimization . . . . . . . . . . . . . . . . . . . . . . 5.5 Adaptive Exploration Proximal Policy Optimization . . . . . . . . . . .

75 75 77 77 79 81 84 84 85 87 88

Contents

xv

5.6

92 92

Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.1 Data Description and Experiential Initialization . . . . . . . 5.6.2 The Performance of Probabilistic Forecasting Obtained by LSTM-AePPO . . . . . . . . . . . . . . . . . . . . . . . . 5.6.3 Metrics Comparison Among Different Algorithms . . . . . 5.6.4 The Effectiveness of AePPO . . . . . . . . . . . . . . . . . . . . . . . 5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

7

Dense Skip Attention-Based Deep Learning for Day-Ahead Electricity Price Forecasting with a Drop-Connected Structure . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Structure of the Proposed Framework . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.3 Autoweighting of Features . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.4 Target Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Drop-Connected UCNN-GRU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Advanced Residual UCNN Block . . . . . . . . . . . . . . . . . . . 6.3.2 Drop-Connected Structure . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Dense Skip Attention Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Dense Skip Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2 Feature-Wise Attention Block . . . . . . . . . . . . . . . . . . . . . . 6.5 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.3 Case 1: Model Effectiveness Evaluation . . . . . . . . . . . . . . 6.5.4 Case 2: Comparison with Statistical Techniques . . . . . . . 6.5.5 Case 3: Comparison with Conventional DL Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Quantile Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8 Formulation of the Evaluation Index . . . . . . . . . . . . . . . . . . . . . . . . 6.9 PReLU: A Solution to the Neuron Inactivation . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Uncertainty Characterization of Power Grid Net Load of Dirichlet Process Mixture Model Based on Relevant Data . . . . . . . 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 A Bayesian Framework Based on the Dirichlet Mixture Model of Data Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Net Load Time-Series Correlation . . . . . . . . . . . . . . . . . . . 7.2.2 Bayesian Framework Based on the Dirichlet Mixture Model of Data Association . . . . . . . . . . . . . . . . . 7.2.3 Dirichlet Process and Folded Stick Construction Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

92 94 97 98 99 101 101 104 105 106 106 106 107 107 108 109 109 111 114 114 115 116 118 119 124 124 125 125 128 131 131 133 133 134 135

xvi

Contents

7.2.4 Nonparametric Dirichlet Mixture Model . . . . . . . . . . . . . The Dirichlet Mixture Model Based on VBI for Data Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Nonparametric Dirichlet Mixture Model . . . . . . . . . . . . . 7.3.2 Variational Posterior Distribution Considering Data Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Example Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Description of the Algorithm . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 DDPMM Convergence Analysis . . . . . . . . . . . . . . . . . . . . 7.4.3 Analysis of DDPMM Fitting Effect . . . . . . . . . . . . . . . . . . 7.4.4 DDPMM Interval Indicator Analysis . . . . . . . . . . . . . . . . 7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

138

7.3

8

9

Extreme Learning Machine for Economic Dispatch with High Penetration of Wind Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . 8.1.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.3 Contribution of This Paper . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Multi-objective Economic Dispatch Model . . . . . . . . . . . . . . . . . . . 8.2.1 Formulations of Economic Dispatch . . . . . . . . . . . . . . . . . 8.2.2 Multi-objective Economic Dispatch Model . . . . . . . . . . . 8.3 Extreme Learning Machine Assisted Group Search Optimizer with Multiple Producers . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Group Search Optimizer with Multiple Producers . . . . . 8.3.2 ELM Assisted GSOMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Simulation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.1 Simulation Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Multi-objective Optimization Approach for Coordinated Scheduling of Electric Vehicles-Wind Integrated Power Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Operation Models of EV and Wind Power . . . . . . . . . . . . . . . . . . . 9.2.1 Operational Model of EV Charging Station . . . . . . . . . . . 9.2.2 Model of Uncertain Wind Power . . . . . . . . . . . . . . . . . . . . 9.2.3 Wind Power Curtailment Based on Probability Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Coordinated Scheduling Model Integarated EV and Wind Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.1 Objective Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.2 Decision Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

140 140 144 147 147 148 149 151 155 155 157 157 157 158 159 160 160 161 165 165 167 172 172 172 180 180

183 183 185 185 187 188 189 191 191

Contents

9.3.3 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution of Coordinated Stochastic Scheduling Model . . . . . . . . . 9.4.1 The Parameter Adaptive DE Algorithm . . . . . . . . . . . . . . 9.4.2 Decision-Making Method . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.3 Solution Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.1 Case Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.2 Result Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.3 Algorithm Performance Analysis . . . . . . . . . . . . . . . . . . . . 9.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4

10 Many-Objective Distribution Network Reconfiguration Using Deep Reinforcement Learning-Assisted Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Many-Objective Distribution Network Reconfiguration Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.1 Problem Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Deep Reinforcement Learning-Assisted Multi-objective Bacterial Foraging Optimization Algorithm . . . . . . . . . . . . . . . . . . 10.3.1 Multi-objective Bacterial Foraging Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.2 Deep Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . 10.3.3 Multi-objective Material Foraging Optimization Algorithm Based on Deep Reinforcement Learning . . . . 10.4 Simulation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.1 Simulation Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.2 Simulation Result and Analysis . . . . . . . . . . . . . . . . . . . . . 10.4.3 Comparison with Other Algorithms . . . . . . . . . . . . . . . . . 10.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Federated Multi-agent Deep Reinforcement Learning for Multi-microgrid Energy Management . . . . . . . . . . . . . . . . . . . . . . . 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Theoretical Basis of Reinforcement Learning . . . . . . . . . . . . . . . . . 11.3 Decentralized Multi-microgrid Energy Management Model . . . . 11.3.1 Isolated Microgrid Energy Management Model . . . . . . . 11.3.2 Isolated MG Energy Management Model via MDP . . . . 11.3.3 Decentralized Multi-microgrid Energy Management Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Federated Multi-agent Deep Reinforcement Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.1 Proximal Policy Optimization . . . . . . . . . . . . . . . . . . . . . .

xvii

192 192 192 194 195 195 195 198 204 206 207

209 209 211 211 212 215 216 217 218 222 222 222 224 227 228 231 231 233 234 234 236 237 239 239

xviii

Contents

11.4.2 Federated Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.3 Federated Multi-agent Deep Reinforcement Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5.2 Analysis of the F-MADRL Algorithm . . . . . . . . . . . . . . . 11.5.3 Performance Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

240 241 246 246 247 250 252 252

12 Prospects of Future Research Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 Smart Grid Forecast Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.1 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.2 Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Smart Grid Dispatch Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2.1 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2.2 Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . .

255 255 256 256 257 257 259

Chapter 1

Introduction for Smart Grid Forecast and Dispatch

As a novel generation of power systems, smart grid is devoted to achieving a sustainable, secure, reliable and flexible energy delivery through the bidirectional power and information flow. In general, the smart grid mainly possesses the following features. (1) Smart grid offers a more efficient way to ensure the optimal dispatch with a lower generation cost and higher power quality via the integration of distributed sources and flexible loads, such as renewable energy and electric vehicles [1–5]. (2) Smart grid achieves the secure and stable operation of power system via the deployment of effective operational control technologies, including the automatic generation control, autonomous voltage control and load frequency control [6–9]. (3) Smart grid provides a transaction platform for customers and suppliers affiliated to different entities, thus enhances the interactions between suppliers and customers, which facilitates the development of electricity market [10–12]. (4) Smart grid equips numerous advanced infrastructures including sensors, meters and controllers, which also arises some emerging issues, such as the network security and privacy concern [13–16]. On this basis, the typical architecture of smart grid is depicted in Fig. 1.1, which illustrates that the operation of smart grid involves four fundamental segments, i.e., power generation, transmission, distribution and customers. As for the generation part, traditional thermal energy is converted to electrical power, and the large-scale of renewable energy integration is a trend in smart grid. After that, the electrical energy is delivered from the power plant to the power substations via the high-voltage transmission lines. Then, substations lower the transmission voltage and distribute the energy to individual customers such as residential, commercial and industrial loads. During the transmission and distribution stages, numerous smart meters are deployed in the smart grid to ensure the secure and stable operation. Besides, the © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Li et al., Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch, Engineering Applications of Computational Methods 14, https://doi.org/10.1007/978-981-99-0799-1_1

1

2

1 Introduction for Smart Grid Forecast and Dispatch

Fig. 1.1 Typical architecture of smart grid

prevalent of these advanced infrastructures also brings about some emerging issues that traditional power system seldom encounter, e.g., the network security and privacy concern. Among the various smart grid operation issues, forecast and dispatch are regarded as the most critical segments. One the one hand, smart grid forecast offers a precious information for the uncertain future status, which significantly assists the smart grid to prevent and defuse the potential risks . On the other hand, smart grid dispatch contributes to optimal operation of power system, which promotes the efficiency of energy utilization as well as the stability of the whole system. In this way, extensive previous research has devoted to investigate these two directions, which has already achieved quite successful applications. In the rest part of this section, comprehensive introduction regarding the smart grid forecast and dispatch is presented as follows.

1.1 Smart Grid Forecast Forecasting techniques are essential to the operation of the smart grid, and it is capable to provide crucial references, such as load and electrical price for the schedule and planning of the power system [17–19]. The precious of forecasting highly influence the decision performance of the smart grid [17]. Generally, the forecasting techniques can be represented as follows: (1.1) Y = Fθ (X ) where Y is the forecast value, normally stands for the load, electricity price and demands; Fθ denotes the forecast model with parameter θ and X is the inputs. The θ is usually determined by the experiences of algorithm designer or through the historical data. Besides, the forecasting would apply autoregression, that is, Y = X t

1.1 Smart Grid Forecast

3

and X = X [t−1,t−2,...,t−T ] where T indicates the order of the regression model [20, 21]. It should be noted that the forecasting can be categorized in to three types on the basis of the time interval [22]. • The long-term technique focuses on the forecasts about 1 year to 10 years ahead. The values are mainly used for the long-term planning of the smart grid, including the future direction and the assessment of a smart grid [23, 24]. • The medium-term forecasting technique mainly takes the consideration of the predict value about 1 mouth to 1 year ahead. The economic efficiency, security guarantee and maintenance of the power system are the chief topics during this time interval [25]. • The recent industry and academia both concentrate on the forecasting on less time interval, namely about 1 hour to 1 day ahead, and this is describe as the short-term forecasting. It is due to the optimal economic dispatch in smart grid, the optimum unit commitment and the evaluation of contracts between various companies would rely on the precious forecasting value to achieve an efficient performance [24, 26]. Traditionally, the above forecast in the smart grid can be done through statistical methods such as Box–Jenkins basic models, Kalman filtering (KF)[27], gray method (GM)[28] and exponential smoothing (ES)[29]. The Box–Jenkins basic models include autoregressive (AR)[30], moving average (MA)[31] autoregressive moving average (ARMA)[32] and autoregressive integrated moving average (ARIMA) models[33]. In AR, the forecast value can be expressed as a linear combination of previous data. The MA method mimics the moving average process; it is a linear regression model that forecasts future values through the white noise of one or more past values. ARMA model combines both AR and MA, and the ARIMA further enhances the ability of the algorithm on the non-stationary data[34]. Besides, the KF method is efficient, especially in long-term forecasting, and is capable of dealing with errors with multi-inputs. Therefore, the numerous elements that may influence the forecast performance should be considered, such as weather, time, economy, random disturbances, and other customer factors. Moreover, GM is widely applied in the scenario with limited past data and the ES can be carried from the exponentially weighted average of the past observation. With the development of machine learning techniques, support vector machines (SVMs)[28], the artificial neural network (ANN)[35], extreme learning machines (ELM)[36] and wavelet neural networks (WNNs)[37] are emerging for the forecast problem in the smart grid. The SVM model could deploy a hyperplane that separates the data that is mapped into a higher feature dimensional space through a nonlinear mapping function. In this way, the algorithm is capable to model the nonlinear relationship between the input and the forecast value. The ANN has gained huge popularity in recent decades because of the development of big data and advanced computation hardware [38]. Basically, it is capable of fitting the nonlinear relationship when conducting forecasting, the burgeoning recurrent neural network (RNN) and Transformer-family model further endow the time and space dependency for the forecasting, which further improves the forecasting accuracy of the model [29, 39]. The ELM is a special case of a feedforward neural network that only contains

4

1 Introduction for Smart Grid Forecast and Dispatch

a single-hidden layer. By analytically solving the corresponding least-squares problem, the weights of the ELM can be simply determined. WNN takes the advantage of the wavelet function and thus can recognize a feature extraction without too much prior information. This means the algorithm is robust for approximating the nonlinear function [40]. Despite the above significant progress in the field of smart grid forecasting, the related fields are still developing. Nowadays, its main research focuses on the following aspects: 1. Increasing the accuracy of the forecasting techniques Although current methods have achieved sufficient performance, the forecasting accuracy is still inadequate due to numerous reasons. First, the forecasting method would be underfitting due to the inappropriate model or training method. Besides, the performance of the methods would decrease when forecasting the peak or some emergencies happened. Overall, the forecasting values are fully trusted only if their accuracy raise to a higher level. 2. Tackle the distribution training and application of the forecasting method With the development of renewable energy techniques and the electrical market, more and more distributed microgrids are being developed. Since microgrids have become the main subject of the smart grid, their distribution characteristic requires distributed forecasting techniques. Different from the traditional grids with central operators, the distributed microgrids have their own management center for power dispatch and energy transactions. In this way, forecasting techniques, especially ML methods, should develop new approaches for this change. 3. Raising the explainability of ML techniques The black box feature of the current ML techniques limits its wide application in the industry. On the one hand, the experiences of human experts cannot accelerate the training of ML methods. On the other hand, the knowledge of ML methods that are trained through numerous data is impossible to be learned by humans, which reduced the credibility of the methods. Therefore, the exploration of the explainability of ML is necessary, which could develop more transparent ML methods and thus become more inspired for wider applications.

1.2 Smart Grid Dispatch 1.2.1 Problem Statement Power dispatch is a pivotal problem that must be addressed in achieving the smart grid promise [41]. In order to decide the optimal strategy for power generation, transmission and even consumption, smart grid dispatch connects different components within the whole power system. To be specific, the purpose of power dispatch aims at reasonably arranging the generation schemes to each generator and determining the operation states of transformers and other power equipment, so as to optimize some

1.2 Smart Grid Dispatch

5

performance indicators while satisfying the constraint conditions at the same time. Generally, the smart grid dispatch could be converted into an optimization problem, which is expressed as follows. min f (x)  g(x) = 0, s.t. h(x) ≤ 0,

(1.2)

where x denotes the decision variables of smart grid (e.g., outputs of generators,) and f (x) represents the objective function (e.g., the fuel cost, the voltage deviation, transmission loss and etc.). Besides, g(x) and h(x) are equality constraints (e.g., power balance constraints) and inequality constraints (e.g., output limits of equipment), respectively. In fact, smart grid dispatch problems will have different formulations under different requirements or assumptions. Therefore, some popular formulations of smart grid dispatch are summarized in this section. (a) Economic Dispatch (ED) ED is one of the fundamental problems in the smart grid, which allocates generation among different generation units to achieve the minimum operation cost without considering the transmission network constraints [42]. In general, ED is the simplest formulation of smart grid dispatch that is usually utilized for real-time operation as follows. min c (PG )   (1.3) G PG − D PD = 0 s.t. PGmin ≤ PG ≤ PGmax where G represents the set of power generators and D is the set of load demands. PG and PD denote the outputs of generator and load demands, respectively. Hence, c(PG ) is the total cost function, which could be calculated by linear function (i.e., c(PG ) =  (α + β PG )) or nonlinear quadratic function c(PG ) = (α + β PG + γ PG2 ). In addition, the power balance constraint is presented by the first constraint, without considering the power flow through transmission lines and the second constraint depicts the limits of generator outputs. (b) Optimal Power Flow (OPF) Despite ED achieves quite successful applications in power system, it only finds the optimal dispatch for generators, which are constrained within their output limits and results in a balance between total generation and load demands. However, the ED calculation ignores the effect that the dispatch of generation has on the loading of transmission lines or the effect it has on bus voltages. In fact, the dispatch solution of generators does have a significant affects on power flows, which should be taken into account under some circumstances. To this end, the optimal power flow is proposed as an extension of classic ED model, which couples the power flow calculation with

6

1 Introduction for Smart Grid Forecast and Dispatch

the ED calculation so that the power flow and ED are optimized, simultaneously [43]. The original formulation of OPF is expressed as follows: min



c (PGi )

Gi∈G



 ⎧ PGi − PDi − j∈i Vi V j G i j cos θi j + Bi j sin θi j = 0 ⎪ ⎪  ⎪ ⎪ Q Gi − Q Di − j∈i Vi V j G i j cos θi j − Bi j sin θi j = 0 ⎪ ⎪ ⎨ min max PGi ≤ PGi ≤ PGi s.t. min Q Gi ≤ Q Gi ≤ Q max ⎪ Gi ⎪ ⎪ ⎪ Vimin ≤ Vi ≤ Vimax ⎪ ⎪ ⎩ |Sij | ≤ Sijmax

(1.4)

where PGi and Q Gi are the active and reactive power of generator i. PDi and Q Di denote the active and reactive power demand of bus i. Vi represents the voltage magnitude of bus i and θij denotes the difference of voltage phase between bus i and j. G ij and Bij are the real and imaginary part of the mutual admittance, respectively. Besides, Sij is the power flow, and Sijmax represents the transmission capacity of the branch connecting bus i and j. The power flow equations are addressed by the first two constraints, while the next four constraints depict the limitation of generator, bus and branch, respectively. (c) Energy Management With the increasing penetration of highly fluctuated renewable energy, power system is confronted with rigorous challenge, which is mainly due to the imbalance between power supply and demand. Actually, the shortage/excess in the consumption or generation of power may perturb the smart grid and create serious problems such as voltage deviation and even blackouts in severe conditions. Therefore, energy management is applied to increase the balance between supply and demand in an efficient way, and to reduce the peak load during unexpected periods. Generally, energy management can be divided into two main categories. On the one hand, the first one is from the perspective of electricity supply, which uses the energy management to define the adequate scheme of generation units in an efficient way, which is also named unit commitment. A classical formulation of energy management is presented as follows: min









c1 PG,t + c2 u G,t + c3 suG,t + c4 sdG,t t

⎧  ⎪ G PG,t − D PD,t = 0 ⎪ ⎪ ⎪ ⎨ u G,t PGmin ≤ PG,t ≤ u G,t PGmax s.t. uG,t = u G,t−1 + suG,t − sdG,t ⎪ ⎪ ⎪ ⎪ t suG,t ≥ SUG ⎩ t sd G,t ≥ SDG

(1.5)

1.2 Smart Grid Dispatch

7

where c1 (·), c2 (·), c3 (·) and c4 (·) represent the fixed cost, variable cost, startup cost and shutdown cost of generation units, respectively. u G,t , suG,t and sdG,t denote the decisions of unit commitment, startup and shutdown. As mentioned before, the power balance constraint is addressed by the first constraint, while the second constraint presents the limitation of generator outputs. Besides, the status of generation units is denoted by the third constraint, while the last two address the minimal startup and shutdown time constraints. On the other hand, the second category is on the consumer side, in which consumers manage their energy consumption in order to meet the available power from the generation side, which is also called demand response. More specific, the consumer side energy management provides an opportunity of users to play an important role int he operation of smart grid by shifting or reducing their energy usage during peak periods in response to time-based rates or other forms of financial incentives. (d) Network Reconfiguration Network reconfiguration could be defined as altering the network topological structures by changing the open/close status of tie switches while satisfying operation constraints [44]. This process can improve the performance of smart grid according to different particular objectives and constraints. The formulation of original network reconfiguration is presented as follows: min Ploss + VD

⎧  PGi − PDi − j∈i Vi V j G ⎪ i j cos θi j + Bi j sin θi j = 0 ⎪  ⎪ ⎪ ⎨ Q Gi − Q Di − j∈i Vi V j G i j cos θi j − Bi j sin θi j = 0 s.t. Vimin ≤ Vi ≤ Vimax ⎪ ⎪ ⎪ 0 ≤ Ii ≤ Iimax ⎪ ⎩ Radial topological constraints

(1.6)

where the fourth constraint addresses the limitation of line current and the radial network structure must be maintained and all loads should be served after reconfiguration, as the last constraint denoted.

1.2.2 Problem Properties The objective functions and operation constraints of smart grid dispatch determine that this problem has the following characteristics [45]: (a) Multi-objective It should be noted that the optimization objectives of smart grid dispatch are diverse from different perspectives. For instance, the owner of renewable power plant prefers to promote the utilization of renewable energy, in order to gain more revenue. However, the large-scale integration of renewable energy may threaten the secure opera-

8

1 Introduction for Smart Grid Forecast and Dispatch

tion of power system, which is confronted with the optimization objective of smart grid. Therefore, smart grid dispatch cannot consider only one optimization objective. Actually, the dispatcher usually needs to consider several objectives in the real smart grid, such as the generation cost, voltage deviation and power loss, which is a multi-objective optimal dispatch problem of power system. (b) Multi-constraint Due to the particularity of power system, the smart grid dispatch problem is a multiconstraint one. At first, the power generation and load demand must be balanced in real-time since the electricity cannot be stored in a large scale. Afterward, considering the effect of generation dispatch on transmission lines, the classic energy conservation should be extended to power flow constraints, which determine the power distribution of smart grid. In addition, the outputs of electrical appliance should also be constrained as a result of their physical limitation. At last, the secure constraints of smart grid need to be satisfied including the apparent power on transmission lines, the voltage amplitude of power buses and etc. Consequently, the smart grid dispatch problem is formulated as an optimization model with multiple constraints. (c) Multi-variable In order to achieve the economic and secure operation of smart grid, numerous decision variables should be dispatched including the power outputs of generator, the terminal voltage amplitude of generator, the tap position of transformer and controllable status of electrical equipment. Therefore, the smart grid dispatch problem in reality is accompanied with high-dimensional decision variables due to the extensive scale of power system. For example, the dimension of decision variables in IEEE 118-bus power system is up to 238, while the total number of decision variables in China Southern Power Grid is more than ten thousand. Worse of all, some variable of smart grid dispatch are continuous, while some other are discrete, which leads to this problem hard to be solved. (d) Strong uncertainty With large-scale renewable energy integrated into the power system, its strong uncertainty brings about serious challenges to the dispatch of smart grid. First of all, the outputs of renewable energy are intermittent such as solar power and hydroelectricity, which makes the peak load regulation difficult. Secondly, the randomness of renewable energy may threaten the secure and stable operation of smart grid, e.g., voltage deviation, power loss or even congestion. Finally, the generation of renewable energy is uncontrollable to some extent, which aggravates the dispatch burden of smart grid. In addition, the consumption behavior of load users is also random, which leads to the uncertainty of demand side. (e) Computational complexity Taking aforementioned four aspects into account, we could make a conclusion that the smart grid dispatch is a complicated optimization problem with multi-objective,

1.2 Smart Grid Dispatch

9

multi-constraint, multi-variable and strong uncertainty. This is the reason why smart grid dispatch problem has attracted much attention in recent years. In order to handle this complex problem and meet the requirement of practical application, several methods are proposed, which will be introduced in Chap. 3 with detailed explanation.

References 1. F. Alex Navas, J.S. Gómez, J. Llanos, E. Rute, D. Sáez, M. Sumner, Distributed predictive control strategy for frequency restoration of microgrids considering optimal dispatch. IEEE Trans. Smart Grid 12(4):2748–2759 (2021) 2. Z. Chen, J. Zhu, H. Dong, W. Wanli, H. Zhu, Optimal dispatch of WT/PV/ES combined generation system based on cyber-physical-social integration. IEEE Trans. Smart Grid 13(1), 342–354 (2022) 3. T. Wu, C. Zhao, Y.J.A. Zhang, Distributed AC-DC optimal power dispatch of VSC-based energy routers in smart microgrids. IEEE Trans. Power Syst. 36(5), 4457–4470 (2021) 4. Z. Zhang, C. Wang, H. Lv, F. Liu, H. Sheng, M. Yang, Day-ahead optimal dispatch for integrated energy system considering power-to-gas and dynamic pipeline networks. IEEE Trans. Indus. Appl. 57(4), 3317–3328 (2021) 5. M.R. Islam, H. Lu, M.R. Islam, M.J. Hossain, L. Li, An IoT- based decision support tool for improving the performance of smart grids connected with distributed energy sources and electric vehicles. IEEE Trans. Indus. Appl. 56(4), 4552–4562 (2020) 6. X. Sun, J. Qiu, Hierarchical voltage control strategy in distribution networks considering customized charging navigation of electric vehicles. IEEE Trans. Smart Grid 12(6), 4752–4764 (2021) 7. L. Xi, L. Zhang, X. Yanchun, S. Wang, C. Yang, Automatic generation control based on multiple-step greedy attribute and multiple-level allocation strategy. CSEE J. Power Energy Syst. 8(1), 281–292 (2022) 8. K.S. Xiahou, Y. Liu, Q.H. Wu, Robust load frequency control of power systems against random time-delay attacks. IEEE Trans. Smart Grid 12(1), 909–911 (2021) 9. K.D. Lu, G.Q. Zeng, X. Luo, J. Weng, Y. Zhang, M. Li, An adaptive resilient load frequency controller for smart grids with DoS attacks. IEEE Trans. Vehicular Technol. 69(5), 4689–4699 (2020) 10. B. Hu, Y. Gong, C.Y. Chung, B.F. Noble, G. Poelzer, Price-maker bidding and offering strategies for networked microgrids in day-ahead electricity markets. IEEE Trans. Smart Grid 12(6), 5201–5211 (2021) 11. H. Haghighat, H. Karimianfard, B. Zeng, Integrating energy management of autonomous smart grids in electricity market operation. IEEE Trans. Smart Grid 11(5), 4044–4055 (2020) 12. A. Paudel, L.P.M.I. Sampath, J. Yang, H.B. Gooi, Peer-to-peer energy trading in smart grid considering power losses and network fees. IEEE Trans. Smart Grid, 11(6), 4727–4737 (2020) 13. P. Zhuang, T. Zamir, Hao Liang, Blockchain for cybersecurity in smart grid: a comprehensive survey. IEEE Trans. Indus. Inf. 17(1), 3–19 (2021) 14. Y. Ding, B. Wang, Y. Wang, K. Zhang, H. Wang, Secure metering data aggregation with batch verification in industrial smart grid. IEEE Trans. Indus. Inf. 16(10), 6607–6616 (2020) 15. K. Kaur, G. Kaddoum, S. Zeadally, Blockchain-based cyber-physical security for electrical vehicle aided smart grid ecosystem. IEEE Trans. Intell. Transp. Syst. 22(8), 5178–5189 (2021) 16. M.B. Gough, S.F. Santos, T. AlSkaif, M.S. Javadi, R. Castro, J.P.S. Catalão, Preserving privacy of smart meter data in a smart grid environment. IEEE Trans. Indus. Inf. 18(1), 707–718 (2022) 17. T. Hong, S. Fan, Probabilistic electric load forecasting: a tutorial review. Int. J. Forecasting 32(3), 914–938 (2016)

10

1 Introduction for Smart Grid Forecast and Dispatch

18. H.J. Feng, L.C. Xi, Y.Z. Jun, Y.X. Ling, H. Jun, Review of electric vehicle charging demand forecasting based on multi-source data. In 2020 IEEE Sustainable Power and Energy Conference, pp. 139–146(2020) 19. L. Liu, F. Kong, Y.X. Liu, Peng, Q. Wang, A review on electric vehicles interacting with renewable energy in smart grid. Renew. Sustain. Energy Rev., 51, 648–661 (2015) 20. G.F. Savari, V. Krishnasamy, J. Sathik, Z.M. Ali, S.H.E. Abdel Aleem, Internet of things based real-time electric vehicle load forecasting and charging station recommendation. ISA Trans., 97, 431–447 (2020) 21. W. Kong, Z.Y. Dong, D.J. Hill, F. Luo, Y. Xu, Short-term residential load forecasting based on resident behaviour learning. IEEE Trans. Power Syst., 33(1), 1087–1088 (20170 22. T. Hong et al., Energy forecasting: past, present, and future. Foresight: Int. J. Appl. Forecasting, 32, 43–48 (2014) 23. D. Solyali, A comparative analysis of machine learning approaches for short-/long-term electricity load forecasting in Cyprus. Sustainability, 12(9), 3612 (2020). Publisher: MDPI 24. W. Song, S. Fujimura, Capturing combination patterns of long- and short-term dependencies in multivariate time series forecasting. Neurocomputing 464, 72–82 (2021) 25. B. Hayanga, M. Stafford, C.L. Saunders, L. Bécares, Ethnic inequalities in age-related patterns of multiple long-term conditions in England: analysis of primary care and nationally representative survey data. medRxiv (2022). Publisher: Cold Spring Harbor Laboratory Press 26. L. Cheng, H. Zang, X. Yan, Z. Wei, G. Sun, Probabilistic residential load forecasting based on micrometeorological data and customer consumption pattern. IEEE Trans. Power Syst. 36(4), 3762–3775 (2021) 27. Z. Zheng, H. Chen, X. Luo, A kalman filter-based bottom-up approach for household short-term load forecast. Appl. Energy 250, 882–894 (2019) 28. G. Fan, M. Yu, S. Dong, Y. Yeh, W. Hong, Forecasting short-term electricity load using hybrid support vector regression with grey catastrophe and random forest modeling. Utilities Policy 73, 101294 (2021) 29. S. Smyl, G. Dudek, P. Pelka, ES-dRNN: A Hybrid Exponential Smoothing and Dilated Recurrent Neural Network Model for Short-Term Load Forecasting. arXiv preprint arXiv:2112.02663 (2021) 30. T. Ahmad, H. Chen, Nonlinear autoregressive and random forest approaches to forecasting electricity load for utility energy management systems. Sustain. Cities Soc., 45, 460–473 (2019). Publisher: Elsevier 31. M. Alamaniotis, D. Bargiotas, N.G. Bourbakis, L.H. Tsoukalas, Genetic optimal regression of relevance vector machines for electricity pricing signal forecasting in smart grids. IEEE Trans. Smart Grid 6(6), 2997–3005 (2015) 32. T. Jonsson, P. Pinson, H.A. Nielsen, H. Madsen, T.S. Nielsen, Forecasting electricity spot prices accounting for wind power predictions. IEEE Trans. Sustain. Energy 4(1), 210–218 (2013) 33. R. Wang, L. Yao, Y. Li, A hybrid forecasting method for day-ahead electricity price based on GM(1,1) and ARMA. In: 2009 IEEE International Conference on Grey Systems and Intelligent Services (GSIS 2009), pp. 577–581 (2009) 34. J.P. González, A.M.S. Muñoz San Roque, E.A. Pérez, Forecasting functional time series with a New Hilbertian ARMAX model: application to electricity price forecasting. IEEE Trans. Power Syst., 33(1), 545–556 (2018) 35. T. Lin, B. G. Horne, P. Tino and C. L. Giles. Learning long-term dependencies in NARX recurrent neural networks. IEEE Transactions on Neural Networks, 7(6), 1996 36. G. Huang, Q. Zhu, C. Siew, Extreme learning machine: theory and applications. Neurocomputing, 70(1) (2006) 37. A.J. Conejo, M.A. Plazas, R. Espinola, A.B. Molina, Day-ahead electricity price forecasting using the wavelet transform and ARIMA models. IEEE Trans. Power Syst. 20(2), 1035–1042 (2005) 38. S.G. Patil, M.S. Ali, Review on analysis of power supply and demand in Maharashtra state for load forecasting using ANN. 2022

References

11

39. S. Li, X. Jin, Y. Xuan, X. Zhou, W. Chen, Y.X. Wang, X. Yan, Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Adv. Neural Inf. Process. Syst., 32 (2019) 40. N.M. Pindoriya, S.N. Singh, S.K. Singh, An adaptive wavelet neural network-based energy price forecasting in electricity markets. IEEE Trans. Power Syst. 23(3), 1423–1432 (2008) 41. Y. Li, P. Wang, H.B. Gooi, J. Ye, L. Wu, Multi-objective optimal dispatch of microgrid under uncertainties via interval optimization. IEEE Trans. Smart Grid. 10(2), 2046–2058 (2017) 42. B.H. Chowdhury, S. Rahman, A review of recent advances in economic dispatch. IEEE Trans. Power Syst. 5(4), 1248–1259 (1990) 43. F. Zohrizadeh, C. Josz, M. Jin, R. Madani, J. Lavaei, S. Sojoudi, A survey on conic relaxations of optimal power flow problem. Euro. J. Oper. Res. 287(2), 391–409 (2020) 44. Y. Liu, J. Li, W. Lei, Coordinated optimal network reconfiguration and voltage regulator/der control for unbalanced distribution systems. IEEE Trans. Smart Grid 10(3), 2912–2922 (2019) 45. Y. Liu, Y. Li, H.B. Gooi, Y. Jian, H. Xin, X. Jiang, J. Pan, Distributed robust energy management of a multimicrogrid system in the real-time energy market. IEEE Trans. Sustain. Energy 10(1), 396–406 (2019)

Chapter 2

Review for Smart Grid Forecast

2.1 Introduction Accurate, effective and reliable forecasting techniques are essential for the development of the smart grid. To realize the smart and intelligent dispatch of the electrical equipments and constructing advanced electrical market and the smart city, the forecasting techniques are essential. Wherein, the load demand, the electrical price and other emerging infrastructure for the smart city such as the electrical vehicle are indispensable. This has attracted numerous researchers to promote the developing of the forecasting techniques related to smart grid. In the past ten years, the grid is becoming more and more complex. Numerous types of equipment for power generation and load demand are integrated, including renewable energy generators, energy storage systems and electric vehicle charging stations (EVCS). The emerging loads cause high uncertainties on both the generation and demand side, which would bring about a high peak-to-average ratio, high operation costs and dangerous situation to the smart grid. Wherein, the load demands are difficult to be forecasted due to its complex composition. In addition, with intermittent renewable energy generations, the concept of netload generally grabs more attention. Defined by the difference between the load and the outputs of renewable generators such as wind turbines and solar panels, the netload naturally influenced by the uncertainties of the renewable energy. Therefore, the forecasting of the netload is more valuable for the smart grid incorporating renewable energy generators. In addition, the construction of the power market derives the need for electricity price forecasting. Not only for the decision-making of the market participants but also for the risk management of the market owner. Bidding strategies, power transmission and distribution rely on price forecasting to ensure the economic and secure operation of the power market. [1] analyzed the impacts of the price forecasting error on the economy of the electricity market. Numerical results reveal that “accurate” price forecasting would increase the overall economic performance. The planner may carry out inappreciable purchase plans and derive harmful profits for market participants. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Li et al., Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch, Engineering Applications of Computational Methods 14, https://doi.org/10.1007/978-981-99-0799-1_2

13

14

2 Review for Smart Grid Forecast

Fig. 2.1 The framework of this chapter

Moreover, the emerging load types for the future smart grid are gradually drawn more attention in recent years. For instance, the increasing capacities of the electrical vehicle are deployed to reduce carbon emissions from the perspective of transportation. Expanding EV adoption also brought challenges to the operation of EVCS, which is increasingly constructed and connected to the modern smart grid. It should be noted that the charging power of EVCS is uncertain because of its high correlation with user behavior, which is intrinsic randomness. As one of the most important loads for the smart grid, the fluctuations of the EVCS charging power would threaten the safety of the grid. Therefore, it is essential to forecast the EVCS charging power. In this chapter, the forecasting techniques of (1) the electrical load and netload; (2) the electrical price; and (3) the charging power of electrical vehicle charging stations are reviewed and categorized in Sects. 2.2, 2.3, and 2.4, respectively. The framework of this chapter is illustrated in Fig. 2.1.

2.2 The Load and Netload Forecasting Due to the nature of uncertainty of the load and netload, feature engineering for their forecasting is always a challenging task [2]. However, selecting an appropriate feature combination is time-wasted. For instance, the residential load would be influenced by temperature, relative humidity, meteorological data [3], the calendar variables [4], economic indicator [5] and even holiday [6]. Researchers would spend much of their time studying the impacts of these features on the load demand. Moreover, the uncertainties of renewable energy are also included in the netload forecasting, which bring more fruitful feature requirements, and thus more difficult to achieve. Khuntia

2.2 The Load and Netload Forecasting

15

et al. reviewed numerous kinds of literature to show that the forecasting values are highly dependent on the historical, seasonal and economic data [7]. Besides, forecasting methods are essential in the field of power system operation such as the maintenance schedule for industrial consumers. According to the method that deals with the feature, numerous models about load and netload forecasting have been proposed, and they can be categorized into three main types: the representative patterns, the statistical model and the machine learning model.

2.2.1 The Representative Patterns of Load Forecasting Considering the daily recycle characteristic of the load demand, numerous studies applied the representative patterns on the load data to increase the forecasting accuracy. For instance, In [8], a hierarchical algorithm based on an unsupervised learning algorithm was adopted for clustering the patterns presented in the load data. [9] applies a two-stage clustering algorithm, i.e., the K-means and a hierarchy for the segmentation of household electricity. In addition, the robustness of the K-means is also excavated by [10], and the adaptive clustering mechanism is used for residential load profiles. Hsial et al. proposed the agglomerative hierarchical cluster tree method combined with the Ward’s linkage distance measure in [5], which is proved efficient in load demand clustering. [11] supported the characteristics of energy usage behavior with theoretical analysis and practice theory. Wang et al. focused on dealing with the massive amount of smart meter data in [12] and developed an adaptive K-means clustering framework for big data applications. In addition, numerous techniques such as a novel clustering metric and dynamic time warping were utilized in the clustering method, which decreased the representative group number significantly. Considering the application of the demand response technique, a data-driven segmentation approach using individual loads was presented in [13]. Experiments showed that forecasting with consumption patterns has potentials for a high variation of demand reduction. Dalal et al. proposed a methodical approach for the representative patterns under different scenarios such as season and atypical day of power system [14]. A cluster validation metric was applied, and the results demonstrated its potential. Additionally, the representative scenarios also were investigated by Arthur et al., who presented an unsupervised clustering algorithm, called a modified iterative selforganizing data analysis technique for the short-term nonchronological scenarios of power system load [15]. Alfonso et al. characterized the building energy consumption patterns with the enhanced symbolic aggregate approximation process [16]. Benefiting from the unequal time windows and classification, anomalies were capable to be detected during the daily time window with the support of graphical visualization. Besides, [17] introduced ensemble learning with the representative subset for load forecasting. The weights of these representative methods would dynamically change according to a distance method. The comparisons demonstrated the superior performance of the proposed algorithm.

16

2 Review for Smart Grid Forecast

2.2.2 The Statistical Model of Load/Net Load Forecasting The statistical models such as least absolute shrinkage and selection operator (LASSO) [6], Markov models [11, 18] and Gaussian process-based model [19, 20] are widely used. Besides, linear regression, autoregressive integrated moving averages (ARIMA) and Kalman filtering (KF) are popular in this field [21]. For instance, [22] forecasted the peak load through the ARIMA model that applies the expert system with the time-series information simultaneously. Experiments revealed the accuracy of the algorithm especially on the peak load. In [6], LASSO is applied to capture the sparsity in the historical load data and leverage the relationship between users. The experiments demonstrated its superior performance comparing with wellknown support vectors and line regression. Sultana et al. presented a statistical model for short-term load forecast with Bayesian optimization algorithm [23]. The study hybridized the seasonal ARIMA with nonlinear autoregressive and performed a stable performance on the experiments. [24] provided novel angles on the factors of data identification and model parameters. Correlation analysis and hypothesis tests were used for the forecasting, togethering with the wavelet transform. [25] had combined the ARIMA and exponential smoothing forecasting on the real-world data with 11 regions, and case studies demonstrate the efficiency of the combined method. Similarly, [26] applied the Box–Jenkins seasonal ARIMA through the R-Studio package software. The study also achieved sufficient performance on the actual load data of the hospital facility. Taking into consideration the daily operation on the smart grid, [27] developed a forecasting method based on prediction interval which was combined with the statistical mean-variance model. The quantile regression and risk assessment index were combined to obtain the risk of the load demand profile and experiments showed the accuracy of the method. Through clustering the features according to their importance, [28] presented a day-ahead load forecasting method based on the aggregated algorithm. Statistical experiments revealed that clustering would enhance the performance of the algorithm on the data of the residential distribution network. Pirbazari et al. conducted an ensemble-based algorithm based on four different feature selection methods, i.e., F-regression, mutual information, recursive feature elimination and elastic net [29]. The case studies of this research proceeded on different load profiles with different consumption behavior and the proposed method would achieve considerable improvement in the training speed as well as the performance. To defend the cyberattack, Zhao et al. proposed an adaptive regression for load forecasting with the mixed pattern that is highly generalized under the cyberattack [30]. The experiments illustrated that the mixed method is capable to stay robust under both positive and negative cyberattacks. Similarly, [31] also informed the extreme outliers in the load data, and Huber’s robust statistical method has shown its superior performance through large-scale simulation experiments. Hadri et al. reviewed the statistical methods for load forecastings such as ARIMA, SARIMA, XGBoost and random forest (RF) [32]. Relying on the IoT and big data platforms, the accuracies of these predictive methods can be verified on a real-world dataset. [33] reviewed MA, ARMA and Kalman filter(KF) on the Andhra

2.2 The Load and Netload Forecasting

17

Pradesh State electricity demand data with the mean absolute percentage error, and results showed that if bad data would be filtered, the statistical methods would achieve sufficient performance.

2.2.3 The Machine Learning Model of Load and Netload Forecasting Machine learning-based algorithms such as artificial neural networks [10, 34] were paid much attention in recent years. Due to the vast available data and the development of cloud services, the massive amount of fine-grained electricity consumption data was accessible [35]. [36] has conducted numerous experiments on both longterm and short-term load forecasting studies, and the support vector machine (SVM), as well as ANN, is more comparative. They can provide more reliable and precise outcomes in terms of less error. [37] designed two types of error functions orienting the power costs and applying the SVM for sufficient fitting. The real-world data from New South Wales in Australia was used to show the superiority of the proposed method. Due to their superior performance, machine learning algorithms based on data-driven techniques are becoming more and more essential and widespread. For instance, the long short-term memory (LSTM) algorithm proposed in [38] is capable to represent the time-variant that exists in the load data. Similarly, [39] applied the LSTM model to attain a deeper and wider insight into the big load data of the smart grid. The experiments also demonstrated that machine learning techniques would facilitate the potential of big data. [40] used a fusion of improved sparrow search algorithms to enhance the accuracy of the LSTM for a more safe and more economical power system operations. The results also showed decreases of the error indexes after applying the algorithm. In addition, the recurrent neural network (RNN) model revealed that a deep learning model with recurrent form is more efficient in establishing more complex temporal correlations. Smyl et al. established an RNN model with hybrid exponential smoothing and dilated mechanism to estimate the predictive intervals for the short-term load forecasting [41]. Compared with the statistical method, the proposed RNN model would outperform others in terms of accuracy. Focusing on a particular 33/11kV substation, Veeramsetty et al. presented an estimation method with an artificial neural network for electric load, and the proposed architecture is proved to have better accuracy than the multi-layer perspective [42]. Besides, the variations of the RNN-based algorithms proposed in Refs. [43–45] also show their advantages in the accuracy prompting of load forecasting. Besides, one of the most popular algorithms in the field of computer version, the convolutional neural network (CNN), also introduced in the field of load forecasting [2, 46]. Also, the graph neural network [47] was successfully capturing the non-Euclidean pair-wise correlations by Lin et al. in Ref [47] for spatial information. The temporal relationships in the load data are normally considered during the algorithm design. For instance, [48] presented a temporal convolutional network together with

18

2 Review for Smart Grid Forecast

a light gradient boosting machine, and its effectiveness was demonstrated through the fruitful experiments based on the dataset from China, Australia and Ireland. A local linear neurofuzzy model was presented in [49]. The binary tree learning, and sigmoid validity functions are applied for the parameter updates of the model. Hoori et al. applied a multi-column radial basis function neural network for short-term load forecasting [50]. The study used a k-d tree realizing the dataset split operation to lower the sensitivities of the weather and seasonal effects, thus increasing the convergence speed and the generalization of the method. To save the energy of the heating, ventilation and air conditioning system, [51] used an ensemble learning algorithm to conduct the step-ahead building cooling load forecasting. The study revealed that forecasting would be a benefit for unexpected schedules with high accuracy. Oreshkin et al. addressed the mid-term electricity load forecasting with the deep stack of fully connected layers connect with the forward and backward residual links, which has demonstrated state-of-the-art performance on the related problem [52]. Additionally, considering the real-time cooling load forecasting, the meta-learning strategy was also introduced in a machine recommendation system, which shows high generalizability in the forecasting task [53]. Based on dynamic programming and machine learning techniques, [54] realized anomaly detection on load forecasting to detect the cyberattacks such as unsuitable operational decisions. The robustness of the proposed machine learning-based anomaly detection (MLAD) has been verified through innumerable cyberattack scenarios. In [55], an adaptive structure was applied for the neural network to ensure convergence and stability during the training process. Moreover, the study also provided the upper bound of the learning factor based on the convergence theory. Mohammad et al. comprised deep feed-forward and deep recurrent neural networks under different model architectures and multiple datasets [56]. Additionally, simulations of this study also tested numerous activation functions to find the optimal setting that attains the best generalization. Xu et al. focused on probabilistic peak load forecasting, the artificial neural network is used to quantify the probabilistic occurrence, as well as the magnitude of the peak load [57]. In [58], a deep CNN network was proposed for the weekly forecast of real-world data in Australia. The deep CNN is more capable to conduct highly volatile, non-stationary and nonlinear behavior in the electricity load data. An accurate deep neural network algorithm for short-term load forecasting was presented in [59], and the comparisons with the other five commonly used algorithms demonstrate its high performance. Similarly, using the gated recurrent unit(GRU) network, [60] aimed to tackle the complexity and the nonlinearity of the load data and conducted the short-term forecasting with the real-world load data. Comparisons demonstrated GRU performs better on forecasting than the LSTM model. Gao et al. proposed a novel ensemble deep random vector functional link network for load forecasting, as well as an empirical wavelet transformation method [61]. Data from Australian energy market operation over the whole year of 2020 was used for simulation and demonstrated the superior performance of the model over the other eleven comparison methods. The reinforcement learning algorithm, namely the Q-Learning, was introduced for the deterministic and probabilistic forecasting of electrical load [62]. Through the agents trained by Q-Learning and the second-

2.3 The Electrical Price Forecasting

19

step model, the numerical studies showed an improvement of around 50 and 60% in deterministic and probabilistic forecasting, respectively. [63] discussed the machine and deep learning algorithm of load forecasting under the smart grids combining distributed renewable energy sources. The study announced that larger historical data and more data storage devices are needed to provide large resources for forecasting.

2.3 The Electrical Price Forecasting There are various studies focusing on electrical price forecasting (EPF), which can be mainly categorized in to two types. One is termed the mathematical methods, which apply forecasting through predefined model with fixed parameters and consider the day-ahead price and other features such as temperature and humidity. The other is termed the learning method, where the parameters of the forecasting model are determined through the learning mechanism. The following sections would review the literature through the two categories.

2.3.1 The Mathematical Method for Electrical Price Forecasting One of the most representative benchmark models for EPF is the naive forecast [64], which applied a resembling mechanism on historical values with similar patterns to forecast the future price. However, this kind of method would suffer from poor accuracy due to its feature input routine. Therefore, to achieve more accurate forecasting, more handcrafted features were adopted in the regression model, such as the autoregression (AR), autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA) techniques. For instance, Weron et al. have applied AR on the Nord Pool to estimate model parameters by minimizing the forecasting error [65]. In [66], ARMA is used by Liu et al. to optimize the Akaike information criterion by modeling the hourly ahead electricity price, and the experiments demonstrated the efficiency of the model. It should be noted that both the AR and ARMA model performs the best on the stationary time series [67]. However, the ARIMA model is beyond this restriction. For instance, Contreras et al. have applied the ARIMA model with a logarithmic transformation on price data to achieve a more stable forecasting variance thus would address the drawbacks of the stationary data. Moreover, considering the uncertainties of load and renewable energy sources that integrated into the power system, the Gaussian process regression (GPR) has drawn more attention [68]. Compare with AR, ARMA and ARIMA, the GPR would achieve more accurate forecastings. However, the above methods are normally the linear model which can be solved by the least-square approach. The widespread nonlinear patterns in the EPF are difficult to capture by the above-mentioned methods.

20

2 Review for Smart Grid Forecast

Therefore, the nonlinear approaches are also widely applied in the mathematical methods. For instance, Kristiansen presented a regression model for spot price forecasting on the Nord Pool market, and the model achieved a 0.97 R-square index, which outperforms the Myopic and Futures model [69]. To construct a new hybrid intelligent system, [70] proposed a real-coded genetic algorithm for day-ahead prediction of market-clearing price, and the experiments show its superior performance on ARIMA, wavelet-ARIMA, fuzzy neural network and generalized autoregressive conditional heteroskedastic method. [71] took into consideration the changes in Eurozone electricity markets and proposed a short-term forecasting method to relieve the risks in market investments. Numerous aspects are included in the model, such as constants, regressors, moving averages, weekly/seasonal dummies, and autoregressive and heteroscedastic variables. Analysis based on the single national price dataset was also performed and discussed. Abedinia et al. proposed a novel feature selection criterion to select a minimum subset for the forecast process [72]. Besides, the real-coded genetic algorithm combined with a hybrid filter-wrapper method for the price forecast, considering relevancy, redundancy and interaction at the same time. [73] developed a forecasting strategy based on the wavelet transform and ARIMA to solve the nonlinear, non-stationary and time-variant characteristics of the forecast data from the electricity market in mainland Spain. [74] took the consideration of the functional response in the regression model for the daily curves of the electricity price. The comparative studies showed that functional response would enhance the forecasting performance compared with the scalar one. Bessa et al. presented a statistical forecasting method for the price used in the day-ahead bidding [75]. Further studies on electric vehicle aggregation verified the potential of the proposed method in the future smart grid paradigm.

2.3.2 The Learning Method for Electrical Price Forecasting Compared with the mathematical method, the learning methods for EPF are not limited by the linearity of the model, and it has a more sufficient performance on recognizing the nonlinear relationship especially for the artificial intelligence model [76]. For instance, through the iterative training process, the support vector machine proposed in [77] is employed for the EPF with point forecasting, and the parameters are estimated through the maximum likelihood estimation (MLE) algorithm. Focusing on interval forecasting that provides more useful information than the point value, the neural network (NN) models becomes more and more popular in conducting nonlinear relationships for the EPF. For instance, the artificial neural network (ANN)-based algorithm has been proposed by Mandal et al. and experiments are conducted on multiple seasons [78]. The [79] reported that the learning methods such as ANN and support vector machines would outperform the mathematical ones. Moreover, [80] presented a single-layer forward NN, termed the extreme learning machine. Theoretical and experiments analysis demonstrated its efficiency in training speed. ELM model has been developed by Chen et al. for point and interval electrical price

2.3 The Electrical Price Forecasting

21

forecasting, and the accuracy of the algorithm on the prediction interval was further improved by incorporating the bootstrapping method. Experiments illustrated the enhancement of the method on uncertainty estimation capability of the model. However, the ELM-based methods still rely on high randomness in assigning weights and biases of the model even though they are verified to be efficient approaches. The randomness limits the performance of the ELM method, and it may suffer from lower accuracy in some cases. Therefore, researchers proposed an improved algorithm to address the randomness, for instance, the wavelet-based and ensemble algorithm [81]. The decomposing and composing mechanism provided by wavelet would enhance the forecasting performance [81]. However, high-frequency information may be lost due to the adoption of wavelet components. To address this issue, Amjady et al. adopted the fuzzy neural network (FNN) for electrical price forecasting, which combined nonlinear functions through the fuzzy logic [82]. Moreover, Pindoriya et al. adopted an adaptive wavelet-ANN in [83], and the results show the outperformance of the wavelet-ANN compared with multi-layer perceptron [84] and radial basis function NN [85]. [86] presented an ANN model based on similar days, and the numerical studies based on forecast mean square error and mean absolute percentage error revealed the efficiency and accuracy of the proposed one. Azevedo et al. trained the neural network with the probabilistic information from past years to forecast the minimum and the maximum marginal prices during certain periods [87]. In [88], the fully connected, convolutional and recurrent networks were analyzed as well as three widely used optimization techniques, and the experiments demonstrated that single-layer recurrent neural network with root mean square propagation optimizer achieves the best on the S& P500 index forecasting. Aiming the short-term price forecasting, a researcher in [89] presented an ANN that is trained by the improved Levenberg–Marquardt algorithm. The mean absolute percentage error criterion of the ANN obtained through experiments performed the best among the comparisons, especially during the peak hour. Amjady et al. combined the neural network and evolutionary algorithm to forecast the price on both time and wavelet domain [90]. [91] combined with the advantages of the LSTM model and two popular approaches, i.e., AdaGrad and RMSProp to perform the stochastic gradient-based optimization. The excellence of the proposed model was illustrated by the experiments with data from New South Wales of Australia. Focusing on the cost of cables in electrical engineering, [92] analyzed the value of copper and other different specifications in cables using the RNN model. The effectiveness of the model was also proved in the paper. Singhal et al. [93] presented short-term market-clearing price forecasting with a fuzzy C-means and a three-layer RNN based on chaos theory. Price spike forecasting is specifically considered in the experiments, and the comparisons verified the efficiency of introducing fuzzy C-means. Similarly, the chaotic characteristics of the electricity price are also considered in [94] to represent the trends for adjacent phase points.

22

2 Review for Smart Grid Forecast

2.4 The Electrical Vehicle Charging Station Charging Power Forecasting Due to the need of reducing carbon emissions, electrification of the transportation system, i.e., the electrical vehicle, has been one of the most essential concerns in recent years. Current works about EVCS or EV charging power forecasting can be divided into two main categories, namely, the model-based and data-driven approaches. Different from load power forecasting and price forecasting, the value of EVCS charging power is mainly dependent on the charging behavior of the EV user rather than the natural environment.

2.4.1 Model-Based Electrical Vehicle Charging Station Charging Power Forecasting Method The first category focus on the modeling of the behavior of EV user. For instance, [95] employed the trip chain to establish a spatial-temporal behavior model of EVs. The Monte Carlo and Markov decision process theory is employed for forecasting their charging power. Temperature, traffic conditions and the willingness of EV users in different scenarios are considered in the model to obtain the forecasting distribution. Considering the impacts of traffic on the charging behavior, [96] provided a spatialtemporal distribution approach for EVCS charging power forecasting. A dynamic urban road network model is applied, and the real-time Dijkstra dynamic path search algorithm was introduced for modeling the charging behavior. Based on the case in Shenzhen, China, the largest electric bus and electric taxi fleet in the world, the charging behaviors of EVs were considered for systematic forecasting of EV charging power [97]. The [98] derived the statistic driving behavior patterns and the EVCS charging habits by the simulation of the EV operation. The private EV, electric taxi and electric bus are categorized for prediction in this study, and the results show that the maximum charging profile value is 1760MW at 21:30 in a day. Guo et al. presented a load forecasting method based on the number of vehicles [99]. The distribution of the charging stations and the location map was considered in the forecasting, and the comparisons with linear regression and gray prediction showed the superior performance of the proposed one.

2.4.2 Data-Driven Electrical Vehicle Charging Station Charging Power Forecasting Method The developments of cloud services and the internet of things techniques would bring massive charging process data, such as the information on charging time and energy of EVs, which can be collected. Thus, data-driven algorithms have become

2.4 The Electrical Vehicle Charging Station Charging Power Forecasting

23

another popular category in the field of EVCS charging power forecasting [100]. [100] reviewed numerous data-driven approaches such as ANN, SVM and K-nearest neighbor (KNN). Applying the forecasting, the intelligent transport system would accelerate the guiding of the EV. Some classical data-driven forecasting algorithms are widely employed. For instance, [101] applied an autoregressive integrated moving average (ARIMA) model to forecast the charging power of the aggregated EVCSs charging load, which comes from 2400 charging stations distributed in Washington State and San Diego. [102] combined the least squares support vector machine algorithm and fuzzy clustering for the EVCS charging power forecasting task. The wolf pack algorithm was also applied to construct a hybrid model that achieves high prediction accuracy and ideal stability. Based on the historical data of Nebraska, USA, the effectiveness of the extreme gradient boosting on the charging power forecasting was validated in [103]. Numerous studies demonstrated that the algorithm is capable to model the charging behavior of the plug-in EV. The above algorithms mainly depend on the input data with a specific humandefined feature, which normally requires cumbersome engineering-based efforts [104]. Hence, the methods such as the deep learning one presented in [105] become more and more popular in this field. For instance, empowered by unprecedented learning ability from extensive data, the super-short-term stochastic load forecasting of plug-in EV is forecasted [104]. The experiments showed that the long-short-term memory method could reduce by about 30% forecasting error compared with other ANN and RNN in this task. Moreover, considering the temporal dependency of the data, [106] implemented a dilated convolutional network with the mean square loss to capture the peak of charging power. The spatiotemporal convolutional architecture and the temporal convolutional network work together to achieve an effective result. [107] introduced the LSTM network as well as the XGBoost for the analysis and forecasting of the electrical charging load. Experiments with data from charging stations in Jiangsu are applied to demonstrate the efficiency of the algorithm. Besides, focusing on the variables in macro- and microscale geographical areas, [108] illustrated the superior performance of the LSTM on the seasonal trend of EV charging volumes. Through abundant experiments based on the number of charging stations and the volume of charging under different levels, the advantages of the deep learning method were illustrated. Moreover, the researches on probabilistic forecasting of EVCS charging are emerging. The above-mentioned algorithms are point forecasting, which only provide the expected charging power of the future, and the probabilistic forecasting would bring about more information, such as the uncertainties of the forecast value [109, 110]. There only exists a few works on this topic. Buzna et al. proposed an ensemble method for the probabilistic forecasting of the EVCS using the quantile regression forests and neural network [109]. Compared with non-hierarchical frameworks, the presented approach increases the forecast accuracy up to 9.5%. Besides, Zhang et al. proposed the convolutional neural network to deal with the non-stationary feature of the traffic flow and conduct the prediction intervals of the EVCS charging power [110]. Experiments indicated the potential of modeling driver behaviors such as the arrival rate and the charging process through a mixture model. [111] applied

24

2 Review for Smart Grid Forecast

a Markov chain cyclical model for the forecasting of the status of charging stations. The method contained the vehicle distribution, the average plug time and the amount of energy withdrawn. Buzna et al. compared the time-series and machine learning approaches in [112], and the comparative analyses are conducted on a dataset containing 1700 charging stations in the Netherlands. Random forecast, gradient boosting XGBoost and ANN were proposed for comparisons, and the results demonstrated that the seasonal autoregressive integrated moving average with exogenous regressors model outperformed others. Considering the plug-in hybrid EV, [113] introduced a reinforcement learning algorithm, Q-Learning, for the smart, uncoordinated and coordinated scenarios. Then, based on the Keras open-source software, the simulations demonstrated the effectiveness of the algorithm. Jahromi et al. presented an end-to-end deep learning algorithm integrated with kernel density estimation, which takes the consideration of the spatial feature and temporal features [114]. The numerical results on the practical 348 residential electric vehicles and the comparisons with deep and shallow-based methods illustrated the effectiveness of the proposed one. In order to capture the correlation information, [115] introduced the temporal graph convolution model for both short and long-term forecasting. Complex spatio-temporal information was modeled through a raster-map in the proposed method, and the results demonstrated its superior performance in capturing the spatial and temporal correlations of the charging station network.

References 1. H. Zareipour, C.A. Canizares, K. Bhattacharya, Economic impact of electricity market price forecasting errors: a demand-side analysis. IEEE Trans. Power Syst., 25(1), 254–262 (2009) Publisher: IEEE 2. L. Cheng, H. Zang, X. Yan, Z. Wei, G. Sun, Probabilistic residential load forecasting based on micrometeorological data and customer consumption pattern. IEEE Trans. Power Syst. 36(4), 3762–3775 (2021) 3. C.N. Yu, P. Mirowski, T.K. Ho, A sparse coding approach to household electricity demand forecasting in smart grids. IEEE Trans. Smart Grid 8(2), 738–748 (2016) 4. P. Lusis, K.R. Khalilpour, L. Andrew, A. Liebman, Impact of calendar effects and forecast granularity, Short-term residential load forecasting. Appl. Energy 205, 654–669 (2017) 5. Y.-H. Hsiao, Household electricity demand forecast based on context information and user daily schedule analysis from meter data. IEEE Trans. Indus. Inf. 11(1), 33–43 (2014) 6. P. Li, B. Zhang, Y. Weng, R. Rajagopal, A sparse linear model and significance test for individual consumption prediction. IEEE Transactions on Power Systems 32(6), 4489–4500 (2017) 7. S.R. Khuntia, J.L. Rueda, M. AMM van Der Meijden, Forecasting the load of electrical power systems in mid-and long-term horizons: a review. IET Gener. Transmission Distribut., 10(16), 3971–3977 (2016). Publisher: Wiley Online Library 8. M. Chaouch, Clustering-based improvement of nonparametric functional time series forecasting: application to intra-day household-level load curves. IEEE Trans. Smart Grid 5(1), 411–419 (2013) 9. J. Kwac, J. Flora, Ram Rajagopal, Household energy consumption segmentation using hourly data. IEEE Trans. Smart Grid 5(1), 420–430 (2014)

References

25

10. F.L. Quilumba, W.J. Lee, H. Huang, D.Y. Wang, R.L. Szabados, Using smart meter data to improve the accuracy of intraday load forecasting considering customer behavior similarities. IEEE Trans. Smart Grid. 6(2), 911–918 (2014) 11. B. Stephen, X. Tang, P.R. Harvey, S. Galloway, K.I. Jennett, Incorporating practice theory in sub-profile models for short term aggregated residential load forecasting. IEEE Trans. Smart Grid 8(4), 1591–1598 (2015) 12. Y. Wang, Q. Chen, C. Kang, Q. Xia, Clustering of electricity consumption behavior dynamics toward big data applications. IEEE Trans. Smart Grid 7(5), 2437–2447 (2016) 13. M. Afzalan, F. Jazizadeh, Residential loads flexibility potential for demand response using energy consumption patterns and user segments. Appl. Energy, 254, 113693 (2019) Publisher: Elsevier 14. D. Dalal, A. Pal, P. Augustin, Representative Scenarios to Capture Renewable Generation Stochasticity and Cross-Correlations. arXiv preprint arXiv:2202.03588 (2022) 15. A.N. de Paula, E. José de Oliveira, L. de Mello Honorio, L.W. de Oliveira, C.A. Moraes, m-ISODATA: unsupervised clustering algorithm to capture representative scenarios in power systems. Int. Trans. Electric. Energy Syst., 31(9), e13005 (2021). Publisher: Wiley Online Library 16. A. Capozzoli, M.S. Piscitelli, S. Brandi, D. Grassi, G. Chicco, Automated load pattern learning and anomaly detection for enhancing energy management in smart buildings. Energy 157, 336–352 (2018). Publisher: Elsevier 17. J. Che, F. Yuan, S. Zhu, Y. Yang, An adaptive ensemble framework with representative subset based weight correction for short-term forecast of peak power load. Appl. Energy, 328120156 (2022). Publisher: Elsevier 18. T. Teeraratkul, D. O’Neill, S. Lall, Shape-based approach to household electric load curve clustering and prediction. IEEE Transactions on Smart Grid 9(5), 5196–5206 (2017) 19. G. Xie, X. Chen, Y. Weng, An integrated gaussian process modeling framework for residential load prediction. IEEE Trans. Power Syst. 33(6), 7238–7248 (2018) 20. D.W. Van der Meer, M. Shepero, A. Svensson, J. Widén, J. Munkhammar, Probabilistic forecasting of electricity consumption, photovoltaic power generation and net demand of an individual building using gaussian processes. Appl. Energy 213, 195–207 (2018) 21. A. Rajagukguk, I. Mado, A. Triwiyatno, A. Fadllullah, Short-term electricity load forecasting model based dsarima. Electric. Eng. Depart. Faculty Eng. Universitas Riau 5, 6–11 (2022) 22. N. Amjady, Short-term hourly load forecasting using time-series modeling with peak load estimation capability. IEEE Trans. Power Syst. 16(3), 498–505 (2001) 23. N. Sultana, S.M.Z. Hossain, S.H. Almuhaini, D. Dü¸stegör, Bayesian optimization algorithmbased statistical and machine learning approaches for forecasting short-term electricity demand. Energies, 15(9), 3425 (2022). Publisher: MDPI 24. M. Mustapha, M.W. Mustafa, S. Salisu, I. Abubakar, A.Y. Hotoro, A statistical data selection approach for short-term load forecasting using optimized ANFIS. In: IOP Conference Series: Materials Science and Engineering, vol. 884, Issue 1, pp. 012075. IOP Publishing (2020) 25. A.S. Nair, M. Campion, D. Hollingworth, P. Ranganathan, Two-stage load forecasting for residual reduction and economic dispatch using pjm datasets. In 2018 IEEE International Conference on Electro/Information Technology (EIT), pp. 0691–0695, IEEE (2018) 26. H. Matsila, P. Bokoro, Load forecasting using statistical time series model in a medium voltage distribution network. In IECON 2018-44th Annual Conference of the IEEE Industrial Electronics Society, pp. 4974–4979, IEEE (2018) 27. H. Aprillia, H.T. Yang, C.M. Huang, Statistical load forecasting using optimal quantile regression random forest and risk assessment index. IEEE Trans. Smart Grid, 12(2), 1467–1480 (2020). Publisher: IEEE 28. N. Huang, W. Wang, S. Wang, J. Wang, G. Cai, L. Zhang, Incorporating load fluctuation in feature importance profile clustering for day-ahead aggregated residential load forecasting. IEEE Access, 8, 25198–25209 (2020). Publisher: IEEE 29. A.M. Pirbazari, A. Chakravorty, C. Rong, Evaluating feature selection methods for short-term load forecasting. In 2019 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 1–8, IEEE (2019)

26

2 Review for Smart Grid Forecast

30. S. Zhao, Q. Wu, Y. Zhang, J. Wu, X.A. Li, An asymmetric bisquare regression for mixed cyberattack-resilient load forecasting. Expert Syst. Appl., 210, 118467 (2022). Publisher: Elsevier 31. J. Jiao, Z. Tang, P. Zhang, M. Yue, C. Chen, J. Yan, Ensuring Cyberattack-Resilient Load Forecasting with A Robust Statistical Method. In: 2019 IEEE Power and Energy Society General Meeting (PESGM), pp. 1–5, IEEE (2019) 32. S. Hadri, Y. Naitmalek, M. Najib, M. Bakhouya, Y. Fakhri, M. Elaroussi, A comparative study of predictive approaches for load forecasting in smart buildings. Procedia Comput. Sci., 160, 173–180 (2019). Publisher: Elsevier 33. S.D. Haleema, Short-Term Load Forecasting using Statistical Methods: A Case Study on Load Data 34. M. Beccali, M. Cellura, V. Lo Brano, A. Marvuglia, Short-term prediction of household electricity consumption: Assessing weather sensitivity in a mediterranean area. Renew. Sustain. Energy Rev., 12(8), 2040–2065 (2008) 35. Y. Wang, Q. Chen, T. Hong, C. Kang, Review of smart meter data analytics: applications, methodologies, and challenges. IEEE Trans. Smart Grid 10(3), 3125–3148 (2018) 36. D. Solyali, A comparative analysis of machine learning approaches for short-/long-term electricity load forecasting in Cyprus. Sustainability, 12(9), 3612 (2020). Publisher: MDPI 37. J. Wu, Y.G. Wang, Y.C. Tian, K. Burrage, T. Cao, Support vector regression with asymmetric loss for optimal electric load forecasting. Energy, 223, 119969 (2021). Publisher: Elsevier 38. S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 39. S. Mujeeb, N. Javaid, M. Akbar, R. Khalid, O. Nazeer, M. Khan, Big data analytics for price and load forecasting in smart grids. In: International Conference on Broadband and Wireless Computing, Communication and Applications (Springer, 2018), pp. 77–87 40. G.C. Liao, Fusion of improved sparrow search algorithm and long short-term memory neural network application in load forecasting. Energies, 15(1), 130 (2021). Publisher: MDPI 41. S. Smyl, G. Dudek, P. Pelka, ES-dRNN: A Hybrid Exponential Smoothing and Dilated Recurrent Neural Network Model for Short-Term Load Forecasting. arXiv preprint arXiv:2112.02663 (2021) 42. V. Veeramsetty, R. Deshmukh, Electric power load forecasting on a 33/11 kV substation using artificial neural networks. SN Appl. Sci., 2(5), 1–10 (2020). Publisher: Springer 43. Y. Wang, D. Gan, M. Sun, N. Zhang, L. Zongxiang, C. Kang, Probabilistic individual load forecasting using pinball loss guided lstm. Appl. Energy 235, 10–20 (2019) 44. H. Shi, X. Minghao, R. Li, Deep learning for household load forecasting-a novel pooling deep rnn. IEEE Trans. Smart Grid 9(5), 5271–5280 (2018) 45. W. Kong, Z.Y. Dong, Y. Jia, D.J. Hill, Y. Xu, Y. Zhang, Short-term residential load forecasting based on lstm recurrent neural network. IEEE Trans. Smart Grid 10(1), 841–851 (2019) 46. A. Estebsari, R. Rajabi, Single residential load forecasting using deep learning and image encoding techniques. Electronics 9(1), 68 (2020) 47. W. Lin, W. Di, B. Boulet, Spatial-temporal residential short-term load forecasting via graph neural networks. IEEE Trans. Smart Grid 12(6), 5373–5384 (2021) 48. Y. Wang, J. Chen, X. Chen, X. Zeng, Y. Kong, S. Sun, Y. Guo, Y. Liu, Short-term load forecasting for industrial customers based on TCN-LightGBM. IEEE Trans. Power Syst., 36(3), 1984–1997 (2020). Publisher: IEEE 49. Z. Tavassoli-Hojati, S.F. Ghaderi, H. Iranmanesh, P. Hilber, E. Shayesteh, A self-partitioning local neuro fuzzy model for short-term load forecasting in smart grids. Energy 199, 117514 (2020). Publisher: Elsevier 50. A.O. Hoori, A. Al Kazzaz, R. Khimani, Y. Motai, A.J. Aved, Electric load forecasting model using a multicolumn deep neural networks. IEEE Trans. Indus. Electron., 67(8), 6473–6482 (2019). Publisher: IEEE 51. L. Wang, E. WM Lee, R.K.K Yuen, Novel dynamic forecasting model for building cooling loads combining an artificial neural network and an ensemble approach. Appl. Energy., 228, 1740–1753 (2018). Publisher: Elsevier

References

27

52. B.N. Oreshkin, G. Dudek, P. Pelka, E. Turkina, N-BEATS neural network for mid-term electricity load forecasting. Appl. Energy, 293, 116918 (2021). Publisher: Elsevier 53. W. Li, G. Gong, H. Fan, P. Peng, L. Chun, Meta-learning strategy based on user preferences and a machine recommendation system for real-time cooling load and COP forecasting. Appl. Energy, 270, 115144 (2020). Publisher: Elsevier 54. M. Cui, J. Wang, M. Yue, Machine learning-based anomaly detection for load forecasting under cyberattacks. IEEE Trans. Smart Grid, 10(5), 5724–5734 (2019). Publisher: IEEE 55. H. Li, Short Term Load Forecasting by Adaptive Neural Network. In IOP Conference Series: Materials Science and Engineering, vol. 449, Issue 1 (IOP Publishing, 2018), p. 012028 56. F. Mohammad, Y.C. Kim, Energy load forecasting model based on deep neural networks for smart grids. Int. J. Syst. Assur. Eng. Manage., 11(4), 824–834 (2020). Publisher: Springer 57. L. Xu, S. Wang, R. Tang, Probabilistic load forecasting for buildings considering weather forecasting uncertainty and uncertain peak load. Appl. Energy, 237, 180–195 (2019). Publisher: Elsevier 58. S. Khan, N. Javaid, A. Chand, A.B.M. Khan, F. Rashid, I.U. Afridi, Electricity load forecasting for each day of week using deep CNN. In Workshops of the International Conference on Advanced Information Networking and Applications (Springer, 2019), pp. 1107–1119 59. P.H. Kuo, C.J. Huang, A high precision artificial neural networks model for short-term energy load forecasting. Energies, 11(1), 213 (2018). Publisher: MDPI 60. G. Xiuyun, W. Ying, G. Yang, S. Chengzhi, X. Wen, Y. Yimiao, Short-term load forecasting model of gru network based on deep learning framework. In 2018 2nd IEEE Conference on Energy Internet and Energy System Integration (EI2) (IEEE, 2018), pp. 1–4 61. R. Gao, L. Du, P.N. Suganthan, Q. Zhou, K.F. Yuen, Random vector functional link neural network based ensemble deep learning for short-term load forecasting. Expert Syst. Appl., 206, 117784 (2022). Publisher: Elsevier 62. C. Feng, M. Sun, J. Zhang, Reinforced deterministic and probabilistic load forecasting via Q-learning dynamic model selection. IEEE Trans. Smart Grid, 11(2), 1377–1386 (2019). Publisher: IEEE 63. S. Aslam, H. Herodotou, S.M. Mohsin, N. Javaid, N. Ashraf, S. Aslam, A survey on deep learning methods for power load and renewable energy forecasting in smart microgrids. Renew. Sustain. Energy Rev., 144, 110992 (2021). Publisher: Elsevier 64. F.J. Nogales, J. Contreras, A.J. Conejo, R. Espinola, Forecasting next-day electricity prices by time series models. IEEE Trans. Power Syst. 17(2), 342–348 (2002) 65. R. Weron, A. Misiorek, Forecasting spot electricity prices: a comparison of parametric and semiparametric time series models. Int. J. Forecasting 24(4), 744–763 (2008) 66. M. Alamaniotis, D. Bargiotas, N.G. Bourbakis, L.H. Tsoukalas, Genetic optimal regression of relevance vector machines for electricity pricing signal forecasting in smart grids. IEEE Trans. Smart Grid 6(6), 2997–3005 (2015) 67. T. Haida, S. Muto, Regression based peak load forecasting using a transformation technique. IEEE Trans. Power Syst. 9(4), 1788–1794 (1994) 68. D. Cao et al., Robust deep gaussian process-based probabilistic electrical load forecasting against anomalous events. IEEE Trans. Indus. Inf. 18(2), 1142–1153 (2022) 69. T. Kristiansen, A time series spot price forecast model for the Nord Pool market. Int. J. Electric. Power Energy Syst., 61, 20–26 (2014). Elsevier 70. N. Amjady, M. Hemmati, Day-ahead price forecasting of electricity markets by a hybrid intelligent system. Euro. Trans. Electric. Power, 19(1), 89–102 (2009). Wiley Online Library 71. A. Cervone, E. Santini, S. Teodori, D.Z. Romito, Electricity price forecast: a comparison of different models to evaluate the single national price in the Italian energy exchange market. Int. J. Energy Econ. Policy 4(4), 744–758 (2014) 72. O. Abedinia, N. Amjady, H. Zareipour, A new feature selection technique for load and price forecast of electrical power systems. IEEE Trans. Power Syst., 32(1), 62–74 (2016). IEEE 73. O. Abedinia, N. Amjady, Day-ahead price forecasting of electricity markets by a new hybrid forecast method. Model. Simul. Electric. Electron. Eng. 1(1), 1–7 (2015)

28

2 Review for Smart Grid Forecast

74. G. Aneiros, J. Vilar, P. Raña, Short-term forecast of daily curves of electricity demand and price. Int. J. Electric. Power Energy Syst., 80, 96–108 (2016). Elsevier 75. R.J. Bessa, M.A. Matos, Global against divided optimization for the participation of an EV aggregator in the day-ahead electricity market. Part I: theory. Electric Power Syst. Res., 95, 309–318 (2013) Elsevier 76. Y. Li, S. He, Y. Li, L. Ge, S. Lou, Z. Zeng, Probabilistic charging power forecast of EVCS: reinforcement learning assisted deep learning approach. IEEE Trans. Intell. Vehicles., In press 77. J.H. Zhao, Z.Y. Dong, Z. Xu, K.P. Wong, A statistical approach for interval forecasting of the electricity price. IEEE Trans. Power Syst. 23(2), 267–276 (2008) 78. P. Mandal, T. Senjyu, N. Urasaki, T. Funabashi, A.K. Srivastava, A novel approach to forecast electricity price for PJM using neural network and similar days method. IEEE Trans. Power Syst. 22(4), 2058–2065 (2007) 79. R. Zhang, G. Li, Z. Ma, A deep learning based hybrid framework for day-ahead electricity price forecasting. IEEE Access 8, 143423–143436 (2020) 80. G. Huang, Q. Zhu, C. Siew, Extreme learning machine: theory and applications. Neurocomputing, 70(1) (2006) 81. S. Li, P. Wang, L. Goel, A novel wavelet-based ensemble method for short-term load forecasting with hybrid neural networks and feature selection. IEEE Trans. Power Syst. 31(3), 1788–1798 (2016) 82. N. Amjady, Day-ahead price forecasting of electricity markets by a new fuzzy neural network. IEEE Trans. Power Syst. 21(2), 887–896 (2006) 83. N.M. Pindoriya, S.N. Singh, S.K. Singh, An adaptive wavelet neural network-based energy price forecasting in electricity markets. IEEE Trans. Power Syst. 23(3), 1423–1432 (2008) 84. L. Zhang, P.B. Luh, Neural network-based market clearing price prediction and confidence interval estimation with an improved extended Kalman filter method. IEEE Trans. Power Syst. 20(1), 59–66 (2005) 85. J. Guo, P.B. Luh, Improving market clearing price prediction by using a committee machine of neural networks. IEEE Trans. Power Syst. 19(4), 1867–1876 (2004) 86. P. Mandal, T. Senjyu, N. Urasaki, T. Funabashi, A.K. Srivastava, A novel approach to forecast electricity price for PJM using neural network and similar days method. IEEE Trans. Power Syst., 22(4), 2058–2065 (2007), IEEE 87. F. Azevedo, Z.A. Vale, Short-term price forecast from risk management point of view. In Proceedings of the 13th International Conference on, Intelligent Systems Application to Power Systems (IEEE, 2005), pp. 111–116 88. F. Kamalov, L. Smail, I. Gurrib, Stock price forecast with deep learning. In 2020 International Conference on Decision Aid Sciences and Application (DASA) (IEEE, 2020), pp. 1098–1102 89. M.K. Kim, A new approach to short-term price forecast strategy with an artificial neural network approach: application to the Nord Pool. J. Electric. Eng. Technol., 10(4), 1480–1491 (2015) The Korean Institute of Electrical Engineers 90. N. Amjady, F. Keynia, Day ahead price forecasting of electricity markets by a mixed data model and hybrid forecast method. Int. J. Electric. Power Energy Syst., 30(9), 533–546 (2008) Elsevier 91. Z. Chang, Y. Zhang, W. Chen, Effective adam-optimized LSTM neural network for electricity price forecasting. In 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS) (IEEE, 2018), pp. 245–248 92. Y. Fu, Q.F. Zhang, C.Y. Zhang, Cable price forecast based on neural network models. In International Conference On Signal And Information Processing, Networking And Computers (Springer, 2022), pp. 1254–1260 93. D. Singhal, K.S. Swarup, Electricity price forecasting using artificial neural networks. Int. J. Electric. Power Energy Syst., 33(3), 550–555 (2011) Elsevier 94. H. Yang, M. Lai, Chaotic characteristics of electricity price and its forecasting model. Australian J. Electric. Electron. Eng., 2(2), 117–125 (2005) Taylor and Francis 95. S. Cheng, Z. Wei, D. Shang, Z. Zhao, H. Chen, Charging load prediction and distribution network reliability evaluation considering electric vehicles’ spatial-temporal transfer randomness. IEEE Access 8, 124084–124096 (2020)

References

29

96. L. Chen, F. Yang, Q. Xing, S. Wu, R. Wang, J. Chen, Spatial-temporal distribution prediction of charging load for electric vehicles based on dynamic traffic information. In 2020 IEEE 4th Conference on Energy Internet and Energy System Integration, pp. 1269–1274 (2020) 97. Y. Zheng, Z. Shao, Y. Zhang, L. Jian, A systematic methodology for mid-and-long term electric vehicle charging load forecasting: The case study of Shenzhen, China. Sustain. Cities Soc. 56, 102084 (2020) 98. F. Bizzarri, F. Bizzozero, A. Brambilla, G. Gruosso, G.S. Gajani, Electric vehicles state of charge and spatial distribution forecasting: a high-resolution model. In IECON 2016-42nd Annual Conference of the IEEE Industrial Electronics Society (IEEE, 2016), pp. 3942–3947 99. G. Chunlin, Q. Wenbo, W. Li, D. Hang, H. Pengxin, X. Xiangning, A method of electric vehicle charging load forecasting based on the number of vehicles. In International Conference on Sustainable Power Generation and Supply (SUPERGEN 2012) (IET, 2012), pp. 1–5 100. H.J. Feng, L.C. Xi, Y.Z. Jun, Y.X. Ling, H. Jun, Review of electric vehicle charging demand forecasting based on multi-source data. In 2020 IEEE Sustainable Power and Energy Conference, pp. 139–146 (2020) 101. H.M. Louie, Time-series modeling of aggregated electric vehicle charging station load. Electric Power Components Syst. 45(14), 1498–1511 (2017) 102. X. Zhang, Short-term load forecasting for electric bus charging stations based on fuzzy clustering and least squares support vector machine optimized by wolf pack algorithm. Energies 11(6), 1449 (2018) 103. A. Almaghrebi, F. Aljuheshi, M. Rafaie, K. James, M. Alahmad, Data-driven charging demand prediction at public charging stations using supervised machine learning regression methods. Energies 13(16), 4231 (2020) 104. J. Zhu, Z. Yang, M. Mourshed, Y. Guo, Y. Zhou, Y. Chang, Y. Wei, S. Feng, Electric vehicle charging load forecasting: a comparative study of deep learning approaches. Energies 12(14), 2692 (2019) 105. Z. Li, Y. Li, Y. Liu, P. Wang, R. Lu, H.B. Gooi, Deep learning based densely connected network for load forecasting. IEEE Trans. Power Syst. 36(4), 2829–2840 (2021) 106. G. Guo, W. Yuan, Y. Lv, W. Liu, J. Liu, Traffic forecasting via dilated temporal convolution with peak-sensitive loss. IEEE Intelligent Transportation Systems Magazine, in press 107. M. Xue, L. Wu, Q.P. Zhang, J.X. Lu, X. Mao, Y. Pan, Research on load forecasting of charging station based on XGBoost and LSTM model. J. Phys.: Conf. Series 1757, 012145 (2021) 108. Y. Kim, S. Kim, Forecasting charging demand of electric vehicles using time-series models. Energies 14(5), 1487 (2021) 109. L. Buzna, P.D. Falco, G. Ferruzzi, S. Khormali, D. Proto, N. Refa, M. Straka, G. van der Poel, An ensemble methodology for hierarchical probabilistic electric vehicle load forecasting at regular charging stations. Appl. Energy, 283, 116337 (2021) 110. X. Zhang, K.W. Chan, H. Li, H. Wang, J. Qiu, G. Wang, Deep-learning-based probabilistic forecasting of electric vehicle charging load with a novel queuing model. IEEE Trans. Cybernetics. 51(6), 3157–3170 (2021) 111. G. Gruosso, A. Mion, G.S. Gajani, Forecasting of electrical vehicle impact on infrastructure: Markov chains model of charging stations occupation. eTransportation, in press 112. L. Buzna, P.D. Falco, S. Khormali, D. Proto, M. Straka, Electric vehicle load forecasting: a comparison between time series and machine learning approaches. In 2019 1st International Conference on Energy Transition in the Mediterranean Area (SyNERGY MED) (IEEE, 2019), pp. 1–5 113. M. Dabbaghjamanesh, A. Moeini, A. Kavousi-Fard, Reinforcement learning-based load forecasting of electric vehicle charging station using Q-learning technique. IEEE Trans. Indus. Inf., 17(6), 4229–4237 (2020) IEEE 114. A.J. Jahromi, M. Mohammadi, S. Afrasiabi, M. Afrasiabi, J. Aghaei, Probability density function forecasting of residential electric vehicles charging profile. Appl. Energy, 323, 119616 (2022) Elsevier 115. F.B. Hüttel, I. Peled, F. Rodrigues, F.C. Pereira, Deep Spatio-Temporal Forecasting of Electrical Vehicle Charging Demand. arXiv preprint arXiv:2106.10940 (2021)

Chapter 3

Review for Smart Grid Dispatch

3.1 Introduction As a novel generation of power systems, smart grid is devoted to achieving a sustainable, secure, reliable and flexible energy delivery through the bidirectional power and information flow. On this basis, smart grid offers a more efficient way to ensure the optimal dispatch with a lower generation cost and higher power quality via the integration of distributed sources and flexible loads, such as renewable energy and electric vehicles [1]. In fact, optimal dispatch is one of the most pivotal issues in the smart grid, which has been extensively investigated in the previous research [2]. In detail, smart grid dispatch problems have various real-world applications under different assumptions or requirements, including microgrid economic dispatch, optimal power flow, electric vehicles energy management and distribution network reconfiguration. As a result, the smart grid dispatch is a computationally complicated optimization problem with multi-objective, multi-constraint, multi-variable and strong uncertainty, which is difficult to be solved. To this end, several methods are developed to deal with this complex problem and meet the requirement of application in the real world. Generally, the common solutions for smart grid dispatch could be divided into three categories, i.e., mathematical programming, evolutionary algorithms as well as the AI-enabled computational methods. The rest part of this chapter is organized as follows. The real-world applications with respect to smart grid dispatch are presented in Sect. 3.2, including distribution network, microgrid network, electric vehicles and integrated energy system. On this basis, existing solution methods are summarized in Sect. 3.3 from three aspects, i.e., mathematical programming, evolutionary algorithms and AI-enabled computational methods, respectively.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Li et al., Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch, Engineering Applications of Computational Methods 14, https://doi.org/10.1007/978-981-99-0799-1_3

31

32

3 Review for Smart Grid Dispatch

3.2 Real-World Applications Compared with the traditional power system, smart grid integrates more distributed renewable energy to promote the sustainability. Under this circumstance, the conventional centralized high-voltage power transmission is not economical since the renewable energy sources are usually distributed and far away from the load center. As a result, distribution network, self-sufficient microgrid and integrated energy system are gradually becoming more independent from the transmission network, which are also regarded as the developing trend of smart grid [3–6]. In addition, it has witnessed the rapid development of electric vehicles in recent years, which has already become a critical component of the smart grid [7–9], as shown in Fig. 3.1. To this end, real-world applications regarding optimal dispatch on distribution network, microgrid, integrated energy system and electric vehicle are summarized as follows.

3.2.1 Distribution Network In recent years, the operation of distribution network (DN) is facing significant challenges mainly due to the increasing deployment of distributed energy resources (DERs) and electric vehicles. Specifically, the uncertain RE power output impacts the distribution and direction of DN power flow, which may further lead to the increase of power loss and voltage fluctuations [10]. Traditional methods based on mathemat-

Fig. 3.1 Optimal dispatch issues of smart grid operation

3.2 Real-World Applications

33

ical optimization may not effectively deal with this highly uncertain environment. More importantly, these traditional methods significantly depend on the accurate distribution network parameters, which are difficult to acquire in practice.

3.2.2 Microgrid Network Microgrid is a local electric power system with DERs, energy storage system (ESS) and flexible loads [11]. Various objectives are proposed in the field of microgrid optimization dispatch, such as maximizing the revenue of operator, minimizing the operational cost, promoting the satisfaction of users, reducing the delivery power loss, increasing the RE utilization and promoting the system stability. Indeed, decision variables of microgrid dispatch mainly include the electricity transaction price, power allocation plan, device operation state, etc. Therefore, the common model of microgrid optimal dispatch is expressed as follows: min

T   t=1

 dg  πdg Pk (t) + ηm π(t)Pmgrid (t)+

k∈m

ηess |SOC(t) − SOC(t − 1)| Z   πmz qmz (t)u mz (t) +

(3.1)

z=1

where the operational cost optimization of the mth microgrid during dispatch period T is denoted in Eq.  dg  (3.1). The first part represents the generation cost of DERs, in which πdg Pk (t) is the quadratic polynomial correlated to electricity generation dg Pk . The second part is the energy cost of microgrid purchasing electricity from the main grid where ηm denotes the loss coefficient of power delivery. π(t) and grid Pm (t) represent the electricity sale price and energy purchased from the main grid, respectively. The third part denotes the loss cost of ESS where ηess is the loss coefficient and S OC represents the energy state of ESS. Besides, the last part of Eq. (3.1) is the internal dispatch cost of microgrids where πmz denotes dispatch cost of the zth response block. Especially, qmz represents the response generation of the zth response block and u mz is the binary variable denoting the participating status of demand response. To ensure the secure operation of microgrid, following constraints should be taken into account. For example, the energy stored at ESS during t should be equal to the difference of charged and discharged energy. ESS t − SOC(t) = SOC(t − 1) + ηch Pch

ESS Pdc t ηdc

(3.2)

34

3 Review for Smart Grid Dispatch

where Eq. (3.2) describes the energy balance of ESS. ηch and ηdc denote the charging ESS ESS and Pdc represent the energy and discharging coefficients of ESS, respectively. Pch amount of ESS charging and discharging. In addition, t is the duration of the tthth time interval. In the microgrid operation, the active power should be balanced between supply and demand side, which is presented as follows. Pmgrid (t) +



dg

ESS Pk (t) + Pdc (t) +

k∈m

=

Z 

qmz (t)u mz (t)

z=1 load Pm (t)

+

(3.3)

ESS Pch (t)

where Pmload (t) represents the power load of mth microgrid. The managed objects of microgrid optimal dispatch could be divided into three categories, i.e., DERs, ESS and user loads, as shown in Fig. 3.2. The management of DERs is related to dispatch the generation of DERs, which mainly consists of PV, wind power, diesel generator (DG), fuel cell and so on [12]. As for ESS, the optimal management is achieved by dispatching the actions of charge or discharge, in order to coordinate the relationship between supply and demand side of microgrid [13]. Besides, another method named demand response is widely applied to dispatch the user loads of microgrid, which aims to reduce the operational cost and enhance the service reliability [14]. However, the dispatch model of microgrid is difficult to be optimized since the uncertainty of renewable energy and user loads. What is more, the existence of high-dimensional variables and nonlinear constraints makes the solving trouble.

Fig. 3.2 Microgrid optimal dispatch method

3.3 The Methods for Smart Grid Dispatch

35

3.2.3 Electric Vehicles Electric vehicle (EV) has been rapidly developed worldwide in the past decade due to its low environmental impact [15]. Specifically, reducing the charging cost through dispatching the behaviors of charging and discharging is the hot spot of research. Owing to the flexibility of EVs charging/discharging, some literature focuses on the coordinated dispatch of EVs and RE, which is devoted to promoting the utilization of RE by EVs. However, the uncertainty of renewable energy and user loads results in the difficulty of model construction. At the same time, the massive amount of EVs makes the optimization variables high dimensional, which is hard to be solved. On the one hand, traditional methods tend to estimate before optimization and decisionmaking while addressing the randomness of EV charging behaviors. On the other hand, multi-stage optimization is introduced to handle the problem caused by highdimensional variables. Nevertheless, the optimization results of these methods are really dependent on the predictive accuracy.

3.2.4 Integrated Energy System In order to solve the problem of sustainable supply of energy and the environment pollution, integrated energy system (IES) has attracted extensive attention all over the world. It regards the electric power system as the core platform and integrates renewable energy at the source side and achieves the combined operation of cooling, heating as well as electric power at the load side [16]. However, the high penetration of renewable energy and flexible loads make the IES become a complicated dynamic system with strong uncertainty, which poses huge challenges to the secure and economic operation of IES. Moreover, conventional optimization methods often rely on accurate mathematical model and parameters, which are not suitable for IES optimal dispatch problem while considering strong randomness.

3.3 The Methods for Smart Grid Dispatch As mentioned above, smart grid dispatch is a complex optimization problem, which is difficult to be solved in practice. To this end, various solution techniques have been developed to handle this aporia, which could be divided into three categories, i.e., mathematical programming, evolutionary algorithm and AI-enabled methods. In the rest part of this subsection, the detailed applications of these solution approaches on the smart grid dispatch are summarized comprehensively.

36

3 Review for Smart Grid Dispatch

3.3.1 Mathematical Programming Various mathematical optimization methods have been proposed for the smart grid dispatch, which could be divided into four categories such as linear programming (LP), mixed-integer programming (MIP), dynamic programming and nonlinear programming (NLP). In general, the smart grid dispatch problem is transformed to a mathematical programming model by relaxation or conversion. On this basis, mathematical programming approaches are applied to solve the transformed problems, which significantly reduces the complexity and ensures the optimality. In the rest part of this subsection, detailed applications of mathematical programming methods on smart grid dispatch are discussed as follows. (a) Economic Dispatch For instance, a mixed-integer linear programming model for the economic dispatch problem with LMP-dependent load is proposed in [17], and the equilibrium solution simultaneously offers the dispatch strategy and LMPs. Case studies demonstrate the difficulties of traditional approaches and the effectiveness of the proposed method. In [18], a method is presented to convert the nonlinear program economic dispatch into a mixed-integer linear program, which is reformulated as a two-stage linear program. Although this approximation does not guarantee optimality, more than 98% of the presented empirical results, based on the IEEE 118-bus and Polish systems, achieved global optimality. In the case of suboptimal solutions, the savings were still significant and the solution time was dramatically reduced. [19] develops feasible region projection-based approach to equivalently reformulate the robust dynamic economic dispatch as a single-level linear programming model that can be effectively solved while guaranteeing solution optimality. Numerical studies show the proposed approach is one order of magnitude faster than CCG. In addition, a data-driven twostage day-ahead dispatch model for island microgrid in [20], in order to reduce the intraday dispatch pressure. The first stage dispatch model considers multiple demand responses and applies phase space reconstruction and machine learning to predict renewable energy output and power load. Since the deviation between the predicted value and the actual value can lead to renewable energy curtailment or load loss, the role of the second stage is to predict the renewable energy curtailment and load loss after the first stage dispatch. Simulation results indicate that the proposed two-stage dispatch method using regulation reserve capacity not only reduces dispatch cost, but also improves system efficiency and reliability. Moreover, [21] develops a robust energy and reserve dispatch model to implement reliable and economical operation, in which the operating decisions are divided into predispatch and redispatch. On this basis, a cutting plane algorithm is established to solve associated optimization problems. The proposed model and method are applied to a five-bus system as well as a realistic provincial power grid in China. Numeric experiments demonstrate that the proposed methodology is effective and efficient. In [22], a dynamic constrained economic dispatch model is presented to achieve the following objectives: (1) better frequency performance under high renewable

3.3 The Methods for Smart Grid Dispatch

37

penetration; (2) optimal scheduling between energy, regulation reserve and regulation mileage while respecting the system and generator physical constraints; (3) better incentive for fast ramping units such as batteries or flywheels to provide highquality regulation reserve. In comparison with traditional economic dispatch model, the proposed approach can optimally schedule and allocate regulation reserve to cover the moment to moment generation-load imbalance, therefore improve the frequency performance and economic efficiency. The simulation results show the significant improvement on control performance standard score by applying the developed model while keeping the moderate total production cost increase. Besides, a novel chance-constrained economic dispatch model is proposed in [23], where generation of conventional units and curtailment strategies of renewable energy are co-optimized to minimize total operational cost and restrict operational risks. An efficient solution method is developed to schedule generation and curtailment sequentially, in which tractable optimization models are formulated for each step. Numerical tests demonstrate that the proposed model can significantly reduce the probability of emergency control of curtailment compared with conventional models. Furthermore, this solution method also outperforms scenario-based method with significantly improved computational efficiency and higher-quality solution. (b) Optimal Power Flow For example, [24] presents three strong second-order cone programming relaxations for the AC OPF problem. These three relaxations are incomparable to each other, and two of them are incomparable to the standard SDP relaxation of OPF. Extensive computational experiments show that these relaxations have numerous advantages over existing convex relaxations in the literature. For example, one of the proposed SOCP relaxations together with IPOPT produces a feasible solution for the largest instance in the IEEE test cases (the 3375-bus system) and also certifies that this solution is within 0.13% of global optimality, all this computed within 157.20 s on a modest personal computer. Overall, the proposed strong SOCP relaxations provide a practical approach to obtain feasible OPF solutions with extremely good quality within a time framework that is compatible with the real-time operation in the current industry practice. In [25], a sequential linear programming (SLP) approach consists of a sequence of carefully constructed supporting hyperplanes and halfspaces, in the aim of leveraging the advantages of LP while retaining the accuracy of NLP interior point methods. The algorithm is numerically demonstrated to converge on 138 test cases with up the 3375 buses to feasible high-quality solutions (i) without AC feasibility restoration (i.e., using LP solvers exclusively), (ii) in computation times generally within the same order of magnitude as those from a state-of-the-art NLP solver and (iii) with robustness against the choice of starting point. Besides, a novel bi-level formulation based on the smoothing technique in [26], where any price-affecting strategic player can be modeled in the upper level, while the marketclearing problem in the lower level uses convex quadratic transmission AC OPF, with the goal of achieving accuracy close to the one of the exact nonlinear formulations.

38

3 Review for Smart Grid Dispatch

Furthermore, in [27], a specific algorithm for the solution of a nonapproximated, non-convex AC OPF problem in radial distribution systems is designed to overcome the limitations of recent approaches for the solution of the OPF problem. It is based on the method of multipliers, as well as on a primal decomposition of the OPF problem. We provide a centralized version, as well as a distributed asynchronous version of the algorithm. The performance of our proposed algorithm is evaluated by both small-scale electrical networks and a modified IEEE 13-bus test feeder. [28] presents a solution for balanced radial networks. It exploits recent results that suggest solving for a globally optimal solution of OPF over a radial network through the second-order cone program relaxation. This distributed algorithm is based on alternating direction method of multiplier (ADMM), but unlike standard ADMMbased distributed OPF algorithms that require solving optimization subproblems using iterative method, decomposition in this article allows to derive closed form solutions for these subproblems, greatly speeding up each ADMM iteration. Numerical experiments illustrate the scalability of the proposed algorithm by simulating it on a real-world 2065-bus distribution network. In addition, a fully decentralized OPF algorithm for multi-area interconnected power systems based on the distributed interior point method is proposed in [29], where solving the regional correction equation was converted into solving a parametric quadratic programming problem during each Newton–Raphson iteration. Simulation results on a three-bus test system, four IEEE test systems and a real four-area 6056-bus interconnected system show the benefits of the proposed method. (c) Energy Management For instance, [30] presents a new control strategy called adaptive equivalent consumption minimization strategy (A-ECMS). This real-time energy management for HEV is obtained adding to the ECMS framework an on-the-fly algorithm for the estimation of the equivalence factor according to the driving conditions. The main idea is to periodically refresh the control parameter according to the current road load, so that the battery state of charge is maintained within the boundaries and the fuel consumption is minimized. The results obtained with A-ECMS show that the fuel economy that can be achieved is only slightly suboptimal and the operations are charge sustaining. In [31], the real-time energy management problem of energy hubs is formulated in a dynamic pricing market. The energy hubs interaction is modeled as an exact potential game to optimize each energy hub’s payments to the electricity and gas utilities, as well as the customers’ satisfaction from energy consumption. The potential game approach offers a chance to study the existence and uniqueness of the Nash equilibrium and to design an online distributed algorithm to achieve that equilibrium. Simulations results show that the proposed algorithm can increase the energy hubs’ average payoff by 18.8%. Besides, a distributed algorithm for online energy management in networked microgrids with a high penetration of distributed energy resources is denoted in [32]. To address the high uncertainty issue in the networked microgrids, an online energy management is based on the ADMM algorithm with the past power generation information from the DERs. The online algorithm provides less conservative schedule than the robust optimization-based approach.

3.3 The Methods for Smart Grid Dispatch

39

The effectiveness of the proposed algorithm is verified by various numerical examples. [33] proposes a traffic data-enabled predictive energy management framework for a power-split PHEV. Compared with conventional model predictive control, an additional supervisory SoC planning level is constructed based on real-time traffic data. Numerical results using real-world traffic data illustrate that the proposed strategy successfully incorporates dynamic traffic flow data into the PHEV energy management algorithm to achieve enhanced fuel economy. Besides, in [34], a decentralized energy management framework is developed to coordinate the power exchange between distribution system and microgrids based on the alternating direction method of multipliers algorithm in a fully decentralized fashion. Moreover, the robust model is solved by column and constraint generation algorithm, where cutting planes are introduced to ensure the exactness of secondorder cone relaxation. Numerical results on a modified IEEE 33-bus system with three microgrids validate the effectiveness of the proposed method. [35] designs a distributed energy management strategy for the optimal operation of microgrids with consideration of the distribution network and the associated constraints. The simulation results demonstrate the effectiveness and fast convergence of the proposed distributed energy management strategy. Moreover, a novel exact algorithm is developed in [36] to solve the optimal reserve management of EV aggregator and the finite convergence is proved. Comprehensive case studies demonstrate the economic merits of our proposed model to both the aggregator and the EV owners. The solution optimality with the state-of-the-art approach thus validates the effectiveness of our proposed exact algorithm. [37] proposes an online energy management strategy with the ability to mimic the optimal solution but without using a prior road information. Simulation results indicate that the proposed strategy exhibits similar behavior as an optimal solution obtained from dynamic programming. Profits in fuel economy primarily arise from engine stop/start and energy obtained during regenerative braking. This latter energy is preferably used for pure electric propulsion where the internal combustion engine is switched off. (d) Network Configuration For instance, [38] presents a mixed-integer conic programming formulation for the minimum loss distribution network reconfiguration problem. This formulation has two features. Firstly, it employs a convex representation of the network model which is based on the conic quadratic format of the power flow equations. Secondly, it optimizes the exact value of the network losses. The use of a convex model in terms of the continuous variables is particularly important because it ensures that an optimal solution obtained by a branch-and-cut algorithm for mixed-integer conic programming is global. On this basis, good quality solutions with a relaxed optimality gap can be very efficiently obtained. In [39], the distribution network problem considering uncertainties of loads and distributed generation is formulated as a two-stage robust optimization model solved by a nested column-and-constraint generation algorithm. Illustrative cases show that distributed generation curtailments can be significantly reduced by a small number of critical switches that operate only several times in intraday network configuration. Furthermore, a dynamic and multi-objective stochastic

40

3 Review for Smart Grid Dispatch

mixed-integer linear programming model is developed in [40], which jointly takes the optimal deployment of RES-based DGs and ESSs into account in coordination with distribution network reinforcement and/or reconfiguration. Numerical simulation on IEEE 119-bus power system clearly shows the capability of ESS deployment in dramatically increasing the level of renewable DGs integrated in the system. In [41], the total supply capability of distribution system considering network reconfiguration and daily load curves is investigated, which formulates this evaluation model as a mixed-integer problem with second-order cone programming. Numerical results show that the presented model is more accurate than the previously published models. In addition, [42] designs an optimal contingency assessment model using twostage stochastic linear programming including wind power generation and a generic ESS. The optimization model is applied to find the best radial topology by determining the best switching sequence to solve contingencies. The proposed model is applied to a 69-node distribution system, and the results of all possible contingencies in the network are examined considering three different case studies with several scenarios. In [43], a new convex formulation of network reconfiguration strategy is incorporated, which guarantees the components of the same VPP connected and further improves the performance of VPPs. Numerical simulation on the 13-bus and 70-bus distribution networks justifies the effectiveness of the proposed approach. Moreover, a novel linear programming model is presented in [44], which includes precisely assessing reliability and considers post-fault network reconfiguration strategies involving operational constraints. This model can also formulate the influences of demand variations, uncertainty of distributed generations and protection failures on the reliability indices. Experimental results indicate that the proposed model yields the same results as the simulation-based algorithm. Specifically, the system average interruption duration indices are reduced when considering post-fault network reconfiguration strategies in all tested systems. More importantly, the proposed model is suitable for inclusion in reliabilityconstrained operational and planning optimization models for power distribution systems.

3.3.2 Evolutionary Algorithms Although mathematical methods achieve successful applications in smart grid dispatch, they mainly concentrate on the convex optimization problem after relaxation or conversion. However, as for the common smart grid dispatch issues, traditional mathematical approaches cannot efficiently solve this non-convex multi-objective optimization problem [45]. Fortunately, the evolutionary algorithm is able to deal with this problem with strong robustness and adaptability [46–48]. For instance, various evolutionary algorithms are introduced to solve the multi-objective optimization problems, including commonly used PSO [49], NSGA-II [50] and differential evo-

3.3 The Methods for Smart Grid Dispatch

41

lution algorithm [51]. In the rest part of this subsection, detailed applications of evolutionary algorithms on smart grid dispatch are discussed as follows. (a) Economic Dispatch For instance, [52] develops a novel dual-population adaptive differential evolutionary algorithm to solve the large-scale multi-fuel economic dispatch with value-point effects. Simulation results on the extremely large-scale non-convex economic dispatch problem with more than 1000 units are conducted, which demonstrate the effectiveness of proposed evolutionary algorithm in terms of accuracy and robustness. A self-adaptive differential evolutionary and a real-coded genetic algorithm are proposed in [53] to handle the dynamic dispatch problem. The effectiveness of the designed approaches is verified on a number of dynamic economic dispatch problems for a cycle of 24h, which reveals the superiority regarding solution quality and reliability. In [54], a powerful, robust and hybrid configuration of evolutionarybased algorithm, namely fuzzy-based hybrid particle swarm optimization-differential evolution algorithm, is presented for solving the multi-objective economic dispatch problem. Numerical simulation based on 10-unit, 40-unit and 160-unit test systems proves the out performance of the developed method, compared with the outcomes of other algorithms. In addition, an enhanced multi-objective differential evolutionary algorithm is proposed in [55], in which an elitist archive technique is adopted to retain the non-dominated solutions obtained during the evolutionary process. The feasibility and effectiveness of the proposed method are demonstrated by a test power system. Compared with other methods, the designed algorithm can get higher-quality solutions by reducing the fuel cost and the emission effects synthetically. Moreover, [56] presents a novel hybrid algorithm connecting interior point method and differential evolution for solving economic load dispatch problem with valve point effect. In detail, this algorithm involves two stages, and the first stage employs interior point method to minimize the cost function without considering the valve point effect. Afterward, the second stage considers valve point effect and minimizes the cost function using differential evolution algorithm. Extensive tests verify that the proposed method outperforms other existing techniques for economic load dispatch considering valve point effect. In [57], an improved genetic algorithm with multiplier updating is proposed to solve power economic dispatch problems of units with valve point effects and multiple fuels. Especially, the proposed algorithm is demonstrated to be highly promising for the large-scale system of the actual economic dispatch. Besides, an improved quantum-inspired evolutionary algorithm based on diversity information of population is developed in [58]. From the results for the benchmark problem, it is observed that the proposed approach provides promising results when compared to various methods available in the literature. (b) Optimal Power Flow For example, [59] designs an efficient modified differential evolution algorithm to solve optimal power flow with non-smooth and non-convex generator fuel cost

42

3 Review for Smart Grid Dispatch

curves. Simulation results demonstrate that the modified differential evolution algorithm provides very remarkable results compared to those reported recently in the literature. In [60], a new efficient evolutionary algorithm is proposed to solve optimal power flow, which utilizes the concept of incremental power flow model based on sensitivities. The potential of this approach is verified on IEEE 30-bus, 118-bus and 300-bus systems, and the simulation results indicate that the developed approach is generic one and can be applied to any evolutionary algorithm-based OPF in comparison with other evolutionary algorithms. Besides, a robust and efficient method for solving transient stability constrained OPF is presented in [61], which is a new branch of evolutionary algorithms with strong ability in searching global optimal solutions of highly nonlinear and non-convex problems. To be specific, several strategies are proposed for the initialization, assessment and selection of solution individuals in evolution process of differential evolution, in order to reduce the computational burden. Numerical tests on the three-generator, nine-bus power system and New England ten-generator, 39bus system have demonstrated the robustness and effectiveness of the proposed approach. Furthermore, in [62], an improved strength Pareto evolutionary algorithm is proposed to solve multi-objective optimal power flow problem, which mainly involves three improved aspects. At first, the external archive population is only composed of the variable size of non-dominated individuals in environmental selection operator. Secondly, the Euclidean distance between the elite individual and its k-th neighboring individual is adopted to update the external archive population. Thirdly, the local search strategy is embedded into strength Pareto evolutionary algorithm. The performance of the designed method has been tested on the IEEE 30-bus and IEEE 57-bus systems. [63] presents a multi-objective differential evolutionary algorithm based on forced initialization to solve OPF problem, which takes fuel cost minimization, power losses minimization, voltage profile improvement and voltage stability enhancement into account. For solving the multi-objective OPF, the proposed approach combines a new variant of DE (DE/best/1) with the -constraint approach. This combination guarantees high convergence speed and good diversity of Pareto solutions without computational burden of Pareto ranking and updating or additional efforts to preserve the diversity of the non-dominated solutions. In addition, an enhanced adaptive differential evolution is designed for OPF in [64] where four improvements are introduced to obtain promising results. Firstly, crossover rate sorting mechanism is introduced to allow individuals to inherit more good genes. Secondly, parameters are rerandomized to maintain the search efficiency and diversity. Thirdly, dynamic population reduction strategy is adopted to accelerate convergence. At last, self-adaptive penalty constraint handing technique is integrated to deal with the constraints. To verify the effectiveness of the proposed method, it is applied to the OPF problem on a modified IEEE 30-bus test system, which combines stochastic wind energy and solar energy with conventional thermal power generators. The simulation results demonstrate that the proposed approach can be an effective alternative for the OPF problem.

3.3 The Methods for Smart Grid Dispatch

43

(c) Energy Management For instance, a new form of algorithm environment for the multi-objective optimization of an energy management system in plug-in hybrid vehicles (PHEVs) is presented in [65]. The surrogate-assisted strength Pareto evolutionary algorithm is developed to optimize the power-split control parameters guided by the data from the physical PHEV and its digital twins. Driven by the developed method, the optimized energy management system surpasses other systems by saving more than 4.8% energy. In [66], a generic framework of online energy management system (EMS) is proposed for PHEVs, which includes several control strategies for managing battery state-of-charge (SOC). Extensive simulation validation and evaluation using real-world traffic data indicates that the different SOC control strategies of the proposed online EMS all outperform the conventional control strategy. [67] focuses on optimizing the capacity of hybrid energy storage system and parameters of EMS, simultaneously. Detailedly, in order to minimize the cost of super-capacitors and the capacity degradation of batteries at the same time, the multiple objective optimization problems are solved by multi-objective evolutionary algorithm based on decomposition (MOEA/D). Experimental results illustrate that the proposed method can improve the efficiency of system and enhance the life span of the battery. In addition, [68] develop two control strategies of power and energy management for synchronous microgrid operation based on evolutionary algorithm. The first strategy reduces power and energy losses, thus improving the entire efficiency of microgrid systems while the second one minimizes the operation cost. In [69], a combined microgrid sizing and energy management methodology is proposed, formulated as a leader-follower problem. The leader problem focuses on sizing and aims at selecting the optimal size for the microgrid components, which is solved by a genetic algorithm. The follower problem, i.e., the energy management issue, is formulated as a unit commitment problem and is solved with a mixed-integer linear program. Simulation experiments validate the effectiveness of the proposed method, especially with respect to a simple rule-based strategy. Besides, an evolutionary algorithm is presented to optimize the integrated usage of multiple residential energy resources, including local generation, shiftable loads, thermostatically controlled loads and storage system. Results have shown that significant savings can be achieved mainly through demand response actions implemented over thermostatically controlled loads. (d) Network Configuration For example, [70] proposes a tree encoding and two genetic operators to improve the evolutionary algorithm performance for network reconfiguration problems. Simulation on a large-scale system indicates that the developed methodology can provide an efficient alternative for reconfiguration problems. In [71], an evolutionary algorithm is designed to efficiently solve the distribution network reconfiguration for loss reduction problem. To be specific, this algorithm achieves a novel way for implementing the operator of recombination to guaranty, at all times, the production of new radial typologies. More importantly, this approach is presented and verified in

44

3 Review for Smart Grid Dispatch

a real distribution system, showing excellent results and computational efficiency. Besides, a customized evolutionary algorithm has been introduced and applied to power distribution network reconfiguration in [72]. The recombination operators of the algorithm are designed to preserve feasibility of solutions (radial structure of the network) thus considerably reducing the size of the search space. Consequently, improved repeatability of results and lower overall computational complexity of the optimization process have been achieved. This approach is demonstrated to be superior by comprehensive benchmarks over state-of-the-art methods from the literature. In addition, [73] develops an effective method-variable scaling hybrid differential evolution for solving the network reconfiguration for power loss reduction and voltage profit enhancement of distribution systems. Numerical results show that the performance of the proposed method is better than the other methods such as genetic algorithm and simulated annealing. A Pareto-based multi-objective distribution network reconfiguration method using discrete PSO algorithm is investigated in [74], in which probabilistic heuristics and graph theory techniques are employed to improve the stochastic random search of the algorithm self-adaptively during the optimization process. Numerical experiments demonstrate the effectiveness of the proposed method in solving multi-objective network reconfiguration problems by obtaining a Pareto front with great diversity, high-quality and proper distribution of non-dominated solutions in the objective space. Besides, in [75], a hybrid evolutionary algorithm based on combination of new fuzzy adaptive PSO and Nelder–Mead simplex search method to solve the distribution network reconfiguration problem. The proposed algorithm is validated on two distribution networks, and simulation results show that this algorithm is very powerful and guarantees to obtain the global optimization. [76] introduces a novel hybrid algorithm based on shuffled frog leaping algorithm and PSO to solve the network reconfiguration problem. The proposed algorithm has been applied to a complex multimodal benchmark function and also two different distribution networks including 33-bus and 95-bus test systems. In addition, the network reconfiguration problem has been solved by an enhanced gravitational search algorithm in [77], in order to improve the transient stability index and decrease losses as well as operation cost in a distribution test system with multiple microturbines. The effectiveness of the presented method is studied based on a typical 33-bus test system. [78] designs a novel multi-objective invasive weed optimization algorithm to solve the optimal network reconfiguration, which considers power loss, voltage deviation, switching operations and the load balancing index, simultaneously. The performance of the proposed algorithm is compared with results available in recent literature, and it is observed that the proposed method produces a high-quality Pareto solution and finds a global optimum configuration. In [79], a genetic algorithm with two-network encoding is developed, which is capable of representing only radial connected solutions without demanding a planar topology or any specific genetic operator. Besides, a hybrid market-based distribution network reconfiguration methodology is presented in [80], in order to concurrently obtain both the optimal configuration and the location marginal prices at distributed generation connected buses. The effectiveness of this proposed methodology is evaluated by a practical distribution system.

3.3 The Methods for Smart Grid Dispatch

45

3.3.3 AI-Enabled Methods Although traditional methods based on mathematical optimization and evolutionary algorithms achieve success, these methods significantly depend on the accurate models and parameters, which are difficult to acquire in practice. To address these limitations, AI-enabled methods such as deep learning (DL), reinforcement learning (RL) and their combination deep reinforcement learning (DRL) have been proposed to solve the complicated dispatch problem in smart grid. At present, AI-enabled method is gradually becoming an effective tool for operators to make decisions during the operation of smart grid. In the rest part of this subsection, detailed applications of AI-enabled methods on smart grid dispatch are discussed as follows. (a) Economic Dispatch For instance, [81] proposes a novel DRL approach based on deep LSTM for microgrid economic dispatch through dispatching the generation of DERs and ESS, which shows better results compared to Q-Learning. The asynchronous advantage AC (A3C) algorithm is introduced to dispatch energy while considering risk in [82]. Experiments results denote a higher accuracy energy scheduling of proposed riskaware model than traditional methods. Besides, a finite-horizon DDPG (FH-DDPG)based DRL algorithm is proposed in [83] for energy dispatch problem with diesel generators, PV panels and a battery. Case study using real isolated MG data shows the designed approach can make efficient decisions even with partially available state information. Moreover, in [84], a multi-agent TD3 (MATD3) is developed for ESS energy management. Simulation results demonstrate its efficiency and scalability while handling high-dimensional problems with continuous action space. [85] introduces the DDPG algorithm to derive ESS energy dispatch policies without fully observable state information. Cases studies show that the proposed algorithm has derived better energy dispatch policy for ESS. In addition, the curriculum learning is integrated into A2C to improve sample efficiency and accelerate training process in [86], which significantly speeds up the convergence during the training of DRL and increases the overall profits. In addition, [87] proposes a Monte Carlo DRL (MCDRL) approach for demand side management, which verifies to have a strong exploration ability and protect the privacy of consumer. In [88], a prioritized experience replay DDPG (PERDDPG) is applied in the microgrid dispatch model considering demand response. Simulation studies indicate its advantage in significantly reducing operational cost compared with traditional dispatch methods. Besides, the A2C algorithm is developed to address the demand response problem in citech39451164, which not only shows superiority and flexibility, but also has the ability of privacy preserving. (b) Optimal Power Flow Moreover, DRL could provide better flexible control decisions to promote the operations of DN while considering the optimal power flow, such as voltage regulation. For

46

3 Review for Smart Grid Dispatch

instance, Refs. [90, 91] propose a multi-agent DDPG-based approach for the DN voltage regulation with a high penetration of PVs, which shows a better utilization of PV resources and control performance. In Refs. [92, 93], a novel DRL algorithm named constrained soft actor-critic (SAC) is proposed to solve Volt-Var control problems in a model-free manner. Comprehensive numerical studies demonstrate the efficiency and scalability of proposed DRL algorithm, compared with state-of-the-art DRL and convectional optimization algorithms. Refs. [94, 95] propose a two-stage real-time Volt-Var control method, in which the model-based centralized optimization and DQN algorithm are combined to mitigate the voltage violation of DN. (c) Energy Management On the one hand, traditional methods tend to estimate before optimization and decision-making while addressing the randomness of EV charging behaviors. On the other hand, multi-stage optimization is introduced to handle the problem caused by high-dimensional variables. Nevertheless, the optimization results of these methods are really dependent on the predictive accuracy. In this way, DRL is applied to deal with the EVs optimal dispatch problem, which is a data-driven method and insensitive to the accuracy of prediction. For instance, [96] proposes a novel approach based on DQN to dispatch the EVs charging and recommend the appropriate traveling route for EVs. Simulation studies demonstrate its effectiveness in significantly reducing the charging time and origin-destination distance. In [97], a DRL approach with embedding and attention mechanism is developed to handle the EV routing problem with time windows. Numerical studies show that it is able to efficiently solve the problem of large sizes, which is not solvable with current other methods. Besides, a charging control DDPG algorithm is introduced to learn the optimal strategy for satisfying the requirements of users while minimizing the charging expense in [98]. The SAC algorithm is applied to deal with the congestion control problem in [99], which proves to outperform other decentralized feedback control algorithms in terms of the fairness and utilization. Taking the security into account, [100] proposes a constrained policy optimization (CPO) approach based on the safe DRL to minimize the charging cost, which does not require any domain knowledge about the randomness. Numerical experiments demonstrate that this method could adequately satisfy the charging constraints and reduce the charging cost. A novel multi-agent DDPG algorithm for traffic light control is proposed to reduce the traffic congestion in [101]. Experiment results show that this method can significantly reduce congestion in various scenarios. [102] develops a multi-agent DQN (MA-DQN) method to model the pricing game in the transportation network and determine the optimal charging price for EVCS. Case studies are conducted to verify the effectiveness and scalability of the proposed approach. In [103], a DQN-based EV charging navigation framework is proposed to minimize the total travel time and charging cost in the EVCS. Experimental results demonstrate the effectiveness and necessity of the coordination of smart grid and intelligent transportation system. In addition, the continuous SAC algorithm is applied to crack the

3.3 The Methods for Smart Grid Dispatch

47

EV charging dispatch problem considering the dynamic user behaviors and electricity price in [104]. Simulation studies show that the proposed SAC-based approach could learn the dynamics of electricity price and driver’s behavior in different locations. (d) Network Configuration Furthermore, DRL algorithms are also applied to determine the optimal network configuration of DN. For example, [105] develops a many-objective distribution network reconfiguration model to assess the trade-off relationship for better operations of DN, in which a DQN-assisted evolutionary algorithm (DQN-EA) is proposed to improve searching efficiency. Similarly, an online DNR scheme based on deep Q-Learning is introduced in [106] to determine the optimal network topology. Simulation results indicate that the computation time of proposed algorithm is low enough for practical applications. Besides, [107] develops a data-driven batch-constrained SAC algorithm for the dynamic DNR, which could learn the network reconfiguration control policy from historical datasets without interacting with the DN. In [108], the federated learning and AC algorithm are combined to solve the demand response problem in DN, which considers the privacy protection, uncertainties as well as power flow constrains of DN, simultaneously. In addition, a DRL framework based on A2C algorithm is proposed in [109], which aims at enhancing the long-term resilience of DN using hardening strategies. Simulation results show its effectiveness and scalability in promoting the resilience of DN compared with traditional mathematical methods.

References 1. Y. Li, Z. Ni, T. Zhao, T. Zhong, Y. Liu, W. Lei, Y. Zhao, Supply function game based energy management between electric vehicle charging stations and electricity distribution system considering quality of service. IEEE Trans. Indus. Appl. 56(5), 5932–5943 (2020) 2. Y. Li, J. Huang, Y. Liu, Z. Ni, Y. Shen, W. Hu, L. Wu, Economic dispatch with high penetration of wind power using extreme learning machine assisted group search optimizer with multiple producers considering upside potential and downside risk. J. Modern Power Syst. Clean Energy, pp 1–13 (2021) 3. A.M. Fathabad, J. Cheng, K. Pan, F. Qiu, Data-driven planning for renewable distributed generation integration. IEEE Trans. Power Syst. 35(6), 4357–4368 (2020) 4. K. Utkarsh, D. Srinivasan, A. Trivedi, W. Zhang, T. Reindl, Distributed model-predictive real-time optimal operation of a network of smart microgrids. IEEE Trans. Smart Grid 10(3), 2833–2845 (2019) 5. Y. Liu, L. Guo, C. Wang, A robust operation-based scheduling optimization for smart distribution networks with multi-microgrids. Appl. Energy 228, 130–140 (2018) 6. C. Guo, F. Luo, Z. Cai, Z.Y. Dong, Integrated energy systems of data centers and smart grids: state-of-the-art and future opportunities. Appl. Energy 301, 117474 (2021) 7. Z.J. Lee, G. Lee, T. Lee, C. Jin, R. Lee, Z. Low, D. Chang, C. Ortega, S.H. Low, Adaptive charging networks: a framework for smart electric vehicle charging. IEEE Trans. Smart Grid 12(5), 4339–4350 (2021)

48

3 Review for Smart Grid Dispatch

8. C. Li, Z. Dong, G. Chen, B. Zhou, J. Zhang, Y. Xinghuo, Data-driven planning of electric vehicle charging infrastructure: a case study of Sydney, Australia. IEEE Trans. Smart Grid 12(4), 3289–3304 (2021) 9. B. Zhou, K. Zhang, K.W. Chan, C. Li, X. Lu, S. Bu, X. Gao, Optimal coordination of electric vehicles for virtual power plants with dynamic communication spectrum allocation. IEEE Trans. Indus. Inf. 17(1), 450–462 (2021) 10. D. Cao, H. Weihao, J. Zhao, G. Zhang, B. Zhang, Z. Liu, Z. Chen, F. Blaabjerg, Reinforcement learning and its applications in modern power and energy systems: a review. J. Modern Power Syst. Clean Energy 8(6), 1029–1042 (2020) 11. Y. Li, T. Zhao, P. Wang, H.B. Gooi, L. Wu, Y. Liu, J. Ye, Optimal operation of multimicrogrids via cooperative energy and reserve scheduling. IEEE Trans. Indus. Inf. 14(8), 3459–3468 (2018) 12. M. Mahmoodi, P. Shamsi, B. Fahimi, Economic dispatch of a hybrid microgrid with distributed energy storage. IEEE Trans. Smart Grid 6(6), 2607–2614 (2015) 13. Y. Shi, S. Dong, C. Guo, Z. Chen, L. Wang, Enhancing the flexibility of storage integrated power system by multi-stage robust dispatch. IEEE Trans. Power Syst. 36(3), 2314–2322 (2021) 14. Y. Li, J. Huang, Y. Liu, T. Zhao, Y. Zhou, Y. Zhao, C. Yuen, Day-ahead risk averse market clearing considering demand response with data-driven load uncertainty representation: A Singapore electricity market study. Energy 254, 123923 (2022) 15. Y. Li, Z. Ni, T. Zhao, Y. Minghui, Y. Liu, W. Lei, Y. Zhao, Coordinated scheduling for improving uncertain wind power adsorption in electric vehicles-wind integrated power systems by multiobjective optimization approach. IEEE Trans. Indus. Appl. 56(3), 2238–2250 (2020) 16. E.A. Martínez Ceseña, E. Loukarakis, N. Good, P. Mancarella, Integrated electricity-heat-gas systems: Techno-economic modeling, optimization, and application to multienergy districts. Proc. IEEE, 108(9), 1392–1410 (2020) 17. Z. Shen, W. Wei, L. Wu, M. Shafie-khah, J.P.S. Catalão, Economic dispatch of power systems with lmp-dependent demands: a non-iterative milp model. Energy 233, 121015 (2021) 18. M. Sahraei-Ardakani, K.W. Hedman, A fast lp approach for enhanced utilization of variable impedance based facts devices. IEEE Trans. Power Syst. 31(3), 2204–2213 (2016) 19. Y. Liu, W. Lei, J. Li, A fast lp-based approach for robust dynamic economic dispatch problem: a feasible region projection method. IEEE Trans. Power Syst. 35(5), 4116–4119 (2020) 20. H. Hou, Q. Wang, Z. Xiao, M. Xue, W. Yefan, X. Deng, C. Xie, Data-driven economic dispatch for islanded micro-grid considering uncertainty and demand response. Int. J. Electric. Power Energy Syst. 136, 107623 (2022) 21. W. Wei, F. Liu, S. Mei, Y. Hou, Robust energy and reserve dispatch under variable renewable generation. IEEE Trans. Smart Grid 6(1), 369–380 (2015) 22. G. Zhang, J. McCalley, Q. Wang, An agc dynamics-constrained economic dispatch model. IEEE Trans. Power Systems 34(5), 3931–3940 (2019) 23. Y. Yang, W. Wenchuan, B. Wang, M. Li, Chance-constrained economic dispatch considering curtailment strategy of renewable energy. IEEE Trans. Power Syst. 36(6), 5792–5802 (2021) 24. B. Kocuk, S.S. Dey, X.A. Sun, Strong socp relaxations for the optimal power flow problem. Oper. Res., 64(6):1177–1196 (2016) 25. S. Mhanna, P. Mancarella, An exact sequential linear programming algorithm for the optimal power flow problem. IEEE Trans. Power Syst. 37(1), 666–679 (2022) 26. K. Šepetanc, H. Pandži´ci, T. Capuder, Solving bilevel ac opf problems by smoothing the complementary conditions—part i: model description and the algorithm. IEEE Trans. Power Syst., pp. 1–10 (2022) 27. K. Christakou, D.-C. Tomozei, J.-Y. Le Boudec, M. Paolone, Ac opf in radial distribution networks-part ii: an augmented lagrangian-based opf algorithm, distributable via primal decomposition. Electric Power Syst. Res. 150, 24–35 (2017)

References

49

28. J.F. Marley, D.K. Molzahn, I.A. Hiskens, Solving multiperiod opf problems using an ac-qp algorithm initialized with an socp relaxation. IEEE Trans. Power Syst. 32(5), 3538–3548 (2017) 29. L. Wentian, M. Liu, S. Lin, L. Li, Fully decentralized optimal power flow of multi-area interconnected power systems based on distributed interior point method. IEEE Trans. Power Syst. 33(1), 901–910 (2018) 30. C. Musardo, G. Rizzoni, Y. Guezennec, B. Staccia, A-ecms: an adaptive algorithm for hybrid electric vehicle energy management. Euro. J. Control 11(4–5), 509–524 (2005) 31. S. Bahrami, M. Toulabi, S. Ranjbar, M. Moeini-Aghtaie, A.M. Ranjbar, A decentralized energy management framework for energy hubs in dynamic pricing markets. IEEE Trans. Smart Grid 9(6), 6780–6792 (2018) 32. W.-J. Ma, J. Wang, V. Gupta, C. Chen, Distributed energy management for networked microgrids using online admm with regret. IEEE Trans. Smart Grid 9(2), 847–856 (2018) 33. C. Sun, S.J. Moura, X. Hu, J.K. Hedrick, F. Sun, Dynamic traffic feedback data enabled energy management in plug-in hybrid electric vehicles. IEEE Trans. Control Syst. Technol. 23(3), 1075–1086 (2015) 34. H. Gao, J. Liu, L. Wang, Z. Wei, Decentralized energy management for networked microgrids in future distribution systems. IEEE Trans. Power Syst. 33(4), 3599–3610 (2018) 35. W. Shi, X. Xie, C.-C. Chu, R. Gadh, Distributed optimal energy management in microgrids. IEEE Trans. Smart Grid 6(3), 1137–1146 (2015) 36. W. Liu, S. Chen, Y. Hou, Z. Yang, Optimal reserve management of electric vehicle aggregator: Discrete bilevel optimization model and exact algorithm. IEEE Trans. Smart Grid 12(5), 4003–4015 (2021) 37. John T.B.A. Kessels, M.W.T. Koot, P.P.J. van den Bosch, D.B. Kok, Online energy management for hybrid electric vehicles. IEEE Trans. Vehicular Technol., 57(6), 3428–3440 (2008) 38. R.A. Jabr, R. Singh, B.C. Pal, Minimum loss network reconfiguration using mixed-integer convex programming. IEEE Trans. Power Syst. 27(2), 1106–1115 (2012) 39. S. Lei, Y. Hou, F. Qiu, J. Yan, Identification of critical switches for integrating renewable distributed generation by dynamic network reconfiguration. IEEE Trans. Sustain. Energy 9(1), 420–432 (2018) 40. S.F. Santos, D.Z. Fitiwi, M.R.M. Cruz, C.M.P. Cabrita, J.P.S. Catalão, Impacts of optimal energy storage deployment and network reconfiguration on renewable integration level in distribution systems. Appl. Energy 185, 44–55 (2017) 41. K. Chen, W. Wenchuan, B. Zhang, S. Djokic, G.P. Harrison, A method to evaluate total supply capability of distribution systems considering network reconfiguration and daily load curves. IEEE Trans. Power Syst. 31(3), 2096–2104 (2016) 42. P. Meneses de Quevedo, J. Contreras, M.J. Rider, J. Allahdadian, Contingency assessment and network reconfiguration in distribution grids including wind power and energy storage. IEEE Trans. Sustain. Energy 6(4), 1524–1533 (2015) 43. L. Wang, W. Wenchuan, L. Qiuyu, Y. Yang, Optimal aggregation approach for virtual power plant considering network reconfiguration. J. Modern Power Syst. Clean Energy 9(3), 495– 501 (2021) 44. Z. Li, W. Wenchuan, B. Zhang, X. Tai, Analytical reliability assessment method for complex distribution networks considering post-fault network reconfiguration. IEEE Trans. Power Syst. 35(2), 1457–1467 (2020) 45. Y. Li, Y. Cai, T. Zhao, Y. Liu, J. Wang, W. Lei, Y. Zhao, Multi-objective optimal operation of centralized battery swap charging system with photovoltaic. J. Modern Power Syst. Clean Energy 10(1), 149–162 (2022) 46. N.O. Aljehane, R.F. Mansour, Optimal allocation of renewable energy source and charging station for PHEVs. Sustain. Energy Technol. Assess. 49, 101669 (2022)

50

3 Review for Smart Grid Dispatch

47. F. Abukhodair, W. Alsaggaf, A.T. Jamal, S.A. Khalek, R.F. Mansour, An intelligent metaheuristic binary pigeon optimization-based feature selection and big data classification in a mapreduce environment. Mathematics 9(20), 2627 (2021) 48. A. Althobaiti, A.A. Alotaibi, S. Abdel-Khalek, E.M. Abdelrahim, R.F. Mansour, D. Gupta, S. Kumar, Intelligent data science enabled reactive power optimization of a distribution system. Sustain. Comput.: Inf. Syst., p. 100765 (2022) 49. D. Wang, D. Tan, L. Liu, Particle swarm optimization algorithm: an overview. Soft Comput. 22(2), 387–408 (2018) 50. K. Deb, A. Pratap, S. Agarwal, T. Meyarivan, A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans. Evolution. Comput. 6(2), 182–197 (2002) 51. J. Liu, J. Lampinen, A fuzzy adaptive differential evolution algorithm. Soft Comput. 9(6), 448–462 (2005) 52. X. Chen, Novel dual-population adaptive differential evolution algorithm for large-scale multifuel economic dispatch with valve-point effects. Energy 203, 117874 (2020) 53. M.F. Zaman, S.M. Elsayed, T. Ray, R.A. Sarker, Evolutionary algorithms for dynamic economic dispatch problems. IEEE Trans. Power Sys. 31(2), 1486–1495 (2016) 54. E. Naderi, A. Azizivahed, H. Narimani, M. Fathi, M.R. Narimani, A comprehensive study of practical economic dispatch problems by a new hybrid evolutionary algorithm. Appl. Soft Comput., 61, 1186–1206 (2017) 55. L. Youlin, J. Zhou, H. Qin, Y. Wang, Y. Zhang, Environmental/economic dispatch problem of power system by using an enhanced multi-objective differential evolution algorithm. Energy Convers. Manage. 52(2), 1175–1183 (2011) 56. N. Duvvuru, K.S. Swarup, A hybrid interior point assisted differential evolution algorithm for economic dispatch. IEEE Trans. Power Syst. 26(2), 541–549 (2011) 57. C.-L. Chiang, Improved genetic algorithm for power economic dispatch of units with valvepoint effects and multiple fuels. IEEE Trans. Power Syst. 20(4), 1690–1699 (2005) 58. J.X.V. Neto, D.L. de Andrade Bernert, L. dos Santos Coelho, Improved quantum-inspired evolutionary algorithm with diversity information applied to economic dispatch problem with prohibited operating zones. Energy Convers. Manage., 52(1), 8–14 (2011) 59. S. Sayah, K. Zehar, Modified differential evolution algorithm for optimal power flow with non-smooth cost functions. Energy Convers. Manage. 49(11), 3036–3042 (2008) 60. S.S. Reddy, P.R. Bijwe, A.R. Abhyankar, Faster evolutionary algorithm based optimal power flow using incremental variables. Int. J. Electric. Power Energy Syst. 54, 198–210 (2014) 61. H.R. Cai, C.Y. Chung, K.P. Wong, Application of differential evolution algorithm for transient stability constrained optimal power flow. IEEE Trans. Power Syst. 23(2), 719–728 (2008) 62. X. Yuan, B. Zhang, P. Wang, J. Liang, Y. Yuan, Y. Huang, X. Lei, Multi-objective optimal power flow based on improved strength pareto evolutionary algorithm. Energy 122, 70–82 (2017) 63. A.M. Shaheen, R.A. El-Sehiemy, S.M. Farrag, Solving multi-objective optimal power flow problem via forced initialised differential evolution algorithm. IET Gener. Transmission. Distribut. 10(7), 1634–1647 (2016) 64. S. Li, W. Gong, L. Wang, X. Yan, H. Chengyu, Optimal power flow by means of improved adaptive differential evolution. Energy 198, 117314 (2020) 65. J. Li, Q. Zhou, H. Williams, X. Hongming, D. Changqing, Cyber-physical data fusion in surrogate- assisted strength pareto evolutionary algorithm for phev energy management optimization. IEEE Trans. Indus. Inf. 18(6), 4107–4117 (2022) 66. X. Qi, W. Guoyuan, K. Boriboonsomsin, M.J. Barth, Development and evaluation of an evolutionary algorithm-based online energy management system for plug-in hybrid electric vehicles. IEEE Trans. Intell. Transp. Syst. 18(8), 2181–2191 (2017) 67. L. Wang, M. Li, Y. Wang, Z. Chen, Energy management strategy and optimal sizing for hybrid energy storage systems using an evolutionary algorithm. IEEE Trans. Intell. Transp. Syst. 23(9), 14283–14293 (2022)

References

51

68. M. Parol, T. Wójtowicz, K. Ksie˛˙zyk, C. Wenge, S. Balischewski, B. Arendarski, Optimum management of power and energy in low voltage microgrids using evolutionary algorithms and energy storage. Int. J. Electric. Power Energy Syst. 119, 105886 (2020) 69. B. Li, R. Roche, A. Miraoui, Microgrid sizing with combined evolutionary algorithm and milp unit commitment. Appl. Energy 188, 547–562 (2017) 70. A.C.B. Delbem, A.C.Pd.L.F. de Carvalho, N.G. Bretas, Main chain representation for evolutionary algorithms applied to distribution system reconfiguration. IEEE Trans. Power Syst., 20(1), 425–436 (2005) 71. E.M. Carreno, R. Romero, A. Padilha-Feltrin, An efficient codification to solve distribution network reconfiguration for loss reduction problem. IEEE Trans. Power Syst. 23(4), 1542– 1551 (2008) 72. A. Landeros, S. Koziel, M.F. Abdel-Fattah, Distribution network reconfiguration using feasibility-preserving evolutionary optimization. J. Modern Power Syst. Clean Energy 7(3), 589–598 (2019) 73. J.-P. Chiou, C.-F. Chang, S. Ching-Tzong, Variable scaling hybrid differential evolution for solving network reconfiguration of distribution systems. IEEE Trans. Power Syst. 20(2), 668– 674 (2005) 74. M.-R. Andervazh, J. Olamaei, M.-R. Haghifam, Adaptive multi-objective distribution network reconfiguration using multi-objective discrete particles swarm optimisation algorithm and graph theory. IET Gener. Transmission Distribut. 7(12), 1367–1382 (2013) 75. T. Niknam, E. Azadfarsani, M. Jabbari, A new hybrid evolutionary algorithm based on new fuzzy adaptive pso and nm algorithms for distribution feeder reconfiguration. Energy Convers. Manage. 54(1), 7–16 (2012) 76. A. Azizivahed, H. Narimani, E. Naderi, M. Fathi, M.R. Narimani, A hybrid evolutionary algorithm for secure multi-objective distribution feeder reconfiguration. Energy 138, 355– 373 (2017) 77. E. Mahboubi-Moghaddam, M.R. Narimani, M.H. Khooban, A. Azizivahed et al., Multiobjective distribution feeder reconfiguration to improve transient stability, and minimize power loss and operation cost using an enhanced evolutionary algorithm at the presence of distributed generations. Int. J. Electric. Power Energy Syst. 76, 35–43 (2016) 78. D.S. Rani, N. Subrahmanyam, M. Sydulu, Multi-objective invasive weed optimization-an application to optimal network reconfiguration in radial distribution systems. Int. J. Electric. Power Energy Syst. 73, 932–942 (2015) 79. H.D. de Macedo Braz, B.A. de Souza, Distribution network reconfiguration using genetic algorithms with sequential encoding: subtractive and additive approaches. IEEE Trans. Power Syst., 26(2), 582–593 (2011) 80. E. Azad-Farsani, I.G. Sardou, S. Abedini, Distribution network reconfiguration based on lmp at dg connected busses using game theory and self-adaptive fwa. Energy 215, 119146 (2021) 81. A. Dridi, H. Afifi, H. Moungla, J. Badosa, A novel deep reinforcement approach for IIoT microgrid energy management systems. IEEE Trans. Green Commun. Network., pp. 1–1 (2021) 82. M.S. Munir, S.F. Abedin, N.H. Tran, Z. Han, E.N. Huh, C.S. Hong, Risk-aware energy scheduling for edge computing with microgrid: a multi-agent deep reinforcement learning approach. IEEE Trans. Netw. Service Manage. 18(3), 3476–3497 (2021) 83. L. Lei, Y. Tan, G. Dahlenburg, W. Xiang, K. Zheng, Dynamic energy dispatch based on deep reinforcement learning in IoT-driven smart isolated microgrids. IEEE Internet Things J. 8(10), 7938–7953 (2021) 84. T. Chen, S. Bu, X. Liu, J. Kang, F.R. Yu, Z. Han, Peer-to-peer energy trading and energy conversion in interconnected multi-energy microgrids using multi-agent deep reinforcement learning. IEEE Trans. Smart Grid, pp. 1–1 (2021) 85. F.S. Gorostiza, F.M.G. Longatt, Deep reinforcement learning-based controller for SOC management of multi-electrical energy storage system. IEEE Trans. Smart Grid 11(6), 5039–5050 (2020)

52

3 Review for Smart Grid Dispatch

86. H. Hua, Z. Qin, N. Dong, Y. Qin, M. Ye, Z. Wang, X. Chen, J. Cao, Data-driven dynamical control for bottom-up energy internet system. IEEE Trans. Sustain. Energy, p. 1 (2021) 87. D. Yan, F. Li, Intelligent multi-microgrid energy management based on deep neural network and model-free reinforcement learning. IEEE Trans. Smart Grid 11(2), 1066–1076 (2020) 88. Y. Li, R. Wang, Z. Yang, Optimal scheduling of isolated microgrids using automated reinforcement learning-based multi-period forecasting. IEEE Trans. Sustain. Energy, p. 1 (2021) 89. Z. Qin, D. Liu, H. Hua, J. Cao, Privacy preserving load control of residential microgrid via deep reinforcement learning. IEEE Trans. Smart Grid 12(5), 4079–4089 (2021) 90. D. Cao, H. Weihao, J. Zhao, Q. Huang, Z. Chen, F. Blaabjerg, A multi-agent deep reinforcement learning based voltage regulation using coordinated PV inverters. IEEE Trans. Power Syst. 35(5), 4120–4123 (2020) 91. P. Kou, D. Liang, C. Wang, W. Zihao, L. Gao, Safe deep reinforcement learning-based constrained optimal control scheme for active distribution networks. Appl. Energy 264, 114772 (2020) 92. W. Wang, Y. Nanpeng, Y. Gao, J. Shi, Safe off-policy deep reinforcement learning algorithm for volt-VAR control in power distribution systems. IEEE Trans. Smart Grid 11(4), 3008–3018 (2020) 93. H. Liu, W. Wenchuan, Two-stage deep reinforcement learning for inverter-based volt-VAR control in active distribution networks. IEEE Trans. on Smart Grid 12(3), 2037–2047 (2021) 94. X. Sun, J. Qiu, Two-stage volt/var control in active distribution networks with multi-agent deep reinforcement learning method. IEEE Trans. Smart Grid 12(4), 2903–2912 (2021) 95. Q. Yang, G. Wang, A. Sadeghi, G.B. Giannakis, J. Sun, Two-timescale voltage control in distribution grids using deep reinforcement learning. IEEE Trans. Smart Grid 11(3), 2313– 2323 (2020) 96. C. Zhang, Y. Liu, W. Fan, B. Tang, W. Fan, Effective charging planning based on deep reinforcement learning for electric vehicles. IEEE Trans. Intell. Transp. Syst. 22(1), 542–554 (2021) 97. B. Lin, B. Ghaddar, J. Nathwani, Deep reinforcement learning for the electric vehicle routing problem with time windows. IEEE Trans. Intell. Transp. Syst., pp. 1–11 (2021) 98. F. Zhang, Q. Yang, D. An, CDDPG: a deep-reinforcement-learning-based approach for electric vehicle charging control. IEEE Internet Things J. 8(5), 3075–3087 (2021) 99. A.A. Zishan, M.M. Haji, O. Ardakanian, Adaptive congestion control for electric vehicle charging in the smart grid. IEEE Trans. Smart Grid 12(3), 2439–2449 (2021) 100. H. Li, Z. Wan, H. He, Constrained EV charging scheduling based on safe deep reinforcement learning. IEEE Trans. Smart Grid 11(3), 2427–2439 (2020) 101. T. Wu, P. Zhou, K. Liu, Y. Yuan, X. Wang, H. Huang, D.O. Wu, Multi-agent deep reinforcement learning for urban traffic light control in vehicular networks. IEEE Trans. Vehicular Technol. 69(8), 8243–8256 (2020) 102. M. Shahidehpour, T. Qian, C. Shao, X. Li, X. Wang, Z. Chen, Multi-agent deep reinforcement learning method for EV charging station game. IEEE Trans. Power Syst., p. 1 (2021) 103. T. Qian, C. Shao, X. Wang, M. Shahidehpour, Deep reinforcement learning for EV charging navigation by coordinating smart grid and intelligent transportation system. IEEE Trans. Smart Grid 11(2), 1714–1723 (2020) 104. L. Yan, X. Chen, J. Zhou, Y. Chen, J. Wen, Deep reinforcement learning for continuous electric vehicles charging control with dynamic user behaviors. IEEE Trans. Smart Grid 12(6), 5124– 5134 (2021) 105. Y. Li, G. Hao, Y. Liu, Y. Yu, Z. Ni, Y. Zhao, Many-objective distribution network reconfiguration via deep reinforcement learning assisted optimization algorithm. IEEE Trans. Power Delivery, p. 1 (2021) 106. S.H. Oh, Y.T. Yoon, S.W. Kim, Online reconfiguration scheme of self-sufficient distribution network based on a reinforcement learning approach. Appl. Energy 280, 115900 (2020) 107. Y. Gao, W. Wang, J. Shi, Y. Nanpeng, Batch-constrained reinforcement learning for dynamic distribution network reconfiguration. IEEE Trans. Smart Grid 11(6), 5357–5369 (2020)

References

53

108. S. Bahrami, Y.C. Chen, V.W.S. Wong, Deep reinforcement learning for demand response in distribution networks. IEEE Trans. Smart Grid 12(2), 1496–1506 (2021) 109. N.L. Dehghani, A.B. Jeddi, A. Shafieezadeh, Intelligent hurricane resilience enhancement of power distribution systems via deep reinforcement learning. Appl. Energy 285, 116355 (2021)

Chapter 4

Deep Learning-Based Densely Connected Network for Load Forecast

4.1 Introduction As we know, load forecasting plays an important role in various power system decision-making problems, such as unit commitment and economic dispatch. Inaccurate forecasting results will influence the economic and technic performances of a power system directly. Therefore, it is necessary to develop precise and reliable forecasting methods, which could be used for enhancing operations of power grid [1–4]. Numerous literatures have been published to study load forecasting. The majority of previous work is about deterministic forecasting, which mainly consists of two categories, i.e., statistical and machine learning-based methods. For the former one, forecasting results are obtained by hand-crafted features without learning, such as in the approaches of Kalman filtering [5] and Box–Jenkins models [6]. On the contrary, machine learning-based methods extract the features during training process. Some commonly adopted strategies for load forecasting include support vector machine [7], neural network [8, 9], fuzzy logic [10], multiple kernel learning [11], additive models [12–15], etc. Nevertheless, in recent years, it is difficult to obtain accurate load forecasting due to grid modernization, which contributes to less predictable electricity load. To accommodate the uncertainty of electricity load, some probabilistic forecasting methods have been developed. A well-behaved one is the quantile regression [16], which could be applied to various prediction architectures. Although machine learning methods have made promising achievements, more precise and stable methods are still required to conduct increasing unpredictability of power systems [17]. Recently, much attention has been paid to deep learning due to its well-behaved nonlinearity approximation capability. A series of deep learning methods for load forecasting are based on convolutional neural networks and recurrent neural networks [18]. Nevertheless, they still have some restrictions. For instance, a prior assumption behind the convolutional kernel is the space invariance [19], which is not fully satisfied in electricity data (this issue has been shown in © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Li et al., Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch, Engineering Applications of Computational Methods 14, https://doi.org/10.1007/978-981-99-0799-1_4

55

56

4 Deep Learning-Based Densely Connected Network for Load Forecast

Sect. 4.5 of this paper). Another well-behaved model for load forecasting is the fully connected network [20]. However, due to the numerous parameters, it suffers from overfitting and slow convergence. To address the above challenges, we develop a novel deep learning model named as densely connected network (DCN). This model is established in a densely connected structure and its backbone is based on unshared convolutional neural network (UCNN) [21]. By constructing the identity mapping, this structure performs better on nonlinear approximation and converges more rapidly than the fully connected network. In addition, considering the possibility of overfitting, a novel regulation method is developed in this paper. Unlike commonly used regulation methods (i.e., classic L 2 -norm), our regulation method does not worsen the searching surface. It only works when the parameters attempt to be out of range. Furthermore, a smooth loss function is adopted to stabilize the parameter updating process. In addition, by combing this smooth loss function and quantile regression, our method can be recast to solve probabilistic forecasting. To demonstrate the effectiveness of this architecture, the DCN is validated on datasets from Australia and Germany. Several classic deep learning benchmarks are selected to make comparisons. Through verification, our model outperforms the benchmarks and achieves better performance. In general, this paper consists of the following contributions: (1) A novel deep learning model is proposed for load forecasting. Its backbone is based on unshared convolutional neural network. To the best of our knowledge, this is the first time of using unshared convolution for load forecasting. (2) By reconstructing the unshared convolutional layers into the densely connected fashion, a densely connected block is designed. This advanced structure could address the gradient vanishment problem efficiently. (3) In this work, a novel and efficient regularization method named clipped L 2 -norm is developed to address overfitting. Unlike previous regularization methods, our method avoids considering overfitting until some parameters attempt to overstep the limited range. (4) A smooth loss function is utilized to stabilize the parameter updating process. By combining this smooth loss function with quantile regression, our proposed model can be reconstructed to solve probabilistic forecasting without much modification. The remainder of this paper is organized as follows: Sect. 4.2 mainly introduces the residual architecture, which is the basis of the DCN. Section 4.3 shows the structure of the unshared convolution. Also, the theory basis and implementation details of the DCN are presented. A case study is conducted in Sect. 4.4 to make verification. Finally, the conclusion is drawn in Sect. 4.5.

4.2 Residual Architecture

57

4.2 Residual Architecture In recent years, deep learning has shown remarkable performance in various fields, such as object detection [22], semantic segmentation [23] and speech recognition [24]. Meanwhile, numerous studies have also applied deep learning to solve the load forecasting problem [3, 17, 20, 25]. It is essentially a regression task which aims to establish a suitable model  p (·) approximating the values Y (actual load) with input X (historical data). We denote Y ∗ as the output of such model, which is shown as follows. (4.1) Y ∗ =  p (X ) We train the model to eliminate the gap between Y and Y ∗ by updating the parameters p of  p (·). After several iterations, p should converge to p ∗ satisfying the following condition (4.2) p ∗ = arg min L(Y, Y ∗ ) where L(·) denotes the loss function. During the iterations of training, the parameters are updated by some optimization algorithms, such as stochastic gradient descent [26] and Adagrad [27]. However, these algorithms are usually based on back-propagating gradients, which could vanish or explode with the network deepening [28]. This obstacle is one of the main obstacles in deep learning [29]. Several attempts have been conducted to address this challenge [30]. Remarkably, a residual neural network, also called Resnet, was developed by He et al. to alleviate this hindrance [31]. It obtained outstanding performance and its authors won the best paper award in IEEE Conference on Computer Vision and Pattern Recognition 2016 (CVPR 2016), which is one of the most influential meetings on computer vision and deep learning. In the following, we will introduce the residual architecture used in Resnet briefly. As shown in Fig. 4.1a, a classic convolutional block consists of three parts, convolutional layer, batch normalization layer and relu layer. First, the convolutional layer extracts the feature of data by sliding convolutional kernels. Next, the batch normalization layer changes the feature map into a normal-like distribution, which promotes the stability of the training model. Furthermore, a relu layer is added to enhance the nonlinear approximation capability [32]. The whole convolutional block can be denoted as a mapping function M(x). For the residual structure, this mapping function is recast into another style M(x) + x through adding some “short connection”, which is shown in Fig. 4.1b. It has been verified that the reconstructed mapping function can alleviate gradient vanishment, to some extent [31]. Furthermore, in order to deal with this problem more efficiently, Densenet was established to develop a more densely connected structure and its authors won the best paper award in CVPR 2017 [34]. However, both the networks of Resnet and Densenet are based on convolutional neural network (CNNs). A priori assumption behind CNNs is that the training data satisfy spatial invariance [21], which does not fully accord with electricity load data. To address this restriction, we develop a

58

4 Deep Learning-Based Densely Connected Network for Load Forecast

Fig. 4.1 The transformation from the classic block to the residual block, reprinted from Ref. [33], copyright2022, with permission from IEEE

novel deep learning model named as DCN based on unshared convolutional networks with residual form, and the unshared convolution will be introduced in the following Section.

4.3 Unshared Convolution Previous deep learning methods for load forecasting are mainly based on fully connected networks (FCN) [8, 35], CNN [36] and recurrent neural networks (RNN) [18]. However, these networks have some restrictions. For instance, CNN requires that the data for load forecasting should be space invariant [21], but it is actually space variant. This issue will be presented in Sect. 4.5. For RNN, the outputs are determined by the input data and previous states of the network [3]. The forecasting deviation could be accumulated gradually if there exists some errors. FCN is another choice for load forecasting [35]. Nevertheless, it usually suffers from a large amount of parameters and is easy to be overfitted [37]. In order to avoid the above problems, we adopt the UCNN [21] as our backbone. Compared to FCN, it has much fewer parameters and is less likely to be overfitted. Meanwhile, unlike classic CNN, its parameters of convolution kernels corresponding to different locations of the feature map are not shared, which contributes to the issue that UCNN does not have the requirement of space invariance for forecasting. In detail, Fig. 4.2 shows the difference between classic convolution and unshared convolution. In this figure, the whole black rectangle represents a feature map, and the colorful squares denote convolution kernels. The kernels of the same color are the same ones. It could be observed from Fig. 4.2a that in classic convolution, the

4.4 Densely Connected Network

59

Fig. 4.2 The sketches of classic convolution and unshared convolution, reprinted from Ref. [33], copyright2022, with permission from IEEE

kernels in different places are the same, which means the parameters are shared. On the contrary, as shown in Fig. 4.2b, the unshared convolution kernels corresponding to different positions of the feature map are unique. This characteristic guarantees that the input data do not have to be space invariant. To the best of our knowledge, this is the first time that unshared convolution is applied to load forecasting.

4.4 Densely Connected Network In this section, the description of the DCN is dissected into two steps. First, the overall framework and implementation process are depicted. In the second step, we present four novel components used in the proposed model.

4.4.1 Overall Framework The novel DCN model developed in this paper is a flexible deep learning architecture designed for both deterministic and probabilistic load forecasting and is trained in the end-to-end fashion. Its backbone is based on UCNN, whose parameters are obviously fewer than FCN. The overall framework of the DCN is illustrated in Fig. 4.3. Its framework consists of three parts, preprocessing, feature extraction and target regression. In the following, the process of all the three parts are introduced. In the preprocessing part, the whole data are normalized to a normal-like distribution, which could help stabilize the convergence process [38]. Assuming the whole dataset as a matrix, the normalization process can be formulated as

60

4 Deep Learning-Based Densely Connected Network for Load Forecast

Fig. 4.3 Overall architecture of Densely Connected Network, reprinted from Ref. [33], copyright2022, with permission from IEEE. (DCB: Densely Connected Block, FCL: fully connected layer)

⎧ ⎪ ⎪ μj = ⎪ ⎪ ⎪ ⎨

1 N

N 

xi j

i=1 N 

σ j2 = N1 (xi j − μ j )2 ⎪ ⎪ ⎪ i=1 ⎪ ⎪ xi j −μ j ⎩xi j = √ 2

(4.3)

σ j +

where N , xi j , μ j , σ j2 and  denote the total number of data, the element in the ith row and the jth column of dataset, the mean, the variance of the jth column and a tiny number, respectively. Meanwhile, a similar operation is conducted to the actual load values: y∗ − ν (4.4) yi∗ = i κ where yi∗ is the ith element, ν and κ are two constants preset empirically. The normalization of labels can stabilize the learning process and reduce forecasting deviations. Next, training data batches are generated by sampling in the normalized dataset randomly. More specifically, the data belonging to the same timestamp can be treated as a vector, and a training data batch comprises several vectors sampled from the dataset. In the feature extraction part, the processed data batch is fed to a series of densely connected blocks (DCB) to extract information. The weights in each layer of DCB are regularized by clipped L 2 -norm to avoid overfitting. With a similar intention, a dropout layer is added after the final DCB. The dropout layer will randomly discard a preset proportion of features flowing from the previous layer. This operation can strength the robustness of network [21]. Furthermore, a fully connected layer without activation function is added following the dropout layer to forecast the load. During the training process, the parameters in the DCN are updated with Adam algorithm[39]. By training with two different forms of loss functions in the target regression part, deterministic prediction and probabilistic prediction can be realized, respectively.

4.4 Densely Connected Network

61

4.4.2 Densely Connected Block Since the energy management center of the MG is an agent which is trained by DRL algorithm, the above isolated MG model should be reformulated as the MDP model. The details of state, action and reward are defined as follows. By introducing the densely connected residual architecture into unshared convolutional layers, we obtain the densely connected block (DCB), which is the basis of the DCN. In the following, the implementation details of unshared convolution are presented firstly. Given a training dataset (4.5) D = {(xt , yt }nt=1 where xt represents the tth training sample from the input domain X and yt represents the corresponding label from the target space Y. After iterations, the output of the network  p (xt ) should converge to yt . Considering that  p (·) is an UCNN model and each small rectangle in Fig. 4.2 is called a convolution area, the feature transformation process in the kth hidden layer can be formulated as pik = ϕ(h ik−1 )

(4.6)

where h ik−1 , pik and ϕ(·) are the ith convolution area of the (k − 1)th layer, the output corresponding to the input h ik−1 and the unshared convolution operator. Then, the outputs corresponding to different convolution areas in the last layer are concatenated as a new feature map. Furthermore, the newly generated feature map could be separated as several convolution areas of different sizes for the next convolution operation. The unshared convolution operator ϕ(·) applied to a specific convolution area is similar to the process of fully connected layer and is formulated as di j =

h 

gk (wit · et j + bi j )

(4.7)

t=1

where et j is the element whose coordinates are (t, j) in the convolution area, and di j , wit , bi j are the corresponding output, weight and bias. Besides, h and gk (·) are the height of this convolution area and the corresponding activation function, respectively. In this work, to avoid the information loss in relu [32], we choose Leaky relu [40] as our activation function, instead. In order to strength the nonlinear approximation capability of the network, we modify the unshared convolutional layers into the residual structure. Reconstructed as an identity mapping, the gradient could back-propagate in a more “straight” way. Specifically, this is conducted by adding a “short connection”. The original feature transformation process could be recast into pik = ϕ(h ik−1 ) + h ik−1 .

(4.8)

62

4 Deep Learning-Based Densely Connected Network for Load Forecast

Fig. 4.4 Densely Connected Block (DCB). Each layer in the same block owns the same number of nodes, reprinted from Ref. [33], copyright2022, with permission from IEEE. (BN: Batch normalization. UCL: Unshared convolutional layer)

Following the above idea, the network could be reconstructed into a more dense fashion. In each block of the network, the output of the kth hidden layer is corresponding to each previous hidden layer in the same block with a short connection, which is formulated as follows. pik = ϕ(h ik−1 ) +

k−1 

h li

(4.9)

l=1

Then, the DCB is formed. It is the basic component of our proposed method and its structure is shown in Fig. 4.4. Established into a dense fashion, DCB can extract the features from load data efficiently.

4.4.3 Clipped L 2 -norm Overfitting is one of the main challenges in deep learning [41]. Some parameter regularization methods were designed to address this problem, such as L 2 -norm [21]. It relieves overfitting by directly adding weight punishment to the training loss function. By selecting the regularization factor in L 2 -norm, a balance between avoiding overfitting and achieving good forecasting performance on the training dataset could be obtained. Nevertheless, this process of searching balance could worsen the forecasting accuracy even when the network is not overfitted [21]. A possible solution is clipping the parameters into a certain range. Using this method, the network does not consider overfitting until some parameters attempt to overstep the preset range. Nevertheless, limiting the range of parameters may prevent the network from arriving at the globally optimal solution.

4.4 Densely Connected Network

63

To address the above challenges, we propose a regularization method named clipped L 2 -norm. In this method, a parameter range is preset initially. Similar to the method by clipping weights, overfitting is not considered if all parameters are in the parameter range. If some parameters overstep the parameter range, the L 2 -norm is applied to these parameters to alleviate overfitting. The implementation details of clipped L 2 -norm are shown in Algorithm 1, and its effectiveness is verified in Sect. 4.5. Algorithm 1 The training process of the deterministic DCN, reprinted from ref. [33], copyright2022, with permission from IEEE 1: Initialize the parameters ω in the DCN (·) 2: Set the range of parameters as (Vmin ,Vmax ) 3: Data normalization 4: for epoch=1 to M do 5: for i=1 to N do 6: Sample input X and load values Y ∗ from dataset randomly 7: Y = (X ) (Calculate the output of the DCN) 8: Wstatus = ϕcheck (ω) (Check if some parameters have overstepped the preset range) 9: L=L smooth (Y, Y ∗ ) + Wstatus · ||ω|| (Calculate the smooth loss and regularization loss) 10: ω = Oadam (L) (Calculate the change of parameters with Adam optimizer) 11: ω ← ω + ω (Update the parameters) 12: end for 13: end for

4.4.4 Smooth Loss Loss function defines the optimization target of supervised training. The most commonly used loss functions for load forecasting are L 1 loss and L 2 loss [3]. However, they still have some limitations. For instance, L 1 loss measures the difference between real values and network’s output with a first-order function, which is given as follows. L 1 = |Y − Y ∗ |

(4.10)

where Y is the regression output and Y ∗ is the real value. Nevertheless, when Y is close to Y ∗ , the gradient of L 1 loss fluctuates severely. In comparison, L 2 loss adopts the second-order form, which is L 2 = (Y − Y ∗ )2 .

(4.11)

For L 2 loss, although the gradient approximates zero when Y is close to Y ∗ , the gradient of L 2 loss could explode when there exists a huge gap between Y and Y ∗ . To address the restrictions mentioned above, we adopt a smoother loss function similar to the one in [22], which is given as follows.

64

4 Deep Learning-Based Densely Connected Network for Load Forecast

Fig. 4.5 The diagram of smooth loss, reprinted from Ref. [33], copyright2022, with permission from IEEE

 L smooth =

0.5(Y − Y ∗ )2 |Y − Y ∗ | < 1 |Y − Y ∗ | − 0.5 otherwise

(4.12)

As shown in Fig. 4.5, the gradient of smooth loss is a constant when there exists a huge gap between the output of network and real value. This characteristic guarantees the stability of back-propagation. Meanwhile, when the output is close to the actual value, the gradient is close to zero, which supports the stability of convergence.

4.4.5 Smooth Quantile Regression Recently, the quantile regression was used to reconstruct deterministic forecasting model into probabilistic form [4]. To better stabilize the regression process, we develop a smooth quantile regression by combing quantile regression and smooth loss function, and it is introduced as follows. In our method, smooth quantile regression is realized by adding imbalance to the smooth deterministic loss function. More specifically, this can be conducted by simply weighting the smooth loss function. Assuming the original smooth loss function as L(Y, Y ∗ ), the newly constructed loss function for smooth quantile regression can be defined as   αL(Yi , Yi∗ ) + (1 − α)L(Yi , Yi∗ ) (4.13) Lq = i:Yi >Yi∗

i:Yi Ui

where L ri ws stands for the RWS regarding the ith element of prediction results. Ui and L i are the ith element of predicted upper and lower boundaries, respectively. yi represents the ith real load value. δi is an index for describing the obtained prediction range, and its formulation is shown as follows. δi = 2 ×

Ui − L i Ui + L i

(4.15)

Therefore, the RWS can be formulated as L r ws

N 1  i = L N i=1 r ws

(4.16)

where N is the total number of prediction results. The simulation results are shown in Table 4.3. Meanwhile, part of the forecasting results of the DCN is presented in Fig. 4.10. As observed in the 7th column of Table 4.3, the forecasting RWS of the DCN on Australia and Germany load data is 0.051 and 0.047, respectively. Therefore, it is the best-behaved method compared to other methods. For the classic convolution-based methods, the forecasting RWS of CNN, CNN+FCN, Resnet and Densenet are 0.093, 0.079, 0.091 and 0.086 on Australia dataset and 0.084, 0.076, 0.075 and 0.075 on Germany dataset. Worthy to be mentioned, it can be seen in Fig. 4.10 that the widths of prediction ranges rise sharply at the peaks and valleys. This phenomenon reveals that the main difficulties of load forecasting stem from these critical periods. To address this challenge, a possible solution is paying more attention to the data in the peak and off periods during the training process. This issue can be conducted by increasing

4.6 Conclusion

71

Fig. 4.10 Probabilistic forecasting results of the DCN. (Both these two figures cover data from July 1, 2016 to July 3, 2016 with one hour resolution)

the weights of loss function on the forecasting results belonging to these periods. In addition, instead of predicting actual load demand directly, predicting the leap value from normal periods to peak/off periods may also alleviate this challenge. These two strategies are promising to be studied in the future.

4.6 Conclusion This paper has developed a novel deep learning model for deterministic and probabilistic load forecasting. In this model, UCNN was selected as the backbone, and this is the first time of applying unshared convolution to load forecasting. By reconstructing the unshared convolution layers into the densely connected structure, this architecture has a good nonlinear approximation capability and can be trained in the end-to-end fashion. Meanwhile, an efficient regularization method named clipped L 2 -norm was utilized to address overfitting. Compared to previous classic regularization methods, this method does not worsen the searching surface. In addition, a smooth loss function was used to stabilize the parameter updating process. By combing the smooth loss function and quantile regression, our method could conduct the probabilistic load forecasting. Using load data from Australia and Germany, we have conducted three cases to validate the outperformance of the proposed DCN. Case 1 verified that the densely connected structure contributes the most to the forecasting precision of DCN. In Case 2, further studied are adopted to show the effectiveness of DCN on deterministic load forecasting, compared with other popular deep learning models, such as FCN, CNN, CNN+FCN, LSTM, Resnet and Densenet. In addition, Case 3 also presented outperformance of the proposed DCN in terms of probabilistic load forecasting. Finally, we indicate there exists some promising directions for future works, such as accommodating the unobservable influential factors and reducing parameter redundancy in the DCN.

72

4 Deep Learning-Based Densely Connected Network for Load Forecast

References 1. C. Ye, Y. Ding, P. Wang, Z. Lin, A data-driven bottom-up approach for spatial and temporal electric load forecasting. IEEE Trans. Power Syst. 34(3), 1966–1979 (2019) 2. Y. Wang, Q. Chen, N. Zhang, Y. Wang, Conditional residual modeling for probabilistic load forecasting. IEEE Trans. Power Syst. 33(6), 7327–7330 (2018) 3. H. Shi, X. Minghao, R. Li, Deep learning for household load forecasting: a novel pooling deep RNN. IEEE Trans. Smart Grid 9(5), 5271–5280 (2017) 4. W. Zhang, H. Quan, D. Srinivasan, An improved quantile regression neural network for probabilistic load forecasting. IEEE Trans. Smart Grid 10(4), 4425–4434 (2019) 5. H.M. Al-Hamadi, S.A. Soliman, Short-term electric load forecasting based on Kalman filtering algorithm with moving window weather and load model. Electr. Power Syst. Res. 68(1), 47–59 (2004) 6. N. Amjady, Short-term hourly load forecasting using time-series modeling with peak load estimation capability. IEEE Trans. Power Syst. 16(3), 498–505 (2001) 7. S. Fan, L. Chen, Short-term load forecasting based on an adaptive hybrid method. IEEE Trans. Power Syst. 21(1), 392–401 (2006) 8. W. Zhuochun, X. Zhao, Y. Ma, X. Zhao, A hybrid model based on modified multi-objective cuckoo search algorithm for short-term load forecasting. Appl. Energy 237, 896–909 (2019) 9. M. Rafiei, T. Niknam, J. Aghaei, M. Shafie-Khah, J.P.S. Catalão, Probabilistic load forecasting using an improved wavelet neural network trained by generalized extreme learning machine. IEEE Trans. Smart Grid 9(6), 6961–6971 (2018) 10. G.-C. Liao, T.-P. Tsao, Application of fuzzy neural networks and artificial intelligence for load forecasting. Electr. Power Syst. Res. 70(3), 237–244 (2004) 11. D. Wu, B. Wang, D. Precup, B. Boulet, Multiple kernel learning based transfer regression for electric load forecasting. IEEE Trans. Smart Grid (2019) 12. Y. Goude, R. Nedellec, N. Kong, Local short and middle term electricity load forecasting with semi-parametric additive models. IEEE Trans. Smart Grid 5(1), 440–446 (2013) 13. S. Fan, R.J. Hyndman, Short-term load forecasting based on a semi-parametric additive model. IEEE Trans. Power Syst. 27(1), 134–141 (2011) 14. V. Thouvenot, A. Pichavant, Y. Goude, A. Antoniadis, J.-M. Poggi, Electricity forecasting using multi-stage estimators of nonlinear additive models. IEEE Trans. Power Syst. 31(5), 3665–3673 (2015) 15. S.B. Taieb, R.J. Hyndman, A gradient boosting approach to the Kaggle load forecasting competition. Int. J. Forecast. 30(2), 382–394 (2014) 16. T. Hong, K. Maciejowska, J. Nowotarski, R. Weron et al., Probabilistic load forecasting via quantile regression averaging of independent expert forecasts. Technical report, Hugo Steinhaus Center, Wroclaw University of Technology (2014) 17. B. Liu, J. Nowotarski, T. Hong, R. Weron, Probabilistic load forecasting via quantile regression averaging on sister forecasts. IEEE Trans. Smart Grid 8(2), 730–737 (2015) 18. W. Kong, Z.Y. Dong, Y. Jia, D.J. Hill, Y. Xu, Y. Zhang, Short-term residential load forecasting based on LSTM recurrent neural network. IEEE Trans. Smart Grid 10(1), 841–851 (2017) 19. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, Semantic image segmentation with deep convolutional nets and fully connected CRFS. arXiv:1412.7062 (2014) 20. Z. Cao, C. Wan, Z. Zhang, F. Li, Y. Song, Hybrid ensemble deep learning for deterministic and probabilistic low-voltage load forecasting. IEEE Trans. Power Syst. (2019) 21. I. Goodfellow, Y. Bengio, A. Courville, Deep Learning (MIT Press, 2016) 22. S. Ren, K. He, R. Girshick, J. Sun, Faster r-CNN: towards real-time object detection with region proposal networks, in Advances in Neural Information Processing Systems, 2015, pp. 91–99 23. O. Ronneberger, P. Fischer, T. Brox, U-net: convolutional networks for biomedical image segmentation, in International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer, 2015), pp. 234–241

References

73

24. T.N. Sainath, O. Vinyals, A. Senior, H. Sak, Convolutional, long short-term memory, fully connected deep neural networks, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, 2015), pp. 4580–4584 25. W. He, Load forecasting via deep neural networks. Procedia Comput. Sci. 122, 308–314 (2017) 26. L. Bottou, Stochastic gradient descent tricks, in Neural Networks: Tricks of the Trade (Springer, 2012), pp. 421–436 27. J. Duchi, E. Hazan, Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011) 28. Y. Bengio, P. Simard, P. Frasconi et al., Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994) 29. X. Glorot, Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010, pp. 249–256 30. S. Ioffe, C. Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167 (2015) 31. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778 32. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014) 33. Z. Li, Y. Li, Y. Liu, P. Wang, R. Lu, H.B. Gooi, Deep learning based densely connected network for load forecasting. IEEE Trans. Power Syst. 36(4), 2829–2840 (2021) 34. G. Huang, Z. Liu, L. Van Der Maaten, K.Q. Weinberger, Densely connected convolutional networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 4700–4708 35. S. Ryu, J. Noh, H. Kim, Deep neural network based demand side short term load forecasting. Energies 10(1), 3 (2017) 36. K. Wang, X. Qi, H. Liu, Photovoltaic power forecasting based LSTM-convolutional network. Energy, 116225 (2019) 37. M. Cogswell, F. Ahmed, R. Girshick, L. Zitnick, D. Batra, Reducing overfitting in deep networks by decorrelating representations. arXiv:1511.06068 (2015) 38. Z. Li, M. Dong, S. Wen, H. Xiang, P. Zhou, Z. Zeng, CLU-CNNS: object detection for medical images. Neurocomputing 350, 53–59 (2019) 39. D.P. Kingma, J. Ba, Adam: a method for stochastic optimization. arXiv:1412.6980 (2014) 40. X. Zhang, Y. Zou, W. Shi, Dilated convolution neural network with leakyrelu for environmental sound classification, in 2017 22nd International Conference on Digital Signal Processing (DSP) (IEEE, 2017), pp. 1–5 41. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

Chapter 5

Reinforcement Learning Assisted Deep Learning for Probabilistic Charging Power Forecasting of EVCS

5.1 Introduction As one of the most essential actions, the carbon emission reduction has been adopted to address the problem of climate change. Therefore, renewable energy sources, such as wind and solar energy, are widely deployed to substitute fossil fuels [1–5]. Besides, the electrification of transportation systems is another effective approach to reduce carbon emissions. Due to the development of manufacture and related key technologies, the reliability and efficiency of electric vehicles (EVs) are significantly improved. Therefore, many countries aim to achieve a high EV penetration in the near future. It is predicted that the number of EVs will increase continually and may approximately reach 35 million in the year of 2022 throughout the world [6]. However, as one of the most important factors to expand the EV adoption, EV charging has brought great challenges to the secure and economic operations of power systems. Because the charging behavior of EV is with the nature of intrinsic randomness, the charging power of EV charging station (EVCS) is uncertain [7]. Consequently, it is considered as a volatile and uncertain load. Fluctuations of such load may threaten the operational security of power systems [8, 9]. To prevent such stressed conditions, it is crucial to well predict the EVCS charging power. Current works about EVCS or EV charging power forecasting can be divided into two main categories, namely the model-based and data-driven approaches. For the first category, [10] employed the trip chain to establish a spatial-temporal behavior model of EVs, which is beneficial for forecasting their charging power. Considering the impacts of traffic on the charging behavior [11] provided a reliable approach for EVCS charging power forecasting. Based on the case in Shenzhen, China, the charging behaviors of EVs are considered for systematic forecasting of EV charging power [12]. Thanks to the advent of the cloud services and the Internet of things, massive charging process data, such as the charging time and energy changing of EVs can be collected to forecast the charging power of EVCS [13]. Some classical data© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Li et al., Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch, Engineering Applications of Computational Methods 14, https://doi.org/10.1007/978-981-99-0799-1_5

75

76

5 Reinforcement Learning Assisted Deep Learning …

driven forecast algorithms have been widely employed to forecast the charging power of EVCS. For instance, [14] applied an autoregressive integrated moving average (ARIMA) model to forecast the charging power of massive EVCSs that were distributed in Washington State and San Diego. Reference [15] combined the least squares support vector machine (LSSVM) algorithm and fuzzy clustering for the EVCS charging power forecasting task. Based on the historical data of Nebraska, USA, the effectiveness of the extreme gradient boosting (XGBoost) on the charging power forecasting was validated in [16]. The above algorithms require structured input data with a specific human-defined feature, which require cumbersome engineering-based efforts [17]. Hence, the deep learning method becomes more popular in this field. For instance, [17] reviewed numerous deep learning methods for EV charging load forecast and concluded that the long short-term memory (LSTM) model could reduce by about 30% forecasting error compared with the conventional artificial neural networks (ANN) model. Reference [18] introduced the LSTM network to build a hybrid forecast model for the forecasting of EVCS charging power and the experimental results demonstrated its effectiveness. The advantage of LSTM was also shown in [19], compared with ARIMA and the ANN. Note that the above algorithms have been used to conduct the point forecast, which only provides an expected charging power of EVCS in the future. Therefore, the probabilistic forecast is introduced. It could forecast the probabilistic distribution of the future charging power and thus provide more information, i.e., the expected value and the forecast uncertainties [20, 21]. Consequently, we conduct the probabilistic forecast of EVCS charging power in this paper. To the best of our knowledge, there exist only a few reported works about this topic. For instance, [20] has applied four quantiles regression algorithms to the probabilistic EV load forecasting. In addition, the deep learning method is used in [21] to capture uncertainties of the probabilistic forecast. However, the above quantiles regression and deep learning-based methods have disadvantages. Regarding the first method, we need to take several steps under different quantiles to obtain the forecast distribution, which would bring about more computational burden [20]. For deep learning, the forecast uncertainty is mainly caused by the model training and input data [21, 22]. Indeed, the stochastic parameters are normally introduced in the deep learning model, and the uncertainty is estimated through statistical indicators of its outputs. Therefore, the uncertainty of probabilistic forecast is usually obtained by using adequate samples, which needs the repeated running of the deep learning model and the usage of input data. In other words, traditional approaches may lead to a huge computation complexity. Consequently, we propose an approach to solve the above issue. That is, a reinforcement learning assisted deep learning forecast approach is adopted to directly obtain the forecast uncertainty. This approach is designed to calculate the uncertainty once only, instead of repeated running. In detail, the LSTM is used to obtain the point prediction results of EVCS charging power, which is deemed as the expected value of the forecasting probabilistic distribution. Furthermore, note that the cell state of LSTM is one of the core inherent parameters, which could represent the important

5.2 Framework

77

information of the model and the input data, simultaneously. Therefore, the forecast uncertainty can be obtained from the cell state (this issue is explained in Sect. 5.4). As such state is varying from the input data, the forecast uncertainty also changes with the time-series data. Then, this change could be modeled as a Markov decision process (MDP). In this way, we innovatively introduce the reinforcement learning algorithm to assist the LSTM, in order to obtain the forecast uncertainty. The reason for selecting reinforcement learning is that it is a powerful technique to improve the artificial target by the autonomous learning without using any prior knowledge. Here, the target is to obtain the effective probabilistic forecast, and the reinforcement learning is used to learn the forecast uncertainty by observing the cell state of the LSTM. This paper makes the following contributions: (1) We propose a reinforcement learning assisted deep learning framework for probabilistic EVCS charging power forecasting. To the authors’ best knowledge, this is the first paper that uses the reinforcement learning algorithm to obtain forecast uncertainty in the field of EV charging power forecasting. (2) We model the variation of LSTM cell state as an MDP, which is solved by proximal policy optimization (PPO) to obtain the forecast uncertainty. In this case, the expected value of the EVCS charging power is forecasted by the LSTM, while the forecast uncertainty is yielded by the PPO. (3) An adaptive exploration PPO (AePPO) is further proposed. This would help adaptively balance the exploration and the exploitation during the training of PPO, which could improve its performance and prevent it from getting trapped into local optima. (4) We propose a data transformer method to obtain the information of EVCS from distributed charger recordings, i.e., charging sessions for the LSTM training. It aggregates the charging sessions and transforms them into the time-series format that includes the information of charging process, i.e., the charging power, utilization time and demand satisfaction rate. The remainder of this paper is organized as follows. Section 5.2 presents the reinforcement learning assisted deep learning framework for EVCS charging power forecasting. In Sect. 5.3, a data transformer method is proposed to preprocess the charging data. Section 5.4 introduces the LSTM, the MDP model of the cell state variation and the original PPO. In Sect. 5.5, the AePPO is proposed and the case studies are conducted in Sect. 5.6. Finally, Sect. 5.7 concludes this paper.

5.2 Framework 5.2.1 Problem Formulation The target of our studied probabilistic forecast problem is modeled by the following equations.

78

5 Reinforcement Learning Assisted Deep Learning …

t+1 = Eˆ t+1 +  P

(5.1)

 ∼ N (0, δ f )

(5.2)

Eˆ t+1 = f (X t )

(5.3)

δ f = g( f, X t ),

(5.4)

t+1 is the probabilistic forecast of EVCS charging power at time t + 1. where P  represents the noise which stands for the uncertainty between forecast and real value. Eˆ t+1 is the expectation of the forecast, which is obtained by the functions f [22]. In this paper, we assume the noise is Gaussian distributed with variance δ f , which is obtained by function g. Note that the time-series model based on Gaussian distribution assumption would have a satisfactory performance [23]. Besides, X t = {xt−T , ..., xt−1 , xt } is the vector of input features of the two functions. It stands for the features of T timestamps before t + 1, whereas xt ∼ N N f (N N f denotes the N f dimensional integer space). Note that f and g represent the point forecast model by LSTM and variance learning model via our proposed AePPO, respectively. Generally, f provides a point forecast which represents the mean value of our target [24]. During the training, the inherent parameters of f are gradually learned. It may cause the inflexibility of the algorithm, especially when confronted with the stochastic situation, i.e., the EV charging behavior. Thus the results provided by f might be unreliable. In this case, it is necessary to quantify the uncertainty of the EVCS charging power, and the introduction of the noise which is parameterized by the predicted variance δ f is one of the most effective approaches [23, 25]. To achieve this, g is used as the variance learning model in this paper. It could determine the variance δ f arising from the uncertainty of f and the input data X t . The uncertainty of f normally comes from two aspects, one is its nature of the black box, and the other is its need of vast historical data [25]. Since the structure of f is designed via the experience of researchers, it works like a black box because of the lack of physical models. On the other hand, f is data-driven, and its inherent parameters are determined by the stochastic gradient descent optimization method such as Adam [21]. It means more training data would increase the generalization capability of the model and thus improve its accuracy. In addition, the uncertainty of the input data X t could be caused by the charging behavior and the precision of the sensor. Specifically, X t relates to the features of the charging process, which is stochastic and highly variable. Besides, the limitation of the sensor precision leads to the uncertain error of the collected data. In summary, the variance caused by the uncertainty of f and X t should be jointly considered. Usually, the variance is estimated through statistical indicators, which required a massive repeated running of f (X t ) and thus lead to a huge computation complexity [22]. However, as the core parameter of LSTM, the cell state could be also affected by the input data. This means the cell state would change because of the uncertainty that comes from the model and the input data at the same time.

5.2 Framework

79

Consequently, in this paper, we model the change as an MDP; wherein, the cell state of the LSTM is denoted as the state. The action is defined as the variance of the probabilistic forecast distribution. In this way, the predicted distribution could be constructed by the action of the MDP and the prediction of LSTM, and the reward of MDP is denoted as the evaluation index of the distribution. With the definition of MDP, a reinforcement learning model, AePPO, is proposed as g to obtain the variance for the probabilistic forecast distribution.

5.2.2 The Probabilistic Forecast Framework of EVCS Charging Power As discussed above, the LSTM and AePPO are trained and utilized to obtain the mean and variance values for obtaining the predicted probabilistic distribution of EVCS charging power. In order to implement this procedure, we propose a probabilistic forecast framework, as shown in Fig. 5.1. It contains three parts, pretreatment, training, and utilization. They are presented by the three subfigures in Fig. 5.1, which are denoted as (a), (b) and (c), respectively. As illustrated in Fig. 5.1a, an EVCS has numerous chargers which record the information about charging power in the period between the arrival and departure of an EV user [20]. Since all chargers may work simultaneously, the charging information of EVCS is obtained by recordings of chargers. In this paper, such recording is termed as the charging session. It consists of the period during the process of EV charging, i.e., the arrival and departure time of the EV, and the related data about the energy information, for instance, the demand and remaining energy of the corresponding EV. Note that one charging session only contains partial charging information of EVCS and the period between different charging sessions may overlap. The information in charging session during the overlap period should be gathered to derive the data of EVCS for LSTM training. In this case, we pretreat the charging session data to aggregate the information and transform them into the time-series format with three features, i.e., the charging power, utility time and demand satisfaction rate. Then, the training and validation sets are determined for the algorithm training and its utilization. Afterward, the learning process is shown in Fig. 5.1b. The LSTM is trained based on the training set. After that, the AePPO will interact with LSTM according to its state, and produce the related action during each training iteration. Then, the reward is calculated based on the state and action, and it will be used to update the parameters of the AePPO. Subsequently, the adaptive exploration mechanism is proposed to balance the exploration and the exploitation of AePPO for the next iteration. In this way, the LSTM and AePPO are jointly trained to obtain the mean and variance of the forecasts. Finally, the last part of this framework is the utilization, as shown in Fig. 5.1c. The validation set is used to verify the framework’s performance, and the LSTM

80

5 Reinforcement Learning Assisted Deep Learning …

Fig. 5.1 Overall structure of the proposed forecast framework, reprinted from Ref. [26], copyright2022, with permission from IEEE

5.3 Data Transformer Method

81

and AePPO models are utilized here. That is, the LSTM provides the mean value of probabilistic forecast distribution Eˆ t+1 , while AePPO is used to determine the variation δ f based on the cell state of LSTM. In this way, the predicted probabilistic t+1 could be obtained by Eq. 5.1. distribution E

5.3 Data Transformer Method In this section, the data transformer method for the charging session is introduced to aggregate the charging information of EVCS, including the charging power, utility time and demand satisfaction rate. Then, the information is transformed into the time-series format for the training of LSTM. As previously mentioned, the charging session contains the information of charging period, which starts when an EV is connected to an EVCS charger at time tarr , and finishes when it departs at tde . Furthermore, the charging session also includes the energy information. For instance, an arriving EV has earr energy remaining in its batteries at time tarr , and its energy demand is euser . During the charging process, the charger will provide power to the EV until it is finished at tdc (the time when EV is done charging). At this time, the energy of EV is denoted by edc . Finally, when the EV leaves the charger at tde , the above information of the charging session can be recorded. Based on the issue whether euser is satisfied, two scenarios can be classified by checking the charging session, as illustrated in Fig. 5.2. Figure 5.2a represents the Scenario 1 where euser is satisfied while Fig. 5.2b stands for the Scenario 2; wherein, euser is not fulfilled. In Scenario 1, tdc < tde and euser = edc . It means that the EV could finish its charging before its departure time. As for the Scenario 2, tdc = tde and euser > edc ; i.e., the remaining energy in the battery is less than the demand when the EV finishes its charging. This scenario may happen when the users are confronted with an emergency, in which the EV has to leave the charger before the demand is satisfied.

Fig. 5.2 Two scenarios with respect to charging sessions

82

5 Reinforcement Learning Assisted Deep Learning …

Fig. 5.3 Transformation of the charging sessions on separate chargers to the characteristic timeseries data of charging station

In order to summarize the characteristics of charging sessions in the two scenarios for the further training of LSTM, we use three features, i.e., the delivery energy E deli , the demand satisfaction rate D and the utilization time Tutil . They are formulated as follows. (5.5) E deli = edc − earr D=

E deli × 100% E req

Tutil = tdc − tarr ,

(5.6) (5.7)

where E req is computed as the difference between the energy demand and the remaining energy when EV arrives. It is defined as follows: E req = euser − earr .

(5.8)

Note that the difference between the two scenarios is represented by D, formulated in Eq. 5.5. It is equal to 1 in Scenario 1 because the demand of the user is satisfied. In Scenario 2, since the demand of the user is not fulfilled, the D value is lower than 1. Normally, the EVCS contains numerous chargers and they will work simultaneously. The chargers record the information of time and energy when EV is connected and save it as a charging session. Figure 5.3a shows an example of the charging situation of EVCS which contains Nc chargers. If we want to represent the delivery energy, utility time and demand satisfaction rate of the EVCS by the time-series format, we count the corresponding information on the same timestamp. For instance, Fig. 5.3a illustrates the charging session at chargers 1, k and Nc during timestamp T0 →T6 .

5.3 Data Transformer Method

83

For charger 1, an EV is connected from T0 to T4 . According to the timestamps, the 1 features defined in Eqs. (5.4)–(5.7) could be split. During T0 , its utility time Tutil,0 denotes the interval between tarr and the end of T0 , indicating the working time of this charger. During this period, the delivery energy between the end of T0 and earr 1 is termed as E deli,0 . Accordingly, the delivery energy and utility time of the other 1 1 and Tutil,t , where t ∈ {0, 1, 2, 3, 4}. Note that three timestamps are denoted as E deli,t 1 1 1 1 the E deli,3 , E deli,4 and Tutil,3 , Tutil,4 are equal to 0 because the charger is not working during T3 and T4 . Since E req is not varying with time, the demand satisfaction rate 1 /E req . In a similar fashion as of each timestamp could be calculated by Dt1 = E deli,t charger 1, the delivery energy, utility time and demand satisfaction rate of charger k at k k , Tutil and Dik , respectively. timestamp Ti could be calculated and are termed as E deli,i i It is seen from Fig. 5.3a that at charger k, the EV is connected at T2 and leaves at T5 . Moreover, for the charger Nc , the EV stayed from T2 to T6 . Besides, it is observed that the three charging sessions overlap in timestamp T2 , T3 and T4 . In this case, the charging sessions are discontinuous in the three timestamps. Here, the information that is partially contained in the charging sessions should be gathered to construct the time-series data of the EVCS and used for the LSTM training. To achieve this goal, the features split by timestamp are aggregated by the following equations: Et =

Nc 

i E deli,t

(5.9)

i=1

Tt =

Nc 

i Tutil,t

(5.10)

i=1

Dt =

Nc 

Dti ,

(5.11)

i=1

where Nc represents the number of chargers and t ∈ [0, Nt ] is the timestamp index, while Nt stands for the length of the time series. In order to describe this process more clearly, an instance at timestamp T4 is shown in Fig. 5.3c. It can be seen that the value of each feature at T4 is obtained by the aggregation of the split charging session data. After aggregation, the time-series data with three features of EVCS are obtained for the training of LSTM. Therefore, the number of features N f is 3 in this paper, and the input feature vector at timestamp t is denoted by xt = [E t , Tt , Dt ].

84

5 Reinforcement Learning Assisted Deep Learning …

5.4 Reinforcement Learning Assisted Deep Learning Algorithm This section presents the reinforcement learning assisted deep learning algorithm. In the first subsection, the LSTM is introduced to forecast the mean value of charging power with the aggregated time-series data. Then, the variation of the LSTM cell state is modeled as an MDP in the second subsection. In the last subsection, the PPO is presented to solve the MDP to obtain the variance of the forecasts.

5.4.1 Long Short-Term Memory In order to forecast the probabilistic distribution of EVCS charging power, its mean and variance values should be obtained. In this paper, we apply the LSTM to predict the mean value of charging power. Compared with other deep learning algorithms, such as the fully connected network and convolutional neural network, LSTM is more capable of learning the long-term dependencies inherent to the time-series data [25]. Overall, the LSTM is formulated as follows: yt+1 = A(X t , yt , ct ),

(5.12)

where yt+1 represents the output of LSTM at time t + 1, which stands for the predicted mean value of EVCS charging power in this paper. The LSTM model A is used to make calculation using X t , yt and ct , which are the input data, the output and the cell state of LSTM at time t, respectively. In the LSTM, yt+1 is associated with the data and the cell state, which could be observed from the following formulations. ot+1 = σ (Wo [yt , X t ] + bo )

(5.13)

yt+1 = ot+1  tanh(ct+1 ),

(5.14)

where ot+1 is the scaling parameter in order to scale the cell state, while  indicates the elementwise multiplication. Wo and bo are learnable parameters in the output process which are determined by training, and σ and tanh are the sigmoid and hyperbolic tangent activation functions. Their output range is (0, 1). In addition, ct+1 stands for the cell state at t + 1 which is updated by ct according to the following equation. ct+1 = f t+1  ct + i t+1  c˜t+1 ,

(5.15)

where f t+1 stands for the forgetting degree of the previous cell state ct in ct+1 , and i t+1 denotes the storing degree of c˜t+1 . c˜t+1 is the candidate information of input in ct+1 . They are formulated by the following equations.

5.4 Reinforcement Learning Assisted Deep Learning Algorithm

85

f t+1 = σ (W f · [yt , X t ] + b f )

(5.16)

i t+1 = σ (Wi · [yt , X t ] + bi )

(5.17)

c˜t+1 = tanh(Wc · [yt , X t ] + bc ),

(5.18)

where W f and b f stand for the weight and bias of the forget gate. Wi and bi are parameters of input gate, and Wc and bc are the weight and bias to determine c˜t+1 . Note that these parameters are obtained according to the training. f t+1 ranges from (0, 1), where 0 means that the information of ct should be completely ignored in ct+1 while 1 represents that it should be fully kept. Besides, the range of i t+1 is (0, 1) as well. When i t+1 = 0, it indicates that the candidate information should not be stored in ct+1 . Otherwise, i t+1 = 1 represents the full storage of candidate information. From these equations, we could learn that the cell state is the core parameter of LSTM because it contains the relevant information stored inside the LSTM memory, as shown in Eq. 5.14. Therefore, the LSTM could determine its output according to the observation of the cell state. During the LSTM training, all the learnable parameters can be determined by the stochastic gradient descent optimization method. The randomness of such an optimization method in determining these parameters would lead to the uncertainty of the model. Besides, since f t+1 and i t+1 are obtained by the input data X t , it could also influence the cell state. Since X t contains the uncertainty of the user’s behavior, it will be reflected in the cell state as well.

5.4.2 The Modeling of LSTM Cell State Variation As described above, the input of LSTM X t is the time-series format, which reflects the behavior of the users. Besides, because of the recurrent structure of the LSTM, the cell state is influenced by its previous value. Therefore, the variation of the LSTM cell state is caused by the uncertainty that comes from both model and data. In this case, if we want to extract the variance of the probabilistic forecast distribution from the LSTM cell state, the variation of the cell state should be modeled. Since the cell state is only determined by its previous state and X t , we could model its variation as an MDP. Normally, the MDP is represented by a tuple S, A, P, R. S is the state space that stands for all possible states of the environment. A represents the action space; i.e., the agent interacts with the environment to produce action a ∈ A for the guidance of state transition. P = { p(st+1 |st , at )} stands for the set of transition probability, and R = r (st , at ), R ∈ R; S × A → R depends on the state and action and is termed as the reward. In our problem, the MDP can be formulated as follows: (1) Environment: The environment produces the state st and computes the reward rt based on the action at of the agent. Since we aim to obtain the variance of the probabilistic forecast distribution from the change of the cell state, a well-trained LSTM is used to produce it from X t .

86

5 Reinforcement Learning Assisted Deep Learning …

(2) Agent: Here, the agent stands for a policy to obtain the action from observing the state. Besides, the agent could imply the self-learning from its experience to gradually improve the policy. (3) State: The state represents the parameters that define the environment. Since the cell state determines the output of LSTM, it is used as the state, which is termed as st at time t. (4) Action: As the uncertainty of the model and input data are contained in st , it is expected that the agent can capture this uncertainty to produce the variance of the forecast distribution, i.e., the action at = δt . (5) Reward: The reward r (st , at ) indicates the evaluation index of at on the basis of the st . It gives feedback to the agent about the performance of its action, which could be used to update the agent. Since our target is to obtain the probabilistic distribution of future load, which is constructed by the output of the environment, i.e., LSTM and the agent’s action. In our case, the reward is designed to evaluate the performance of the predicted distribution. To quantify it, the coverage width-based criterion (CWC) is introduced. It is a widely used criterion to measure the performance, i.e., coverage of the predicted probabilistic distribution [27].  CWC =

q∈Q n

  Uq − L q × (1 − γ e−η(Cq −μ) ) nR

,

(5.19)

where Uq = inf{y ∈ R: P{y ≥ Uq } > q2 } and L q = sup{y ∈ R: P{y ≤ L q } < q2 } stands for the upper and lower quantiles, q ∈ Q n represents the probability. Q n contains n probabilities to calculate the CWC. R = E max − E min represents the range of charging power in the dataset. Besides, μ ∈ (0, 1) and η ∈ (0, 1) are two hyperparameters which are manually determined, and Cq and γ are calculated by the following formulations.  Cq =

 1, if yt ∈ L q , Uq  / L q , Uq 0, if yt ∈ 

γ =

0, Cq = 1 1, Cq = 0.

(5.20)

(5.21)

It could be learned from Eqs. (5.8 and (5.9) that the Uq − L q stands for the sharpness of the distribution and Cq represents its coverage. Here the CWC is concerned more with the quality of the predicted distribution, which is highly affected by the action. The distribution with better coverage and sharpness, i.e., Cq = 1 and lower Uq − L q , corresponds to lower CWC. In this case, the reward is defined as r = −C W C, which should be maximized by the agent.

5.4 Reinforcement Learning Assisted Deep Learning Algorithm

87

5.4.3 Proximal Policy Optimization Based on the MDP, a recent reinforcement learning algorithm, called PPO, is introduced to train the agent for generating the optimal policy π(ak |sk ). The PPO contains two types of deep neural networks, which are termed as actor and critic. The actor works as the agent, and it produces the action by a policy π with parameter θ π . On the other hand, the critic is used to evaluate the performance of actor, parametrized by θ Q . The critic is used to approximate the value function V πk (sk ) for the training of actor, where k indicates the iteration of training. Since V πk (sk ) is able to indicate the long-term influence of the policy, it is used for the training of the PPO. The main target of the PPO is to train the actor and critic, by learning from the experience tuples sk , ak , rk , sk+1 . It is obtained by the interaction between the actor and the environment. The actor generates a normal distribution N (ak,mean , ak,var ) after observing sk . That is, the action ak is sampled from this distribution, i.e., ak ∼ N (ak,mean , ak,var ). Therefore, this introduces randomness in the produced action and thus creates more diverse actions when observing the same state. In this way, it leads to the diversity of rewards since they are related to the action. This mechanism is termed the exploration of PPO, which is used to enrich the experience tuple in order to prevent the actor from getting trapped in local optima. Then, based on the prediction of the mean value yk obtained by LSTM, the predicted probabilistic distribution N (yk , ak ) is constructed. Afterward, rk could be calculated by the reward function to evaluate N (yk , ak ), and the state at the next time sk+1 is also restored. With the experience tuple, the parameters of both networks are updated by the following equations: π = θkπ + ηπ ∇s,a∼πk LC L I P θk+1

(5.22)

Q = θkQ + η Q ∇θ Qk L Q , θk+1

(5.23)

where ηπ and η Q denote the learning rates of actor and critic, respectively. LC L I P and L Q stand for the loss functions of the two networks, which are given by

L

πk (a | s) πk A , πk−1 (a | s) s,a πk (a | s) πk , 1 − , 1 + )As,a ) clip( πk−1 (a | s)

CLIP

= Es,a∼πk min(

 L Q = Es,a∼πk (γ V πk (sk+1 ) + r (sk , ak ) − V πk (sk ))2 ,

(5.24)

(5.25)

where πk−1 (a | s) denotes the policy of (k − 1)th iteration, and γ is the discount rate. clip(t, tmin , tmax ) is the clip function. It returns tmax if t > tmax , and tmin if t < tmin . Aπs,ak is the advantage of action a under state s, which is represented by the difference between a and the averaged performance of the actor, formulated as follows:

88

5 Reinforcement Learning Assisted Deep Learning …

Aπs,ak = Q(sk , ak ) − V πk (sk ).

(5.26)

In Eq. (5.23),  is the hyperparameter which is manually set to penalize larger Aπs,ak . The Q-function Q(sk , ak ) represents the discounted rewards, representing the long-term influence of action, which is shown as follows: Q(sk , ak ) = r (sk , ak ) +

t 

γ t−i r (sk−i , ak−i )

(5.27)

i=1

5.5 Adaptive Exploration Proximal Policy Optimization The actor of PPO will generate ak,mean and ak,var , based on the state sk . As discussed in the above section, the action ak is sampled from the distribution N (ak,mean , ak,var ). The value of ak,var influences the exploration of PPO. With the same ak,mean , a larger ak,var represents a wider distribution. It means the choice of action is broadened, and the diversity of rewards increases with it. In this case, the degree of exploration is enlarged as well because a variety of experiences are imported. On the contrary, the smaller ak,var cannot provide a richer experience because the action distribution is more concentrated, and the probability of ak deviating from ak,mean is lower. Under this situation, the PPO trains its actor and critic according to its previously generated experience, which is termed as the process of exploitation. Note that exploration and exploitation are usually contradictory. Ideally, the reinforcement learning algorithm focuses on its exploration in the early stage of the training to obtain abundant experiences. However, if the algorithm keeps a high exploration, it is hard to be convergent because of its high randomness. In this case, the exploration will shrink according to the training. Thus, the degree of exploitation is increased. Consequently, the explored experiences should be fully utilized to help the actor and critic to converge. However, the training process of PPO is different from the ideal one, which comes from its uncontrollable exploration. The ak,var of PPO is generated by the actor, which is a neural network. Therefore, it is dependent on both the initial condition (set manually) and the inherent parameters of actor (determined by the stochastic optimization algorithm). This means the determination of exploration is uncertain, and the value of ak,var may shrink prematurely and cause higher exploitation based on the current experiences. However, because of the lack of enough exploration, the diversity of the experiences may be low. In this way, it may lead to the PPO getting trapped in the local optima. To tackle this issue, we propose an adaptive exploration mechanism for PPO to dynamically balance its exploration and exploitation. This mechanism is termed as the adaptive exploration proximal policy optimization (AePPO). In the AePPO, the AePPO ), where ak,mean is generated by the actor action is sampled from N (ak,mean , ak,var AePPO and ak,var is determined by our proposed adaptive exploration mechanism.

5.5 Adaptive Exploration Proximal Policy Optimization

89

Fig. 5.4 a Sampling of reward. b Changes of parameter with respect to iterations, reprinted from Ref. [26], copyright2022, with permission from IEEE

Using such a mechanism, we aim to increase the exploration in the earlier training stage of AePPO and gradually shrink it to enhance the exploitation of the experiences. AePPO It is inadvisable to achieve this target by gradually reducing the value of ak,var according to the training. This is because different actions may produce the same performance, i.e., reward, which may limit the expansion of experiences. In this case, AePPO should be adjusted according to the performance of the reward. the ak,var AePPO While considering Na actions sampled from N (ak,mean , ak,var ) under the same sk , their corresponding rewards may vary because their values are related to the AePPO ) follows the normal distribution, we assume action. Since ak ∼ N (ak,mean , ak,var the variation of rewards is subject to a normal distribution N (rkmean , rkvar ) as well, and rkmean and rkvar represent the mean and variance of the distribution, respectively. As shown in Fig. 5.4a, the two parameters could be estimated by the following equations: rkmean =

rkvar

Na 1  r (sk , aki ) Na j=1

Na 1  = (r (sk , aki ) − rkmean ) Na j=1

aki ∈ (ak1 , ak2 , ..., aki , ..., akNa ),

(5.28)

(5.29)

(5.30)

where (ak1 , ak2 , ..., aki , ..., akNa ) indicates the Na actions which are sampled from AePPO ). N (ak,mean , ak,var AePPO Note that rkvar represents the exploration of the PPO and it is influenced by ak,var . var AePPO If rk is enlarged by changing ak,var , it indicates the diversity of experiences is enriched; i.e., the exploration ability of PPO is enhanced. Besides, since rkmean evaluates the performance of PPO, the higher rkmean means improved convergence as well. In this case, the focus on exploration and exploitation should be changed during the

90

5 Reinforcement Learning Assisted Deep Learning …

training. This process could be represented by the following equation. exp

fk

= bk,1 rkmean + bk,2 rkvar ,

(5.31)

where bk,1 < 1 and bk,2 = 1 − 1 − (bk,1 − 1)2 denote two parameters which represent the focus on rkmean and rkvar , respectively. As shown in Fig. 5.4b, the values of these parameters vary according to the training iteration k. In the early stage of exp the training, bk,2 is larger than bk,1 , which means f k is highly influenced by rkvar . Then, bk,1 is increased according to the training, and thus, rkvar gradually decreases. exp This charger represents the shifting of focus. When bk,1 > bk,2 , the value of f k exp var var is more influenced by rk , which means the larger f k causes higher rk and thus exp strengthen the exploration. On the other hand, when bk,1 ≤ bk,2 , the raise of f k exp mean leads to the increase of rk , the focus of f k has changed from raising the exploration to increase the average performance of PPO, i.e., enhancing the exploitation. exp AePPO during the training In this case, f k should be maximized by adjusting the ak,var exp of AePPO; i.e., − f k should be minimized. exp AePPO and f k is difficult to model, we introduce Since the relationship between ak,var particle swarm optimization (PSO), a well-known optimization algorithm to search exp AePPO in order to minimize − f k [28]. The PSO contains N p particles, and each the ak,var AePPO of them stands for a possible ak,var . The position of the lth particle at tth iteration t is denoted as the xl , which is our target to be minimized. The article is updated according to the following equations: vlt+1 = ωvlt + e1 × rand1 ×( p best,t − xlt ) + e2 × rand2 ×(g best − xlt ) xlt+1 = xlt + vlt+1 ,

(5.32) (5.33)

where ω ∈ (0, 1), e1 and e2 denote the weight of acceleration. rand1 and rand2 are the random numbers sample from standard normal distribution N (0, 1). p best,t indicates the local optimal position at tth iteration and g best stands for the global optimal position. After Nvar iterations, the global optimal position is set as the variation AePPO = g best . The pseudocode of AePPO is provided in the of the action, i.e., ak,var Appendix. In conclusion, based on Sects. 5.4 and 5.5, the interconnection between LSTM and the proposed AePPO is described as follows. As illustrated in the upper side of Fig. 5.5, LSTM takes the input data, which is obtained by the data transformer and performs its internal calculation according to Eqs. (5.12)–(5.17). After that, the output of LSTM is the predicted mean value Eˆ k+1 of EVCS charging power. Then, the cell state of LSTM is extracted to fulfill the experience of AePPO, which serves as the state sk of AePPO. The lower side of Fig. 5.5 shows the process of AePPO training iteration. The square with a yellow background in this figure represents the update mechanism of actor and critic according to Eqs. (5.21)–(5.26). Besides, the area with a violet background represents the procedure of adaptive exploration. After the

5.5 Adaptive Exploration Proximal Policy Optimization

91

Fig. 5.5 Interconnection between LSTM and AePPO, reprinted from Ref. [26], copyright2022, with permission from IEEE

initialization of PSO, the fitness of each particle is calculated by Eq. (5.30). Then, the particles adjust their position according to Eqs. (5.31) and (5.32), and the aim to minimize the fitness. After reaching the stopping condition, the g best of PSO is considered AePPO , representing the adaptive exploration of AePPO. Afterward, the variation as ak,var AePPO of predicted distribution δk+1 is sampled from N (ak,mean , ak,var ), where ak,mean is determined by the actor of AePPO. Finally, based on Eq. (5.1), the probabilistic pret+1 = N ( Eˆ t+1 , δ f ). diction distribution of EVCS charging power is represented by E Specifically, the LSTM is used to determine the mean value of this distribution and the AePPO is used to obtain the variance part. Then, the probabilistic prediction of EVCS charging power is obtained.

92

5 Reinforcement Learning Assisted Deep Learning …

5.6 Case Study 5.6.1 Data Description and Experiential Initialization In this part, we conduct a case study to verify the effectiveness of our proposed algorithm, i.e., LSTM-AePPO. It is based on the charging session data collected from ACN dataset [29], which is an open dataset for EVCS charging researches. It contains the data which are collected from two EVCSs, one at Caltech, Pasadena and the other located in Jet Propulsion Laboratory (JPL), La Canada, USA. The EVCS in Caltech includes 54 chargers, and the other in JPL includes 50 in total. In our experiments, the charging session data from June 1st, 2018, to June 1st, 2020, are selected as the historical data to train the LSTM-AePPO, in which 25916 and 22128 charging sessions are included in the case of Caltech and JPL during this period, respectively. In order to demonstrate the superior performance of LSTM-AePPO, state-of-theart algorithms are introduced for comparisons, i.e., support vector quantile regression (SVQR) [30], linear quantile regression (QR) [31] and gradient boosting quantile regression (GBQR) [32]. Since the charging behavior varies under different seasons, the performance and metric comparisons are conducted on the four seasons, respectively. Moreover, for further verifying the outperformance of our proposed AePPO, the original PPO is used to replace it in LSTM-AePPO for comparisons, which is termed as LSTM-PPO. To compare the performances of the above-mentioned algorithms, error metrics need to be used. The evaluation of probabilistic forecast results should consider the coverage, variation, and reliability. Therefore, we adopt three metrics, i.e., Winkler [25], CWC [27] and Pinball [33]. In this paper, the features of former Nh hours [xi−1 , ..., xi−Nh ] are used in the probabilistic forecasting of EVCS charging power at hour i. Note that Nh is set as 12 in the following experiments. The corresponding hyperparameters of LSTM-AePPO are given in Table 5.1. The experiments are conducted on a computer with 8GB RAM, Intel i5-8265U CPU, and implemented in Python (Table 5.1).

5.6.2 The Performance of Probabilistic Forecasting Obtained by LSTM-AePPO In this section, we compare the performance of LSTM-AePPO with QR, QRSVM, GBQR and LSTM-PPO on the charging session data collected from Caltech and JPL. Since the charging session data are not continuous in time, we have applied the data transformer mechanism to transform it into a time-series format. The difference between the two adjacent timestamps is 1 h, and the time-series data last for 17525 h. The probabilistic forecast distributions of EVCS charging power obtained by LSTM-AePPO are shown in Figs. 5.6 and 5.7. They illustrate the results for Caltech and JPL, on 3 test days (72 h). In both figures, the subfigures (a), (b), (c) and (d)

5.6 Case Study

93

Table 5.1 Hyperparameters of the proposed LSTM-AePPO, reprinted from Ref. [26], copyright2022, with permission from IEEE Symbol Description Value ηπ ,η Q ηlstm  γ NT Npso Nvar e1 , e2 ω

The learning rate of actor and critic The learning rate of LSTM The clip value of actor loss The discount factor The maximum training number of AePPO The population of PSO The maximum iteration of PSO The learning factor of PSO The inertia weight of PSO

0.0001 0.001 0.1 0.99 40000 20 100 2 0.7

Fig. 5.6 Probabilistic forecasting of Caltech, obtained by the proposed LSTM-AePPO framework, reprinted from Ref. [26], copyright2022, with permission from IEEE

show results of Spring, Summer, Autumn and Winter, respectively. Specifically, we use q% prediction interval (PI) to denote the interval between the lower and upper quantiles at probability q. The darker color is used to denote the PI with a smaller q. Consequently, the 30% PI is shown with the darkest color while the 90% PI corresponds to the lightest. Note that the real charging power is represented by the red line in these figures. In the case of Caltech, the peak demand hour of a typical day varies according to the seasons. From Fig. 5.6a, c, we learn that their peak demands emerge around the 18th and 42nd hours. However, as shown in Fig. 5.6b, d, the appearance of the peak charging demand during the Summer and Winter is in the 12th, 42nd and 65th

94

5 Reinforcement Learning Assisted Deep Learning …

Fig. 5.7 Probabilistic forecasting of JPL obtained by the proposed LSTM-AePPO framework, reprinted from Ref. [26], copyright2022, with permission from IEEE

hours during the 3 test days. It is more frequent than those in Spring and Autumn. Besides, as shown in Fig. 5.7, the occurance of the peak demand is more irregular in the case of JPL. In Spring, the charging demand increases and reaches its maximum value in the 15th and 40th hours. However, this phenomenon only happens in the 63rd hour of the Summer. On the contrary, the peak demand emerges earlier in Autumn, i.e., in the 1st and 20th hours. In addition, the peak demand in the Winter case of JPL is more frequent, and it totally has 3 times during the 3 test days, which are the 15th, 40th and 62nd hours, respectively. The above description explains the irregularity of the peak demand, which is a challenge for forecasting. However, the LSTM-AePPO achieves an excellent result under this circumstance. The fluctuation of the probabilistic forecast distribution capture the EVCS charging power. As shown in Figs. 5.6 and 5.7, most of the values in the red lines are covered by the 90% PI, which means that the probabilistic forecast distribution could represent the variation of real charging power.

5.6.3 Metrics Comparison Among Different Algorithms To verify the effectiveness of the LSTM-AePPO, we have introduced CWC, Pinball and Winkler as metrics for comparison, and the results are shown in Table 5.2. Since the width of PI could affect these metrics, it is meaningless to compare the metrics across different PI. Therefore, we compare each algorithm under the same PI to demonstrate the effectiveness of our proposed algorithm. For example, in the Summer case of JPL, the CWC, Winkler and Pinball of GBQR at 90% PI are 50.851, 16.626 and 2.561, which are 3.2, 7.3, and 3.7 times higher than those of

30 60 90 Summer 30 60 90 Autumn 30 60 90 Winter 30 60 90

Spring

JPL

4.988 40.591 78.246 5.782 54.670 93.164 1.468 24.339 57.711 7.115 48.499 98.002

5.380 12.990 27.937 18.780 45.033 55.510 3.412 14.719 29.461 7.200 21.863 37.367

CWC QR

5.108 32.068 78.695 2.921 42.148 64.690 3.310 40.695 64.316 3.700 33.461 80.627

5.050 15.390 40.199 11.125 32.828 46.330 5.983 19.360 50.743 5.722 15.868 32.714

6.246 32.143 52.858 3.984 29.222 50.851 5.946 23.516 49.832 4.568 29.447 58.370

5.710 17.426 25.844 14.890 33.886 47.081 10.488 29.290 38.936 6.245 16.061 27.038

QRSVM GBQR

8.777 15.832 37.460 7.234 12.567 29.462 9.721 21.823 47.496 6.995 13.151 27.185

5.107 14.162 17.563 12.319 11.591 30.303 8.747 6.666 26.390 4.994 8.440 32.327

LSTMPPO

10.784 15.008 19.382 5.647 10.515 15.922 4.973 20.835 27.844 2.591 10.170 19.457

4.584 12.592 12.660 10.044 9.123 22.875 5.232 6.302 17.199 4.868 8.349 15.527

LSTMAePPO

205.529 73.123 18.393 128.946 41.654 17.674 73.408 18.543 14.978 292.522 113.864 16.970

101.111 49.890 19.440 160.830 59.479 22.033 102.738 44.214 15.845 103.079 43.839 11.602

Winkler QR

Bold metric indicates the strongest performance among all the algorithms

30 60 90 Summer 30 60 90 Autumn 30 60 90 Winter 30 60 90

Spring

PI(%)

Caltech

Location Season

164.939 66.488 15.832 124.318 45.414 16.356 64.361 18.798 14.993 282.652 140.090 28.998

82.002 42.770 18.530 124.271 55.384 22.807 55.510 21.009 11.978 106.389 51.392 12.615 163.276 56.467 16.386 110.119 39.932 16.626 45.189 17.435 14.993 258.627 117.323 27.530

83.173 41.644 18.257 146.475 61.368 22.818 68.693 24.376 11.375 100.884 44.749 12.922

QRSVM GBQR

13.540 5.287 6.830 12.716 8.927 2.816 4.913 3.430 3.085 13.900 10.050 4.673

28.210 21.203 9.817 49.833 34.910 15.493 34.364 23.205 8.941 22.920 16.482 7.178

LSTMPPO

8.448 4.958 8.867 11.039 5.961 2.292 3.340 3.026 3.056 11.533 4.568 8.403

22.761 18.757 8.429 20.463 28.665 12.888 28.028 21.851 6.040 21.999 12.933 4.129

LSTMAePPO

4.748 4.688 3.939 3.100 3.774 3.190 1.498 1.962 1.902 6.830 6.426 5.287

2.480 2.332 2.002 5.176 4.607 3.822 2.389 2.252 1.911 2.741 2.639 2.195

Pinball QR

3.745 3.595 3.100 2.696 2.936 2.606 1.393 1.842 1.752 6.291 5.976 4.988

2.002 1.877 1.638 3.572 3.344 2.844 1.661 1.729 1.558 2.582 2.366 1.979 3.894 3.834 3.235 2.621 2.966 2.561 1.183 1.827 1.693 5.856 5.437 4.538

2.116 1.979 1.683 4.470 3.970 3.299 2.400 2.434 2.070 2.514 2.298 1.900

QRSVM GBQR

6.201 5.018 3.685 4.044 3.640 4.958 9.916 9.017 9.481 3.026 3.175 2.591

1.558 1.820 2.389 2.901 2.957 3.390 1.046 1.001 3.048 1.388 2.002 2.741

LSTMPPO

Table 5.2 The seasonal metric comparison of two sites under different PI, reprinted from Ref. [26], copyright2022, with permission from IEEE

4.763 3.894 1.573 4.044 3.145 0.704 3.670 3.610 1.333 2.396 2.247 1.917

1.501 1.524 1.297 2.821 2.650 2.764 0.796 0.967 2.036 1.263 1.615 1.956

LSTMAePPO

5.6 Case Study 95

96

5 Reinforcement Learning Assisted Deep Learning …

Fig. 5.8 Comparisons of CWC, Pinball and Winkler in the case of Clatech among the five mentioned algorithms

LSTM-AePPO (CWC = 15.922, Winkler = 2.292, Pinball = 0.704). Besides, when comparing them at 60% PI, the CWC of LSTM-AePPO, i.e., 10.515 is lower than 54.670, 42.148, 29.220 and 12.560, which are the metric values of QR, QRSVM, GBQR, and LSTM-PPO, respectively. This means that the forecast distribution of LSTM-AePPO has a better sharpness. In addition, the Winkler and Pinball of LSTMAePPO are 5.961 and 3.145 at 60% PI, which are smaller than the values obtained by the other algorithms. Generally, it indicates the better variation and reliability of the forecast distribution. The effectiveness is also verified in the Winter case at Caltech. In addition to the 90% PI, the metrics of LSTM-AePPO are lower than their comparison algorithms at 60% PI. Under 30% PI, the CWC and Winkler of LSTM-AePPO are 4.868 and 21.920, which are lower than the corresponding metrics of GBQR, i.e., 6.250 and 100.884. Because of the large amount of comparison data, we show the best one of each case by bolding the corresponding number in Table 5.2. It could be seen that the LSTM-AePPO outperforms other algorithms on CWC, Winkler and Pinball in most cases. Note that the LSTM-AePPO does not perform the best in rare cases among each PI. For instance, as shown in Table 5.2, nearly all metrics with respect to LSTMAePPO outperform others except two cases under 90% PI, i.e., the CWC of JPL Winter case (8.403 for LSTM-AePPO but 4.673 for LSTM-PPO) and the Pinball of Caltech Winter case (1.956 for LSTM-AePPO but 1.900 for GBQR). As three metrics evaluate different characteristics of forecast distribution, the numerical comparisons on only one metric may not be comprehensive. In this case, we use the three-axis radar chart to comprehensively compare the performance of these algorithms, in which each of the axes represents one metric under 90% PI, as shown in Fig. 5.8 and

5.6 Case Study

97

Fig. 5.9 Comparisons of CWC, Pinball and Winkler in the case of JPL among the five mentioned algorithms

Fig. 5.9. We use the relative area of each algorithm for comparisons [34], since the area of the colored triangle is determined by the three metrics, the relative area could assess the coverage, variation and reliability of probabilistic forecast distribution, simultaneously. From the above results, we can see that the LSTM-AePPO performs the best at 90% PI, compared with other algorithms. Therefore, it demonstrates the effectiveness of our proposed LSTM-AePPO.

5.6.4 The Effectiveness of AePPO In order to illustrate the effectiveness of the proposed adaptive exploration mechanism in LSTM-AePPO, we compare the reward curves of PPO and AePPO during the 40,000 training iterations in the two cases, as shown in Fig. 5.10a, b. In Fig. 5.10a, the reward of PPO increases during the first 4000 iterations before dramatically decrease. Similarly, as shown in Fig. 5.10b, the reward of PPO gradually decreases after 2500 iterations. On the contrary, the reward of AePPO would increase and surpass PPO after around the 25,300th and the 28,000th iteration in the case of Caltech and JPL, as shown in Fig. 5.10a, b, respectively. In the case of Caltech, the reward of PPO is increased from around − 1.5 to − 0.9 during the early 5000 iterations, while the reward value of AePPO stays at a lower level, which is raised from − 6 to − 5. The same situation also appears in the case of JPL. The reward of PPO increased from − 1.25 to 0.85 in the early training. However, the corresponding value of AePPO is much lower. This is because the adaptive exploration mechanism focuses on the

98

5 Reinforcement Learning Assisted Deep Learning …

Fig. 5.10 Comparison between the reward of PPO and AePPO during the training in the case of a Caltech and b JPL

exploration in the early stage of the training. The action obtained by AePPO has higher randomness, in order to enrich the experience. Moreover, the exploration of PPO decreases due to the premature convergence of its actor. Since the diversity of its experience is not plentiful enough, the PPO may fall into the local optima because of inadequate exploration. Therefore, we can see that the adaptive exploration mechanism strengthens the accumulation of experience of AePPO in the early stage. It makes full use of this gathered experience during the training and finally achieves a higher reward than PPO. In addition, the fluctuation of reward curves obtained by LSTM-AePPO is also decreased in the two cases. This illustrates that the proposed AePPO provides a more stable and superior performance than the original PPO.

5.7 Conclusion This paper has proposed a reinforcement learning assisted deep learning probabilistic forecast framework for the charging power of EVCS. This framework contains a data transformer method to preprocess the charging session data and a probabilistic forecast algorithm, termed as LSTM-AePPO. In this framework, the LSTM is used to forecast the mean value of the forecast distribution, and the variation of its cell state is modeled as an MDP. Then, a reinforcement learning algorithm, AePPO, is applied to solve the MDP model and calculate the variance of the forecast distribution. In addition, to balance the exploration and the exploitation, we further propose an adaptive exploration mechanism to enrich the diversity of the experiences and prevent the premature convergence of AePPO. We have conducted case studies to demonstrate the superior performance of LSTM-AePPO. The experiments have used the charging session data that are collected from the EVCSs of Caltech and JPL, respectively. Compared with other algorithms, namely QR, QRSVM and GBQR, the usage of CWC, Winkler and Pinball metrics demonstrated that LSTM-AePPO is more effective, by yielding probabilistic

References

99

forecast distribution with better coverage, variation and reliability. Moreover, the comparison between the reward curves of PPO and AePPO indicates that the adaptive exploration mechanism is more effective in balancing the exploration and the exploitation of reinforcement learning. That is, our proposed AePPO could reach a higher reward than that of the PPO during the training.

References 1. M.J. Sanjari, H.B. Gooi, Probabilistic forecast of PV power generation based on higher order Markov chain. IEEE Trans. Power Syst 32(4), 2942–2952 (2017) 2. T. Hong, P. Pinson, Y. Wang, R. Weron, D. Yang, H. Zareipour, Energy forecasting: a review and outlook. IEEE Open Access J. Power Energy 7, 376–388 (2020) 3. C. Wan, J. Lin, J. Wang, Y. Song, Z. Yang Dong, Direct quantile regression for nonparametric probabilistic forecasting of wind power generation. IEEE Trans. Power Syst. 32(4), 2767–2778 (2017) 4. J. Yan, H. Zhang, Y. Liu, S. Han, L. Li, L. Zongxiang, Forecasting the high penetration of wind power on multiple scales using multi-to-multi mapping. IEEE Trans. Power Syst. 33(3), 3276–3284 (2018) 5. W. Liu, C. Ren, X. Yan, PV generation forecasting with missing input data: a super-resolution perception approach. IEEE Trans. Sustain. Energy 12(2), 1493–1496 (2021) 6. L. Liu, F. Kong, X. Liu, Y. Peng, Q. Wang, A review on electric vehicles interacting with renewable energy in smart grid. Renew. Sustain. Energy Rev. 51, 648–661 (2015) 7. Y. Shi, H. Duong Tuan, A.V. Savkin, T.Q. Duong, H. Vincent Poor, Model predictive control for smart grids with multiple electric-vehicle charging stations. IEEE Trans. Smart Grid 10(2), 2127–2136 (2019) 8. K. Chaudhari, N.K. Kandasamy, A. Krishnan, A. Ukil, H.B. Gooi, Agent-based aggregated behavior modeling for electric vehicle charging load. IEEE Trans. Industr. Inform. 15(2), 856– 868 (2019) 9. B. Wang, P. Dehghanian, S. Wang, M. Mitolo, Electrical safety considerations in large-scale electric vehicle charging stations. IEEE Trans. Industr. Appl. 55(6), 6603–6612 (2019) 10. S. Cheng, Z. Wei, D. Shang, Z. Zhao, H. Chen, Charging load prediction and distribution network reliability evaluation considering electric vehicles’ spatial-temporal transfer randomness. IEEE Access 8, 124084–124096 (2020) 11. L. Chen, F. Yang, Q. Xing, S. Wu, R. Wang, J. Chen, Spatial-temporal distribution prediction of charging load for electric vehicles based on dynamic traffic information, in 2020 IEEE 4th Conference on Energy Internet and Energy System Integration, 2020, pp. 1269–1274 12. Y. Zheng, Z. Shao, Y. Zhang, L. Jian, A systematic methodology for mid-and-long term electric vehicle charging load forecasting: the case study of Shenzhen, China. Sustain. Cities Soc. 56, 102084 (2020) 13. H.J. Feng, L.C. Xi, Y.Z. Jun, Y.X. Ling, H. Jun, Review of electric vehicle charging demand forecasting based on multi-source data, in 2020 IEEE Sustainable Power and Energy Conference, 2020, pp. 139–146 14. H.M. Louie, Time-series modeling of aggregated electric vehicle charging station load. Electr. Power Components Syst. 45(14), 1498–1511 (2017) 15. X. Zhang, Short-term load forecasting for electric bus charging stations based on fuzzy clustering and least squares support vector machine optimized by wolf pack algorithm. Energies 11(6), 1449 (2018) 16. A. Almaghrebi, F. Aljuheshi, M. Rafaie, K. James, M. Alahmad, Data-driven charging demand prediction at public charging stations using supervised machine learning regression methods. Energies 13(16), 4231 (2020)

100

5 Reinforcement Learning Assisted Deep Learning …

17. J. Zhu, Z. Yang, M. Mourshed, Y. Guo, Y. Zhou, Y. Chang, Y. Wei, S. Feng, Electric vehicle charging load forecasting: a comparative study of deep learning approaches. Energies 12(14), 2692 (2019) 18. M. Xue, L. Wu, Q.P. Zhang, J.X. Lu, X. Mao, Y. Pan, Research on load forecasting of charging station based on XGBoost and LSTM model, in J. Phys. Conf. Ser. 1757, 012145 (2021) 19. Y. Kim, S. Kim, Forecasting charging demand of electric vehicles using time-series models. Energies 14(5), 1487 (2021) 20. L. Buzna, P. De Falco, G. Ferruzzi, S. Khormali, D. Proto, N. Refa, M. Straka, G. van der Poel, An ensemble methodology for hierarchical probabilistic electric vehicle load forecasting at regular charging stations. Appl. Energy 283, 116337 (2021) 21. X. Zhang, K.W. Chan, H. Li, H. Wang, J. Qiu, G. Wang, Deep-learning-based probabilistic forecasting of electric vehicle charging load with a novel queuing model. IEEE Trans. Cybern. 51(6), 3157–3170 (2021) 22. L. Zhu, N. Laptev, Deep and confident prediction for time series at Uber, in 2017 IEEE International Conference on Data Mining Workshops, 2017, pp. 103–110 23. C. Wan, Z. Xu, P. Pinson, Z.Y. Dong, K.P. Wong, Probabilistic forecasting of wind power generation using extreme learning machine. IEEE Trans. Power Syst. 29(3), 1033–1044 (2014) 24. A. Khosravi, S. Nahavandi, D. Creighton, A.F. Atiya, Comprehensive review of neural networkbased prediction intervals and new advances. IEEE Trans. Neural Netw. 22(9), 1341–1356 (2011) 25. M. Sun, T. Zhang, Y. Wang, G. Strbac, C. Kang, Using Bayesian deep learning to capture uncertainty for residential net load forecasting. IEEE Trans. Power Syst. 35(1), 188–201 (2020) 26. Y. Li, S. He, Y. Li, L. Ge, S. Lou, Z. Zeng, Probabilistic charging power forecast of EVCS: reinforcement learning assisted deep learning approach. IEEE Trans. Intell. Vehic. 1 (2022) 27. A. Khosravi, S. Nahavandi, D. Creighton, A.F. Atiya, Lower upper bound estimation method for construction of neural network-based prediction intervals. IEEE Trans. Neural Netw. 22(3), 337–346 (2011) 28. S. Kachroudi, M. Grossard, N. Abroug, Predictive driving guidance of full electric vehicles using particle swarm optimization. IEEE Trans. Vehic. Technol. 61(9), 3909–3919 (2012) 29. Z.J. Lee, T. Li, S.H. Low, ACN-data: analysis and applications of an open EV charging dataset, in Proceedings of the Tenth International Conference on Future Energy Systems, 2019, pp. 139–149 (2019) 30. Y. He, R. Liu, H. Li, S. Wang, L. Xiaofen, Short-term power load probability density forecasting method using kernel-based support vector quantile regression and copula theory. Appl. Energy 185, 254–266 (2017) 31. T. Hong, P. Wang, H. Lee Willis, A naïve multiple linear regression benchmark for short term load forecasting, in 2011 IEEE Power and Energy Society General Meeting, 2011, pp. 1–6 32. Y. Wang, N. Zhang, Y. Tan, T. Hong, D.S. Kirschen, C. Kang, Combining probabilistic load forecasts. IEEE Trans. Smart Grid 10(4), 3664–3674 (2019) 33. Y. Wang, D. Gan, M. Sun, N. Zhang, L. Zongxiang, C. Kang, Probabilistic individual load forecasting using pinball loss guided LSTM. Appl. Energy 235, 10–20 (2019) 34. S. Jie Wang, L. Hou, L. Jay, B.U. Xiangjian, Evaluating wheel loader operating conditions based on radar chart. Autom. Constr. 84, 42–49 (2017)

Chapter 6

Dense Skip Attention-Based Deep Learning for Day-Ahead Electricity Price Forecasting with a Drop-Connected Structure

6.1 Introduction The fluctuation of electricity prices affects the allocation and dispatch of power resources in the electricity market [1–3]. Therefore, decision-makers often significantly rely on the day-ahead electricity price forecasting (DAEPF), in order to implement optimal strategies for the market to maximize the profits [3–5]. Furthermore, with the gradual expansion of the liberalized electricity market, the power exchange between each bidding area will rise continuously. Then, it would bring economic losses to the market if the forecasting accuracy of day-ahead electricity price is not further improved. Thus, more accurate DAEPF models are still needed. Indeed, there are various studies focusing on the DAEPF, which can be classified into two main categories, i.e., the statistical and artificial intelligence techniques. The former category forecasts the day-ahead price by using a mathematical combination of previous prices and other price drivers (e.g., temperature, humidity, etc.) without the learning process [6]. A benchmark statistical model for DAEPF is the naive forecast (NF) [7], which is mainly based on resembling similar historical values to forecast future prices. To predict the electricity price more accurately, regression models are further adopted. They mainly search for the best linear relationship between input variables and output predicted values based on least-squares [8]. However, the nonlinear pattern is usually manifested in the DAEPF, thus limiting the performance of statistical techniques [9]. On the contrary, artificial intelligence techniques extract the nonlinear relationship among the inputs and outputs during the iterative training process. For instance, Chai et al. proposed a hybrid method based on ensemble extreme learning machine (ELM) for DAEPF [10]. Although ELM-based method verifies to be an efficient approach, it still owns a high level of randomness which limits the predictive accuracy. Apart from ELM, Amjady et al. presented a neutral network-based method named the fuzzy neutral network (FNN) for DAEPF. It verifies to be more accurate than the other DAEPF methods, such as autoregressive integrated moving average model [8], multilayer perceptron [11], and radial basis function neural networks [12]. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Li et al., Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch, Engineering Applications of Computational Methods 14, https://doi.org/10.1007/978-981-99-0799-1_6

101

102

6 Dense Skip Attention-Based Deep Learning …

Recently, DL-based methods have become more of interest in DAEPF for its promising nonlinear approximation ability. The basic DL model is the deep neural network (DNN) [13], and Lago et al. concluded that DNN yields a better accuracy in DAEPF than statistical models via comparative experiments. Slightly more complex than DNN is the recurrent neural network (RNN) [14], which is also a suitable choice for DAEPF. In addition, Ugurlu et al. proposed the multilayer long short-term memory (LSTM) and gated recurrent units (GRU) for DAEPF, which outperforms the RNN [15]. To further improve the GRU-based model, Bottieau et al. proposed an advanced GRU model based on encoder-decoder structure for DAEPF [16]. Apart from LSTM and GRU, convolutional neural network (CNN) is also a widely used method in DAEPF [17–19]. To alleviate the overfitting and improve the performance of the CNN, Li et al. used unshared convolutional neural network (UCNN) as the backbone of their framework [20]. It could be observed that existing methods have achieved good performance in DAEPF. However, in a expanding liberalized electricity market, electricity prices have the characteristics of temporal and feature-wise variabilities. Note that such variabilities lead to the challenges of obtaining high forecasting precision of dayahead electricity price. These challenges are elucidated as follows. (1) In the perspective of one individual feature derived from the dataset, a trait of temporal variability has been revealed. This trait can be interpreted as multi-term dependencies, which consists of the long-term and short-term dependencies. They further reveal the nonlinear relationship between the inputs and outputs. For instance, an illustrative figure of electricity price in Sweden from January 23rd 00:00 Central European Time (CET) to March 14th 00:00 CET in 2018 is shown in Fig. 6.1. Similar to the studies in [10, 21], the strong correlation is also observed in our case. To clearly show this correlation, we randomly select two-day electricity price curve from Sweden for observation and magnify it in the upper left panel of Fig. 6.1. It is observed that electricity price in February 6th 00:00–08:00 rises sharply and falls in 20:00–24:00 during the same day. Similar rise-falling trend with respect to this correlation can also be detected in each day from the dataset, which is regarded as a representative of short-term dependency. Electricity price as long as months ago may also affect the predictive results largely, which is referred to the long-term dependency. For instance, in Fig. 6.1, the spike in black is correlated with the peak spike in purple, although the time gap between them is approximately a month. (2) Given all available features that concern the electricity price in the dataset, not all of them can boost the performance of DAEPF, even if they are highly correlated with the predicted target. To be more specific, adding extra features with high correlation might blur the effect of one specific input feature due to excessive redundancy and instead result in a less accurate model [4]. This phenomenon explains the feature-wise variability of electricity price. While having multiple highly correlated features, it is rather challenging to automatically provide highquality features with high weights from the enormous dataset to contribute to the DAEPF.

6.1 Introduction

103

Fig. 6.1 Schematic diagram of temporal variability

To address the problem listed above, it is necessary to find a DAEPF method with high accuracy considering the temporal and feature-wise variabilities. Nevertheless, previous researches based on classic artificial intelligence methods have disadvantages as follows: (1) Considering the temporal variability, UCNN is good at extracting short-term dependency for its window-like convolution strategy. However, UCNN cannot show its advantages when encountering long-term features [20, 22]. Another widely used feature extraction method is LSTM, which is designed to overcome the long-term dependency extraction problem that RNN cannot solve [15, 23], but lack in the ability of short-term feature extraction. (2) Regarding the challenging work of dealing with feature-wise variability, previous studies [10, 21] have adopted all highly correlated features and train them with same attention. However, due to the feature-wise variability, considering all available correlated features equally could cause the reduction of the model’s performance. Thus, attention-based model is recommended to abate such reduction by assigning each feature with different weights to optimize the forecasting performance [24]. Nevertheless, attention is likely to lose expressive power with respect to network depth, which is a significant disadvantage if we choose to adopt very deep models [25]. (3) One of the most challenging task in very deep DL models may be to tackle with the overfitting and degradation [26, 27], and stacked UCNN or CNN could not solve this problem well. Resnet has been designed to alleviate the degradation to some extent [26]. However, the backbone of Resnet is still the shared convolution, which is not qualified for DAEPF task for the reason that electricity data are space-variant [20]. Therefore, the overfitting still remains in the Resnet. In addition, containing multiple rectified linear units (ReLU) activation layers in its deep structure, Resnet suffers from the potential loss caused by ReLU, which will worsen the degradation.

104

6 Dense Skip Attention-Based Deep Learning …

To solve the mentioned issues, we conduct the following work, which are also contributions of this chapter. (1) An effective framework is proposed for both deterministic and interval DAEPF.1 The backbone of this framework is the dense skip attention-based DL model with a drop-connected structure. (2) We have developed a drop-connected UCNN-GRU by aggregating UCNN and GRUs coherently to solve the problem of temporal variability. Also, the overfitting is well-alleviated. (3) A dense skip attention mechanism is proposed to assign weights to each feature, which offers a solution to feature-wise variability. Owning to a dense skip connection, this enhanced mechanism can reduce the loss of the DL model and further improve the robustness. (4) Moreover, for the drop-connected UCNN-GRU in our proposed framework, we introduce PReLU and residual connections into the UCNN, and thus the advanced residual UCNN block is developed. This advanced block can reduce the degradation with neglectable extra cost. With PReLU deeply embedded in the residual architecture, the detrimental neuron inactivation can be avoided, which is proved in 6.9 of this chapter. The remainder of this chapter is organized as follows. Section 6.2 introduces the structure of our proposed framework of deterministic and interval DAEPF. Section 6.3 demonstrates the drop-connected UCNN-GRU, which is the first important part of our proposed model, where the second influential one named dense skip attention mechanism is explicated in Sect. 6.4. Then, three case studies are conducted in Sect. 6.5 to validate the performance of our proposed framework. Finally, conclusions are drawn in Sect. 6.6.

6.2 Structure of the Proposed Framework The proposed framework in this work is an effective DL-based approach for DAEPF both in the deterministic and interval forecasting. The overall structure is shown in Fig. 6.2. 2 data preThe main procedure is demonstrated as follows. We first use the  1 original dataset and then norprocessing to split the train set and test set in the  3 malize them, respectively. Secondly, the preprocessed train data are fed to the  drop-connected UCNN-GRU for the feature extraction. In the third step, the data 4 dense skip attention mechanism to automatically weight each are processed with  feature in the dataset. Finally, we can obtain the results of deterministic or interval 5 In the facet of funcforecasting through corresponding regression methods in . 1

Deterministic forecasting predicts the point value of day-ahead electricity price while the interval one predicts the possible range to quantify its uncertainty. They are both informative tools in DAEPF for decision-makers.

6.2 Structure of the Proposed Framework

105

1 OrigFig. 6.2 Overall framework of the proposed method. (The specific execution sequence is  2 Data preprocessing→  3 Drop-connected UCNN-GRU→  4 Dense skip attention inal data→  5 Target regression.) Reprinted from Ref. [28], copyright 2022, with permission mechanism→  from IEEE

tionality, the overall framework can be summarized into four parts, i.e., the data preprocessing, feature extraction, autoweighting of features and target regression. Each part is described as follows.

6.2.1 Data Preprocessing In this part, the whole data are normalized using the Min-Max approach [29]. The data in each feature are scaled between 0 and 1, which could help stabilize the convergence process of loss function [30]. Given a training dataset X ∈ Rn×m with m features and n timestamps, the normalization process can be formulated as xi∗j = (xi j − min x j )/(max x j − min x j ), where xi j denotes the element in the ith row and the jth column of the dataset X , and xi∗j denotes the normalized element of xi j . Note

106

6 Dense Skip Attention-Based Deep Learning …

that min x j and max x j represent the global minima and maxima of all the elements in the jth column, respectively. Afterward, we pack the normalized data to generate data batches for training.

6.2.2 Feature Extraction In this part, the normalized data batches are fed to our proposed DL architecture (called drop-connected UCNN-GRU) for feature extraction. This enhanced architecture allows UCNN and GRU to share their strength in short-term and long-term feature extraction in an attempt to overcome the temporal variability. In addition, a drop-connected structure is deployed to accommodate UCNN and GRUs, in order to alleviate the overfitting. Furthermore, the PReLU combined with the residual connection is adopted in the advanced residual UCNN blocks to reduce the degradation.

6.2.3 Autoweighting of Features After the feature extraction, the features within the time dimension could be wellcaptured. In this stage, a dense skip attention mechanism is developed to automatically weight the features in the data batches and obtain the optimal weights to achieve the best forecasting performance through recursive training process. Furthermore, applying the dense skip connection enables the model to maintain the ability of autoweighting while reducing the risk of losing expressive capability exponentially with respect to the network depth. By this means, the feature-wise variability can be well-addressed.

6.2.4 Target Regression Target regression is the core of time-series forecasting using DL methods. The model uses the Adam optimizer [31] to update all parameters iteratively, so as to obtain the optimal forecasting model. The deterministic DAEPF is realized using mean absolute error (MAE) as the loss function, while the interval one is using the smooth quantile loss function. The details of smooth quantile loss function are shown in Sect. 6.7. Via the above operations, we are capable of achieving the deterministic and interval DAEPF.

6.3 Drop-Connected UCNN-GRU

107

Fig. 6.3 Transformation from the a conventional UCNN block to the b advanced residual UCNN block. (BN: batch normalization, UCov: unshared convolution) Reprinted from Ref. [28], copyright 2022, with permission from IEEE

6.3 Drop-Connected UCNN-GRU In this section, to address the temporal variability, the drop-connected UCNN-GRU is proposed to implement feature extraction. Note that this proposition depicts the first important part of our proposed DL model. Specifically, we use an advanced residual UCNN block for short-term feature extraction before GRU for the long-term one. Furthermore, to alleviate the overfitting, we rearrange the UCNN and GRU in a dropconnected structure, termed as drop-connected UCNN-GRU. The details are shown as follows.

6.3.1 Advanced Residual UCNN Block To further enhance the short-term feature extraction capability, we have developed an advanced residual UCNN block inside the drop-connected UCNN-GRU. In Fig. 6.3a, the conventional UCNN block consists of three parts, i.e., the unshared convolution layer, batch normalization and ReLU activation. Different from the conventional UCNN blocks, a zero-padding layer is applied so that the size of the tensors remains unchanged after each convolution. Moreover, our advanced residual UCNN block (shown in Fig. 6.3b) substitutes ReLU with PReLU to further enhance the approximation ability. Structurally, instead of stacking each block directly, we explicitly adopt a residual mapping termed as “shortcut connection”. It can help ease the degradation, which makes our advanced residual UCNN block to triumph over the conventional one. In the following, we will explain the residual architecture and PReLU, respectively. Residual architecture is designed to solve the noteworthy issue called degradation, which exists in very deep models, considering deeper model might not achieve a better performance when the network depth exceeds a certain level. Thus, by introducing “shortcut connection”, it enables the model to skip some layers without adding either extra parameter or computational complexity. When encounters the degradation, i.e.,

108

6 Dense Skip Attention-Based Deep Learning …

the model would at least retain the performance in the shallower depth and therefore alleviates the degradation. PReLU is function as an advanced activation layer. Compared to widely used ReLU, PReLU improves the approximation ability of the forecasting model and further alleviates the model degradation in the mean time. As a nonlinear neuron activation, the function of ReLU f R is obtained as y = f R (x) = max(0, x),

(6.1)

where x and y denote input and output of ReLU. However, ReLU can be fragile due to the risk of neuron inactivation, which may lead to irreversibly non-updatable status of gradients and consequently intensify the degradation. Therefore, to address this issue, PReLU is developed. Formally, the formulation of PReLU f PR is defined as (6.2) yi = f PR (xi ) = max(0, xi ) + βi min(0, xi ), where xi denotes the input of PReLU in channel i and yi denotes its corresponding output. Tellingly, βi depicts the trainable parameter ranging from 0 to 1 and updates jointly with other layers to contribute to the convergence of the loss function. In contrast to ReLU, PReLU has a trainable and nonzero slope in the negative axis, which avoids the risk of neuron inactivation and enhances the approximation ability in the meantime. Moreover, PReLU introduces neglectable amount of parameters in the model, which will not further intensify the overfitting. To summarize, the above statements indicate that applying PReLU in the residual architecture can enhance the performance of advanced residual UCNN block without any extra computational cost. By introducing the residual architecture to the block, we can alleviate the degradation. Besides, we replaced ReLU with a trainable activation layer PReLU to enhance the approximation ability and avoid the neuron inactivation in the mean time (see Sect. 6.9 for details of neuron inactivation).

6.3.2 Drop-Connected Structure Since deep models are prone to be overfitted, in order to mitigate this problem, the drop-connected structure is adopted. Specifically, dropout is applied in the highly overfitable position in the proposed DL model, i.e., the junction between advanced residual UCNN blocks and GRU blocks. There are various means to deal with the overfitting, among which the mainstream methods are L 1 norm, L 2 norm and dropout [26]. L 1 norm tends to sparse the input data by assigning zero weights to unimportant features, and L 2 norm mainly considers all the features with the same importance by minimizing the weights of all features. However, due to the feature-wise variability, it is uncertain whether all features will contribute to prediction for different dataset. Hence, L 1 norm and L 2 norm

6.4 Dense Skip Attention Mechanism

109

are not sensible decisions in our model because of contradicting the objective of autoweighting of features and thus lessen the generalization ability. In contrast, instead of modifying the loss function to the minimize weights, dropout simply retains some of the neurons randomly and drop the others to reduce the overfitting and thus enhance the robustness of the model. Assuming that the proposed DL model without dropout has /r m L hidden layers, and l ∈ {1, 2, · · · , L} indexes these layers. Regarding gil as the ith input neuron of the GRU block positioned in the l hidden layers in the model, gil+1 as its outputs. It is located in the junction of advanced residual UCNN blocks and GRU blocks inside drop-connected UCNN-GRU. The operation of GRU in training can be described as gil+1 = Gl (gil ),

(6.3)

where Gl (·) represents the function of the GRU block in the l layers. After transformed into drop-connected structure, the operation of this GRU in training becomes gil+1 = Gl (r l ∗ gil ), ril ∼ Bernoulli( p).

(6.4)

where ∗ denotes an element-wise product and r l is a vector of independent Bernoulli random variables and Probability(ril = 1) = p. Consequently, it brings randomness to the model by introducing an arbitrary probability of dropping neurons in the training process and thus reduces the overfitting by simplifying the model with randomly drop.

6.4 Dense Skip Attention Mechanism The second influential part in our proposed DL model is the dense skip attention mechanism, which is designed to solve the problem of feature-wise variability. The more specific figure of the dense skip attention mechanism is shown in Fig. 6.4. It mainly consists of two parts, i.e., the feature-wise attention block (marked in red dashed box) and dense skip connection (marked in orange dashed box). The former one stresses on the autoweighting in the feature dimension and the latter is designed to ease the potential burden that caused by attention. We will illustrate these two parts specifically in the following two subsections.

6.4.1 Dense Skip Connection As for the dense skip connection, it is an efficient approach to reduce the loss of the model. As shown in Fig. 6.4, it skips the feature-wise attention block with a single dense and PReLU activation layer. Then, it directly connects to the output of feature-wise attention block to form a dense skip connection.

110

6 Dense Skip Attention-Based Deep Learning …

Fig. 6.4 Structure of the dense skip attention mechanism. (FCL: fully connected layer, Seq2Seq: sequence-to-sequence) Reprinted from Ref. [28], copyright 2022, with permission from IEEE

Although attention can achieve great performance in most cases, direct stacking attention in very deep models might cause the model to lose expressive power [25], which could undermine the performance of the model. Considering φ(·) as the nonlinear function of the feature-wise attention block and gi as the input of feature-wise attention block, the output of this block Yi∗ can be formulated as Yi∗ = φ(gi ) = P[Wc A[G0 (gi )] ].

(6.5)

Nevertheless, dense skip connection can provide an extra route to the output in a much more simpler structure than feature-wise attention block. Define ρ(·) as the nonlinear function of the dense skip connection, the output of dense skip attention mechanism is obtained as: (6.6) Yi∗ = φ(gi ) + ρ(gi ), where ρ(gi ) = P[WD G1 (gi )].

(6.7)

Note that G0 (·), G1 (·), A(·), P0 (·) and P1 (·) stand for the function of GRU0 , GRU1 , attention, PReLU0 and PReLU1 activation layer, respectively. Evidently, comparing between Eqs. (6.5) and (6.7), the complexity of ρ(gi ) is much less than that of φ(gi ), indicating a costless route is developed for our model. In this way, when the loss caused by attention is greater than the positive effect it exerts, the model can adaptively choose the dense skip connection and retain the performance of the non-attention model. Therefore, dense skip connection can reduce the possible loss of the model caused by attention and further improves the robustness of the model.

6.4 Dense Skip Attention Mechanism

111

Fig. 6.5 Structure of the attention. (DR: dimension reduction, dot: dot multiply, concat: concatenation, magnify: the attention is magnified for display, so that its specific operations can be elucidated in the blue dashed box in the lower part of the figure) Reprinted from Ref. [28], copyright 2022, with permission from IEEE

6.4.2 Feature-Wise Attention Block Structurally, for the feature-wise attention block shown in Fig. 6.4, the permutation layer is added in the front to transpose the two axes of the input data; i.e., if the shape input tensor is [batch, temporal axes, feature-wise axes], the output of permutation will be [batch, feature-wise axes, temporal axes]. Dimensionally, it enables the attention to process the information in feature dimension instead of temporal one. Then, the attention is added behind for autoweighting each feature; i.e., the attention assigns an initial weight to each input feature, and it is continuously updated in subsequent iterations to obtain the best results. Manifestly, the pillar of feature-wise attention block is the attention for its autoweighting techniques. Attention contains two important vectors, i.e., attention vector for weighting each feature and the context vector for averaging (Fig. 6.5). (i) (i) (i) w Define the input vector of attention as h(i) = {h (i) 1 , h 2 · · · , h f }, h t ∈ R , where (i) w h f represents the last feature channel f of input h in time i and R denotes the real vector space R of W × 1 observation window matrix. Formally, the attention vector in time i is defined as αi , and the weight of the jth feature αi j can be calculated as: (i) exp score(h (i) lt , h j )

αi j = f

k=1

(i) exp score(h (i) lt , h k )

,

(6.8)

where h lt stands for the last hidden state in channel j after dimension reduction and h j represents the current state of channel j. In this way, we can limit the weight of

112

6 Dense Skip Attention-Based Deep Learning …

each feature ranging from 0 to 1. Additionally, the score function can be described as   (i) (i) = h (i) , h (6.9) score h (i) lt lt Wa h j j After weighting, a context vector ci is then computed as the average of all the features’ weights, as shown in Eq. (6.10). ci =

f 

αi j h (i) j

(6.10)

j=1

Finally, as shown in Eq. (6.11), we can derive the output of attention ATOi with a concatenation layer. (6.11) ATOi = tanh(Wb [ci ; h (i) lt ]) To be clear, Wa and Wb are both learnable parameters and implemented with different dense layers. Then, after adding a dense and a PReLU activation layer, the feature-wise attention block is formed, as presented in Fig. 6.4. It should be remarked that from Eqs. (6.8)–(6.9), the learnable parameter Wa in the weights calculation process contributes most to autoweighting of features. In details, Wa enables the weights αi j to update recursively so as to pursue the optimal performance by minimizing the loss function L. Then, optimized weights of each feature are obtained as (6.12) αi j = arg min L. α

By this means, autoweighting of features can be well-implemented, and thus our proposed feature-wise attention block can serve as an optimized solution to the feature-wise variability. At the end of Sect. 6.4, we will give an overview of the DL model proposed in this chapter. In Fig. 6.6, the normalized electricity price dataset (array marked in blue) on the left is first fed to the advanced residual UCNN blocks (red cube) for short-term feature extraction. Then, after dropout (dark gray cube) to contain the overfitting, the data are further fed to the GRU blocks (light gray cube) for long-term feature extraction. The above procedure is called drop-connected UCNN-GRU (black dashed box), which well address the issue of temporal variability. Next, the processed data are then fed to the dense skip attention (blue dashed box) for feature autoweighting, and thus solve the problem of feature variability. Finally, after recursive training in an end-to-end fashion, we can derive the output of dense skip attention as the DAEPF results. The deterministic and interval prediction learning algorithm of our model is shown in Algorithm 1.

6.4 Dense Skip Attention Mechanism

113

Algorithm 1 Deterministic and interval forecasting algorithm for the proposed DL Model, reprinted from ref. [28], copyright2022, with permission from IEEE T and {Y }T . 1: Input: input and target price time series {X t }t=1 t t=1 2: Output: predicted day-ahead electricity price Y ∗ . 3: Data preprocessing. 4: for epoch=1 to M do 5: procedure Drop- connected UCNN- GRU 6: function Advanced Residual UCNN Block(W , b, β ) 7: Initialize: weights W , bias b, and slope β . 8: for k=1 to 4 do 9: Normalize the batch x with normal distribution. (k) 10: h Ucov = x W (k) + b(k) (k) 11: y = f PR (h Ucov , β (k) ) 12: x←x+y 13: end for 14: end function 15: Randomly drop neurons in Bernoulli probability. 16: function GRU Block( pG ) 17: Initialize: all the trainable parameters in GRU pG . 18: Fetch the input data g 1 from dropout 19: for k=1 to 4 do (k) 20: g k+1 = Gk (g k , pG ) 21: end for 5 22: Generate Seq2Seq and Non-Seq2Seq series gS5 and gN 5 23: return gS5 and gN 24: end function 25: end procedure 26: procedure Dense Skip Attention(α , β D ) 27: Initialize: weights of features α , slope β D . 28: Y ∗ = φ(gS5 , α) 5 , βD) 29: Y ∗ ← Y ∗ + ρ(gN 30: end procedure 31: procedure Target Regression(Deterministic or interval forecast) 32: Choose one regression method. 33: if Current regression → Deterministic forecast then m 34: Calculate the loss function LMAE = m1 i=1 |y − y ∗ | 35: Update all parameters in the means of back-propagation. 36: Assuming  representing all the learnable parameters ∂LMAE 37:  ←  − η ∂ 38: if Loss function LMAE stops decreasing then 39: η ← η ∗ 10−1 40: end if 41: end if 42: if Current regression → Interval forecast then 43: for α = 0.05, 0.5, 0.95 do 44: Calculate the smooth quantile loss function LQR (α). 45: Update all parameters  in the means of back-propagation. ∂L (α) 46:  ←  − η QR ∂ 47: if Quantile loss function LQR (α) stops decreasing then 48: η ← η ∗ 10−1 49: end if 50: end for 51: end if 52: end procedure 53: end for

 Unshared convolution  Activate with PReLU  Construct the identity mapping  Drop-connected structure

 Extract the long-term feature

 Via the feature-wise attention block  Construct the dense skip connection

 Adaptively update the learning rate

 Choose different quantile coefficients

114

6 Dense Skip Attention-Based Deep Learning …

Fig. 6.6 Overall structure of DL model. (part of the dataset is indicated by an ellipsis for the time axis is too long to display completely)

6.5 Case Study This section is devoted to verifying the effectiveness of our proposed model both for deterministic and interval forecasting. It is a dense skip attention-based DL model with a drop-connected structure for DAEPF. Case 1 demonstrates the superiority of the proposed components in our model including advanced residual UCNN block, GRU blocks, dense skip attention mechanism, respectively. Case 2 and Case 3 present the superiority of the proposed model on statistical techniques and conventional DL techniques, respectively.

6.5.1 Data Description In this chapter, we use electricity price data from four different countries, i.e., Sweden, Denmark, Finland and Norway between 2018/01/01 00:00 CET and 2019/12/31 23:00 CET with one hour resolution. For each country, we choose the most correlated features using Pearson coefficient [32]; i.e., all the features used in our work are highly correlated with the target electricity price with Pearson coefficient no less than 0.5. In this way, we can testify the ability of our proposed DL model to handle the feature-wise variability. Through selecting highly correlated features, we obtain the final selected features in Table 6.1. All the data in Table 6.1 are available on the Nordpool2 and Finnish energy website3 In the case study, we divide the electricity price target dataset of each country at the ratio of 7:3, where the first 70% is used for training and the last 30% is adopted for testing.

2 3

Nord Pool Website: https://www.nordpoolgroup.com/. Finnish energy website: https://energia.fi/.

6.5 Case Study

115

Table 6.1 Selected features for each country, reprinted from Ref. [28], copyright 2022, with permission from IEEE Category Subcategory Candidate features Historical data Price

Exchange power

Gross power consumption Gross power production Transmission capacity Wind power productiona Prognosis data Gross power consumption Gross power production Wind power productiona

Domestic market-clearing price (EUR/MWh) Regulating price (EUR/MWh) Imbalanced price production purchase (EUR/MWh) Imbalanced price production sale (EUR/MWh) Imbalanced price consumption (EUR/MWh) Domestic net power import/export(MWh) Spot power flow (MWh) Spot power sell and buy (MWh) Domestic net power consumption (MWh) Domestic net power production (MWh) Power transmission capacity between spot (MWh) Domestic wind power production (MWh) Domestic net power consumption (MWh) Domestic net power production (MWh) Domestic wind power production prognosis (MWh)

a

Among the datasets of four countries, wind power data are only available in Denmark and Finland during 2018 and 2019

6.5.2 Implementation Details In this subsection, we aim to explain the hyperparameters of the model and introduce the specific environment during training. After the input training data are normalized, it is fed into our proposed dense skipped connection attention-based DL model. Concretely, for drop-connected UCNN-GRU, we first use four advanced residual UCNN blocks for short-term feature extraction. The number of convolution channels in each block is doubled in turn; i.e., the channel of each block is 15, 30, 60 and 120, respectively. In addition, the kernel size of each unshared convolution is set as 3. For dropout, the dropout rate is set as 0.5 because it can obtain the most randomly generated network structure combinations and therefore improves the generalization ability. Next, we use 5 directly stacked GRU layers for long-term dependencies extraction. Each of the GRU layers is equipped with a sequence-to-sequence structure, and the unit size is set as 50.

116

6 Dense Skip Attention-Based Deep Learning …

Finally, for the dense skip attention, we first permute the input data, and then the attention is added to automatically assign and update the weight of each feature. In addition, we use the dense layer and PReLU activation layers to skip the attention to form a dense skip connection. With the above components, we can conduct the iterative training in an end-to-end fashion and derive the deterministic or interval DAEPF results with corresponding regression method.

6.5.3 Case 1: Model Effectiveness Evaluation In this subsection, we use comparative experiments to verify the effectiveness of each important component in our proposed model. There are three important components in our model, i.e., advanced residual UCNN block, GRU and dense skip attention. Hence, we construct four models for comparison, i.e., our proposed model, the model after removing advanced residual UCNN block, GRU and dense skip attention, respectively. The above models are abbreviated as Proposed, Proposed-ARUCNN, Proposed-GRU and Proposed-DSAttention. To further verify the superiority of PReLU over ReLU, we also construct an extra model where ReLU is placed for all the activation layer (Proposed − PReLU + ReLU). The deterministic forecasting results are shown in Table 6.2, and such improvements can be manifested in the reduction of mean square error (MSE), mean absolute error (MAE) and mean absolute percentage error (MAPE). From Table 6.2, the advanced residual UCNN block verifies to be a promotive component to enhance forecasting skills in the four datasets. We can quantify such enhancement by using the evaluation results (e.g., MAE, MSE and MAPE) of the Proposed model minus the corresponding ones of Proposed-ARUCNN model in each dataset. After adding the advanced residual UCNN block, the MAPE decreases by 1.751%, 0.16%, 0.46% and 1.433% in Sweden, Norway, Denmark and Finland, respectively. Other indexes such as MSE and MAE have also been improved accordingly. Using PReLU instead of ReLU can improve the prediction performance of the model, and such improvement can be quantified using the MAPE of Proposed minus the one of Proposed-PReLU+ReLU. Specifically, using PReLU instead of ReLU reduces MAPE by 0.793% and 1.582% in Norway and Denmark, respectively. This verifies that replacing ReLU with PReLU can improve the approximation ability of the model. Accordingly, this replacement also proves to be effective in Sweden and Finland. Also, the drop-connected UCNN-GRU verifies to have a positive impact on DAEPF. Compared with the Proposed-GRU model, our Proposed model, the one with aggregation of the advanced residual UCNN blocks and GRU blocks, reduced the MAPE by 1.619%, 0.278%, 0.715% and 1.038% in four datasets, respectively. It indicates that this structure can well capture the multi-term dependency in the day-ahead electricity price and tackles the temporal variability.

6.5 Case Study

117

Table 6.2 Effectiveness evaluation on deterministic forecasting, reprinted from Ref. [28], copyright 2022, with permission from IEEE Datasets Model MSE MAE MAPE (%) Sweden

Norway

Denmark

Finland

Proposed Proposed − ARUCNN Proposed − GRU Proposed −PReLU + ReLU Proposed − DSAttention Proposed Proposed − ARUCNN Proposed − GRU Proposed − PReLU + ReLU Proposed − DSAttention Proposed Proposed − ARUCNN Proposed − GRU Proposed − PReLU+ReLU Proposed − DSAttention Proposed Proposed − ARUCNN Proposed − GRU Proposed − PReLU+ReLU Proposed − DSAttention

6.875 7.476 8.279 7.699

1.487 1.803 1.873 1.532

5.24 6.99 6.86 5.57

8.455 0.2517 0.3678 0.395 0.6324

1.708 0.3697 0.4552 0.4621 0.6502

6.12 1.01 1.27 1.29 1.80

0.3308 8.5049 11.8689 11.7249 17.4169

0.4231 2.236 2.4282 2.6122 3.1284

1.15 4.64 5.10 5.36 6.22

9.4793 8.8576 10.1765 11.7544 15.8463

2.6059 2.1588 2.1981 2.5347 2.5694

5.17 7.91 9.34 8.95 7.95

13.3327

2.598

9.13

Furthermore, after adding dense skip attention, MAPE in four datasets has been reduced by 0.886%, 0.139%, 0.533% and 1.225%, respectively. To be clear, the results are obtained by using the MAPE of Proposed minus the one of Proposed DSAttention. This enhancement of forecasting accuracy could be mostly attributed to the dense skip attention. As shown in Fig. 6.7, compared to the model Proposed DSAttention that applies same attention weight to each feature, our proposed model can autoweight each feature to solve feature-wise variability problem. In Fig. 6.7, historical domestic market-clearing prices are a key feature, for it ranks the top 1st or 2nd in all four datasets regarding the attention weights. Moreover, Denmark and Finland have high weights on wind power production and power exchange (import), respectively. This shows that assigning higher weights on these key features can improve the forecasting performance, which verifies the effectiveness of the proposed dense skip attention mechanism. On the contrary, the interval forecasting results are shown in Table 6.3, and lower relative winkler score (RWS) indicates narrower intervals and greater coverage of the

118

6 Dense Skip Attention-Based Deep Learning …

Fig. 6.7 Attention weights of each feature in four countries, reprinted from Ref. [28], copyright 2022, with permission from IEEE Table 6.3 Effectiveness evaluation on interval forecasting Framework Index Sweden Norway Proposed−PreLU+ReLU RWS Proposed−DSAttention Proposed−GRU Proposed−ARUCNN Proposed

0.1982 0.1468 0.1528 0.2357 0.1248

0.1497 0.0907 0.0916 0.2637 0.089

Country Denmark 0.1657 0.1115 0.1558 0.2908 0.1105

Finland 0.2158 0.2298 0.2289 0.3584 0.2032

Bold metric indicates the strongest performance among all the algorithms

actual price, which means better performance. The formulation of RWS is demonstrated in Sect. 6.8. In our implementation, we set the quantile coefficient α to 0.05 and 0.95 as the upper and lower boundaries of interval prediction. It can be seen from Table 6.3 that the components we proposed also have similar improvement effects in interval forecasting. For instance, by using the RWS of Proposed minus the one of Proposed-ARUCNN, we can obtain the improvement of advanced residual UCNN block. Specifically, when the advanced residual UCNN block is added, the interval forecasting performance can be improved by 0.1109, 0.1747, 0.1803 and 0.1552 in four datasets, respectively. The reduction rates are 47.5%, 66.2%, 62.0% and 43.3%, respectively, proving that the improved block has excellent feature extraction capability. Accordingly, the addition of other components including dense skip attention and PReLU also improves the interval forecasting performance of the model in four countries, indicating that each component has a positive impact on the interval prediction.

6.5.4 Case 2: Comparison with Statistical Techniques In this subsection, we compare our model with two conventional statistical techniques, which include naive forecast (NF) and Auto-ARIMA. NF is a primitive

6.5 Case Study

119

Table 6.4 Comparison with statistical techniques on deterministic forecasting, reprinted from Ref. [28], copyright 2022, with permission from IEEE Country Model MSE MAE MAPE (%) Sweden

Norway

Denmark

Finland

NF Auto-ARIMA Proposed NF Auto-ARIMA Proposed NF Auto-ARIMA Proposed NF Auto-ARIMA Proposed

80.805 67.642 6.875 18.580 7.547 0.2517 140.209 49.003 8.5049 96.887 35.806 8.8576

6.0206 5.079 1.487 3.286 2.391 0.3697 9.792 5.1775 2.2360 9.559 6.627 2.1588

47.887 14.089 5.236 10.221 6.459 1.009 20.098 12.187 4.641 27.424 15.289 7.909

but basic forecasting method, which can reveal the difficulty of forecasting task. Auto-ARIMA is a benchmark method, which is appropriate for basic comparison. Therefore, we use NF and Auto-ARIMA to compare with our proposed framework. For NF and Auto-ARIMA, we select the first 70% of the electricity price data in each dataset for training and the rest for testing. Then, we compare our proposed model with NF and Auto-ARIMA within four datasets, and the results can be obtained in Table 6.4. The MAPE of NF in the four datasets is 47.887%, 10.221%, 20.098% and 27.424%, respectively, indicating that the DAEPF task is challenging. It can be observed that our proposed model has a great improvement on each dataset over NF and Auto-ARIMA. For instance, the MAPE of Auto-ARIMA in four datasets is 14.089%, 6.459%, 12.187% and 15.289%, respectively. It indicates that our proposed framework outperforms the NF and Auto-ARIMA, reducing the MAPE to 5.236%, 1.009%, 4.641% and 7.909%, respectively. It indicates that the proposed DL model has stronger nonlinear approximation ability than NF and Auto-ARIMA, so as to obtain better forecasting performances.

6.5.5 Case 3: Comparison with Conventional DL Techniques To further verify the superiority of our proposed model over conventional DL techniques, we use UCNN, unshared Resnet (UResnet), GRU, UCNN+GRU and Transformer for comparison both on deterministic and interval forecasting. UCNN, GRU are two benchmark feature extraction techniques focusing on short-term and longterm dependencies, respectively, which is suitable for comparison. Furthermore,

120

6 Dense Skip Attention-Based Deep Learning …

Table 6.5 Comparison with DL techniques on deterministic forecasting, reprinted from Ref. [28], copyright 2022, with permission from IEEE Dataset Model MSE MAE MAPE (%) Sweden

Norway

Denmark

Finland

GRU UCNN UCNN + GRU UResnet Transformer Proposed GRU UCNN UCNN + GRU UResnet Transformer Proposed GRU UCNN UCNN + GRU UResnet Transformer Proposed GRU UCNN UCNN + GRU UResnet Transformer Proposed

8.455 8.163 7.476 7.944 7.078 6.875 0.335 0.409 0.330 0.354 0.279 0.251 11.654 10.903 9.479 11.153 13.650 8.504 11.083 16.603 10.332 12.098 11.302 8.857

1.721 1.921 1.708 1.801 1.525 1.487 0.414 0.526 0.423 0.472 0.425 0.369 2.383 2.750 2.605 2.395 2.457 2.236 2.451 3.112 2.598 2.739 2.462 2.158

6.786 6.929 6.122 5.784 6.861 5.236 1.150 1.446 1.148 1.339 1.195 1.009 5.017 5.547 5.174 4.931 5.175 4.641 9.396 10.758 9.134 9.721 8.793 7.909

UCNN+GRU is the combination of UCNN and GRU in a sequential structure. Meanwhile, Resnet is a basic convolution-based method for target regression. In our case, we use unshared convolution layer to substitute all the convolution layers in traditional Resnet for enhancement, which forms UResnet. Transformer is also selected for being a very popular method in recent years. Deterministic forecasting results are shown in Table 6.5 and Fig. 6.8. Among the benchmark, UCNN and GRU show worse skills than the others due to their limited feature extraction abilities in long-term and short-term, respectively. Combining UCNN and GRU can improve the predictive performance over UCNN and GRU. For instance, the MAPE of UCNN+GRU in four datasets are 6.122%, 1.148%, 5.174%, 9.134%, respectively, which outperforms UCNN by 0.807%,0.298%, 0.373% and 1.624%, respectively. Accordingly, the combination of UCNN and GRU improves GRU in MAPE, MSE and MAE as well. It may be due to the reason that combining UCNN and GRU can take both the long-term

6.5 Case Study

121

Fig. 6.8 Deterministic forecasting results of proposed model. (2019.10.12 00:00 CET–2019.10.14 00:00 CET)

and short-term dependency features into consideration. The UResnet gives a better performance over UCNN in four datasets due to its residual architecture to solve the problem of degradation. In contrast, our proposed model not only has the feature extraction capabilities of UCNN and GRU, but also has a residual structure, which enables our model to outperform UCNN+GRU and UResnet. Specifically, the MAPE of our proposed model in four datasets is 5.236%, 1.009%, 4.641% and 7.909%, respectively, while UResnet achieves the results of 5.784, 1.339, 4.931 and 9.721%, correspondingly. Meanwhile, our proposed model can accomplish lower MAPE over transformer, since our proposed model contains a dense skip connection over attention and mitigate the loss brought by attention. Considering all the models for comparison, it could be observed that our proposed model takes the lead in the

122

6 Dense Skip Attention-Based Deep Learning …

Table 6.6 Comparison with DL techniques on interval forecasting, reprinted from ref. [28], copyright 2022, with permission from IEEE Model Index Country Sweden Norway Denmark Finland GRU RWS UCNN UCNN + GRU UResnet Transformer Proposed

0.1973 0.1982 0.1468 0.1772 0.1374 0.1248

0.1754 0.1497 0.0907 0.1219 0.0937 0.0890

0.2639 0.1657 0.1115 0.1660 0.1295 0.1105

0.4187 0.2158 0.2198 0.2104 0.2273 0.2032

forecast even in the highly volatile datasets of Denmark and Finland, proving its robustness in dealing with temporal and feature-wise variabilities. To better present the deterministic forecasting results, 48 h of forecasts and real values are shown in Fig. 6.8. Considering the clarity of presentation, we only show the results of GRU and the proposed model comparing with the actual electricity price. In these figures, the blue dashed lines stand for the real price value and red dot lines represent the forecasting results of our proposed model. For comparison, we plot the forecasting results of GRU in the mean time to stress the superiority of our proposed model. It can be observed in Fig. 6.8 that our proposed model can better approximate the real value than GRU around the peaks of electricity price. For instance, during 16:00–20:00 2019/10/13 in Sweden, 10:00–20:00 2019/10/12 in Norway, 6:00–16:00 2019/10/13 in Finland and 17:00–24:00 2019/10/13 in Denmark, the prediction generated by our proposed model is closer to the real value than that of GRU. In the perspective of interval forecasting, our proposed model can also achieve better performance over other DL models. As shown in Table 6.6, the RWSs of our proposed model are 0.1248, 0.089, 0.1105 and 0.2032 in four datasets, respectively, which evidently outperforms UResnet, transformer, UCNN+GRU and UCNN and GRU individually. It is worth mentioning that the quantile factor α is set as 0.05 for the upper bound and 0.95 for the lower bound. Furthermore, the interval forecasting results are visualized in Fig. 6.9. It is observed that the performance of our proposed model is also promising. For datasets with relatively flat fluctuations (Sweden and Norway), the 90% contains most of the actual price, which is verified to have great interval forecasting performance. Meanwhile, for datasets with large spikes (Denmark and Finland), the severe volatility nature makes it more challenging to forecast precisely. Thus, less real value points are covered by the 90% interval in Denmark and Finland. These phenomena might be attributed to the internal difference among these countries. Specifically, the electricity markets in Sweden and Norway are rather stable; i.e., they have large quantities of consumption per capita while maintaining abundant electricity production. However, Finland cannot fully transform its hydro reservoir resource to electricity as much as Sweden and Norway do. Therefore, Finland is largely rely on importing electricity,

6.5 Case Study

123

Fig. 6.9 Interval forecasting results of our proposed model. (2019.06.01 00:00 CET–2019.06.08 03:00 CET) Reprinted from Ref. [28], copyright 2022, with permission from IEEE

and thus uncertainty of the price fluctuation arises. It can be seen from Fig. 6.9 that the day-ahead price rises and falls significantly with time, which makes it harder to predict. Denmark has a large share of wind power, and hence increasing the difficulty of DAEPF due to the temporal and spatial characteristic of wind. Generally, the changing trends of the intervals are similar to the actual trend of the electricity price, as shown in Fig. 6.9, which verify the performance of our proposed model. Likewise, we only show the results of 90% interval for clear presentation. To summarize, in Case 1, by comparing the deterministic and interval forecasting performance of the model before and after component removal, the effectiveness

124

6 Dense Skip Attention-Based Deep Learning …

of each proposed component could be verified. Besides, Case 2 has demonstrated the superior nonlinear approximation ability of our proposed model, which long surpasses the statistical ones including NF and Auto-ARIMA. Furthermore, Case 3 has shown the outperformances of our proposed model over other DL models both on deterministic and interval forecasting.

6.6 Conclusion An effective DL-based DAEPF model has been proposed in this chapter for deterministic and interval forecasting. In recognizing that the temporal variability exists in electricity price datasets, the coherently aggregating structure of UCNN and GRU is proposed to extract multi-term dependency features. Moreover, by reconstructing UCNN and GRU blocks into a drop-connected structure, which is then referred as drop-connected UCNN-GRU, the overfitting is also well-alleviated. PReLU activation combined with residual connection is further integrated into our model to avoid the degradation. Considering the feature-wise variability, the feature-wise attention block is proposed for autoweighting in the feature dimension. While introducing dense skip connection into this block, the potential loss brought by attention can be reduced. Lastly, by aggregating drop-connected UCNN-GRU and dense skip attention, we can derive the final DL model, which verifies to outperform other conventional DL and statistical models. To further enhance the performance of our DAEPF model, future work will be emphasized on achieving more informative forecasting. For instance, density forecast of day-ahead electricity price is a feasible enhancement for it can generate full distributional information on the future uncertainties of electricity price, which might provide more utility to decision-makers.

6.7 Quantile Regression Quantile regression is a convenient method to transform deterministic forecast to interval one. In this way, we can reconstruct the loss function to the quantile loss one to achieve interval prediction. Smooth loss function is verified to guarantee the stability of back-propagation [20]. The formulation of smooth loss function LSm is ) = LSm (Y, Y



2  0.5(Y − Y ) Y − Y  − 0.5

  Y − Y  < 0   Y − Y  ≥ 0 ,

(6.13)

 stand for the real price and predicted price series, respectively. where Y and Y In this way, quantile loss function LQR (·) can be realized by weighting the smooth loss function, which can be formulated as

6.9 PReLU: A Solution to the Neuron Inactivation

LQR =



) + αLSm (Y, Y

 i:Y >Y

125



) (1 − α)LSm (Y, Y

 i:Y ≤Y

(6.14)

where α is a quantile coefficient ranging between (0, 1). Then, we can derive the upper and lower bounds of the prediction interval via changing the value of α.

6.8 Formulation of the Evaluation Index In the interval prediction, relative winkler score (RWS) is adopted for model evaluation. Taking Ui , L i and yi (i = 1, 2, . . . , m) as the upper, lower boundary of interval prediction results and real electricity price, the formulation of RWS is shown in Eq. (6.15). m 1  i L (6.15) RWS = m i=1 RWS

L iRWS

⎧ ⎪ ⎨δi = δi + ⎪ ⎩ δi +

L i −yi yi yi −Ui yi

L i ≤ yi ≤ Ui yi < L i , yi > Ui

(6.16)

where δi stands for the index of prediction range we obtained. δi = 2 ×

Ui − L i . Ui + L i

(6.17)

6.9 PReLU: A Solution to the Neuron Inactivation In this part, we will explain the neuron inactivation brought by ReLU and then elucidate the adaptive solution PReLU can provide. Consider a neuron with an unshared convolution and ReLU activation afterward. Mathematically, take x (i) j as the output of the jth neuron in the ith layer and L as the loss function. The input of the (i + 1)th layer x (i+1) is obtained as: j = f R (h (i) x (i+1) Ucov ) j

(6.18)

(i) (i) + b(i) h (i) Ucov = x j W j j ,

(6.19)

(i) where W j(i) , b(i) j and h Ucov stand for the corresponding weight, bias and output of unshared convolution, respectively.

126

6 Dense Skip Attention-Based Deep Learning …

Then, according to Eqs. (6.1) and (6.19), the derivative of the loss with respect to weight in the ith layer is obtained by the chain rule as ∂L ∂ W j(i)

= = =

∂L

∂ x i+1 j

∂ x (i+1) ∂ W j(i) j ∂L

∂ x (i+1) ∂h (i) j Ucov

(i) ∂ x (i+1) ∂h (i) Ucov ∂ W j j

∂L ∂ x (i+1) j

,

(6.20)

(i) ∂ f R (h (i) Ucov ) ∂h Ucov

∂h (i) Ucov

∂ W j(i)

where the gradient4 of f R (h (i) Ucov ) is



f R (h (i) Ucov )

=

0 h (i) Ucov ≤ 0 . 1 h (i) Ucov > 0

(6.21)

Considering the update function of weight w(i) jk as W j(i) := W j(i) − W j(i)

(6.22)

where the weight update W j(i) with the learning rate η is represented by

W j(i) = η

∂L ∂ W j(i)



∂L ∂ x (i+1) j

(i) ∇ f R (h (i) Ucov ) x j

(6.23)

Then, we can make the inference according to Eqs. (6.20)–(6.23). (i) Inference 1: I f h (i) Ucov ≤ 0, then accor ding to Eq. (6.21), ∇ f R (h Ucov ) = 0. W e use Eq. (6.23) to obtain (6.24)

W j(i) = 0.

T hus, the weight W j(i) stop updating. As bias is not modi f ied by the input, i f the bias update is negative, i.e., b(i) j < 0, then, the out put o f unshar ed convolution will be stuck at a negative value, considering the update o f h (i) Ucov

4

Mathematically, f R is not derivable at its zero-singularity, but the software implementation usually returns its left derivative instead of null, and hence, 0 is taken as the derivative at the zero-singularity point of f R .

References

127 (i) (i) h (i) Ucov :=h Ucov − h Ucov (i) =h (i) Ucov + b j ,

(6.25)

subject to W j(i) = 0. T hen, the out put o f ReLU in the ist layer will r emain 0 as the input h (i) Ucov is negative and updates decr ementally. Back and f or th, the weight will no longer update in this neur on, i.e.,

W j(i) ≡ 0,

(6.26)

which indicates a non-updating status o f the neur on thatcannot contribute to training ever since, which cause the neur on inactivation. In contrast, take PReLU as the substitution for ReLU, i.e., replacing all the f R function with f PR , the derivative in Eq. (6.23) is reconstructed to Eq. (6.27) based on Eq. (6.2). ∂L ∂L (i) = η (i+1) ∇ f PR (h (i) (6.27)

Wˆ j(i) = η Ucov ) x j (i) ∂ Wˆ j ∂x j Note that Wˆ j(i) and ∂ Wˆ j(i) denote the update and partial derivative of weight after modification of PReLU, respectively. Likewise, ∇

f PR (h (i) Ucov )

=

∂ f PR (h (i) Ucov ) ∂h (i) Ucov

=

β h (i) Ucov ≤ 0 1 h (i) Ucov > 0

(6.28)

In the knowledge of β ∈ (0, 1), we can make the inference as follows. Inference 2: I f h (i) Ucov ≤ 0, then ∇ f PR (h (i) Ucov ) = β > 0.

(6.29)

Consequently, accor ding to Eq. (6.27),

Wˆ j(i) ≡ 0, subject to η > 0,

∂L ∂ x (i+1) j

≡ 0, x (i) j  ≡ 0.

(6.30)

By this means, the weight update will continue without stagnation and hence pr event the neur on f r om inactivation. Via comparing Inference 1 and 2, we can conclude that the neuron created by PReLU can well address the inactivation.

128

6 Dense Skip Attention-Based Deep Learning …

References 1. M. Sun, T. Zhang, Y. Wang, G. Strbac, C. Kang, Using Bayesian deep learning to capture uncertainty for residential net load forecasting. IEEE Trans. Power Syst. 35(1), 188–201 (2020) 2. Y. Wang, N. Zhang, C. Kang, M. Miao, R. Shi, Q. Xia, An efficient approach to power system uncertainty analysis with high-dimensional dependencies. IEEE Trans. Power Syst. 33(3), 2984–2994 (2018) 3. C. Wan, Z. Xu, Y. Wang, Z.Y. Dong, K.P. Wong, A hybrid approach for probabilistic forecasting of electricity price. IEEE Trans. Smart Grid 5(1), 463–470 (2014) 4. O. Abedinia, N. Amjady, H. Zareipour, A new feature selection technique for load and price forecast of electrical power systems. IEEE Trans. Power Syst. 32(1), 62–74 (2017) 5. L. Wang, Z. Zhang, J. Chen, Short-term electricity price forecasting with stacked denoising autoencoders. IEEE Trans. Power Syst. 32(4), 2673–2681 (2017) 6. R. Weron, Electricity price forecasting: a review of the state-of-the-art with a look into the future. HSC Res. Rep. 30(4), 1030–1081 (2014) 7. F.J. Nogales, J. Contreras, A.J. Conejo, R. Espinola, Forecasting next-day electricity prices by time series models. IEEE Trans. Power Syst. 17(2), 342–348 (2002) 8. J. Contreras, R. Espinola, F.J. Nogales, A.J. Conejo, ARIMA models to predict next-day electricity prices. IEEE Trans. Power Syst. 18(3), 1014–1020 (2003) 9. J.P. González, A.M.S. Muñoz San Roque, E.A. Pérez, Forecasting functional time series with a new Hilbertian ARMAX model: application to electricity price forecasting. IEEE Trans. Power Syst. 33(1), 545–556 (2018) 10. S. Chai, Z. Xu, Y. Jia, Conditional density forecast of electricity price based on ensemble ELM and logistic EMOS. IEEE Trans. Smart Grid 10(3), 3031–3043 (2019) 11. L. Zhang, P.B. Luh, Neural network-based market clearing price prediction and confidence interval estimation with an improved extended Kalman filter method. IEEE Trans. Power Syst. 20(1), 59–66 (2005) 12. J. Guo, P.B. Luh, Improving market clearing price prediction by using a committee machine of neural networks. IEEE Trans. Power Syst. 19(4), 1867–1876 (2004) 13. J. Lago, F. De Ridder, B. De Schutter, Forecasting spot electricity prices: deep learning approaches and empirical comparison of traditional algorithms. Appl. Energy 221, 386–405 (2018) 14. A. Graves, A.R. Mohamed, G. Hinton, Speech recognition with deep recurrent neural networks, in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013, pp. 6645–6649 15. U. Ugurlu, I. Oksuz, O. Tas, Electricity price forecasting using recurrent neural networks. Energies 11(5), 1–23 (2018). (May) 16. J. Bottieau, L. Hubert, Z. De Grève, F. Vallée, J. Toubeau, Very-short-term probabilistic forecasting for a risk-aware participation in the single price imbalance settlement. IEEE Trans. Power Syst. 35(2), 1218–1230 (2020) 17. M. Zahid, F. Ahmed, N. Javaid, R.A. Abbasi, H.S. Zainab Kazmi, A. Javaid, M. Bilal, M. Akbar, M. Ilahi, Electricity price and load forecasting using enhanced convolutional neural network and enhanced support vector regression in smart grids. Electronics 8(2), 122 (2019) 18. H.Y. Cheng, P.H. Kuo, Y. Shen, C.J. Huang, Deep Convolutional Neural Network Model for Short-Term Electricity Price Forecasting. arXiv:2003.07202 (2020) 19. M. Adil, N. Javaid, N. Daood, M. Asim, I. Ullah, M. Bilal, Big data based electricity price forecasting using enhanced convolutional neural network in the smart grid, in Workshops of the International Conference on Advanced Information Networking and Applications, 2020, pp. 1189–1201 20. Z. Li, Y. Li, Y. Liu, P. Wang, R. Lu, H.B. Gooi, Deep learning based densely connected network for load forecasting. IEEE Trans. Power Syst. 36(4), 2829–2840 (2021) 21. F. Ziel, Forecasting electricity spot prices using lasso: on capturing the autoregressive intraday structure. IEEE Trans. Power Syst. 31(6), 4977–4987 (2016)

References

129

22. C. Liu, W.H. Hsaio, T. YaoChung, Time series classification with multivariate convolutional neural network. IEEE Trans. Industr. Electron. 66(6), 4788–4797 (2019) 23. K. Greff, R.K. Srivastava, J. Koutnik, B.R. Steunebrink, J. Schmidhuber, LSTM: a search space odyssey. IEEE Trans. Neural Netw. 28(10), 2222–2232 (2017) 24. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, I. Polosukhin, Attention is all you need, in Proceedings of the 31st International Conference on Neural Information Processing Systems, vol. 30, 2017, pp. 5998–6008 25. Y. Dong, J.B. Cordonnier, A. Loukas, Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth. arXiv: Learning (2021) 26. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a simple way to prevent neural networks from overfitting. J. Machine Learn. Res. 15(1), 1929–1958 (2014) 27. K. He, X. Zhang, S. Ren, J. Sun, Identity Mappings in Deep Residual Networks. arXiv:1603.05027 (2016) 28. Y. Li, Y. Ding, Y. Liu, T. Yang, P. Wang, J. Wang, W. Yao, Dense skip attention based deep learning for day-ahead electricity price forecasting. IEEE Trans. Power Syst. 1–19 (2022) 29. I. Goodfellow, Y. Bengio, A. Courville, Deep Learning (2016) 30. Z. Li, M. Dong, S. Wen, H. Xiang, P. Zhou, Z. Zeng, CLU-CNNs: object detection for medical images. Neurocomputing 350, 53–59 (2019) 31. D.P. Kingma, J. Ba, Adam: A Method for Stochastic Optimization (2017) 32. G.U. Yule, M.G. Kendall, An Introduction to the Theory of Statistics (Griffin, London, 1968)

Chapter 7

Uncertainty Characterization of Power Grid Net Load of Dirichlet Process Mixture Model Based on Relevant Data

7.1 Introduction The net load of the power grid is the active power difference between the electricity demand of power users and renewable energy generation. With the rapid development of renewable energy, large-scale integration of wind and photovoltaic power contributes to reducing fossil energy consumption and environmental pollution. However, with an increasing proportion of the fluctuating and intermittent renewable generation, the accurate prediction of the net load becomes very difficult, which challenges the safe and economic operation of the power system [1, 2]. Therefore, how to effectively characterize the uncertainty of the net load to rationalize the power backup resources to a certain extent is very important. Recent research on the characterization of net load uncertainty in power grids is divided into two categories. The first category is to characterize the uncertainty of net load employing probabilistic forecasting or interval forecasting [3]. The second category studies the prediction errors of deterministic prediction value and analyzes their statistical properties [4]. Since deterministic forecasts are easier to perform compared to probabilistic or interval forecasts, net load uncertainty characterization methods based on the second category have received much attention from domestic and international scholars. Bhandari et al. used the logistic distribution to characterize the prediction error of net load, which can better describe its skew feature compared to the Gaussian distribution [5, 6]. However, when the penetration of distributed renewable is high, the logistic distribution is unable to describe the characteristics of its cyclic coordinate descent [7]. To address this issue, literature [8–10] uses a Gaussian mixture model (GMM) to characterize the uncertainty of load and renewable, and the characteristics of the peak can be better described by fitting the net load prediction error using the GMM. However, GMM requires a prior specification of the number of components with probability distributions, which currently relies mainly on empirical settings and generally leads to large errors. To address the abovementioned drawbacks of GMM, Sun et al. used the Dirichlet process mixture model © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Li et al., Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch, Engineering Applications of Computational Methods 14, https://doi.org/10.1007/978-981-99-0799-1_7

131

132

7 Uncertainty Characterization of Power Grid …

(DPMM) to characterize the net load uncertainty [11]. However, this work assumes that the observed data are independent. In fact, the net load is a time series with a strong temporal correlation [12]. However, traditional DPMM does not consider this correlation, which leads to bias and the inability to describe the net load information. To address this shortcoming of DPMM, this chapter proposes a Bayesian framework based on the data-relevance Dirichlet process mixture model (DDPMM), which uses an improved posterior probability distribution considering data association and combines variational Bayesian inference to obtain the optimal number of mixture models and their parameters to better characterize the net load uncertainty. Firstly, a hybrid model based on the Dirichlet process is constructed, which mainly includes the weight proportion of each hybrid model and its corresponding parameters. Secondly, an improved posterior distribution is proposed based on the variational inference method to describe the relationship between each model and the observed data by using the characteristic that the net load data are correlated. Moreover, the hybrid model is solved iteratively using an improved expectation-maximum (EM) algorithm to obtain its marginal probability distribution of prediction errors under different prediction values, thus realizing the uncertainty characterization of the net load. Finally, experimental validation is performed based on data from the Belgian grid. The main contributions are as follows: (1) A data correlation based hybrid model of the Dirichlet process is developed, which considers the net load temporal correlation to improve the performance of posterior probability distribution of DPMM on describing its uncertainty. Compared with the DPMM algorithm, the DDPMM can utilize the net load data more effectively. (2) A new lower bound of evidence is constructed based on the improved posterior probability distribution of DDPMM. Based on the EM algorithm, the convergence of the DDPMM is proved using the updated information of the newly constructed evidence lower bound. (3) The effectiveness of the DDPMM is verified on real data from the Belgian grid. The results show that the DDPMM, which takes into account the correlation information of net load data, can significantly accelerate the convergence speed and further improve the accuracy of the obtained hybrid model compared with the traditional DPMM. The remainder of this chapter is organized as follows: Sect. 7.2 introduces the Dirichlet mixture model of data association-based Bayesian framework. Section 7.3 introduces the Dirichlet mixture model-based variational Bayesian inference for data association. Case studies are conducted in Sect. 7.4 to make verification. Finally, the conclusion is drawn in Sect. 7.5.

7.2 A Bayesian Framework Based on the Dirichlet Mixture Model of Data Association

133

7.2 A Bayesian Framework Based on the Dirichlet Mixture Model of Data Association In this section, firstly, the data correlation of the net load data is analyzed to reveal the data correlation of the net load time series; secondly, the traditional DPMM is improved by this data correlation, and the Bayesian framework of the Dirichlet mixture model considering the data correlation is proposed.

7.2.1 Net Load Time-Series Correlation The net load data are presented as a time series whose current state is correlated with the historical moment; thus, this section uses lagged correlation to describe the degree of data correlation of the net load [13]. The specific equation is shown as follows. t=N −T t=T +1 ζt · ζt+T corr(, T ) =  (7.1)  t=N −T 2 t=N −T 2 ζ · ζ t t=1 t=1 t+T For Eq. (7.1), the data  = ζ1 , ζ2 , . . . , ζt , . . . , ζ N of the net load observations of Belgium for the whole year 2019 are used as an example, where ζt is the tth observed data, the temporal resolution of dataset is 15 min, and N is 96 × 365 = 35040 representing the amount of data for one year [14]. T represents the time series of net load with a lag of T steps, and ζt +T is the (t + T )th observed data. Figure 7.1 represents the correlation degree of the net load, and the horizontal and vertical coordinates represent the scale of the time lag with ten steps, each step is 15 min, and the color scale on the right side indicates the magnitude of its data correlation. In Fig. 7.1, it can be observed that the degree of data correlation of the net load decreases with increasing panning time. Taking a lag of ten steps (T = 10), the value of the right vertical coordinate corresponding to the coordinate (0, 10) represents the magnitude of the degree of association of the current data ζ11 , ζ12 , . . . , ζ N with the net load ζ1 , ζ2 , . . . , ζ N −10 after 150 min, i.e., 0.82. This indicates that the current data are still strongly associated with the net load after 150 min. Therefore, the influence of historical data on the current moment needs to be considered when fitting the net load data specifically using GMM or DPMM. However, traditional GMMs and DPMMs will assume independence between individual observations when dealing with time series and do not consider the feature that net load data are correlated, which may lead to probabilistic mixture models generated by GMMs and DPMMs containing more components and easily cause overfitting, thus affecting the accuracy of the characterization of net load uncertainty [15].

134

7 Uncertainty Characterization of Power Grid …

Fig. 7.1 Data relevance of net load, reprinted from Ref. [15], copyright2022, with permission from Acta Automatica Sinica

To this end, this chapter improves the traditional DPMM by considering the characteristics of net load data association and proposes a Dirichlet mixed model Bayesian framework that considers data association.

7.2.2 Bayesian Framework Based on the Dirichlet Mixture Model of Data Association The proposed Bayesian framework of the Dirichlet mixture model based on data association consists of three parts, as shown in Fig. 7.2. The first stage is performed to normalize the net load data. The observed and predicted values of the net load are denoted by x and y, respectively, and their values are given in the actual Belgian grid dataset studied in this chapter. Then, the normalization operation is performed for x and y, respectively [16]. In the second stage, the joint probability distribution of x and y is obtained using the proposed DDPMM. Firstly, a stick-breaking process [17] is used to represent the Dirichlet process to obtain the mixture model of this joint probability distribution; secondly, an improved variational Bayesian inference considering data association is used to calculate the mean and covariance matrices of the mixture model and the number of mixture groups. Finally, in the third stage, the marginal probability distribution P(z|y) about the predicted values is obtained based on the joint probability distribution of x and y, where z = x − y represents the prediction error of the net load. The uncertainty of the net load is described by the above nonparametric Bayesian framework based on DDPMM, which mainly includes the steps of stick breaking, the Dirichlet process and improved variational inference, which are described separately below.

7.2 A Bayesian Framework Based on the Dirichlet Mixture Model of Data Association

135

Fig. 7.2 Framework of DDPMM variational Bayes, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica

7.2.3 Dirichlet Process and Folded Stick Construction Representation DDPMM is an improvement based on DPMM, where DPMM is a nonparametric Bayesian soft clustering method, i.e., it is a way to attribute data A to a certain Gaussian distribution with a certain probability [18], where E = {1 , 2 , . . .  N } denotes the number of data. DPMM can obtain the optimal probability distribution from the data to describe this dataset E without specifying the specific parameters of the mixture model and the number of mixture groups. Consider a continuous probability distribution H () , each of its sample points k = {μk , σk } represents a Gaussian distribution, where μi , σi are its mean and variance, thus H () is a mixture model, i.e., it is obtained by weighting an infinite number of Gaussian distributions, and each data is classified into some H () .

136

7 Uncertainty Characterization of Power Grid …

Fig. 7.3 Process of steak-breaking, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica

However, when H () is a continuous distribution, the probability of getting two identical Gaussian distributions from H () sampling is 0, i.e., any two different data i ,  j (i = j) are classified into different Gaussian distributions and will lose the meaning of clustering. Therefore, it is necessary to discretize so that different data i ,  j (i = j) are classified into the same class, and this discretization process is called the Dirichlet process, which is a stochastic process used for Bayesian modeling and analysis [19]. Define H () as the base distribution and the distribution after discretization is denoted as G. The distribution G obtained by DP sampling has the following representation. G ∼ DP (φ, H ())

(7.2)

where φ is the concentration parameter, which represents the degree of H () discretization. When φ → 0 , the distribution G reaches its most discrete state, i.e., it becomes a discrete distribution column with only one sample in the sample space; when φ → ∞ , the distribution G is equal to H and becomes a continuous distribution, i.e., G = H . Between these two extreme cases, the larger the concentration parameter φ , the closer G is to the base distribution H () [20]. The construction method of the Dirichlet process is mainly through the stickfolding construction, which allows an intuitive understanding of the construction process of the hybrid model. The stick-folding construction is the process of taking a stick of unit length and breaking it several times to obtain sticks of different lengths, which is shown to obey the Dirichlet process [21]. The stick-folding constitution can be obtained using the following two expressions. vm ∼ Beta(vm | 1, φ)

πm = vm

m−1 

(1 − vs )

(7.3)

(7.4)

s=1

Consider a unit-length stick, as shown in Fig. 7.3, and each time a certain proportion vm of the current length is selected, where vm obeys Beta(vm | 1, φ) and vm ∈ (0, 1) , as defined by Eq. (7.3). The length πm of the stick selected out each

7.2 A Bayesian Framework Based on the Dirichlet Mixture Model of Data Association

137

Fig. 7.4 Base distribution, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica

time is the weight of the samples in G, i.e., the proportion of each mixture model, and it is a probability measure satisfying the following conditions. ∞ 

πm = 1

(7.5)

m=1

Therefore, the sticks selected out each time the sum of their lengths to 1, i.e., the sum of the weights of each mixed component of the distribution G is 1. The distribution G is obtained by sampling out one Gaussian distribution at a time for H () and the folded stick construction to calculate its weight coefficients and thus the distribution G, where the expression for G is shown below. G=

∞ 

πm δm = 1

(7.6)

m=1

where δm is the Kronecker function with the following expression.  1 w = m δ m = 0 other

(7.7)

Figure 7.4 shows a schematic diagram of the basis distribution, where the basis distribution H () is located above the horizontal coordinate, and the basis distribution G is below the horizontal coordinate after discretization, where i represents the Gaussian distribution sampled from H () and πi represents the weight of each Gaussian distribution. Figure 7.5 mainly represents the formation process of the basis distribution G. First, i is sampled by H () , and subsequently its weight πi is obtained by the folded stick construction. Repeating this operation, the discretized basis distribution G is obtained, and Eq. (7.6) is its corresponding mathematical expression.

138

7 Uncertainty Characterization of Power Grid …

Fig. 7.5 Dirichlet process mixture model, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica

7.2.4 Nonparametric Dirichlet Mixture Model In fact, the DPMM obtained by the above folded stick-based construction, combined with the statistical properties of the data, is able to update the optimal number of mixture groups, as well as the weights πm and m for each mixture model, with no prior knowledge. For a two-dimensional net load number n = [xn , yn ] , the probability density function of n can be expressed using the DPMM as follows: p(n | π, ) =

∞ 

πm Pm (n | m )

(7.8)

m=1

where xn , yn denote the actual net load observations and the predicted net load for ∞ the nth net load data, respectively, π = {πm }∞ m=1 and  = {m }m=1 denote the set of πm and m , respectively, and pm (·) denotes the mth mixture component of the mixture model. It is generally considered to be Gaussian distributed a priori [22], and therefore, the DPMM can be expressed as a combination of multivariate Gaussian distributions as follows: p(n | π, ) =

∞ 

πm Nm (n | μm , −1 m )

(7.9)

m=1 −1

where Nm (n | μm , is the mth component of the mixture model and is Gaussian m

distributed, where μm and m are its mean vector and covariance matrix. Equation (7.9) represents the conditional probability that data n is generated by the mixture model. N , where, for ∀i, j ∈ {1, 2, . . . N } On this basis, consider the dataset E = {n }n=1 , i and  j are independent of each other. Thus, the DPMM obtained based on the dataset E can be expressed as

7.2 A Bayesian Framework Based on the Dirichlet Mixture Model of Data Association

139

Fig. 7.6 DPMM probability graph model, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica

p(E | π, ) =

N  ∞ 

πm Nm (n | μm , −1 m )

(7.10)

n=1 m=1

To associate each data with each hybrid component, a set of hidden variables n=N Z = {{z m[n] }n=N n=1 }m=1 is introduced and if the nth data n is associated with the mth hybrid component, then z m[n] = 1 ; otherwise z m[n] = 0 . Thus, DPMM can be further expressed as p(E | Z , ) =

N  ∞ 

[n]

zm Nm (n | μm , −1 m )

(7.11)

n=1 m=1

Figure 7.6 shows the relationship between the variables of the DPMM. First the mixture model weights π are generated from the concentration coefficients φ , which represent the weights of each mixture model;k represents the Gaussian distribution sampled from the base distribution H (). In addition, the mixture model weights π are indirectly related to the net load data through the hidden variable Z , where the mixture model weights indicate the proportion of each mixture component to the mixture model. Therefore, the hidden variable Z is generated from π as shown in Fig. 7.6. The conditional probability distribution of Z over π can be obtained and expressed in the following form: p(Z | π ) =

N  ∞ 

z [n]

πmm

(7.12)

n=1 m=1

The better parameters π and  can be estimated by variational Bayesian inference (VBI) and the above-mentioned nonparametric Dirichlet mixture model, which in turn leads to the final DDPMM. VBI is described in detail in the next section.

140

7 Uncertainty Characterization of Power Grid …

7.3 The Dirichlet Mixture Model Based on VBI for Data Association In this section, a Dirichlet mixture model considering data association is proposed. The posterior distribution after considering data association is obtained by using the information of data association to improve the variational inference. First, the variational inference problem is transformed into an optimization problem by approximating the likelihood function of the net load data through the variational distribution; then a new variational posterior distribution is established by considering the characteristics of the net load data association; finally, the DDPMM is proved to be convergent by constructing a new upper and lower bound.

7.3.1 Nonparametric Dirichlet Mixture Model VBI is used to estimate the parameters θ = {π, }and the posterior probability density of the hidden variable Z , where π = {πm }∞ m=1 . Given the data E, the objective is to solve the mixture model p such that the posterior distribution p(E | θ ) of the data E over the given parameters θ is maximized. argmax p(E | θ ) θ

(7.13)

Since it is difficult to compute an analytic expression for p(E | θ ) , this chapter uses VBI to approximate the original distribution p using the variational distribution q . VBI assumes that the variational distribution q comes from the family of variables L and approximates the posterior distribution by minimizing the KL scatter. Thus, VBI transforms the Bayesian inference problem into an optimization problem. By definition, the log-likelihood of p(E | θ ) can be decomposed as follows:  ln p(E | θ ) =



ln p(E | θ ) · q(Z )dZ

p(E, Z | θ ) · q(Z )dZ p(Z | E, θ ) (7.14)   q(Z ) p(E, Z | θ ) · q(Z )dZ + ln · q(Z )dZ = ln q(Z ) p(Z | E, θ ) = ELBO + KL(q(Z )  p(Z | E, θ )) =

ln

where ELBO and KL(q(Z )  p(Z | E, θ )) can be expressed in the following form:  p(E, Z | θ ) ELBO = ln · q(Z )dZ (7.15) q(Z )

7.3 The Dirichlet Mixture Model Based on VBI for Data Association

141

Fig. 7.7 DPMM probability model graph considering data relevance, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica

 KL(q(Z )  p(Z | E, θ )) =

ln

q(Z ) · q(Z )dZ p(Z | E, θ )

(7.16)

Therefore, maximizing p(E | θ ) requires optimizing two parameters, Z and θ , in two steps. The first step optimizes the parameter Z so that the variational distribution q(Z ) is close to p, which is called the expectation step (E step), i.e., the expected variational distribution is obtained; the second step optimizes the parameter , which is called the maximization step (M step), by optimizing θ to maximize ln p(E | θ ). In this chapter, we focus on the data correlation of the net load, and Fig. 7.7 represents the DDPMM probability plot considering the data correlation, which differs from Fig. 7.6 in that the current moment data refer to the information of the previous moment data, i.e., part of the information of Z i comes fromZ i−1 . This part is determined by the variational distribution q(Z ), so DDPMM mainly considers the improvement of the variational distribution so that q(Z ) goes to approximate p(E | θ ). The improved variational distribution q(Z ) can be obtained by using the properties of the correlation data of the net load, i.e., the minimization K L(q(Z )  p(Z | E, θ )), which corresponds to the E step in the EM algorithm, i.e., to obtain the desired variational distribution, where KL(q(Z )  p(Z | E, θ )) is the Kullback–Leibler divergence, which can be regarded as the distance between two distributions, and the smaller the Kullback–Leibler divergence the more similar the two distributions are. From the non-negativity of Kullback–Leibler divergence, it is known that ln p(E | θ ) ≥ ELBO, where ELBO is the evidence lower bound (ELBO). When the observed and predicted data are known and the mixture model is fixed, p(E | θ ) is a constant, so the problem of minimizing the Kullback–Leibler divergence can be transformed into the following problem: argminK L(q(Z )  p(Z | E, θ )) ⇔ argmax ELBO  p(E, Z | θ ) · q(Z )dZ = ln q(Z )  p(E, Z | θ ) = E q(Z ) ln q(Z ) = E q(Z ) [ln p(E | Z , )] + E q(Z ) [ln p(Z | π )] − E q(Z ) [q(Z )]

(7.17)

142

7 Uncertainty Characterization of Power Grid …

VBI usually assumes that the variance distribution comes from a family of field variables and that the hidden variables are independent of each other. In order to better describe the variance distribution, the folded stick construction is truncated in this chapter and a slightly larger value is taken as the upper limit of the number of mixed groups. q(Z ) =

K 

qi (z i )

(7.18)

i=1

where K is the upper bound of the mixed model components. The update formula for each hidden variable is obtained by using the information transfer algorithm for cyclic coordinate optimization. For the kth hidden variable, all variables that are not related to z k are considered as constants. Thus, the objective function of Eq. (7.14) can be transformed into the following form:  p(E, Z | θ ) E q(Z ) ln q(Z )

K   K j=1 q j dZ q j ln =− p j=1 =−

  K j=1

=−

  K j=1

=−  =

  K j=1

q j ln

K  j=1

q j dZ +

  K

q j ln pdZ

j=1

q j (ln qk − ln p)dZ −

  K

qj

(7.19) ln qi dZ

i=k

j=1

q j (ln qk − ln p)d Z −k dz k −

K 

K 

ln qi

i=k

qk {E q(z−k) ln p − ln qk }dz k + constant

The update equation of qk is obtained according to Eq. (7.17) as follows:  p(E, Z | θ ) ∂ E q(Z ) ln q(Z ) =0 ∂qk ⇒ E q(z−k) ln p − 1 + constant = ln qk

(7.20)

⇒ qk∗ ∝ exp{E q(z−k) ln p(E, Z | θ )} Therefore, the optimal variance distribution is qk∗ , where E q(z−k) ) denotes the expectation operation on the other hidden variables except the kth hidden variable.

7.3 The Dirichlet Mixture Model Based on VBI for Data Association

143

 p(E, Z | θ ) with In addition, the second-order partial derivative of E q(Z ) ln q(Z ) respect to qk can be expressed computationally in the following form:  p(E, Z | θ ) ∂ 2 E q(Z ) ln q(Z ) 2 ∂qk ∂{ E q(z−k) ln p − ln qk − 1dz k } = ∂qk  1 =− dz k qk

(7.21)

The optimal variance distribution can be obtained by iteratively updating the variance distribution of each hidden variable by Eq. (7.20). In addition, it is 1 dz k of E q(Z ) known from Eq. (7.21) that the second-order partial derivative − qk  p(E, Z | θ ) ln for each qk is less than 0, so it is convex for each qk . The variaq(Z ) tional distribution obtained by updating Eq. (7.20) is guaranteed to be monotonically undiminished by the update of ELBO. The logarithmic expression ln qk∗ (Z ) of the optimal variational distribution can be expressed in the following form: ln qk∗ (Z ) =

K 

E q j ln p(E, Z | θ ) + constant (7.22)

j=k

=

K 

E q j {ln p(E, Z | ) + ln p(Z | π )} + constant

j=k

Substituting Eqs. (7.11) and (7.12) into Eq. (7.22) and treating the variables unrelated to Z into the constant terms, Eq. (7.23) is obtained as follows: ln q ∗ (Z ) =

N  K 

z k[n] ln ρkn + constant

(7.23)

n=1 k=1

where ρkn denotes the probability density value of the kth mixture model at data n. By simplifying Eq. (7.23), the optimal variational distribution q ∗ (Z ) can be obtained as follows: q ∗ (Z ) =

N  K  n=1 k=1

[n]

(rkn )zk

(7.24)

144

7 Uncertainty Characterization of Power Grid …

where rkn is the responsiveness of the nth data n to the kth mixture component and represents the probability that n belongs to the kth mixture component, which is calculated by unitizing ρkn ρn rkn = p(z k[n] = 1) = ∞ k j=1

ρ nj

(7.25)

7.3.2 Variational Posterior Distribution Considering Data Association The traditional VBI assumes that the L hidden variables are independent of each other; however, the net load is a time series, and thus its before and after data are correlated. Therefore, the optimal variance distribution considering correlation can be expressed in the following form: q ∗ (z k ) = q(z k | z k−1 )



⇒ p z k[n] = 1 = p z kn = 1 | z k[n−1] =1 

p z k[n] = 1, z k[n−1] =1 

= p z k[n−1] =1 

(7.26)

where k  denotes the mixed component of the maximum responsiveness of the (n − 1)th data. Considering the information of the nth data with the mixed component of = 1) can be defined in the following form: the (n − 1)th data, p(z k[n] = 1, z k[n−1] 

p z k[n] = 1, z k[n−1] =1 



= w · p z k[n] = 1 + w  · p z k[n]  = 1

  × w · p z k[n−1] = 1 + w  · p z kn−1 =1      · w · rkn−1 + w  · rkn−1 = w · rkn + w  · rkn−1  

(7.27)

π π  and w = denote the proportion of the kth and k  th comπ + π π + π ponents. Thus, the improvement of the variance distribution by improving the responsivity values can be expressed in the following form: where w =

7.3 The Dirichlet Mixture Model Based on VBI for Data Association

145



p z k[n] = 1, z k[n−1] =1 

ρkn = p z k[n−1] =1      w · rkn + w  · rkn · w · rkn−1 + w  · rkn−1  = rkn−1    n−1   r = w  + w kn−1 · w · rkn + w  · rkn rk 

(7.28)

Normalize ρkn to obtain the normalized responsiveness rkn after normalization: ρ n rkn = ∞ k j=1



ρ nj 

w +w =

K k=1



rkn−1 rkn−1 

w + w

 · (w · rkn + w  · rkn )

rkn−1 rkn−1 

(7.29)

 · (w · rkn + w  · rkn )

Therefore, the upper and lower bounds of rkn can be expressed in the following form: w  · min(rkn , rkn ) rkn ≥  K n n k=1 max(r k , r k  ) n n max(rk , rk  ) rkn ≤ K  w · k=1 min(rkn , rkn )

(7.30)

The difference between the improved responsiveness rkn and rkn can be obtained by Eq. (7.25) as shown by Eq. (7.31). w  · min(rkn , rkn ) rkn − rkn ≥  K − rkn = LBnk n n max(r , r )  k k k=1 max(rkn , rkn ) n n rk − rk ≤ − rkn = HBnk K w  · k=1 min(rkn , rkn )

(7.31)

Therefore, the difference between the improved modified evidence lower bound (MELBO) and the original ELBO can be obtained according to Eqs. (7.15) and (7.28) as follows:

146

7 Uncertainty Characterization of Power Grid …

MELBO-ELBO = E q(Z ) [ln p(E | Z , )] + E q(Z ) [ln p(Z | π )] − E q(Z ) [q  (Z )] − E q(Z ) [ln p(E | Z , )] − E q(Z ) [ln p(Z | π )] + E q(Z ) [q(Z )]  N K  N  K   n n n n n n = (rk − rk ) · ln πk ) + ( (rk lnrk − rk ln rk ) n=1 k=1

(7.32)

n=1 k=1

N  K  = (rkn − rkn )(ln πk + 1 + ln ξkn ) n=1 k=1

where ξkn is between rkn and rkn . Thus, the upper and lower bounds for the MELBOELBO of Eq. (7.32) can be obtained as follows: N  K  (rkn − rkn )(ln πk + 1 + ln ξkn ) n=1 k=1



N  K 

H Bkn (1 + ln[πk · max(rkn , rkn )])

(7.33)

n=1 k=1

= MHB

N  K  (rkn − rkn )(ln πk + 1 + ln ξkn ) n=1 k=1



N  K 

L Bkn (1 + ln[πk · min(rkn , rkn )])

(7.34)

n=1 k=1

= MLB Therefore, the upper and lower bounds of MELBO-ELBO are modified high bound (MHB) and modified low bound (MLB), where MHB and MLB denote the improved upper and improved lower bounds, respectively. Then the optimal q ∗ (Z ) is found using the E M algorithm considering the characteristics of the net load data association. Specifically, the specific form of the variational distribution is first solved. Subsequently, the q ∗ (Z )is fixed so that the change of MELBO is only related to the parameter θ . In turn, the optimal log-likelihood p(E, Z | θ ) q(Z )dZ can be obtained by optimizing θ , and thus the optifunction q(Z ) mal ln p(E | θ ) is calculated. The specific expressions and the proof of convergence can be found in Appendix A. The iterative process corresponding to the improved posterior distribution of the EM algorithm combined with the correlation of the net load data is shown in Fig. 7.8. First, the log-likelihood value and the evidence lower bound ELBO are obtained by

7.4 Example Analysis

147

Fig. 7.8 EM iterative graph considering data relevance, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica

Eq. (7.14), and KL(q  p) denotes the distance between the variational distribution q(Z ) and p(Z | E, θ ), as shown in Fig. 7.8a. After step E, the lower bound ELBO is updated. Figure 7.8b indicates that the monotonically increasing ELBO leads to a decrease in KL(q  p) and a shortening of the distance between the variational distribution q(Z ) and the target distribution p(Z | E, θ ). After obtaining ELBO, the DPMM posterior distribution is calculated considering the characteristics of the net load data association, so the modified variational posterior, Modified q(Z ), is used. This posterior distribution causes a change in the ELBO of Fig. 7.8b, so the improved upper bound MHB and the improved lower bound MLB are introduced to obtain the range of change in the lower bound MELBO of the improved evidence relative to the ELBO, where MELBO lies between MHB and MLB, as shown in Fig. 7.8c. After the improvement of the posterior distribution, the final MELBO can be obtained. Finally, the log-likelihood function is maximized by optimizing the parameter θ in M-step, as shown in Fig. 7.8d. It should be noted that since the MELBO obtained by the improved EM algorithm lies between MHB and MLB, it will lead to a difference between it and the ELBO before the improvement, and then oscillation will occur during the iteration. However, the DDPMM with data association considered can improve the convergence speed, and detailed results will be shown in Chap. 4.

7.4 Example Analysis 7.4.1 Description of the Algorithm To verify the superiority of the proposed data-linked Dirichlet hybrid model, this chapter validates it against the net load data of the Belgian grid from 2019 to 2020, where the installed capacities of wind and PV are 3157 MW and 3369 MW, account-

148

7 Uncertainty Characterization of Power Grid …

Fig. 7.9 Net load and load curve on July 1, 2019, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica

ing for 16.0% and 17.0% of the total installed capacity, respectively. Figure 7.9 shows the net load versus net load forecast curve and the load versus load forecast curve for July 1, 2020, where it should be noted that the forecast values are given directly from the dataset. It can be seen that the load prediction error is small, while the net load error is large because it includes renewable energy sources with high randomness. To verify the superiority of the proposed DDPMM, this chapter uses the net load data of 2019 as the training set and the net load data of 2020 as the test set with a temporal resolution of 15 min. In the nonparametric Bayesian framework based on DDPMM proposed in this chapter, the observed data [z, y]T are first normalized, where z = x − y. Then, the joint probability distribution of[z, y] is obtained using DDPMM. Finally, the marginal probability density of the prediction error at each prediction value is obtained based on the prediction value of the net load, and the merit of DDPMM is evaluated by the related index. The results section first verifies the convergence of DDPMM; further, the fitting effect of DDPMM is obtained by fitting the net load forecasting error; finally, the overall characterization effect of DDPMM on the net load forecasting error is evaluated by interval metrics.

7.4.2 DDPMM Convergence Analysis The DDPMM makes use of the information of time-series data correlation to make the convergence faster. Figure 7.10 represents the gradual convergence of DDPMM and DPMM as the number of iterations increases, where the vertical coordinate represents the change value p(E | θ ) for each iteration, and if the change value tends

7.4 Example Analysis

149

to 0, it is considered to have converged. In order to compare the convergence speed of DDPMM and DPMM in detail, the logarithmic vertical axis is used in this chapter, and the logarithmic operation with a base of 10 is performed on the vertical axis. From Fig. 7.10, it can be observed that DDPMM and DPMM converge gradually after 3000 generations, where for DPMM the variation of p(E | θ ) per generation is slightly greater than 1010 , while for DDPMM the variation is smaller. It is also observed that the DDPMM oscillates repeatedly during convergence. This is because the DDPMM takes into account the characteristics of net load data correlation when updating, which can lead to oscillations as the changes of p(E | θ ) corresponding to the DDPMM are uneven compared to the DPMM. In addition, the DDPMM algorithm is improved based on DPMM, so in the early part of the iteration, the change trend of DDPMM is approximately the same as that of the DPMM counterpart, as shown in the subplot in Fig. 7.10. As the number of iterations increases, the DDPMM updates faster as more information on the net load data is obtained. At roughly 3000 iterations, the variation of p(E | θ ) of DDPMM gradually decreases, while DPMM remains basically the same, which means that DDPMM converges faster compared to DPMM. Therefore, from the convergence curve in Fig. 7.10, we can see that DDPMM sacrifices some stability to find more excellent solutions to speed up the iterations when the first iteration is performed.

7.4.3 Analysis of DDPMM Fitting Effect In order to compare the effectiveness of the conditional probability distribution error fitting of DDPMM, this chapter uses GMM and DPMM to compare them, where the number of groups of GMM is determined by the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) [22, 23], and the specific formulas of AIC and BIC are as follows. AIC = 2k − 2 ln(L)

(7.35)

BIC = k ln(n) − 2 ln(L)

(7.36)

where k denotes the number of model parameters, L denotes the likelihood function of the model, and n is the number of net load samples. The AIC and BIC are indicators of the goodness of fit of the statistical model. By calculating the AIC and BIC values when the number of GMM mixture groups is 1, 2, . . . m (where m denotes a larger number), m 1 and m 2 are obtained so that their corresponding AIC and BIC values are the smallest, which are denoted as GMM-AIC and GMM-BIC. The number of groups of GMM-AIC is calculated to be 20, and the number of groups of GMM-BIC is calculated to be 13. In addition, the optimal number of groups obtained from DPMM is 15, while the optimal number of groups obtained

150

7 Uncertainty Characterization of Power Grid …

Fig. 7.10 PDF of net load forecast error conditions under different forecast values, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica

from DDPMM is 5. This is because DDPMM fits the net load data [x, y]T by considering more information and thus merges similar Gaussian distributions. However, the number of mixed groups of DPMM and GMM is more than that of DDPMM, which indicates that it considers more local information and tends to lead to overfitting. In this chapter, according to the number of data samples, they are equally divided into 10, and the average value of each sample entry is used as the predicted value of net load for each conditional probability distribution. Figure 7.11 shows the conditional probability distributions of DDPMM, DPMM and GMM at each prediction value, and the fitting effect of each model on the test set is obtained by comparing it with the histogram of net load data. It can be observed through Fig. 7.11 that when the net load forecast is small or large, as shown in subfigures (a) and (j), the forecast error of net load is large. the GMM model cannot characterize its uncertainty well, while DDPMM and DPMM perform better in this aspect. In order to further quantitatively compare the fitting effect of net load prediction errors of different models, this chapter uses two indicators, log-likelihood (Log-L) and chi-square goodness of fit (Gof), where log-likelihood is the value of ln p(E | θ ), and the larger the value of this indicator represents. The larger the value of this indicator means the better the corresponding model fit; Gof measures the difference between two probability distributions, and the smaller the value of Gof means the smaller the difference of the fit effect. Table 7.1 shows the log-likelihood values of the four models, and DDPMM has a larger log-likelihood value on the 2020 net load data, indicating that DDPMM fits better. Table 7.2 shows the Gof values for the four models, and it can be observed that DDPMM performs relatively well, with the smallest Gof in subplots (a), (d), (g) and (j), while performing slightly weaker than DDPMM in the other cases. This is because Gof is used in this chapter to evaluate the fit of the conditional probability distribution under the ten forecast values in Fig. 7.11, which mainly describes local information.

7.4 Example Analysis

151

Fig. 7.11 Interval radar chart with 0.95 confidence, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica Table 7.1 Comparison of log-likelihood, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica Model Log-L/103(test) DDPMM (5) DPMM (15) GMM-AIC (20) GMM-BIC (13)

1.639 1.625 1.612 1.611

7.4.4 DDPMM Interval Indicator Analysis Therefore, in order to better measure the ability of DDPMM to characterize the overall uncertainty of net load, this chapter further uses interval metrics to evaluate the overall information of net load forecast error. Specifically, we first set the forecast error quantile α and derive the upper and lower quantile xa/2 and x−a/2 corresponding to the forecast values according to the conditional probability distributions of forecast errors belonging to different forecast values in Fig. 7.11, so as to obtain the upper and lower bounds of the net load forecast. These predicted upper and lower bounds are

152

7 Uncertainty Characterization of Power Grid …

Table 7.2 Comparison of goodness of fit of chi-square, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica Test (a) (b) (c) (d) (e) DDPMM DPMM GMM-AIC GMM-BIC Test DDPMM DPMM GMM-AIC GMM-BIC

4.36 4.87 5.32 5.64 (f) 1.96 1.79 2.01 2.53

3.03 3.15 3.54 3.46 (g) 1.59 1.98 2.42 2.07

3.58 4.46 4.20 3.95 (h) 1.96 1.31 1.95 2.84

1.63 2.46 2.87 2.63 (i) 5.32 4.92 5.94 4.99

1.54 1.37 2.01 2.34 (j) 5.11 7.08 5.58 5.98

then compared with the actual observed data xi to obtain the overall information of the net load, such as interval width, interval coverage and other indicators. The upper and lower interval bounds U α(xi ) and L α (xi ) of the predicted values are expressed as follows: U α (xi ) = xi + X α/2 L α (xi ) = xi + X −α/2

(7.37)

To evaluate the quality of the interval obtained by the probability distribution of the net load prediction error, five metrics are introduced in this chapter for evaluation, namely Winkler score, predict interval coverage probability (PICP), coverage widthbased criterion (CWC), average interval score (AIS) and mean predict interval center deviation (MPICD) [24]. PICP measures the coverage of the interval; Winkler, CWC and AIS describe the relationship between interval coverage and interval width; MPICD describes the position of the predicted value in the interval, and if it is close to the middle line of the interval, it means better results. The smaller the Winkler, CWC and MPICD the better, and the larger the PICP and AIS the better. Please refer to Appendix B for the specific mathematical expressions of these five indicators. In this chapter, a confidence level of 0.95 is chosen to construct the confidence interval, March, June, September and December are used as typical months for testing, and the results obtained for each model are shown in Fig. 7.11. It can be observed that at a confidence level of 0.95, the indicators of DDPMM perform well and its indicators are basically better than the other three models. Table 7.3 lists the values of the specific metrics corresponding to Figure (a), and it can be concluded that the Winkeler scores of DDPMM improve by 9.7%, 14.2% and 8.9% compared to DPMM, GMM-AIC and GMM-BIC, respectively, which is a significant improvement relative to DPMM. It is also observed that GMM-AIC has the largest Winkler score,

7.4 Example Analysis

153

Table 7.3 Interval index for March 2020 with 0.95 confidence level, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica Model Winkler/102 PICP CWC AIS MPICD/102 DDPMM DPMM GMM-AIC GMM-BIC

33.57 37.18 39.11 36.86

0.80 0.70 0.63 0.69

4.14 22.58 78.71 26.93

−1.03 −1.56 −1.84 −1.51

6.03 6.05 6.07 6.08

Table 7.4 Interval index for March 2020 with 0.95 confidence level, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica Model Winkler/102 PICP CWC AIS MPICD/102 DDPMM DPMM GMM-AIC GMM-BIC

29.84 30.91 33.19 31.46

0.93 0.86 0.75 0.84

0.50 1.18 6.68 1.56

−0.58 −0.68 −1.02 −0.77

4.59 4.61 4.65 4.63

Table 7.5 Interval index for September 2020 with 0.95 confidence level, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica Model Winkler/102 PICP CWC AIS MPICD/102 DDPMM DPMM GMM-AIC GMM-BIC

30.60 32.59 34.90 32.85

0.89 0.80 0.72 0.79

0.93 3.71 13.76 3.68

−0.70 −0.96 −1.32 −1.01

5.02 5.03 5.03 5.02

while its number of mixed groups is 20, which indicates that GMM-AIC undergoes overfitting, while the other three models fit relatively well, with DDPMM fitting the best. In addition, CWC and AIS combine the prediction interval coverage and the width of the prediction interval, and it can be seen that DDPMM improves by 81.7% and 34.0%, respectively, compared with DPMM, which is a significant improvement. pICP is the interval coverage, and DDPMM reaches 0.80 at a confidence level of 0.95, while DPMM is 0.70, indicating that DDPMM can better characterize the net load uncertainty. MPICD describes whether the model is homogeneous, and it can be seen that the difference between DDPMM and the other three models is not very significant in this metric, but still better than the remaining three models. Tables 7.4, 7.5 and 7.6 gives the interval-specific metric values in Fig. 7.11b–d. It is clear that DDPMM performs best in Winkler scores, while the GMM-BIC model has better Winkler scores than DPMM. In addition, the superiority of DDPMM is more obvious in the PICP metric, which improves relative to DPMM in June, September and December by 8.1%, 11% and 18%, respectively. In the CWC and AIS indicators, DDPMM outperformed the remaining three models in September and December. In

154

7 Uncertainty Characterization of Power Grid …

Table 7.6 Interval index for December 2020 with 0.95 confidence level, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica Model Winkler/102 PICP CWC AIS MPICD/102 DDPMM DPMM GMM-AIC GMM-BIC

39.18 44.51 46.02 43.31

0.72 0.61 0.59 0.64

20.4 138.23 173.46 82.32

−1.97 −2.77 −3.00 −2.59

7.45 7.44 7.42 7.42

Table 7.7 Interval index of the June 2020 with 0.8 confidence, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica Model Winkler/102 PICP CWC AIS MPICD/102 DDPMM DPMM GMM-AIC GMM-BIC

32.76 34.70 37.53 35.41

0.77 0.68 0.56 0.66

0.41 1.34 9.86 1.90

−0.91 −1.2 −1.66 −0.32

7.45 7.44 7.42 7.42

Table 7.8 Interval index of the June 2020 with 0.5 confidence, reprinted from Ref. [15], copyright 2022, with permission from Acta Automatica Sinica Model Winkler/102 PICP CWC AIS MPICD/102 DDPMM DPMM GMM-AIC GMM-BIC

39.84 41.73 44.26 42.33

0.49 0.38 0.31 0.38

0.16 0.73 2.19 0.66

−1.93 −2.3 −2.73 −2.4

7.45 7.44 7.42 7.42

addition, in the comparison of MPICD metrics, DDPMM performs better in March versus September, indicating that DDPMM can better characterize the uncertainty in net load and that the resulting upper and lower bounds for the net load forecast are more evenly distributed on both sides of the observations. From Tables 7.7 and 7.8, it can be observed that DDPMM performs well at different confidence levels, and its PICP metric is closest to the confidence level, indicating that DDPMM has good interval coverage for the prediction intervals. Further, it can be observed that the difference between DDPMM and GMM-BIC metrics is not much, and it can be considered that DDPMM and GMM-BIC obtain a similar number of mixture groups under the current dataset, where DDPMM is 15 and GMM-BIC is 13. However, DDPMM describes more information using only five mixture groups, which illustrates the advantage of DDPMM. The superiority of DDPMM to characterize the net load uncertainty problem is verified by the above results. Compared with DPMM, GMM-AIC and GMMBIC, DDPMM obtains a smaller number of hybrid model components, obtains more effective information and reduces the adverse effects of overfitting on characterizing net load uncertainty.

7.5 Conclusion

155

7.5 Conclusion In this chapter, we propose a Dirichlet mixture model based on data association and improve the posterior distribution by variational inference method, so that the posterior distribution takes more information on net load data association into account. Thus, the lower bound of the improved evidence is constructed so that the DDPMM obtains a suitable variational distribution through this lower bound, and its convergence is proved by combining it with the EM algorithm. Finally, by using the Belgian grid as an arithmetic example, different metrics are used to measure the local and global information of the DDPMM characterizing the net load uncertainty. The validation results show that DDPMM has faster convergence, better accuracy, and can better characterize the net load uncertainty of the grid.

References 1. B.-N. Huang, Y. Wang, Y.-S. Li, X.-R. Liu, C. Yang, Multi-objective optimal scheduling of integrated energy systems based on distributed neurodynamic optimization. Acta Automatica Sinica 46, 1–19 (2020) 2. H. Tang, C. Liu, M. Yang, T. Xu Tang, B.-Q. Dan, K. Lv, Learning-based optimization of active distribution system dispatch in industrial park considering the peak operation demand of power grid. Acta Automatica Sinica 45, 1–15 (2019) 3. M. Sun, T. Zhang, Y. Wang, G. Strbac, C. Kang, Using Bayesian deep learning to capture uncertainty for residential net load forecasting. IEEE Trans. Power Syst. 35(1), 188–220 (2019) 4. Y. Wang, N. Zhang, Q. Chen, D.S. Kirschen, P. Li, Q. Xia, Data-driven probabilistic net load forecasting with high penetration of behind-the-meter PV. IEEE Trans. Power Syst. 33(3), 3255–3264 (2017) 5. N. Rajbhandari, W. Li, P. Du, S. Sharma, B. Blevins, Analysis of net-load forecast error and new methodology to determine non-spin reserve service requirement, in 2016 IEEE Power and Energy Society General Meeting, pp. 1–6 6. L. Alvarado-Barrios, R. Alvaro, B. Valerino et al., Stochastic unit commitment in microgrids: influence of the load forecasting error and the availability of energy storage. Renew. Energy 146, 2060–2069 (2020) 7. C. Tang, X. Jian, Y. Sun et al., A versatile mixture distribution and its application in economic dispatch with multiple wind farms. IEEE Trans. Sustain. Energy 8(4), 1747–1762 (2017) 8. B. Dong, Z. Li, S. Rahman et al., A hybrid model approach for forecasting future residential electricity consumption. Energy Build. 117, 341–351 (2016) 9. S. Zhao, T. Zhang, Z. Li et al., Distribution model of day-ahead photovoltaic power forecasting error based on numerical characteristic clustering. Autom. Electr. Power Syst. 43(13), 36–45 (2019) 10. W. Sun, M. Zamani, M.R. Hesamzadeh, H. Zhang, Data-driven probabilistic optimal power flow with nonparametric Bayesian modeling and inference. IEEE Trans. Smart Grid 11(2), 1077–1090 (2020) 11. P.J. Brockwell, R.A. Davis, J.O. Berger, et al., Time Series: Theory and Methods (Springer, 2015) 12. K. Park, Fundamentals of Probability and Stochastic Processes with Applications to Communications (Springer, 2018) 13. L. Han, H. Jing, R. Zhang, et al. Wind power forecast based on improved long short-term memory network. Energy, 116300 (2019)

156

7 Uncertainty Characterization of Power Grid …

14. D.M. Blei, M.I. Jordan, Variational inference for Dirichlet process mixtures. Bayesian Anal. 1(1), 121–143 (2006) 15. Y. Liu, Y. Zhao, Y. Li, T. Sun, Z. Zeng, Uncertainty characterization of power grid net load of Dirichlet process mixture model based on relevant data. Acta Automatica Sinica 48(3), 747–761 (2022) 16. Z. Li, Y. Li, Y. Liu, P. Wang, R. Lu, H.B. Gooi, Deep learning based densely connected network for load forecasting. IEEE Trans. Power Syst. 36(4), 2829–2840 (2021) 17. P. Deb, Finite Mixture Models 39(4), 521–541 (2000) 18. C.M. Bishop, Pattern recognition and machine learning. J. Electron. Imag. 16(4), 140–155 (2006) 19. N. Bouguila, D. Ziou, A Dirichlet process mixture of generalized Dirichlet distributions for proportional data modeling. IEEE Trans. Neur. Netw. 21(1), 107–122 (2010) 20. J. Sethuraman, A constructive definition of the Dirichlet prior. Statistica Sinica 4(2), 639–650 (1994) 21. H. Dong, B. Dwk, C. Csp, Prior selection method using likelihood confidence region and Dirichlet process Gaussian mixture model for Bayesian inference of building energy models. Energy Build. 224(11029), 3 (2020) 22. H. Akaike, A new look at the statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1976) 23. Y. Wang, Q. Liu, Comparison of akaike information criterion and Bayesian information criterion in selection of stock-recruitment relationships. Fisheries Res. 77(2), 220–225 (2006) 24. A. Khosravi, S. Nahavandi, D. Creighton et al., Lower upper bound estimation method for construction of neural network-based prediction intervals. IEEE Trans. Neur. Netw. 22(3), 337–346 (2011)

Chapter 8

Extreme Learning Machine for Economic Dispatch with High Penetration of Wind Power

8.1 Introduction 8.1.1 Background and Motivation In recent years, renewable energy has been paid more attention and utilized more extensively throughout the world [1]. Taking the wind power as an example, its installed capacity is up to 140 million kilowatts by June 2016 in China, accounting for 9.2% of all power generation capacities. Especially, its penetration will be as high as 33.8% by 2050, which shows that the power system with highly penetrated wind power is gradually being formed [2]. Evidently, the wind power contributes to significant economic benefits. However, it is challenging to address uncertainties brought by highly penetrated wind energy for power system operations, such as economic dispatch (ED) [3]. So far, many efforts have been conducted, and operational risks brought by wind power uncertainties have been studied [4, 5]. Nevertheless, it shall be mentioned that the potential and risk are worth being considered in ED, taking the highly penetrated uncertain wind power into account. To be specific, for a dispatch solution, it is evident that fluctuations among the actual power system generation costs and their expectation under different wind power scenarios exist. In this way, a higher actual cost than the expected one leads to the downside risk which expresses the potential economic losses [6]. On the contrary, the upside potential focuses on potential economic gain, and it is manifested by the actual cost that is lower than the expected one [7]. Therefore, we attempt to consider the upside potential and downside risk, in order to well study the ED problem with highly penetrated uncertain wind power. This is the motivation of our work.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Li et al., Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch, Engineering Applications of Computational Methods 14, https://doi.org/10.1007/978-981-99-0799-1_8

157

158

8 Extreme Learning Machine for Economic …

8.1.2 Literature Review At present, four methods are mainly used for dealing with wind power uncertainty in the context of ED, i.e., fuzzy, robust, interval, and stochastic optimization methods. In the fuzzy optimization method, the uncertain wind power is represented as a fuzzy variable, and fuzzy memberships are set to establish the fuzzy ED model. However, specifying values of such memberships are subjectively determined by power system dispatchers. In this way, the obtained optimal ED solution may be confronted with subjectiveness [8, 9]. The robust optimization method is an alternative technique for dealing with the ED, in which uncertain wind power is presented via an interval variable. In [10], a robust dispatch method is proposed for optimizing the power system operations, i.e., minimizing the generation cost while supporting its transient stability under the highlevel wind power integration. In order to minimize the long-term operational cost considering the operation and service constraints, a robust two-stage optimization approach is presented in [11] to manage the energy generation in a grid-connected microgrid. Moreover, in [12], a robust day-ahead ED method is proposed considering the highly penetrated wind power, and the formulation for the worst-case scenario is established. Consequently, the robust ED optimization method can well immunize against uncertainties. However, the drawback of such method is the conservativeness, due to its inherent nature of handling with the worst-case scenario. In comparison, the stochastic optimization method is the most widely used to solve ED with uncertain wind power. The reasons are summarized as follows: (i) It uses the probabilistic information which is more comprehensive than those of interval and fuzzy variables; (ii) this method adopts sampling analysis, which is more convenient for studying the risk effect of the uncertain wind power in the ED. First, the uncertain wind speed/power and their forecast errors are specified to follow certain probabilistic distributions, e.g., Weibull [13, 14] or Gaussian distribution [15, 16], which can be derived by statistics of historical data. Then, various samples with respect to wind power are generated based on the probabilistic distribution. In this way, the uncertainty can be manifested [17]. Afterward, the stochastic optimization approach is used for establishing formulations of ED with uncertain wind power, which aims to find the optimal ED solution with respect to the minimal expected value of generation cost [18]. It is evident that this ED solution can realize the best averaged economic performance under uncertain wind power environment. However, the financial risk is not taken into account. This kind of risk is originated in the area of portfolio optimization and formulated as the variance of economic benefits [19]. Therefore, it shows the extent of fluctuations between actual gains and their expectations under uncertainties. As the ED problem aims to find the best solution considering uncertain wind power, it is similar to the portfolio optimization problem. That is, although the integrated wind power contributes to economic benefits for power system operations, the financial risk also exists.

8.1 Introduction

159

Therefore, the mean-variance (MV) model which was proposed by the economist named Markowitz [19] has been applied in ED considering wind power integrations [20]. In this model, the variance of generation cost is used to formulate the financial risk while considering multiple wind power samples. However, it is noted that the index of variance does not distinguish between higher and lower economic benefits.1 In other words, the minimization of variance means that the higher and lower benefits are inhibited, simultaneously. It is natural that we shall try to reduce the lower benefits, which is like a “pain” to decision-makers [21]. Therefore, a financial risk index called downside risk has been introduced, which aims to minimize the lower semi-deviation for avoiding lower benefits as much as possible [22]. For its application, Ref. [23] formulates a bi-objective optimization formulation for ED with wind power while considering downside risk constraints. Likewise, the relationship between expectation and downside risk of the total fuel cost is investigated in Ref. [24], considering the integrated wind power. However, under the context of economic operation of power system, it is also encouraged to pursue higher economic benefits. This is related to the concept of upside potential. Compared with downside risk, it focuses on higher benefits rather than lower ones, which can be formulated as the form of upper semi-deviation [25]. In other words, the downside risk is used to minimize potential losses, while the upside potential tries to maximize the gains. Therefore, we aim to simultaneously consider the upside potential and downside risk, in order to obtain more economic gains while reducing potential losses.

8.1.3 Contribution of This Paper In this paper, we propose a multi-objective economic dispatch (MuOED) model with highly penetrated wind power, in which the upside potential and downside risk are both taken into account. Then, the potential gains and losses, and the expected generation cost of ED with respect to dispatch solutions are simultaneously obtained. The main contributions of the paper are threefold. (1) A MuOED model is formulated, for assessing the expected generation cost, downside risk, as well as the upside potential at the same time, under the environment of highly penetrated uncertain wind power; (2) We develop a multi-objective optimization algorithm, i.e., the extreme learning machine (ELM) assisted group search optimizers with multiple producers (GSOMP), for solving the proposed MuOED model; (3) Numerical case studies based on the Midwestern US power system have been conducted, which verify the effectiveness of our proposed model and the good performance of the developed algorithm.

1

Higher or lower benefit refers to the benefit sample which is greater or less than their expectation.

160

8 Extreme Learning Machine for Economic …

The rest of the paper is organized as follows. Section 8.2 presents the MuOED model. Section 8.3 shows details of the developed algorithm. Finally, numerical case studies are conducted in Sect. 8.4, and conclusions are drawn in Sect. 8.5.

8.2 Multi-objective Economic Dispatch Model 8.2.1 Formulations of Economic Dispatch The ED problem aims to minimize the generation cost of a power system, by optimizing decision variables while satisfying a set of equality and inequality constraints referring to the power system secure operations. The decision variables include the generator active power, generator voltage, transformer tap setting, and power outputs of reactive power sources. On the contrary, the equality constraints include power flow equations, which ensure active and reactive power balance in a power system. The inequality constraints include the operating limits of generation units, power transformers, and reactive power sources, and power system security constraints, such as the limits on bus voltages and branch apparent power flows. Therefore, ED is presented in the following formulations. min

NG 

   cm PG2 m + bm PG m + am + dm sin(em (P min G m − PG m ))

(8.1)

m=1

s.t. PG i − PDi = Vi

Ni 

V j (G i j cos θi j + Bi j sin θi j )

(8.2)

j=1

Q G i − Q Di = Vi

Ni 

V j (G i j sin θi j − Bi j cos θi j )

(8.3)

j=1

PGmin ≤ PG m ≤ PGmax m m

(8.4)

max Q min Gm ≤ Q Gm ≤ Q Gm Vimin ≤ Vi ≤ Vimax Q Cmin ≤ Q Cl ≤ Q Cmax l l Tkmin ≤ Tk ≤ Tkmax Snmin ≤ Sn ≤ Snmax

(8.5) (8.6) (8.7) (8.8) (8.9)

The objective function (8.1) minimizes the total generation cost of thermal units. With respect to the mth unit, am , bm and cm are cost coefficients, dm and em are sinusoidal term coefficients that manifest the valve point effect [26], and P min G m is the minimum power output. m = 1, 2, ..., NG , and NG is the total number of ther-

8.2 Multi-objective Economic Dispatch Model

161

mal units. Constraints (8.2)–(8.3) stand for active and reactive power in a power system balance. PG i and Q G i are the injected active/reactive power at the ith bus, and PDi and Q Di stand for the corresponding active/reactive power demand, where i = 1, 2, ..., N B , and N B is the total number of buses. Ni represents the number of buses which are adjacent to the ith bus. Constraints (8.4)–(8.9) present operational limits. PG m and Q G m are the real and reactive power outputs of the mth unit. Vi , Q Cl , Tk , and Sn stand for the voltage at the ith bus, the output of the Cl th reactive compensation device, the tap ratio of the kth power transformer, and the apparent power flow in the nth branch, respectively. Cl = 1, 2, ..., NC , and NC is the total number of reactive power compensation devices. k = 1, 2, ..., N T , and N T is the total number of power transformers. n = 1, 2, ..., N S , and N S is the total number of branches in a max min , PGmax , Q min , Vimax , Q Cmin , Q Cmax , Tkmin , Tkmax , Snmin and power grid. PGmin G m , Q G m , Vi m m l l max Sn are corresponding lower and upper bounds. Therefore, the compact formulations of ED are presented as follows.

min C(x, u) s.t. g(x, u) = 0 h(x, u) ≤ 0,

(8.10) (8.11) (8.12)

where C(x , u) is the generation cost function shown in (8.1). (x stands for state variables, including the load-bus voltage VL , generator reactive power Q G and apparent power flow Sn . Therefore, x can be expressed as x T = [PG 1 , VL 1 , . . . , VL N L , Q G 1 , . . . , Q G NG , S1 , . . . , S N S ]. N L is the total number of load buses. The vector of decision variables u comprises the generator active power PG , generator voltage VG , transformer tap setting T and power output of reactive power sources Q C . Consequently, it is represented as u T = [PG 2 , . . . , PG NG , VG 1 . . . , VG NG , T1 , . . . , TNT , Q C 1 , . . . Q C NC ]. g(x , u) and h(x , u) represent equality constraints (8.2)–(8.3) and inequality constraints of (8.4)–(8.9), respectively.

8.2.2 Multi-objective Economic Dispatch Model To well describe the upside potential and downside risk brought by the uncertain wind power, we define the economic benefit as follows. B(x , u, W ) = C(x0 , u0 ) − C(x, u, W )

(8.13)

where C(x0 , u0 ) is the minimal generation cost of a power system without wind power integrated, which is obtained by solving (8.10)–(8.12). x0 and u0 are the optimal state and decision variables. W stands for the integrated wind power, including active power PW and reactive power Q W .

162

8 Extreme Learning Machine for Economic …

Then the expected value of the economic benefit is to be maximized, which is shown as follows. max B(x , u, W )exp = E PW , Q W [B(x , u, W )] NS  (C0 − C p )P(C p ), =

(8.14)

p=1

where E is the expectation operator, C0 = C(x0 , u0 ) and C p = C(x, u, W p ). W p is the pth wind power sample which can be obtained by Monte Carlo method [27], and N S is the total number of these samples. In detail, W p = { PW , Q W } = {(PW1 , PW2 , . . . , PW M ), (Q W1 , Q W2 , . . . , Q W M )}, where M stands for the number of wind farms, PWq and Q Wq represent active and reactive power outputs of the qth wind farm. C p is the corresponding generation cost with respect to W p . P(C p ) is the probability that C p occurs, i.e., the probability that W p happens.  S Note that Np=1 C0 P(C p ) in (8.14) is actually the constant value C0 , and thus (8.14) could be converted to (8.15) for optimizing the expected generation cost (EGC) in the uncertain wind power environment. min C(x , u, W )exp = E PW , Q W [C(x , u, W )] =

NS 

C p P(C p ).

(8.15)

p=1

The downside risk (DSR) represents the potential losses, which means the actual economic benefits are below their expected value. Then, the lower semi-deviation has been used for describing this kind of risk [28]. As for the uncertain wind power, it may contribute to lower benefits than the expected value as shown in (8.14). Therefore, we adopt the DSR to study the ED problem, in order to minimize the potential losses as much as possible. The formulation of DSR is presented as follows.    DSR(x, u, W) = EPW ,QW  B(x, u, W) − B(x, u, W)exp − =

NS      C p − C exp  P(C p ), +

(8.16)

p=1

where C exp is the expected value of C p , |a|− = max {0, −a}, and |a|+ = max {0, a}. It shall be mentioned that DSR is transformed into the upper semi-deviation of generation cost, since the concept of generation cost is contrary to economic benefit.

8.2 Multi-objective Economic Dispatch Model

163

Fig. 8.1 Varying EGC, DSR and USP with the different penetrations of wind power

The upside potential (USP) targets to maximize the potential gains under the uncertain environment, which is usually expressed as the form of upper semi-deviation [25]. That is, the actual economic benefits higher than their expectation are used for describing USP.    USP(x, u, W) = EPW ,QW  B(x, u, W) − B(x, u, W)exp + =

NS      C p − C exp  P(C p ) −

(8.17)

p=1

Here, USP is reformulated into the lower semi-deviation of generation cost as well. Note that different penetration levels of wind power have varying impacts on the EGC, DSR and USP. As shown in Fig. 8.1, the EGC would decrease with the increasing integration of wind power, as it contributes to more economic benefit. However, when the wind power penetration level is low, the actual generation cost would deviate the EGC merely to a small extent. The reason is that wind power of a low penetration level cannot significantly affect power system operations. Therefore, the DSR and USP are also low in this situation. On the contrary, when confronted with highly penetrated wind, the generation cost would largely deviate from its expected value, which indicates that the corresponding DSR and USP are also relatively high. In this way, the large DSR indicates that the potential economic losses is substantial, which should be reduced as much as possible. On the other hand, some generation cost are much lower than EGC when the value of USP is high. That means this kind of risk is better to be pursued as it contributes

164

8 Extreme Learning Machine for Economic …

to more economic benefits. Therefore, it is necessary to take both DSR and USP into account when solving the ED problem with highly penetrated wind power. In this way, our proposed MuOED model considers three objectives simultaneously, i.e., the expected economic benefit, the DSR and the USP. Therefore, in this paper, we formulate the proposed MuOED model as a tri-objective optimization problem. [min C(x, u, W )exp , min DSR(x , u, W ), max USP(x, u, W )]

(8.18)

s.t. g(x , u, W ) = 0

(8.19)

h(x, u, W ) ≤ 0

(8.20)

u

u

u

Note that the formulation (8.19) is different from (8.11) as wind power is integrated into the power system, which changes the distribution of power flow. The details of (8.19) are given below. It has been verified that the forecast error of wind speed could follow Gaussian distribution, during a short time [29]. In this way, this probabilistic distribution has been widely used for representing the uncertain wind speed v in the ED problem. Then the actual wind speed is shown as follows. v = v f + N(0, σv2 ),

(8.21)

where v f denotes the forecast wind speed, and σv is the standard deviation of forecast error. N denotes the Gaussian distribution. Traditionally, Monte Carlo sampling (MCS) method is adopted to generate a large number of forecast errors and obtain various wind speeds to manifest the uncertainty. However, it is evident that large samples lead to huge computational complexity for ED. Therefore, we adopt point estimation method to obtain wind speed samples, which can derive comparable results and require much less samples to represent uncertainties [30]. Due to the limited space of this paper, details of point estimation method are not included here, for which readers can refer to [30]. Then the pth active wind power sample PW p with respect to the wind speed v p is determined by the following formulation: ⎧ ⎪ 0 ⎪ ⎪ ⎨

PW p

if v p < vci ∪ v p > vco Pra vci3 Pra = + 3 v 3 if vci ≤ v p < vra 3 − v3 ⎪ vra vra − vci3 p ⎪ ci ⎪ ⎩P if vra ≤ v p ≤ vco , ra

(8.22)

where v p , vci , vra , vco stand for the pth, cut-in, rated and cut-out wind speed, respectively. Pra is the rated active power of a wind turbine. If a wind farm is composed of N W wind turbines, the pth active wind power is PW p × N W . In our work, we select wind turbines with a constant power factor cosϕ,

8.3 Extreme Learning Machine Assisted Group Search Optimizer …

165

therefore, active and reactive power outputs of a wind farm are formulated as follows. Pp = PW p × NW

Qp =

Pp 1 − cos2 ϕ. cosϕ

(8.23)

(8.24)

Then power flow equations are affected by the injected wind power sample Pp at power system bus (e.g., bus i) where a wind farm is located. The updated equations are shown as follows.  V j (G i j cos θi j + Bi j sin θi j ) (8.25) PG i = PD i − Pp + Vi j∈Ni

Q G i = Q D i − Q p + Vi



V j (G i j sin θi j − Bi j cos θi j )

(8.26)

j∈Ni

8.3 Extreme Learning Machine Assisted Group Search Optimizer with Multiple Producers As discussed in the above section, we aim to obtain the minimum EGC, the minimum DSR and the maximum USP in the proposed MuOED model. Evidently, such a solution may not exist as the three objectives would conflict with each other. Therefore, the Pareto front that manifests the trade-off among these objectives is to be obtained for decision-making.

8.3.1 Group Search Optimizer with Multiple Producers Indeed, the weighting method can be introduced to convert the MuOED model to a single optimization problem [31]. However, it is difficult to set reasonable values of weights, which are usually prespecified to weigh different objectives. Furthermore, the Pareto front that manifests the trade-off among multiple objectives is obtained by using various combinations of weights, which is very time-consuming for conducting optimization. Above all, the weakness of the weighting method is that it cannot well solve the non-convex multi-objective problem, as not all of the Pareto optimal solutions can be found [32]. In our work, the proposed MuOED model is non-convex due to (8.19); consequently, we use a multi-objective optimization algorithm to deal with it. In recent years, various evolutionary algorithms have been developed to address the non-convex optimization problems [33], such as particle swarm optimization

166

8 Extreme Learning Machine for Economic …

Fig. 8.2 Illustration of GSOMP a The searching mechanism of GSO; b non-dominated sorting technique; c the obtained Pareto front

(PSO) [34], differential evolution (DE) [35], as well as the improved versions of such algorithms [35, 36]. In addition, they are also effectively utilized in solving multi-objective problems [37, 38]. Therein, the group search optimizer with multiple producers (GSOMP) [39], which incorporates the single-objective optimization algorithm of group search optimizer (GSO) [40] and the non-dominated sorting technique [41], has shown an excellent performance in dealing with multi-objective optimizations. It shall be mentioned that the outstanding performance of GSOMP is mainly contributed by the design of searching mechanism based on multiple producers [39]. The producer is one of the three kinds of members in GSO. It acts as the leader for seeking the global optima. The rest two kinds of members in GSO are scroungers and rangers. Scroungers follow the producer during the searching process and try to refine solutions near the producer, while the rangers adopt random walk in the searching area. The searching process of GSOMP as well as the non-dominating sorting technique is shown in Fig. 8.2. The searching mechanism of GSO is presented in Fig. 8.2a, each producer attempts to find the best value corresponding to individual objective function, which is followed by scroungers. The rangers will randomly range and search in the variable space.

8.3 Extreme Learning Machine Assisted Group Search Optimizer …

167

When dealing with the MuOED model via GSOMP, multiple producers are initially selected to optimize each objective in (8.18) by GSOMP. Then scroungers follow these producers, and rangers disperse randomly. In this way, the members of GSOMP are updated. Then, the non-dominated sorting technique is adopted to obtain the Pareto solutions (non-dominated solutions) as shown in Fig. 8.2b. After several iterations, the Pareto front corresponding to Pareto optimal solutions can be obtained, presented in Fig. 8.2c.

8.3.2 ELM Assisted GSOMP However, it shall be mentioned that although GSOMP has demonstrated excellent performance on global searching, its local searching ability is modest [39]. One of the promising directions for improving the local searching efficiency is to preupdate the poor performed members, i.e., the dominated members, during the searching process. Therefore, we aim to preprocess the dominated members using the strategy of local random walk [42]. Then, dominated members can update their positions and may obtain better fitnesses, i.e., objective values of (8.18). This could enhance the local searching ability and accelerate the convergent speed of GSOMP. It should be mentioned that reinforcement learning (RL) is widely applied in power system operations. For instance, [43] proposes a building energy optimization of central HVAC system based on RL. Also, [44, 45] adopt RL to tackle the security problems of the power grid, such as unexpected large power surges and cyber-attacks. In our work, we aim to improve the performance of GSOMP with a preupdate strategy of local random walk. Nevertheless, before preupdating the dominated members, the whole group members are usually assessed by calculating their objective values. Especially, in our work, the power flow computation is necessary, which leads to a high time complexity. However, the RL usually requires long training time to learn the prior strategy. Consequently, it might further burden the computation time of GSOMP. Therefore, we introduce a high learning efficiency neural network, i.e., extreme learning machine (ELM) [46], to approximately assess group members. As a matter of fact, ELM is a single-hidden-layer feedforward neural network (SLFN) with excellent learning ability. The main advantages of ELM are three folds. (1) Few human intervention. The input weights and biases associated with the hidden layer in ELM are randomly set without additional manual adjustments. Few parameters need to be manually determined, such as the number of nodes for the hidden layer, etc. [46]. (2) Fast learning speed. The training time of network parameters of SLFN is much less than the deep learning with multi-hidden-layer neural networks. Furthermore, ELM is free of the updating of parameters, such as input weights and biases associated with the hidden layer. Thus, it also reduces the computational time [47].

168

8 Extreme Learning Machine for Economic …

(3) High learning accuracy. Different from the gradient descending-based methods, which are highly sensitive to the value of learning rate, ELM adopts the leastsquare to train the hidden layer. The whole learning process completes through a mathematical transformation without iterations. It has been verified that ELM has a good approximation capability and superior generalizing ability [47]. Therefore, we take the advantage of the ELM and propose the ELM assisted GSOMP (ELM-GSOMP) to deal with the MuOED model in this paper. For the ELM-GSOMP, the ELM topology and ELM-based preupdate mechanism are shown in Fig. 8.3a, b, respectively. Figure 8.3a shows the training and approximating loops. In the training loop, the historic information of the group members and corresponding fitnesses are used for training the ELM. Then, at beginning of the n-th iteration, we use the ELM to approximate the fitnesses of current group members in the approximating loop. Afterward, the non-dominated sorting method is used to determine the non-dominated and dominated solutions. Then, we employ the local random walk regarding the dominated solutions to preupdate group members, for enhancing the local searching of GSOMP. Specifically, in Fig. 8.3a, for a training sample (um , tm ), um is the I -dimension group member representing the dispatch solution in our work, and tm is the Odimension objective values of the mth sample. The ELM with L hidden nodes and activation function g(·) are modeled as tm =

L  l=1

βl g (h l (um )) =

L 

βl g(wl um + bl ), m = 1, ..., M,

(8.27)

l=1

where wl and bl represent the input weight and bias with respect to the lth hidden layer, respectively. The connection between input node and hidden node is formulated as follows. (8.28) h l (um ) = wl um + bl In ELM, the input weights and hidden layer biases are randomly assigned, and the hidden layer output matrix H can be obtained by the activation function g(·). Then, the output T of ELM, i.e., the approximated fitness values for the input members, could be expressed as follows. T = Hβ, (8.29) where βl stands for the weight connecting the lth hidden node and the output nodes. Applying the least-square approach, the training of ELM is accordingly converted as follows. (8.30) min ||T − Tˆ ||, β

where Tˆ is the vector of actual fitness values regarding the input members. Then, the optimal solution of (8.24) is derived.

8.3 Extreme Learning Machine Assisted Group Search Optimizer …

169

Fig. 8.3 a ELM neural network topology; b ELM-based preupdate process

βˆ = H † Tˆ ,

(8.31)

where H † represents Moore–Penrose generalized inverse of matrix H . Therefore, the ELM network parameters are directly calculated through matrix computation. Then, fitnesses of group members can be approximated through the trained ELM network. The advantage of the ELM-GSMOP will be further verified via simulation studies in Sect. 8.5. The details of computing procedure for solving the MuOED model are presented in Fig. 8.4. The main procedures are shown in the following steps.

170

8 Extreme Learning Machine for Economic …

Fig. 8.4 Flowchart of the algorithm of ELM-GSOMP

Step-1: Initialization Input power system data, such as data of traditional generators, load demand, wind turbines, forecasted wind speed and the error, and the number of hidden nodes of ELM. Generate K wind power samples by the point estimation method. Then, randomly initialize ELM-GSOMP members us (s = 1, 2, ..., S), which stands for dispatch solutions. The number of initialized members and the size of external archive are both S. Step-2: Preupdate ELM-GSOMP members In the first iteration, the group members will not be preupdated because of the short of training data. Then, before each iteration, the dominated members will be preupdated and classified by non-dominated sorting after ELM network approximation. The network parameters are trained by previous iterative group members of ELMGSOMP.

8.3 Extreme Learning Machine Assisted Group Search Optimizer …

171

Step-3: Update Pareto optimal solutions Evaluate each solution us by (8.18), i.e., calculate generation cost samples with respect to individual wind power scenarios. In this way, the EGC, USP and DSR are obtained, which are saved into external archive, together with us . Compare objective values of these solutions us (s = 1, 2, ..., S), and determine Pareto optimal solutions by the non-dominated sorting. In addition, new members will be generated at the next iteration for updating obtained solutions. Step-4: Update ELM-GSOMP members At the next iteration, the group members are updated. That is, the producers update their positions by performing the producing behavior. Then the scroungers and rangers are chosen and updated according to their own behaviors of following and ranging. Step-5: Stopping criterion When the number of iterations n reaches N , the procedure terminates, and the final Pareto optimal solutions are obtained. Otherwise, go back to Step-2 and continue the iterative procedure. It shall be mentioned that the obtained Pareto optimal solutions reflect the tradeoff among EGC, DSR and USP. The final dispatch solution can be determined by a fuzzy decision-making method, which balances interests of gains and losses in the uncertain wind power environment. The details of this method are shown below. A fuzzy membership function shall be defined when using this decision-making method, which is formulated as follows: ⎧ ⎪ 0 if f i ≥ f imax ⎪ ⎪ ⎨ f max − f i i if f imin < f i < f imax (8.32) μi = max min ⎪ fi − fi ⎪ ⎪ ⎩1 if f i ≤ f imin , where μi is the membership value corresponding to the ith objective function f i . f imax and f imin are the maximum and minimum values in terms of the obtained Pareto optimal solutions, respectively. In this way, for each alternative k, the normalized membership value is derived as follows: n ωi μi [k] n , (8.33) μ[k] = m i=1 k=1 i=1 ωi μi [k] where m and n represent the number of Pareto solutions and multiple objectives, respectively. ωi stands for the weight of the ith objective. Then, the solution with respect to the largest value of μ[k] is selected as the final dispatch one.

172

8 Extreme Learning Machine for Economic …

8.4 Simulation Studies 8.4.1 Simulation Settings In order to verify the effectiveness of the proposed MuOED model and the algorithm of ELM-GSOMP, we conduct simulation studies based on the Midwestern US power system [48]. Its topology is shown in Fig. 8.5, and 5 wind farms are located on the 7-th, 11-th, 16-th, 24-th and 30-th buses, respectively. Each wind farm comprises 14 wind turbines, and the rated active power of each turbine is set to be 2.5 MW. Note that each wind turbine in the wind farm is a double-fed induction generator with a constant power factor of 0.95. Its rated, cut-in and cut-out wind speeds are set as 12.5 m/s, 4 m/s and 20 m/s, respectively. In addition, forecast wind speeds of the 5 wind farms are assumed as 9.0 m/s, 7.6 m/s, 9.3 m/s, 8.7 m/s and 11.7 m/s, respectively. With this setup, the total forecast wind power is 84.34 MW, representing 30% of the total generation in this system. In addition, forecasting errors of wind speeds are all set to be 20% of their corresponding forecast wind speed. In order to better show the performance of the proposed ELM-GSOMP, we compare its performance with several multi-objective algorithms, such as the multiobjective differential evolution (MODE) [38], multi-objective hybrid-adaptive differential evolution (MOHyDE), multi-objective particle swarm optimization (MOPSO). Also, the standard version of GSOMP is also compared. For all the algorithms, the population size and number of maximum iterations are set as 40 and 300, respectively. Regarding MODE, the mutant amplification factor F and the crossover Cr are set as 0.8 and 0.1. Moreover, the probability factors indicating the changing threshold of F and Cr are both set as 0.1. The lower and upper limits of F are Fl = 0.1 and Fu = 0.9. In addition, the personal and global learning coefficients are set as 0.1 and 0.2, and the size of external repository is 300. As for the GSOMP and ELM-GSOMP, the number of rangers is both 20% of all members. In addition, the number of hidden nodes of ELM is set as 100 for ELM-GSOMP. It is noted that all the simulations are run in MATLAB R2016a on a 64-bit PC with Inter® Core(TM) i7 CPU @2.20 GHz processor and 8 GB RAM. As the proposed model considers the EGC, DSR and USP, we first compare the corresponding Pareto fronts obtained by the above-mentioned algorithms and verify the outperformance of proposed ELM-GSOMP. Then, we attempt to compare dispatch performances of different obtained solutions, in order to verify the necessity of considering both USP and DSR in ED with uncertain penetrated wind power.

8.4.2 Simulation Results Based on the above simulation settings, we conduct 50 independent runs, and results of the best runs with respect to these five algorithms are used for comparisons. In order to compare the convergent speed and performance of such algorithms, the

8.3 Extreme Learning Machine Assisted Group Search Optimizer …

173

Fig. 8.5 Topology of wind integrated Midwestern US power system

hypervolume (HV) [41] is adopted in this paper as the evaluation index. The larger value of HV, the more convergent of the algorithm. It should be mentioned that as the USP is an objective to be maximized, we handled it as min u {−USP} for solving simplicity and better comparison. In addition, the reference point is set as (900, 20, 0) for the HV computation. As a result, values of HV index of each iteration are shown in Fig. 8.6, demonstrating the convergent performance of MODE, MOHyDE, MOPSO, GSOMP and ELM-GSOMP. It is easily observed that ELM-GSOMP could achieve the largest HV at the final iteration. Moreover, there exist several significant increases of HV index during the searching process, e.g., for the 88-th iteration, etc. It also shows the advantage of incorporation of ELM, which helps the members to “jump out” the local optima. Furthermore, several evaluating indicators such as the spacing index (SI), the mean distance (MD) and the number of Pareto solutions (NPS), as well as the computational time (CT) are also adopted to make quantitative comparisons among these five algorithms [49]. The SI calculates the averaged distance among adjacent points in the obtained Pareto front, which indicates how evenly these points are distributed. MD

174

8 Extreme Learning Machine for Economic … 10 4

Hypervolume (HV)

3.7 3.6 3.5

10 4

3.7

3.4

MODE MOHyDE MOPSO GSOMP ELM-GSOMP

3.65 3.3 3.6

3.2

80

3.1 50

90

100

100 150

200

250

300

Number of iterations

Fig. 8.6 Comparison of hypervolume (HV) with different optimization algorithms Table 8.1 Comparisons of SI, MD, NPS and CT among Pareto fronts obtained by five algorithms Algorithm SI MD NPS CT(s) MODE MOHyDE MOPSO GSOMP ELM-GSOMP

87.40 93.01 84.42 51.82 49.55

240.42 245.35 278.66 324.93 337.08

40 40 300 939 1119

610.21 597.15 2708.21 675.32 693.85

Bold metric indicates that our proposed method achieves the strongest performance among all the algorithms

is the mean distance between all Pareto solutions and the reference point (900, 20, 0). The NPS is the number of Pareto solutions found by multi-objective optimization algorithms. Table 8.1 shows the values of these indexes, and we observe that performance of the proposed method indicated by SI, MD as well as NPS is more excellent than those of other algorithms. To be specific, the value of SI found by ELM-GSOMP is 49.55, which is the least among all algorithms. This means the ELM-GSOMP can find more evenly distributed Pareto solutions. In terms of the values of MD, it is shown that the Pareto front found by ELM-GSOMP is more convergent, as the mean distance between its points and the reference point is the largest value, i.e., 337.08. In addition, ELM-GSOMP obtains the NPS as much as 1119, more than the number of 939 solutions with respect to GSOMP. It shall be mentioned that the NPSs of MODE and MOHyDE are dependent on the population size of 40 [38], while such index regarding MOPSO is 300, limited by the size of external repository [37]. As for the aspect of CT, all algorithms could complete the whole iterations within 12 min except MOPSO, which is acceptable for the intra-day dispatch, and the time window is set as 1 h in our paper. Note that the CT between GSOMP and ELM-GSOMP has shown no significant difference. As a matter of fact, the total training and approximating time for ELM during the searching process are less than 20 s.

8.3 Extreme Learning Machine Assisted Group Search Optimizer …

175

USP ($/h)

15 10

GSOMP ELM-GSOMP

5

0 500

20 600

10

700

EGC ($/h)

800

0

DSR ($/h)

Fig. 8.7 Pareto fronts of MuOED obtained by GSOMP and ELM-GSOMP Fig. 8.8 Parallel axis plot of 3 normalized objective values

After that, the best Pareto front obtained by ELM-GSOMP in 50 independent runs is presented in Fig. 8.7, compared with that of GSOMP. However, it is hard to identify the Pareto solutions with good trade-off among the three objectives from Fig. 8.7. In this way, we use the parallel axis plot, a visualization tool for displaying each objective on a separate axis, whose scale is normalized to the range of scales given in the objective values [50]. The plot is shown in Fig. 8.8. The lines crossing with each other have well described the trade-off relationship among different Pareto solutions. It is shown that EGC has a trade-off with DSR. Therefore, we shall choose a final dispatch solution from these Pareto solutions. Usually, the fuzzy decision-making method is used to make such a selection. However, the relationship among the objectives should be further analyzed, and thus relative weights need to be set for EGC, DSR and USP. Evidently, the lager value of a weight, the more important of the corresponding objective. In this paper, to investigate the effectiveness of the MuOED considering DSR and USP, we set different weights, i.e., [ω1 , ω2 , ω3 ] = [(0.90, 0.05, 0.05), (0.70, 0.15, 0.15), (0.50, 0.25, 0.25), (0.3, 0.35, 0.35), (0.10, 0.45, 0.45)], where ω1 , ω2 and ω3 are weights for EGC, DSR

176

8 Extreme Learning Machine for Economic …

Table 8.2 Values of EGC, DSR and USP for alternative dispatch solutions Solutions a1 a2 a3 a4 a5 EGC ($/h) DSR ($/h) USP ($/h)

503.71 15.43 10.48

508.38 14.69 9.89

594.62 4.94 2.29

596.90 4.89 2.19

648.01 1.36 3.34

a6 596.90 4.89 2.19

Note Detailed values of decision variables a1 , a2 , . . . , a6 are also listed in Table 8.4 Table 8.3 Values of EGC, DSR and USP With respect to solutions a6 –a8 Solution a6 a7 a8 EGC ($/h) DSR ($/h) USP ($/h)

596.90 4.89 2.19

503.70 15.92 10.81

639.11 1.84 4.94

Detailed values of decision variables a6 , a7 and a8 are also listed in Table 8.4

and USP, respectively. On this basis, five final solutions are obtained and denoted as a1 , a2 , . . . , a5 , and their corresponding values of EGC, USP and DSR are shown in Table 8.2. Moreover, another solution a6 can be derived with [ω1 , ω2 , ω3 ] = [(0.33, 0.33, 0.33)], for balanced consideration of EGC, DSR, and USP. Their values are also presented in this table. It can be easily observed that the value of EGC decreases when its weight becomes smaller. For a1 , EGC reaches as low as 503.71 $/h, and the value of USP is as high as 10.48 $/h. This is beneficial for power system economic operations. However, the DSR is 15.43 $/h, the highest among a1 , . . . , a5 , a6 . This means that a1 has the biggest potential losses. On the contrary, a5 has the highest EGC and a lower USP, i.e., 648.01 and 1.36 $/h. It is evident that this solution may not be good for economic operations, although its DSR is as low as 1.36 $/h. Therefore, it is necessary to consider EGC, DSR, and USP in a balanced way, and we obtain a6 which corresponds to the equal weights of the three objectives. To further verify the effectiveness of a6 , we calculate generation costs with respect to the 200 wind power samples. Note that these samples can be obtained based on the forecasted wind speeds and errors using a sampling method of Latin hypercube sampling and Cholesky decomposition [27]. Then, Fig. 8.9 shows generation costs of a6 , compared with those of a1 and a5 . We can see that the varying range of samples with a1 is the largest, although their averaged value is the least. It is observed that values of some samples are higher than those of a6 , even a5 . The reason is that the DSR and USP of a1 are the biggest among the three solutions. We also find the values of generation costs of a5 are varied with a very small range, which shows this solution is more robust to the uncertain wind power environment. However, a5 obtains the highest values of generation costs, indicating it is not good for power system economic operations. On the contrary, a6 can well balance the EGC and risks, in which the varying range of samples is small, while the values of generation costs are not high.

8.3 Extreme Learning Machine Assisted Group Search Optimizer …

177

800

a1

750

a5

EGC

700

Generation cost ($/h)

a6

650 600 550 500 450

400 350

0

20

40

60

80

100

120

140

160

180

200

Number of wind power samples

Fig. 8.9 Generation cost samples for solutions a1 , a5 and a6

In addition, to further verify the necessity of consideration of including DSR and USP, we obtain two more solutions a7 and a8 , which correspond to the weights [0.999, 0.0005, 0.0005] and [0.4975, 0.4975, 0.005]. It is easily seen that a7 puts much more effort on minimizing the EGC, while almost neglecting the effects of DSR and USP as their weights are merely 0.0005. Similar, a8 attempts to optimize EGC and DSR, but the USP is almost not taken into account. The objective values of a7 and a8 , together with a6 , are listed in Table 8.3. It is seen that a7 has the lowest EGC, i.e., 503.70 $/h. However, as a8 takes DSR into account, the corresponding value is 1.84 $/h, much smaller than that of a7 . To well demonstrate the necessity of considering DSR, Fig. 8.10 also shows the 200 generation cost samples for a7 and a8 . It can be seen that although the expectation of samples of a7 is smaller, values of some samples are very high, which presents more potential economic losses. To avoid such losses, it is beneficial to consider DSR. Similarly, in order to further present the necessity of considering USP, Fig. 8.11 shows the 200 generation cost samples for a8 and a6 . The reason is that a6 considers EGC, DSR and USP while a8 almost neglects USP. It is easily observed that the values of generation cost samples of a6 are lower than those of a8 , as USP is pursued for a6 . Therefore, we can conclude that it is valuable to consider DSR and USP for power system economic dispatch in the uncertain wind power environment (Table 8.4).

178

8 Extreme Learning Machine for Economic … 750

a7

a8

EGC

700

Generation cost ($/h)

650 600

550 500

450 400

350

0

20

40

60

80

100

120

140

160

180

200

Number of wind power samples

Fig. 8.10 Generation cost samples for solutions a7 and a8 760

a6

740

a8

Generation cost ($/h)

720 700

EGC

680 660 640 620 600 580 560

0

20

40

60

80

100

120

140

Number of wind power samples

Fig. 8.11 Generation cost samples for solutions a6 and a8

160

180

200

a1

35.4238 18.6140 11.6694 10.4157 13.7138 1.0332 1.0499 1.0422 1.0225 1.0771 1.0562 1.1000 0.9250 1.1000 1.0250 0.04 0.04 0.02 0.04 0.04 0.04 0.03 0.02 0.04

Solutions

PG 2 (MW) PG 3 (MW) PG 4 (MW) PG 5 (MW) PG 6 (MW) V1 (p.u) V2 (p.u) V3 (p.u) V4 (p.u) V5 (p.u) V6 (p.u) T1 T2 T3 T4 Q C1 (MVAr) Q C2 (MVAr) Q C3 (MVAr) Q C4 (MVAr) Q C5 (MVAr) Q C6 (MVAr) Q C7 (MVAr) Q C8 (MVAr) Q C9 (MVAr)

45.6567 20.7862 13.3232 11.2826 13.3456 1.0490 1.0451 1.0296 1.0185 1.0855 1.0774 1.0500 0.9625 1.0000 1.0250 0.04 0.02 0.02 0.02 0.01 0.04 0.00 0.02 0.01

a2

55.9104 25.4473 29.3969 25.0199 16.5874 1.0398 1.0323 1.0187 1.0375 1.0921 1.0531 0.9750 0.9875 1.0875 1.0000 0.02 0.03 0.03 0.02 0.03 0.01 0.00 0.05 0.04

a3 57.9811 23.4091 26.9856 18.6624 25.1457 0.9973 1.0226 0.9969 1.0191 0.9531 1.0062 1.1125 0.9500 1.0625 1.0875 0.03 0.01 0.04 0.05 0.02 0.03 0.04 0.00 0.05

a4

Table 8.4 Final solutions obtained by fuzzy decision-making method 67.5113 29.2994 23.8330 24.0487 20.9738 1.0277 1.0214 1.0415 1.0064 1.0485 1.0634 0.9875 0.9750 0.9750 0.9750 0.02 0.03 0.03 0.01 0.04 0.02 0.02 0.00 0.04

a5 57.9811 23.4091 26.9856 18.6624 25.1457 0.9973 1.0226 0.9969 1.0191 0.9531 1.0062 1.1125 0.9500 1.0625 1.0875 0.03 0.01 0.04 0.05 0.02 0.03 0.04 0.00 0.05

a6 29.0937 18.6634 10.0150 10.0808 12.3907 1.0068 1.0237 0.9960 0.9784 1.0399 1.0389 1.0125 1.0500 1.1000 0.9750 0.03 0.04 0.04 0.03 0.04 0.03 0.04 0.03 0.03

a7 74.0119 24.1635 29.8145 17.8835 16.9152 1.0406 1.0430 1.0359 1.0360 1.0630 1.0881 1.0375 1.0125 1.0000 0.9250 0.01 0.02 0.00 0.03 0.03 0.01 0.02 0.03 0.02

a8

Min 20 15 10 10 12 0.95 0.95 0.95 0.95 0.95 0.95 0.90 0.90 0.90 0.90 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Max 80 50 35 30 40 1.05 1.10 1.10 1.10 1.10 1.10 1.10 1.10 1.10 1.10 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05

8.3 Extreme Learning Machine Assisted Group Search Optimizer … 179

180

8 Extreme Learning Machine for Economic …

8.5 Conclusion In this paper, we have proposed a multi-objective ED (MuOED) model with uncertain wind power. In this model, the expected generation cost, the upside potential, and the downside risk are taken into account at the same time. Then the MuOED model is formulated as a tri-objective optimization problem, and we use an extreme learning machine assisted group search optimizers with multiple producers (ELM-GSOMP) to solve the problem. Afterward, a fuzzy decision-making method is used for choosing the final dispatch solution. Simulation on a modified wind integrated Midwestern US power system results verify that ELM-GSOMP can find more convergent Pareto solutions with a faster speed, compared with traditional GSOMP. Moreover, we have also used sample analysis to verify that it is necessary to consider the downside risk and upside potential for economic dispatch in power system with uncertain wind power. It shall be mentioned that some works investigate the power system operation of real-time optimization, such as [51, 52]. However, the MuOED model proposed in this paper focused on the intra-day optimization which is addressed by the ELMGSOMP. However, due to the limitation of computational time, it is difficult to conduct MuOED for real-time application. Since the MuOED considering uncertain wind power is verified to be effective, it would be significant to adopt it in real-time application, if the real-time optimization algorithm could be proposed.

References 1. N. Liu, J. Wang, L. Wang, Distributed energy management for interconnected operation of combined heat and power-based microgrids with demand response. J. Mod. Power Syst. Clean Energy 5(3), 478–488 (2017) 2. Q. Liu, A Study on the Development Situation and Path of Chinese 2050 Highly Penetrated Renewable Energy (Energy Research Institute, Chinese Development and Reform Commission, 2015) 3. Y.Z. Li, Q.H. Wu, M.S. Li, J.P. Zhan, Mean-variance model for power system economic dispatch with wind power integrated. Energy 72, 510–520 (2014) 4. M. Negnevitsky, D.H. Nguyen, M. Piekutowski, Risk assessment for power system operation planning with high wind power penetration. IEEE Trans. Power Syst. 30(3), 1359–1368 (2015) 5. L. Ji, G.H. Huang, L.C. Huang, Y.L. Xie, D.X. Niu, Inexact stochastic risk-aversion optimal dayahead dispatch model for electricity system management with wind power under uncertainty. Energy 109, 920–932 (2016) 6. Y.X. Liu, Z.F. Qin, Mean semi-absolute deviation model for uncertain portfolio optimization problem. J. Uncertain Syst. 6(4), 299–307 (2012) 7. L.M. Du, Y.N. He, Extreme risk spillovers between crude oil and stock markets. Energy Econ. 51, 455–465 (2015) 8. C. Liu, A. Botterud, Z. Zhou, P.W. Du, Fuzzy energy and reserve co-optimization with high penetration of renewable energy. IEEE Trans. Sustain. Energy 8(2), 782–791 (2017) 9. X. Die, H. Dong, D. Wang. The multi-objective fuzzy optimization scheduling of the combined system of wind-photovoltaic-pumping considering electricity equity, in 2020 4th International Conference on HVDC (HVDC), 2020, pp. 382–387

8.3 Extreme Learning Machine Assisted Group Search Optimizer …

181

10. Y. Xu, M. Yin, Z.Y. Dong, R. Zhang, D.J. Hill, Y. Zhang, Robust dispatch of high wind powerpenetrated power systems against transient instability. IEEE Trans. Power Syst. 33(1), 174–186 (2018) 11. W. Hu, P. Wang, H.B. Gooi, Toward optimal energy management of microgrids via robust two-stage optimization. IEEE Trans. Smart Grid 9(2), 1161–1174 (2018) 12. J. Xu, B. Wang, Y. Sun, Q. Xu, J. Liu, H. Cao, H. Jiang, R. Lei, M. Shen, A day-ahead economic dispatch method considering extreme scenarios based on wind power uncertainty. CSEE J. Power Energy Syst. 5(2), 224–233 (2019) 13. V.C. Pandey, N. Gupta, K.R. Niazi, A. Swarnkar, A scenario-based stochastic dynamic economic load dispatch considering wind uncertainty, in 2019 8th International Conference on Power Systems (ICPS), 2019, pp. 1–5 14. U. Güvenç, E. Kaymaz, Economic dispatch integrated wind power using coyote optimization algorithm, in 2019 7th International Istanbul Smart Grids and Cities Congress and Fair (ICSG), 2019, pp. 179–183 15. Y. Wang, S. Lou, Y. Wu, S. Wang, Flexible operation of retrofitted coal-fired power plants to reduce wind curtailment considering thermal energy storage. IEEE Trans. Power Syst. 35(2), 1178–1187 (2020) 16. T. Ding, Q. Yang, X. Liu, C. Huang, Y. Yang, M. Wang, F. Blaabjerg, Duality-free decomposition based data-driven stochastic security-constrained unit commitment. IEEE Trans. Sustain. Energy 10(1), 82–93 (2019) 17. Z. Lu, M. Liu, W. Lu, Z. Deng, Stochastic optimization of economic dispatch with wind and photovoltaic energy using the nested sparse grid-based stochastic collocation method. IEEE Access 7, 91827–91837 (2019) 18. H. Wu, I. Krad, A. Florita, B. Hodge, E. Ibanez, J. Zhang, E. Ela, Stochastic multi-timescale power system operations with variable wind generation. IEEE Trans. Power Syst. 32(5), 3325– 3337 (2017) 19. H. Markowitz, Portfolio selection. J. Finance 7(1), 77–91 (1952) 20. M.S. Li, Q.H. Wu, T.Y. Ji, H. Rao, Stochastic multi-objective optimization for economicemission dispatch with uncertain wind power and distributed loads. Electr. Power Syst. Res. 116, 367–373 (2014) 21. C. James, P. Michael, Measuring risk for cost of capital: the downside beta approach. J. Corporate Treasury Manage. 4(4), 346–347 (2012) 22. A.D. Roy, Safety first and the holding of assets. Econometrica 20(3), 431–449 (1952) 23. Y.Z. Li, Q.H. Wu, Downside risk constrained probabilistic optimal power flow with wind power integrated. IEEE Trans. Power Syst. 31(2), 1649–1650 (2016) 24. Z.Q. Xie, T.Y. Ji, M.S. Li, Q.H. Wu, Quasi-Monte Carlo based probabilistic optimal power flow considering the correlation of wind speeds using copula function. IEEE Trans. Power Syst. 33(2), 2239–2247 (2018) 25. D. Cumova, D. Nawrocki, Portfolio optimization in an upside potential and downside risk framework. J. Econ. Bus. 71, 68–89 (2014) 26. Y.Z. Li, M.S. Li, Q.H. Wu, Energy saving dispatch with complex constraints: prohibited zones, valve point effect and carbon tax. Int. J. Electr. Power Energy Syst. 72, 510–520 (2014) 27. H. Yu, C.Y. Chung, K.P. Wong, H.W. Lee, J.H. Zhang, Probabilistic load flow evaluation with hybrid Latin hypercube sampling and Cholesky decomposition. IEEE Trans. Power Syst. 24(2), 661–667 (2009) 28. X. Zhang, G.Y. He, S.Y Lin, W.X. Yang, Economic dispatch considering volatile wind power generation with lower-semi-deviation risk measure, in 2011 4th International Conference on Electric Utility Deregulation and Restructuring and Power Technologies (DRPT) (IEEE, 2011), pp. 140–144 29. M. Lange, On the uncertainty of wind power predictions-analysis of the forecast accuracy and statistical distribution of errors. J. Solar Energy Eng. 127, 177–284 (2005) 30. C. Su, Probabilistic load-flow computation using point estimate method. IEEE Trans. Power Syst. 20(4), 1843–1851 (2005)

182

8 Extreme Learning Machine for Economic …

31. A.J. Conejo, F.J. Nogales, J.N. Jose, R.G. Bertrand, Risk-constrained self-scheduling of a thermal power producer. IEEE Trans. Power Syst. 3(3), 1569–1574 (2004) 32. K. Miettinen, Nonlinear Multiobjective Optimization (Springer Science & Business Media, 2012) 33. J. Soares, T. Pinto, F. Lezama, H. Morais, Survey on complex optimization and simulation for the new power systems paradigm. Complexity 2018, 1–32 (2018). (August) 34. M. Clerc, J. Kennedy, The particle swarm: Explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002). (Feb.) 35. F. Lezama, J. Soares, Z. Vale, J. Rueda, S. Rivera, I. Elrich, 2017 IEEE competition on modern heuristic optimizers for smart grid operation: Testbeds and results, in Swarm and Evolutionary Computation (2019) 36. F. Lezama, J. Soares, R. Faia, T. Pinto, Z. Vale, A new hybrid-adaptive differential evolution for a smart grid application under uncertainty, in 2018 IEEE Congress on Evolutionary Computation (CEC) (2018) 37. L. Wang, C. Singh, Balancing risk and cost in fuzzy economic dispatch including wind power penetration based on particle swarm optimization. Electr. Power Syst. Res. 78, 1361–1368 (2008) 38. M. Varadarajan, K.S. Swarup, Solving multi-objective optimal power flow using differential evolution. IET Gener. Transmission Distrib. 2(5), 720–730 (2008) 39. C.X. Guo, J.P. Zhan, Q.H. Wu, Dynamic economic emission dispatch based on group search optimizer with multiple producers. Electr. Power Syst. Res. 86, 8–16 (2012) 40. S. He, Q.H. Wu, J.R. Saunders, Group search optimizer: an optimization algorithm inspired by animal searching behavior. IEEE Trans. Evol. Comput. 13(5), 973–990 (2009) 41. H.L. Liao, Q.H. Wu, Multi-objective optimization by learning automata. J. Glob. Optim. 1–29 (2013) 42. Y.Z. Li, M.S. Li, Z. Ji, Q.H. Wu, Optimal power flow using group search optimizer with intraspecific competition and levy walk, in IEEE Symposium Series on Computational Intelligence (Singapore, 2013), pp. 1–7 43. J. Hao, D.W. Gao, J.J. Zhang, Reinforcement learning for building energy optimization through controlling of central HVAC system. IEEE Open Access J. Power Energy 7, 320–328 (2020) 44. C. Chen, M. Cui, F. Li, S. Yin, X. Wang, Model-free emergency frequency control based on reinforcement learning. IEEE Trans. Industr. Inform. 17(4), 2336–2346 (2021) 45. C. Chen, M. Cui, X. Fang, B. Ren, Y. Chen, Load altering attack-tolerant defense strategy for load frequency control system. Appl. Energy 280, 116015 (2020) 46. S.F. Ding, H. Zhao, Y. Zhang, X.Z. Xu, R. Nie, Extreme learning machine. Artif. Intell. Rev. (2015) 47. Y. Yang, Y. Wang, X. Yuan, Bidirectional extreme learning machine for regression problem and its learning effectiveness. IEEE Trans. Neural Netw. Learn. Syst. 23(9), 1498–1505 (2012) 48. H.L. Liao, Q.H. Wu, Y.Z. Li, L. Jiang, Economic emission dispatching with variations of wind power and loads using multi-objective optimization by learning automata. Energy Convers. Manage. 87, 990–999 (2014) 49. M.A.C. Silva, C.E. Klein, V.C. Mariani, L.S. Coelho, Multiobjective scatter search approach with new combination scheme applied to solve environmental/economic dispatch problem. Energy 53, 14–21 (2013) 50. H. Julian, W. Daniel, State of the art of parallel coordinates. STAR Proc. Eurographics 95–116, 2013 (2013) 51. H. Liu, K. Huang, Y. Yang, H. Wei, S. Ma, Real-time vehicle-to-grid control for frequency regulation with high frequency regulating signal. Prot. Control Mod. Power Syst. 3(1) (2018) 52. S. Fan, G. He, X. Zhou, M. Cui, Online optimization for networked distributed energy resources with time-coupling constraints. IEEE Trans. Smart Grid 12(1), 251–267 (2021)

Chapter 9

Multi-objective Optimization Approach for Coordinated Scheduling of Electric Vehicles-Wind Integrated Power Systems

9.1 Introduction It is well recognized that renewable energy and electric vehicles are widely deployed for adapting to our society in an environmental way [1–4]. On the one hand, wind power represents one of the main parts of renewable energy, and the total installed wind capacity has reached to 171.6 MW by 2018 in China [5]. Specially, it has been reported that the integration of wind power into power systems could reduce 797.84 g/kWh greenhouse gas emission [6, 7]. On the other hand, people have used more EVs, the number of which is predicted to reach 100 million by 2030 throughout the world [8–10]. Indeed, some researches have considered impacts of EVs on energy portfolio, and its benefit on energy conservation and emission reduction has been verified [11]. However, it should be mentioned that uncertainties of renewable energy (RE) resources bring significant challenges to the operation of power systems and could lead to wind power curtailments [12]. In order to reduce the curtailments, two approaches have been studied, i.e., supply side dispatching and demand side management [13]. The first one uses flexibility of thermal generators to adjust their power outputs for enhancing wind power utilization. On the other hand, the second approach is based on the flexibility of demand side resources. Aggregating EVs together as the storage asset of power system is an efficient way to deal with the issue of wind curtailment [14]. However, by coordinated scheduling of thermal generators and EVs, the impacts of wind power curtailment reduction on other operational objectives of power systems, such as energy conversation and emission reduction, have not been comprehensively investigated. Evidently, it will facilitate the decision-making of system operators for wind power integrated power systems if this issue could be solved.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Li et al., Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch, Engineering Applications of Computational Methods 14, https://doi.org/10.1007/978-981-99-0799-1_9

183

184

9 Multi-objective Optimization Approach for Coordinated . . .

Dispatching thermal generators in the supply side is a common approach to enhance wind power utilization. For instance, a robust thermal generator dispatching framework is proposed for maximizing the wind power integration in [15], Besides, [16] presents a thermal generator scheduling model considering uncertainties to minimize wind power curtailment and uses a chance-constrained two-stage stochastic approach to determine day-ahead scheduling. Furthermore, some researches consider risks of system operation. For instance, [17] proposes a stochastic approach based risk-aware thermal generators scheduling scheme, in order to improve wind power utilization. Similarly, [18] introduces a risk-constrained two-stage stochastic programming to make optimal decisions of thermal generators for enhancing wind penetration in a wind-thermal power system. In addition, an uncertain wind power integrated dynamic thermal rating system is proposed in [19] to ensure the reliability and improve wind power utilization. It is noted that applying the technology of demand side management could also improve wind power utilization [20]. In this aspect, [21] proposes a corresponding source-grid-load coordinated planning model, which utilizes demand response to reduce wind power curtailments. Furthermore, [22] introduces a combined heat and power operational model via a price response based demand side management, and results verify the effectiveness on reducing wind power curtailments. In addition, [23] proposes a stochastic programming model for the optimal scheduling of multiple energy resources and loads, which has better performances on wind power integration than the case without adopting demand side management. Besides, [24] formulates an operation model for an integrated energy system to improve wind power utilization, by properly managing thermal loads in buildings. In addition, considering energyintensive loads as possible demand resources, a direct control scheme to manage power consumption of energy-intensive loads is proposed in [25], in order to smooth wind power fluctuation and enhance its integration. Specially, [26] conducts an overview on definition and classification of different types of demand resources, especially EVs, and emphasizes their valuable services. Significantly, EVs are potential demand resources to enhance wind power utilization, due to their high penetration level and large discharging capacity [27]. For instance, [27] studies the contribution of EVs to making use of wind power in an independent building power system, by scheduling charging and discharging loads. Similarly, the benefit of EVs on enhancing wind power utilization is also investigated, using a twolayer optimization approach and considering wind power uncertainties [28]. Besides, [29] focuses on the coordinated scheduling of EVs and the power system to reduce wind power curtailment and mitigate the impact of wind variation on the power grid via a robust approach. Proposing a comprehensive mathematical model, [30] considers EVs as energy storage devices and analyzes their coordinated operations with wind power to enhance its utilization. Considering challenges of electric powertraffïc networks, a coordinated solution for security constrained thermal generatorstraffïc assignment problem is proposed in [31], in order to manage the schemes of EVs and thermal generators for enhancing wind power utilization. Furthermore, EVs are integrated to serve as battery storages in the vehicle-to-grid mode in [32], which effectively contributes to the integration of wind power.

9.2 Operation Models of EV and Wind Power

185

Existing studies clearly show the importance and benefits of minimizing wind power curtailment. However, how about the comprehensive relationship among multi-objectives, i.e., energy conservation, emission reduction and wind curtailments? What are the quantitative impacts of introducing EVs for enhancing wind penetration? Evidently, addressing above questions would be helpful for power system operators to make proper scheduling decisions. To this end, (i) we think investigating the comprehensive relationship among wind power curtailment, energy conservation and emission reduction is necessary; (ii) it is essential to quantify impacts of EVs and wind power on multi-objectives. These are the major motivations and objective of our work. In order to solve this issue, (i) we propose a coordinated scheduling model for EV-wind integrated power systems, in which wind power curtailments are calculated based on probabilistic information, and reveal the relationship among multiple objective; (ii) we design an improved multi-objective optimization algorithm based on differential evolution to deal with this complex coordinated scheduling model. These two points are considered as main contributions of our work. The rest of the chapter is organized as follows. In Sect. 9.2, we first discuss the model of EVs and wind power and then establish a probability-based mathematical model to quantify the wind power curtailment. Section 9.3 formulates the coordinated scheduling model via the multi-objective optimization approach, in which the objective functions, constraints and decision variables are introduced. Section 9.4 includes the solution algorithm and corresponding computational procedure, respectively. Finally, Sect. 9.5 analyzes scheduling results of the proposed model, and the conclusion is drawn in Sect. 9.6.

9.2 Operation Models of EV and Wind Power 9.2.1 Operational Model of EV Charging Station The charging load of EV charging station (EVCS) is scheduled to enhance the wind power utilization in this paper. The electric energy flow between EVCS and EV, as well as between power grid and EVCS can transfer in bidirections, as shown in Fig. 9.1. Besides, the power grid arranges the day-ahead hourly based schedule of charging loads for the EVCS. k is In the kth(k = 1, 2, ..., 24) period, the total charging load of the EVCS PEVC k limited by the number of EVs NEV , and charging power Pc,i as well as discharging power Pd,i of the ith EV battery: k



NEV  i=1

k

Pd,i ≤

k PEVC



NEV  i=1

Pc,i

(9.1)

9 Multi-objective Optimization Approach for Coordinated . . .

186

Fig. 9.1 EV charging system, reprinted from [4], copyright2019, with permission from IEEE

k k Additionally, NEV can be calculated by the number of EVs arrived (i.e., NEV,arr ) k in the past k periods and departed (i.e., NEV,dep ) in the past k − 1 periods [33]:

k NEV =

k 

i NEV,arr −

i=1

k−1 

i NEV,dep

(9.2)

i=1

Besides, the power demand of EVCS Preq is determined by the number and behavior of EVs that are connected to the station. Hence, the charging power requirement can be expressed as follows. k

Preq

NEV  Di · Phuk,i = ηc i=1

(9.3)

where Di is the travel distance of the ith EV, Phuk,i stand for the energy consumption per 100 km, and ηc represents the charging efficiency, respectively. Therefore, the available energy of EVCS that can be injected to the grid in the kth period can be measured as k

k SEV

=

k−1 SEV

+

k PEVC

NEV  Ci − Di · Phuk,i + ηc i=1

(9.4)

k where SEV and Ci stand for the available energy and the ith EV battery capacity, respectively. However, the schedule of charging power in EVCS needs to ensure energy demands of EV users. That is, the available energy of EVCS in the kth time period

9.2 Operation Models of EV and Wind Power

187

should be more than that of the depart EVs, while also limited by the energy of schedulable EVs, which is shown as follows: k NEV,dep

k

NEV  Ci  Ci k ≤ SEV ≤ ηc η i=1 i=1 c

(9.5)

Besides, the total charging load of EVCS should be equal to EV user demands in the entire day, i.e., 24 

k PEVC =

k=1

24 

Preq

(9.6)

k=1

9.2.2 Model of Uncertain Wind Power It is noted that wind speed is difficult to predict exactly, and such forecast inaccuracy would introduce uncertainty to available wind power. The mathematical relationship between wind speed and wind power is expressed as follows.

k PWP

⎧ 0 ⎪ ⎪ ⎪ ⎨ a + bv 3 = ⎪ Pra ⎪ ⎪ ⎩ 0

0 ≤ v < vci vci ≤ v < vra vra ≤ v < vco

(9.7)

v > vco

k In (9.7), PWP is the available wind power. v , vci , vra , vco represent the actual wind speed, cut-in wind speed, rated wind speed, and cut-out wind speed, respectively. Besides,

a=

vci3 Pra vci3 , b = 3 − v3 3 − v3 vra vra ci ci

(9.8)

In addition, the actual wind power integrated into the power grid can be scheduled, k is expressed as and the schedulable power PWP,sch k k ≤ PWP 0 ≤ PWP,sch

(9.9)

It is worth mentioning that uncertain wind speed can be simulated via the forecast value vfore plus its deviation v [34]. It has been verified that such kind of deviation could follow Gaussian distribution [35]. Therefore, v and delta v could be formulated as follows. v = vfore + v

(9.10)

188

9 Multi-objective Optimization Approach for Coordinated . . .

Fig. 9.2 Probability model of wind power, reprinted from [4], copyright2019, with permission from IEEE

v ∼ N (0, σv2 )

(9.11)

9.2.3 Wind Power Curtailment Based on Probability Model In order to quantify the uncertainties of wind power, we (i) sample Nsim forecast errors based on (9.11) via Latin hypercube sampling [35] approach and calculate their corresponding actual wind speed vik (i = 1, 2, ..., Nsim ) by (9.10). Then, we k k < vact ≤ vik ) , based on the cumucalculate the segmental probability, i.e., P(vi−1 k ≤ vik ) ; (ii) we use (9.7)–(9.8) to compute power outputs lative probability P(vact k of wind turbines, according to the corresponding vik . Therefore, we obtain the PWP,i k k k and P(vi−1 < vact ≤ vik ) , based on which probability relationship between PWP,i we can calculate some necessary parameters as follows. k ) occurs During each scheduling period, wind power curtailment (i.e., PWP,cur when the scheduled wind power is less than the actual wind power outputs, which is determined by the deviation and the related probability. To this end, we can draw the probability distribution of PWP,t,act − PWP,sch (i.e., PWP,i,err , which is shown as the shaded area in Fig. 9.2. The mathematical expectation in premised of PWP,i,act > PWP,sch can represent the wind power curtailment. PWP,cur =

Nsim 

P(PWP,i,err > 0) · PWP,i,err

i=1

Besides, the curtailment cost is expressed as

(9.12)

9.3 Coordinated Scheduling Model Integarated EV and Wind Power

189

Fig. 9.3 Coordinated stochastic EV-wind integrated power scheduling system, reprinted from [4], copyright2019, with permission from IEEE

CWP,cur = α · PWP,cur

(9.13)

in which α is the cost coefficient of wind power curtailment, which is set as 0.2 in our study. In addition, the required reserve cost of wind power CWP,res when PWP,i,act > PWP,sch is CWP,res = −β ·

Nsim 

  P PWP,i,err < 0 · PWP,i,err

(9.14)

i=1

where β is the cost coefficient of reserve, also set as 0.2 in this paper.

9.3 Coordinated Scheduling Model Integarated EV and Wind Power The proposed coordinated scheduling model of EV-wind integrated power system is shown in Fig. 9.3, which consists of thermal and wind generators, power grid and EVCS. In order to comprehensively explore the relationship among wind power curtailment, energy conservation (represented by generation cost) and pollution emission, the coordinated scheduling model based on the formulations discussed in Sect. 9.2

9 Multi-objective Optimization Approach for Coordinated . . .

190

is proposed, which optimizes the three objective functions mentioned above while satisfying prevailing operational constraints. Therefore, the coordinated scheduling model of the EV-WP integrated power system is formulated in (9.15)–(9.33), which will be shown as follows, respectively. min J obj (xdes ) , xdes ∈ Dcons

(9.15)

Jobj1 = PWP,cur

(9.16)

Jobj2 = FTL

(9.17)

Jobj3 = E TP

(9.18)

FTL = FWP + FTP + CWP,cur + CWP,res

(9.19)

FWP =

Nw 

ωi PW,i

(9.20)

i=1

FTP =

NT 

   ai + bi PT,i + ci PT,i 2 + ei sin f i PT,i,min − PT,i

(9.21)

i=1

E TP =

NT 

αi + βi Pi + γi Pi 2 + λi exp (δi Pi )



(9.22)

i=1

xdes = (xTP , xWP , xEVCS )

(9.23)

⎧  k  ⎨ xTP = xTP  k xWP = xWP  , k = 0, 1, · · · , T − 1 ⎩ k xEVCS = xEVCS

(9.24)

  k k = max  PT,i,min , PTk−1 − Di  PT,i,min k k PT,i,max = min PT,i,max , PTk−1 + Ui

(9.25)



k k Ui,min ≤ Uik ≤ Ui,max , i ∈ NLN

(9.26)

k

TL ≤ TLk , i ∈ Nbus i i,max

(9.27)

⎧ NT  

⎪ k k k ⎪ PT,i,max = − PT,i ⎨ Sup i=1

NW

⎪ k k k ⎪ ⎩ Sup ≥ βr Plocal + wp PW,i i=1

(9.28)

9.3 Coordinated Scheduling Model Integarated EV and Wind Power

191

⎧ NT  

⎪ k k k ⎪ PT,i = − PT,i,min ⎨ Sdn i=1

(9.29)

NW

⎪ k k ⎪ ⎩ Sdn ≥ wn PW,i i=1

PGi = PDi − sfarm Pfarm + sEVC PEVC +



  Bij θi − θ j

(9.30)

j∈Ni

sfarm =

1, wind farm located 0, no wind farm located

sEVC =

1, EVCS located 0, no EVCS located

Pfarm = PWP × NW

(9.31)

(9.32) (9.33)

9.3.1 Objective Functions This model has three objective functions, including wind power curtailment, generation cost, and the emission of contaminants, especially NOx in this paper. These objective functions are expressed as (9.16)–(9.18), respectively. In (9.17), FTL is the total cost of this generation system, which is represented by (9.19)–(9.21), where FTP , FWP and NT , NW are the generation costs and numbers of thermal and wind power turbines, respectively. Furthermore, ai , bi , ci , ei , f i and PT,i,min stand for fuel cost coefficients and the minimum power output of the i th thermal generations. Moreover, ωi is the wind power cost coefficients, which is set as 20 in our case. Besides, E TP in (9.18) is the emissions of NOx and can be formulated in (9.22), in which αi , βi , γi , λi , δi are the NOx emission coefficients.

9.3.2 Decision Variables This model considers thermal power, wind power and power of EVCS as scheduling variables. Therefore, the decision variables are formulated as (9.23). In addition, the total scheduling time is T = 24 h, with each time window 1 h. Consequently, decision variables are also shown in (9.24).

192

9 Multi-objective Optimization Approach for Coordinated . . .

9.3.3 Constraints In order to guarantee secure operations of the EV-wind integrated power system, constraints with respect to thermal generators, wind power turbines, power grid, EVCS and system reserve are presented as follows, respectively. (i) Due to slow change of steam pressure, power output of a thermal generators is limited by not only its upper and lower power outputs but also ramping constraints. k k and PT,i,max are the minimum and maximum power As shown in (9.25), PT,i,min th outputs of the i thermal generator in the kth time period. Moreover, Di and Ui stand for the ramping down and up limits of the i th generator in two adjacent time periods. (ii) In order to guarantee the operational security of the power grid, voltage of each bus in NLN load nodes should satisfy its limits in each period, which are formulated as (9.26). In detail, Uik is the i th load node voltage in the k th period , and its maximum k k and Ui,min , respectively. Furthermore, T L ik donates and minimum value are Ui,max the transmission power of i th bus in Nbus buses are not allowed to exceed their upper k ), which is expressed as (9.27). limit (i.e., T L i,max (iii) The EVCS is expected to meet the EV user demands, including charging requirement and power constrains formulated in (9.5) and (9.6). (iv) The whole generating system has its reserve constrains. The up reserve (i.e., k k ) constraint is shown in (9.28), in which Plocal is the local power demand in the Sup kth period. w p and βr stand for up reserve coefficients of wind power and local power demand, which are set as 5% and 15%, respectively. k ) constraint is formulated as (9.29), and wn is the The down reserve (i.e., Sdn down reserve coefficient of wind power, which are set the same as the up reserve parameters. (v) The integration of wind power and EVCS affects power flows. We employ sfarm and sEVC to represent the integration state of wind farm and EVCS at the ith bus, respectively, as shown in (9.31)–(9.32). Hence the power flow equations can be expressed as (9.30), where Bi j is the transfer susceptance between bus i and j . Besides, PGi and PDi are injected active power and demanded active power of bus i , respectively, and Ni is the total number of buses adjacent to bus i . Furthermore, Pfarm donates the active power output of a wind farm that has NW generators, which can be calculated by (33)

9.4 Solution of Coordinated Stochastic Scheduling Model 9.4.1 The Parameter Adaptive DE Algorithm As the proposed coordinated stochastic scheduling model is a complex, non-convex and multi-objective problem, we employ a multi-objective differential evolution (DE)

9.4 Solution of Coordinated Stochastic Scheduling Model

193

Fig. 9.4 Operation process of PADE algorithm, reprinted from [4], copyright2019, with permission from IEEE

algorithm, due to its effective, convenient and reliable performance in dealing with complex optimization problems. However, the efficiency of the traditional DE algorithm is indeed affected by the selection of the mutation operator Fr and crossover operator Cr . In this paper, we propose a parameter adaptive DE (PADE) algorithm to deal with this issue, by which control parameters are updated with the population of DE. It is known that traditional DE algorithm contains a vector of population, which can be represented as   xi,g = xi,g, j , i = 0, 1, · · · , N p − 1; g = 0, 1, · · · , Iter − 1; j = 0, 1, · · · , D − 1

(9.34)

where N p , Iter and D stand for scale of population, the maximum generations and dimension of the population, respectively. In general, Fig. 9.4 shows the operation process of PADE, which consists control parameters determination, evaluation and non-dominated sort for each individual and other evolution operations. ass ass and Ci,g to form the Specially, PADE introduces assistant control parameters Fi,g control parameters. Therefore, the population vector xi,g of PADE is reformulated as    ass ass  p c c (9.35) , xi,g , xi,g = Fi,g , Ci,g . xi,g = xi,g Additionally, Fr and Cr are expected to be determined by assistant control parameter in a reasonable way. Control parameters should inherit information of parents

9 Multi-objective Optimization Approach for Coordinated . . .

194

to accelerate the speed of evolution. In detail, mutation and crossover operations of control parameters are formulated as follows.    ass ass , xrc0,g + Fi,g xrc1,g − xrc2,g , rand(0, 1) ≤ Ci,g c u i,g = (9.36) c else x i,g

r = Ci,g

r Fi,g =

ass , Cminr + rand(0, 1) (Cmaxr − Cminr ) , rand ≤ Ci,g r Ci,g−1 else

(9.37)

ass , Fminr + rand(0, 1) (Fmaxr − Fminr ) , rand ≤ Fi,g r Fi,g−1 else

(9.38)

c is the population of assistant control parameters. where u i,g Besides, the non-dominated sorting is always used to the Pareto front [36]. Hence, PADE adopts non-dominated sort method to sort individuals in the population, and obtain the optimal Pareto front. Similar as the traditional DE algorithm, the mutation population and crossover population can be expressed as

 c  c r xr 1,g − xrc2,g = xrc0,g + Fi,g vi,g p

u i,g, j =

(9.39)

p

r vi,g, j , rand(0, 1) ≤ Ci,g , p xi,g, j else

(9.40)

Using the above operations, the algorithm generates the next population according to the fitness and objective values p p p u i,g , fit(u i,g ) ≥ f it (xi,g ) p p xi,g+1 = , Ncons = N p (9.41) p xi,g else p xi,g+1

=

p

p p u i,g , fit(u i,g ) ≤ fit(xi,g ) p = Np , Ncons p xi,g else

(9.42)

p

where f it (x) is the fitness function of population x , and Ncons represents the number of populations that satisfy all the constraints.

9.4.2 Decision-Making Method The Pareto front obtained by the optimal algorithm contains a set of feasible solutions. We should determine one solution for scheduling.

9.5 Case Study

195

Here, we use the fuzzy decision (FD) method [36] in our work. As shown in (9.43)–(9.44), FD employs normalized weighted membership approach to calculate the eclectic value (i.e., EC (i) ) of each individual, based on the objective importance weight factor wq (q = 1, . . . D) . In addition, Jqmin and Jqmax are the acceptable limits of objective function q . Furthermore, we determine the eclectic solution xEC via (9.45), according to their eclectic values, which are calculated by (46). ⎧ , Jq (x) ≤ Jqmin ⎪ ⎨ 1 max Jq −Jq (x) , Jqmin ≤ Jq (x) ≤ Jqmax (9.43) f Jq (x) = Jqmax −Jqmin ⎪ ⎩ max 0 , Jq (x) ≥ Jq

3

q=1 wq f Jq (x i ) , i = 0, . . . , N P − 1 EC (i) = N p 3 i=1 q=1 wq f Jq (x i )

EC(i max ) = max(EC(0), . . . , EC(i)) xEC = x(i max ), i = 0, . . . , N P − 1

(9.44)

(9.45)

9.4.3 Solution Procedure The solution procedure is described in Fig. 9.5. Based on the wind power and EV information, the optimal model is established according to the relationship proposed in Sects. 9.2 and 9.3. Furthermore, PADE algorithm, which includes parameter adjustment, initiation, variation, crossover and selection operations, is used to solve the coordinated optimal problem and obtain the Pareto front. Finally, we adopt the FD to obtain the eclectic solution.

9.5 Case Study 9.5.1 Case Description In order to verify performance of the proposed scheduling model and the modified optimization algorithm, we employ a modified Midwestern US power system to conduct simulation studies. In this system, a wind farm with 100 turbines is integrated to bus 22 and an EVCS is located at bus 10, as shown in Fig. 9.6. Furthermore, forecast values of power requirements of the EVCS, together with power outputs of the wind farm, are set and shown in Fig. 9.7, for 24 time intervals. Considering uncertainties of wind power, its forecast error is set as 8% of the forecast values.

196

9 Multi-objective Optimization Approach for Coordinated . . .

Fig. 9.5 Solution procedure of the proposed problem, reprinted from [4], copyright2019, with permission from IEEE

Furthermore, coefficients of thermal generators are shown in Table 9.1, which includes fuel cost factors, NOx emission coefficients, and limits of power outputs and power ramping, respectively. In order to investigate the performances of the algorithm of PADE and the proposed scheduling model, we set N p =50, Iter =1800, and conduct the following three cases for simulation studies:

16.00 100.0 200.0 83.00 250.0 250.0

150.0 25.00 −5.554 0.000 0.000 0.000

1 2 5 8 11 13

200.0 250.0 6.490 325.0 300.0 300.0

Fuel cost coefficients a/p.u. b/p.u. c/p.u.

BUS

50.00 40.00 2.875 0.000 0.000 0.000

d/p.u. 6.300 9.800 0.000 0.000 0.000 0.000

e/p.u. 4.091 2.543 4.258 5.326 4.258 6.131

−5.554 −6.047 −5.094 −3.550 −5.049 −5.555 6.490 5.638 4.586 3.380 4.586 5.151

Emissions coefficients α/0.01 β/0.01 γ /0.01 200.0 500.0 1.000 2000 1.000 10.00

ξ /0.000001 2.875 3.333 8.000 2.000 8.000 6.667

λ

Table 9.1 Coefficients of thermal generators, reprinted from [4], copyright2019, with permission from IEEE

0.80 0.30 0.20 0.15 0.10 0.12

2.00 0.80 0.50 0.35 0.30 0.40

Power limits Pmin /p.u. Pmax /p.u.

0.50 0.30 0.15 0.20 0.15 0.15

0.50 0.30 0.15 0.20 0.15 0.15

RU /p.u. R D /p.u.

9.5 Case Study 197

198

9 Multi-objective Optimization Approach for Coordinated . . .

Fig. 9.6 Topology of a modified Midwestern US power system, reprinted from [4], copyright2019, with permission from IEEE

Case 1: 100 wind power turbines, 9000 EVs; Case 2: 50 wind power turbines, 9000 EVs; Case 3: 50 wind power turbines, 1000 EVs.

9.5.2 Result Analysis In this section, the algorithm of PADE and the FD method are used to obtain optimal Pareto solutions, and typical solutions for three cases are shown in Table 9.2, respectively. The Pareto front of Case 1 shown in Figs. 9.8 and 9.9 presents the corresponding 2D visions. In Fig. 9.8, we note three typical Solutions A, B and C. Solutions A and B are extreme ones, as they correspond to the minimum generation cost and

9.5 Case Study

199

Fig. 9.7 Forecast values of wind power and charging demand of EVCS, reprinted from [4], copyright2019, with permission from IEEE Table 9.2 Result of typical solutions, reprinted from [4], copyright2019, with permission from IEEE Case Solution Wind power Generation cost/$ Pollution curtailment/MWh emission/ton Case 1

Case 2

Case 3

A B C A’ B’ C’ A” B” C”

504.62 21.19 73.06 195.47 0.61 32.42 354.37 57.50 132.29

30414.20 46359.18 40402.02 29374.29 38466.22 33885.88 20593.52 29835.87 25664.06

13.20 11.20 11.75 13.15 11.10 12.31 11.25 10.12 10.59

wind power curtailment. On the other hand, Solution C is obtained by using the FD method, via a balanced consideration of multiple objectives. With more details shown in Table 9.2, Solution A gains the minimal generation cost, i.e., $ 30414.20. However, it corresponds to the maximal pollution emission and wind curtailment, i.e., 13.20 ton and 504. 62 MWh, respectively. On the contrary, Solution B is the best solution that minimizes the wind power curtailment and pollution emission, compared with those of Solution A and C. Therefore, it is evident that

200

9 Multi-objective Optimization Approach for Coordinated . . .

Fig. 9.8 Pareto front of Case 1, reprinted from [4], copyright2019, with permission from IEEE

Fig. 9.9 2D Pareto fronts of Case 1, reprinted from [4], copyright2019, with permission from IEEE

9.5 Case Study

201

Fig. 9.10 Pareto fronts of Case 1 and Case 2, reprinted from [4], copyright2019, with permission from IEEE.

selecting A or B as the final scheduling solution is not reasonable, and we should comprehensively consider the multiple objectives. Therefore, the method of FD is adopted to obtain eclectic Solution C, and its objectives values are 73.06 MWh, $ 40402.02, and 11.75 ton, respectively. Consequently, it is concluded that a higher utilization of wind power corresponds to less pollution emission and higher generation costs. This is because the more adoption of wind power represents the less usage of thermal power, which leads to less pollution emissions. On the other hand, the more integration of wind power contributes to more generation cost of wind as it has high-generation coefficient ωi , which increases the total generation cost of power system. In addition, in order to investigate the impact of wind power scale on the scheduling solution, we compare typical solutions of Case 1 and Case 2, i.e., A, B, C, A, B and C , as shown in Fig. 9.10. In general, solutions of Case 1 have less values of wind power curtailment compared with those of Case 2. Specially, it is noted that the wind power curtailment of A is more than that of A , because the wind power adsorption is limited by the generation cost, as the increase in wind power integration will cause the rise of generation cost. We calculate the objective values of these six typical solutions, which are also shown in Table 9.3, respectively. Moreover, in order to conduct further investigation on impacts of wind power, the wind adsorption power, thermal power outputs and EVSC power demands of typical solutions are shown in Table 9.3. Besides, the thermal and wind power outputs of these two cases are shown in Fig. 9.11. It can be observed from both Table 9.3 and Fig. 9.11 that the wind power adsorption increases by 40.25 MWh, 493.53 MWh and 348.16 MWh, with respect to A∼A, B∼B, and C∼C. Specially, the power requirements of EVCS power of these two

9 Multi-objective Optimization Approach for Coordinated . . .

202

Table 9.3 Wind adsorption power, thermal output and EVCS power of Case 1, Case 2 and Case 3, reprinted from [4], copyright2019, with permission from IEEE Solution Wind adsorption Thermal output/MWh EVCS power/MWh power/MWh A A’ A” B B’ B” C C’ C”

292.88 252.63 20.41 1194.68 701.15 514.37 857.73 509.57 226.14

6654.55 6697.40 5365.26 5692.83 6195.76 4838.95 6039.76 6401.76 5098.09

1653.30 1653.30 183.69 1653.30 1653.30 183.69 1653.30 1653.30 183.70

Fig. 9.11 Thermal power and wind power output when NWP=50 and NWP=100, reprinted from [4], copyright2019, with permission from IEEE

cases are both 1653.30 MWh, because Case 1 and Case 2 have the same EV scale. In order to satisfy the requirement of increased wind power absorption, thermal generators reduce almost the same power outputs as the increase of wind power. Note that the generation cost is mainly determined by thermal and wind power outputs. Therefore, the generation cost of Case 1 is $ 6516.14 more than that of Case 2, as the wind power generation cost coefficient is high. The above analyses indicate that the scheduling system, which considers the minimum generation cost, presents limited ability to integrate wind power. Furthermore, we compare Case 2 and Case 3 to illustrate the impact of EV scale on scheduling results. Figure 9.12 shows the Pareto solutions of these two cases, among

9.5 Case Study

203

Fig. 9.12 Pareto fronts when NEV=1000 and NEV=9000, reprinted from [4], copyright2019, with permission from IEEE

Fig. 9.13 Thermal power and wind power output when NEV=1000 and NEV=9000, reprinted from [4], copyright2019, with permission from IEEE

which typical solutions A, B, C, A, B and C are marked, respectively. Obviously, wind power curtailments of Case 3 are more than those of Case 2, since the increase in EV load enables the system to utilize more wind power to substitute thermal power. For more details, the objective values and other measure indexes of these six typical solutions are also shown in Tables 9.2 and 9.3, respectively. It is noted that the thermal power outputs of Case 2 are more than those of Case 3, as shown in Fig. 9.13.

9 Multi-objective Optimization Approach for Coordinated . . .

204

Table 9.4 HV, MED and spacing value obtained by DE and PADE, reprinted from [4], copyright2019, with permission from IEEE Algorithm HV MED Spacing Traditional DE PADE

6.3×106 6.7×106

6352.6 6709.4

4525.1 2545.5

Specially, this objective value of eclectic solution increases from 5098.09 MWh to 6401.76 MWh when the number of EV increases from 1000 to 9000, because the increase of wind adsorption power is less than the rising demand of EVCS. Therefore, generation cost and pollution emissions of C and C increase from $ 25664.06 to $ 33885.88 and from 10.59 ton to 12.31 ton, respectively, due to the increased power outputs of thermal generators. Besides, the wind power curtailment decreases from 132.29 MWh to 32.42 MWh, which is caused by the increase in EVSC load from 183.70 MWh to 1653.30 MWh. Consequently, wind power adsorption is limited by energy requirements of EV load. Therefore, the scale of EV is expected to be suitable for a certain wind power scale to lower the generation cost and increase the absorption of wind power.

9.5.3 Algorithm Performance Analysis The hypervolume indicator (HV) is widely used to evaluate performance of the algorithm in dealing with multi-objective optimization problems [37]. It measures both convergence and diversity of the Pareto fronts. Therefore, a better Pareto is reflected by a larger value of HV indicator. In order to evaluate the search efficiency and stability of the PADE algorithm, we use Case 3 to conduct 20 independent tests, and the HV values are shown in Fig. 9.14. Obviously, the HV values vary up and down between 0 and 90 iteration approximately, and keep increasing subsequently. This manifests that the PADE can satisfy the constraints and convergent gradually. The 20 final HV values are all within [6.665×106 , 6.854×106 ], which means the PADE presents good robustness and stability performance. We also conduct 20 independent tests for traditional DE, and the best Pareto front together with the one obtained by PADE is shown in Fig. 9.15. As we can see, it may not be easy to tell which front is better. Therefore, two other indexes, spacing index and mean Euclidean distance (MED) [37], with HV are calculated to conduct a comprehensive comparison, as shown in Table 9.4. It is observed that the value of HV obtained by PADE is 6.7×106 , which is bigger than that of traditional DE, i.e., 6.3×106 . It indicates that the PADE obtains better Pareto front. Besides, values of MED and spacing are 2545.5 and 6709.4 for PADE, which are superior to those of DE. This clearly shows that the proposed algorithm is advantageous in obtaining more convergent and evenly distributed Pareto solutions.

9.5 Case Study

205

Fig. 9.14 HV values of 20 independent runs, reprinted from [4], copyright2019, with permission from IEEE

Fig. 9.15 Pareto front of the traditional and improved DE algorithms, reprinted from [4], copyright2019, with permission from IEEE

Furthermore, the evolutionary processes of HV values with respect to the above Pareto fronts are presented in Fig. 9.16. Obviously, the HV value of the PADE algorithm reaches 6.7×106 while the other is only 6.3×106 , after 1800 iterations. Moreover, the population of PADE algorithm satisfies all constraints after around 90 iterations, approximately. However, the HV value of traditional DE varies before the 150th iteration, approximately. This indicates that the PADE can satisfy the constraints more rapidly while deriving more convergent Pareto solutions finally.

206

9 Multi-objective Optimization Approach for Coordinated . . .

Fig. 9.16 HV values of the traditional DE and PADE, reprinted from [4], copyright2019, with permission from IEEE

9.6 Conclusion This paper proposes a coordinated stochastic scheduling model of EV-WP integrated power system to conduct the comprehensive investigation among wind power curtailment, generator cost and pollution emission. Specially, the proposed model considers uncertainties of wind power and calculates wind power curtailment by probabilistic information. Besides, we propose the parameters adaptive DE algorithm to solve the above optimal issue in an efficient way. Simulation based on a modified Midwestern US power system is conducted. The results indicate that a higher utilization of wind power corresponds to less pollution emission and higher generation costs, because of both the decrease in thermal power outputs and the high-generation cost coefficient of wind power. Moreover, we compare the multiple cases with different scales of wind power and EV. It is noted that the wind power adsorption is restricted by generation cost, which indicates that a larger wind scale could correspond to higher wind power curtailment and generation cost. In addition, wo also explore the impacts of EV scale, and results show wind power adsorption is constrained by energy requirements of EV loads. Therefore, the scale of EV is expected to be suitable for a certain wind power scale to improve the performance of power system. Additionally, the results obtained by traditional DE and PADE reveal that PADE has a better performance. The effect of price coefficients on the system scheduling performance is expected for further study.

References

207

References 1. L. Zhang, W. Lee, Z. Ding, Y. Lu, D. Chen, A stochastic resource-planning scheme for phev charging station considering energy portfolio optimization and price-responsive demand. 54, 5590–5598, IEEE (2018) 2. A. Kulvanitchaiyanunt, V.C. Chen, J. Rosenberger, P. Sarikprueck, W.J. Lee, Novel hybrid market price forecasting method with data clustering techniques for ev charging station application. IEEE Trans. Indus. Appl. 51(3), 1987–1996 (2015) 3. A.A. Stoorvogel, D. Wu, T. Yang, J. Stoustrup, Distributed optimal coordination for distributed energy resources in power systems. IEEE Trans. Autom. Sci. Eng. 14(2), 414–424 (2017) 4. T. Zhao, M. Yu, Y. Liu, Y. Li, Z. Ni, L. Wu, Coordinated stochastic scheduling for improving wind power adsorption in electric vehicles-wind integrated power systems by multi-objective optimization approach. 2019 IEEE Industry Applications Society Annual Meeting, pp. 1–10 (2019) 5. X. Zhao, D. Luo, Driving force of rising renewable energy in China: environment, regulation and employment. In Renew. Sustain. Energy Rev. 68, 48–56 (2017) 6. S. Zhang, K. Luo, X. Zhao, Q. Cai, The substitution of wind power for coal-fired power to realize China’s CO2 emissions reduction targets in 2020 and 2030. Energy 120, 164–178 (2017) 7. V.J. Karplus, T. Qi, X. Zhang, The energy and CO2 emissions impact of renewable energy development in China. Energy Policy 68, 60–69 (2014) 8. S.T. Revankar, M.S. Kumar, Development scheme and key technology of an electric vehicle: an overview. Renew. Sustain. Energy Rev. 70, 1266–1285 (2017) 9. K. Ahn, C. Fiori, H.A. Rakha, Power-based electric vehicle energy consumption model: model development and validation. Appl. Energy 168, 257–268 (2016) 10. D. Ouyang, J. Du, Progress, of Chinese electric vehicles industrialization in 2015: a review. Appl. Energy 188, 529–546 (2017) 11. Z. Lukszo, M. Weijnen, Y. Li, C. Davis, Electric vehicle charging in China’s power system: energy, economic and environmental trade-offs and policy implications. Appl. Energy 173, 535–554 (2016) 12. V. Modi, M. Waite, Modeling wind power curtailment with increased capacity in a regional electricity grid supplying a dense urban demand. Appl. Energy 183, 299–317 (2016) 13. Y. Tan, J. Yan, H. Li, P.E. Campana, Feasibility study about using a stand-alone wind power driven heat pump for space heating. Appl. Energy 228, 1486–1498 (2018) 14. Y. Mu, C. Xu, Y. Jiang, H. Jia, X. Li, Coordinated control for ev aggregators and power plants in frequency regulation considering time-varying delays. Appl. Energy 210, 1363–1367 (2018) 15. B. Zhang, Z. Li, W. Wu, B. Wang, Adjustable robust real-time power dispatch with large-scale wind power integration. IEEE Trans. Sustain. Energy 6, 357–368 (2015) 16. X. Zhang, Z. Wu, P. Zeng, Q. Zhou, A solution to the chance-constrained two-stage stochastic program for unit commitment with wind energy integration. IEEE Trans. Power Syst. 31(6), 4185–4196 (2016) 17. M. He, S. Abedi, D. Obadina, Congestion risk-aware unit commitment with significant wind power generation. IEEE Trans. Power Syst. 33, 6861–6869 (2018) 18. H. Saboori, R. Hemmati, Stochastic risk-averse coordinated scheduling of grid integrated energy storage units in transmission constrained wind-thermal systems within a conditional value-at-risk framework. Energy 113, 762–775 (2016) 19. J. Teh, I. Cotton, Reliability impact of dynamic thermal rating system in wind power integrated network. IEEE Trans. Reliabil. 65(12), 1081–1089 (2016) 20. M. Humayun, A. Safdarian, M. Ali, M.Z. Degefa, M. Lehtonen, Increased utilization of wind generation by coordinating the demand response and real-time thermal rating. IEEE Trans. Power Syst. 31(5), 3737–3746 (2016) 21. B. Shen, S. Dang, J. Zhang, Y. Zhou, N. Zhang, Z. Hu, A source-grid-load coordinated power planning model considering the integration of wind power generation. Appl. Energy 168, 13–24 (2016)

208

9 Multi-objective Optimization Approach for Coordinated . . .

22. Y. Sun, Y. Jiang, J. Xu, Day-ahead stochastic economic dispatch of wind integrated power system considering demand response of residential hybrid energy system. Appl. Energy 190, 1126–1137 (2017) 23. E. Heydarian-Forushani, M.D. Somma, G. Graditi, Stochastic optimal scheduling of distributed energy resources with renewables considering economic and environmental aspects. Renew. Energy 116, 272–287 (2018) 24. S. Lu, Z. Luo, C. Wu, W. Gu, J. Wang, Optimal operation for integrated energy system considering thermal inertia of district heating network and buildings. Appl. Energy 199, 234–246 (2017) 25. Y. Sun, Y. Bao, S. Liao, J. Xu, B. Tang, Control of energy-intensive load for power smoothing in wind power plants. IEEE Trans. Power Syst. 33(6), 6142–6154 (2018) 26. G. Graditi, M.C. Falvo, P. Siano, Electric vehicles integration in demand response programs. 2014 International Symposium on Power Electronics, Electrical Drives, Automation and Motion, pp. 548–553 (2014) 27. G. Deconinck, X. Guan, Z. Qiu, Y. Yang, Q. Jia, Z. Hu, Distributed coordination of ev charging with renewable energy in a microgrid of buildings. IEEE Trans. Smart Grid 9, 6253–6264 (2018) 28. M. Aliakbar-Golkar, M. Fowler, A. Elkamel, A. Ahmadian, M. Sedghi, Two-layer optimization methodology for wind distributed generation planning considering plug-in electric vehicles uncertainty: a flexible active-reactive power approach. Energy Convers. Manage. 124, 231– 246 (2016) 29. Q. Jia, Q. Huang, X. Guan, Robust scheduling of ev charging load with uncertain wind power integration. IEEE Trans. Smart Grid 9(2), 1043–1054 (2018) 30. P. Meibom, J. Kiviluoma, Influence of wind power, plug-in electric vehicles, and heat storages on power system investments. Energy 35(3), 1244–1255 (2010) 31. Z. Li, W. Tian, Y. Sun, Z. Chen, M. Shahidehpour, Ev charging schedule in coupled constrained networks of transportation and power system. IEEE Trans. Smart Grid 10(5), 4706–4716 (2019) 32. G. Krajacic, N. Duic, P. Prebeg, G. Gasparovic, Long-term energy planning of croatian power system using multi-objective optimization with focus on renewable energy and integration of electric vehicles. Appl. Energy 184, 1493–1507 (2016) 33. M. Wietschel, D. Dallinger, Grid integration of intermittent renewable energy sources using price-responsive plug-in electric vehicles. Renew. Sustain. Energy Rev. 16(5), 3370–3382 (2012) 34. F. Bouffard, F.D. Galiana, Stochastic security for operations planning with significant wind power generation. IEEE Trans. Power Syst. 23(2), 306–316 (2008) 35. M. Lange, On the uncertainty of wind power predictions-analysis of the forecast accuracy and statistical distribution of errors. J. Solar Energy Eng. 127(2), 177–284 (2005) 36. M.R. Narimani, T. Niknam, R. Azizipanah-Abarghooee, An efficient scenario-based stochastic programming framework for multi-objective optimal micro-grid operation. Appl. Energy 99, 455–470 (2012) 37. M.S. Li, J.P. Zhan, Y.Z. Li, Q.H. Wu, Mean-variance model for power system economic dispatch with wind power integrated. Energy 72, 510–520 (2014)

Chapter 10

Many-Objective Distribution Network Reconfiguration Using Deep Reinforcement Learning-Assisted Optimization Algorithm

10.1 Introduction In recent years, energy and environmental crises have been significant obstacles to the sustainable development of our society. In order to lessen these crises, renewable energy (RE) has been paid more attention to and utilized widely worldwide. For instance, at the end of 2019, the overall installed capacity of RE generation reached 2537 GW in the world, and the capacity of wind and solar power was 623 GW and 586 GW, respectively [1]. This shows that the highly penetrated RE power system is gradually formed. Although RE contributes to significant economic benefits for mitigating the energy crisis and climate warming, the high penetration of RE would bring about great challenges to the operation of power systems, such as the distribution network (DN). For example, uncertain highly penetrated RE affects the distribution and direction of power flow in the DN [2]. Therefore, it may lead to the increase of power loss [3], voltage deviations [4], degradation of power quality [5], etc. To overcome these challenges, one of the strategies is the curtailment of RE, to a certain extent [6]. For instance, Chalise et al. [7] propose a method of active power curtailment of RE for overvoltage prevention, which is verified to be adequate for RE penetration in DN. Martin-Martinez et al. [8] study the influence of RE curtailments on the security and reliability of the power system and further analyze different RE curtailment scenarios. Besides, considering that fast-varying solar power could result in unexpected voltage violations, Li et al. [9] propose an efficient distributed online voltage control algorithm adopting RE curtailments to maintain the bus voltage within the acceptable ranges in DN. It should be noted that the curtailment severely restricts the utilization and development of RE [10]. As one of the most crucial tools, distribution network reconfiguration (DNR) has been widely used to improve the operation of DN. Takenobu et al. [11] propose a DNR model for annual reconfiguration scheduling, which determines the reconfiguration period to achieve the minimum annual power loss. Asrari © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Li et al., Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch, Engineering Applications of Computational Methods 14, https://doi.org/10.1007/978-981-99-0799-1_10

209

210

10 Many-Objective Distribution Network Reconfiguration …

et al. [12] take the time-varying nature of loads into account and propose an adaptive fuzzy-based parallel genetic algorithm to solve the DNR problem. This aims to optimize the power loss, voltage deviation and the number of switching. Furthermore, in [13], the load balancing and power loss are optimized simultaneously. The optimal configurations on different test networks are obtained. Besides, Capitanescu et al. [14] analyzed the potential of DNR in improving the hosting capacity of wind and power generation. The above study verifies the significant potential of DNR. However, with the increasing RE integration, how to well utilize the DNR in improving the operations of DN and absorbing more RE is worthy of investigated. The main reason is that RE is expected to be utilized as much as possible according to the requirements of various countries (some even decreed laws), and the complete absorption should not worsen the operations of DN. Therefore, an issue is thus naturally inspired. Is it possible to improve the operation of the DN while increasing the absorption of RE via DNR? In other words, the voltage deviation, network power loss, statistic voltage stability, generation cost, etc., should not be significantly affected by the high penetration of RE. This means the above indexes that manifest operations of DN, together with the amount of RE curtailment, shall be simultaneously optimized to achieve a balance. The issue refers to the many-objective optimization problem. It is worth noting that, unlike multi-objective optimization, the many-objective optimization problem contains no less than four objectives. Accordingly, as the number of objectives increases, the number of non-dominated solutions [15] also increases substantially, which makes traditional multi-objective optimization algorithms lose the selection driver [16]. Therefore, it significantly weakens the searching ability of conventional algorithms when confronted with many objectives, such as non-dominated sorting genetic algorithm II (NSGA-II) [15], multi-objective particle swarm optimization (MOPSO) [17], and the multi-objective bacterial foraging optimization algorithm (MBFO) [18]. Note that MBFO has been proposed in recent years, and its performance is verified to be better than NSGA-II and MOPSO [18]. In order to solve the drawback that the multi-objective optimization algorithm performs poorly in dealing with many-objective optimization problems, this paper proposes a deep reinforcement learning (DRL)-assisted MBFO (DRL-MBFO). The novelty of DRL-MBFO is the combination of DRL and MBFO, i.e., using Deep-QNet (DQN) to assist population evolution. It effectively solves the problem of slow population evolution due to reduced selection driver and dramatically improves the searchability of algorithms. To this end, this paper aims to establish a many-objective distribution network reconfiguration (MDNR) model for considering uncertain RE, and the five objectives are investigated as follows: (1) the RE curtailments, (2) the voltage deviation, (3) the power loss, (4) the statistic voltage stability (L) and (5) the generation cost. Then, we propose a novel algorithm of DRL-MBFO to address the MDNR model efficiently. Thereby, a Pareto front is obtained, which manifests the trade-off among these objectives to be achieved for decision-making. The main contributions of the paper are twofold.

10.2 Many-Objective Distribution Network Reconfiguration Model

211

(1) A MDNR model with highly penetrated uncertain RE is proposed for assessing the many objectives of operations of the DN. This model could be used to promote the absorption of RE while guaranteeing power quality and economic benefits. At the same time, more objectives can also make the model more closely to real-world scenarios. (2) We propose a deep reinforcement learning-assisted MBFO, which can find more convergent Pareto solutions with faster searching speed in the many-objective optimization problem. The reminder of this paper is organized as follows. Section 10.2 presents the MDNR model for considering uncertain RE integrated. Section 10.3 shows the details of the DRL-MBFO. Numerical simulations are conducted in Sect. 10.4. In Sect. 10.5, conclusions are drawn.

10.2 Many-Objective Distribution Network Reconfiguration Model 10.2.1 Problem Formulations The objective functions are minimized by adjusting the decision variables for the MDNR issue under the premise that the equality and inequality requirements are satisfied. We use PV power as an example to analyze the RE curtailments without sacrificing generality. The following are the objective functions in this paper: (1) PV power curtailment, (2) voltage deviation, (3) power loss, (4) statistic voltage stability and (5) generation cost. On the other side, the related constraints include power flow equations, the output power of the generator, limitations of branch capacity and node voltage and constraints of network topology. The status of tie switches for DN serves as the decision variable. Therefore, MDNR is presented in the form of the following formulations. (10.1) min F(x, u, S) u

Pi = Pξs,i + Vi

NB 

V j (G i j cos θi j + Bi j sin θi j )

(10.2)

j=1

Q i = Q ξs,i + Vi

NB 

V j (G i j sin θi j − Bi j cos θi j )

(10.3)

j=1

Vi,min ≤ Vi ≤ Vi,max

(10.4)

Sk ≤ Sk,max

(10.5)

212

10 Many-Objective Distribution Network Reconfiguration …

Pg,min ≤ Pg ≤ Pg,max

(10.6)

Q g,min ≤ Q g ≤ Q g,max

(10.7)

g∈G

(10.8)

where (10.1) is the vector of objective functions and x and u denote the DN system state and decision variable. S stands for the set of sampling scenarios of PV power, which are usually to represent uncertainties and could be obtained by the two-point estimation method [19]. (10.2)–(10.3) represent the power flow equations, which are non-convex. Pi and Q i are denoted as injected active and reactive power at the ith bus (i = 1, 2, . . . N B ), and N B is the number of DN buses. Pξs,i and Q ξs,i represent the integrated active and reactive PV power at bus i with respect to the sample ξs (s = 1, 2, . . . , N S ), and N S is the number of the sampling scenarios. Vi and V j are the voltage at the ith and the jth bus. G i j and Bi j stand for the conductance and susceptance between buses i and j, respectively. θi j denotes the voltage angle difference between bus i and j. Inequality constraints (10.4)–(10.7) stand for the operational limits of the DN. Vi,min and Vi,max are the minimal and maximum voltage magnitudes at bus i. Sk denotes the transmission capacity of branch k, corresponding, Sk,max is the maximum transmission capacity of branch k (k = 1, 2, . . . , N K ), and N K is the number of the branch. Pg and Q g stand for the active and reactive power output of the generator g (g = 1, 2, . . . , NG ), and NG denotes the total number of generators. Besides, Pg,min , Pg,max , Q g,min and Q g,max are their corresponding lower and upper bounds, respectively. Finally, (10.8) is the topology constraint, i.e., there is only one possible path between each bus and the generator [20], g represents the topology of DN after reconfiguration, and G is the set of topology that satisfies the radial structure.

10.2.2 Objectives (1) PV Power Curtailment As previously announced, PV power is anticipated to be absorbed as much as possible. However, the operation of DN is seriously threatened by the high PV power penetration, which directly led to the activation of several limitations (10.2)–(10.7). Therefore, this work employed a gradual curtailment strategy to assure PV absorption while satisfying the restrictions, and the specific procedure is depicted in Fig. 10.1. The integrated PV power γ and its absorption α are shown by the points in the left and right vertical lines, respectively, in Fig. 10.1. When operating limitations (10.2)– (10.7) are not violated by the integrated PV power γ , the PV power absorption is

10.2 Many-Objective Distribution Network Reconfiguration Model

213

Fig. 10.1 Gradual curtailment strategy, reprinted from Ref. [21], copyright 2022, with permission from IEEE

equal to the integrated power, for example, at the points p and p  . However, in order to ensure the normal operation of DN, the curtailment method must be activated if the integrated PV power violates some limitations in (10.2)–(10.7). For instance, the γ will gradually drop with a number of curtailments until the constraints (10.2)–(10.7) are satisfied in terms of the point q. In this way, the integrated PV power q would not equal power absorption q  , and the difference between q and q  is the PV power curtailment. The detailed steps of the gradual curtailment strategy are shown in the following procedures. Step 1 : Initialize counter of several curtailments NC = 0. Step 2 : Calculate power flow equations (10.2)–(10.3), if the power flow is convergent, we can obtain the power system states x, then check constraints (10.4)–(10.7). If the constraints (10.4)–(10.7) are satisfied simultaneously, go to Step 4. Step 3 : If the power flow is not convergent or (10.4)–(10.7) is violated, then the Ci PV power will be curtailed, meantime update NC = NC + 1 and γ = γ − Ci . Afterward, return Step 2.  NC Ci . Step 4: Calculate the total curtailments based on the counter Nc , i.e., C = i=1 Finally, taking the stochastic PV power into account, each PV power scenario has a corresponding probability, the PV power curtailment of the DN can be calculated as the following formula, which is to be minimized to support its absorption: min Cur(x, u, S) = u

NS 

ps · C s

(10.9)

s=1

where ps represents the probability occurance for scenario s and Cs is the PV power curtailment under this scenario. (2) Voltage Deviation Voltage deviation (VD) is a crucial indicator of the quality of the power. However, the voltage deviation in the DN site voltage deviation may become worse because of the highly integrated RE. It would adversely impact the service life of electrical equipment in addition to endangering the DN’s security operations. In order to decrease voltage deviation, we minimize the voltage deviation as follows:

214

10 Many-Objective Distribution Network Reconfiguration …

min V D(x, u, S) = u

NS 

ps ·

s=1

NB   1  Vi − Vr e f  N B i=1

(10.10)

where Vref is the reference voltage value (usually set as 1.0). In (10.10), we first calculate the averaged VD for all N B buses in the DN and obtain the expected VD under the PV scenarios using their probability ps . (3) Power Loss One crucial metric to assess the economics of DN operations is power loss (PL). The distribution and direction of power flow would be affected by the RE with high penetration, potentially leading to a considerable rise in PL. The calculation of PL is as follows: NS NK   P 2 + Q2 min P L(x, u, S) = ps · Rk k 2 k (10.11) u Uk s=1 k=1 where Rk is the resistance of branch k and Uk stands for the bus voltage at the end of this branch. In (10.11), we first compute the sum of PL for each branch and obtain the expectation of PL on the basis of PV scenarios. (4) Statistic Voltage Stability The L-index can indicate the statistical voltage stability. The lower the statistic voltage stability, the higher the L-index is. The current operating mode can intuitively reflect the separation between the load bus and the voltage collapse point. The system’s level of voltage stability can be assessed using the maximum L-index calculated at all load buses. It can also be applied as a quantitative index to calculate the current operating point’s voltage stability margin. The following is the formula for the L-index: min L(x, u, S) = max u

i∈N B

NS  s=1

   NG   g=1 Fi,g Vg   ps · 1 −    Vi

(10.12)

where Vg and Fi,g are the voltage at generator g and the element of matrix F, respectively. (10.13) F = −Y −1 LL Y LG where Y LL represents the admittance matrix among load buses, while Y LG stands for the admittance matrix between the load bus and generator.

10.3 Deep Reinforcement Learning-Assisted Multi-objective …

215

(5) Generation Cost Additionally, reducing the cost of generation significantly impacts the DN’s economic gains. The generation cost (GC) is computed as follows: min GC(x, u, S) = u

NS 

ps

s=1

·

NG  

    cg PG2 g + bg PG g + ag + dg sin(eg (PGmin − P ))  G g g

g=1

(10.14) where ag , bg and cg are cost coefficients and dg and eg are the sinusoidal term represent the coefficients that manifest the valve point effect [22]. PG g and PGmin g power output of generator g and its minimum value, respectively.

10.3 Deep Reinforcement Learning-Assisted Multi-objective Bacterial Foraging Optimization Algorithm As discussed in the previous section, the proposed model is a non-convex manyobjective optimization problem. It can be converted to a single-objective optimization problem via the weighted sum method. However, it is unsuitable for solving the proposed MDNR model in this paper, as this method could not well solve the nonconvex optimization problem. Therefore, multi-objective evolutionary algorithms provide an alternative approach to deal with this problem, such as MOPSO, NSGA-II and MBFO. However, these algorithms have encountered great difficulties in solving many-objective optimization problems, although they have shown excellent performance in problems with two or three objectives. The primary reason is that most candidate solutions in the population of evolutionary algorithms become non-dominated with the number of objectives increasing. This makes traditional algorithms lose the selection driver, i.e., candidate solutions would not well find the suitable evolutionary direction to obtain better optimization performance. In order to solve this problem, we introduce the DRL technique to assist algorithms in finding suitable evolutionary directions. Specifically, DRL could conduct offline learning on the basis of training data, in order to provide references for actions of candidate solutions to obtain better optimization performance. In this paper, we select the DRL to assist MBFO, as it is a recently proposed algorithm and performs better than NSGA-II and MOPSO.

216

10 Many-Objective Distribution Network Reconfiguration …

10.3.1 Multi-objective Bacterial Foraging Optimization Algorithm MBFO is an emerging evolutionary algorithm, which simulates the foraging behavior of bacteria. This algorithm solves optimization problems via the following operations, i.e., chemotaxis, reproduction and dispersal, as shown in Fig. 10.2a. Therein, chemotaxis operation has two basic movements, i.e., swimming and tumbling. Bacteria tumble randomly to change the direction of their movement, meanwhile moving forward via swimming. Reproduction operation is to simulate the evolutionary process of bacteria and maintains good individuals. Afterward, some bacteria may mutate and cause the dispersal, and new individuals will be randomly generated in the feasible region. Via the above operations, the bacteria will evolve with iterations, as shown in Fig. 10.2b, and the Pareto front can be obtained after several iterations, as presented in Fig. 10.2c.

Fig. 10.2 Operations of MBFO, reprinted from Ref. [21], copyright 2022, with permission from IEEE

10.3 Deep Reinforcement Learning-Assisted Multi-objective …

217

Fig. 10.3 Procedure of DQN, reprinted from Ref. [21], copyright 2022, with permission from IEEE

10.3.2 Deep Reinforcement Learning DRL is an artificial intelligence algorithm that combines the deep learning and reinforcement learning (RL). DQN [23] is one of the pioneering works of DRL, on the basis of deep neural networks (DNN) and RL. Therein, RL is used to achieve the technique of autonomous learning driven by the agent’s goal, and DNN is adopted to address the problem of agent perception. The procedure of DQN is shown in Fig. 10.3. There are three parts in Fig. 10.3, i.e., the agent, environment and replay memory. From overall view, agent first obtains state s from the environment, then process state via neural network Q_eval_net and action selection module, and output action a to the environment. After that, the environment updates the state of agent to next state s  through the state  module and calculates the reward r by reward module and  update at last, storing s, a, r, s  in replay memory and outputting the next state r  to agent again. Wherein, s is the current state of agent, a stands for the action of agent, r denotes the reward when the state of agent is s, and the action taken is a, meanwhile, s  represents the next state after agent carry out the action a. From the inside of agent, there are two neural networks with the same structure but different parameters: Q_eval_net and Q_target_net. Wherein, θ represents the weight parameter of Q_eval_net, and θ − stands the weight parameter of Q_taget_net. Q_eval_net is used to predict the Q-value of all actions in state s. Similar, Q_target_net is used to predict the Q-value of all actions in state s  , which is used to assist in updating the Q-value. When the sample in replay memory reaches a certain number, the agent will randomly extract a fixed number of samples from  the replay  memory to train Q_eval_net. Assume the training sample is s, a, r, s  . First, input s to Q_eval_net to obtain the Q-value of all actions when the state is s, and a is used to determine the Q-value Q(s, a; θ ) will be updated. Therein, Q(s, a; θ ) represents the Q-value output by Q_eval_net when state is s and action is a. Similar, input s  to Q_target_net to obtain the Q-value of all actions when the state is s  , then the max-

218

10 Many-Objective Distribution Network Reconfiguration …

imum Q-value maxa  Q(s  , a  ; θ − ) will combine with r to form T arget Q. Therein, maxa  Q(s  , a  ; θ − ) stands the max Q-value in all Q-value output by Q_taget_net when the state is s  , a  represents the action correspond to maxa  Q(s  , a  ; θ − ). The update function of Q-value is shown in (10.15). At the same time, the weight parameter θ is updated depending on the loss functions at each iteration, and the loss function is shown in (10.16). (10.17) shows the calculation method of T arget Q.

Q(s  , a  ; θ − ) − Q(s, a; θ ) Q(s, a; θ ) ← Q(s, a; θ ) + α r + γ max  a

(10.15)



L(θ ) = E (T arget Q − Q(s, a; θ ))2

(10.16)

Q(s  , a  ; θ − ) T arget Q = r + γ max 

(10.17)

a

where γ and α stand for the discount factor and learning rate, respectively. In the training process, after each C episodes, copying the weight parameter θ to θ − to complete the update of Q_target_net. Repeat the above loop until the termination condition is satisfied. The trained DQN can output the Q-value of all actions according to the current state s, an attractive reward which is more likely obtained when the Q_value of action is larger.

10.3.3 Multi-objective Material Foraging Optimization Algorithm Based on Deep Reinforcement Learning The multi-objective bacterial foraging optimization algorithm has strong randomness, which makes its search efficiency very poor and the convergence rate is also slower; however, it has a strong search ability in finding the global best while avoiding the local optima. MBFO can effectively solve various multi-objective optimization problems, whereas it is not computationally friendly, especially in high-dimensional objective space. Deep reinforcement learning (DRL) is a combination of reinforcement learning (RL) and deep learning (DL). It has excellent learning capabilities. In recent years, deep reinforcement learning has achieved remarkable results in various fields such as speech [24], visual identity system [25], driverless [26] and games [27], which are attracting growing concern. This paper combines deep reinforcement learning with MBFO, using Deep-Q-Net to assist bacteria in chemotactic strategy. With the assistance of DQN, population will evolve in a better direction. The virtue of the present DRL-MBFO lies in greatly improves the search efficiency while ensuring the global search capability even in many-objective optimization. In order to solving the MDNR problem more effectively through the combination of DQN and MBFO, we improved the traditional DQN algorithm and making it more suitable for dealing with high-dimensional discrete optimization problems. The procedure of improved DQN is detailed as Algorithm 1.

10.3 Deep Reinforcement Learning-Assisted Multi-objective …

219

Algorithm 1 Improved Deep-Q-leaning, reprinted from ref. [21], copyright 2022, with permission from IEEE 1: Initialize replay memory D to capacity N 2: Initialize Q_eval_net with random weights θ 3: Initialize Q_target_net with random weights θ − 4: Initialize other parameters 5: for episode=1 to M do 6: Random state s0 initialization 7: for t=1 to T do 8: for i=1 to n do 9: Input st,i into DNNi to get the Q-value of all actions 10: Choose action at,i according to ε-greedy 11: Update state to st,i according to st,i and at,i 12: Calculate the fitness value f t,i of st,i −1, i f ∃ j, A j ≺ f t,i 13: Set rt,i ← 1, other wise 14: if rt,i = 1 then 15: Store f t,i in A 16: end if 17: Store transition in A 18: Set st,i+1 ← st,i 19: Sample random minibatch of transitions from Di 20: Set yk,i ← rk,i + γ maxa k,i Q − (sk,i , ak,i  ; θi− )

2 21: Perform a gradient descent step on yk,i − Q sk,i , ak,i ; θi with respect to the network parameters θi 22: Every C steps reset θi − ← θi 23: end for 24: Set st+1,1 ← st,n 25: end for 26: end for

where D=[D1 , D2 , . . ., Dn ]T denotes the replay memory, N=[N1 , N2 , . . ., Nn ]T represents the capacity of D, and n is the dimension of decision variables u. i n T 1 2 st,i =[st,i , st,i , . . . , st,i , · · · , st,i ] stands for the state of agent when the number of iterations is t and i; meanwhile, in this paper, the state also represents the decision variable u. The at,i is the action applied on the ith dimension state space when the number of iterations is t. f t,i and A represent the fitness of st,i and non-dominated archive, respectively. Notice that, θ = [θ1 , θ2 , . . . , θn ]T , accordingly, the number of DNN in Q_eval_net is n, i.e., Q_eval_net=[DNN1 , DNN2 , . . ., DNNn ]T . Analo− − T gously, Q_target_net=[DNN− 1 , DNN2 , . . ., DNNn ] . − First, we should initialize D, N, θ , θ and other parameters. When the outermost loop starts, the state s0 would initialize randomly. After the innermost loop starts, we should input st,i to Q_eval_net, and Q_eval_net would output the Q-value of all actions in state st,i , then the ε-greedy would choose an action at,i for execution and calculate st,i according to the following formula. 1 i n st,i = st,i + at,i = [st,i , . . . , st,i + at,i , . . . , st,i ]

(10.18)

220

10 Many-Objective Distribution Network Reconfiguration …

Next, estimating the fitness f t,i of st,i and calculating the reward rt,i as follows. rt,i =

−1, if ∃ j, A j ≺ f t,i 1, otherwise

(10.19)

f t,i is dominated by A j . where A j is a non-dominate solution in A. A j ≺ f t,i means   If rt,i = 1, storing f t,i in A. Afterward, the tuple st,i , at,i , rt,i , st,i , is stored in Di , and setting st,i+1 = st,i . So far, most of the loop has been completed, and the next step is to train the Q_eval_net. First, randomly extract a fixed number of samples from Di to train DNN1 , updating θi by gradient descent. In the training process, after each C episodes, copying the θi to θi − . After the innermost loop ends, setting st+1,1 = st,n for the next cycle. Finally, repeat the above loop until the termination condition is satisfied. In the above algorithm, since the number of DNN in Q_eval_net is n, Q_eval_net can output the Q-values of all actions in each dimension, which will be used to select actions in every dimensional. Therefore, the problem that a single DNN has limited number output of Q-value and difficult to combine with high-dimensional optimization algorithms is solved. In the proposed DRL-MBFO, we should train Q_eval_net according to algorithm 1 at first, which is called offline learning. Only the Q_eval_net completes training can be used to assist bacterial population evolve, and this process is referred to online assistance in this paper. In online assistance, the decision variables u correspond to the state s of the bacteria. Moreover, in the chemotaxis operation, the bacteria input their state st into the DNN1 corresponding to the first dimension, followed by, DNN1 will output the Q-value of all actions Q all in this dimension. Then choosing the action be executed through roulette wheel selection, before this, in order to prevent the state of bacteria from getting worse, the Relu function is used to filter the Q-value less than 0. Therefore, only actions with a Q-value greater than 0 will be applied to the state st , and updating the state to st+n , like (10.18). Repeat the above process until the state of all dimensions has been updated, finally calculate the fitness f of the state st+n , and the dominance of bacteria, all the non-dominated solutions are preserved in a non-dominated archive  as (10.20), and  will be updated with (10.21).  =  ∪ X best

(10.20)

=−B

(10.21)

where X best is the non-dominated solution and B represents the solution set that was from , but dominated by X best now. After the chemotaxis operation, calculate the crowding distance of each bacterium. First, according to the dominance, dividing bacteria into two parts: ND and NnonD , where ND is the part of dominated and NnonD is the part of non-dominated. Then, sorting the two parts of bacteria according to the crowding distance, respectively,

10.3 Deep Reinforcement Learning-Assisted Multi-objective …

221

Fig. 10.4 Multi-objective bacterial foraging optimization algorithm based on DQN, reprinted from Ref. [21], copyright 2022, with permission from IEEE

the smaller the crowding distance, the higher the ranking is. After that, put part ND after part NnonD , and eliminate half of the bacteria at the bottom. The remaining half of the bacteria will cross over to generate new bacteria in order to keep the absolute population size. The above is the reproduction operation of DRL-MBFO. At last, bacteria will eliminate and dispersal with a certain probability. When the bacterium satisfies the probability of elimination and dispersal, which will be eliminated, a new individual will be randomly generated in the feasible region. Repeating the above process until the termination conditions are met, therein, Nc , Nre and Ned are the cycles of chemotaxis operation, reproduction operation, elimination and dispersal operation. In summary, although most optimization algorithms have severely lost the selection driver in many-objective optimization, due to the assistance of DQN, bacteria will still move in a better direction, which makes DRL-MBFO more effective than other algorithms in many-objective optimization. The flowchart of proposed DRLMBFO is shown in Fig. 10.4.

222

10 Many-Objective Distribution Network Reconfiguration …

Fig. 10.5 Modified IEEE-33 system, reprinted from Ref. [21], copyright 2022, with permission from IEEE

10.4 Simulation Studies 10.4.1 Simulation Settings In order to test the effectiveness of proposed MDNR and DRL-MBFO, we conduct case studies on a modified IEEE-33 system, in which topology is shown in Fig. 10.5. Therein, slack generator is located at bus 1, and we set four PV power systems on the 13th, 21st, 24th and 31st bus, respectively. The rated active power of each PV power system is 2.0 MW. Meanwhile, set the constant power factor to 0.95. The standard illumination of PV power system is 1000 W/m2 , and the illumination when the PV conversion efficiency reaches the maximum is 300 W/m2 . In addition, forecast illumination is 288 W/m2 , 464 W/m2 , 392 W/m2 and 1200 W/m2 . At this time, the forecast integrated PV power is approximately 2.16 MW, 58.1% of the total system demand.

10.4.2 Simulation Result and Analysis After solving the MDNR model with DRL-MBFO, a series of non-dominated solutions can be obtained. However, the number of objective functions in MDNR is 5, which is difficult to express in the traditional Cartesian coordinate system. Therefore, this paper draws the Pareto front through the parallel axis plot, as shown in Fig. 10.6. In Fig. 10.6, the crossing lines reflect the trade-off between different nondominated solutions. Moreover, the high curtailment of PV power is usually accompanied by low VD, PL and L. On the contrary, when the amount of curtailment of PV power decreases, it will lead to a rise in the index such as VD and L, which creates hidden dangers to the safe operation of the distribution network.

10.4 Simulation Studies

223

Fig. 10.6 Parallel axis plot of five objective values, reprinted from Ref. [21], copyright 2022, with permission from IEEE Table 10.1 Values of objective for alternatives solutions Solution Cur VD PL S1 S2 S3 S4 S5

0 0.1078 0.2155 0.4311 0.6466

0.1947 0.1677 0.1727 0.0948 0.0199

0.7391 0.6049 0.4957 0.2725 0.0553

L

GC

0.5456 0.4703 0.3198 0.1735 0.0373

12.3889 12.2278 12.2187 12.1718 12.1616

Usually, we should choose a final solution from the non-dominated set. Therefore, the Preference Ranking Organization Method for Enrichment of Evaluations (PROME THEE) will be used for the final solution selection. The relative weights of Cur, VD, PL, L and GC must be set. The value of weight represents the importance of the objective. The greater the relative weight, the more important the objective is. In this paper, the relative weights are set to different combinations, and the result is given in Table 10.1. Wherein, S1 , S2 , . . . , S5 represent different alternative solutions, and their relative weights are set to [w1 , w2 , w3 , w4 , w5 ]=[(0.999, 0.001, 0.001, 0.001, 0.001), (0.75, 0.0625, 0.0625, 0.0625, 0.0625), (0.5, 0.125, 0.125, 0.125, 0.125), (0.25, 0.1875, 0.1875, 0.1875, 0.1875), (0.001, 0.25, 0.25, 0.25, 0.25)] respectively. The distribution of alternative solutions in each objective is shown in Fig. 10.7. It is easy to observe from Table 10.1 that the target value of Cur improved with its relative weight decreases. For S1 , the relative weight of Cur is close to 1, and the value of Cur reached its minimum 0. However, the other objective function are indicates an overall higher value, which poses a threat to the safety and economy of the DN. For S3 , the relative wight of Cur is still reached 50%, but the PL and L get significantly

224

10 Many-Objective Distribution Network Reconfiguration …

Fig. 10.7 Values of objectives for alternative solutions

decreased compared with S1 and S2 . Concurrently, since the weights of each objective of S4 are similar, each objective reached a great balance. Finally, for S5 , the weight of Cur is reduced to 0.001; therefore, the value of Cur reaches the maximum value 0.6466. In reality, the distribution network is a complicated large-scale system, we could not just take Cur or VD into account, which will cause the negatively correlated indicators deteriorate. Therefore, compared with multi-objective or single-objective optimization, many-objective optimization is more practical. Meanwhile, in the actual power system, the weight between the objectives is very complicated, and even a slight disturbance will make the final selected scheme different. So, in this paper, we adopt w0 = [0.2, 0.2, 0.2, 0.2, 0.2] as the final choice of relative weight. The corresponding solution is S0 = [0.5388, 0.0536, 0.1615, 0.1033, 12.1521]. The decision variable corresponding to S0 –S5 is given in Table 10.2.

10.4.3 Comparison with Other Algorithms To compare with others algorithms, we compared DRL-MBFO with MBFO, MOPSO and NSGA-II. Note that the number of swimming in traditional MBFO is not constraint, which caused the scale of calculation is difficult to determine. For in 30 independent runs, the ratio of the average calculation scale of DRL-MBFO and MBFO is 1:1.036, regarding 200 and 90 iterations of DRL-MBFO and MBFO, respectively. Thereby, in this paper, the iterations of MBFO are set 90, and the others are identically set 200. Meanwhile, for easier comparison, the iterations of DRL-MBFO will be converted to 200. The size of population N is set 100, and the capacity of non-dominated archive is 1000, so as to accommodate all historical non-dominated solutions.

10.4 Simulation Studies

225

Table 10.2 Decision variables of each solution Solution Decision variable (open switches) S0 S1 S2 S3 S4 S5

8–21 3–4 4–5 2–3 2–3 7–8

13–14 11–12 11–12 13–14 9–15 9–15

10–11 8–21 8–21 8–21 10–11 10–11

9–15 30–31 30–31 18–33 29–30 17–18

Table 10.3 Comparison of HV, SP, MD and NPS metrics Algorithm HV SP MD DRL-MBFO 1.0760 0.0174 2.285 MBFO MOPSO NSGA-II

1.0749 1.0724 1.0495

0.0181 0.0183 0.0299

2.287 2.282 2.258

3–4 24–25 24–25 25–29 28–29 28–29

NPS 212 184 221 77

In order to compare the performance of each algorithm more clearly, the hypervolume (HV), the spacing (SP) index, the mean distance (MD) and the number of Pareto solutions (NPS) are used as metrics for algorithm comparison [28, 29]. Therein, HV is an index to assess both diversity and convergence of obtained Pareto front. SP is used to evaluate the distribution of Pareto front in the target space. MD represents the mean Euclidean distance between non-dominated solutions and reference point, and NPS is used to assessed the searching efficiency of algorithm. We investigate the result in 30 independent runs to further compare the above algorithm. First, results of experiment with the best HV are chosen to comparison, and Table 10.3 shows the comparing results. More precisely, DRL-MBFO has the largest value in HV index, which indicates its convergence is the best, and the small SP index implies the non-dominated solutions obtained by DRL-MBFO is more uniform. Meanwhile, the MD index and NPS index of DRL-MBFO are very close to the best value. In order to compare the HV index further, Fig. 10.8 shows the evolution of HV index in each iteration. It is observed that DRL-MBFO obtained a superb HV value within 104 iterations. Moreover, compared with MBFO, DRL-MBFO only needs 99 iterations to achieve the larger HV value. Analogously, compared with MOPSO and NSGA-II, the number of iterations is 36 and 6 respectively. This indicates the superb convergence rate of DRL-MBFO. The reason why DRL-MBFO obtained a superb HV value just within little iterations is it has the assistance of DQN. Therefore, bacteria know how to reach the best state, and the next thing is just to do it. In order to reflect the efficiency of DRL-MBFO, we make statistics to the non-dominated solutions on the iterations they are obtained, and the result is in Fig. 10.9. The number of non-dominated solu-

226

10 Many-Objective Distribution Network Reconfiguration …

Fig. 10.8 Evolution of HV for the five algorithms with increasing iterations

Fig. 10.9 Distribution of non-dominated solutions

tions obtained by DRL-MBFO is concentrated in the first 40 iterations. Especially, 43 non-dominated solutions are obtained in the first ten iterations, 6 times more than MBFO and 2 times more than MOPSO and NSGA-II, which indicates the attractive searching performance of DRL-MBFO. Then, we study each algorithm’s HV value in 30 independent runs. The distribution is shown in Fig. 10.10. The result shows that DRL-MBFO has the largest mean 1.0741 and the smallest variance 1.85×10−6 , which indicates that DRL-MBFO not only has the best convergence, but also has excellent stability. Meanwhile, NSGA-II has the smallest mean 1.0361 and the largest variance 6.18×10−5 , and the mean and variance

10.5 Conclusion

227

Fig. 10.10 Distribution of HV value

of MBFO and MOPSO are between the DRL-MBFO and NSGA-II, which signify that DRL-MBFO is superior to MBFO, MOPSO and NSGA-II in convergence and stability for many-objective optimization. In conclusion, based on the above analysis, the convergence rate, research performance and stability of DRL-MBFO have a notable advantage to MBFO, MOPSO and NSGA-II in many-objective optimization.

10.5 Conclusion This paper proposes a many-objective distribution network reconfiguration (MDNR) model with stochastic photovoltaic power. In this model, the objective function involves the photovoltaic power curtailment, voltage deviation, power loss, statistic voltage stability and generation cost. Then a deep reinforcement learning-assisted multi-objective bacterial foraging optimization algorithm (DRL-MBFO) is proposed to solve the above MDNR model.

228

10 Many-Objective Distribution Network Reconfiguration …

Case studies are conducted on a modified IEEE-33 system. Simulation results manifest the necessity of building a many-objective optimization model. Meanwhile, the attractive performance of DRL-MBFO in convergence rate, research performance and stability in many-objective optimization has been reflected.

References 1. A. Whiteman, S. Rueda, D. Akande, N. Elhassan, G. Escamilla, I. Arkhipova,Renewable Capacity Statistics 2020. International Renewable Energy Agency (2019) 2. D. Motyka, M. Kajanova, P. Bracinik, The impact of embedded generation on distribution grid operation. International Conference on Renewable Energy Research and Applications, pp. 360–364 (2018) 3. S. Choi, Practical coordination between day-ahead and real-time optimization for economic and stable operation of distribution systems. IEEE Trans. Power Syst. 33(4), 4475–4487 (2018) 4. Y. Xu, Z.Y. Dong, R. Zhang, D.J. Hill, Multi-timescale coordinated voltage/var control of high renewable-penetrated distribution systems. IEEE Trans. Power Syst. 32(6), 4398–4408 (2017) 5. D. Jin, H. Chiang, P. Li, Two-timescale multi-objective coordinated volt/var optimization for active distribution networks. IEEE Trans. Power Syst. 34(6), 4418–4428 (2019) 6. B. Bletterie, S. Kadam, R. Bolgaryn, A. Zegers, Voltage control with PV inverters in low voltage networks?in depth analysis of different concepts and parameterization criteria. IEEE Trans. Power Syst. 32(1), 177–185 (2017) 7. S. Chalise, H.R. Atia, B. Poudel, R. Tonkoski, Impact of active power curtailment of wind turbines connected to residential feeders for overvoltage prevention. IEEE Trans. Sustain. Energy 7(2), 471–479 (2016) 8. S. Martín-Martínez, E. Gómez-Lazaro, A. Molina-Garcia, A.H onrubia-Escribano, Impact of wind power curtailments on the Spanish power system operation, in IEEE PES General Meeting/Conference Exposition, pp. 1–5 (2014) 9. J. Li, Z. Xu, J. Zhao, C. Zhang, Distributed online voltage control in active distribution networks considering PV curtailment. IEEE Trans. Ind. Inf. 15(10), 5519–5530 (2019) 10. Y. Wang, D. Zhang, Q. Ji, X. Shi, Regional renewable energy development in china: a multidimensional assessment. Renew. Sustain. Energy Rev. 124, 109797 (2020) 11. Y. Takenobu, N. Yasuda, S. Kawano, S. Minato, Y. Hayashi, Evaluation of annual energy loss reduction based on reconfiguration scheduling. IEEE Trans. Smart Grid 9(3), 1986–1996 (2018) 12. A. Asrari, S. Lotfifard, M. Ansari, Reconfiguration of smart distribution systems with time varying loads using parallel computing. IEEE Trans. Smart Grid 7(6), 2713–2723 (2016) 13. Q. Peng, Y. Tang, S.H. Low, Feeder reconfiguration in distribution networks based on convex relaxation of OPF. IEEE Trans. Power Syst. 30(4), 1793–1804 (2015) 14. F. Capitanescu, L.F. Ochoa, H. Margossian, N.D. Hatziargyriou, Assessing the potential of network reconfiguration to improve distributed generation hosting capacity in active distribution systems. IEEE Trans. Power Syst. 30(1), 346–356 (2015) 15. K. Deb, A. Pratap, S. Agarwal, T. Meyarivan, A fast and elitist multiobjective genetic algorithm: NSGA-ii. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002) 16. J. Cheng, G.G. Yen, G. Zhang, A many-objective evolutionary algorithm with enhanced mating and environmental selections. IEEE Trans. Evol. Comput. 19(4), 592–605 (2015) 17. C.A. Coello Coello, M.S. Lechuga, Mopso: a proposal for multiple objective particle swarm optimization, in Proceedings of the 2002 Congress on Evolutionary Computation. CEC’02 (Cat. No.02TH8600), vol. 2, pp. 1051–1056 (2002) 18. B. Niu, H. Wang, J. Wang, L. Tan, Multi-objective bacterial foraging optimization. Neurocomputing 116, 336–345 (2013)

References

229

19. M. Alhazmi, P. Dehghanian, S. Wang, B. Shinde, Power grid optimal topology control considering correlations of system uncertainties. IEEE Trans. Ind. Appl. 55(6), 5594–5604 (2019) 20. A. Asrari, T. Wu, S. Lotfifard, The impacts of distributed energy sources on distribution network reconfiguration. IEEE Trans. Energy Convers. 31(2), 606–613 (2016) 21. Y. Li, G. Hao, Y. Liu, Y. Yaowen, Z. Ni, Y. Zhao, Many-objective distribution network reconfiguration via deep reinforcement learning assisted optimization algorithm. IEEE Trans. Power Deliv. 37(3), 2230–2244 (2022) 22. Y.Z. Li, M.S. Li, Q.H. Wu, Energy saving dispatch with complex constraints: prohibited zones, valve point effect and carbon tax. Int. J. Electr. Power Energy Syst. 63, 657–666 (2014) 23. Y. Li, Deep reinforcement learning: an overview. arXiv preprint arXiv:1701.07274 (2017) 24. H. Lee, P. Chung, Y. Wu, T. Lin, T. Wen, Interactive spoken content retrieval by deep reinforcement learning. IEEE/ACM Trans. Audio Speech Lang. Process. 26(12), 2447–2459 (2018) 25. X. Cheng, J. Lu, B. Yuan, J. Zhou, Identity-preserving face hallucination via deep reinforcement learning. IEEE Trans. Circ. Syst. Video Technol. (2019) 26. S. Chen, M. Wang, W. Song, Y. Yang, Y. Li, M. Fu, Stabilization approaches for reinforcement learning-based end-to-end autonomous driving. IEEE Trans. Vehic. Technol. 69(5), 4740–4750 (2020) 27. C. Yeh, C. Hsieh, H. Lin, Automatic bridge bidding using deep reinforcement learning. IEEE Trans. Games 10(4), 365–377 (2018) 28. K. Shang, H. Ishibuchi, L. He, L.M. Pang, A survey on the hypervolume indicator in evolutionary multi-objective optimization. IEEE Trans. Evol. Comput. (2020) 29. M. de Athayde Costa e Silva, C. Eduardo Klein, V. Cocco Mariani, L. dos Santos Coelho, Multiobjective scatter search approach with new combination scheme applied to solve environmental/economic dispatch problem. Energy 53, 14–21 (2013)

Chapter 11

Federated Multi-agent Deep Reinforcement Learning for Multi-microgrid Energy Management

11.1 Introduction In recent years, renewable energy (RE) has been widely deployed, such as wind power and photovoltaic. Unlike traditional energy, RE resources are usually distributed. Therefore, microgrids (MGs) have been paid much attention to utilize the RE. The MG usually locates in a small area and provides the required electricity for an entity, such as a school, a hospital and a community[1]. Normally, the main target of MG is to achieve the self-sufficiency of energy, via the utilization of RE. However, due to its limited capacity, the MG has to take the risk of power shortage. Specifically, since the user demand and RE are dependent on the user behavior and weather condition, the power demand may exceed the capacity of MG, while the RE generation may be insufficient [2]. For this reason, numerous adjacent MGs are interconnected to form a multi-microgrid (MMG) system [3]. Compared with the isolated MG, the MMG system is more capable of utilizing RE because of its larger capacity. Besides, the energy is allowed to be traded among different MGs, i.e., each MG can actively sell its additional power when its power generation exceeds the demand, or purchase power from other MGs when the generation is insufficient [4]. Therefore, the MMG is more promising to achieve the energy self-sufficiency, compared with an isolated MG. However, because of the complexity of energy management of the MMG, it is essential to adopt an effective scheme. The present studies of MMG energy management can be mainly categorized into two types, i.e., the centralized and decentralized schemes. The former one is based on a centralized energy management center, which could get access to the corresponding energy information of the MMG system [5]. Then, the center can well make decisions to achieve the energy self-sufficiency of the MMG system. However, note that the multiple MGs usually belong to different entities, and it is difficult for the centralized management center to well acquire operation data of all MGs due to the increasing awareness of privacy protection. Therefore, a more popular research direction is the decentralized MMG management scheme. For instance, Ng et al. have proposed the multi-agent approach to © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Li et al., Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch, Engineering Applications of Computational Methods 14, https://doi.org/10.1007/978-981-99-0799-1_11

231

232

11 Federated Multi-agent Deep Reinforcement …

achieve the decentralized control of the MMG system [6]. In addition, Yang et al. have adopted multiple self-decision agents to realize the energy self-sufficiency of participated MGs [7]. Liu et al. have treated the MMG system as a fully distributed optimization model, which is solved by a robust optimal scheduling algorithm [8]. The aforementioned literature focuses on building accurate optimization models, which can be summarized as the model-based approach. However, there exists an essential drawback with such an approach, i.e., it is merely used for the predetermined scheduling solution, which is not suited for a real-time decision. In other words, when some emergencies occur in the MMG system, such as the rapid decrease of load demand, the predetermined scheduling may not work, etc. To tackle this problem, the learning-based approach has been developed in recent years, due to the wide application of artificial intelligence technology. One of the representative approaches, i.e., the multi-agent deep reinforcement learning (MADRL), is widely deployed in the field of decentralized MMG energy management [9]. For instance, [10] studies various deep reinforcement learning algorithms for energy management of the MG, and experiments demonstrate the convergence of these algorithms and emphasize the outperformance of the actor-critic algorithm. Reference [11] proposes an energy management approach that takes advantage of a multi-agent model-free reinforcement learning algorithm. This distributed and hierarchical decision mechanism effectively increases the energy self-sufficiency of MMG. However, since the MADRL technology requires massive data to train the MG agent, the concern of user privacy is raised. The data of users can be utilized to analyze their habits and even life tracks. In the case of MMG energy management, to train an effective agent with a high generalization, massive energy operation data should be collected from different MGs. However, such a mechanism has two defects. On the one hand, some MGs may not be willing to submit their processing data because of privacy awareness [12]. On the other hand, the security during data transmission cannot be guaranteed. To tackle above issues, we introduce an emerging distributed learning approach, namely federated learning (FL), for the training of MADRL in the MMG energy management [13]. In other words, we apply the FL to protect user privacy and guarantee data security while ensuring the generalization of each MG agent in the MMG system. Specifically, each MG is controlled by an agent, which deploys a recent deep reinforcement model, namely proximal policy optimization (PPO) [14]. Each agent firstly executes the self-training according to the local energy operation data in each MG. Then, agents upload their learned model parameters to a server. After that, these parameters are aggregated by the server to construct a global model, which will be broadcasted to each MG and replace the learned model. In this way, the user privacy and data security can be guaranteed because only model parameters are transmitted. Moreover, agents share their experiences through the FL approach, which thus enhance their generalization1 compared with the local training. 1

Since the MGs in the MMG belong to different kinds of entities, which have distinct habits of using electricity, the local MG operation data manifest the preference of the local user. Therefore, the generalization of the agent would decrease.

11.2 Theoretical Basis of Reinforcement Learning

233

In this chapter, we develop an MMG system model for the deployment of FL, where each MG contains conventional generators, batteries, renewable energy generators, load and the energy management center. Besides, we propose a federated multi-agent deep reinforcement learning algorithm (F-MADRL) for the energy management of the MMG system. Case studies demonstrate that our proposed F-MADRL algorithm effectively manages the operation of the MMG system under different demand and renewable energy scenarios. Moreover, we verify that F-MADRL outperforms other state-of-the-art DRL algorithms under the distributed MMG model. The remainder of this chapter is organized as follows. Section 11.2 introduces the theoretical basis of the reinforcement learning. In Sect. 11.3, a decentralized MMG model is built. Section 11.4 proposes the F-MADRL algorithm and provides its overall structure and technical details. In Sect. 11.5, comprehensive case studies are conducted. Finally, Sect. 11.6 concludes this chapter.

11.2 Theoretical Basis of Reinforcement Learning Normally, the Markov decision process (MDP) is defined by a tuple S, A, P, R, where S is the finite state space that stands for all valid state and A represents the finite set of actions. P = { p(st+1 |st , at )} stands for the set of transition probability from state st to st+1 and R = r (st , at ), R ∈ R; S × A → R is termed as the reward function, which is normally the metric to evaluate the action. To solve the MDP, a policy π should be developed to provide the probability of executing action a when observing the state s, i.e., π(a|s) = P[At = a|St = s]. The aim of π is to maximize the discounted cumulative reward during the finite time T , which is termed as the return function: T  γ R(st ) (11.1) U (s0 , s1 , ..., sT ) = t=0

where γ ∈ [0, 1] is the discount factor, representing the importance of the present reward. Then, two kinds of value functions are defined based on U to help the policy make decisions. The first is the state value function Vπ (s) and the other is the action value function Q π (a, s), which are formulated as follows: Vπ (s) = E π [Ut |St = s]    π(a|s) Pssa  [r (s, a) + γ Vπ (s )] =

(11.2)

s

a

Q π (a, s) = E π [Ut |St = s, At = a]   Pssa  [R(s, s  |a) + γ Q π (a  , s  )] = s

a

(11.3)

234

11 Federated Multi-agent Deep Reinforcement …

where Vπ (s) stands for the expectation of future reward at the state s and the Q π (a, s) represents the future expected reward when selecting an action a at state s. s  and a  stand for the possible reaching state and action at state s. Pssa  is the transition probability from s to s  under a. In fact, Vπ (s) and Q π (a, s) are used to evaluate the quality of the state s and the action-state pair (a, s), respectively. They are updated according to above two equations and help the policy π decide whether reaching the state or executing the action.

11.3 Decentralized Multi-microgrid Energy Management Model The decentralized MMG system includes numerous MGs that are connected to a distribution power network. Usually, an energy management center is set in each MG, which performs as an agent to conduct self-training and control the dispatchable elements, such as conventional generators (CGs) and batteries (BAs). In this section, to describe the energy management model of the MMG system more clearly, we firstly introduce the isolated MG model with the MDP format before developing the MMG model.

11.3.1 Isolated Microgrid Energy Management Model Figure 11.1 illustrates the structure of an isolated MG, which is constructed by five types of elements: renewable power generators (RPGs), BA, CG, conventional load

Fig. 11.1 Structure of an isolated MG, reprinted from [15], copyright2023, with permission from IEEE.

11.3 Decentralized Multi-microgrid Energy Management Model

235

(CL) and energy management center. Note that BAs and CGs are dispatchable since their outputs are controlled by the management center. On the contrary, because of the high uncertainties of RE, the outputs of RPG cannot be controlled. Additionally, the energy management center is termed as the agent that controls these dispatchable elements by observing the state of MG operation. The details of each element are given as follows. 1. Conventional Generator: It can be seen from Fig. 11.1 that the CG includes diesel engine generator and microturbine, which generate power through fossil fuels. The cost functions of CGs can be represented as follows: 2 + bCG,i PCG,i + cCG,i (11.4) C(PCG,i ) = aCG,i PCG,i min max PCG,i ≤ PCG,i ≤ PCG,i

(11.5)

where C(PCG,i ) represents the generation cost of ith CG and PCG,i is its generation min max power. aCG,i , bCG,i and cCG,i denote the cost coefficients of CGi . PCG,i and PCG,i are the lower and upper bounds of CGi . 2. Renewable power generator: Fig. 11.1 presents two kinds of RPGs, namely wind turbine and photovoltaic. The generation of RPG normally depends on the natural environment such as wind speed, temperature, weather and solar irradiance. Since the RPGs do not consume any fossil fuels, their generation costs are not considered in this chapter. 3. Battery: As one of the most commonly used energy storage devices, BA can store energy generated by CGs and RPGs and release it when needed. Thus, the BA has two operation states, namely charging and discharging, which are represented by the transition of its state of charge, and can be formulated by t PBA ηch E BA t η dch PBA = (1 − δ)SOCt − E BA

SOCt+1 = (1 − δ)SOCt −

(11.6)

SOCt+1

(11.7)

where SOCt and SOCt+1 denote the charging states of BA at time t and t + 1. PBA is the charging–discharging power of BA. Here, we assume PBA > 0 when the BA is charging and PBA < 0 when BA is discharging. The ηch and ηdch are the charging and discharging efficiencies. δ denotes the discharging rate, which is set as 0.2%. E BA represents the capacity of BA. Similar to CG, the operation of BA would bring about the generation cost, which is formulated by the following equation: max 2 C(PBA, j ) = aBA, j (PBA, j + 3PBA, j (1 − SOC))

(11.8)

236

11 Federated Multi-agent Deep Reinforcement … max +bBA, j (PBA, j + 3PBA, j (1 − SOC)) + cBA, j max max PBA, j < PBA, j < PBA, j

(11.9)

where C(PBA, j ) represents the cost of the jth BA. aBA, j , bBA, j and cBA, j are the max min cost coefficients of the jth BA. PBA, j and PBA, j are the upper and lower bounds of BA output power. 4. Network power loss of MG: Practically, there exist the power losses during the transmission of energy in the MG. The power loss usually corresponds to the active generation power [16], which could be formulated as λCG =

∂ Ploss ∂ Ploss ∂ Ploss , λRPG = , λBA = PCG PRPG PBA

(11.10)

where λCG , λRPG and λBA represent the power loss coefficients of CG, RPG and BA, respectively. In this chapter, λCG = λRPG = λBA are set as 0.02. Then, the power loss Ploss can be given by the following equation [16]: Ploss =

n CG  i=1

λCG PCG,i +

n RPG  j=1

λRPG PRPG,j +

n BA 

λBA PBA,k

(11.11)

k=1

where n CG , n RPG and n BA are the numbers of CGs, RPGs and BAs in the isolated MG, respectively.

11.3.2 Isolated MG Energy Management Model via MDP Since the energy management center of the MG is an agent which is trained by DRL algorithm, the above isolated MG model should be reformulated as the MDP model. The details of state, action and reward are defined as follows. 1. State: In this chapter, we consider a 24-hour scheduling of the MG, and each hour is denoted by t ∈ {1, 2, ..., 24}. The state of MG at time t includes the energy operation information, which is defined as follows: t−1 t−1 , ..., PRPG,n , SOCt−1 , E λt−1 } st = {PLt−1 , PRPG,1 RPG

(11.12)

t−1 where st indicates the state of MG at time t; PLt−1 and PRPG,i stand for the load t−1 demand and the ith RPG at time t − 1. In addition, the E λ is the electricity price in the transaction between the MG and the distribution power network. 2. Action: The action is generated by the agent, which stands for power outputs of the CGs and BAs, defined as follows: t t t t , ..., PCG,n , PBA,1 , ..., PBA,n } at = {PCG,1 CG BA

(11.13)

11.3 Decentralized Multi-microgrid Energy Management Model

237

3. Reward: We expect the MG to achieve the balance between its power demand and generation, i.e., the energy self-sufficiency. In other words, the deviation between load demand and real power generation should be minimized as well. Therefore, the reward function is defined as follows: t ) rt = −ζ × abs(Pde

(11.14)

where rt is the reward value at time t and ζ is the manually defined shrinkage coeft evaluates the deviation between ficient. abs(·) stands for the absolute function. Pde load demand and real generation, which is formulated by t Pde = PLt −

n CG 

t PCG,i +

i=1

n RPG 

t PRPG,j +

j=1

n BA 

t t PBA,k − Ploss



(11.15)

k=1

t By incorporating Eqs. (11.11) and (11.15), the term Pde can be rewritten as follows: t Pde

 n CG

t − λCG )PCG,i +   n BA t t j=1 (1 − λRPG )PRPG,j + k=1 (1 − λBA )PBA,k .

n RPG

= PLt −

i=1 (1

(11.16)

Since the outputs of CG, RPG and BA are bounded, we have max t min ≤ Pde ≤ PLt − PMG PLt − PMG

(11.17)

n CG max n RPG max  BA max n CG min min PCG,i + j=1 PRPG, j + nk=1 PBA,k and PMG = i=1 PCG,i + where P max = i=1  n RPG MG n BA min min P + P are the maximum and minimum outputs of the MG. j=1 RPG, j k=1 BA,k max min PRPG, j and PRPG, j are the upper and lower bounds of jth RPG output power. In addition, the power demands cannot be infinite, which are naturally bounded by PLmin and PLmax , namely PLmin ≤ PLt ≤ PLmax . Therefore, the upper and lower bounds of the reward function can be given min max |, |PLmin − PMG |) ≤ rt ≤ 0 − max(|PLmax − PMG

(11.18)

where max(a, b) is the maximum function, returning the bigger value of a and b.

11.3.3 Decentralized Multi-microgrid Energy Management Model As shown in Fig. 11.2, a decentralized MMG model that contains n p MGs is considered in this chapter. These MGs are connected to the distribution power network,

238

11 Federated Multi-agent Deep Reinforcement …

Fig. 11.2 Structure of MMG system, reprinted from [15], copyright2023, with permission from IEEE

and the energy transaction between MGs is also allowed. Each MG is controlled by an agent, which observes the state st of MG and provides the action at . Since the MG is encouraged to maximize the reward rt for achieving energy selfsufficiency, the target of the MMG should be the maximum of the systematic rewards rsys,t , which can be represented by the sum of rewards obtained by all the MG agents. The rsys,t is given by rsys,t =

np  i=1

rti =

np 

t − i × abs(Pi,de )

(11.19)

i=1

t where rti represents the reward obtained by the ith MG agent at time t. i and Pi,de are the shrinkage coefficient and deviation of MG i. Besides, since the load demand of MG cannot be known in advance, excessive or insufficient power generation of an isolated MG is unavoidable, thus the energy transaction in the MMG system is inevitable. Therefore, the energy transaction mechanism between different MGs is developed, which is given below. That is, MG is allowed to conduct energy transactions with the distribution power network and other MGs, as shown in Fig. 11.2. If the generated power of MG i exceeds its load demand at time t, the excessive energy will be sold to other MGs with a price E i (t). If the demand of MG i cannot be satisfied, the MG will purchase electricity from MG j, which has the lowest price of the whole participated MGs.

j = arg min El (t) × L l , l ∈ [1, 2, ..., n p ]

(11.20)

l

where L l indicates whether the generation of MG l exceeds its demand. The L l is set as 1 if the demand is exceeded or set as infinite if not. The MGs will preferentially purchase the surplus power generated by other MGs. When the MG generations are fully consumed, the distribution power network will provide power with price E dpn (t), which is usually higher than El (t), l ∈ [1, 2, ..., n p ].

11.4 Federated Multi-agent Deep Reinforcement Learning Algorithm

239

Fig. 11.3 Process of PPO algorithm, reprinted from [15], copyright2023, with permission from IEEE

11.4 Federated Multi-agent Deep Reinforcement Learning Algorithm As discussed above, the MMG is decentralized, thus each MG agent has a high autonomy. However, the decentralized structure threatens the generalization performance of the agent, because the diversity of the data in the isolated MG is limited, which may make the agent trap into a local optima. To tackle this issue, we propose a federated multi-agent deep reinforcement learning (F-MADRL) algorithm. The FL is used to improve the generalization of the agent during training while ensuring data privacy. In this part, we firstly introduce the proximal policy optimization (PPO) algorithm, which is a famous deep reinforcement learning algorithm that is used for the self-training of MG agents. Then, the FL is developed and the proposed F-MADRL algorithm is introduced.

11.4.1 Proximal Policy Optimization In this chapter, each MG agent performs self-training with the PPO algorithm to obtain the optimal policy π . There are two types of deep neural networks, namely actor and critic, defined by the PPO. Actor π θ is parameterized by θ , which aims to produce the action. The critic is denoted as V μ , and it is parameterized by μ. The overall training process of PPO during an episode is illustrated in Fig. 11.3. First of all, the experiment tuples T are sampled T = {s0 , a0 , r0 , s1 , s1 , a1 ,

240

11 Federated Multi-agent Deep Reinforcement …

r1 , s2 , ..., sU , aU , rU , sU +1 }, where U indicates the length of T . Then, the PPO could calculate the loss function of the actor in the kth episodes, which is defined as follows:    πθ πθ π θ (a|s) π θ (a|s) (11.21) LC = Es,a∼T min( π θk (a|s) As,ak , clip( π θk (a|s) , 1 − , 1 + )As,ak ) k−1

k−1

where Es,a∼T [·] represents the empirical average over the sampled experiment tuples T . The πk−1 and πk stand for the previous and new policy, respectively. clip(t, tmin , tmax ) is the clip function, which returns tmax if t > tmax and tmin if t < tmin . The is the clip parameter. Aπs,ak stands for the advantage estimator, which is calculated via the following equation: πθ

As,ak = δ0V + (γ λ)δ1V + (γ λ)2 δ2V , ..., +(γ λ)U −t+1 δUV −1

(11.22)

where γ ∈ [0, 1] and λ ∈ [0, 1] represent the discount factor and a manually defined hyperparameter, respectively. δkV is calculated by μ

μ

δkV = rk + γ Vk (st+1 ) − Vk (st ) μ

(11.23)

μ

where Vk (st+1 ) and Vk (st ) are given by the critic, which is trained by the loss function LV :

μ μ LV = Es,a∼T (γ Vk (st+1 ) + r (st , at ) − Vk (st ))2 .

(11.24)

With the above equations, the parameter θ and μ of the actor and the critic can be updated by the following equations: θk+1 = θk + ηπ ∇θk LC μk+1 = μk + ηV ∇μk L

V

(11.25) (11.26)

where ηπ and ηV are the learning rate of actor and critic.

11.4.2 Federated Learning The federated learning is a distributed learning mechanism that protects individual privacy while ensuring the training performance [17]. There are two characteristics in the FL: One is called the participant and the other is termed as the collaborator. The j participant j, j ∈ [1, n p ], is denoted as a deep learning model f w j . It conducts the self-training locally and uploads its parameters w j to the collaborator periodically, where n p is the number of participants which are processed in parallel. Constrained j by the data privacy, the participant f w j only trains on the local dataset, which may cause the insufficient training since the capacity and diversity of the data are lim-

11.4 Federated Multi-agent Deep Reinforcement Learning Algorithm

241

ited. The FL could tackle this problem through the following steps. First, at the j training epoch e, e ∈ [1, Ne ], the model of jth participant is defined as f we , which j conducts self-training to obtain the parameters w ej . Ne is the total number of training epochs. Then, each participant to the collaborator and   uploads its parameters constructs a parameter list we = w1e , w2e , ..., wne p . The collaborator calculates the weight average of we to estimate a global model f Ge+1 with parameters w e+1 G . After e+1 aggregation, the collaborator broadcasts w G to all the participants and replaces their e+1 own parameters, i.e., w e+1 = w2e+1 ... = wne+1 . The aggregating mechanism G = w1 p of FL is formulated by the following equations: = w eG − η∇ F j (w ej ), ∀ j w e+1 j

(11.27)

 1 w e+1 j n j=1 p

(11.28)

np

w e+1 G =

where η and F j (·) are the learning rate and local loss function of the jth participant, respectively.

11.4.3 Federated Multi-agent Deep Reinforcement Learning Algorithm In this chapter, the federated learning mechanism is introduced for the multi-agent training in the MMG presented in Section III, and the F-MADRL algorithm is proposed. In F-MADRL, the participant can be considered as the agent in each MG and the collaborator is a server that takes the responsibility for aggregating and broadcasting the parameters. The F-MADRL aims to solve the following distributed optimization model: np  e F(w G ) = p j F j (w ej ) (11.29) min e wG

j=1

where F(·) is the global loss p j represents the weights of each MG on the nfunction. p p j = 1. Note that F(·) cannot be directly computed global model, and p j > 0, k=1 without the information sharing of each participant. The F-MADRL includes two parts: One is executed on server and the other is executed on MG agents. The procedure of the server part and MG agent part is provided in Algorithms 1 and 2, respectively. During the beginning of the training epoch of F-MADRL, the server would build a global agent with the parameter w0G , which is then broadcasted to each MG agent for self-training. Since the agents update their parameter in parallel, the server aggregates the parameters list we = [w1e , w2e , ..., wne p ] by Eq. (11.28). Furthermore, the

242

11 Federated Multi-agent Deep Reinforcement …

Algorithm 1 The federated multi-agent deep reinforcement learning algorithm on the server, reprinted from Ref. [15], copyright2023, with permission from IEEE 1: Execute on the server: 2: Initialize the model parameters w 0G and broadcast them to the MG agents. 3: for Global epoch e = 1 to Ne do 4: for MG agent j = 1 to n p parallel do 5: Update the MG parameter w ej at the local agent. 6: Store the w ej . 7: Upload the w ej to the server 8: end for 9: Receive the parameters from each MG agent and construct 10: we ← [w1e , w2e , ..., wne p ] 11: Aggregating parameters through n pthe 1model e+1 12: w e+1 j=1 n p w j . G = 13: Broadcast the w e+1 G to other MG agents. 14: end for

aggregated parameters w e+1 are used to update the global model parameters and G broadcast to the MG agent for the training of epoch e + 1. On the other hand, Algorithm 2 describes the self-training procedure of the MG agent, which cooperates with the server. When the MG agents receive the parameter w eG from global model at the epoch e, their parameters are replaced by w eG , i.e., w ej = w eG . Then, each MG agent executes Ni individual self-training epochs in parallel. Here, the PPO is applied to conduct the self-training. Afterward, parameters of the MG agent at the last self-training epoch, namely θ Ni and μ Ni , are stored and uploaded to the server. The overall structure of F-MADRL is illustrated in Fig. 11.4. At the epoch e, the agent in three MGs is firstly replaced by the global agent in the (e − 1)th epoch. Then, the three MG agents conduce self-training to obtain parameters, which are uploaded to the server for aggregation. Next, the global agent would be built on the server, and the parameters will be broadcasted to the MG agents for the (e + 1)th epoch. Furthermore, the theoretical convergence analysis of the proposed F-MADRL is provided below. To evaluate the convergence of F-MADRL, we make the following assumptions considering the function Fk , k ∈ [1, n p ], by referring to [18]. Assumption 1 The Fk is L-smooth, ∀w, w , Fk (w) ≤ Fk (w ) + ∇ Fk (w )T (w − w ) + L2 w − w 22 . Assumption 2 The Fk is μ-strongly convex, ∀w, w , Fk (w) ≥ Fk (w ) + ∇ Fk (w )T (w − w ) + μ2 w − w 22 . Based on the above assumptions, we have the following Lemmas. Lemma 1 F is μ-strongly convex and L-smooth.

11.4 Federated Multi-agent Deep Reinforcement Learning Algorithm

243

Algorithm 2 The federated multi-agent deep reinforcement learning algorithm on the server, reprinted from Ref. [15], copyright2023, with permission from IEEE 1: 2: 3: 4: 5: 6: 7: 8: 9: 10:

Execute on each MG agent: Parallel running on j, j ∈ [1, n p ] agent at global epoch e Receive the parameters from the server w ej ← w eG for Individual training epoch i = 1 to Ni do Collect the experience tuple T = {s0 , a0 , r0 , s1 , s1 , a1 , r1 , s2 , ..., sn p , an p , rn p , sU +1 } Compute the discounted factor: μ μ δkV ← rk + γ Vk (st+1 ) − Vk (st ) Estimate the advantage: πθ

k As,a ← δ0V + (γ λ)δ1V + (γ λ)2 δ2V , ..., +(γ λ)U −t+1 δUV −1 Calculate the loss function of the actor:

11:

LC ← Es,a∼T min(

12:

i ) , 1 + )As,a

πθ



πiθ (a|s) θ (a|s) πi−1

πθ

i As,a , clip(

πiθ (a|s) θ (a|s) , 1− πi−1

13: Update the actor parameter: 14: θi+1 ← θi + ηπ ∇s,a∼T LC 15: Calculate the loss of the critic: function

μ μ 16: LV ← Es,a∼T (γ Vi (st+1 ) + r (st , at ) − Vi (st ))2 17: Update the critic parameter: 18: μi+1 ← μi + ηV ∇μi L V 19: end for 20: Store the network parameters. w ej ← {θ Ni , μ Ni } 21: Upload the parameters w ej to the server.

Fig. 11.4 Proposed federated multi-agent deep reinforcement learning algorithm, reprinted from [15], copyright2023, with permission from IEEE

244

11 Federated Multi-agent Deep Reinforcement …

Proof Straightforwardly from Assumption 1 and Assumption 2, in line with the definition of convex, F is the finite-sum of the Fk , thus it is μ-strongly convex and L-smooth as well. Lemma 2 ∀w, w ∈ Rn and w t = w + t (w − w ) for t ∈ [0, 1]. Then, 

1

F(w) − F(w ) =

∇ F(w t )T (w − w)dt

(11.30)

0

and F(w) − F(w ) − ∇ F(w)T (w − w)

1 = (∇ F(w t ) − ∇ F(w))T (w − w)dt

(11.31)

0

Proof Equation (11.30) follows the fundamental theorem of calculus. Equation (11.31) follows from Eq. (11.30) by subtracting ∇ F(w)T (w − w) from both sides of the equation. Lemma 3 If F is smooth and μ-strongly convex for μ > 0, then for the w∗ = arg min F(w), w

1 μ ∇ F(w) 22 ≥ F(w) − w∗ ≥ w − w∗ 22 2μ 2

(11.32)

Proof Applying the Lemma 2, we have F(w) ≥ F(w∗ ) + ∇ F(w∗ )T (w − w∗ ) +

μ w∗ − w 22 2

(11.33)

Using the fact that ∇ F(w∗ ) = 0, we yield F(w) − F(w∗ ) ≥

μ w∗ − w 22 2

(11.34)

which is the right side of (11.33). Note that F(w∗ ) ≥ min F( y) + ∇ F(w)T ( y − w) + y

μ w − y 22 2

(11.35)

since y = w − μ1 ∇ F(w) minimizes the right side of the above inequality, we yield

min y F( y) + ∇ F(w)T ( y − w) + μ2 w − y 22 ≥ F(w) −

1 ∇ F(w) 22 2μ

(11.36)

11.4 Federated Multi-agent Deep Reinforcement Learning Algorithm

245

namely 1 ∇ F(w) 22 ≥ F(w) − F(w∗ ) 2μ

(11.37)

Theorem 1 Considering the F is L-smooth and μ-strongly, let w∗ = arg min at the w

kth iteration. Then,  μ k F(wk ) − F(w∗ ) ≤ 1 − (F(w0 ) − F(w∗ )) L Consequently, it requires

L μ

(11.38)

∗) log( F(w0 )−F(w ) iterations to find -optimal solution.

Proof Applying Lemma 2, we have |F(w) − F(w ) − ∇ F(w)T (w − w)| 1 ≤ | 0 (∇ F(w t ) − ∇ F(w))T (w − w)dt| 1 ≤ 0 (∇ F(w t ) − ∇ F(w)) w − w dt

(11.39)

Based on the Assumption 1, F(w t ) − ∇ F(w)) ≤ t w − w . Note that w t − w = t (w − w ). Then, we have |F(w) − F(w ) − ∇ F(w)T (w − w)| 1 ≤ 0 (∇ F(w t ) − ∇ F(w)) w − w dt ≤

L w 2



(11.40)

w 22

Note that wk+1 = wk − η∇ F(wk )

(11.41)

where η > 0 represents the learning rate. Then, we yield F(wk+1 ) − (F(wk ) + η ∇ F(wk ) 22 ) ≤

η2 L ∇ F(wk ) 22 2

(11.42)

1 ∇ F(wk ) 22 . From Lemma 3, ∇ F if we pick η = L1 , then F(wk+1 ) ≤ F(wk ) − 2L 2 (w) 2 ≥ 2μ(F(w) − F(w∗ )). When putting them together,

1 ∇ F(wk ) 22 2L μ ≤ F(wk ) − F(w∗ ) − (F(wk ) − F(w∗ )) L μ = (1 − )(F(wk ) − F(w∗ )) L

F(wk+1 ) − F(w∗ ) ≤ F(wk ) − F(w∗ ) −

(11.43)

246

11 Federated Multi-agent Deep Reinforcement …

Repeatedly applying this bound yields  μ k F(wk ) − F(w∗ ) ≤ 1 − (F(w0 ) − F(w∗ )) L L μ

(11.44)

Using the fact that 1 + x ≤ e x , the convergence rate is given by picking k ≥ ∗) log( F(w0 )−F(w ).

11.5 Case Study In this section, we conduct case studies to demonstrate the effectiveness of the proposed F-MADRL algorithm. At first, the parameter settings of the MMG system and F-MADRL are introduced. Then, the F-MADRL algorithm is deployed, and its performances are analyzed. Moreover, we compare the F-MADRL with the original PPO and other famous DRL algorithms such as advantage actor-critic (A2C) [19] and trust region policy optimization (TRPO) [20] to demonstrate the efficiency of the F-MADRL.

11.5.1 Experiment Setup Without loss of generality, three MGs that belong to different entities are set in the MMG system for the case study. Each MG includes an RPG, a CG and a BA. The detailed parameters of each MG are presented in Table 11.1. Besides, the load demands, the electricity selling price and RE generations of the MMG system are provided in Fig. 11.5a–c, respectively. It can be learnt from Table 11.1 and Fig. 11.5a that during 10:00–13:00 and 18:00–22:00, the power demand of MG1 surpasses its maximum capacity, which means the MG1 operates in the energy self-insufficient state during these periods. On the contrary, the MG2 and MG3 operate in the energy self-sufficient state. As for the F-MADRL, the

Table 11.1 Parameters setting of each MG, reprinted from [15], copyright2023, with permission from IEEE a($/h) b($/kWh) c($/kWh) Pmin (kW) Pmax (kW) MG1 MG2 MG3

CG BA CG BA CG BA

0.0081 0.0153 0.0076 0.0163 0.0095 0.0173

5.72 5.54 5.68 5.64 5.81 5.74

63 26 365 32 108 38

0 –50 0 –50 0 –50

200 50 280 50 200 50

11.5 Case Study

247

Fig. 11.5 Load demand, electricity selling price and RE generation of the three MGs

training epoch is set as 900. Moreover, γ and λ are set as 0.99 and 0.95, receptively. A famous neural network optimizer, Adam [14], is used to update the F-MADRL, and the learning rate of actor ηπ and critic ηV is set as 0.0001 and 0.001. Simulation studies are conducted using Python 3.6.8 with PyTorch 1.7.1.

11.5.2 Analysis of the F-MADRL Algorithm In this section, the proposed F-MADRL is applied on the MMG system and its performance evaluation is reported. Figure 11.6 presents the reward curve, the actor loss and the critic loss of the three MG agents in the subplots (a), (b) and (c), respectively. It can be seen from Fig. 11.6a that the rewards of three MG agents increase during the training. The initial rewards are –138, –140 and -136 for the three MG agents, which are gradually raised with the increasing of the training iterations. Besides, as shown in the notations with yellow background of Fig. 11.6a, all the rewards of MG agents are almost in the same level at the end of training, which are –103.1, –105.2 and –103.2, respectively. This phenomenon benefits from the introduction of the FL mechanism, and the experiences of the three MG agents are shared during the training thus increasing the generalization of each agent. This issue will be further demonstrated by comparing the F-MADRL with other RL algorithms in the next section. Additionally, to illustrate the convergence of the F-MADRL algorithm, the critic and actor loss during training are shown in Fig. 11.6b, c. Form Fig. 11.6(b), we could learn that the critic networks are well trained since the losses of the three MG agents gradually decrease from 150, 72 and 90 to 20, 0.12 and 0.1 for the three agents, respectively. Moreover, Fig. 11.6c illustrates that each MG achieves an optimal policy because the actor loss of three MGs is close to 0 after training. Then, the policies obtained by the F-MADRL are applied to determine the scheduling of each MG. Specifically, Figs. 11.7, 11.8 and 11.9 denote the scheduling of MG1, MG2 and MG3, respectively. Each figure includes two kinds of graphs, where the above graph is the scheduling solution and the lower graph shows the unbalanced

248

11 Federated Multi-agent Deep Reinforcement …

Fig. 11.6 Reward a, actor loss b and critic loss c curves of each MG agent during the training

demand, namely the difference between generation and load demand of the MG. The positive unbalanced demand indicates the generation of the MG surpasses its load demand, which means the demand is satisfied while the negative one means the demand is unsatisfied. As shown in Fig. 11.7, since the demand of MG1 is higher than its capacity, the MG1 operates in the energy self-insufficient state. Therefore, the demands of MG1 are all unsatisfied during the 24 h. Moreover, the scheduling of BA generation obtained by the agent mainly considers its charging and discharging power balance. In the first hour, the power demand of MG1 is 180 kW, which is lower than those of

11.5 Case Study

249

Fig. 11.7 Scheduling of MG1 obtained by the F-MADRL algorithm, reprinted from [15], copyright2023, with permission from IEEE

Fig. 11.8 Scheduling of MG2 obtained by the F-MADRL algorithm, reprinted from [15], copyright2023, with permission from IEEE

time periods, thus the agent chooses to charge the battery and limits its output in a small range. Since the MG1 operates in the energy self-insufficient state, it requires the external power from other MGs and distribution power system, which is achieved by the transaction mechanism in the MMG system. Even the unbalanced demands of MG1 are as high as –182 kW at 12:00 and –198 kW at 20:00, these power shortages can be supplied by other MGs and the distribution power network. This is the reason why the MG1 agent does not fully operate its CG and BA all the time and the agent learns that the energy transaction is more cost effective than generating power by itself.

250

11 Federated Multi-agent Deep Reinforcement …

Fig. 11.9 Scheduling of MG3 obtained by the F-MADRL algorithm, reprinted from [15], copyright2023, with permission from IEEE

The scheduling policy of the MG2 agent and MG3 agent is similar but different from that of MG1. As illustrated in Figs. 11.8 and 11.9, since the MG2 and MG3 work in an energy self-sufficient state, their power generations of CG obtained by the two agents could change with their demands. It should be noted that although the demands of MG2 and MG3 in the first hour are only 71 kW and 72 kW, their CG outputs are as high as 110 and 102 kW, which is due to the additional charging cost of their BAs, i.e., the BAs in MG2 and MG3 would fully store energy in the first hour and make use of it when needed. However, the demand power in the following 23 h does not exceed the capacity of MG2 and MG3, thus their batteries are not discharged. Moreover, in the 11th and 16th hours, the generation and demand of MG2 are nearly equal, thus only causing –0.2 kW unbalanced demands in these two hours. Besides, the absolute values of unbalanced demands during MG3 operation are not lower than 25 kW. The unbalanced demand can be eliminated according to the transaction mechanism in the MMG. In this way, the excess power sold to other MGs that work in a self-insufficient states and the power shortage can be supplied by the power from other MGs or the distribution power network. The above phenomena demonstrate that the agents trained by the F-MADRL algorithm can make an efficient scheduling whether the MG operates in the energy self-sufficient or self-insufficient state.

11.5.3 Performance Comparison In this section, we will demonstrate the effectiveness of introducing the FL mechanism in the MADRL algorithm by comparing the performance of F-MADRL and PPO-MADRL. PPO-MADRL indicates the PPO is used for the self-training of MG

11.5 Case Study

251

Fig. 11.10 Test reward of each MG agent is trained by the four algorithms on a energy self-sufficient state and b energy self-insufficient state

agents but without using the FL mechanism. Besides, we also compare the performance of F-MADRL with other well-known deep reinforcement learning algorithms. That is, A2C and TRPO are used to replace the PPO to implement the self-training of MG agents, and the two comparison algorithms are termed A2C-MADRL and TRPO-MADRL, respectively, since the training of MG agents is based on the local operation data, which may be easy to make the agent trap in the local optima. Therefore, to compare the generalization of the MG agents trained by each algorithm, we set the comparison metric as the test reward value of the MG agent, which are obtained by testing this agents in both the energy self-sufficient and self-insufficient states. It aims to well verify the generalization of the F-MADRL. Figure 11.10 compares the values of test reward obtained by the MG agents under these four training algorithms, and the size of each slice stands for the performance of the corresponding MG agent. Since the value of test reward is negative, the slice with a smaller size means a better performance. Figure 11.10a, b denotes the test rewards of each MG agent in the energy selfsufficient and self-insufficient state, respectively. In the two plots, the slides that represent the best test rewards of the three MG agents are surrounded by the red lines. As shown in Fig. 11.10a, the test rewards obtained by F-MADRL are –4.5, –4.2 and –4.7 for the three MG agents, which surpasses other comparison algorithms. This means the MG agents trained by F-MADRL perform better in the energy self-sufficient state. Besides, as shown in Fig. 11.10b, the three MG agents trained by FL-MADRL are the best in the energy self-insufficient state, and their test rewards are –47.0, –46.9 and –47.0, respectively. In addition, since the performance of the F-MADRL is better than these comparative algorithms in both the energy self-sufficient and selfinsufficient state, it can be concluded that the F-MADRL has a better generalization performance.

252

11 Federated Multi-agent Deep Reinforcement …

The main difference between F-MADRL and the other three algorithms is the introduction of the FL mechanism, which leads to the diversity in their performances. The reason for this phenomenon is analyzed as follows. Since the MG agents trained by PPO-MADRL, A2C-MADRL and TRPO-MADRL are only based on the local operation data of MG due to the limitation of privacy, which means the data sources are limited, and thus causing lower diversity of the training data. Consequently, the decision-making ability of MG agents would be weaken. The introduction of the FL mechanism alleviates this drawback. Using FL, the experiences of MG agents can be shared without threatening user privacy and data security. In this way, the generalization of the agent would be improved, as verified in the above experiments. Therefore, the comparisons conducted in this section demonstrate the effectiveness of introducing the FL mechanism in the MADRL algorithm and also reveal the better generalization of F-MADRL.

11.6 Conclusion This chapter proposes a federated multi-agent deep reinforcement learning algorithm for the multi-microgrids system energy management. A decentralized MMG model is built first, which includes numerous isolated MG and an agent is used to control the dispatchable elements of each MG for its energy self-sufficiency. Due to the privacy protection and data security, the F-MADRL is implemented to train the agents, which adopts the PPO for the self-training of the agent. Then, the FL mechanism is introduced to build a global agent that aggregated the parameters of all local agents on the server and replaces the local MG agent with the global one. Therefore, the experiences of each agent can be shared without threatening the privacy and data security. The case studies are conducted on an MMG with three isolated MGs. The convergence and the performance of F-MADRL are illustrated first. Then, by comparing MADRL-PPO, MADRL-A2C and MADRL-TRPO, the F-MADRL performs a better generalization and achieves higher test rewards. The comparison indicates the validity of introducing the FL mechanism in the MADRL and also demonstrates the effectiveness of our proposed F-MADRL.

References 1. F. Luo, Y. Chen, X. Zhao, Y.G. Liang, Zheng, J. Qiu, Multiagent-based cooperative control framework for microgrids’ energy imbalance. IEEE Trans. Indus. Inf., 13(3), 1046–1056 (2017) 2. W. Liu, G. Wei, J. Wang, Y. Wenwu, X. Xi, Game theoretic non-cooperative distributed coordination control for multi-microgrids. IEEE Trans. Smart Grid 9(6), 6986–6997 (2018) 3. Y. Li, T. Zhao, P. Wang, H.B. Gooi, L. Wu, Y. Liu, J. Ye, Optimal operation of multimicrogrids via cooperative energy and reserve scheduling. IEEE Trans. Indus. Inf. 14(8), 3459–3468 (2018)

References

253

4. H. Farzin, R. Ghorani, M. Fotuhi-Firuzabad, M. Moeini-Aghtaie, A market mechanism to quantify emergency energy transactions value in a multi-microgrid system. IEEE Trans. Sustain. Energy 10(1), 426–437 (2019) 5. S.A. Arefifar, M. Ordonez, Y. Abdel-Rady, I. Mohamed, Energy management in multimicrogrid systems development and assessment. IEEE Trans. Power Syst. 32(2), 910–922 (2017) 6. E.J. Ng, R.A. El-Shatshat, Multi-microgrid control systems (MMCS). In: IEEE PES General Meeting, pp. 1–6 (2010) 7. X. Yang, H. He, Y. Zhang, Y. Chen, G. Weng, Interactive energy management for enhancing power balances in multi-microgrids. IEEE Transactions on Smart Grid 10(6), 6055–6069 (2019) 8. Y. Liu, Y. Li, H.B. Gooi, Y. Jian, H. Xin, X. Jiang, J. Pan, Distributed robust energy management of a multimicrogrid system in the real-time energy market. IEEE Trans. Sustain. Energy 10(1), 396–406 (2019) 9. Y. Du, F. Li, Intelligent multi-microgrid energy management based on deep neural network and model-free reinforcement learning. IEEE Trans. Smart Grid 11(2), 1066–1076 (2020) 10. T.A. Nakabi, P. Toivanen, Deep reinforcement learning for energy management in a microgrid with flexible demand. Sustain. Energy Grids Netw. 25, 100413 (2021) 11. E. Samadi, A. Badri, R. Ebrahimpour, Decentralized multi-agent based energy management of microgrid using reinforcement learning. Int. J. Electric. Power Energy Syst. 122, 106211 (2020) 12. X. Fang, Q. Zhao, J. Wang, Y. Han, Y. Li, Multi-agent deep reinforcement learning for distributed energy management and strategy optimization of microgrid market. Sustain. Cities Soc. 74, 103163 (2021) 13. S. Lee, D.-H. Choi, Federated reinforcement learning for energy management of multiple smart homes with distributed energy resources. IEEE Trans. Indus. Inf. 18(1), 488–497 (2022) 14. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, O. Klimov, Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017) 15. Y. Li, S. He, Y. Li, Y. Shi, Z. Zeng, Federated multiagent deep reinforcement learning approach via physics-informed reward for multimicrogrid energy management. IEEE Transactions on Neural Networks and Learning Systems, pp. 1–13 (2023) 16. R. Mudumbai, S. Dasgupta, B.B. Cho, Distributed control for optimal economic dispatch of a network of heterogeneous power generators. IEEE Trans. Power Syst. 27(4), 1750–1760 (2012) 17. Y. Shuai, X. Chen, Z. Zhou, X. Gong, W. Di, When deep reinforcement learning meets federated learning: Intelligent multitimescale resource management for multiaccess edge computing in 5G ultradense network. IEEE Internet Things J. 8(4), 2238–2251 (2021) 18. S. Wang, T. Tuor, T. Salonidis, K.K. Leung, C. Makaya, T. He, K. Chan, Adaptive federated learning in resource constrained edge computing systems. IEEE J. Selected Areas Commun. 37(6), 1205–1221 (2019) 19. V. Mnih, A. Puigdomènech Badia, M. Mirza, A. Graves, T.P. Lillicrap, T. Harley, D. Silver, K. Kavukcuoglu, Asynchronous methods for deep reinforcement learning. In Proceedings of The 33rd International Conference on Machine Learning, vol. 48, pp. 1928–1937, New York, PMLR (2016) 20. J. Schulman, S. Levine, P. Moritz, M.I. Jordan, P. Abbeel, Trust region policy optimization. In Proceedings of the 32nd International Conference on Machine Learning, vol. 37, pp. 1889– 1897, Lille (2015) PMLR

Chapter 12

Prospects of Future Research Issues

We have mentioned the difficulty of smart grid forecast and dispatch mainly lies in the strong uncertainty, curse of dimensionality and the trouble of establishing the accurate model. Fortunately, as one of the model-free approaches, RL is able to deal with the stochastic renewable energy and user power demand through interacting with the environment in the absence of prior knowledge. In addition, the curse of dimensionality can be well handled with the help of DNN. Therefore, DRL shows a great potential in addressing relevant issues of smart grid forecast and dispatch and has achieved successful applications in practice. However, current AI-enabled computational methods like DRL still have a certain extent of limitation, which mainly due to the dependence on handcrafted reward function. In reality, it is not easy to design a reward function that encourages the behaviors we desire while still being learnable. Even worse, the most reasonable reward function cannot avoid the local optimality, which belongs to the typical exploration–exploitation dilemma and has puzzled AI-enabled computational methods for a long time. To this end, a relatively comprehensive understanding about the challenges of current AI-enabled computational approaches, potential solutions and future directions is discussed in this chapter.

12.1 Smart Grid Forecast Issues As we discussed in the Chap. 1 that the forecasting techniques are quite essential to the operation of the smart grid. Therefore, the relatively comprehensive summaries on the current challenges and the possible future research directions on the deep learning method for smart gird related forecasting techniques.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Li et al., Artificial Intelligence Enabled Computational Methods for Smart Grid Forecast and Dispatch, Engineering Applications of Computational Methods 14, https://doi.org/10.1007/978-981-99-0799-1_12

255

256

12 Prospects of Future Research Issues

12.1.1 Challenges 12.1.1.1

Weather Dependencies

The weather has a tremendous impact on forecasting tasks in the smart grid field. Not only does the temperature affect the operation of the power equipment; humidity, wind speed, wind direction and sun irradiation all influence the outputs of the renewable energy generator. However, as the weather conditions are difficult to be predicted, biases could arise in the numerical weather report, which is dependent on the forecasting methodologies in the smart grid. Furthermore, weather conditions may fluctuate for the large power infrastructure that geographically distributed locations. Forecasting techniques would be influenced in this way. 12.1.1.2

Influence of User Behavior

One of the most important functions of the smart grid is satisfying user needs such as power demand and energy trading. As a consequence, forecasting load, netload and electrical pricing mean determining patterns of user behavior that influence these numbers in a final analysis. Unfortunately, user patterns of behavior vary, and they may differ among those users. Peoples personal price decisions and needs may change abruptly, making precise forecasting algorithms difficult to capture. As a result, numerical forecasting cannot achieve superior accuracy to the target, as there is always a discrepancy between the real and forecasted values.

12.1.1.3

Disturbance of Power System

Unexpected events such as the fluctuation of renewable energy power, damage in transmission lines, natural disaster, black-out, the temporal policy adjustment of the power market and the abrupt demand raise up would bring disturbances to the power system. Such events are fatal to forecasting systems since the model is more likely to identify outputs in a regular manner, resulting in lower forecasting accuracies. Furthermore, these unexpected events would disperse the dataset quality, which is more likely to be used to build an inaccurate forecasting model.

12.1.2 Future Research Directions 12.1.2.1

Advanced Neural Network Model

In the last few decades, neural network paradigms have been evolving. In contrast to the multi-layer processor, the recurrent neural network architecture takes into account time dependencies. Besides, despite the fact that these models are commonly used

12.2 Smart Grid Dispatch Issues

257

in a variety of smart grid-related forecasting challenges, their performance is still far from reliable. Using a more complex neural network model could be an efficient way to resolve this dilemma. For instance, the graph neural network might automatically consider the smart grid architecture, increasing the interpretation of the model as well as forecast accuracy. Furthermore, the transformer and diffusion models have shown significant success in image and natural language processing. Furthermore, the spiking neural network technique has received a lot of attention as the next-generation network model, which is thought to boost the intelligence of neural network inference.

12.1.2.2

Distribution Learning Technique

The evolution of distribution networks and the topology of the modern smart grid is inevitable. Especially with the widespread adoption of distributed renewable energy generators and the deployment of microgrids, forecasting techniques in the distribution mode must seek changes in comparison to the centralized mode. For instance, considering the emerging privacy concern of distributed entity data, a distributed learning mechanism is required for forecasting. Federated learning is one of the most representative methods. To protecting data security, only the neural network model parameters that are separately trained on each distributed sector would be shared. This allows the experiences of each model to be shared. In addition to privacy, autonomy and achieving global optimal operation are important research directions. 12.1.2.3

Event-Driven Forecasting

As discussed above, unexpected events such as disasters would highly influence forecasting accuracy. Therefore, if the forecasting method is capable of determining the appearance of these events, the forecasting model cannot only recognize the regular pattern through the data but also whether these events would influence the accuracy. This technology is termed event-driven forecasting. Despite the value forecasting methods widely deployed in the smart grid, distinguishing unexpected events and rectifying the forecasting value on time would benefit real-time energy management, energy trading and other related tasks.

12.2 Smart Grid Dispatch Issues 12.2.1 Challenges 12.2.1.1

Sample Efficiency

Despite the success of AI-enabled computational methods, it usually needs at least thousands of samples to gradually learn some useful policies even for a simple task.

258

12 Prospects of Future Research Issues

However, the real-world or real-time interactions between agent and environment is usually costly, and it still requires time and energy consumption even in the simulation platform. This brings about a critical problem for AI-enabled computational methods, i.e., how to design a more efficient algorithm to learn faster with fewer samples. At present, most of AI algorithms are of such a low learning efficiency that requires unbearable training time under current computational power. It is even worse for real-world interactions that potential problems of security concern, risks of failure cases and time consumption all put forward higher requirements on the learning efficiency of AI algorithms in practice.

12.2.1.2

Learning Stability

Unlike the stable supervised learning, present AI algorithms are volatile to a certain extent, which means there exist huge differences of the learning performances over time in horizontal comparisons across multiple runs. In specific, this learning instability over time generally reflects as the large local variances or the non-monotonicity on a single learning curve. As for the unstable learning, it manifests a significant performance difference between different runs during training, which leads to large variances for horizontal comparisons. What is more, the endogenous instability and unpredictability of DNN aggravate the deviation of value function approximation, which further brings about noise in the gradient estimators and unstable learning performance. Moreover, AI-enabled computational methods are sensitive to hyperparameters and initialization, which also extremely impacts the behavior of learning algorithms.

12.2.1.3

Exploration

Different with the classical exploration–exploitation trade-off in RL, exploration is another main challenge of AI-enabled computational methods. The difficulty of exploration in RL may stem from sparse reward functions, large action spaces and unstable environment, as well as the security issue of exploration in the real world. Firstly, the sparse rewards might result in the value function and policy networks optimized on hyper-surfaces that are not convex and not smooth, or even discontinuous scenarios. Secondly, the large action space also raises the difficulty of exploration for AI agents. For instance, it is extremely hard to explore an optimal policy in the wellknown StarCraft II game due to the large action space and long control sequence. Thirdly, the unstable environment also makes the agent exploration more difficult such as the multi-player settings cause the opponent part of the game environment to some extent, which weakens the exploration capacity of agent.

12.2 Smart Grid Dispatch Issues

12.2.1.4

259

Interpretability

In the field of smart grid, the optimal policy is devoted to assisting user to make the ultimate decision. To this end, it is important for user to understand the decisions made by agents, i.e., the interpretability of AI-enabled computational methods. In this sense, RL is regarded to have strong interpretability and logicality owe to the rigorous mathematical derivation based on the value and policy. However, the introduction of DNN impacts the explainability of DRL, such as using DL to approximate the policy and value functions, the estimation of uncertainty based on DL, etc.

12.2.2 Future Research Directions 12.2.2.1

Model-Based AI Methods

Different from the aforementioned model-free methods, model-based AI-enabled computational methods generally indicates an agent not only learns a policy to estimate its action but also learns a model of environment to assist its action planning, thus accelerates the speed of policy learning. Actually, learning an accurate model of the environment provides the additional information to better evaluate the agent’s current policy, which can make the entire learning process more efficient. In principle, a good model could handle a bunch of problems, as AlphaGo has done. Therefore, it is meaningful to investigate the model-based AI algorithms and promote the sample efficiency in the smart grid.

12.2.2.2

Transfer Learning Combined AI Methods

As one of machine learning methods, transfer learning focuses on storing knowledge gained while solving a problem and applying to a different but relevant one, whose core is to find similarities between the existing and new knowledge. Since it is too expensive to learn the target domain directly from scratch, transfer learning is adopted to using existing knowledge to learn new knowledge as quickly as possible. The promise of transfer learning actually lies in that it could leverage knowledge from previous tasks to speed up learning of new ones. Consequently, it is a potential solution to combine DRL with transfer learning to handle the data dependency in smart grid dispatch problems.

12.2.2.3

Meta-learning-Based AI Methods

Apart from transfer learning, meta-learning is also devoted to improving the learning performance on a new task with previous experience from smart grid, rather than considering each task as an independent one. More specifically, meta-learning makes

260

12 Prospects of Future Research Issues

full use of the obtained smart grid knowledge to guide the new learning task through adjusting neural networks, adaptively. In this way, it is viable to promote the metalearning to a never met smart grid dispatch scenario and complete this new task by the self-tuning of meta-model. In addition, the combination of meta-learning and RL gives the meta-RL methods, which can reduce the sensitivity to network parameters and enhance the robustness of algorithm. On this basis, it is interesting to promote the meta-RL into the deep one and apply it in the future smart grid dispatch problems.

12.2.2.4

Multi-agent AI Methods

With the development of AI-enabled computational methods, multi-agent deep reinforcement learning (MADRL) is proposed and has attracted much attention. In fact, MADRL is regarded as a promising and worth exploring direction, which provides a novel way to investigate the unconventional DRL situations including swarm intelligence, unstable environments for each agent, and innovation of agent itself. MADRL not only makes it possible to explore distributed intelligence in multi-agent environments, but also contributes to learn a near-optimal agent policy in the complicated large-scale smart grid. To this end, it is necessary to analyze interactions among multi-agents and promote the application of MADRL in smart grid dispatch domain.

12.2.2.5

Federated Learning Combined AI Methods

The concern regarding smart grid security and privacy is one of the main oppositions to smart grid forecast and dispatch. However, extensive previous research about applications of DRL in smart grid mainly belong to the centralized method, which are vulnerable to cyberattack and privacy leakage. To this end, federated learning is combined with DRL to meet the requirements of privacy preservation and network security. Actually, federated learning enables multiple agents to coordinately learn a shared decision model while keeping all the training data on device, thus preventing the risk of privacy leakage. What is more, the decentralized structure of federated learning offers a promising technique to reduce the pressure of centralized data storage. Therefore, it is meaningful to investigate the combination between federated learning and other AI methods in the smart grid dispatch field.