130 14 20MB
English Pages 441 [434] Year 2022
Springer Optimization and Its Applications 181
Maude Josée Blondin João Pedro Fernandes Trovão Hicham Chaoui Panos M. Pardalos Editors
Intelligent Control and Smart Energy Management Renewable Resources and Transportation
Springer Optimization and Its Applications Volume 181
Series Editors Panos M. Pardalos , University of Florida My T. Thai , University of Florida Honorary Editor Ding-Zhu Du, University of Texas at Dallas Advisory Editors Roman V. Belavkin, Middlesex University John R. Birge, University of Chicago Sergiy Butenko, Texas A & M University Vipin Kumar, University of Minnesota Anna Nagurney, University of Massachusetts Amherst Jun Pei, Hefei University of Technology Oleg Prokopyev, University of Pittsburgh Steffen Rebennack, Karlsruhe Institute of Technology Mauricio Resende, Amazon Tamás Terlaky, Lehigh University Van Vu, Yale University Michael N. Vrahatis, University of Patras Guoliang Xue, Arizona State University Yinyu Ye, Stanford University
Aims and Scope Optimization has continued to expand in all directions at an astonishing rate. New algorithmic and theoretical techniques are continually developing and the diffusion into other disciplines is proceeding at a rapid pace, with a spot light on machine learning, artificial intelligence, and quantum computing. Our knowledge of all aspects of the field has grown even more profound. At the same time, one of the most striking trends in optimization is the constantly increasing emphasis on the interdisciplinary nature of the field. Optimization has been a basic tool in areas not limited to applied mathematics, engineering, medicine, economics, computer science, operations research, and other sciences. The series Springer Optimization and Its Applications (SOIA) aims to publish state-of-the-art expository works (monographs, contributed volumes, textbooks, handbooks) that focus on theory, methods, and applications of optimization. Topics covered include, but are not limited to, nonlinear optimization, combinatorial optimization, continuous optimization, stochastic optimization, Bayesian optimization, optimal control, discrete optimization, multi-objective optimization, and more. New to the series portfolio include Works at the intersection of optimization and machine learning, artificial intelligence, and quantum computing. Volumes from this series are indexed by Web of Science, zbMATH, Mathematical Reviews, and SCOPUS.
More information about this series at https://link.springer.com/bookseries/7393
Maude Josée Blondin João Pedro Fernandes Trovão • Hicham Chaoui Panos M. Pardalos Editors
Intelligent Control and Smart Energy Management Renewable Resources and Transportation
Editors Maude Josée Blondin Department of Electrical and Computer Engineering Université de Sherbrooke Sherbrooke, QC, Canada Hicham Chaoui Department of Electronics Carleton University Ottawa, ON, Canada
João Pedro Fernandes Trovão Department of Electrical and Computer Engineering Université de Sherbrooke Sherbrooke, QC, Canada Panos M. Pardalos Department of Industrial & Systems Engineering University of Florida Gainesville, FL, USA
ISSN 1931-6828 ISSN 1931-6836 (electronic) Springer Optimization and Its Applications ISBN 978-3-030-84473-8 ISBN 978-3-030-84474-5 (eBook) https://doi.org/10.1007/978-3-030-84474-5 Mathematics Subject Classification: 93C05, 93C10, 93C35, 93C73, 93C95, 74G65, 90B50, 68T27, 90B06, 68T05, 68T01, 68T20, 74P99, 34H99 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
No doubt, the world is facing several crises. We want to say that changes must take place to preserve the earth’s healthy state, but many observations force us to say that changes are needed to stop the Earth’s destruction. From environmental disasters happening at a higher frequency to the increasing rate of loss of biodiversity, we are in the era of doing things with an environmental sustainability mindset. In other words, a shift towards greener technologies and more ecological management of life’s commodities such as cars, trains, and electricity networks is needed. Researchers have been working diligently to find solutions to develop environmental technologies and lessen the damaging effects of such commodities that make our life easier in many ways. Even though the development is rapid, it is not yet enough to meet the “earthly needs.” For instance, according to Climate Action Tracker, many countries, which include Canada, the USA, and France, are not taking sufficient actions to meet the Paris Agreement, which is, maintaining global warming under 2 ◦ C, and preferably below 1.5 ◦ C. This critical environmental goal is one example of the challenges that we are facing. While the Earth is growling, new research holds great promise for solving those issues. This book contains some of these new promising solutions and provides future research direction. This book is particularly about intelligent techniques in control and energy management for renewable resources and transportation. It covers theoretical aspects along with practical applications. The major topics are distributed energy resources, hybrid batteries’ energy management, control of autonomous vehicles, wind systems, and maritime transport scheduling. Each chapter is self-contained, meaning readers can enjoy each chapter individually. Chapter “Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles” presents predictive energy management strategies for fuel cell hybrid electric vehicles, focusing on studying driving predictive information combined with the real-time optimization framework. The following chapter examines hierarchical energy management strategies at fleet levels for managing total cost ownership. The third chapter introduces the readers to stochastic optimization methods, which apply to smart power systems and grids. Chapter “Challenges for a Massive Integration of Flexible Resources in LV Networks” reviews the challenges that v
vi
Preface
low-voltage electricity distribution networks face to convert them into smartgrids. The fifth chapter provides a comprehensive overview of electrical railway power supply systems for high-speed lines. Chapter “Energy-Efficient Scheduling of Intraterminal Container Transport” proposes bi-objective mixed-integer linear programming models in maritime transport operations to ensure economic sustainability while reaching ecological goals. The following chapter studies the limitation of power in electric vehicles from the battery packs and investigates new hybrid battery management systems to respond to some of the identified limitations. In Chapter “Robust, Resilient, and Energy-Efficient Satellite Formation Control”, the authors propose distributed network methods for coordinating satellites with energy and fuel efficiency objectives. Chapter “A Methodology for the Assessment of Efficiency in Systems Under Transient Conditions: Case Study for Hybrid Storage Systems in Elevators” discusses energy management systems of hybrid energy storage systems for commercial building elevators and proposes strategies to peakshave grid power. Chapter “A Holistic Approach to the Energy-Efficient Smoothing of Traffic via Autonomous Vehicles” examines the stop-and-go wave problem in autonomous vehicles and proposes an approach to dissipate them, reduce fuel consumption, and increase safety. The next chapter presents a new methodology for a multi-objective optimal energy management strategy for hybrid energy storage systems; the method is applied to energy managing of the eCommander electric vehicle. Chapter “Transient Stability and Protection Evaluation of Distribution Systems with Distributed Energy Resources” analyses the impact of integrating distributed energy resources into power distribution systems. The last chapter proposes expertise-based intelligent controllers for electric vehicle speed control. Sherbrooke, Québec, Canada Sherbrooke, Québec, Canada Ottawa, Ontario, Canada Gainesville, Florida, USA
Maude J. Blondin João Pedro F. Trovão Hicham Chaoui Panos M. Pardalos
Acknowledgements
We want to thank everyone who participated in this book. Without you, authors, this book would not exist. And without you, reviewers, the quality of this book would not be the same. We are very grateful for your effort and contributions. Finally, we want to thank the Springer editorial team, especially Elizabeth Loew and Saveetha Balasundaram, for help and support in publishing this contributed book. Sherbrooke, Québec, Canada Sherbrooke, Québec, Canada Ottawa, Ontario, Canada Gainesville, Florida, USA
Maude J. Blondin João Pedro F. Trovão Hicham Chaoui Panos M. Pardalos
vii
Contents
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles . . . Yang Zhou, Alexandre Ravey, and Marie-Cécile Péra
1
Plug-in Hybrid Electric Buses with Different Battery Chemistries Total Cost of Ownership Planning and Optimization at Fleet Level Based on Battery Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jon Ander López-Ibarra, Haizea Gaztañaga, Josu Olmos, Andoni Saez-de-Ibarra, and Haritza Camblong
45
Stochastic Optimization Methods for the Stochastic Storage Process Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pavel Knopov and Vladimir Norkin
79
Challenges for a Massive Integration of Flexible Resources in LV Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Pablo Arboleya, Lucía Suárez, Rubén Medina, and Alberto Méndez Electrical Railway Power Supply Systems for High-Speed Lines: From Traditional Grids to Smart Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Daniel Serrano-Jimenez, Sandra Castano-Solis, Eneko Unamuno, and Jon Andoni Barrena Energy-Efficient Scheduling of Intraterminal Container Transport . . . . . . . 155 S. Mahdi Homayouni and Dalila B. M. M. Fontes Learning-Based Control for Hybrid Battery Management Systems . . . . . . . 187 Jonas Mirwald, Ricardo de Castro, Jonathan Brembeck, Johannes Ultsch, and Rui Esteves Araujo Robust, Resilient, and Energy-Efficient Satellite Formation Control . . . . . . 223 Sean Phillips, Christopher Petersen, and Rafael Fierro
ix
x
Contents
A Methodology for the Assessment of Efficiency in Systems Under Transient Conditions: Case Study for Hybrid Storage Systems in Elevators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Jorge García, Cristina González-Morán, Pablo García, and Pablo Arboleya A Holistic Approach to the Energy-Efficient Smoothing of Traffic via Autonomous Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 Amaury Hayat, Xiaoqian Gong, Jonathan Lee, Sydney Truong, Sean McQuade, Nicolas Kardous, Alexander Keimer, Yiling You, Saleh Albeaik, Eugene Vinistky, Paige Arnold, Maria Laura Delle Monache, Alexandre Bayen, Benjamin Seibold, Jonathan Sprinkle, Dan Work, Benedetto Piccoli Optimal Energy Management of Electric Vehicles Supplied by Battery and Supercapacitors: A Multi-Objective Approach . . . . . . . . . . . . . . . . 317 Bảo-Huy Nguyễn and João Pedro F. Trovão Transient Stability and Protection Evaluation of Distribution Systems with Distributed Energy Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Guilherme S. Morais, Mariana Resener, Bibiana M. P. Ferraz, Ana P. Zanatta, Maicon J. S. Ramos, and Younes Mohammadi Fuzzy Logic Control for Motor Drive Performance Improvement in EV Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 Minh C. Ta, Binh-Minh Nguyen, and Thanh Vo-Duy
Contributors
Alberto Méndez Plexigrid, Gijón, Spain Alexandre Bayen University of California at Berkeley, Berkeley, CA, USA Alexander Keimer University of California at Berkeley, Berkeley, CA, USA Alexandre Ravey FEMTO-ST (UMR CNRS 6174), FCLAB (USR CNRS 2007), Univ. Bourgogne Franche-Comté, UTBM, Belfort, France Ana P. Zanatta Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil Amaury Hayat Ecole des Ponts Paristech, Marne-la-Vallée, France Andoni Saez-de-Ibarra Energy Storage and Management, IKERLAN Technology Research Centre, BRTA, Arrasate, Mondragón, Spain Bảo-Huy Nguyễn e-TESC Lab., University of Sherbrooke, Sherbrooke, QC, Canada CTI Lab. for EVs, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam Benedetto Piccoli Rutgers University Camden, Camden, NJ, USA Benjamin Seibold Temple University, Philadelphia, PA, USA Bibiana M. P. Ferraz Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil Binh-Minh Nguyen Department of Advanced Science and Technology, Toyota Technological Institute, Nagoya, Japan Christopher Petersen Space Vehicles Directorate, Air Force Research Laboratory, Kirtland, NM, USA Cristina González-Morán LEMUR Group, Department of Electrical Engineering, University of Oviedo, Gijon, Spain Dan Work Vanderbilt University, Nashville, TN, USA xi
xii
Contributors
Daniel Serrano-Jimenez ETS de Ingeniería de Minas y Energía, Universidad Politécnica de Madrid, Madrid, Spain Eneko Unamuno Faculty of Engineering of Mondragon Univertsitatea (EPS-MU), Arrasate, Spain Eugene Vinistky University of California at Berkeley, Berkeley, CA, USA Dalila B. M. M. Fontes LIAAD - INESC TEC, Porto, Portugal Faculdade de Economia da Universidade do Porto, Porto, Portugal Guilherme S. Morais Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil Haizea Gaztañaga Energy Storage and Management, IKERLAN Technology Research Centre, BRTA, Arrasate, Mondragón, Spain Haritza Camblong Department of Systems Engineering and Control, University of the Basque Country (UPV/EHU), Faculty of Engineering of Gipuzkoa, Plaza de Europa and Univ. Bordeaux, Estia Institute of Technology, Bidart, France João Pedro F. Trovão e-TESC Lab., University of Sherbrooke, Sherbrooke, QC, Canada Canada Research Chair in Efficient Electric Vehicles with Hybridized Energy Storage Systems, University of Sherbrooke, Sherbrooke, QC, Canada INESC Coimbra, University of Coimbra, DEEC, Polo II, Coimbra, Portugal Polytechnic of Coimbra, IPC-ISEC, DEE, Coimbra, Portugal Johannes Ultsch Institute of System Dynamics and Control, German Aerospace Center (DLR), Weßling, Germany Jon Ander López-Ibarra Engineering Department, Jema Energy, Lasarte-Oria, Spain Jon Andoni Barrena Faculty of Engineering of Mondragon Univertsitatea (EPSMU), Arrasate, Spain Jonas Mirwald Institute of System Dynamics and Control, German Aerospace Center (DLR), Weßling, Germany Jonathan Brembeck Institute of System Dynamics and Control, German Aerospace Center (DLR), Weßling, Germany Jonathan Lee University of California at Berkeley, Berkeley, CA, USA Jonathan Sprinkle University of Arizona, Tucson, AZ, USA Jorge García LEMUR Group, Department of Electrical Engineering, University of Oviedo, Gijon, Spain Josu Olmos Energy Storage and Management, IKERLAN Technology Research Centre, BRTA, Arrasate, Mondragón, Spain
Contributors
xiii
Lucía Suárez Telemanagement System Infrastructure Dept. ERedes Electrical Distribution, EDP Group, Lisbon, Portugal S. Mahdi Homayouni LIAAD - INESC TEC, Porto, Portugal Maicon J. S. Ramos Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil Marie-Cécile Péra FEMTO-ST (UMR CNRS 6174), FCLAB (USR CNRS 2007), Univ. Bourgogne Franche-Comté, UTBM, Belfort, France Maria Laura Delle Monache Saint-Martin, France
INRIA Grenoble—Rhône Alpes, Montbonnot-
Mariana Resener Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil Minh C. Ta e-TESC Lab., University of Sherbrooke, Sherbrooke, QC, Canada CTI Lab. for EVs, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam Nicolas Kardous University of California at Berkeley, Berkeley, CA, USA Pablo Arboleya Universidad de Oviedo, Oviedo, Spain Lemur Research Group, Gijon, Spain LEMUR Group, Department of Electrical Engineering, University of Oviedo, Gijon, Spain Pablo García LEMUR Group, Department of Electrical Engineering, University of Oviedo, Gijon, Spain Paige Arnold Rutgers University Camden, Camden, NJ, USA Pavel Knopov V.M. Glushkov Institute of Cybernetics of the National Academy of Sciences of Ukraine, Kyiv, Ukraine Rafael Fierro Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, USA Ricardo de Castro Department of Mechanical Engineering, University of California, Merced, CA, USA Rubén Medina Plexigrid, Stockholm, Sweden Rui Esteves Araujo INESC TEC and Faculty of Engineering, University of Porto, Porto, Portugal Saleh Albeaik University of California at Berkeley, Berkeley, CA, USA Sandra Castano-Solis ETS de Ingeniería y Diseño Industrial, Universidad Politécnica de Madrid, Madrid, Spain Sean McQuade Rutgers University Camden, Camden, NJ, USA
xiv
Contributors
Sean Phillips Space Vehicles Directorate, Air Force Research Laboratory, Kirtland, NM, USA Sydney Truong Rutgers University Camden, Camden, NJ, USA Thanh Vo-Duy CTI Lab. for EVs, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam Vladimir Norkin V.M. Glushkov Institute of Cybernetics of the National Academy of Sciences of Ukraine, Kyiv, Ukraine Faculty of Applied Mathematics of the National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv, Ukraine Xiaoqian Gong Arizona State University, Tempe, AZ, USA Yang Zhou School of Automation, Northwestern Polytechnical University, Xi’an, China FEMTO-ST (UMR CNRS 6174), FCLAB (USR CNRS 2007), Univ. Bourgogne Franche-Comté, Belfort, France Yiling You University of California at Berkeley, Berkeley, CA, USA Younes Mohammadi Lulea University of Technology, Lulea, Sweden
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles Yang Zhou, Alexandre Ravey, and Marie-Cécile Péra
1 Introduction With the global warming and the depletion of fossil fuels, fuel cell technologies have gained increasingly growing attention in transportation electrification field [1]. Fuel cell hybrid electric vehicles (FCHEVs) are associated with many appealing properties, like high efficiency, zero emission, and rapid refueling, which make them the competitive substitutions to internal combustion engine-based hybrid electric vehicles (HEVs) [2]. Nevertheless, the vehicle’s high ownership cost and the limited lifespan of fuel cell systems greatly hinder the massive commercialization of FCHEVs. To reduce the FCHEV’s operating cost for better economic performance, one of the practical solutions at the current stage is to develop and implement the reliable energy management strategies (EMSs) [3]. EMSs are vehicular system-level control strategies dedicated to coordinating the outputs of multiple energy sources within the hybrid powertrain, so as to meet the requested propulsion power demand. Existing EMSs for FCHEVs can be roughly categorized into rule-based and optimization-based strategies. Rule-based strategies constitute a series of predefined deterministic (or fuzzy) rules. These rules for power allocation are largely designed based on human intuition, engineering experience, or expertise knowledge [4]. Generally, the advantages of rule-based strategies lie in their simplicity, real-time suitability, and robustness against discrepancies in
Y. Zhou () School of Automation, Northwestern Polytechnical University, Xi’an, China e-mail: [email protected] A. Ravey · M.-C. Péra FEMTO-ST (UMR CNRS 6174), FCLAB (USR CNRS 2007), Univ. Bourgogne Franche-Comté, UTBM, Belfort, France e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_1
1
2
Y. Zhou et al.
driving cycles. However, their performance optimality (i.e., fuel economy) cannot be guaranteed by the calibrated parameters and preset rules, which is the major deficiency of rule-based strategies. In contrary, global optimization-based strategies derive the optimal power distributing effect based on the complete driving cycle knowledge a priori [5], which can dramatically improve the performance optimality versus rule-based strategies. Nevertheless, such optimal control effect cannot be directly applied to real-time scenarios due to the requirement on the entire driving cycle information beforehand. Alternatively, to optimally distribute power demand in real time, a diversity of control algorithms has been successfully applied to vehicular energy management problems, wherein model predictive control (MPC) is one of the most representative approaches [6]. Specifically, MPC anticipates the upcoming vehicle’s behavior over the finite time horizon and takes control actions accordingly. With the advancement in onboard electronic control units and real-time optimization techniques, MPC-based strategies can be adopted for real-time power allocation. Nevertheless, vehicle’s future behaviors could be heavily affected by a variety of unpredictable traffic factors, like the stochastic distribution of traffic lights and the unexpected movement of pedestrians [3]. Such driving disturbance would differ the actual vehicle behaviors from the estimated ones, thus degrading the precision of predictive control framework. Therefore, how to properly characterize the future driving uncertainties has become a crucial aspect in terms of MPC performance enhancement [3]. Nowadays, the maturation of modern telematics systems and the advances in driving prediction techniques (DPTs) make it possible to acquire the previewed information about the vehicle’s future driving states, such as the traffic flow speed and the road slope profile. Benefiting from the previewed information, there would be more chances for predictive energy management strategies (PEMSs) to further enhance the vehicles’ performance against traditional non-predictive EMSs. In this case, this chapter will especially focus on the development of PEMS for FCHEV, so as to underline the potential performance improvement imposed by driving prediction integration. The rest of this chapter is organized as follows: Section “Data-driven speed– forecasting approach” presents the development and validation of a layer recurrent neural network (LRNN) predictor. Moreover, to enhance the EMS’s adaptability under multiple driving patterns, a Markov chain-based driving pattern recognizer is devised and verified in section “Markov chain-based driving pattern recognizer”. With the driving prediction techniques and model predictive control, section “Energy management strategy via model predictive control” presents a multimode predictive energy management strategy (PEMS) for FCHEV, whose functionality and real-time suitability are validated via Software-in-the-Loop (SIL) testing against the benchmark strategies. The major findings and future research directions are briefed in section “Conclusion”.
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
3
2 Data-Driven Speed Forecasting Approach Under the framework of model predictive control (MPC), the control actions are computed based on the estimation of future disturbances. In vehicular energy management field, the vehicle’s upcoming propulsion power requests are normally deemed as the system disturbances. Therefore, the prediction quality of vehicular future power demand would have a profound impact on the performance of MPC-based energy management strategies. Moreover, considering the dependency between the propulsion power request and vehicle speed, it is thus necessary to study how to precisely estimate the distribution of the vehicle’s upcoming speed profiles. According to our previous study [3], data-driven approaches, such as artificial neural network, Markov chain, and support-vector machine, are widely applied to forecast the vehicle’s speed. As one of the representative methods, neural networks can learn predictive knowledge from historical dataset via training processes and reproduce similar behaviors in future speed prediction tasks. However, conventional backpropagation neural networks (BPNN) are suffering from two drawbacks, namely, the slow convergence rate and the risk of being trapped into the local optima in the training stage, which would eventually deteriorate the speed forecasting performance [3]. To this end, we propose a novel speed forecasting approach in this section, based on a layer recurrent neural network (LRNN), with the purpose of enhancing the forecast precision with the assistance of an improved type of network structure.
2.1 Layer Recurrent Neural Network Speed Predictor LRNN is one type of recurrent neural network (RNN), which is a connectionist model with multiple self-connected hidden layers [7]. The biggest advantage of the recurrent connection is that a “memory” of previous inputs remains in the network’s internal state [3]. The structure of the proposed LRNN speed predictor is depicted in Fig. 1. As can be seen, the LRNN comprises an input layer, multiple middle layers, and an output layer. The output of each middle layer is feedback to itself with a time delay. Such recurrent network structure helps the LRNN to store historical temporal information, thus better capturing the dynamics in a time series. Mathematically, the functionality of LRNN is to receive the historical speed samples and to project the future ones, which can be expressed as follows: ∗ ∗ ∗ vk+1 = fLRNN vk−Hq +1 , . . . , vk−1 , vk , vk+2 , . . . , vk+H p
(1)
where Hq and Hp are the length of input and output speed samples, respectively. As reported in [8, 9], it is suggested to set the length of forecast horizon (Hp ) to 1–10s (with sampling time period being 1 s) in vehicular energy management problems.
4
Y. Zhou et al.
Time Delay
Input Layer
Time Delay
W
+
+
...
B
W
...
W
W
...
...
...
+
W
Output Layer
B
B
Middle Layers
Fig. 1 Schematic diagram of the LRNN predictor Offline Driving Database
40
CYC_HHDDT65
35
Velocity (m/s)
30
CYC_Highway
25 CYC_INDIA_HW
20
CYC_WVSUB
CYC_Nuremberg36
15
CYC_NewYorkBUS CYC_NYCC CYC_BUSRTE
10 5 0
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Time(s)
Fig. 2 Multiple standard driving cycles extracted from ADVISOR simulator
Therefore, in this study, the upper limit for velocity prediction horizon is set to 10s, while the length of input speed sequence is set the same as the prediction horizon (Hp = Hq ). The training of LRNN should be accomplished based on a comprehensive database. To guarantee the satisfied forecast precision, this driving database should contain abundant driving cycles covering a wide range of driving scenarios. To this end, eight standard driving cycles with different driving patterns (urban/suburban/highway) are aggregated to establish the offline driving database, as shown in Fig. 2. Please note these standard driving cycles are taken from the vehicular simulator ADVISOR [10]. Thereafter, another standard driving cycle, Urban Dynamometer Driving Schedule (UDDS), is picked from ADVISOR to verify the performance of LRNN predictor. The root-mean-square error (RMSE) is used as the evaluation metric for forecast precision [9].
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
5
Impact of Percentage of Training Sample and Middle Layer Configuration Before online implementation, a sensitivity analysis is conducted to explore the impacts on forecast precision of LRNN caused by different percentages of network training samples and different node combinations in LRNN middle layer, so as to further improve the quality of speed prediction. Firstly, to study the impact on prediction accuracy by different ratios of network training sample, the LRNN is trained with seven different percentages of driving data, and then the performance is tested under the UDDS driving cycle. In NN training phase, once the percentage x% of training sample is settled, the (1-x%) of driving data is used for validation accordingly, since we do not consider any testing sample (0%). The middle layer configuration of LRNN is {3,4,6}. The reason for using such configuration will be illustrated afterward. Table 1 details the prediction results. As can be seen, with the increment of training ratio from 35% to 85%, the forecast accuracy of LRNN is improved when Hp = 3, 5, and 10s. This is mainly because, with a higher ratio of training sample, LRNN can learn predictive knowledge from a wider range of driving scenarios, thereby increasing its forecast precision in the face of the newly encountered driving conditions. Nevertheless, too much training sample (e.g., 95%) would degrade the prediction accuracy to some extent, since an over-high ratio of training sample would compromise the generalization capacity of LRNN, thus reducing the forecast quality. As a result, the ratio of the training sample is set to 85% since it can improve the prediction accuracy without overly degrading the network generalization capacity. Moreover, we keep using 85% of driving data (8227 out of 9479 speed samples) as the training sample for LRNN. Thereafter, by maintaining the three-middle layer structure unchanged, the total number of middle nodes as a constant (e.g., in our case 13), and altering the node numbers in the first two middle layers, LRNN predictor is tested under UDDS driving cycle, with the average prediction error (RMSE) listed in Table 2. As can be observed, when Hp = 3, 5, and 10s, the highest prediction accuracy is achieved under the middle layer configuration III, namely, {3,4,6}. Table 1 Average RMSE (km/h) under different training data percentages Hp = 3 s Hp = 5 s Hp = 10s
35% 1.75 3.00 6.31
45% 1.78 2.98 6.31
55% 1.82 3.06 6.43
65% 1.74 2.96 6.29
75% 1.72 2.91 6.20
85% 1.67 2.85 6.09
95% 1.70 2.97 6.28
Table 2 Average RMSE (km/h) under different node combinations of LRNN middle layer Hp 3s 5s 10s
Config. I {1,6,6} 2.66 3.46 6.72
Config. II {2,5,6} 1.81 2.92 6.21
Config. III {3,4,6} 1.67 2.85 6.09
Config. IV {4,3,6} 2.08 3.12 6.22
Config. V {5,2,6} 3.04 3.76 6.75
Config. VI {6,1,6} 2.89 3.59 6.70
6
Y. Zhou et al.
Therefore, based on the results of sensitivity analysis, for LRNN predictor, 85% of data in offline driving database is used for network training, while the remaining 15% is for performance validation, and the hidden layer node configuration is set to {3,4,6} for online implementation.
Benchmark Speed Predictor Description Two commonly used speed predictors are introduced as the benchmark, namely, a multistep Markov chain (MSMC) predictor and a backpropagation neural network (BPNN) predictor. MSMC predicts the vehicle’s acceleration via the multistep transition probability matrices (TPM), with the number of Markov state being 50 and the order of Markov chain being 1. More details regarding the establishment of MSMC are available in [9]. In addition, the BPNN predictor has a three-layer structure, constituting of an input layer, a hidden layer, and an output layer. It is a multi-input-multi-output mapping function, where the node number of BPNN hidden layer equals to the sum of nodes in LRNN middle layers, 13 in our case. More details regarding the elaboration of BPNN predictor are available in [9]. Furthermore, the estimation of TPM and the training of BPNN are accomplished using the driving data in Fig. 2.
Prediction Results and Discussions Figure 3 depicts the performance discrepancy of three predictors under UDDS testing cycle, where the blue and red curves, respectively, denote the real speed and the forecasted speed over each prediction horizon, where Hp = 10s and the sampling time interval ΔT = 1s. As shown in Fig. 3(a–c), due to the stochastic nature of Markov chain, the predicted speed of MSMC tend to diverge heavily from the actual speed traces, thus causing the largest prediction error among three approaches. Besides, since the order of Markov chain is set to 1, the MSMC approach forecasts the upcoming speeds only relying on the current driving state, thus degrading its credibility in characterizing the complex and blended driving behaviors. Moreover, since more past speed samples are used for speed forecasting, the prediction quality of BPNN approach can thus be enhanced in contrast to MSMC method. In addition, thanks to the additional “memory” effect imposed by the recurrent network structure, the forecasted speed of LRNN distributes closer to the actual speed trajectories versus BPNN predictor, implying the improved forecast accuracy. Furthermore, as underlined in the dashed regions in Fig. 3(d–f), the LRNN predictor shows an overall higher rate of reconvergence after the inflection points of the speed trajectory versus benchmark predictors, indicating it can more promptly adapt to recent driving changes. Table 3 lists the average RMSE (RMSE) of three predictors with different Hp on UDDS testing cycle, where the percentage is the RMSE decrement brought by
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles Real Velocity (m/s)
30 (m/s)
7 Predicted Velocity (m/s) (a) MSMC
20 10 0
0
200
400
600
Velocity (m/s)
1000
1200
1400
(b) BPNN 20 10 0
0
200
400
600
800
1000
1200
1400
Time(s)
30 Velocity (m/s)
800 Time(s)
30
(c) LRNN
20 10 0
0
200
400
600
800
1000
1200
1400
Time (s) 20
10
0
20
40
60
80
100
120
140
Velocity (m/s)
Velocity (m/s)
(d1) MSMC
60
80
100
120
140
(f1) LRNN
20
40
0
800
850
900
60
80
100
120
140
950
(e2) BPNN
10 0
800
850
900
20
10
0
Velocity (m/s)
40
Velocity (m/s)
Velocity (m/s)
Velocity (m/s)
(e1) BPNN
20
10
20
10
0
(d2) MSMC
950
(f2) LRNN
10 0
800
850
900
950
Fig. 3 Comparative speed forecasting results on UDDS driving cycle (Hp = 10): (a)–(c) global view of MSMC, BPNN, and LRNN predictors; (d)–(f) local view of MSMC, BPNN, and LRNN predictors
LRNN predictor. Specifically, in contrast to benchmark approaches, the proposed LRNN approach can shrink the forecast error in average by at least 16.23% (MSMC) and 6.16% (BPNN), respectively, implying the improved forecast precision. Hence, the effectiveness of enhancing the prediction quality via using an improved type of network structure is verified.
8 Table 3 Average RMSE and prediction accuracy improvement on UDDS driving cycle
Y. Zhou et al. Hp 3s 5s 10s
RMSE (km/h) Error reduction RMSE (km/h) Error reduction RMSE (km/h) Error reduction
MSMC 2.34 28.63% 3.98 28.39% 7.27 16.23%
BPNN 1.97 15.23% 3.18 10.38% 6.49 6.16%
LRNN 1.67 N/A 2.85 N/A 6.09 N/A
3 Markov Chain-Based Driving Pattern Recognizer Driving pattern is an overall characterization of the combination of vehicle states and road environment [3], with urban (flowing or congested), suburban, and highway being the representative patterns. Usually, the speed and the power demand profile under different classes of driving pattern exhibits different features. This would greatly challenge the adaptability and optimality of energy management strategies (EMS) under changeable driving scenarios. Yet, most of the existing EMSs for FCHEVs still focus on optimizing the power splitting effects on specific driving cycles (e.g., [5]), which did not completely account for the impacts on control performance by different driving patterns. In such context, a driving pattern-conscious EMS with strengthened adaptability in changeable driving scenarios should be further investigated, which, in parallel, brings a challenging mission: driving pattern recognition (DPR). To address this issue, this section develops a DPR approach based on Markov chain and moving window technique [11], which can effectively identify the pattern of the online driving fragment. The design process will be detailed in the following parts.
3.1 Working Principle of the Markov Chain-Based DPR Approach In this section, the velocity-acceleration (v-a) transition behavior is deemed as the characteristic of each driving pattern, and it is quantified by the transition probability matrix (TPM) of Markov chain. The workflow of the devised DPR method is illustrated in Fig. 4, including four working stages: (a) offline benchmark TPM building stage, (b) online TPM estimation stage, (c) resemblance quantification stage, and (d) DPR precision-enhancing stage, where stage (a) is accomplished offline, whereas others are fulfilled in real time. The principle of each working stage will be detailed in the rest of this section.
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
Urban Driving database
Suburban Driving database
(b) Online Measured
Highway Driving database
Driving Data (v,a)
(a)
Self-learning MC model
Conventional MC model
Benchmark TPMs for Urban
(c)
Benchmark TPMs for Suburban
Online identified Multi-step TPMs
Benchmark TPMs for Highway
Similarity Degree Quantification: 2-D Correlation Coefficient
9
(d)
∆
∈
, N
Y
Similarity Vector: SD(N)=[sd1(N),sd2(N),sd3(N)]
,
∆
–
Number of Stop Complementary Rules
=
Average/ Maximum Speed DPR result
Fig. 4 Flowchart of the Markov driving pattern recognition (DPR) method
3.2 Design of Markov-Based Driving Pattern Recognizer As shown in Fig. 4, conventional Markov model is leveraged to estimate offline benchmark TPMs for three representative driving patterns, that is, urban, suburban, and highway. In contrast, the TPMs related to recent driving changes are generated via the self-learning Markov model. Finally, the pattern recognition results can be attained through measuring the resemblance between offline benchmark TPMs and online-recognized TPMs. For DPR purpose, the Markov state is defined as the discrete (v-a) pair, respectively, velocity and acceleration at the time k, denoted by x(k) = (v(k), a(k)). Therefore, the (i, j)th element in the l-step TPM can be derived by [12]: [Tl ]ij = Pr x (k + l) = xj |x(k) = xi ≈ Numlij /Numloi
Numloi =
s j =1
Numlij , i, j ∈ {1, 2, . . . , s} , l ∈ {1, . . . , NT }
(2)
(3)
where Numlij denotes the number of state transition event xi → xj in l steps ahead, Numloi the state transition number originating from xi , and NT the time-scale range of the conventional Markov model. To identify the TPM group via the real-time measurements, the number of state transition Num should be replaced by the frequency of state transition Fre. Hence, the original estimation model of transition probability (2) is reformulated as [12]:
10
Y. Zhou et al.
[Tl (L)]ij ≈
Numlij (L)/L Numloi (L)/L
=
Frelij (L)
(4a)
Freloi (L)
Frelij (L) = Numlij (L)/L =
1 L flaglij (t) t=1 L
(4b)
Freloi (L) = Numloi (L)/L =
1 L flagloi (t) t=1 L
(4c)
flagloi (t) =
s j =1
flaglij (t)
(4d)
where L is the length of observation. Besides, flag implies the occurrence of corresponding transition events, i, j ∈ {1, . . . , s} and l ∈ {1, . . . , Hp }. For instance, flaglij (t) = 1 only when the state transition incident ai → aj occurs at time step t (t ∈ [1, L]), while flagloi (t) = 1 only when the state transition incident originates from state ai at time step t. If the corresponding transition events do not occur, they both take zero values. Furthermore, the frequency of transition (Frelij and Freloi ) can be further expanded in the following way [12]: Frelij (L) =
Freloi (L) =
1 L 1 l l l t=1 flagij (t) = L · (L − 1) Freij (L − 1) + flagij (L) L = Frelij (L − 1) + L1 · flaglij (L) − Frelij (L − 1) ≈ Frelij (L − 1) + ϕ · flaglij (L) − Frelij (L − 1)
(5a)
1 L 1 l l l t=1 flagoi (t) = L · (L − 1) Freoi (L − 1) + flagoi (L) L = Freloi (L − 1) + L1 · flagloi (L) − Freloi (L − 1)
≈ Freloi (L − 1) + ϕ · flagloi (L) − Freloi (L − 1)
(5b)
To help TPMs adapt to recent driving changes, the varying decay coefficient 1/L is replaced by a constant forgetting coefficient ϕ (0 < ϕ < 1) in (5a) and (5b), with the purpose of stepwise removing the impacts on transition probabilities by old measurements. A larger ϕ denotes a higher updating rate of TPM. In particular, all the measurements flaglij (1), . . . , flaglij (L) and
flagloi (1), . . . , flagloi (L) are allocated with a group of exponentially declining weights [ϕ(1 − ϕ)L − 1 , . . . , ϕ(1 − ϕ), ϕ], wherein the sum of all weight elements is 1. Thus, the probability [Tl (L)]ij can be updated online by [12]:
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
11
Frelij (L − 1) + ϕ · flaglij (L) − Frelij (L − 1) , i, j ∈ {1, . . . , s} , l ∈ 1, . . . , Hp . [Tl (L)]ij ≈ l l l Freoi (L − 1) + ϕ · flagoi (L) − Freoi (L − 1)
(6) By (6), the TPM can be gradually renewed based on the incrementally attained driving information.
Offline Benchmark Transition Probability Matrix Building Stage Figure 5 gives the flowchart of offline benchmark TPM building stage. In particular, the workflow of this stage is detailed as follows: • Step 1: Multiple standard driving cycles are selected from ADVISOR [10], including the Cruise3, HWFET, ARTEMIS_HW, HHDDT65, ARTEMIS_UB, US06_HW, BUSRTE, Manhattan, AQMDRTC2, NurembergR36, ARTEMIS_SUB, WVUINTER, WVUSUB, UNIF01, and IM240. The driving cycles with identical patterns are aggregated to generate the related sub-database. • Step 2: Within each sub-database, collected speed profiles are discretized into related (v-a) pairs. Then, these time index-labelled driving samples are imported into the velocity-acceleration plane. The same Markov state indices are given to those samples, which are distributed in the identical rectangle zone. • Step 3: According to the measurements within the velocity-acceleration plane, the TPM groups of each driving pattern can be generated by (2). These TPM groups are deemed as the offline references for real-time resemblance quantification. Furthermore, as seen from the attained 3-D bar diagrams, every driving scenario has its own velocity-acceleration transition feature. Hence, the generated multistep TPM groups can be adopted to represent corresponding driving scenarios.
Real-Time Transition Probability Matrix Recognition Stage Figure 6 presents the workflow of real-time TPM recognition stage. As shown in Fig. 6, the online learning Markov model (6) is applied to each driving fragment within the moving horizon. Lu and Ls represent the length of updating and sampling window, respectively. Based on the (v-a) samples, the Markov transition probabilities are renewed at each sampling time instant, so as to evolve the real-time TPMs from the initial status to the terminal status. Thereby, at the end of each sampling phase, the similarity between the online-recognized TPMs and the offline benchmark TPMs can be quantified. The quantification results keep unchanged within the whole updating phase (Lu seconds). Thereafter, to promptly remove the negative impacts imposed by old measurements, all the elements in the online-recognized TPMs are
12
Y. Zhou et al.
Step1: Offline Scenario-based Database ARTEMIS_UB
25
0
0
500
1000
Time (s)
50
Manhattan
Measured Driving Data (v,a)(v,a) Measured Samples
0
0
500
Time (s)
3
1000
40 20 0
0
200 400 600 800
Velocity Velocity(km/h) (km/h)
Velocity Velocity(km/h) (km/h)
ARTEMIS_SUB
60
Discretization & Projection
AQMDRTC2
80 60
...
40 20 0
Time (s)
0
200 400 600 800
Time (s)
500
1000
Time (s)
1500
Velocity(km/h) (km/h) Velocity
Velocity Velocity(km/h) (km/h)
HHDDT65
0
120 100 80 60 40 20 0
xq 1
x(k+l)
0
x(k)
-1
x7
-2
Highway driving database (c) 120 100 80 60 40 20 0
...
HWFET
200
400
600
...
x1
-3
0
x36
...
2
Suburban driving database (b) 80
Step2: V-A Plane (e.g. s=36)
...
25
Acceleration (m/s2)
50
Velocity Velocity(km/h) (km/h)
Velocity Velocity(km/h) (km/h)
Urban driving database (a)
0
x2 20
x3 40
x4 60
x5 80
Velocity (km/h)
x6 100
120
Conventional Markov Model
800
Time (s)
Step3: Multi-time-scale benchmark TPMs
TPM for Urban
TPM for Suburban
TPM for Highway
One-step ahead
Two-step ahead
Three-step ahead
...
Fig. 5 Flowchart of offline scenario-based benchmark TPM estimation phase (e.g., s = 36 and NT = 3): Step 1: Establishment of the offline scenario-based driving database. Step 2: Discretion and projection speed samples into the v-a plane. Step 3: Estimation of offline benchmark TPM groups in different driving patterns
set to 1/s at every initialization moment (denoted by red line in Fig. 6). Hence, the quantification results are renewed every Lu second. More details regarding the TPM resemblance quantification will be introduced in subsection “Quantification of similarity degree”.
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
13
Online Multi-scale TPM Identification Initialization Time Instant DPR Results
Update Time Instant
...
...
Lu
... N
Updating Phases
... N th ... N+1 th Sampling Phases ... N+2 th Ls
N+1 N+2 N+3 N+4 N+5
... N+3 th ... N+4 th N+5 th
...
...
Window Moving Direction
Driving Segment
...
Self-Learning MC Model Quantification results Update per Lu seconds
Initial TPMs TPM Estimation & Similarity Quantification N-th Initialization time instant
Transition Probability Updating
Sampling Phase
Ls
Terminal TPMs
2-D Correlation Coefficient
Offline Benchmark TPMs
Updating Phase N-th Update time instant
Lu
Fig. 6 Flowchart of online multiscale TPM identification phase
To guarantee the online-recognized TPMs fully reflecting the driving feature of current segment, the Markov effective memory depth Dϕ is set identical to the length of the sampling window Ls . Obviously, a larger Ls covers a wider range of historical driving conditions. Yet, an overlarge Ls might include redundant information and also lead to heavier computational burden. As stated in [13], typically, the HEVs’ driving period is approximately 3 minutes. In this case, as a rational tradeoff, Ls is set approximated to this threshold. In addition, the setting of updating window length Lu should guarantee the renewing rate of pattern recognition without frequent pattern switching. In light of these issues, Lu and Ls are set to 50s and 150 s, respectively. It should be mentioned that the settings on Lu and Ls are attained via numerous cross-validations.
14
Y. Zhou et al.
Quantification of Similarity Degree The 2-D correlation factor r ∈ [0, 1] is adopted in this section to quantify the resemblance between the online-recognized TPMs and the offline benchmark TPMs. Given the two matrices A, B ∈ Rm × n , r(A, B) indicates the resemblance between them, as calculated by: m n r (A, B) = m n i=1
i=1
j =1
j =1 [A]i,j
[A]i,j − A [B]i,j − B 2 m n 2 −A i=1 j =1 [B]i,j − B
(7)
where [•]i, j represents the (i, j)th element of a matrix. AandB represent the average of elements in A and B, respectively. A larger r(A, B) means a higher degree of similarity between the examined matrices. In addition, let N represent the index of updating window. Hence, at t = k, N = fix(k/Lu ), wherein Lu = 50s and the function of fix is to output the integer part of k/Lu . At the Nth updating time-step, the online-recognized TPMs, denoted by Tl (N), are compared to the offline benchmark TPMs, denoted by Tli , l = 1, 2, . . . , NT , wherein i is the driving pattern index (1, urban; 2, suburban; 3, highway). Hence, the quantification results are expressed by a vector of similarity SD(N) = [sd1 (N), sd2 (N), sd3 (N)], wherein sdi (N) ∈ [0, 1], i = 1, 2, 3 measures the average similarity of the online-recognized TPMs versus each type of offline benchmark TPMs, as denoted by: sdi (N) =
1 NT r Tl (N ), Tli , i = 1, 2, 3. l=1 NT
(8)
Besides, we define the discrepancy of the largest and the second largest element in SD(N) by ΔSDmax (N) ∈ [0, 1]; the indices of the largest and the second largest element in SD(N) by Imax (N), Imax − 2 (N) ∈ {1, 2, 3}; respectively, and the similarity threshold by εSD ∈ (0, 1). On this basis, two possible cases tend to occur within the Nth sampling horizon: • Case I (SDmax (N) > εSD ): such difference in resemblance is deemed adequate to split different driving patterns. Hence, the pattern identification results can be confidently derived based on the largest element in SD(N), namely, P(N) = Imax (N). This case tends to occur if the (v-a) samples originate from single driving pattern, like the kth and the rth phases shown in Fig. 7(a). • Case II (SDmax (N) ≤ εSD ): it is not persuasive to differentiate driving patterns based on such insignificant similarity differences. This case tends to occur within either the pattern-shifting phases (e.g., see the qth phase of Fig. 7a) or the confusion phases (e.g., see the sth phase of Fig. 7a). The similarity threshold εSD should be set as a balance between the overall pattern recognition accuracy and the sensitivity toward the (v-a) transitions. In
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
15
Fig. 7 Representation of (a) similarity quantification results in different driving scenarios; (b) real driving pattern-shifting phases (e.g., urban to suburban); (c) confusion phases (e.g., urban vs. suburban); (d) proposed solution to separate pattern-shifting phases from confusion phases (e.g., urban vs. suburban)
particular, the driving pattern updating rate might be restricted by a larger εSD , thus degrading the real-time suitability under rapidly changing driving conditions. In contrary, if an over-small εSD were used, the reliability of DPR would be reduced by the frequent pattern-shifting results. As a result, εSD is set to 0.05 after a large number of trials and errors.
16
Y. Zhou et al.
The TPM resemblance quantification results in Case II make it impossible to discriminate two conflict patterns, and, thus, additional rules are required to enhance the precision of pattern recognition. In fact, distinct pattern identification decisions should be taken in terms of two different driving conditions. In particular, though ΔSDmax (N) ≤ εSD under pattern-shifting phases (e.g., see Fig. 7(b1), where more driving samples are derived from “urban” pattern, and Fig. 7(b2), where more driving samples are derived from “suburban” pattern), it is thus rational to recognize the upcoming pattern as “suburban,” since the pattern-shifting moment (see the purple dashed curve) is located in the current sampling horizon. Nevertheless, to prevent misrecognitions in confusion phases (Fig. 7(c1) and (c2)), it is suggested to recognize the current pattern as “urban” since the real driving pattern does not alter.
Complementary Rule Development To split pattern-shifting phases and confusion phases, the workflow of the proposed solution (see Fig. 7d) is detailed as below. Provided (a) P(N − 1) = 1, (b) Imax (N), Imax − 2 (N) ∈ {1, 2}, and (c)ΔSDmax (N) ≤ εSD , the Nth pattern recognition result P(N) can be picked from two candidates: Imax (N) and Imax − 2 (N). Hence, we split the Nth sampling horizon into two identical fragments. If the driving fragment in the second-half of sampling window has sufficient supplementary driving features related to “suburban” scenario, then P(N) = 2. Else, P(N) = 1. Likewise, if “highway” and “urban” or “highway” and “suburban” are the conflict pattern pairs, the aforementioned strategy can also be adopted to finalize the pattern identification results. To fulfill the abovementioned solution, it is supposed to extract the supplementary driving features over the second-half of sampling window, ifΔSDmax (N) ≤ εSD . With the extracted features, the corresponding complementary rules become effective to determine if the target driving segment can be categorized into the upcoming driving pattern. Hereafter, a brief illustration of the selection of supplementary driving features in urban and suburban scenarios and the design of complementary rules are given to detail the criterion to separate conflict driving patterns. The number of stop incident (NoS) and the average speed (vmean ) are chosen as the supplementary features if “suburban” and “urban” become the conflict driving patterns. To explore the statistical distributions of the selected features, numerous driving data with fixed sampling length (0.5Ls = 75 s) are picked up from the offline driving database (as shown in Fig. 5). As a result, with the extracted driving samples, Fig. 8 depicts the statistical distribution of the selected features. Table 4 summarizes several key figures of distribution. Note Pr(•) is the probability of the related incident. Given these statistics, the complementary rule for separating urban and suburban scenarios is depicted in Fig. 9a. Likewise, the complementary rules for other situations can be finalized, as shown in Fig. 9b and c, where the detail design procedure is omitted to avoid repetitive illustrations.
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
17
Fig. 8 Histogram on NoS and vmean of driving samples (per 75 s) under urban and suburban patterns Table 4 Statistical distributions (per 75 s) for the supplementary driving features Urban Suburban
Pr(NoS = 0) 3.07% 86.01%
Pr(NoS = 1) 42.55% 13.15%
Pr(NoS > 1) 54.38% 0.84%
Pr(vmean > 20km/h) 4.57% 95.10%
3.3 Driving Pattern Recognition Performance Validation The devised Markov pattern recognizer is verified under combined testing cycles, wherein the Markov time scale NT and the Markov state number s are set to 5 and 16, respectively.
Pattern Identification Results and Discussions To test the pattern-recognition performance, eight driving cycles are aggregated to generate the combined testing cycle, as depicted in Fig. 10a. In addition, as seen in Fig. 10b, the curves in red, blue, and green formats, respectively, represent the degrees of resemblance (sd1 , sd2 , sd3 ) versus three predefined modes, and the black curve represents the index trajectory of the largest similarity element (Imax ). Moreover, the DPR results are plotted in Fig. 10c and d. In general, the proposed method can properly recognize driving pattern according to Imax if a stable driving condition is encountered. Yet, as displayed in Fig. 10c, DPR errors tend to appear when ΔSDmax ≤ εSD . Contrarily, after using the complementary rules, the DPR accuracy improves greatly (e.g., phases I and II in Fig. 10d). Note the pattern identification result is set to “unrecognized (0)” in the first 150 s due to the lack of historical data for pattern recognition in this period. Overall, as depicted in Fig. 10a–d, the proposed method can separate real-time driving patterns with high credibility in the face of complex driving scenarios.
18
Y. Zhou et al.
(b)
(a)
(c)
Fig. 9 Complementary rules for (a) urban/suburban, (b) highway/suburban, and (c) urban/highway
Figure 10(e–h) DPR results on test cycle III: similarity quantification results in phases I and II. In particular, with the adoption of the complementary rules, the pattern recognition performance has been improved from two aspects: (1) the hazards of misrecognition are declined; (2) the latency before correctly identifying the upcoming pattern is reduced. For instance, as seen in Fig. 10e, sd1 (29) is larger than sd2 (29), but their difference (0.0052) is less than the threshold εSD (0.05). Thus, without the help of complementary rules, the current pattern is recognized as “urban” since Imax (29) = 1. This would lead to DPR errors, as depicted in phase I of Fig. 10c. Contrarily, as depicted in Fig. 10f, no vehicle stop is detected in the second-half of the 29th sampling window. Based on the complementary rules given in Fig. 9a, the current pattern is recognized as “suburban,” so as to avoid the misrecognition of driving pattern, as depicted in phase I of Fig. 10d. Likewise, as seen in Fig. 10g, since Imax (58) = 3, the 58th pattern would be recognized as “highway” if without the help of complementary rules, causing the delay of pattern identification, as depicted in phase II of Fig. 10c. Contrarily, vehicle stops for three
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
19
times (NoS > 1) in the second-half of the 58th sampling window (Fig. 10h). Based on the complementary rules shown in Fig. 9c, the 58th pattern identification result is set as “urban,” thus reducing the latency of pattern identification, as depicted in phase II of Fig. 10d. Beside the testing cycle in Fig. 10a, two other testing cycles are adopted to validate the presented DPR method, with the average precision of identification
(a)
sd1 (b) sd2 sd3 Imax
(c)
(d)
Phase I
Phase II
Fig. 10 DPR results on test cycle III. (a), speed profile of testing cycle; (b), similarity quantification results; (c) and (d), DPR results without and with complementary, respectively
20
Y. Zhou et al.
Fig. 10 (e)-(h). DPR results on test cuycle III: similarity quantification results in phase I and II
given in Table 5. In specific, over 92.32% recognition precision can be attained by the proposed method, even without the help of complementary rules. On this basis, an additional DPR precision improvement (ranging from 2.65% to 4.61%) can be obtained with the use of the complementary rules. This proves that the complementary rules are capable of compensating for DPR performance losses in case ΔSDmax ≤ εSD . To sum up, the proposed DPR method can reasonably differentiate online driving patterns, with the precision of 94.97%–98.16%.
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
21
Table 5 DPR accuracy comparison with/without complementary rules (s = 16 and NT = 5) Without complementary rules With complementary rules Accuracy improvement
Test cycle I 92.32% 94.97% +2.65%
Test cycle II 93.55% 98.16% +4.61%
Test cycle III 92.89% 95.55% +2.66%
Table 6 DPR accuracy comparison with different parameter configurations Parameter settings s = 16 NT = 1 NT = 2 NT = 3 NT = 4 NT = 5 s = 36 NT = 1 NT = 2 NT = 3 NT = 4 NT = 5
Test cycle I 91.64% 92.31% 92.32% 92.98% 94.97% 91.66% 91.66% 90.99% 90.99% 90.99%
Test cycle II 88.19% 92.90% 94.87% 97.49% 98.16% 86.87% 89.50% 93.44% 93.44% 94.09%
Test cycle III 86.98% 92.89% 94.87% 95.52% 95.55% 85.88% 89.81% 92.23% 92.23% 93.54%
Impacts on Pattern Identification Accuracy Imposed by S and NT The settings of s and NT would influence the precision of the presented pattern recognizer. Hence, a sensitivity analysis is conducted to explore the impacts on recognition precision by different settings on s and NT . Related evaluation results on three driving cycles are listed in Table 6. As shown in Table 6, the highest DPR precision is attained when s = 16 and NT = 5. On one hand, if the size of Markov state space continues to grow, more observations are requested to ensure the completeness of the online-recognized TPMs. On the other hand, the limited number of driving data in the sampling window makes the expanded TPMs hard to fully capture the recent driving changes, thus degrading the precision of DPR. Moreover, a larger NT can contribute to the higher DPR accuracy in most cases, since this can permit more onlinerecognized TPMs for resemblance quantification. Through such average filtering effect, the sensitivity toward the improper quantification results could be reduced, thus enhancing the precision of DPR. Yet, when NT exceeds five steps, the accuracy increment effect can be neglected.
Performance Comparison with Existing DPR Approaches In pattern identification tasks, the recognition accuracy and the computation burden are two concerning issues for real applications. In this subsection, the proposed Markov-based DPR approach is compared to existing DPR approaches on these
22
Y. Zhou et al.
Table 7 DPR performance comparison results DPR methods Proposed SVM-based [13] MLPNN-based [14] Clustering + SVM [15] LVQNN-based [16]
Number of feature parameters 5 4 6 6 19
Average DPR accuracy 96.22% 95.20% 95.82% 95.00% 98.00%
two issues, with the comparison results listed in Table 7, where SVM refers to support-vector machine, MLPNN means multilayer perceptron neural network, and LVQNN stands for learning vector quantization neural network. Note the average DPR accuracy of the proposed method on three testing cycles (s = 16 and NT = 5) is adopted for comparison. Besides, the proposed DPR method uses five feature parameters, namely, the velocity, the acceleration, the number of stops, and the average and maximal speed. Overall, the pattern identification accuracy of the proposed method is comparable to those in existing studies [13–16]. Although the DPR method in [16] results in slightly higher accuracy compared to this work, it adopts 19 feature parameters for pattern identification, which are nearly four times the amount of feature parameters used in this study. Using too many feature parameters would increase the complexity of NN structure, thus enlarging offline training time and increasing the risk of overfitting. To sum up, compared to existing DPR approaches, the proposed method can achieve a good balance between the identification accuracy and the online computation burden. In conclusion, the major advances of the proposed method against previous DPR methods are summarized as follows: • The velocity-acceleration (v-a) transition behaviors, for the first time, are used as the driving feature parameters for DPR problems compared to stationary feature parameters used by traditional DPR approaches. This measure permits a more accurate description of each type of driving pattern. • The proposed complementary rules can effectively compensate for DPR accuracy losses during the pattern-shifting phases, thus improving the reliability of pattern identification versus traditional DPR approaches. Validation results demonstrate that the proposed method can identify real-time driving pattern with an average precision of 96.22%, where the periodically updated DPR results can facilitate the realization of multimode EMS framework under changeable driving scenarios.
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
23
4 Energy Management Strategy Via Model Predictive Control Sections “Data-driven speed-forecasting approach” and “Markov Chain based driving pattern recognizer” focus on the development of driving prediction techniques. On this basis, this section provides an example of embedding the predictive knowledge into the real-time optimization framework of model predictive control (MPC), leading to the birth of an integrated predictive energy management strategy (PEMS) for FCHEVs. The rest of this section is organized as follows: Subsection “Powertrain architecture and system modeling” presents the modelling of vehicle’s hybrid powertrain. Subsection “Multi-mode predictive energy management strategy” illustrates the design and verification of a multimode PEMS, which can allocate power demand in the face of changeable driving patterns.
4.1 Powertrain Architecture and System Modelling This subsection presents the modelling of vehicle’s hybrid powertrain system.
Vehicle Model and Powertrain Architecture As depicted in Fig. 11a, this chapter focuses on a midsize vehicle model, which is taken from the database of the vehicular simulator ADVISOR [10]. Figure 11b schemes the topology of the studied hybrid propulsion system, where the fuel cell system (FCS), attached to the DC bus via a unidirectional DC/DC converter, and the battery, directly linked to the DC bus, work cooperatively to respond to the power request from the electric machine. The sizes of PEMFC and battery are determined by the component sizing method presented in our previous work [17]. The key specifications of the studied vehicle model are listed in Table 8.
(b) Vehicles’ powertrain structure
(a) Midsize vehicle model
v Fa
Ptra
PFC
Pdc
Fr
Mg
Ftra PBAT
Fig. 11 Studied vehicle model: (a) midsize vehicle outline and (b) topology of the hybrid powertrain
24
Y. Zhou et al.
Table 8 Powertrain specifications of the vehicle models Category Vehicle structural parameters
PEMFC system Lithium-ion battery pack Electric machine
Others
Item Vehicle mass
Specifications 1360 kg
Vehicle front surface Tire radius Aerodynamic coefficient Rolling coefficient Driveline efficiency Gravitational acceleration Rated power Maximum efficiency Nominal energy capacity Maximum power Maximum torque Maximum rotation speed DC/DC converter efficiency DC/AC converter efficiency
1.746 m2 0.32 m 0.3 0.0135 0.91 9.81 m/s2 30 kW 50.3% 6.4 kWh 150 kW 220 N•m 11,000 rpm 0.90 0.95
Moreover, the propulsion power (Ptra ) needed by a vehicle in motion can be calculated as a function of its weight (M) and speed (v), as denoted by (9) [18]. Accordingly, the output power of FCS (PFC ) and battery (PBAT ) together satisfy the DC bus power demand (Pd ), as denoted by (10). ⎡
⎤
⎢ ⎥ Ptra = v · Ftra = v · ⎣cr Mgcos (θ) + 0.5ρair Sf cd v2 + M˙v⎦ Fr
Pd =
Ptra ηdrive · ηDC/AC · ηEM
(9)
Fa
= PBAT + PFC · ηDC/DC
(10)
where cr is the rolling resistance coefficient, ρair the air density (1.21 kg/m3 ), Sf the front surface area, cd the aerodynamic drag coefficient, g the gravitational acceleration, ηdrive the driveline efficiency, ηDC/DC , ηDC/AC the power converters’ efficiency, and ηEM the EM efficiency. Since a horizontal vehicle model is used in this study, the road slope θ takes 0.
Fuel Cell Model Proton-exchange membrane fuel cell (PEMFC) is used in the studied hybrid powertrain. As the core of PEMFC system, fuel cell stack converts hydrogen energy into useful electricity power (Pstack ) via a series of electrochemical reactions. A fraction of electrical power generated by the stack is used in the auxiliaries around
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
25
the stack (PAUX ) (e.g., air compressor, etc.) to ensure the normal operation of the entire system. In this case, the actual output (net) power from PEMFC system (PFC ) equals to the difference between Pstack and PAUX . During the operation of PEMFC, the hydrogen mass consumption (MH2 ) can be calculated by [19]: MH2 =
t
PFC (τ) dτ η 0 FCS · LHVH2
(11)
where LHVH2 is the lower heating value of hydrogen. Moreover, let PH2 denote the theoretical power supplied by hydrogen; the efficiency of fuel cell system can be expressed as: ηF CS =
PF C Pstack − PAU X = PH2 PH2
(12)
More details about the modelling of auxiliaries’ power consumption can be found in [18]. As a result, the efficiency curve of the studied fuel cell system (FCS) is given Fig. 12. To enhance the operating efficiency of FCS, the FCS net power with point, the highest system efficiency (ηmax ) is defined as the most efficient operating max LOW H I GH , where the , Pη marked as Pη . Besides, the operating range PF C ∈ Pη PEMFC system efficiency (ηFCS ) is higher than 47%, is defined as the FCS’s high efficiency area.
Battery Model A simple but effective enough internal resistance model (R-int) is adopted to represent the behavior of a battery, with the equivalent circuit depicted in Fig. 13a.
High Efficiency Area
Fig. 12 Efficiency curves of a 30-kW PEMFC system efficiency
26
Y. Zhou et al.
(a) IBAT
(b)
RBAT Ud UOC
Fig. 13 Modelling of battery: (a) equivalent circuit of the R-int model and (b) relationship of the internal resistance and OCV of a single cell with respect to its SoC
The state of charge (SoC) is a percentage indicator of the remaining battery capacity (in Ah) in contrast to its nominal one, as computed by: SoC (t) = SoC0 −
ηBAT · IBAT (τ) dτ QBAT 0 t
(13)
where QBAT is the nominal battery capacity, SoC0 the initial SoC, IBAT the battery current, and ηBAT the battery efficiency (1 for discharge and 0.95 for charge). According to Kirchhoff’s voltage law, the DC bus voltage (Ud ) can be calculated as: Ud = UOC − IBAT · RBAT
(14)
where RBAT is the battery internal resistance and UOC the battery open-circuit voltage (OCV). Combine (14) with the expression of battery output power PBAT = Ud • IBAT , the battery current IBAT can be given as: IBAT =
UOC (SoC) −
UOC (SoC)2 − 4 · RBAT (SoC) · PBAT 2 · RBAT (SoC)
(15)
According to [20], UOC and RBAT can be respectively casted into a function of SoC. Figure 13b depicts how the OCV and internal resistance change with SoC. Please note the displayed battery characteristics are extracted from the experimental validated lithium-ion battery model from ADVISOR [10].
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
27
Efficiency
Fig. 14 Efficiency map of the studied electric machine (rated power @150 kW)
Mo
tor
Sp
eed
(rp
m)
or
Mot
ue Torq
)
(N.m
Electric Machine Model Electric machine (EM) is the supplier of vehicular propulsion power. Considering the vehicular maximum power/torque demands by the mission profiles, a 150-kW model is selected from the database of ADVISOR [10], with its rotation speed and torque operating ranges listed in Table 8. Moreover, the EM efficiency map, as shown in Fig. 14, derived from ADVISOR is adopted to compute ηEM when the torque and speed requests from wheel side are specified.
4.2 Multimode Predictive Energy Management Strategy In the face of changeable driving conditions in practice, energy management strategies (EMS) should be able to allocate power demands in multiple driving patterns. To this end, a multimode predictive EMS for FCHEV is devised, which can adapt to rapidly changing driving scenarios. Specifically, based on the Markov driving pattern recognizer (DPR) and the layer recurrent neural network (LRNN) predictor proposed in previous sections, model predictive control (MPC) is leveraged to derive the optimal power-allocating decisions at each sampling period. Figure 15 depicts the control framework of the proposed multimode EMS, which comprises a Markov DPR and a multimode MPC. In specific, the Markov recognizer, in the upper level, can periodically refresh the pattern recognition results. Each identified pattern corresponds to one group of pre-optimized control parameters of MPC. Based on the speed forecast results by LRNN and the adopted control parameters, MPC, in the lower level, takes the optimal control actions through solving a constrained optimization problem over each receding (prediction)
28
Y. Zhou et al.
Markov Driving Pattern Recognizer Multi-pattern Driving Cycle
Moving Window Approach
v-a Transition Information
Real-time Multi-Step TPM
Urban Suburban Highway Unrecognized
Similarity Quantification
V, A Optimal Control Parameter Selection
Layer recurrent NN Velocity Predictor
Vehicle Powertrain Model
V*,A*
Multi-objective Cost function
v
Pd, Pd*
Multi-mode Energy Management Strategy
MPC controller
Uopt
SoC, Pfc , Pbat
State Variable Feedback
Fig. 15 Control framework of the multimode energy management strategy
Model predictive controller Control input
Optimizer
Reference
Disturbances
Control actions
Prediction
Real system (plant)
Plant model State feedback Fig. 16 Illustration of model predictive control framework
horizon Hp . The sampling period ΔT is set as 1 s. The following parts would detail the development of MPC-based EMS.
Model Predictive Control: A Brief Introduction Model predictive control is one of the most widely used advanced control methods in multiple industrial sectors [21], which comprises the following three elements, as shown in Fig. 16. (a) Predictive Model: the term “model” in MPC refers to the control-oriented model (plant model), which is capable of representing the future dynamic behaviors of the real system (plant) according to the input information. The plant model is typically given in the form of state-space representation or transfer function, and the precision of system modelling can greatly affect the performance of MPC.
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
Past
29
Future
Reference trajectory Predicted output k th optimal control sequence k+1 th optimal control sequence Past control input Past output
Implemented control actions
Time step
k
k+1
k+2
...
k+Hm
...
k + Hp
Control Horizon: Hm Prediction Horizon: Hp
Fig. 17 Representation of MPC working principle [6]
(b) Rolling Optimization: MPC takes control actions via optimizing the performance index (quantified by the cost function) over a finite time horizon. Specifically, with the plant state sampled at time instant t = k, MPC optimizes the performance index over the time horizon [k, k + Hp -1], where Hp > 1 is the length of prediction horizon. At the next time instant, the time horizon shifts forward to [k + 1, k + Hp ], with the optimization performed again. In this way, the optimization is repeatedly conducted online. (c) Feedback Correction: After obtaining the optimal control sequence, containing Hp elements, at time instant k, MPC only implements the first one to the real system while discards the others. This measure can prevent the control performance losses imposed by model distortion or disturbances in environment. To sum up, as displayed in Fig. 17, the MPC working flow includes three steps: (i) future system state trajectory estimation, (ii) MPC performance index optimization over finite time horizon, and (iii) application of the first optimal control element to the real system. Once the plant states are updated, steps (i)–(iii) are sequentially carried out. Afterward, the prediction horizon moves forward, the system states are resampled, and the calculation (steps (i)–(iii)) is repeated starting from the new states.
Multimode Model Predictive Controller This subsection presents the design of the multimode predictive controller.
30
Y. Zhou et al.
Control-Oriented Model Let x, u, y, w,and r, respectively, denote the state, control (input), output, disturbance, and reference vector; a linear discrete-time system model is adopted, with the system variable definitions and the state-space representation given as below: x (k + 1) = A (k) x (k) + Bu (k) u (k) + Bw w (k) (a) y (k) = Cx (k) (b) ⎧ x (k) ⎪ ⎪ ⎨ u (k) with ⎪ y (k) ⎪ ⎩ w (k)
= = = =
[SoC (k) PFC (k − 1)]T FC (k−1) ΔPFC (k) = PFC (k)−P ΔT [SoC (k) PFC (k − 1)]T Pd (k)
(16)
(17)
T Besides, the reference vector r (k) = SoCref Pfcref (k) includes the reference values of battery SoC and fuel cell power. Additionally, a first-order differential approximation of SoC dynamics [5] and the discrete form of DC bus power balance can be denoted by (18) and (19), respectively: SoC (k + 1) = SoC (k) −
ΔT · ηBAT · PBAT (k) Ud (k) · QBAT
Pd (k) = PFC (k) · ηDC/DC + PBAT (k)
(18)
(19)
Combine (16–19), the system matrices A(k), Bu (k), Bw (k), C can be specified as:
ΔT·η ·ηBAT T 1 UdDC/DC ΔT·ηDC/DC ·ηBAT (k)·QBAT 1 Bu (k) = Ud (k)·QBAT 0 1 ! " T 10 ΔT·η ·ηBAT Bw (k) = − U DC/DC 0 C= d (k)·QBAT 01
A (k) =
(20)
Cost Function and Constraints The devised EMS intends to achieve three objectives: (a) the saving of H2 consumption, (b) the extension of fuel cell lifespan, and (c) the regulation of battery SoC. Note the second objective is transformed into limiting the power transients of FCS, since the steadier the fuel cell power is, the friendlier the operating conditions are, which will mitigate the degradation of FCS and thus contribute to a longer service time. Hence, at time step t = k, the objective function J(k) is expressed as below:
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
31
Hp
J (k) = (k + i − 1) + ρ3 (k) · C3 (k + i) ⎧ i=1 ρ1 (k) · C1 (k + i) + ρ2 (k) · C 22 ⎪ ref (k) ⎪ C1 (k + i) = PFC (k+i−1)−P ⎪ Pmax ⎪ FC ⎨ 2
(k+i−1) with C2 (k + i − 1) = ΔPFC ΔPmax ⎪ FC ⎪ 2
⎪ ⎪ ref ⎩ C3 (k + i) = SoC(k+i)−SoC SoCmax −SoCmin (21) where C1 to C3 are the cost terms related to three EMS objectives, and Pmax FC = 30 kW, ΔPmax = 1 kW/s, SoC = 0.8 and SoC = 0.6. The major functions max min FC of C1 to C3 are specified as below: • C1 is adopted to urge fuel cell working toward the reference point Pref . • C2 is leveraged to limit the harsh power transients to mitigate the FCS performance degradation imposed by overlarge load dynamics [22]. • C3 is deployed to guarantee the SoC regulation performance. Since a non-plug-in vehicle configuration is adopted, the reference SoC value (SoCref ) is identical to the initial SoC value (SoC0 ), so as to avoid battery overdischarge or overcharge, namely, SoCref = SoC0 = 0.7. In addition, ρ1 , ρ2 , ρ3 are three penalty coefficients reflecting the weights on cost terms C1 , C2 , C3 . The determination of ρ1 , ρ2 , ρ3 and Pref under different driving patterns will be detailed thereafter. Moreover, the control horizon length of MPC is set identical to the prediction horizon length, where Hp is set at five steps. Within each receding horizon, the following constraints should be enforced: ⎧ ⎪ SoC ≤ SoC (k + i) ≤ SoC ⎪ ⎪ ⎪ ⎪ ⎨ PFC ≤ PFC (k + i − 1) ≤ PFC ΔPFC ≤ ΔPFC (k + i − 1) ≤ ΔPFC ⎪ ⎪ ⎪ P BAT ≤ PBAT (k + i) ≤ PBAT ⎪ ⎪ ⎩ w (k + i) = P∗d (k + i) , i ≥ 1
(a) (b) (c) (d) (e)
(22)
where constraint (22a) guarantees the battery operation safety, where SoC = 0.55, SoC = 0.85. If SoC < 0.6 or SoC > 0.8, the EMS emergency working mode will be activated to enforce SoC return to the desired operating range [0.6, 0.8] as soon as possible. In addition, the operating boundaries for PFC , ΔPFC ,and PBAT are indicated by constraints (22b)–(22d), where PFC = 0W, PFC = 30 kW, ΔPFC = −ΔPFC = 1 kW, PBAT = −50 kW, and PBAT = 100 kW. Constraint (22e) assigns
the kth disturbance as the forecasted power demand P∗d (k + 1) , . . . , P∗d k + Hp , ∗ , . . . , v∗ which is derived based on the speed forecast results Vk∗ = vk+1 k+Hp by LRNN and the vehicles’ dynamics (9) and (10).
32
Y. Zhou et al.
It should be noted that since a quadratic performance as the index J(k) is adopted MPC cost function, the kth control actions U∗ (k) = u∗1 (k) , . . . , u∗Hp (k) can be derived by minimizing (21), with respect to linear constraints (22). Such a problem can be converted into a quadratic programming (QP) problem and thus can be solved using the open-source solver qpOASES [23]. Thereafter, only the first element of U∗ (k) is implemented to the vehicle model, while the others are abandoned.
Multiple Working Modes and Parameter Design of Energy Management Strategy To adapt to changeable driving scenarios, multiple EMS working modes are defined, and the switching of working mode is accomplished via using different sets of MPC control parameters. In specific, we consider the following EMS working modes: Normal Working Stage Based on the power demand under urban/suburban/highway scenarios, three groups of control parameters for MPC are tuned in offline. Then, with the periodically renewed DPR results, one group of well-tuned parameters is picked for real-time control to deal with related driving conditions. The offline parameters’ tuning process will be introduced afterward. SoC Emergency Stage If SoC < 0.6 or SoC > 0.8, ρ3 is increased to ten times the normal value to enforce SoC return to [0.6, 0.8]. If SoC emergency incident occurs, the setting of control parameters is turned to the “SoC emergency” mode and keeps unchanged until the next pattern updating moment. Start-Up Stage Without available recent driving information, the pattern identification result is set to “unrecognized” over the first sampling window (t ∈ [1, 150]). In the start-up stage, the control parameters for MPC are tuned to make battery as the primary energy provider, whereas the fuel cell only starts to work in case SoC < 0.6. • Flowchart of Parameter Tuning for Model Predictive Control. The performance of MPC relies highly on its control parameter settings, namely, (ρ1 , ρ2 , ρ3 ) and Pref . To obtain the suitable parameter setting of each driving pattern, the flowchart of parameter tuning is shown in Fig. 18, including four steps: (i) Dynamic programming (DP) is executed over each type of mission profile to extract the global-optimal results. (ii) Related Pref is attained based on the statistical distributions of the DP-extracted fuel cell working points. (iii) Given the fuel cell reference power and weighting coefficient candidates, several performance metrics of MPC-based EMS (e.g., hydrogen consumption, SoC final value, etc.) over the identical driving cycles are compared versus DP-optimized results. (iv) According to the performance discrepancies, ρ1 , ρ2 , ρ3 are tuned via trials and errors.
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
33
Penalty Factors Candidates Fuel Cell Power Profile
Urban Driving Cycle Suburban Driving Cycle Highway Driving Cycle
Optimal Results
Dynamic Programming
Global Optimization Results by DP
Statistical Distribution
Pref
MPC-based EMS
Urban Driving Cycle
Fuel Cell Power Dynamics H2 consumption Final SoC
DP benchmarks
SoC Variation Range
Adjusting
Suburban Driving Cycle
Highway Driving Cycle
MPC Results
Performance Deviations MPC Control Parameters Tuning
Fig. 18 Flowchart of MPC control parameter tuning process
• Selection of Fuel Cell Reference Working Points. With the entire driving cycle information beforehand, DP can obtain the global optimal power-allocating effect, which offers a benchmark for the selection of fuel cell reference working points. Considering the various power demand features, the objective functions of DP vary under different driving scenarios. Specifically, the optimization objective in urban regions is to limit the power transients of fuel cell (the summation of fuel cell power variation along the trip) against the fast dynamic power requests to prolong the lifespan of FCS. In comparison, under suburban and highway regions, the objective is to improve the average fuel cell working efficiency to save H2 consumption. Additionally, during the optimization, the constraints for DP- and MPC-based strategies are the same. Figure 19(a–c) shows the DPoptimized results over three driving patterns, with the related distributions of fuel cell power depicted in Fig. 19d. In urban pattern, the optimal fuel cell working points are located in the range of [1.5, 2.3] kW. The optimal fuel cell working range, in suburban, is [6.0, 7.0] kW, whereas the optimal fuel cell power range is [13.5, 15.5] kW in highway pattern. Thus, the reference fuel cell power (Pref ) is set to the corresponding average values, specifically 1.78 kW for urban, 6.80 kW for suburban, and 14.40 kW for highway. • Tuning Results of Penalty Coefficients. According to the Pref obtained previously, Fig. 20 depicts the tuning results of MPC penalty coefficients. Please note the non-tuned MPC uses the initial penalty coefficient setting (e.g., ρ1 = ρ2 = 1, ρ3 = 1000), which intends to keep battery working in charge-sustaining mode. As shown in Fig. 20a, with the tuned penalty coefficients (red curve), the power transients of fuel cell in urban pattern is greatly declined versus MPC with non-tuned coefficients (green curve). Likewise, as given in Fig. 20b and c, with tuned coefficients, the fuel cell power transients are limited in a relatively narrow range.
34
Y. Zhou et al. (b1)
(a1)
(a2)
(c1)
(c2)
(b2)
(a3)
(b3)
(c3)
(d)
PFC_UB_ave = 1.78 kW PFC_SUB_ave = 6.80 kW
PFC_HW_ave = 14.40 kW
Fig. 19 DP-based optimization results under three driving patterns: (a1)–(a3) DP results under urban scenarios; (b1)–(b3) DP results under suburban scenarios; (c1)–(c3) DP results under highway scenarios; (d) distribution of fuel cell working points under three driving patterns
Compared to DP benchmark, the performance gaps by MPC-based strategies are given in Table 9. The acronyms “MPC-N” and “MPC-T” represent MPC with non-tuned and tuned penalty coefficients, respectively, and mequ,H2 is the equivalent hydrogen mass consumption, which takes the final SoC deviation against the initial value (0.7 in our case) into consideration, as calculated by: mequ,H2 =
PFC (τ) ΔSoCN · EBAT · 3600 dτ + ηFC · LHVH2 0 ηFCS · LHVH2 t
(23)
where EBAT is the nominal energy capacity of battery pack (6.4 kWh), ΔSoCN = SoC0 − SoCN , and ηFC the average fuel cell working efficiency. As depicted in Fig. 20d, with the tuned penalty coefficients, the largest performance deviation on mequ,H2 versus DP benchmark is merely 0.14%. In addition, as depicted in Fig. 20e, the average fuel cell power transients caused by MPC-N strategy is 6.259–26.999 times the DP benchmark. In contrary, with the tuned parameters, this
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
35
(b)
(a)
(c)
(i) Equivalent H2 Consumption Gap vs DP benchmark (d) before/after Parameter tuning 1,50% 0,95%
1,00%
0,85%
0,44%
0,50%
0,14%
0,07%
0,02%
0,00% UB
SUB MPC-non-tuned
MPC-tuned
HW
(j) Fuel Cell Power Transients Discrepancy vs DP (e) benchmark before/after Parameter tuning 26,999
30 20 10
9,668
6,259 1,083
1,008
1,111
0 UB MPC-tuned
SUB MPC-non-tuned
HW
Fig. 20 EMS performance comparison before/after MPC penalty factor tuning: (a) urban scenarios; (b) suburban scenarios; (c) highway scenarios; (d) fuel economy discrepancy vs. DP benchmark; (e) fuel cell power dynamics discrepancy vs. DP benchmark
ratio has reduced sharply to 1.008–1.111 times the DP basis. Hence, it is confirmed that the penalty coefficients of MPC are well tuned, with the optimized parameters listed in Table 9.
Energy Management Strategy Performance Validation In this subsection, Software-in-the-Loop (SIL) simulation is used as a tool to verify the performance of the proposed multimode energy management strategy.
Description of the Software-in-the-Loop Testing Platform SIL testing is one of the important simulation-based tools for initial prototyping before the integration of any actual hardware; it is used to further validate the performance of control strategies [24, 25]. The major task of SIL testing is to justify the behavior of the generated code (functional) and to give further proofs of the
Opt. MPC parameters
mequ,H2 (g) # # # # #ΔPfc #(W/s)
SoCN
(ρ1 , ρ2 , ρ3 ) Pref
1.42
1.41
(1,2100) 1.78 kW
135.20
135.10
Urban (UB) DP MPC-T 0.7000 0.7021
35.29
135.71
MPC-N 0.7032
(1,1,60) 6.80 kW
13.32
418.30 13.43
418.90
Suburban (SUB) DP MPC-T 0.7000 0.7123
Table 9 MPC performance gaps versus DP benchmark before/after parameter tuning
83.39
422.29
MPC-N 0.7102
(1,0.2,54) 14.40 kW
11.01
1302.50
12.23
1302.71
106.42
1313.62
Highway (HW) DP MPC-T MPC-N 0.7000 0.7021 0.7131
36 Y. Zhou et al.
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles Data monitor, capture and parameter calibration Human Machine Interface: dSPACE ControlDesk V4.2
Hardware
37
Software
12V DC Power Supply Data Communication
Host PC Ethernet interface
Network Cable
dSPACE MicroAutoBox II
Generated code running on MicroAutoBox II
Fig. 21 (a) Block diagram and (b) real picture of the SIL testing platform
generated code running on the embedded target (embeddability). In this study, a SIL testing platform is set up (see Fig. 21), which allows the proposed strategy to be tested in the dSPACE hardware (MicroAutoBox II 1401/1511 [26]), thereby further verifying its functionality and real-time suitability. The SIL platform comprises hardware and software subsystems, wherein the hardware subsystem includes a DC power supply, a host PC, and a dSPACE MicroAutoBox II real-time system. The software subsystem contains the vehicular powertrain model and the control algorithms (PEMS) developed in the MATLAB/Simulink environment, which are compiled into C code and downloaded into the MicroAutoBox II. Besides, the dSPACE ControlDesk software is installed in the host PC as the human-machine interface (HMI) to calibrate the model parameters and to capture the experimental data during the online simulation. The host PC and the MicroAutoBox II are connected via a network cable through the Ethernet interface, and the data communication between them is enabled by the dSPACE real-time interface (RTI) module. In SIL testing, the proposed control strategy has been successfully executed in real time under three sampling period settings, namely, 1.0 s, 0.5 s, and 0.2 s, meaning the required computational hardware expense is far from reaching the upper limits of the target CPU, thus proving that the resulting computation burden is acceptable for online applications. To avoid the repetitive illustrations, only the testing results at the sampling period of 1.0 s are presented in the following parts.
Description of the Benchmark Energy Management Strategies Upper benchmark: as expressed by (24), the global optimal result is derived by dynamic programming (DP), which intends to minimize the H2 consumption over the given driving cycles. The global optimal searching can only be finished offline due to the demand of full cycle information beforehand:
38
Y. Zhou et al.
min
ΔPFC ∈μFC
N−1 ! k=0
" PFC (k) · ΔT ηFCS (PFC ) · LHVH2
subject to ⎧ ⎪ 0.6 ≤ SoC (k) ≤ 0.8 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 0 ≤ PFC (k) ≤ 30 kW ⎨ − 1 kW/s ≤ ΔPFC (k) ≤ 1 kW/s ⎪ − 50 kW ≤ PBAT (k) ≤ 100 kW ⎪ ⎪ ⎪ ⎪ SoC 0 = 0.7, PFC0 = 0 W ⎪ ⎪ ⎩ SoCN = 0.7
(a) (b) (c) (d) (e) (f)
(24)
where ΔPFC is deemed as the manipulated variable. μFC is the feasible region of ΔPFC in discretized domain, where the grid resolution is 1 W/s. Constraints (24a)– (24d) respectively indicate the operating boundaries for SoC, PFC , ΔPFC , and PBAT . In addition, (24e) defines the initial status of SoC and FC power, and (24f) poses a terminal constraint on battery SoC. Lower benchmark is a single-mode MPC-based strategy, wherein the reference fuel cell power is set to the most efficient system working point (Pref = Pmax η , as shown in Fig. 13). Furthermore, to deal with the unpredictable driving scenarios, the penalty coefficients are set to the initial configuration (the same as the nontuned MPC controller in subsection “Multiple working modes and parameter design of energy management strategy”, ρ1 = ρ2 = 1, ρ3 = 1000), with the purpose of keeping battery working in charge-sustaining mode to the utmost extent, so as to ensure the operation safety of the hybrid propulsion system.
Evaluation Results on Multi-Pattern Testing Cycles Five combined testing cycles are adopted for performance validation, with the evaluation results over testing cycle I detailed in Fig. 22. As depicted in Fig. 22a, the testing cycle I comprises urban, suburban, and highway scenarios, with the real driving pattern marked with black solid format and the pattern recognition result displayed in red dashed format. On this testing cycle, 97.05% DPR precision can be attained, with the errors largely imposed by the latency in pattern-shifting stages. Figure 22b shows the SoC trajectories of three strategies, where, in urban scenarios, DP charges battery to respond to the peaking power requests in the upcoming suburban and highway scenarios. The multimode MPC strictly restricts battery SoC varying around 0.7 in urban scenarios, whereas the battery power is used in a more flexible way in other scenarios. Additionally, the single-mode MPC maintains SoC firmly at 0.7 over the entire trip. As shown in Fig. 22c, DP uses fuel cell power at different levels in multiple driving patterns with few transient loadings. Likewise, the multimode MPC controls fuel cell operating at different reference points in a stable way. In contrary, the single-mode MPC leads
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
135 90
Speed Profile (km/h) Real Driving Pattern DPR Result
39
(a)
45
(b) DP Multi-mode MPC Single-mode MPC
DP Multi-mode MPC Single-mode MPC
DP EMS with identified result EMS with real pattern
(c)
(d)
Fig. 22 Evaluation results over testing cycle I: (a) speed and driving pattern information; (b) battery SoC trajectories; (c) fuel cell power comparison; (d) influences of pattern recognition errors on fuel cell power
to more fuel cell transient loadings and start-stop cycling. Figure 22d shows how DPR errors affect the output power of fuel cell. The outcome of a multimode MPC assisted by real driving pattern information (100% precision) is plotted in black dashed format. In terms of multimode MPC with pattern recognition results (red curve), the fuel cell power-switching latency can be seen at each pattern-shifting moment. Their performance deviation on fuel cell power is neglectable if external driving conditions are stable. Table 10 lists the numerical results on five testing cycles. The acronym “MPCS” represents the single-mode MPC, while “MPC-R” and “MPC-M” respectively
40
Y. Zhou et al.
stand for the multimode MPC assisted by real driving pattern and pattern recognition results. As the upper benchmark, DP leads to# the least # H2 consumption mequ,H2 and the smallest power transients of fuel cell #ΔPFC #. Compared with the MPC-S strategy on five testing#cycles,# the MPC-M strategy can respectively reduce mequ,H2 by 2.07%–3.26% and #ΔPFC # by 87.75%–88.98%. This implies the improved fuel efficiency and the decreased risk of fuel cell degradation by harsh transient loadings. In addition, it can be seen that the DPR errors can increase mequ,H2 by 0.06%– 1.30%, if comparing the outcomes of MPC-R and MPC-M strategies. As a conclusion, in contrast to single-mode strategy, the presented multimode strategy can lead to (i) at least 87.00% reduction of power transient of fuel cell and (ii) over 2.07% reduction of H2 consumption. Hence, under changeable driving scenarios, the FCS’s operating and maintenance costs could be highly mitigated via the proposed strategy. This should be deemed as the major benefit concerning the actual implementation of the devised multimode energy management strategy.
5 Conclusion This chapter develops a predictive energy management strategy (PEMS) for fuel cell hybrid electric vehicle (FCHEV). Compared to traditional control strategies, the proposed one especially concentrates on the combination of driving predictive information and real-time optimization framework, so as to further improve vehicle’s economic and durability performance. To this end, two driving prediction techniques are designed firstly: (i) A layer recurrent neural network speed predictor is proposed to estimate the vehicle’s upcoming speed profiles over each receding horizon. (ii) A Markov driving pattern recognizer is devised to differentiate real-time driving patterns, which establishes a solid basis for the realization of multimode energy management framework. On this basis, combining the driving prediction techniques with the model predictive control (MPC) framework, a multimode PEMS is developed for a midsize sedan powered by fuel cell and battery, aiming at splitting power demand under changeable driving patterns. In order to verify the effectiveness of the proposed strategy, a Software-in-the-Loop (SIL) platform is established based on the dSPACE MicroAutoBox II real-time system. Validation results show that the proposed control strategy can be properly embedded into and correctly executed on the target hardware with the predefined objectives achieved, thus verifying the EMS’s functionality and real-time suitability. This also justifies the possibility of the proposed strategy being integrated into the onboard electronic control units for real implementations. It should be mentioned that the application scenarios of the proposed predictive energy management framework can also adapt to the change of vehicle models or
Combined cycle V (CYC_V)
Combined cycle IV (CYC_IV)
Combined cycle III (CYC_III)
Combined cycle II (CYC_II)
Type Combined cycle I (CYC_I)
DPR accuracy = 96.61%
Type: “UB + SUB + HW + UB”
DPR accuracy = 94.95%
Type: “UB + SUB + HW + UB”
DPR accuracy = 96.24%
Type: “UB + SUB + HW + SUB + UB”
DPR accuracy = 96.26%
Type: “UB + SUB + HW + SUB”
DPR accuracy = 97.05%
Road information Type: “UB + SUB + HW + UB”
Metrics SoCN mH2 (g) mequ,H2 (g) |ΔPfc |(w/s) SoCN mH2 (g) mequ,H2 (g) |ΔPfc |(w/s) SoCN mH2 (g) mequ,H2 (g) |ΔPfc |(w/s) SoCN mH2 (g) mequ,H2 (g) |ΔPfc |(w/s) SoCN mH2 (g) mequ,H2 (g) |ΔPfc |(w/s)
Table 10 Numerical control performance on five combined testing cycles
9.89
8.27 0.7000 450.40
9.85 0.7000 527.02
8.89 0.7000 488.90
9.07 0.7000 552.10
DP 0.7000 474.30
MPC-R 0.6998 479.21 479.50 9.87 0.7149 566.10 560.84 9.58 0.7067 503.7 501.34 10.03 0.7055 541.10 539.14 8.83 0.6956 458.50 460.08 10.41
MPC-M 0.6844 480.50 486.02 9.99 0.7133 566.51 561.85 9.63 0.7086 504.60 501.63 10.59 0.7066 542.05 539.66 8.95 0.6966 459.50 460.73 10.53
MPC-S 0.7010 502.10 501.72 89.71 0.7030 576.1 575.03 87.40 0.7012 512.70 512.25 86.48 0.7012 553.62 553.18 79.27 0.7011 476.60 476.25 93.09
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles 41
42
Y. Zhou et al.
mission profiles. For example, with the help of an online learning enhanced Markov speed predictor [27] and an adaptive battery SoC reference generator, a PEMS has been designed for a midsize plug-in FCHEV, which can optimally deplete battery energy under multiple driving patterns for better fuel economy [28]. Moreover, combine the fuzzy C-means clustering enhanced Markov predictor and the battery energy planning approach, an integrated PEMS is devised for a light-duty plugin FCHEV dedicated to postal delivery [29], where the GPS-collected real-world driving data in urban routes has been used to verify the effectiveness of the proposed strategy. To sum up, the devised predictive energy management framework has good versatility, making it capable of adapting to multiple application scenarios. Despite the progresses regarding the energy management strategies for fuel cell electric vehicles in this chapter, further intensive studies should be conducted to improve the energy distribution performance. Specifically, future works would concentrate on the following perspectives: • This chapter only focuses on retarding the fuel cell degradation imposed by harsh power transients, whereas other factors that may compromise the durability of fuel cell systems are not considered, such as working at extremely high/low loadings, frequent start-stop cycling, etc. In future works, it is expected to systematically consider these degrading factors by quantifying them within the cost function when making power-allocating decisions [30]. • Powertrain component sizing plays an important role in the vehicle’s drivability and economic performance. In future works, a co-optimization framework for fuel cell hybrid electric vehicles considering the component degradations will be developed, which can simultaneously optimize the sizing parameters and the vehicle’s total ownership cost given the desired driving profiles. • Due to the abundant historical driving database of the postal delivery vehicles, the past driving experience is useful in guiding future energy distributions. Therefore, it is expected in future works to develop a data-driven approach (e.g., deep neural networks) to plan the future usage of onboard electricity energy for further improving the fuel economy performance when charge-depleting mode is involved.
References 1. T. Wang, Q. Li, X. Wang, Y. Qiu, M. Liu, X. Meng, J. Li, W. Chen, An optimized energy management strategy for fuel cell hybrid power system based on maximum efficiency range identification. J. Power Sources 445, 227333 (2020) 2. Z. Hua, Z. Zheng, M.-C. Péra, F. Gao, Remaining useful life prediction of PEMFC systems based on the multi-input echo state network. Appl. Energy 265, 114791 (2020) 3. Y. Zhou, A. Ravey, M.-C. Péra, A survey on driving prediction techniques for predictive energy management of plug-in hybrid electric vehicles. J. Power Sources 412, 480–495 (2019) 4. D. Tran, M. Vafaeipour, M.E. Baghdadi, R. Barrero, J.V. Mierlo, O. Hegazy, Thorough stateof-the-art analysis of electric and hybrid vehicle powertrains: Topologies and integrated energy management strategies. Renew. Sustain. Energy Rev. 119, 109596 (2020)
Predictive Energy Management for Fuel Cell Hybrid Electric Vehicles
43
5. A. Ravey, B. Blunier, A. Miraoui, Control strategies for fuel-cell-based hybrid electric vehicles: From offline to online and experimental results. IEEE T. Veh. Technol.61(6), 2452–2457 (2012) 6. Y. Huang, H. Wang, A. Khajepour, H. He, J. Ji, Model predictive control power management strategies for HEVs: A review. J. Power Sources 341, 91–106 (2017) 7. A. Graves, M. Liwicki, S. Fernández, R. Bertolami, H. Bunke, J. Schmidhuber, A novel connectionist system for unconstrained handwriting recognition. IEEE T. Pattern. Anal.31(5), 855–868 (May 2009) 8. K. Liu, Z. Asher, X. Gong, M. Huang, I. Kolmanovsky, Vehicle velocity prediction and energy management strategy Part 1: Deterministic and stochastic vehicle velocity prediction using machine learning. SAE Technical Paper, 2019-01-1051 (2019) 9. C. Sun, X. Hu, S.J. Moura, F. Sun, Velocity predictors for predictive energy management in hybrid electric vehicles. IEEE T. Contr. Syst. T.23(3), 1197–1204 (May 2015) 10. ADVISOR Advanced Vehicle Simulator. http://adv-vehicle-sim.sourceforge.net/ 11. Y. Zhou, A. Ravey, M.-C. Péra, Multi-mode predictive energy management for fuel cell hybrid electric vehicles using Markov driving pattern recognizer. Appl. Energy 258, 114057 (2020) 12. D.P. Filev, I. Kolmanovsky, Generalized Markov models for real-time modeling of continuous systems. IEEE T. Fuzzy Syst.22(4), 983–998 (Aug. 2014) 13. X. Huang, Y. Tan, X. He, An intelligent multifeature statistical approach for the discrimination of driving conditions of a hybrid electric vehicle. IEEE T. Intell. Transp12(2) (Jun. 2011) 14. R. Zhang, J. Tao, H. Zhou, Fuzzy optimal energy management for fuel cell and supercapacitor systems using neural network based driving pattern recognition. IEEE T. Fuzzy Syst27(1) (Jan. 2019) 15. Z. Chen, L. Li, B. Yan, C. Yang, C.M. Martinez, D. Cao, Multimode energy management for plug-in hybrid electric buses based on driving cycles prediction. IEEE T. Intell. Transp17(10) (Oct. 2016) 16. Q. Zhang, W. Deng, G. Li, Stochastic control of predictive power management for battery/supercapacitor hybrid energy storage systems of electric vehicles. IEEE T. Ind. Inform14(7) (Jul. 2018) 17. A. Ravey, N. Watrin, B. Blunier, D. Bouquain, A. Miraoui, Energy-source-sizing methodology for hybrid fuel cell vehicles based on statistical description of driving cycles. IEEE T. Veh. Technol60(9) (Nov. 2011) 18. L. Guzzella, A. Sciarretta, Vehicle propulsion systems: Introduction to modeling and optimization (Springer-Verlag, Berlin, 2005) 19. M.C. Péra, D. Hissel, H. Gualous, C. Turpin, Electrochemical components (John Wiley & Sons, Inc, 2013) 20. V.H. Johnson, Battery performance models in ADVISOR. J. Power Sources 110(2), 321–329 (2002) 21. Y. Zhou. Predictive energy management for fuel cell hybrid electric vehicle. Other. Université Bourgogne Franche-Comté, 2020. English. NNT : 2020UBFCA020 . tel-03080574
22. C.H. Zheng, G.Q. Xu, Y.I. Park, W.S. Lim, S.W. Cha, Prolonging fuel cell stack lifetime based on Pontryagin’s Minimum Principle in fuel cell hybrid vehicles and its economic influence evaluation. J. Power Sources248 (2014) 23. H. J. Ferreau et al., qpOASES User’s Manual, Version 3.2 [Online], Apr. 2017. Available at: https://github.com/coin-or/qpOASES/blob/master/doc/manual.pdf 24. C. Liu, H. Bai, S. Zhuo, X. Zhang, R. Ma, F. Gao, Real-time simulation of power electronic systems based on predictive behavior. IEEE T. Ind. Electron. 67(9), 8044–8053 (Sept. 2020) 25. H. Bai, C. Liu, R. Ma, D. Paire, F. Gao, Device-level modelling and FPGA-based real-time simulation of the power electronic system in fuel cell electric vehicle. IET Power Electron. 12(13), 3479–3487 (2019) 26. MicroAutoBox II: Compact and robust prototyping system for in-vehicle applications [Online]. Available at:https://www.dspace.com/en/pub/home/products/hw/micautob/ microautobox2.cfm 27. Y. Zhou, A. Ravey, M.-C. Péra, A velocity prediction method based on self-learning multi-step Markov Chain (45th Annual Conference of the IEEE Industrial Electronics Society, Lisbon, Portugal, 2019), pp. 2598–2603
44
Y. Zhou et al.
28. Y. Zhou, A. Ravey, M.-C. Péra, Multi-objective energy management for fuel cell electric vehicles using online-learning enhanced Markov speed predictor. Energ. Conver. Manage. 213, 112821 (2020) 29. Y. Zhou, H. Li, A. Ravey, M.-C. Péra, An integrated predictive energy management for lightduty range-extended plug-in fuel cell electric vehicle. J. Power Sources 451, 227780 (2020) 30. Y. Zhou, A. Ravey, M.-C. Péra, Real-time cost-minimization power-allocating strategy via model predictive control for fuel cell hybrid electric vehicles. Energ. Conver. Manage. 229, 113721 (2021)
Plug-in Hybrid Electric Buses with Different Battery Chemistries Total Cost of Ownership Planning and Optimization at Fleet Level Based on Battery Aging Jon Ander López-Ibarra, Haizea Gaztañaga, Josu Olmos, Andoni Saez-de-Ibarra, and Haritza Camblong
1 Introduction In the past few years, clear evidences regarding climate change have occurred, such as global mean temperature rise. The rising awareness of citizens has derived from new measures and a mind shift in road transport, looking for more sustainable solutions. The transport decarbonization challenge is seen to be spearheaded by public transport [1]. Among the possible technological solutions, hybrid electric buses (HEBs) are the more mature technology with the closest costs to conventional diesel buses, however having a similar behavior to conventional buses in regard to environmental footprint [2]. Larger battery (BT) capacities and additional flexibility degree of charging allow to operate plug-in hybrid electric buses (PHEBs) in zero-emission zone with a larger distance [3]. The lower manufacturing cost of PHEBs beside battery electric buses has placed it as the main competitor buses [4–6]. PHEB initial investment cost is higher beside conventional diesel buses [7–10]. Despite this fact, the yearly driven long distances and lower total cost of ownership
J. A. López-Ibarra () Engineering Department, Jema Energy, Lasarte-Oria, Spain e-mail: [email protected] H. Gaztañaga · J. Olmos · A. Saez-de-Ibarra Energy Storage and Management, IKERLAN Technology Research Centre, BRTA, Arrasate, Mondragón, Spain H. Camblong Department of Systems Engineering and Control, University of the Basque Country (UPV/EHU), Faculty of Engineering of Gipuzkoa, Plaza de Europa and Univ. Bordeaux, Estia Institute of Technology, Bidart, France e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_2
45
46
J. A. López-Ibarra et al.
(TCO) [9, 11–14] make an attractive solution for the oncoming zero-emission bus integration. Therefore, the TCO economic performance indicator plays a key role on the PHEB integration [9]. The TCO calculation and energy management strategy (EMS) design are commonly determined with fixed conditions, which vary throughout the bus lifetime and generate uncertainties on the TCO, as it has been evidenced in the literature [9, 15]. The continuous implementation of telematics and the development of new smart devices have resulted in a new information source. Traditionally, telematics have been used for vehicle positioning. However, the smart device implementation has allowed to acquire, store, compute, and analyze data in the cloud, a process known as digitalization [16, 17]. The correct exploitation of the acquired data will allow to monitor vehicle operation, reducing uncertainties of the TCO [18]. The fleet operation information with the needed BT advance knowledge will allow to manage the BT lifetime, going a step further on the EMS. In this regard, new techniques for managing the BT aging are needed, due to the fact that BT replacements are directly related to the TCO [19]. Some studies pointed out the influence of the BT price on the bus total price, reaching values of 39% [20]. Adding the fact that BTs have shorter lifetime than power electronic systems, the BT system is identified as a bottleneck in the lifetime of the bus. Lithium-ion batteries are classified as rechargeable or secondary batteries. In lithium-ion batteries, electricity is stored in chemical form, and the electricity is produced through an electrochemical reaction process. These types of batteries belong to electrochemical energy storage systems (ESSs). Lithium-ion batteries are basically composed of two electrodes (anode and cathode), an electrolyte, a separator, and a case [21]. Most of the vehicle applications require high energy densities. As a result, researchers have carried out new combinations of cathode, anode, and electrolyte primary functional components, in order to obtain higher energy densities. Therefore, “lithium-ion batteries” is an umbrella term for a variety of material combinations used to form batteries [22]. Anode mostly is composed of graphite carbon (C)—which allows to reach higher potentials. However, for applications that require longer lifetimes and safer levels, lithium titanate (LTO) is used, instead of C anodes [22, 23]. The key element for reaching higher energy densities is the electrodes, cathode (positive active material), and anode (negative active material) materials. Therefore, the lithium-ion batteries are classified according to each cathode and anode material. The cathode materials usually contain manganese (LMO), cobalt (LCO), iron phosphate (LFP), or mixtures such as LiN iMnCo (NMC) and LiN iCoAlO2 (NCA). This classification is shown in Table 1, with an overview of the main characteristics. In this chapter, a hierarchical energy management strategy design methodology for TCO management at fleet level presented in [24] is used to study two different fleets with LTO and NMC BT chemistries. The fleet has been reorganized, and the online operation energy management strategy is updated throughout the bus lifetime. These decisions are made based on the evaluated BT lifetime of the fleet, to meet the planned TCO requirements.
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
47
Table 1 Lithium-ion different chemistries Cathode Anode Optimized for Nominal voltage [V] Specific energy [Wh/kg] Specific power [W/kg] Safety risk
LCO C Energy 3.6 175–240 1000 Highest
LMO C Power 3.7 100–150 4000 Moderate
NCA C Energy 3.65 175–240 1000 Serious
LFP C Power 3.2–3.3 60–120 4000 Unreactive
NMC C Energy 3.6–3.7 150–220 1000 Moderate
LMO LTO Cycle life 2.3 70–75 4000 Moderate
2 Case Study The analysis of this chapter has been in reliance on a fleet composed of ten buses running on the routes shown in Fig. 1. The urban routes have been generated from a database of standardized driving cycles and adapted to a bus driving behavior [25]. Each driving profile represents one round-trip of the length shown next to every route reference. The daily operation of each bus is considered to be 16 h. This operation varies, to fulfill the route and finish the day. In this chapter, two buses have been modeled: LTO-based PHEB and NMC-based PHEB. The utilized component models are described in Sect. 3. The utilized traction element, the electric motor (EM), has been used for the two models. Regarding the power sources, the powertrain elements have been designed according to commercial models [24, 26, 27]. As it has been aforementioned, the LTO chemistry BT pack has been chosen for bus model 1 to evaluate a power purpose BT pack. This type of BT chemistry has longer lifetime than NMC chemistry, as shown in Fig. 10. The technical characteristics are shown in Table 2. To study a BT pack for energy application, bus model 2 equipped with an NMC BT has been modeled, showing the technical characteristics in Table 3.
3 Bus Model The electrical model of the powertrain elements has been developed for the quasistatic simulation method. This simulation method is a noncausal, discrete model where the signals flow from the drive cycle through the powertrain elements one way [28]. Therefore, the used formulation has been based on backward or “effect-cause” approach. The power is calculated at each discrete step following a predefined speed profile going upstream through the vehicle components [29]. In order to standardize the power flow direction, the adopted sign convention has been positive power when there is an electrical power demand or mechanical traction and negative when there is an electrical power absorption or mechanical braking. The powertrain
48
J. A. López-Ibarra et al.
Fig. 1 Speed profiles
configuration is shown in Fig. 2 and each element of each powertrain is described in the following lines.
3.1 Bus Dynamics In the quasi-static simulation, the inputs to the vehicle model are the speed vcyc (k) [ ms ], the acceleration acyc (k) [ sm2 ], and the slope angle α(k) [◦ ] of the predefined route [29]. From these speed profiles, the backward simulation is applied, starting
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
49
Table 2 Bus model 1: LTO BT-based P-HEB Elements Electric motor power Auxiliary demand minimum/maximum Genset power Battery pack Chemistry Series cells Cells branches Battery pack mass Power charge/discharge Energy
Parameters 196.5 12/18 160 LTO 260 2 266 168/192 23.92
Units kW kW kW – – – kg kW kWh
Table 3 Bus model 2: NMC BT-based P-HEB Elements Electric motor power Auxiliary demand minimum/maximum Genset power Battery pack Chemistry Series cells Cells branches Battery pack mass Power charge/discharge Energy
Parameters 196.5 12/18 160 NMC 162 1 166 96/192 23.98
Units kW kW kW – – – kg kW kWh
Fig. 2 PHEB powertrain scheme
from the calculation of the force acting on the wheels (FT ), at each discrete state k defined as follows [29, 30]: FT (k) = Fa (k) + Fg (k) + Fi (k) + Fr (k)
(1)
being Fa (k) [N] aerodynamic force, Fg (k) [N] gravitational force, Fi (k) [N] inertial force, and Fr (k) [N] rolling resistance force (depicted in Fig. 3), at each discrete state k are defined as follows: 2 Fa (k) = 0.5 · ρair · Af · cx · vcyc (k)
(2)
50
J. A. López-Ibarra et al.
Fig. 3 Forces acting on the bus during driving
Vcyc
Fa Fg
FT
Fr mtot . g
α
Table 4 Bus model characteristics Parameter Air density Curb weigth Drag coefficient Frontal area Gravity constant Rolling coefficient Wheel radius
Symbol ρair mveh cx Af g crf rwh
Value 1.1 12,500 0.8 8.67 9.81 0.008 0.487
Fg (k) = mtot · g · sin(α(k)) Fi (k) = mtot · acyc (k) Fr (k) = crf · mtot · g · cos(α(k))
Unit kg m3
kg − m2 m s2
− m
(3)
(4)
(5)
kg
2 the frontal area of the vehicle, cx [-] the being ρair m 3 the density of air, Af m
aerodynamic drag, mtot [kg] total vehicle mass, g sm2 the gravitational acceleration, and crf [-] the road rolling coefficient [30]. The mentioned described constants are summarized in Table 4. The total mass of the vehicle can be defined as follows: mtot = mveh + mBT + mpass · npass
(6)
where mveh [kg] is the empty bus weight, mBT [kg] is the BT weight, mpass [kg] is the average weight per person (assumed to be 75 kg), and npass is the number of passengers (considered to be as reference case 35 passengers of a maximum of 70 passengers) [30]. From the bus dynamic model calculation, the outputs are the wheel rotational rad speed wwh (k) rad , and the required 2 s , the angular acceleration dwwh (k) s torque in the wheel Twh (k) [Nm] calculated as follows:
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
51
vcyc (k) rwh
(7)
wwh (k) =
dwwh (k) =
acyc (k) rwh
Twh (k) = FT (k) · rwh
(8)
(9)
where vcyc (k) ms is the cycle speed, acyc (k) sm2 is the cycle acceleration, and rwh [m] is the wheel radius.
3.2 Transmission Model The transmission consists on the elements placed between the motor and the drive wheel axle. For the case of PHEB configuration, the tractive force is the EM. Since the EM is more flexible than the internal combustion engine (ICE) in a wider rotational speed ranges, there is no need of a gearbox. It is connected to the transmission through a final drive ratio. The final drive ratio transforms a certain rotational speed to a different speed, to make the most of the EM efficiency [29]. The inputs are the outputs of the dynamic model wwh (k), (k), and Twh (k). As
awh a result, the rotational speed of the driveshaft wdrsf t (k) rad s , the acceleration of
rad the driveshaft dwdrsf t (k) s 2 , and the required torque in the driveshaft Tdrsf t (k) [N m] are recalculated as shown in the following lines: wdrsf t (k) = wwh (k) · γ (k)
(10)
dwdrsf t (k) = dwwh (k) · γ (k)
(11)
Tdrsf t (k) =
Twh (k)[+] Twh (k)[−] · η + γ (k) · η γ (k)
(12)
where γ is the final drive ratio and η the efficiency of transmission model [29]. As it has been aforementioned, the traction is only provided by the EM TEM (k) [N m]. The EM power supply is divided between the BT and a genset (GS) in the case of the PHEB, needing to feed the total the power demand Pdem (k) [W ]: TEM (k) = Tdrsf t (k) + dwdrsf t (k) · JEM
(13)
52
J. A. López-Ibarra et al.
Pdem (k) = wdrsf t (k) · TEM (k)
(14)
3.3 Split Factor The information obtained from the transmission model is used to set the required tractive demand in the backward model. This demand has to be satisfied by each vehicle, combining as energetically and economically efficient as possible the power sources. The combination of the power sources use is determined by the split factor. As it has been aforementioned, the PHEB configuration is only driven by the EM. In this case, the split factor represents the power demand Pdem (k) [W ], as the series configuration is electrically coupled by the electric DC bus. This factor is divided in the case of the PHEB between the BT power PBT (k) [W ] and the GS power PGS (k) [W ], as represented in Eq. 15. The GS is composed by an ICE (speed controlled) and an electric generator (torque controlled): $ PGS (k) = Pdem (k) · (1 − U (k)) Pdem (k) = PBT (k) = Pdem (k) · U (k)
(15)
3.4 Electric Motor The PHEB powertrain configurations use the EM for traction purposes. The efficiency of the EM ηEM (k) [%] is calculated by means of the wdrsf t (k) and TEM (k) parameters, based on the efficiency map shown in Fig. 4. The EM model output is the required electric power PEM (k) [kW ] defined as follows [30]: When wdrsf t (k) > 0 and TEM (k) > 0 (traction mode),
Fig. 4 EM efficiency map [26]
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
PEM (k) =
wdrsf t (k) · TEM (k) 103 · ηEM (k)(wdrsf t (k), TEM (k))
53
(16)
When wdrsf t (k) > 0 and TEM (k) < BT 0 (regenerative mode), PEM (k) =
wdrsf t (k) · TEM (k) · ηEM (k)(wdrsf t (k), TEM (k)) 103
(17)
3.5 Genset The GS is a power element of the PHEB and is made up by an ICE and an electric generator. Both elements are mechanically connected by a clutch. The GS has been modeled based on efficiency and consumption maps. The dynamics related to the GS are not addressed in this chapter. The utilized GS model in this chapter has been obtained from a commercial diesel motor of VOLVO used in hybrid bus applications and commercial EG for transport applications, in which the efficiency maps are shown in Figs. 5 and 6b, respectively. From the interpolation of fuel map shown in Fig. 6a, the instantaneous
consumption fuel mass flow mfI CE (k) kg consumed at each discrete state k is calculated as s follows [29, 30]: mfI CE (k) = f (wdrsf t (k), TI CE (k))
(18)
The GS operation has been previously optimized in order to identify the most efficient operation points for the whole power operation range of the GS as shown in Fig. 7 [30]. Fig. 5 EG efficiency map [26]
54
J. A. López-Ibarra et al.
Fig. 6 ICE fuel consumption and efficiency maps [26]
Fig. 7 GS optimal operation curve and efficiency map [26]
The GS model input is the power target PGS (k) [kW ] determined by the split factor of Eq. 15. Figure 7 curve has been included in the model to obtain (depending on the power demanded to the GS input) the instantaneous targets for the ICE rotational speed and corresponding mechanical torque [30].
3.6 Auxiliary Loads The auxiliary loads, represented by the air conditioning, air compressor, cooling pump, power steering, and lights, are represented as a consumption minimum and maximum mean constant consumption of 12 kW and 18 kW for the PHEB. The auxiliary loads can be powered by the BT and the GS for the PHEB.
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
55
Fig. 8 A BT pack electric model
3.7 Battery The BT cell is represented by an ideal open-circuit voltage source (VOCBT [V]) in series with the internal resistance (RBT cell [Ω]). It has been assumed that a string contains nBT BT cells in series and the BT pack groups mBT strings in parallel [30], as shown in Fig. 8. For the state of charge (SOC) estimation of the BT (SOCBT (k)), coulomb counting method has been used [31]. In this modeling, the BT current (IBT (k)[A]) is calculated at each sampling (k), as follows: UBT (SOCBT (k)) IBT (k) = − 2 · RBT
UBT (SOCBT (k))2 − 4 · RBT · PBT (k) 2 · RBT (19)
where UBT (k) [V] is the equivalent open-circuit voltage and RBT (k) [Ω] is the equivalent internal resistance of the BT, at pack level. The BT model input is the power target PBT (k) [W] generated by the EMS split factor. The SOC is updated at each sample as follows: SOCBT (k + 1) = SOCBT (k) −
IBT (k + 1) · 100 CBT · 3600
(20)
where CBT [Ah] is the BT nominal capacity. The parameters utilized for the cell modeling are shown in Table 5. NMC- and LTO-type chemistries have been chosen due to the application characteristics. NMC chemistry is mostly used for energy applications, since the specific energy is high, 149 Wh/kg. On the contrary, LTO chemistry is used for power applications with lower specific energy, in this case 90 Wh/kg.
56
J. A. López-Ibarra et al.
Table 5 Electrical parameters of BT cells Nickel manganese cobalt oxide (NMC) [19, 32, 33] Nom. voltage 3.7 V Nom. capacity 40 Ah Int. resistance 0.8 m
Max C-rate disch/ch 8/7 C-rate Specific energy 149 Wh/kg Calendar lifetime 8 years
Lithium titanate oxide (LTO) [26, 34] Nom. voltage 2.3 V Nom. capacity 20 Ah Int. resistance 0.53 m
Max C-rate disch/ch 8/3 C-rate Specific energy 90 Wh/kg Specific energy 15 years
4 BT Lifetime Estimation Models In this section, the BT lifetime (Ψ ) and the number of replacement calculation method are described. For this calculation, BT calendar degradation and BT cycling degradation methods have been taken into account:
Ψ= min Ψcal , Ψcyc
(21)
where Ψcal is the number of years by means of the calendar degradation and Ψcyc the degradation by means of the BT operation. The calendar degradation is a fixed value for each chemistry, in which these values are given in Table 5. However, the BT cycling degradation has to be evaluated. In the following lines, the BT cycling degradation calculation is introduced. BT cycling degradation is calculated based on a rainflow cycle counting algorithm [35] and Wöhler curve-based method [30]. The Wöhler curve-based method is a fatigue analysis, commonly used for BT aging estimations [36–38]. The Wöhler method lies on the number of NEievt events—in this case depth of discharge (DOD)—that can occur until the BT reaches its end of life (EOL). The lifetime lost (LLievt ) calculation is done by the relation of the accounted max ) that the BT can withstand, (NEievt ) and the maximum number of events (NEievt expressed as follows: LLievt =
NEievt max NEievt
(22)
The N Eievt are accounted by means of the rainflow algorithm (Fig. 9), with steps of 1% of DOD. max are extracted from “Wöhler” curves of NMC and LTO chemistries The N Eievt shown in Fig. 10 and extracted from [39]. For determining the total lifetime loss (LL) in the whole range of DODs (from 0 to 100%), the sum of all the events in the cycling evaluated period has to be calculated as follows [30]:
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
57
Fig. 9 Rainflow charging/discharging cycle counting algorithm
Fig. 10 LTO and NMC Wöhler curves
LL =
LLievt
(23)
ievt
Finally, considering the evaluated SOC profile’s time period with the inversion of LL, the total cycling lifetime (γcyc ) can be calculated, typically defined in years:
58
J. A. López-Ibarra et al.
γcyc =
100 ievt=1
1 %
&
(24)
N Eievt max N Eievt
5 Total Cost of Ownership The PHEB has higher investment costs than conventional buses. However, as it has been aforementioned, the high yearly driven distances of buses and the lower operational costs help to compensate the BT and manufacturing extra costs [2, 14, 40, 41]. The solution feasibility study has to be determined based on the calculation of the TCO, as it is the economic performance indicator, which includes investment cost, insurance, infrastructure, maintenance, driver, operation, carbon taxes, and EOL [9, 26, 42]. From the abovementioned factors forming the TCO, two different groups have been differentiated from the energetic manageability point of view: fixed and manageable costs. Figure 11 shows both groups of factors. Once the bus is under operational conditions, on the one hand, investment cost, insurance, infrastructure, maintenance, and driver costs have been taken as constant, since they cannot be managed from an energetic point of view. On the other hand, operation costs can be managed with direct impact on the respectively carbon taxes. As shown in Fig. 11, the previously analyzed degrees of freedom regarding the PHEB [19] is the fuel consumption, charging cost, and ESS utilization and consequent number of replacements. Going deeper into the proposed TCO management techniques, two techniques have been differentiated. Bus-to-route approach lies on a short-term improvement for fulfilling the efficiency goal of fuel consumption minimization. This optimization is carried out to optimize the bus according to the route. Once the bus has been operated for a defined period of time, operation data will be available, and initial bus conditions vary. At this point, the route-to-bus long-term improvement takes place.
Fig. 11 TCO fixed and manageable costs
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
59
The compiled historical data is exploited, to improve the optimization process. In addition bus conditions are updated, and according to the new state of health (SOH) of the BTs and TCO planning, decisions are taken. These decisions aimed to fix the operation to the TCO plan and can imply a fleet reorganization or EMS update for reducing the fuel consumption or reducing the BT utilization. The TCO calculation for the PHEB is defined below. The fuel consumption fI CE [l/day] is obtained from the following equation: p fI CE =
k=1 mfI CE (k) · kcs
ρf uel
· nround−trips−day
(25)
where p is the driving cycle length in seconds, kcs [−] is the global factor to ICE cold starts, ρf uel [kg/ l] is the volumetric density of fuel, and nround−tripsday [−] is the number of completed round-trips in a day. The energy absorbed from the grid Echa [kW h/day] is calculated from the recharged power consumption, and the power of the charger Pcha (k) [kW ] is calculated based on the mean recharged power. Finally, the BT number of replacements is calculated following the methodology presented in Sect. 4. Therefore, TCO calculation has been carried out based on the aforementioned manageable costs, developing the following Eq. 26 [14, 41]: T COP H EB =
T (fI CE · Cf uel/t · Devf uel + Echa · CkW h/t ) · Opyear + CkW/t · Pcha (1 + dr)T t=1
+
CBT · EBT · DRBT /t (1 + dr)T
(26)
l is the fuel where T is the scenario duration (years), t is the current year, fI CE day
e consumption, Cf uel/t l is the annual fuel price, Devf uel is the diesel fuel price
e development, Echa (k)[kW h/day] is energy absorbed from the grid, CkW h/t kW h
day the referential annual energy cost of the grid, Opyear year is the yearly operation
e /year is the annual cost of the power, Pcha (k) [kW ] is the power days, CkW/t kW
e of the charger, CBT kW h is the BT initial price, EBT [kWh] is the BT energy,
% DRBT /t year is the depreciation rate of the BT per year, and dr[%] is the discount rate. In Table 6, the applied techno-economic parameters for the TCO calculations are shown [2, 7, 26, 43–50].
60
J. A. López-Ibarra et al.
Table 6 TCO parameters
a
Global factor to cold starts Volumetric density of diesel Fuel cost Diesel fuel price development Energy electricity cost Power electricity cost Electricity price development Yearly operation days Bus service life NMC BT low-cost scenario NMC BT medium cost scenario NMC BT high-cost scenario LTO BT low-cost scenario LTO BT medium-cost scenario LTO BT high-cost scenario BT depreciation rate
Acronym kcs ρf uel Cf uel Devf uel CkW h CkW DevElect Opyear PF leetEOL CBT CBT CBT CBT CBT CBT DRBT /t
Value 1.15 0.832 0.95 2.3 0.139 25.9 3.7 330 12 550 800 100 700 1000 1500 −20
Discount rate
dr
2.2
Unit − kg/ l e/ l %/year e/kW h e/kW/year %/year e/kW/year year e/kW h e/kW h e/kW h e/kW h e/kW h e/kW h % in 12 years %
Reference [30] [30] [46] [46] [46] [2] [46] [45]a [2] [48] [26, 44] [47] [48] [26, 44] [47] [43, 48]a [46]
Own estimation
6 Hierarchical Fleet Energy Management Strategy The main contribution of this chapter lies on a previously proposed approach for the energy management of a whole fleet, based on a hierarchical decision-maker, and management structure [24]. In this chapter, the fleet management methodology is thoroughly described. As shown in Fig. 12, it is divided into several levels and stages. In a first classification, two offline levels and one online level are distinguished. The fleet upper level goal is to manage and improve the operation energy efficiency of the whole fleet, taking decisions based on the whole fleet TCO picture. This approach allows to take decisions with a wider view, which offers additional degrees of freedom to further optimize the TCO. The whole methodology is thoroughly explained and validated in the following lines. The methodology, as shown in Fig. 12, has been divided into four different stages. Stage 1: Bus-to-Route Design In the first stage, the bus-to-route optimization scenario is defined, and the fleet expected urban route profiles are analyzed. Once the fleet structure and operation are defined, the optimization itself is performed, and the ANFIS learning-based EMS is designed. The optimization and the EMS design are personalized for each bus, to optimize the operation at vehicle level in each route.
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
61
Fig. 12 Fleet management methodology
Stage 1.1: Expected Urban Route Profiles In stage 1.1, the expected urban route profiles are analyzed, and the optimization scenario is defined. According to the available data, the mean auxiliary consumption and mean passengers are updated for the optimization scenario, if the studied tendency has changed. At this stage, a crucial point for the scenario definition is the final SOC determination, since this variable allows to manage the BT lifetime. The final SOC has been designed based on the level of demand of the route, with the goal of recharging the BT for reaching the initial SOC within the recharging time. The recharging time tcha has been determined according to the route distance, being the recharging time in seconds, nine times the kilometers of each specific route (empirically obtained). This decision has been established to achieve a BT depletion among all the route. Once the recharging time is set, the final SOC, xF inal , is calculated as follows: Echa = Pcha · 1000 ·
tcha 3600
tcha − Pmaxaux · 1000 3600
(27)
where Pcha kW is the charging power and Pmaxaux kW is the maximum auxiliary consumption to calculate the energy to be charged Echa kW h in the worst scenario.
62
J. A. López-Ibarra et al.
The usable energy Eusable kW h in the BT is calculated based on the BT energy EBT kW h, current BT SOH, SOH [%], BT utilization constant, γBT , and initial SOC, x0 [%], as follows: % & SOHBT x0 Eusable = EBT · · γBT · 100 100
(28)
Finally, the maximum discharged energy in the BT is calculated with the difference of Eqs. 28 and 27, and with the current energy, the final SOC xF inal [%] is calculated as follows: xF inal =
Eusable − Echa EBT ·
SOHBT 100
· 100
(29)
According to the fleet status, two pathways are identified to determine the scenario to optimize. The first path is given when a fleet is commissioned. In this case, there is no compiled fleet operation data, and the SOH of the bus BTs is at 100%. At this point, the optimization scenario γBT and SOCref wSOC of the cost function (Eq. 30 explained below) for the PHEB is 1. This scenario definition for the first pathway in both powertrains allows to use the whole BT capacity in the optimization, harnessing the BT utilization and minimizes the fuel consumption. The second path is given, when the fleet has been operating for a period of time and operation data is available. In this case, the first step is to update the bus BT SOH. According to the decisions that are taken in stages 3.4 and 4 (further explained below), the γBT and SOCref wSOC for the PHEB. Stage 1.2: Dynamic Programming Operation Optimization Once the optimization scenario is defined, in stage 1.2, each bus operation is optimized for each route and determining mean auxiliary consumptions, according to the following cost function:
J = minuk ∈Uk
N −1
mfI CE (U (k)) · (xref · wx ) · Ts
(30)
k=0
where mfI CE (U (k)) is the fuel mass consumption (determined by the torque power split factor U), xref is the SOC current difference from a reference SOC, and wx is the weight applied to the SOC difference factor. All the calculations are carried out at each time step (Ts =1 s), within the urban route length (N). The utilized dynamic system has been modeled with a single-state variable, BT SOC, still maintaining the accuracy [51]. For both PHEB powertrains, the database is generated with the power demand, BT SOC profile, length ratio, and GS output power.
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
63
Stage 1.3: ANFIS Learning-Based EMS Design In stage 1.3, the EMS is designed with the ANFIS learning technique being performed. The database generated in stage 1.2 is first processed and divided into data-sets according to the length ratio. The variables power demand, BT SOC profile, and length ratio are used as input variables. The output variable is the GS output power for the PHEB. Stage 1.4: Fleet Fuzzy Logic Update and Implementation Finally, in stage 1.4, the developed FL EMS for each bus is implemented. This last stage is the bridge between the offline optimization and strategy design at vehicle level and the online operation level. At the bus-to-route design, since in the beginning there is no compiled data, the offline data exploitation and decisionmaking at fleet level is not an input. On the contrary, throughout the bus lifetime, the offline data exploitation and decision-making at fleet level is used as an input for the bus-to-route design stage. Stage 2: Fleet Operation In this stage, the previously developed EMSs are integrated into each bus, and these are operated in their corresponding lines. However, the correct design and implementation of the developed EMSs have to be evaluated and corrected if required. The digitalization new techniques allow a continuous monitoring of the bus operation. The most important variables to register from a bus are the speed, auxiliary consumption, passenger flow, power demand, fuel mass flow, BT power demand, and BT SOC. The speed monitoring allows to analyze the driving cycle and behavior that each driver of the fleet is completing for all the routes. The information of the driving cycle allows to improve the energy efficiency without changing the driver driving behavior. Bus auxiliary consumption and passenger flow are the variables that affect the most to the power demand [19]. These variables allow to generate the optimization database. A key point that affects to the TCO is the fuel mass flow and a good indicator for evaluating the EMS efficiency. The BT power demand and SOC are also good indicators and crucial for evaluating the BT lifetime. Stage 3: Fleet Data Exploitation In stage 3, the obtained data from the fleet operation is processed for the subsequent fleet data analysis. Based on this analysis, EMS updating and/or route-to-bus fleet decision-making approach is performed. Stage 3.1: Fleet Operation Data In this stage, a time period has to be set, to collect enough data for the processing stage. This watching period can be set from weeks until years, depending on the data analysis type to be focused on. This stage is the bridge between the online operation level and the offline data exploitation and decision-making at fleet level of the hierarchical decision-maker and management for the data acquisition. The
64
J. A. López-Ibarra et al.
evaluated period is known as the period until the evaluation point. The evaluation point is described below, in stage 4. Stage 3.2: Data Processing The collected data of the previous stage has to be processed, to get valuable information. The extracted information of the fleet is the monitored one in stage 2, speed, auxiliary consumption, passenger flow, power demand, fuel mass flow, BT power demand, and BT SOC. This data is processed to acquire additional information for the analysis and decision-maker stages. The variables speed, auxiliary consumption, passenger flow, and power demand are used for defining the new optimization scenario. On the contrary, fuel mass flow, BT power demand, and BT SOC variables are used for taking decisions for the new optimization scenario. The new information obtained from the data process is crucial for the following stages. From the speed cycle, mean speed, maximum speed, acceleration, and route distance are obtained. Mean speed and route distance are used as indicators to evaluate the route demand level. Finally, the fuel consumption and BT lifetime are calculated as explained in the TCO economic model in Sect. 5. Stage 3.3: Data Analysis At this stage, an analysis of the whole fleet is performed, to facilitate the decisionmaking process. At this point, the planned TCO goal fulfillment has to be checked. The first goal of the TCO plan is the fuel consumption limit constrained by the fleet operator. The second goal is the planned bus BT lifetime for the TCO design, which allows to make decisions based on the bus BT aging estimation. Stage 3.4: Energy Management Decision-Making Based on the data analysis, in the energy management decision-maker stage, the new updated EMS design is decided. This stage receives inputs from stage 3.3 and stage 4, to make decisions for the EMS updates. From stage 3.3, the bus new conditions are updated. These new conditions are the ones referred to the new scenario to be optimized, such as the bus BT SOH at the evaluation point and decisions taken for the new optimization. From stage 4, based on the fleet management procedure, the new route reorganization is obtained, and the bus re-optimization target is set for the new optimization scenario definition. The re-optimization target is set based on the final SOC, which is modified and determined with the γBT constant in Eqs. 28 and 29. The constant γBT , as it has been aforementioned in Eq. 28, is used to manage the BT lifetime. The optimization decisions are made based on the TCO plan, for increasing or decreasing the estimated bus BT lifetime, based on the γBT . On the one hand, in order to increase the BT lifetime of a bus, the BT utilization is decreased, increasing the γBT (γBT > 1) and consequently increasing the final SOC target. In case the final SOC matches the initial SOC, as explained in Eq. 30, the BT utilization is constrained with the SOCref by means of the wSOC . On the other hand, for the
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
65
bus BT lifetime to be decreased, decreasing the γBT (γBT < 1) BT utilization is harnessed, consequently decreasing the optimization final SOC target. The decisions made at this stage are the inputs for the bus-to-route design, to re-optimize and redesign the EMS for each bus according to the decisions taken. Stage 4: Route to Bus The route-to-bus stage is the stage dealing with the fleet management, and it takes place in the evaluation point. It receives inputs from the data processing stage 3.3, and it calculates an output for stage 3.4 energy management decision-making. At this point, it is decided according to the current lifetime of the BTs of the whole fleet and the bus BT lifetime estimation if an EMS update is enough or a reorganization of the fleet is required. In this subsection, the lifetime of all the BTs of buses from the whole fleet has been estimated, and a bus BT lifetime plan for the whole fleet has been developed. The BT lifetime plan is crucial for making the most of each element of the powertrain. In case the bus BT lifetime is above the fleet lifetime horizon, the BT utilization is misused from the initial forecast. Therefore, this causes an extra fuel consumption than the required. In addition, as shown in Fig. 13b, the BT SOH at the fleet EOL point (PF leetEOL ) is around the half. Therefore, an EMS update is needed in the defined evaluation point Peval , to fit the fleet EOL point and BT EOL, maximizing the BT utilization and minimizing the GS and consequently fuel consumption. The evaluation point definition is a crucial stage further explained below. The other case scenario occurs when the bus BT lifetime is below the fleet lifetime horizon, as shown in Fig. 13c. In this case, the GS is underused and the BT
a
c b
Fig. 13 (a) Fleet BT aging scenarios. (b) BT lifetime above expected scenario. (c) BT lifetime below expected scenario
66
J. A. López-Ibarra et al.
is overused. The way that this operation affects the TCO is increasing the planned BT replacements. In the case scenario shown in Fig. 13c, the non-updated EMS operation has four replacements. On the contrary, the updated EMS reduces to two BT replacements, updating the EMS at the evaluation point. The fleet bus BT aging evaluation point definition as it has been aforementioned is a crucial point. The route-to-bus approach is carried out at this point. The definition starting point is given in the fleet BT aging scenario shown in Fig. 14a, identifying the bus lifetime, minimum and maximum bus BT lifetime, and fleet EOL point. The first step for the fleet management is to develop the BT lifetime plan and define the fleet evaluation point methodology. For this, the fleet bus BT lifetime estimation picture is necessary. In Fig. 14 the estimated BT lifetime, BT SOH at the evaluation point, cycle and daily driven distance and mean route speed, as aforementioned in Stage 3.2. Based on this picture, the routes are grouped into three groups: less demanding, most demanding, and average demand groups. The less and most demanding routes are those that the BT lifetime of those buses are above or below the BT lifetime horizon plan, and this cannot be managed to be fulfilled. The rest of the routes are grouped in the average demand. The evaluation point based on the fleet bus years, the bus with the minimum BT aging (minΨ ), has to be identified. The evaluation point for all the fleet is defined as the half of the minimum BT aging bus, as shown in Fig. 14b. The EOL of the BT is limited at 80% of BT SOH or calendar degradation, as explained in Sect. 4. Once the route-to-bus decisions and evaluation point have been defined, all the routes are re-optimized with DP, and the bus BT lifetimes are estimated, in order to stick to the bus BT lifetime goal. The route-to-bus EMS design decision is taken to meet the predefined BT aging targets and improve the TCO of the whole fleet. These decisions are performed based on BT lifetime target and the obtained fleet bus BT lifetime picture, shown in Fig. 15. The routes are grouped at this point. The grouping process has been carried out according to the route characteristics. The routes have been grouped in three classes, the less demanding, the most demanding,
a
b
Fig. 14 (a) Fleet BT aging scenarios. (b) Evaluation point definition based on years evaluation
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
67
Fig. 15 Fleet bus BT lifetime picture
and the average demanding, as shown in Fig. 15. At this point, the fleet management consists of updating the bus EMS and reorganize the routes. The most demanding or less demanding routes where the bus BT lifetime cannot be managed to fit the TCO BT lifetime horizon are the routes that are reorganized. The reorganization process is executed in the following way. For those routes that require a reorganization, the buses with the best SOH are swapped to the most demanding lines, and the buses with the worst SOH are swapped to the least demanding lines. This decision is used as an input for stage 3.4. The remaining routes grouped in the average demand group, the EMS, are updated, according to the new conditions defined in stage 3.4. This fleet management decision has been made to balance the BT lifetime of the buses with the best and the worst SOH. In the following sections, each stage is evaluated and validated for the case scenario described in Sect. 2.
7 Fleet Management Decision-Maker for Different Battery Chemistries The explained BT lifetime plan, evaluation point definition, and TCO improvement evaluation are validated in this section. The described NMC BT-based bus fleet and LTO BT-based bus fleet models are introduced in Tables 2 and 3 driving in the routes described in Fig. 1.
68
J. A. López-Ibarra et al.
7.1 Battery Lifetime Plan and Evaluation Point Definition for Different Chemistries The fleet bus BT lifetime evaluation plan has been developed based on the evaluation point definition based on the years. This technique sets the evaluation point at the half of the estimated minimum bus BT lifetime applied to both fleet BT lifetime plan developments. Putting the focus on the long-term operation and fleet bus BT lifetime, in this section, the fleet route reorganization is performed. This route reorganization is pursued to manage the buses with the most critical BT SOH. These buses are those where the BT lifetime estimation is far above or below from the bus BT lifetime horizon. An EMS update is not enough to correct the BT lifetime. Therefore, a route reorganization has to be applied.
7.2 Fleet of Buses with LTO Chemistry Evaluation Point and Battery Plan Definition The BT lifetime picture of the fleet with LTO chemistry is shown in Fig. 16. The grouping process of the least demanding, most demanding, and average demanding routes has been performed with the information of the cycle daily distance, mean route speed, and route grouping. The buses with a BT lifetime estimation 20% below (BT lifetime estimation < 9.6 years) or above (BT lifetime estimation > 14.4 years) the defined BT lifetime horizon of 12 years have been grouped in the least or most demanding routes, respectively. The rest of the lines within the 20% threshold have been grouped as average demanding. The least demanding routes, in this case, lines 3 and 10, match with the routes that have the shortest cycle and daily driven distance and the lowest average speed. The most demanding routes are the routes that have the longest cycle and daily
Fig. 16 LTO BT based bus fleet BT SOH evaluation and lifetime estimation
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
69
Table 7 Buses and routes exchanging of the LTO-based fleet Group Best SOH Worst SOH
Bus number 3 10 7 8
Current line 3 10 7 8
Exchanged for line 7 8 3 10
distance and the higher average speed, identifying routes 7 and 8 in this group and being line 7 the most demanding one. The main reason for being line 7 the most demanding route in comparison with line 8 is the longer cycle length. The longer cycle length depletes more the BT and consequently degrades more the BT. Finally, the remaining routes with an average BT degradation within the 20% BT lifetime horizon threshold have been grouped in the average demanding class. In Table 7, the route rescheduling is sum up. Bus 3 that was operating in line 3 has been exchanged for the most demanding line 7. Bus 7 that was operating in the most demanding route has been exchanged for the least demanding route 3. The same decisions have been applied for the subsequent the most and the least demanding lines, 8 and 10, respectively. In buses driving in the average demanding routes, the EMS has been updated, but they do not require route reorganization for achieving the bus BT lifetime target of 12 years.
Fleet of Buses with NMC Chemistry Evaluation Point and Battery Plan Definition For the fleet with NMC chemistry, the detailed BT lifetime is shown in Fig. 17, which varies from the previous LTO-based fleet. Based on the bus BT lifetime estimation in Fig. 17, bus reorganization is performed. To make the most of the BT, the median of the fleet bus BT lifetime has been set as the BT lifetime horizon of 4 years. In Fig. 17, the least demanding, most demanding, and average demanding routes have been grouped. In contrast to the LTO bus fleet, due to the applied FEC constraint in the NMC BT Wöhler, the least demanding line in NMC bus fleet is the tenth route and not the third one. The least demanding routes match with those routes that have the shortest cycle and daily driven distance and the lowest average speed, in this case lines 3, 4, and 10. Only route 5 has lower mean speed and shorter driven distance than line 4. However, line 5 has a lower BT lifetime, due to the higher aggressiveness than other routes. The most demanding routes are those with a longer cycle, a daily distance, and a higher average speed, identifying routes 6, 7, and 8 in this class. The main reason for this fact, in comparison with lines 6 and 8, is the longer cycle length, which depletes more the BT and consequently degrades more the BT. Finally, the remaining routes that have an average BT degradation have been grouped in the average demanding class. In the buses that operate in these routes,
70
J. A. López-Ibarra et al.
Fig. 17 NMC BT-based bus fleet BT SOH evaluation and lifetime estimation Table 8 Buses and routes exchanging of the NMC-based fleet Group Best SOH
Worst SOH
Bus number 3 4 10 6 7 8
Current line 3 4 10 6 7 8
Exchanged for line 8 6 7 4 10 3
the EMS have been updated, but they do not participate in the route reorganization process, since for achieving the bus BT lifetime target of 4 years their current BT lifetime is within the 20% threshold of the lifetime horizon target. In Table 8, the route rescheduling is sum up. Bus number 10 that was operating in line 10 has been exchanged for the most demanding line 7. Bus number 7 that was operating in the most demanding route has been exchanged for the least demanding route 10. The same decisions have been applied for the second and third most and least demanding lines, 3, 4, 6, and 8, respectively. Bus number 3 has been exchanged for line 8, bus number 8 for line 3, bus number 4 for line 6, and bus number 6 for line 4.
7.3 Route-to-Bus Updated Energy Management Strategy TCO Evaluation To prove the need of updating the EMS, the NMC BT bus fleet and the LTO BT bus fleet-level TCO have been compared. The TCO has been evaluated from the SOH evaluation point to bus BT EOL. For the TCO evaluation of the whole
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
71
fleet, the medium BT cost scenario has been applied. After this, the three cost scenarios defined in Table 6 have been analyzed for the route-to-bus and fleet TCO improvement rate analysis.
Fleet of Buses with LTO Chemistry Evaluation Point and BT Plan Definition The LTO-based bus fleet TCO at vehicle level has been improved within the range of 1.14% and 31.84%, as shown in Fig. 18. This improvement has been obtained applying the proposed fleet management methodology. However, those buses that have been exchanged for the most demanding lines (routes 3 and 10) have increased their TCO at vehicle level. However, this is compensated with the avoidance of BT replacement on those buses operating in the initial period on the most demanding lines by means of the fleet manager. It must be emphasized the operation modification for bus 7. To reach the service life target, the operation has been changed to start and finish nearly at the same SOC. This is the reason for having low energy and power costs. To analyze the improvement and demonstrate the need to exchange buses, an evaluation for the exchanged and non-exchanged buses has been carried out, as shown in Fig. 19. For this analysis, low, medium, and high BT cost scenarios (LTO cost scenarios described in Table 6) have been studied in the four buses where Fig. 18 LTO BT based bus fleet total cost of ownership of non-updated and updated buses
72
J. A. López-Ibarra et al.
Fig. 19 LTO BT-based bus fleet total cost of ownership of non-updated and updated buses
exchanging has been applied. It is important to acknowledge that in the three cost scenarios, an improvement is achieved, ranging from 4.19% to 5.62%. Regarding the fleet total TCO improvement analysis, the three price scenarios have also been evaluated. With the proposed approach, the bus BT lifetime requirements are met, and a TCO improvement has been achieved. In the low, medium, and high BT price scenario, a fleet TCO improvement of 3.85%, 4.38%, and 5.62% has been achieved, respectively.
Fleet of Buses with NMC Chemistry TCO Evaluation The scenario from the evaluation point until the bus EOL has been evaluated, being defined at 12 years. For the case of NMC-based fleet, an improvement from 0.4% to 29.27% has been obtained, as shown in Fig. 20. As happened for the LTO-based fleet, those buses that have been exchanged for the most demanding lines (routes 3, 4, and 10) have increased their TCO. The rest of the bus TCO has been improved managing their BT lifetimes. As it has been done for the LTO bus fleet, to analyze the improvement and demonstrate the need to exchange some buses, an evaluation of exchanged and nonexchanged buses has been carried out and depicted in Fig. 21. For this analysis, low, medium, and high BT cost scenarios for NMC have been studied with six buses that have been exchanged. It is important to underscore that in the three cost scenarios, an improvement is achieved, ranging from 2.72% to 3.06%.
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
73
Fig. 20 NMC BT-based bus fleet TCO of non-updated and updated buses
Fig. 21 NMC BT-based bus fleet TCO evaluation for swapped and non-swapped buses
In the three BT price scenarios, a whole fleet TCO improvement has been achieved. The achieved fleet TCO improvement for the low, medium, and high BT price scenarios has been of 1.84%, 2.1%, and 2.3%, respectively.
74
J. A. López-Ibarra et al.
8 Conclusions In this chapter, a hierarchical energy management strategy design methodology for total cost of ownership management at fleet level is proposed. The proposed approach is based on a hierarchical decision-maker and management composed of three levels. The fleet has been reorganized, and the online operation energy management strategy is updated throughout the bus lifetime. These decisions are made based on the evaluated BT lifetime of the fleet, to meet the planned total cost of ownership requirements. The main findings have been summarized as follows. The LTO BT-based bus fleet has been improved up to 5.62%. Regarding the NMC BT-based bus fleet, the total cost of ownership has been improved up to 3.06% compared with not updating the bus energy management strategies. When managing a fleet of vehicles, not only the fuel consumption minimization is a key factor, but the whole total cost of ownership improvement. Managing and sticking to the developed total cost of ownership plan will allow to increase the fleet operation savings. In addition, the upper fleet-level point of view and TCO plan facilitate the decisions when updating the EMS and give an additional degree of freedom of managing the fleet. Comparing both fleets, it is to highlight the higher improvement in the LTO fleet. In this case study, all the replacements are avoided. On the contrary, for the NMC fleet, the buses swapped to the most demanding lines increase the number of replacements in the updated scenario. However, as explained in stage 4, having a BT at the EOL of the fleet lifetime above the BT SOH is a sign of a wrong management of the available sources. The fleet management improvement is proved in the fleet TCO improvement of 2.3% for the NMC bus fleet. The obtained improvements in both fleets prove the need of EMS update and fleet reorganization, improving the TCO of the whole fleet further. In addition, the upper fleet level and TCO plan facilitate the decisions when updating the EMS and give an additional degree of freedom by reorganizing the fleet. Although the fuel consumption minimization objective is an important target, having the TCO picture gives a clearer and more reliable information in economic terms. It is worth mentioning the case when the BT of a bus arrives to the EOL before the TCO planned years. This will imply an additional economic penalization from the operating company. This penalization has not been taken into account for those buses not reaching the planned bus BT lifetime. Our future work and research will be focused on the validation of a zero-emission fleet composed of battery electric and hydrogen buses.
References 1. M. Roggea, E. van der Hurkc, A. Larsenc, D.U. Sauer, Electric bus fleet size and mix problem with optimization of charging infrastructure. Appl. Energy 211, 282–295 (2018)
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
75
2. A. Lajunen, Lifecycle costs and charging requirements of electric buses with different charging methods. J. Clean. Prod. 172, 56–67 (2018). https://doi.org/10.1016/j.jclepro.2017. 10.066 3. H. Zhang, X. Li, X. Liu, J. Yan, Enhancing fuel cell durability for fuel cell plug-in hybrid electric vehicles through strategic power management. Appl. Energy 241, no. March, 483–490 (2019). https://doi.org/10.1016/j.apenergy.2019.02.040 4. M.F. M. Sabri, K.A. Danapalasingam, M.F. Rahmat, A review on hybrid electric vehicles architecture and energy management strategies. Renew. Sust. Energ. Rev. 53, 1433–1442 (2016). http://dx.doi.org/10.1016/j.rser.2015.09.036 5. Y. Hu, W. Li, K. Xu, T. Zahid, F. Qin, C. Li, Energy management strategy for a hybrid electric vehicle based on deep reinforcement learning. Appl. Sci. 8(2), 187 (2018) 6. H. Tian, S.E. Li, X. Wang, Y. Huang, G. Tian, Data-driven hierarchical control for online energy management of plug-in hybrid electric city bus. Energy 142, 55–67 (2018) 7. A. Lajunen, Energy consumption and cost-benefit analysis of hybrid and electric city buses. Transport. Res. C Emerg. Technol. 38, 1–15 (2014). http://dx.doi.org/10.1016/j.trc.2013.10. 008 8. J.P. Ribau, C.M. Silva, J.M. Sousa, Efficiency, cost and life cycle CO2optimization of fuel cell hybrid and plug-in hybrid urban buses. Appl. Energy 129, 320–335 (2014) 9. M. Mahmoud, R. Garnett, M. Ferguson, P. Kanaroglou, Electric buses: a review of alternative powertrains. Renew. Sust. Energ. Rev. 62, 673–684 (2016) 10. T. Randall, Here’s how electric cars will cause the next oil crisis (2016). https://www. bloomberg.com/features/2016-ev-oil-crisis. Accessed 26.04.2018 11. M. Glotz-Richter, H. Koch, Electrification of public transport in cities (Horizon 2020 ELIPTIC Project). Transport. Res. Proc. 14, 2614–2619 (2016). https://doi.org/10.1016/j.trpro.2016.05. 416 12. S. Bakker, R. Konings, The transition to zero-emission buses in public transport – the need for institutional innovation. Transport. Res. D Transp. Environ. No. March, 0–1 (2017). https:// doi.org/10.1016/j.trd.2017.08.023 13. B. Sen, T. Ercan, O. Tatari, Does a battery-electric truck make a difference? – life cycle emissions, costs, and externality analysis of alternative fuel-powered Class 8 heavy-duty trucks in the United States. J. Clean. Prod. 141, 110–121 (2017). https://doi.org/10.1016/j. jclepro.2016.09.046 14. K. Palmer, J.E. Tate, Z. Wadud, J. Nellthorp, Total cost of ownership and market share for hybrid and electric vehicles in the UK, US and Japan. Appl. Energy 209, No. July 2017, 108–119 (2018). https://doi.org/10.1016/j.apenergy.2017.10.089 15. M. Ranta, M. Pihlatie, A. Pellikka, J. Laurikko, P. Rahkola, J. Anttila, Analysis and comparison of energy efficiency of commercially available battery electric buses, in IEEE Vehicle Power and Propulsion Conference (VPPC) (2017) 16. M. Mesgarpour, D. Landa-Silva, I. Dickinson, Overview of telematics-based prognostics and health management systems for commercial vehicles. Commun. Comput. Inform. Sci. 395, 123–130 (2013) 17. C. Marina Martinez, X. Hu, D. Cao, E. Velenis, B. Gao, M. Wellers, Energy management in plug-in hybrid electric vehicles: recent progress and a connected vehicles perspective. IEEE Trans. Veh. Technol. PP(99), 1–1 (2016). http://ieeexplore.ieee.org/document/7496906/ 18. C. Manzie, H. Watson, S. Halgamuge, Fuel economy improvements for urban driving: hybrid vs. intelligent vehicles. Transport. Res. C Emerg. Technol. 15(1), 1–16 (2007) 19. J.A. López-Ibarra, N. Goitia-Zabaleta, V.I. Herrera, H. Gazta ñaga, H. Camblong, Battery aging conscious intelligent energy management strategy and sensitivity analysis of the critical factors for plug-in hybrid electric buses. eTransportation 5(2016), 100061 (2020). https:// linkinghub.elsevier.com/retrieve/pii/S2590116820300187 20. Z. Bi, L. Song, R. De Kleine, C.C. Mi, G.A. Keoleian, Plug-in vs. wireless charging: life cycle energy and greenhouse gas emissions for an electric bus system. Appl. Energy 146, No. February, 11–19 (2015). https://doi.org/10.1016/j.apenergy.2015.02.031
76
J. A. López-Ibarra et al.
21. M.A. Hannan, M.M. Hoque, A. Mohamed, A. Ayob, Review of energy storage systems for electric vehicle applications: issues and challenges. Renew. Sust. Energ. Rev. 69, No. November 2016, 771–789 (2017). https://doi.org/10.1016/j.rser.2016.11.171 22. H. Budde-Meiwes, J. Drillkens, B. Lunz, J. Muennix, S. Rothgang, J. Kowal, D.U. Sauer, A review of current automotive battery technology and future prospects. Proc. Instit. Mech. Eng. D J. Automob. Eng. 227(5), 761–776 (2013) 23. D.H. Doughty, E.P. Roth, A general discussion of Li ion battery safety. Interface Mag. 21(2), 37–44 (2012). http://interface.ecsdl.org/cgi/doi/10.1149/2.F03122if 24. J.A. López-Ibarra, H. Gaztañaga, A. Saez-de Ibarra, H. Camblong, Plug-in hybrid electric buses total cost of ownership optimization at fleet level based on battery aging. Appl. Energy 280, No. March, 115887 (2020). https://doi.org/10.1016/j.apenergy.2020.115887 25. Emission test cycles, worldwide engine and vehicle test cycles (2017). https://www.dieselnet. com/standards/cycles/index.php. Accessed 26.04.2018 26. J.A. López-Ibarra, V.I. Herrera, H. Camblong, A. Milo, H. Gaztañaga, Energy management improvement based on fleet digitalization data exploitation for hybrid electric buses, in Computational Intelligence and Optimization Methods for Control Engineering Energy, ed. by B. Maude Josée, P.M. Pardalos, J. Sanchis Sáez (Springer Nature, Cham, 2019), ch. 14, pp. 321–355. http://link.springer.com/10.1007/978-3-030-25446-9_14 27. J.A. López-ibarra, N. Goitia-zabaleta, A. Milo, H. Camblong, H. Gaztañaga, Battery and fuel cell aging conscious intelligent energy management battery and fuel cell aging conscious intelligent energy management strategy for hydrogen hybrid electric buses. Trans. Res. Arena, No. April, 0–10 (2020). https://www.traficom.fi/sites/default/files/media/publication/ TRA2020-Book-of-Abstract-Traficom-research-publication.pdf 28. O. Sundström, L. Guzzella, P. Soltic, Optimal hybridization in two parallel hybrid electric vehicles using dynamic programming, in Proceedings of the 17th IFAC World Congress, vol. 1 (2008), pp. 4642–4647. http://www.nt.ntnu.no/users/skoge/prost/proceedings/ifac2008/ data/papers/2452.pdf 29. L. Guzzella, A. Sciarretta, Vehicle Propulsion Systems (Springer, Berlin, 2005) 30. V. Herrera, Optimized energy management strategies and sizing of hybrid storage systems for transport applications. Doctoral Thesis (2017). https://addi.ehu.es/handle/10810/25887 31. V.I. Herrera, A. Milo, H. Gaztañaga, I. Etxeberria-Otadui, I. Villarreal, H. Camblong, Adaptive energy management strategy and optimal sizing applied on a battery-supercapacitor based tramway. Appl. Energy 169, 831–845 (2016) 32. Kokam li-ion/polymer cell. http://kokam.com/data/Kokam_Cell_Brochure_V.4.pdf. Accessed 26.04.2018 33. S. Jenu, I. Deviatkin, A. Hentunen, M. Myllysilta, S. Viik, M. Pihlatie, Reducing the climate change impacts of lithium-ion batteries by their cautious management through integration of stress factors and life cycle assessment. J. Energy Storage 27, No. November 2019, 101023 (2020). https://doi.org/10.1016/j.est.2019.101023 34. N. Takami, H. Inagaki, Y. Tatebayashi, H. Saruwatari, K. Honda, S. Egusa, High-power and long-life lithium-ion batteries using lithium titanium oxide anode for automotive and stationary power applications. J. Power Sources 244, 469–475 (2013). https://doi.org/10.1016/ j.jpowsour.2012.11.055 35. R. Dufo, Dimensionamiento y control optimo de sistemas híbridos aplicando algorítmos evolutivos. Doctoral Thesis (2007). https://dialnet.unirioja.es/servlet/tesis?codigo=19604 36. V. I. Herrera, A. Milo, H. Gaztanaga and H. Camblong, Multi-Objective Optimization of Energy Management and Sizing for a Hybrid Bus with Dual Energy Storage System, IEEE Vehicle Power and Propulsion Conference (VPPC), (2016), pp. 1–6, https://doi.org/10.1109/ VPPC.2016.7791731 37. W.A. Facinelli, Modeling and simulation of lead-acid batteries for photovoltaic systems. Doctoral Thesis (1983). https://www.osti.gov/biblio/6132982-modeling-simulation-lead-acidbatteriesphotovoltaic-systems
PHEBs with Different BT Chemistries TCO Planning and Optimization at Fleet Level
77
38. D.U. Sauer, H. Wenzl, Comparison of different approaches for lifetime prediction of electrochemical systems-Using lead-acid batteries as example. J. Power Sources 176(2), 534–546 (2008) 39. M. Mabrey, Advantages and marine applications of various lithium ion battery chemistries, in Battery Propulsion Conference (IEEE, Piscataway, 2016). www.maritime.dot.gov/sites/marad. dot.gov/files/docs/innovation-research/meta/3376/spear-lithiumionchemistries.pdf 40. M. Rogge, E. van der Hurk, A. Larsen, D.U. Sauer, Electric bus fleet size and mix problem with optimization of charging infrastructure. Appl. Energy 211, No. November 2017, 282–295 (2018). https://doi.org/10.1016/j.apenergy.2017.11.051 41. L. Nurhadi, S. Borén, H. Ny, A sensitivity analysis of total cost of ownership for electric public bus transport systems in Swedish medium sized cities. Transport. Res. Proc. 3, No. July, 818–827 (2014). https://doi.org/10.1016/j.trpro.2014.10.058 42. M. Pihlatie, S. Kukkonen, T. Halmeaho, V. Karvonen, N.O. Nylund, Fully electric city buses – the viable option, in 2014 IEEE International Electric Vehicle Conference, IEVC 2014 (2015) 43. B. Nykvist, M. Nilsson, Rapidly falling costs of battery packs for electric vehicles. Nat. Clim. Change 5(4), 329–332 (2015) 44. H. Ding, Z. Hu, Y. Song, Value of the energy storage system in an electric bus fast charging station. Appl. Energy 157, 630–639 (2015). https://doi.org/10.1016/j.apenergy.2015.01.058 45. O. Topal, S. Nakir, Total cost of ownership based economic analysis of diesel, CNG and electric bus concepts for the public transport in Istanbul City. Energies 11(9) (2018) 46. I. Mareev, J. Becker, D.U. Sauer, Battery dimensioning and life cycle costs analysis for a heavyduty truck considering the requirements of long-haul transportation. Energies 11(1) (2018) 47. Y. Miao, P. Hynan, A. Von Jouanne, A. Yokochi, Current li-ion battery technologies in electric vehicles and opportunities for advancements. Energies 12(6), 1–20 (2019) 48. F. Meishner, B. Satvat, D.U. Sauer, Battery electric buses in European cities: economic comparison of different technological concepts based on actual demonstrations, in Proceedings of the 2017 IEEE Vehicle Power and Propulsion Conference, VPPC 2017, vol. 2018 (2018), pp. 1–6 49. H.J. Undertaking, Strategies for joint procurement of fuel cell buses a study for the Fuel Cells and Hydrogen Joint Undertaking. Fuel Cells and Hydrogen Joint Undertaking (FCH JU). Technical Report (2018) 50. B. Emonts, M. Reuß, P. Stenzel, L. Welder, F. Knicker, T. Grube, K. Görner, M. Robinius, D. Stolten, Flexible sector coupling with hydrogen: a climate-friendly fuel supply for road transport. Int. J. Hydrogen Energy 44(26), 12 918–12 930 (2019) 51. O. Sundström, D. Ambühl, L. Guzzella, On implementation of dynamic programming for optimal control problems with final state constraints. Oil Gas Sci. Technol. Revue de l’Institut Français du Pétrole 65(1), 91–102 (2009)
Stochastic Optimization Methods for the Stochastic Storage Process Control Pavel Knopov and Vladimir Norkin
1 Introduction The paper investigates a stochastic programming approach to solving stochastic optimal control problems. The idea is to search for the optimal control policy in a parametric form, i.e., as a function of the current state of the system under consideration and of some unknown parameters. Having substituted this control policy into the original optimal control problem, we obtain a stochastic programming problem concerning the unknown parameters as variables. For the latter problem, there is a variety of solution techniques developed in stochastic optimization [1–3]. The described approach is deeply discussed in [4, Ch. 7]. However, there are two questions to be answered: (i) Do the employed parametric policy forms include the true optimal control policy? (ii) Are there available stochastic optimization technique sufficient for solving the arisen stochastic optimization problems? In the present paper, we discuss both of these questions. As to the first question, in the second and the third sections, we survey some general solution existence results for the discrete-time Markovian and semiMarkovian control problems, in particular the existence of optimal controls for systems with arbitrary sets of states and controls. For the Markovian systems, we
P. Knopov () V. M. Glushkov Institute of Cybernetics of the National Academy of Sciences of Ukraine, Kyiv, Ukraine V. Norkin V. M. Glushkov Institute of Cybernetics of the National Academy of Sciences of Ukraine, Kyiv, Ukraine Faculty of Applied Mathematics of the National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv, Ukraine © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_3
79
80
P. Knopov and V. Norkin
build on the papers [5, 6], for the semi-Markovian ones on [6, 7]. The exposition of general results of the stochastic control theory can be found in [8–10]. Applications to the stochastic storage control can be found in [11–14]. In the risk theory, optimal dividend strategies often have a barrier form, when the dividends are paid if the accumulated reserves become greater than some barrier [15, 16]. Related results for the Markovian controlled processes with finite states and controls are available in [17, 18]; for the models with finite number of states and a compact control set, see [19]; for control models with discounting, consult [20, 21]. As to the second question, we remark that the related stochastic optimization problems may occur nonconvex, multi-extremal, nonsmooth, and even discontinuous [4, Ch. 7]. The discontinuity also can be artificially introduced by the use of discontinuous penalty functions to exclude complex constraints on parameters. For solving such difficult problems, we suggest applying the so-called successive stochastic smoothing method [22] and demonstrate its relevance to the optimization of an inventory system and an energy accumulation system.
2 Controlled Markov Processes In this section, we provide some information on the theory of controlled stochastic processes that is necessary for what follows. Consider a discrete-time stochastic system, which is controlled in some way. Let us introduce some basic definitions [8, 9]. We consider a controlled system with random perturbations at time moments n ∈ N in order to minimize costs associated with the system operation. Let X and A be some complete separable metric spaces with Borel σ -algebras ℵ and , respectively, and be used as a phase space and a decision space. Denote {Xn ∈ X, n = 0, 1, . . .} the sequence of states of the system; {Dn ∈ AXn ⊂ A, n = 0, 1, . . .} denote a sequence of decisions (controls), where Ax is the set of feasible decisions at state x ∈ X. The random evolution of the system is governed by a set of transition probabilities: P {B|xn , an } = P{Xn+1 ∈ B|X0 = x0 , D0 = a0 , . . . , Xn = xn , Dn = an } If the system is in state x at the beginning of a time period and a decision a ∈ Ax is made, then a cost r(x, a) is associated with this period of time. We assume that the function |r(x, a)| ≤ C is bounded on the set = {(x, a), x ∈ X, a ∈ Ax } for some constant C, 0 ≤ C < ∞. A sequence of probability distributions δ = {δ0 , δ1 , . . . , δn , . . .} with probability measures δn defined on Axn is called a general admissible control strategy. The strategy δ is called Markov one if δn (·|x0 , a0 , . . . , xn ) = δn (·|xn ), and it is called a stationary Markov one if δn (·|x0 , a0 , . . . , xn ) = δ(·|xn ), n = 0, 1, . . .. A stationary Markov strategy is deterministic if the measure δ(·|x) is concentrated at point x for each x ∈ X. Denote δ(x) the point mass concentration of δ(·|x).
Stochastic Optimization Methods for the Stochastic Storage Process Control
81
Let be the class of admissible strategies and 1 be the class of stationary deterministic strategies. Consider two criteria for the optimality of strategies: The average cost for the chosen strategy δ: 1 Eδx r (Xk , Dk ), X0 = x; n+1 n
ϕ(x, δ) = lim sup n→∞
k=0
The expected (discounted) total cost for δ: ψβ (x, δ) = Eδx
n
β k r (Xk , Dk ), X0 = x,
β ∈ (0, 1).
k=0
The strategy δ ∗ is called optimal with respect to these criteria iff ϕ(x, δ ∗ ) = inf ϕ(x, δ)
(ϕ − optimal)
δ∈
or ψβ (x, δ ∗ ) = inf ψβ (x, δ) δ∈
(ψβ − optimal).
The first problem that arises is to find conditions for the existence of ϕ-optimal and ψβ -optimal control strategies in a particular class of controls. Here it is necessary to establish conditions imposed on the sets of states and controls to prove the corresponding theorems. The simplest case is when the sets are finite. In this case, the theorem on the existence of the optimal strategy in the class of deterministic strategies was proved in [17]. This fact was proved by the direct verification of the optimality of some constructed solution, using the optimality equation constructed on the basis of the dynamic programming method. In the case of compactness or countability of sets, statements about the existence of optimal strategies in the class of stationary nonrandomized strategies were obtained, e.g., in [5, 6] and in other works. A detailed bibliography can be found, for example, in [9, 10, 23]. Theorem 1 ([5]) Let the control space A be compact and the mapping x → Ax be upper semicontinuous. Let there exist a measure μ on (X, ℵ), μ(X) > 0 such that for the transition probability P (·|x) holds P{B|x, a} ≥ μ(B) > 0,
(x, a) ∈ , B ∈ ℵ.
Assume also that the following conditions are satisfied: (1) The cost function r(x, a) is continuous (lower semicontinuous) in (x, a). (2) The transition probability P(·|x, a) is weakly continuous in (x, a).
82
P. Knopov and V. Norkin
Then there exists a stationary deterministic ϕ-optimal strategy in 1 with the minimum cost W = V (x)μ(dx), where V (x) is determined by the optimality equation $ ' V (x) = inf r(x, a) + V (x)P (dy|x, a) , a∈Ax
x ∈ X,
where P (B|x, a) = P(B|x, a) − μ(B), B ∈ ℵ. Remark 1 Conditions on the sets X and A can be different, and various versions of the existence theorems for the existence of optimal strategies can be obtained. For example, we can assume that X is compact and A is finite or X is countable and A is compact, and in these cases, the statement on the existence of the optimal strategy in the class of stationary nonrandomized strategies is also true. There are other options for the conditions for the existence of optimal strategies. Remark 2 Similar results about the existence of the optimal strategy for the ψβ criterion were obtained in [5, 6] and others, and for this criterion, the possibility of using the optimality equation plays a significant role. In particular, the following statement holds: Theorem 2 ([5]) Let A ⊂ [0, Q] be compact, and the following conditions be satisfied: (a) r(x, a) is lower semicontinuous in (x, a); (b) the transition probability P(·|x, a) is weakly continuous in (x, a). Then for β ∈ (0, 1), there exists function minδ ψβ (x, δ) = ψβ (x, δ ∗ ) = ψβ (x), which is the unique solution (in the Banach space of bounded Borel-measurable functions on X) of the following optimality equation: ψβ (x) =
inf
a∈[0,Q−x]
$ ' r(x, a) + β ψβ (y)P (dy|x, a)
3 Controlled Semi-Markov Processes As before, we assume that X and A are some complete separable metric spaces. Let a measurable mapping x → Ax associate with each x ∈ X a nonempty closed set Ax ⊂ A. Assume the set = {(x, a) : x ∈ X, a ∈ Ax } is Borel-measurable in the product space X × A. If in the state x ∈ X the decision a ∈ Ax is made, then:
Stochastic Optimization Methods for the Stochastic Storage Process Control
83
(1) The next state of the system is selected with the transition probability P (·|x, a). (2) Under the condition that the next state of the system is y ∈ X, the residence time in the state x is a random variable with the distribution function (·|x, a, y). We assume that P (·|x, a) and (·|x, a, y) are Borel-measurable functions on and × X, respectively. Let Xn be the state of the system after the n transition steps, Dn be the selected decision, and τn be the residence time in this state (n = 0, 1, 2, . . .). The acceptable strategy δ for a controlled systems is defined as the sequence δ = {δ0 , δ1 , . . . , δn , . . .} of transition kernels such that the probability measure δn (·|hn ) is defined on AXn and measurably depends on the history hn = (X0 , D0 , τ0 , . . . , Xn−1 , Dn−1 , τn−1 , Dn ) of the controlled system up to the moment of n-th transition. If in the state x ∈ X the decision a ∈ Ax is made and the time spent in the state x is equal to t, then the expected costs in time s (s ≤ t) are equal to r(s|x, a). The function r(s|x, a) is assumed to be Borel-measurable on [0, +∞)× . Consider the following optimality criterion for the chosen strategy, namely, an average expected cost of the strategy δ: Eδx ϕ(x, δ) = lim sup n→∞
n
r(τk |Xk , Dk )
k=0
Eδx
n
,
(1)
τk
k=0
where X0 = x and Eδx is a mathematical expectation corresponding to the process control under strategy δ provided that X0 = x. The strategy δ ∗ is optimal with respect to this criterion iff ϕ(x, δ ∗ ) = inf ϕ(x, δ), x ∈ X. δ∈
Let us denote the following quantities: ∞ τ (x, a) =
t d(t|x, a, y) P (dy|x, a), X 0
∞ r(x, a) =
r(t|x, a) d(t|x, a, y) P (dy|x, a). X 0
We will assume that τ (x, a) and r(x, a) exist and are finite for all (x, a) ∈ and |r(x, a)| ≤ C < ∞, (x, a) ∈ . Since criterion (1) depends only on P (·|x, a) and on the averaged characteristics τ (x, a) and r(x, a), we restrict ourselves to considering controlled processes for which
84
P. Knopov and V. Norkin
$
$ 1, t ≥ τ (x, a), 0, t < τ (x, a), (t|x, a, y) = r(t|x, a) = 0, t < τ (x, a); r(x, a), t ≥ τ (x, a). We denote by M(X) the Banach space of bounded Borel-measurable functions on X with norm u = sup u(x). The following results take place: x∈X
Theorem 3 ([6, 7]) Let the space A of control actions be compact and the mapping x → Ax be upper semicontinuous. In addition, let us make the following assumptions: (1) 0 < l < τ (x, a) ≤ L < ∞, (x, a) ∈ . (2) There is a nonnegative measure μ on X such that the following inequalities hold: (a) μ(B) ≤ P (B|x, a), (x, a) ∈ , B ∈ ℵ (b) μ(x) > 0 (3) The function r(x, a) is lower semicontinuous, and τ (x, a) is continuous in (x, a) ∈ . (4) The transition probability is weakly continuous in (x, a) ∈ . Then there exists a stationary deterministic optimal strategy δ ∗ with the minimal cost 1 W = ν(x)μ(dx), L X
where the function ν(x) is the unique solution in the space M(X) of the optimality equation v(x) = inf {r(x, a) + a∈Ax
v(y) P (dy|x, a)}, x ∈ X,
X
with P (B|x, a) = P (B|x, a) −
1 μ(B) τ (x, a), B ∈ X. L
Remark 3 This theorem holds for the cost function with values in [0, +∞). In [6, 7], the conditions for maximizing the one period reward (income) r(x, a), x ∈ X, a ∈ Ax are presented. Applications of the semi-Markovian stochastic optimal control model to the storage control are presented in [14, 24, 25].
Stochastic Optimization Methods for the Stochastic Storage Process Control
85
4 Applied Problems of the Theory of Controlled Stochastic Processes In this section, we focus on some applied problems of the theory of controlled stochastic processes that arise in the inventory theory, the theory of reliability, and optimal service of queuing systems. The unifying idea of the problems considered below is the possibility of reducing the problem of finding optimal strategies to solving some parametric optimization problems and the possibility of using stochastic optimization methods to solve them. This approach was considered in works [4, 12, 26] and others.
4.1 Markov Model of Inventory Management Consider a Markov system for managing the inventory of a single product that can be continuously replenished. The maximum stock level is Q, so the product stock takes on a value in the interval [0, Q]. Periodically, at random moments of time, a check is made, and depending on the actual stock of the product, a decision is made on additional order: if at the n-th moment of a check the level of stocks Xn is less or equal to x ∈ [0, Q], then the order is made Dn ∈ AXn = [0, Q − Xn ]. At the time of check (n+), a random claim ξn comes; ξ = (ξn , n ∈ N) is a sequence of independent random variables with the distribution function G(x), x ≥ 0. We assume that ξn does not depend on the system history up to and including the n-th check and that G(Q) < 1. We assume that the function G(·) is continuous. The evolution equation of the stock level process has the form Xn+1 = (Xn + Dn − ξn )+ , n ∈ N, where (a)+ = max{0, a} is the positive part of a ∈ R. It is further assumed that the random variables are defined on a common basic probability space ( , F, P). The state space of the system is X = [0, Q], and the set of the solutions is A = [0, Q]; da is the decision on the additional order of size a. In the state x, the set of the allowable actions is Ax = [0, Q − x]. Consider the following optimality criteria: minimization of average expected costs, i.e., 1 r(Xk , Dk ), X0 = x, n+1 n
ϕ(x, δ) = lim inf Eδx n→∞
k=0
ϕ(x, δ ∗ ) = inf ϕ(x, δ), δ
and minimization of the total discounted costs with the discount coefficient β, i.e.,
86
P. Knopov and V. Norkin
ψβ (x, δ) = Eδx
∞
β k r(Xk , Dk ), X0 = x;
ψ(x, δ ∗ ) = inf ψβ (x, δ).
k=0
δ
(2)
Let us now consider the problem of determining the structure of the optimal strategy under the conditions of the theorem on the existence of the optimal strategy, which are further assumed to be satisfied. It is known that for many inventory management systems, the optimal strategy is such that there is a basic stock level S that after ordering the optimal stock is equal to S. Since it is possible to order any quantity of the product, the level S is exactly achieved. Further, the optimality of the (s, S)-strategy is usually proved: an order for the replenishment of a product is made only when the stock level x is less than s and the order is S − x. Let’s take as an example a set of conditions under which the optimal strategy will be the (s, S)strategy. We will assume that the function of replenishment costs is c(x) = c0 + cx, the stock is replenished immediately with probability one, and the function f (x) of costs associated with stock shortage and storage in the amount of x is convex and continuous. Also, we assume that the demand ξn does not depend on up to the moment of the n-th check inclusively and that G(Q) < 1. We will also assume that G(·) is continuous. Thus, the cost function has the form r(x, 0) = f (x),
r(x, a) = f (x + a) + c0 + ca,
a ∈ (0, Q − x].
The random evolution of the system is governed by the set of transition probabilities: P({0} |x, a} = 1 − G (x + a− ) , x ∈ [0, Q] , a ∈ [0, Q − x]; P([y1 , y2 ) |x, a} = G (x + a − y1 ) − G (x + a − y2 ) , x ∈ [0, Q],
a ∈ [0, Q − x],
0 < x < y1 < y2 = x + a.
Theorem 4 ([12, 13]) Let function c · x + f (x) be decreasing in x ∈ [0, Q]. For the model described above, there is a threshold x ∗ ∈ [0, Q] such that δ ∗ (x) =
$ Q − x, x ≤ x ∗ , 0, x > x ∗ .
4.2 Optimal Planning of Repair Work of Complex Technical Systems Let the working state of the system be described by a uniformly monotonic nonincreasing Markov process (ξ(t), t ≥ 0) with values in [0, ∞] and transition probabilities P (x, t, B), x ∈ [0, ∞), t ≥ 0, where B is a Borel subset of [0, ∞).
Stochastic Optimization Methods for the Stochastic Storage Process Control
87
The state {0} corresponds to the good state of the system, and the state ξ = x > 0 characterizes some level of failure. The current state of the system is not directly observed but can be determined by direct verification with costs r1 > 0. System operation is costly. The operational cost intensity in state x is equal to r(x) ≥ 0. The function r(x) is monotonic, nondecreasing, and bounded. Depending on a state of the system, after each check, it is necessary to make a decision: either to do nothing and carry out the next control after time T (this action is denoted by aT ) or to make a complete repairing that lasts m > 0 time units and returns the system to its good state {0}, and the next check is carried out in time T after the system resumes in the state {0} (we denote this action by (a0 , aT )). It is assumed that the length of the time interval T between checks belongs to some set Z, where either Z is finite or the entire interval [T1 , T2 ], Z ⊆ [T1 , T2 ], 0 < T1 < T2 . Repairing includes the actual cost of the repair as well as losses due to the system failure. It is assumed that there exists a limit limx→∞ r(x) > r2 . Let us introduce the quantity c = min{x > 0 : r(x) ≥ r2 }. The states x ≥ c will be called the fault states of the system. If, at a moment of checking, the state of the process x ≥ c, then the repair decision is always made {(a0 , aT )}. These assumptions reduce our model to a simple controlled semi-Markov model with criterion (1), the state space X = [0, c) ∪ c, the control space $
aT , 0 ≤ x < c, (a0 , aT ), x ≥ c,
A = {aT , (a0 , aT )}, Ax = and the following transition probabilities:
P {B|x, aT } = P (x, T , B) , x ∈ [0, c) , B ⊂ [0, c) , P {c|x, aT } = P (x, T , [c, ∞)) , x ∈ [0, c) , P {B|c, (a0 , aT )} = P (0, T , B) , B ⊂ [0, c) , P {c|c, (a0 , aT )} = P (0, T , [c, ∞)) . The average useful cost is given by !
T
r(x, aT ) = − r1 + Ex
" r(ξ(t))dt , x ∈ [0, c),
0
! r(c, (a0 , aT )) = − r1 + E0
T
" r(ξ(t))dt + r2 m ,
0
where Ex is a conditional expectation with respect to the measure induced by the process ξ = (ξ(t), t ∈ R+ ), provided that ξ(0) = x. Suppose further that the tran-
88
P. Knopov and V. Norkin
sition probabilities P (x, T , B) are weakly continuous in (x, t), P (0|T1 , [c, ∞)) = γ > 0. Theorem 5 ([27]) Suppose that for an arbitrary Borel function u on [0, ∞), function Ex u(ξ(t)) is continuous in x and t. Then, for the model described above, there exists a cost averaged optimal strategy in the class 1 , for which the minimum ( value of cost W = (T + m)−1 V (x)μ(dx) is achieved, where μ(·) is the measure concentrated at 0 with mass γ , and function V (x) satisfies the optimality equation: ( V (x) = max r(x, aT ) + V (y)P aT ), ( (dy|x, (a0 , aT )) )r(c, (a0 , aT()) + V (y)P (dy|c, ( T = max −r1 − Ex 0 r(ξ(t))dt + V (y)P (dy|x, aT ), * (T ( −r1 − E0 0 r(ξ(t))dt + V (y)P (dy|c, (a0 , aT )) − r2 m , where P (B|x, a) = P (B|x, a) −
1 T +m μ(B)r(x, a),
a ∈ {aT , (a0 , aT )}.
Theorem 6 ([27]) Suppose that for an arbitrary monotone nonincreasing bounded function u(·) defined on [0, ∞), the function Ex (u(ξ(t)) is monotonically nonincreasing in x for any t ≥ 0. Then the optimal strategy δ ∗ has the form δ∗ =
$
aT , x < x ∗ , (a0 , aT ), x ≥ x ∗ ,
4.3 Optimal Queuing System Control Consider a standard G/D/k queuing system with an incoming request flow having a time distribution function F : R+ → [0, 1], constant service time c > 0, and the maximum allowable queue length k > 0. The control consists in the fact that at the moment of the next request arriving, a decision is made whether to take or not to take it for a service. An incoming request on a service is serviced immediately if the device is free, and it gets in the queue otherwise. If the request is not accepted for a service, it is lost. For servicing a customer, this system receives income d > 0, and it incurs losses w(x), where x is the time the customer stays in the queue. Thus, the phase space X = [0, (k + 1)c]. If a0 is the decision not to take a request and a1 is the one to take it, then the set of decisions is A = {a0 , a1 }. If x ∈ [kc, (k + 1)c], then A(x) = {a0 }. If x ∈ [0, kc), then A(x) = {a1 }. As before, let be the class of all admissible strategies and 1 be the class of stationary deterministic strategies. Let us introduce the income function r(x, a): r(x, a0 ) = 0,
r(x, a1 ) = d − w(x),
x ∈ [0, (k + 1)c].
Stochastic Optimization Methods for the Stochastic Storage Process Control
89
Income per unit of time is defined as 1 r(Xi , Di ), Eδx n+1 n
ϕ(x, δ) = lim inf n→∞
i=0
where Xi is the state of the system at the time of arrival of the i-th customer, Di is the decision at the time of arrival of the i-th customer, and Eδx is the conditional mathematical expectation of income for the initial state x and the strategy δ. The strategy δ ∗ is optimal iff ϕ(x, δ ∗ ) = sup ϕ(x, δ). δ∈
The statement about the existence of the ϕ-optimal strategy in the class 1 is proved in [26], and the optimal strategy has the form δ ∗ = {D0∗ , D1∗ , . . .}, where Di∗
$ =
a1 , if Xi ≤ x ∗ , a0 , if Xi > x ∗ ,
x ∗ ∈ [0, kc], i.e., the optimal strategy has a threshold structure.
4.4 Optimal Dividend Policies for Risk Process Management Let us consider a risk process in discrete time, which describes the evolution of states of some stochastic storage system [11, 15, 16]: Xn+1 = max{0, Xn + ξn − Dn }, X0 = x,
n = 0, 1, . . . ,
(3)
where Xn ∈ R+ is the state of reserves (capital) of the system, ξn ∈ R is the net income, and Dn ∈ R+ is the dividend payment at time moment n. The evolution of the system is evaluated by the criterion ψβ (x, δ) = E
T
β n (Dn + q min{0, Xn + ξn − Dn }) , T ≤ ∞,
(4)
n=0
where δ = (D0 , D1 , . . .) is a dividend strategy; E is the mathematical expectation over random variables ξ0 , ξ1 , . . .; β ∈ (0, 1) is the time discounting factor; and q ≥ 1 is some penalty coefficient. In the risk theory, the so-called barrier dividend strategies are considered [15, 16]: Dn = max{0, Xn − Bn }. In some cases, the optimal dividend strategy is just defined by a constant b = Bn or by a linear dividend barrier Bn = b + cn. Thus, it makes sense to consider piecewise linear barrier functions Bn (b, c) = mini=0,...,m {bi + ci n} dependent on a finite number of
90
P. Knopov and V. Norkin
unknown parameters b = (b0 , . . . , bm ) and c = (c0 , . . . , cm ). To put a restriction on the curvature of the dividend barrier, we may require 0 ≤ b0 ≤ b1 ≤ . . . ≤ bm , c0 ≥ c1 ≥ . . . ≥ cm ≥ 0.
(5)
Having substituted the parametric dividend policy Dn (b, c) = max{0, Xn − Bn } into dynamic equations (3) and the objective function (4), we obtain a stochastic programming problem: β (x, b, c) = E
T
β n (Dn (b, c) + q min{0, Xn (b, c) + ξn − Dn (b, c)}) −→ max,
n=0
b,c
subject to constraints (3) and (5). Remark that function β (x, b, c) is nonconcave in (b, c), so the obtained problem can be multi-extremal.
4.5 Optimization of an Energy Accumulation System The stochastic factors play a growing role in smart power systems and grids and require proper probabilistic modeling tools [28]. In this section, we model an energy accumulation system as an inventory system with stochastic demand and losses of goods. Further, we apply stochastic programming methodology for the optimization of such systems. There are several ways for energy storage: an accumulator/battery charging, a water reservoir pumping, a heavy disk spinning, a heavy weight lifting, cryptocurrency mining, etc. [29]. Let us consider the following discrete-time mathematical model of electrical energy storage. Suppose some energy system consists of some energy accumulator, an external energy supplier, a local energy producer, and a local energy consumer. The accumulator is characterized by its maximal capacity x1 , cost of the unit capacity provision c1 , maximal rate of charging s, and maximal rate of discharging r, per unit of time. Assume that the external energy supplier provides an energy inflow x2 with cost c2 . Quantities (x1 , x2 ) ∈ X ⊂ R2 are the decision variables in the model. Let the rate of local consumption be described by a random variable ξ1t ≥ 0 and the productivity of the local energy producer be depicted by a random variable ξ2t ≥ 0, where t = 1, . . . , T denotes time intervals (hours or days, etc.). Remark that ξ1t and ξ2t may be dependent. Let us introduce the random variable ξt = ξ1t − ξ2t , which can take both positive and negative values. The excess of energy production can be stored in the accumulator, if it is not full, and the lack of energy can be compensated from the accumulator if it is not empty. The nonused and non-stored energy is lost. The rate of storage and the rate of energy release are bounded. Let a < 1 be the efficiency of the energy storage in the accumulator and b < 1 be the efficiency of energy releasing from the accumulator. Let us model the level of energy stored in the accumulator by a sequence of random variables Yt ,
Stochastic Optimization Methods for the Stochastic Storage Process Control
91
t = 0, 1, . . . , T . The evolution of the energy storage in the accumulator is described by the following recurrent equation: Yt+1 =
$ min {x1 , Yt + α min {s, x2 − ξt }} , x2 ≥ ξt , max {0, Yt − min{r, ξt − x2 }} , x2 < ξt ,
where Y0 is given, t = 0, 1, . . . , T − 1. The first line in this equation corresponds to the case of the excess of energy provision, and the second line corresponds to the lack of energy in the system. Let us take as an instant performance indicator of the considered system the energy losses in the system at a given time interval. The losses are due to the inability to accumulate the excess of the energy, inefficiency of the accumulation process, and inefficiency of the releasing process. Thus, the losses ft+1 at time interval t + 1 are given by the following equation: ft+1
$ (x2 − ξt ) − (min {x1 , Yt + α min{s, x2 − ξt }} − Yt ) , x2 ≥ ξt , = x2 < ξt . (1 − β) min {Yt , min{r, ξt − x2 }} ,
Here the first line expresses the difference between the excess of energy and the stored energy. The second line expresses the loss of energy due to the inefficiency of the release process. Another instant performance indicator could be the economic losses at a given time interval, which are comprised of the cost of the lost energy, the cost of unsatisfied demand, and the cost of the accumulator capacity provision: gt+1
$ c x + c2 ((x2 − ξt ) − (min {x1 , Yt + α min{s, x2 − ξt }} − Yt )) , x2 ≥ ξt , = 1 1 c1 x1 + c3 (ξt − x2 − β min {Yt , min{r, ξt − x2 }}) , x2 < ξt .
Here the first line describes the cost of the accumulator capacity provision plus the cost of the lost energy. The second line expresses the cost of the accumulator capacity provision and the penalty for the lack of the energy provision with c3 as a penalty coefficient, e.g., an extra energy tariff. In the case of non-zero the accumulator setup costs c0 > 0, we have
gt+1
⎧ ⎪ c2 (x2 − ξt ), ⎪ ⎪ ⎪ ⎪ ⎨ c0 + c1 x1 + c2 ((x2 − ξt ) = − (min {x1 , Yt + α min{s, x2 − ξt }} − Yt )), ⎪ ⎪ ⎪ c3 (ξt − x2 ), ⎪ ⎪ ⎩ c0 + c1 x1 + c3 (ξt − x2 − β min {Yt , min{r, ξt − x2 }}) ,
x1 = 0, x2 ≥ ξt , x1 > 0, x2 ≥ ξt , x1 = 0, x2 < ξt , x1 > 0, x2 < ξt ,
All quantities Yt , ft , and gt are functions of decision variables (x1 , x2 ) and random variables ξ t−1 = (ξ0 , ξ1 , . . . , ξt−1 ), t = 1, 2, . . . , T . As a total performance indicator, one can take the expected economic losses per unit of time to be minimized:
92
P. Knopov and V. Norkin
G(x1 , x2 ) = Eξ
T −1 1 gt+1 (x1 , x2 , ξ t ) → min , T (x1 ,x2 )T ∈X t=0
where Eξ denotes the mathematical expectation over all random variables ξ0 , ξ1 , . . . , ξT −1 . The formulated optimization problem is a nonconvex nonsmooth stochastic optimization problem. In the case of the presence of the setup costs, it is even discontinuous (see Fig. 4). One needs to find a global optimum for this problem. This expectation problem can be approximated by the so-called sample average approximation problem [1–3]: G(x1 , x2 ) =
N T −1 1 1 gt+1 (x1 , x2 , ξ tn ) → min , N T (x1 ,x2 )T ∈X n=1
(6)
t=0
n ), n = 1, 2, . . . , N are composed as parts of where ξ (t−1)n = (ξ0n , ξ1n , . . . , ξt−1 independent random sequences ξ0n , ξ1n , . . . , ξTn −1 , n = 1, 2, . . . , N.
5 Parametric Control Strategies In the previous sections, we demonstrated that under certain conditions optimal control strategies may be deterministic and stationary and may depend on the state of the system and on a finite number of unknown parameters. So it makes sense to search for controls of some parametric forms. Such approach is widely used for solving complex optimal control problems [30, Ch. 15] In the section, we consider several kinds of inventory control strategies that depend on a finite number of unknown parameters. Having substituted such a parametric control into the original optimal control problem, we obtain a stochastic optimization problem concerning the unknown parameters. There are several methods to solve such problems [1–3]. However, the obtained problems may occur nonconvex, nonsmooth, and even discontinuous, so standard methods may not be applicable. To solve such complicated problems, we employ the successive stochastic smoothing method [22] and demonstrate its relevance.
5.1 (s, S)-Replenishment Strategies Inventory stochastic processes in discrete time has the form Xt+1 = (Xt + Dt − ξt )+ = max {0, Xt + Dt − ξt } , t = 0, 1, . . . , T ,
(7)
Stochastic Optimization Methods for the Stochastic Storage Process Control
93
where t is discrete time, Xt is the value of the store at the beginning of the time interval [t, t + 1), Dt is an instant restocking at time moment t, and ξt is the total random demand during time interval [t, t + 1). The store is bounded, Xt ∈ [0, Q], t = 0, 1, . . . , T . An alternative variant of this equation is the following: Xt+1 = (Xt − ξt )+ + Dt = max {0, Xt − ξt } + Dt , t = 0, 1, . . . , T .
(8)
Let us consider the so-called (s, S) two-dimensional parametric strategy of restocking:
Dt = UXt (s, S), t = 0, 1, . . . , T ,
⎧ s > S, ⎨ ∅, Ux (s, S) = 0, s ≤ S, s < x, ⎩ S − x, s ≤ S, s ≥ x.
(9)
where the pair (s, S) constitutes parameters of the strategy, 0 ≤ s < S ≤ Q. Remark that function Ux (s, S) is discontinuous in both variables x and (s, S). A cost function associated with each time interval [t, t + 1): rt = r(Xt , Dt , ξt ) consists of three terms: c1 (Xt , Dt , ξt ) is a cost storage of goods, c2 (Dt ) is a cost of buying goods, and e(Xt , Dt , ξt ) is a revenue from selling goods, where r(x, d, ξ ) = c1 (x, d, ξ ) + c2 (d) − e(x, d, ξ ), $ α(x + d − ξ ), x + d > ξ, c1 (x, d, ξ ) = max {α(x + d − ξ ), 0} = 0, x + d ≤ ξ; $
0, d = 0, c2 (d)= a + bd, d > 0;
$ e(x, u, ξ )=
βξ, x+d > ξ, α, β, a, b≥0. β(x + d), x+d ≤ ξ ;
Then the cost function fx,ξ (s, S) = r(x, Ux (s, S), ξ ) for the simplest (s, S) restocking strategy takes on the form ⎧ ⎪ ∅, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ α(x − ξ ) − βξ, ⎨ −βx, f¯x,ξ (s, S) = r(x, Ux (s, S), ξ ) = ⎪ a + b(S − x) ⎪ ⎪ ⎪ ⎪ ⎪ +α(S − ξ ) − βξ, ⎪ ⎩ a + b(S − x) − βS,
s > S; s ≤ S, s ≤ x, x > ξ ; s ≤ S, s ≤ x, x ≤ ξ ; s ≤ S, s > x, S > ξ ; s ≤ S, s > x, S ≤ ξ.
The cost function f¯x,ξ (s, S) is defined on the set (s, S) ∈ R2 : 0 ≤ s ≤ S ≤ Q . To simplify the feasibility set, we redefine the cost function f¯x,ξ (s, S) by means of the penalty (C + s − S) value for the case s > S, where C is a sufficiently Now we can consider function fx,ξ (s, S) defined on the cube K = large constant. (s, S) ∈ R2 : 0 ≤ s ≤ Q, 0 ≤ S ≤ Q :
94
P. Knopov and V. Norkin
⎧ ⎪ C + s − S, s ⎪ ⎪ ⎪ ⎪ ⎨ α(x − ξ ) − βξ, s ≤ S, s ≤ x, x > ξ ; fx,ξ (s, S) = −βx, s ⎪ ⎪ ⎪ a + b(S − x) + α(S − ξ ) − βξ, s ⎪ ⎪ ⎩ a + b(S − x) − βS, s
> S;
≤ S, s ≤ x, x ≤ ξ ; ≤ S, s > x, S > ξ ; ≤ S, s > x, S ≤ ξ. (10) Remark that functions c2 (d), r(x, d, ξ ), and fx,ξ (s, S) are discontinuous (see Fig. 1). The multidimensional (s, S)-strategy is constructed as follows. Let s ∈ Rm and S ∈ Rm be vectors such that 0 ≤ s1 ≤ s2 ≤ . . . ≤ sm ≤ Q , Q ≥ S1 ≥ S2 ≥ . . . ≥ Sm ≥ 0 and si ≤ ⎧ Si , i = 1, . . . , m (by convention s0 = 0). Then s > S1 , ⎨ ∅, Ux (s, S) = 0, s ≤ S1 , x ≥ S1 , ⎩ Si(x) − x, s ≤ S1 , si(x)−1 ≤ x < si(x) . The cost function fx,ξ (s, S) = r(x, Ux (s, S), ξ ) for the multidimensional (s, S) restocking strategy takes on the form ⎧ ⎪ ∅, sn ⎪ ⎪ ⎪ ⎪ α(x − ξ ) − βξ, sn ⎪ ⎪ ⎨ −βx, sn f¯x,ξ (s, S) = ⎪ a + b(Si(x) − x) ⎪ ⎪ sn ⎪ ⎪ +α(Si(x) − ξ ) − βξ, ⎪ ⎪ ⎩ a + b(Si(x) − x) − βSi(x) , sn
> S1 ; ≤ S1 , s1 ≤ x ≤ Q, x > ξ ; ≤ S1 , s1 ≤ x ≤ Q, x ≤ ξ ; ≤ S1 , si(x)−1 ≤ x < si(x) , Si(x) > ξ ; ≤ S1 , si(x)−1 ≤ x < si(x) , Si(x) ≤ ξ.
Fig. 1 One period (random) cost as a function of (s, S)
Stochastic Optimization Methods for the Stochastic Storage Process Control
95
This function is defined on the convex set = {(s, S) ∈ R2m : 0 ≤ s1 ≤ s2 ≤ . . . ≤ sm ≤ Q; Q ≥ S1 ≥ S2 ≥ . . . ≥ Sm ≥ 0; si ≤ Si , i = 1, . . . , m}. This set can also be expressed with a certain convex function H (s, S), namely, = {(s, S) : H (s, S) ≤ 0}, where $ H (s, S) = max
' max {si+1 − si }, max {Si+1 − Si }, max {si − Si } .
1≤i≤m
1≤i≤m
1≤i≤m
Here, by convention, s0 = 0, sn+1 = Q, S0 = Q, and Sn+1 = 0. To simplify the feasibility set, we can redefine the function f¯x,ξ (s, S) for (s, S) ∈ / as fx,ξ (s, S) = C + max{0, H (s, S)}. Now we can consider function f (s, S) asdefined on the x,ξ cube K = (s, S) ∈ R2m : 0 ≤ si ≤ Q, 0 ≤ Si ≤ Q, i = 1, . . . , m : ⎧ C + H (s, S), ⎪ ⎪ ⎪ ⎪ ⎪ α(x − ξ ) − βξ, ⎪ ⎪ ⎪ ⎪ ⎨ −βx, fx,ξ (s, S) = a + b(Si(x) − x) ⎪ ⎪ ⎪ +α(Si(x) − ξ ) − βξ, ⎪ ⎪ ⎪ ⎪ a + b(Si(x) − x) ⎪ ⎪ ⎩ −βSi(x) ,
H (s, S) > 0; H (s, S) ≤ 0, s1 ≤ x ≤ Q, x > ξ ; H (s, S) ≤ 0, s1 ≤ x ≤ Q, x ≤ ξ ; H (s, S) ≤ 0, si(x)−1 ≤ x < si(x) , Si(x) > ξ ; H (s, S) ≤ 0, si(x)−1 ≤ x < si(x) , Si(x) ≤ ξ.
5.2 Other Parametric Replenishment Strategies The barrier replenishment strategy is a particular case of the (s, S)-strategy, where S = s, namely, Ux (s) = max{0, s − x}. It is continuous in both variables. In this case, x + ux (s) = max{x, s}. The state Xt of the stock at time t is obtained recursively by the equations Xτ +1 = (Xτ + Dτ − ξτ )+ = max {0, Xτ + Dτ − ξτ } , τ = 0, 1, . . . , t − 1, where Dτ = max{0, s − Xτ }. The values Xt = Xt (s) of the stock, as functions of parameter s, still are complicated nonsmooth nonconvex functions. The same concerns the corresponding cost function (with zero fixed cost a = 0): ⎧ α(x − ξ ) − βξ, ⎪ ⎪ ⎨ −βx, fx,ξ (s) = ⎪ b(s − x) + α(s − ξ ) − βξ, ⎪ ⎩ b(s − x) − βs,
s s s s
≤ x, ≤ x, > x, > x,
x > ξ; x ≤ ξ; s > ξ; s ≤ ξ.
96
P. Knopov and V. Norkin
In the multinomenclature case, the stock is described by a vector x = (x 1 , . . . , x k ), the demand is a random vector ξ = (ξ 1 , . . . , ξ k ), and the barrier strategy is depicted also as a vector Ux (s) = max{0, s 1 − x 1 }, . . . , max{0, s k − x k } dependent on the vector parameter s = (s 1 , . . . , s k ). The total one period cost is the sum of good costs, fx,ξ (s) =
k
f i (s i=1 x ,ξ
i
),
where ⎧ αi (x i − ξ ) − βi ξ i , si ⎪ ⎪ ⎨ si −βi x, fx i ,ξ (s i ) = i i i i ⎪ bi (s − x ) + αi (s − ξ ) − βi ξ , s i ⎪ ⎩ si bi (s i − x i ) − βi s i ,
≤ xi , ≤ xi , > xi , > xi ,
xi > ξ i ; xi ≤ ξ i ; si > ξ i ; si ≤ ξ i .
The cost function fx,ξ (s) is defined on the set $ k = s ∈ Rk : 0 ≤ s i ≤ Q,
i=1
' s ≤Q . i
This set contains a bounding constraint ki=1 s i ≤ Q. The piecewise constant restocking strategy has the form ⎧ ⎨ ∅, s > S1 , Ux (s, S) = 0, s ≤ S1 , x ≥ S1 , ⎩ Si(x) , s ≤ S1 , si(x)−1 ≤ x < si(x) , where vectors (s ∈ Rm , S ∈ Rm ) satisfy the constraints 0 ≤ s1 ≤ s2 ≤ . . . ≤ sm ≤ Q; Q ≥ S1 ≥ S2 ≥ . . . ≥ Sm ≥ 0; si + Si ≤ Q, i = 1, . . . , m. The multinomenclature (s, S) replenishment strategy is read as follows. Let k be the number of different goods at a warehouse of common volume Q, x i be the volume of good i, and pair (s i , S i ) be parameters of the (s i , S i )-restocking strategy for good i, i = 1, . . . , k; s = (s 1 , . . . , s k ), S = (S 1 ,). . . , S k ); x = (x 1 , . . . , x k ). Thus * the whole (s, S) restocking strategy Ux (s, S) = Ux11 (s 1 , S 1 ), . . . , Uxkk (s k , S k ) is comprised of k functions: ⎧ si > Si , ⎨ ∅, i i i Ux i (s , S ) = 0, si ≤ Si , si < xi , ⎩ i i S − x , si ≤ Si , si ≥ xi ;
i = 1, . . . , k.
Stochastic Optimization Methods for the Stochastic Storage Process Control
97
Vector function Ux (s, S) is defined on the set $ k = (s, S) ∈ R2k : 0 ≤ s i ≤ S i ≤ Q, i = 1, . . . , k;
i=1
Remark that the set contains a bounding constraint
k
i=1 S
i
' Si ≤ Q .
≤ Q.
6 Dynamic Inventory Stochastic Optimization Problems In this section, we approximate inventory optimal control problems by finitedimensional stochastic optimization problems, which, in turn, are approximated by sample average deterministic problems.
6.1 Inventory Stochastic Optimization Problems in Discrete Time The stochastic optimization problem for selecting optimal (s, S) replenishment strategy is as 1 r(Xt , Dt , ξt ) → min0≤s≤S≤Q , T < ∞ T +1 T
F (s, S) = Eξ
(11)
t=0
or F (s, S) = Eξ
T
(γ )t r(Xt , Dt , ξt ) → min0≤s≤S≤Q , T ≤ ∞,
(12)
t=0
subject to dynamic constraints (8) and (9). Here Eξ denotes mathematical expectation with respect to the random variable ξ = (ξ0 , ξ1 , . . . , ξT ); γ is a discounting factor, 0 < γ < 1. In general, function fx,ξ (s, S) = r(x, ux (s, S), ξ ) is discontinuous on the set {(s, S) : 0 ≤ s ≤ S ≤ Q} both due to discontinuity of control u and cost c2 as illustrated in Fig. 1. This circumstance makes problems (11)–(12) particularly difficult. Mathematical expectations in (11) and (12) can be estimated by the sample average values [1–3]: N 1 1 T FN (s, S) = r(Xtn , Dtn , ξtn ) → min0≤s≤S≤Q , T < ∞, t=0 N T +1 n=1 (13)
98
P. Knopov and V. Norkin
Fig. 2 A multiperiod average cost function
and FN (s, S) =
N 1 T (γ )t r(Xtn , Dtn , ξtn ) → min0≤s≤S≤Q , t=0 N
T ≤ ∞,
n=1
(14) n of the random demand in where ξ n = ξ1n , . . . , ξTn are independent samples ξ t independent series n = 1, . . . , N ; Xtn , t = 0, . . . , T is the evolution of the store under sample ξ n according to dynamic equation (1); and ) * D n = D0n = UX0n (s, S), . . . , DTn = UXTn (s, S) is a sequence of controls, which corresponds to the demand samples ξ n . An example of function FN (s, S) is depicted in Fig. 2. Functions FN (s, S) can be discontinuous and multi-extremal. In the case of barrier strategies, they may be continuous but still nonsmooth nonconvex and multi-extremal. For their optimization, one can apply the so-called sequential stochastic smoothing method [22] (see also [31, 32]). We validate this method for the so-called strongly lower semicontinuous functions.
Stochastic Optimization Methods for the Stochastic Storage Process Control
99
6.2 Inventory Stochastic Optimization Problems in Continuous Time Inventory stochastic processes in continuous time is piecewise constant in time and jumps at random times tk , k = 0, 1, . . ., when rare clients come to a warehouse with the demand ξk ; t0 = 0. When the demand is satisfied, then the stock is filled up. Let Xk be the stock, Dk be the restocking, and ξk be the random demand at the time moment tk . The evolution of the stock is described by the equations Xk+1 = (Xk − ξk )+ + Dk = max {0, Xk − ξk } + Dk , k = 0, 1, . . . , K,
(15)
where K = K(T ) is a random number of clients, which come to the warehouse until time T . The (s, S) restocking strategy is defined as ⎧ follows: s > S, ⎨ ∅, Dk = UXk (s, S), k = 0, 1, . . . , K; Ux (s, S) = 0, s ≤ S, s < x, ⎩ S − x, s ≤ S, s ≥ x. The volume of the warehouse is bounded, Xk ∈ [0, Q]. With each time moment tk , a cost function fk (s, S) is associated, for example, given by the following formula: ⎧ s > S; ⎨ C + s − S, fk (s, S) = −βξk + α(Xk − ξk )(tk+1 − tk ), s ≤ S, Xk > ξk , Xk − ξk > s; ⎩ −βXk + a + bS + αS(tk+1 − tk ), s ≤ S, Xk ≤ ξ . The stochastic optimization problem for selecting the optimal restocking strategy is as K(T ) 1 F (s, S) = E fk (s, S) → min0≤s≤S≤Q tK
(16)
k=0
or F (s, S) = E
K(T )
(γ )k fk (s, S) → min0≤s≤S≤Q
(17)
k=0
subject to dynamic constraint (15). Here E denotes mathematical expectation with respect flows of random variables {t0 = 0, t1 , . . .} and {ξ0 , ξ1 , . . .}; γ is a discounting factor, 0 < γ < 1. Mathematical expectations in (16) and (17) can be estimated by sample average values [1–3]:
100
P. Knopov and V. Norkin n
K (T ) N 1 1 n FN (s, S) = fk (s, S) → min0≤s≤S≤Q , n N tK n n=1
T < ∞,
(18)
k=0
and n
N K (T ) 1 FN (s, S) = (γ )k fkn (s, S) → min , T < ∞, 0≤s≤S≤Q N
(19)
n=1 k=0
where ⎧ s > S; ⎨ C + s − S, n n n n n n fk (s, S) = −βξk + α(Xk − ξk )(tk+1 − tk ), s ≤ S, Xkn > ξkn , Xkn − ξkn > s; ⎩ n − t n ), s ≤ S, X n ≤ ξ n ; −βXkn + a + bS + αS(tk+1 k k k coming K n is the ofclients to the warehouse up to time T in the n random number n n n series; t0n , t1n , . . . , tK n and ξ0 , . . . , ξK n are independent samples of client arrival and corresponding times demands in independent simulation series n = 1, . . . , N ; Xkn , k = 0, . . . , K n is the)evolution of the store under the n-th sample according * n =U n n (s, S) is a to dynamic equation (15); D0n = UX0n ,ξ0n (s, S), . . . , DK n XK n ,ξK n n n n sequence of controls at times t0 , t1 , . . . , tK n in n-th independent series. Functions FN (s, S) can be discontinuous and multi-extremal.
7 Numerical Optimization of Discontinuous Cost Functions In this section, we consider a numerical method for the optimization of discontinuous functions that arise in inventory optimization problems. The idea consists of sequential approximating the original function by smooth ones and optimizing the last by stochastic optimization methods.
7.1 Averaged Functions We limit the consideration to the case of the so-called strongly lower semicontinuous functions. Definition 1 ((Strongly) Lower Semicontinuous Functions, [32]) A function F : Rn → R1 is called lower semicontinuous (lsc) at a point x if lim infν→∞ F (x ν ) ≥ F (x) for all sequences x k → x. A function F : Rn → R1 is called strongly lower semicontinuous (strongly lsc) at a point x if it is lower semicontinuous at x and there exists a sequence x k → x
Stochastic Optimization Methods for the Stochastic Storage Process Control
101
such that it is continuous at x k (for all x k ) and F (x k ) → F (x). A function F is called lower semicontinuous (strongly lower semicontinuous) on X ⊆ Rn if this is the case for all x ∈ X. The property of strong lower semicontinuity is preserved under continuous transformations. The averaged functions obtained from the original nonsmooth or discontinuous function by convolution with some kernel have a smoother character. Therefore, they are often used in optimization theory (see [31–33], and references therein). Definition 2 The set (family) of bounded integrable functions {μθ : R+ , θ ∈ R+ } satisfying for any > 0 conditions
Rn →
lim
θ→0 B
μθ (z)dz = 1,
B = {x ∈ Rn | x ≤ 1},
is called a family of mollifiers. Kernels {μθ } are said to be smooth if the functions μθ (·) are continuously differentiable. A function F : Rn → R1 is called bounded at infinity if there are positive numbers C and r such that |F (x)| ≤ C for all x with x ≥ r. Given a locally integrable, bounded at infinity, function F : Rn → R1 and a family of smoothing kernels {μθ }, the associated family of averaged functions {Fθ , θ ∈ R+ } is defined as follows: Fθ (x) :=
Rn
F (x − z)μθ (z)dz =
Rn
F (z)μθ (x − z)dz.
(20)
As can be seen from the definition, smoothing kernels can have an unlimited support supp μθ = {x| μθ (x) > 0}. To ensure the existence of integrals (20), we assume that the function F is bounded at infinity. We can always assume this property if we are interested in behavior of F within some bounded area. If suppμθ → 0, then this assumption is superfluous. For example, a family of kernels can be as follows. Let ψ be some probability density function with bounded support suppψ; a positive numerical sequence {θν , ν ∈ N} tends to 0 as ν → ∞. Then the smoothing kernels can be taken as follows: ψθν (z) :=
1 ψ(z/θν ). (θν )n
If the function F is not continuous, then we cannot expect the averaged functions Fθ (x) to converge to F uniformly. But we don’t need that. We need such a convergence of the averaged functions Fθ (x) to F that guarantees the convergence of the minima of Fθ (x) to the minima of F . This property is guaranteed by the so-called epi-convergence of functions.
102
P. Knopov and V. Norkin
Definition 3 (Epi-Convergence [34]) A sequence of functions {F ν : Rn → ¯ ν ∈ N} epi-converge to a function F : Rn → R ¯ at a point x, iff R, (i) lim inf F ν (x ν ) ≥ F (x) ν→∞
for all x ν → x (ii) lim F ν (x ν ) = F (x) ν→∞
for some sequence x ν → x. The sequence {F ν }ν∈N epi-converges to F , if this is the case at every point x ∈ Rn , then we write F = e-lm F ν .
Theorem 7 ([32]) For a strongly semicontinuous locally integrable function F : Rn → R1 , any associated sequence of averaged functions {Fθ , θ ∈ R+ } epiconverges to F , that is, for any sequence θ ν ↓ 0, it is satisfied F = e-lm Fθ ν . The next important property of epi-convergent functions shows that optimization of a discontinuous function F (x) under a constraint x ∈ K can, in principle, be performed by approximating F (x) with epi-convergent functions F k (x), for example, with averaged functions. ¯ epi-converges to Theorem 8 ([35]) If the sequence of functions {F k : Rn → R} ¯ then for any compact K ⊂ Rn F : Rn → R, lim(lim inf(inf F k )) = lim(lim sup(inf F k )) = inf F, ↓0
k
K
↓0
k
K
K
where K = K + B, B = {x ∈ Rn |x ≤ 1}. If F k (xk ) ≤ inf F k + δk , xk ∈ K , δk ↓ 0 as k → ∞, K
then lim sup(lim sup xk ) ⊆ argminK F, ↓0
k
where (lim supk xk ) denotes the set X of limit points of sequences {xk } and (lim sup↓0 X ) denotes the set of limit points of the family {X , ∈ R+ } as ↓ 0. Note that in the optimization problem without constraints, the theorem states that limk infx F k = infx F . To optimize discontinuous functions, we approximate them with averaged functions. The convolution of a discontinuous function with the corresponding kernel (probability density) improves analytical properties of the resulting function but increases the computational complexity of the problem, since it transforms the deterministic function into an expectation function, which is a multidimensional integral. Therefore, such an approximation makes sense only in combination with the corresponding stochastic optimization methods. First, we study conditions of continuity and continuous differentiability of the averaged functions.
Stochastic Optimization Methods for the Stochastic Storage Process Control
103
Definition 4 (Steklov Functions [36]) For a locally integrable function F : Rn → R, Steklov averaged functions are the functions Fθ (x) :=
Rn
F (x − z)ψθ (z)dz,
where $ ψθ (z) =
1/θ n , max1≤i≤n |zi | ≤ θ/2, 0, otherwise.
Equivalently, 1 Fθ (x) = n θ
x1 +θ/2 x1 −θ/2
dy1 . . .
xn +θ/2 xn −θ/2
dyn F (y).
(21)
The next proposition fixes the well-known fact that the Steklov functions are locally Lipschitz. Proposition 1 ([33]) For a locally bounded and integrable function F : Rn → R, the associated Steklov functions Fθ are locally Lipschitz, that is, for each compact set K ⊂ Rn , the function Fθ is Lipschitzian on K with the constant LK =
2n sup f (x), θ x∈Kθ
where Kθ := {x + z|x ∈ K, maxi=1,...,n |zi | ≤ θ/2}. Differentiability of the averaged functions, however, in a general case cannot be guaranteed if the smoothing kernels ψθ are not differentiable or the function F itself is not sufficiently smooth. Although in the case of discontinuous functions the Steklov functions are not differentiable, we can always construct differentiable averaged functions by reapplying the averaging to the Steklov functions [31]. For a locally integrable function, we introduce averaged functions of the form ( Fαβ (x) := ( ( R n Fα (x − z)ψβ (z)dz = R n dy R n F (x − y − z)ψα (y)ψβ (z)dz with densities ψα and ψβ from Definition 4. We can also represent Fαβ (x) as Fαβ (x) = Ef (x − αξ − βη), where ξ and η are independent random vectors with independent components uniformly distributed on [−1/2, 1/2] components. The gradient of function Fαβ (x) is calculated by the formula [31]:
104
P. Knopov and V. Norkin
∇Fαβ (x) =
1/2 −1/2
...
dξ1 . . .
−1/2
1/2
−1/2
%
1/2
dηi−1
dξn
1/2
−1/2
dη1
1/2
−1/2
dηi+1 . . .
1/2
−1/2
& dηn λαβ (x, ξ, η) , (22)
where
× f
−f
n
−1 i=1 ei β × αβ α,β αβ α,β z1 (ξ, η), . . . , zi−1 (ξ, η), xi + αξi + β2 , zi+1 (ξ, η), . . . , zn (ξ, η)
αβ α,β αβ α,β z1 (ξ, η), . . . , zi−1 (ξ, η), xi + αξi − β2 , zi+1 (ξ, η), . . . , zn (ξ, η) ,
λαβ (x, ξ , η) =
αβ
zi (ξ, η) := xi − αξi − βηi . Thus, the random vector λαβ (x, ξ , η) is an unbiased estimate of the gradient ∇Fαβ (x), ξ and η are random vectors with independent uniformly distributed components on [−1/2, 1/2]. Steklov functions have bounded supports. We can also consider smoothed functions obtained by employing differentiable kernel with unbounded support. For example, let the kernel be the Gaussian probability density, i.e., ψ(y) = (2π )−n/2 e−y . 2/2
Let us consider the following family of averaged functions: 1 Fθ (x) = n θ
%
F (y)ψ Rn
x−y θ
& dy, θ > 0.
Suppose that F is globally bounded (one can even assume that |F (x)| ≤ γ1 + γ2 xγ3 with some nonnegative constants γ1 , γ2 , γ3 ). Then for the strongly lsc function F , the average functions Fθ epi-converge to F as θ ↓ 0, and each function Fθ is analytical with the gradient: ∇Fθ (x) =
1 θ n+2
Rn
F (x − y)ψ
y θ
ydy =
1 θ
Rn
[F (x − θ z) − F (z)]zψ(z)dz
or ∇Fθ (x) = Eη
1 [F (x + θ η) − F (x − θ η)]η, 2θ
(23)
where the random vector η has the standard normal distribution and Eη denotes the mathematical expectation over η. Thus, the random vector
Stochastic Optimization Methods for the Stochastic Storage Process Control
ξθ (x, η) =
1 [F (x + θ η) − F (x − θ η)]η 2θ
105
(24)
with the Gaussian random variable η is unbiased statistical estimation of the gradient ∇Fθ (x). For the gradients of the expectation function F (x) = Eω f (x, ω) (with f (x, ω) such that Eω |f (x, ω)| exists and grows at infinity no faster than some polynomial in x), we can rewrite (by Fubini’s theorem) the gradient ∇Fθ (x) = ∇Eω f (x, ω)in the form ∇Fθ (x) = Eηω ξθ (x, η, ω), ξθ (x, η, ω) =
1 [f (x + θ η, ω) − f (x − θ η, ω)]η, 2θ
Eηω denotes the expected value over the combined random variable (η, ω). The finite-difference approximations ξθ (x, η, ω) are unbiased estimates of ∇Fθ (x).
7.2 Stochastic Methods for Minimization of Discontinuous Cost Functions Consider a problem of constrained minimization of a generally discontinuous function subject to a box or other convex constraints. For example, this can be sample average problem (6) of Sect. 4.5, problems (13) and (14) of Sect. 6.1, and problems (18) and (19) of Sect. 6.2. Such problems can be solved by collective random search algorithms [37]. In this section, we develop stochastic quasigradient algorithms to solve these problems. A problem of constrained optimization can be reduced to the problem of unconstrained optimization of a coercive function F (x) by using nonsmooth or discontinuous penalty functions as described in [22]. Suppose the function F (x) is strongly lower semicontinuous. In view of Theorem 7, it is always possible to construct a sequence of smoothed averaged functions Fθν that epi-converges to F . By Theorem 8, global minimums of Fθν converge to the global minimums of F as θν → 0. Convergence of local minimums was studied in [32]. Let us consider some procedures for optimizing function F using approximating averaged functions Fθν . Suppose one can find the global minima x ν of functions Fθν , ν = 0, 1, . . .. Then, by Theorem 8, any limit point of the sequence {x ν } is a global minimum of the function F . However, finding global minima of Fθν can be a quite difficult task, so consider the following method:
106
P. Knopov and V. Norkin
The Successive Stochastic Smoothing Method (S 3 -Method) [22] The method sequentially minimizes a sequence of smoothed functions Fθν with decreasing smoothing parameter θν ↓ 0. Here the sequence of approximations x ν is constructed by implementing the following steps [22]: 1. Fix a sufficiently large initial value θ0 of the smoothing parameter. Select a starting point x 0 and set ν = 0. 2. For a fixed smoothing parameter θν , minimize the smoothed function Fθν by some stochastic optimization method with the use of the initial point x ν−1 and finite-difference stochastic gradients (24) to find the next approximation x ν . 3. Set ν := ν + 1 and go to step 2 until a stopping criterion is not fulfilled. For the minimization of the smoothed function Fθν under fixed θν , one can apply any stochastic finite-difference optimization method based on the finite-difference representations (22) and (23) of the gradients of the smoothed functions. Asymptotic convergence of such methods to critical points of Fθν (x) was studied in [33, 38, 39]. If ∇Fθν (x ν ) → 0, then, by results of [32], the constructed sequence x ν converges to the set, which satisfies necessary optimality conditions for F . If this method is applied sequentially to a sequence of smoothed functions Fθν with θν ↓ 0, it can approach to the global minimums of F [22]. This method requires estimating gradients ∇Fθν (x) during the iterative optimization process for a smooth averaged function Fθν . In general, this is a rather complicated and time-consuming procedure that requires calculation of multidimensional integrals. However, such asymptotically consistent estimates can be constructed in parallel with the construction of the main minimization sequence by using the so-called averaging procedure [33, 39, 40]. Consider the following stochastic optimization procedure for iterative optimizing a function Fθ (x) and parallel evaluating its gradients ∇Fθ (x): z0 = ξ0 (x 0 ), x 0 ∈ R n , x k+1 = x k − ρk zk , zk+1 = zk − λk (zk − ξk (x k , ηk )), k = 0, 1, . . . , where vectors ξk (x k , η) are given by formula (24), conditional expectations E{ξk (x k , ηk )|x k } = ∇Fθ (x k ). Theorem 9 ([40], Theorem V.8) Let numbers ρk and λk satisfy conditions 0 ≤ λk ≤ 1,
lim λk = 0, k
∞ k=0
λk = +∞,
∞
λ2k < +∞,
k=0
Then, with probability one, (zk − ∇Fθ (x k )) → 0 as k → ∞.
lim k
ρk = 0. λk
Stochastic Optimization Methods for the Stochastic Storage Process Control
107
Fig. 3 The illustration of convergence of the S 3 -method from different starting points
Figure 3 illustrates the convergence of the successive stochastic smoothing method [22] to deep local minimums of the average cost function FN (s, S) from Sect. 6.1, starting from different initial points. Remark that function FN (s, S) is discontinuous and has a lot of shallow local minimums. In Fig. 4, we further illustrate the application of the successive stochastic smoothing method [22] to optimization of the energy accumulation system described in Sect. 4.5. The parameters of problem (6) were taken as follows: y = 0 (initial level of the stored energy), s = 1.2 (rate of charging), r = 1 (rate of discharging), a = 1 (efficiency of charging), b = 1 (efficiency of discharging), c1 = 0.5 (accumulator capacity provision costs), c2 = 3 (inflow energy cost), c3 = 4 (extra energy tariff), c0 = 0.6 (the accumulator setup costs), T = 20 (time horizon), E(ξ ) = 3 (mean of the normally distribution demand ξ ), D(ξ ) = 9 (variance of the normally distribution demand ξ ), and N = 50 (number of independent samples of the dynamic demand). In the considered example, the objective function is nonconvex and discontinuous (due to the presence of setup costs) and has two local minimums. If the setup costs are sufficiently large, the optimization process converges to the minimum with zero accumulator capacity provision. With small or zero setup costs, the method finds another minimum with the presence of the accumulator.
108
P. Knopov and V. Norkin
Fig. 4 The illustration of convergence of the S 3 -method for the energy accumulator system with setup costs
8 Conclusions The paper investigates the possibility of reducing stochastic optimal storage control problems to finite-dimensional stochastic optimization problems. For this, the conditions are first established under which the optimal control is stationary and Markov, that is, it depends only on the state of the controlled system. The main tool for obtaining these results is the Bellman optimality equation. Then, for some particular cases of the problem, a parametric form of optimal control is established, when it depends on the state of the system and a finite number of unknown parameters. After substituting this form into the original control problem, it turns into a finite-dimensional stochastic programming problem. To solve the latter, there is a wide variety of methods, including ones for solving problems of large dimensions. Thus, this approach allows to some extent overcome the curse of the dimensionality of dynamic programming. However, when implementing this approach, some problems arise: (a) How to choose a parametric form of control if there is not sufficient theoretical justification for such a choice? (b) The optimal control, as a rule, has a discontinuous character; therefore, the parametric form and the corresponding objective function of the stochastic programming problem can also be discontinuous. (c) Constraints on the control of the original problem induce constraints on the sought parameters of the derived stochastic programming problem.
Stochastic Optimization Methods for the Stochastic Storage Process Control
109
In this article, we address all of these issues. (a) Taking the inventory control problem as an example, we consider a whole family of possible discontinuous parametric controls: (s, S)-strategies, barrier strategies, and piecewise-constant multidimensional strategies. (b) To minimize discontinuous functions, we propose a special method, namely, the method of sequential stochastic smoothing of the objective function, which does not use function gradients but uses only stochastic finite-difference analogs of stochastic gradients. (c) To take into account the constraints on the sought parameters, we use discontinuous penalty functions that do not change the objective function in the feasible set of parameter change, but outside it exceeds the maximum value of the function and increases with distance from the admissible area. Using an illustrative example of finding an optimal (s, S) inventory control strategy and optimizing the energy accumulation system, we show the efficiency of the proposed approach to the approximate solution of stochastic optimal control problems. Further research is associated with the development of new forms of parametric control and an increase in the dimension of the parameter space in these forms. Acknowledgments The work was supported by grant 2020.02/0121 of the National Research Foundation of Ukraine and by the grant CPEA-LT-2016/10003 of the Diku Foundation, Norway.
References 1. Y. Ermoliev, R. Wets (eds.), Numerical Techniques for Stochastic Optimization (Springer, Berlin, 1988) 2. A. Ruszczy´nski, A. Shapiro (eds.), Stochastic Programming. Handbooks in OR & MS, vol. 10 (Elsvier, Amsterdam, 2003) 3. A. Shapiro, D. Dentcheva, A. Ruszczy´nski, Lectures on Stochastic Programming: Modeling and Theory (SIAM, Philadelphia, 2009) 4. W.B. Powell, Approximate Dynamic Programming. Solving the Curses of Dimensionality, 2nd edn. (Wiley, Hoboken, 2011) 5. L.G. Gubenko, E.S. Shtatland, On discrete time Markov decision processes. Teoriya veroyatnostey i matematicheskaya statistika (In Russian), in Theory of Probability and Mathematical Statistics, vol. 7 (Kyiv State University, Kyiv, 1972), pp. 51–64 6. L.G. Gubenko, E.S. Statland, On controlled Markov and semi-Markov models and some concrete problems in optimization of stochastic systems, in Proceedings of the Seminar Controlled Random Processes and System, Kyiv (1972), pp. 87-119 7. L.G. Gubenko, E.S. Shtatland, Controlled semi-Markov processes. Cybern. Syst. Anal. 8, 200– 205 (1972). https://doi.org/10.1007/BF01068488 8. E.B. Dynkin, A.A. Yushkevich, Controlled Markov Processes (Springer, Berlin, 1979) 9. D.P. Bertsekas, S.E. Shreve, Stochastic Optimal Control: The Discrete-Time Case (Athena Scientific, Bellmont, 1996) 10. E.A. Feinberg, A. Shwartz (eds.), Handbook of Markov Decision Processes-Methods and Applications (Springer, New York, 2002) 11. N.U. Prabhu, Stochastic Storage Processes, 2nd edn. (Springer, New York, 1998). https://doi. org/10.34229/2707-451X.20.1.1
110
P. Knopov and V. Norkin
12. H. Daduna, P.S. Knopov, L.P. Tur, Optimal strategies for an inventory system with cost function of general form. Cybern. Syst. Anal. 35(4), 601–618 (1999). https://doi.org/10.1007/ BF02835856 13. S.S. Demchenko, P.S. Knopov, V.A. Pepelyaev, Optimal strategies for inventory control systems with a convex cost function. Cybern. Syst. Anal. 36(6), 891–897 (2000). https://doi. org/10.1023/A:1009413511883 14. T.V. Pepelyaeva, L.B. Vovk, I.Yu. Demchenko, Optimal strategies for the multi-task inventory control model. Cybern. Syst. Anal. 52(1), 107–112 (2016). https://doi.org/10.1007/s10559016-9805-6 15. H.U. Gerber, An Introduction to Mathematical Risk Theory (Huebner Foundation, Philadelphia, 1979) 16. H. Schmidli, Stochastic Control in Insurance (Springer, London, 2008) 17. O.V. Viskov, A.N. Shiryaev, Controls leading to optimal stationary modes, in Mathematical Institute of Academy of Science USSR. Collect. for Theory of Probability and Mathematical Statistics pp. 35-45. Nauka, Moscow (1964) 18. D. Blackwell, Discounted dynamic programming. Ann. Math. Statist. 36(1), 226–235 (1965) 19. A. Maitra, Discounted dynamic programming on compact metric spaces. Sankhya, ser. ÐRˇ 30(2), 211–216 (1968) 20. C. Derman, Finite State Markovian Decision Processes (Academic Press, New York, 1970) 21. E.A. Fainberg, On controlled finite state Markov processes with compact control sets. Theory Prob. Appl. 20(4), 856–861 (1975). https://doi.org/10.1137/1120093 22. V.I. Norkin, A stochastic smoothing method for nonsmooth global optimization. Cybernet. Comput. Technol. (1), 5–14 (2020). https://doi.org/10.34229/2707-451X.20.1.1 23. R.K. Chornei, H. Daduna, P. Knopov, Control of Spatially Structured Random Processes and Random Fields with Applications (Springer, New York, 2006) 24. S.S. Demchenko, P.S. Knopov, R.K. Chornej, Optimal strategies for a semi-Markovian inventory system. Cybern. Syst. Anal. 38, 124–136 (2002). https://doi.org/10.1023/A: 1015556518666 25. P.S. Knopov, T.V. Pepelyaeva, I.Yu. Demchenko, A semi-Markov inventory control model. Cybern. Syst. Anal. 52(5), 730–736 (2016) https://doi.org/10.1007/s10559-016-9874-6 26. H. Daduna, P.S. Knopov, Optimal admission control for M/D/1/K queuing systems. Math. Meth. Oper. Res. 50(1), 91–100 (1999). https://doi.org/10.1007/s001860050037 27. P.C. Knopov, V.A. Pepelyaev, On one model of controlled semi-Markov processes (In Russian). Tavricheskij Vestnik. Inform. Mat. (2), 5–13 (2005). https://tvim.info/files/2005-2-5_1.pdf 28. Y. Li, D. Jayaweera, Probabilistic methods applied in power and smart grids, in Smart Power Systems and Renewable Energy System Integration, Studies in Systems, Decision and Control 57, ed. by D. Jayaweera (Springer, Berlin, 2016), pp. 141–148 29. A. Helwig, Applied research in energy storage, in Smart Power Systems and Renewable Energy System Integration, Studies in Systems, Decision and Control 57, ed. by D. Jayaweera (Springer, Berlin, 2016), pp. 179–200 30. C.A. Floudas, P.M. Pardalos et al., (eds.) Handbook of Test Problems in Local and Global Optimization (Springer, Dordrecht, 1999) 31. A.M. Gupal, V.I. Norkin, Algorithm for the minimization of discontinuous functions. Cybernetics 13(2), 220–223 (1977). https://doi.org/10.1007/BF01073313 32. Yu.M. Ermoliev, V.I. Norkin, R.J-B. Wets, The minimization of semicontinuous functions: mollifier subgradients. SIAM J. Contr. Optim. 33(1), 149–167 (1995). https://doi.org/10.1137/ s0363012992238369 33. A.M. Gupal, Stochastic Methods for Solving Nonsmooth Extremal Problems (In Russian) (Naukova Dumka, Kyiv, 1979) 34. R.T. Rockafellar, R.J.-B. Wets, Variational Analysis (Springer, Berlin, 1998) 35. Y.M. Ermoliev, V.I. Norkin, On nonsmooth and discontinuous problems of stochastic systems optimization. Europ. J. Oper. Res. 101, 230–244 (1997). https://doi.org/10.1016/s03772217(96)00395-5
Stochastic Optimization Methods for the Stochastic Storage Process Control
111
36. V.A. Steklov, Sur les expressions asymptotiques de certaines fonctions définies par les equations differentielles du second ordre et leurs applications au probleme du developement dune fonction arbitraire en series procedant suivant les diverses fonctions. Communications de la Société mathématique de Kharkow, Série 2. 10, 97–199 (1907). In French 37. B.V. Kumar, P. Sivakumar, M.M.R. Singaravel, K. Vijayakumar (eds.), Intelligent Paradigms for Smart Grid and Renewable Energy Systems (Springer, Berlin, 2021) 38. B.T. Polyak, Introduction to Optimization (Optimization Software, New York, 1987) 39. V.S. Mikhalevich, A.M. Gupal, V.I. Norkin, Methods of Nonconvex Optimization (Nauka, Moscow, 1987). In Russian 40. Y.M. Ermoliev, Methods of Stochastic Programming (Nauka, Moscow, 1976). In Russian
Challenges for a Massive Integration of Flexible Resources in LV Networks Pablo Arboleya, Lucía Suárez, Rubén Medina, and Alberto Méndez
1 Actual Context: The Path Towards a Digital Grid Success in coping with the unquestionable challenges arising from the shift to a low-carbon economy is closely linked to innovation and technological development in the field of energy. Success in the face of the undoubted challenges posed by the change of model towards a low-carbon economy is closely linked to innovation and technological development in the field of energy. In the process of energy transition towards a sustainable energy system, efficiency and saving measures are complemented by efforts in technological innovation, guided in turn by the need to produce electrical energy in a more sustainable and efficient way and at competitive prices, reducing external dependence and enabling the fight against climate change. The main areas in which lines of P. Arboleya () Universidad de Oviedo, Oviedo, Spain Lemur Research Group, Gijon, Spain LEMUR Group, Department of Electrical Engineering, Universidad de Oviedo, Oviedo, Spain e-mail: [email protected] L. Suárez Telemanagement System Infrastructure Dept. ERedes Electrical Distribution, EDP Group, Lisbon, Portugal e-mail: [email protected] R. Medina Plexigrid, Stockholm, Sweden e-mail: [email protected] A. Méndez Plexigrid, Gijón, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_4
113
114
P. Arboleya et al.
research are structured are basically storage, generation (development of more efficient renewable energies) and development of the electric vehicle, as well as the distribution and transport of energy. In fact, in recent years, electricity distribution has undergone more changes than those experienced over decades, thanks to advances in these fields of study. Several technologies such as the electric vehicle (EV), photovoltaic (PV) generation systems, energy storage systems (ESS), or heat pumps (HP) have been reducing their costs exponentially, becoming more accessible to end users. These technologies have turned what were previously only customers consuming electricity into prosumers, customers who produce electricity and consume it flexibly. Investment in smart grids is also one of the main areas of innovation disruption in the electricity sector. In fact, many investments in innovation within this sector are related to its digital transformation and the need to make the most of the opportunities offered by new information and communication technology (ICT) developments. The arguments set out above are based on an analysis of the current scenario and various regulations at national level but above all on the package of measures which will be derived from the European guidelines established in the so-called Clean energy package for All Europeans [1], a set of measures framed in various lines of action such as energy consumption in buildings, renewable energy, energy efficiency and renewal of the electricity market. The package launches a series of guidelines contained in several European directives including Directive (EU) 2019/944 of the European Parliament and of the Council of 5 June 2019 concerning common rules for the internal market in electricity and amending Directive 2012/27/EU [2]. This Directive regulates all aspects of the generation, transmission and distribution of electricity and includes radical changes to the previous Directive. These changes are related above all to the change in the energy model, defending the model that promotes distributed generation as opposed to a purely centralized model, but also aims to promote the empowerment of consumers, and above all active consumers, understood as such, not only those who generate electricity (prosumers) but also those who can participate in flexibility markets on an individual or aggregate basis by participating in demand response programmes. Another radical change introduced by the present regulation is the definition of the new role of energy communities and the protection and promotion of the implementation of this type of figure. Finally, with regard to the management of the distribution network, the new Directive continues to impose a clear model for the separation of activities. It promotes the modernization and increased visibility of the networks by operators as the main axis of action to encourage the increase in the penetration of distributed generation (DG) and renewable energies (RE), the deployment of EV charging systems and the implementation of ESSs. In addition, it will impose a reduction in costs on distribution facilities that will lead to a reduction in prices for end customers. The achievement of these goals implies a radical change in which a multitude of technologies are incorporated. Without a shadow of a doubt, a bottleneck in this process will be the remote and optimized management of both the distribution networks and all the devices connected to them. The digitalization of electricity network management is one of the most critical
Challenges for a Massive Integration of Flexible Resources in LV Networks
115
processes to be undertaken in the short term and for which the vast majority of distribution companies are not currently ready, prepared or qualified. The digitalization process, which affects all productive activity, is essential in the case of the electricity sector to achieve a sustainable energy system. Its digitalization makes possible to better manage renewable generation, introduce energy efficiency measures, incorporate innovative technologies and, above all, manage consumption, distributed resources and flexible loads whose incorporation is growing year by year. Digitalization is essential for extracting all the potential from the intelligent networks—known as “smart grids”—which allow active management of the electricity distribution system in real time, knowing the changes in consumption and generation, predicting them and anticipating them by efficiently coordinating all the resources available in the network. For years the high-voltage energy transmission network has had intelligence associated with its devices. However, this situation does not occur at lower-voltage levels in what are known as terminal electricity distribution networks. The current trend is to transform this type of distribution networks into intelligent energy networks that can manage increasingly complex and numerous systems and elements, for example, the more atomized generation from micro-generation (PV, wind, or other non-renewable systems). Intelligent networks promote the automation, integration and coordination of all the agents connected to them, developing real-time control systems, security and reliability of the electricity system, prediction and coverage systems, demand management, cybersecurity and unique installations. The introduction of smart meters allows the development of user management applications and the orientation of the activity of this sector towards retail services, promoting the direct participation of customers in the electricity system and the interaction between them and with the electricity companies. Furthermore, this greater digitalization of the electricity system allows progress to be made in the development of self-consumption and selfgeneration, aspects which are covered by the latest European Commission policies. The change from having a totally conventional and predictable distributed consumption to the adoption of the new technologies, giving the possibility to end users to become electricity producers, represents a change of paradigm that will undoubtedly bring many advantages that have to do with the increase of efficiency and the democratization of energy. However, in order to address this change with guarantees, a series of challenges must be resolved in relation to: • Real-time management of all the new resources that will be massively present in the distribution networks in the short term (charging of electric vehicles, heat pumps, other electric air conditioning devices, devices placed behind the meters with IoT-based domotic control, etc.) • The massive and efficient integration of renewable energies (photovoltaic, microwind, micro-cogeneration) and other types of distributed generation, as well as energy storage systems in the electricity distribution network • The efficient management of energy flows in the distribution network so that the necessary investments in network infrastructure are contained, optimizing the use of existing resources
116
P. Arboleya et al.
In order to face the challenges mentioned above, a radical change in the way distribution systems are managed is essential. A digitalization strategy must be implemented to allow for semi-automatic or automatic operation of both the distribution systems themselves and the devices embedded in them. This is the only path for a massive and coordinated penetration of the mentioned devices in the distribution systems, reducing the investment in infrastructure and maximizing the efficiency and sustainability of the whole system. To this end and bearing in mind that this change implies the generation of complex cyber-physical networks, it is necessary to implement systems for managing large volumes of data based on big data techniques, systems for advanced representation and monitoring of electrical data that allow accessibility and compatibility with all types of devices. It is also critical to implement intelligent analysis systems for the data obtained using all kind of available mathematical tools, from conventional load flow algorithms, state estimation and contingency analysis to advanced algorithms based on statistical methods, machine learning or artificial intelligence that allow to extract the maximum amount of knowledge from the data obtained in order to operate and plan the networks and the devices connected to them efficiently and in a sustainable way. From the perspective of the DSOs who manage the current and future digital smart grids, the challenges mentioned above are being analysed in detail, and innovative solutions are being sought that will most likely lead to a paradigm shift in the management of low-voltage network. It is common to compare LV networks that are being digitized with higher-voltage level networks that were already digitized some years ago. But there are also other trends which involve innovative approaches in terms of stating the problem from scratch and seeking decentralized management solutions to deal with distributed energy resources. In this sense, the current trend is to consider the enormous low-voltage distribution networks as a great multitude of small simple entities, made up of a transformer, its associated low-voltage network and the digital devices which supervises it, at the different points of consumption. In this way, a problem of enormous dimensions is reduced to many single, much simpler problems which are now managed individually and which must be coordinated with the rest of the simple entities which complete the distribution network in its closest environment, as well as being integrated with the distributed resources connected to its low-voltage network. With this new vision of the management of the low-voltage network, the control that has been centralized until now is greatly simplified, moving towards a decentralization of the network operation controlled by a multitude of processing units that work in coordination with each other, each one of them managing a simple entity. With the deployment of the smart meters, a decisive impulse has been given that has finally allowed the implementation of the measuring and supervising devices of the different connection points of the network that will help the active management of the distribution system. Therefore, for the distributors, the added challenge to those mentioned above consists of taking the necessary steps to evolve the functionalities offered by these devices today and to continue innovating in order to achieve real or quasi-real time for the massive and efficient integration of the distributed resources.
Challenges for a Massive Integration of Flexible Resources in LV Networks
117
2 Regulatory Framework This chapter deals with the challenges that the electricity system has to face in order to be able to deploy distributed resources in a massive, technically safe and economically and environmentally profitable manner. To this end, one of the main requirements is that distributed generation (DG) systems, ESSs at the prosumer level and other devices installed at the end-user level such as HPs or EVs can actively interact with the rest of the system’s agents following market mechanisms providing benefits off course to the end users of these resources but also to other agents of the system. For this reason, this section will explain in a simple way which agents are currently involved in the electricity system and what their role is, reviewing the mechanisms of energy purchase and sale and the existing balance services. It will then be explained how to integrate new resources and new emerging actors derived from new regulatory changes into this complex system, as well as their possible business models.
2.1 Traditional Roles in the Electrical Sector: Production and Operation Markets The liberalization of the European electricity sector, promoted by Directive 96/92/EC of the European Parliament [3], has meant that the large electricity companies, which have traditionally represented a series of roles such as generators, transporters, distributors and retailers, have split into a conglomerate of different companies, each with a very specific role. In most cases, this separation of roles has been determined by an incompatibility of activities established by the aforementioned directive or by the particular application of the directive in the form of a law in the different states that comprise the European Union. Thus, for example, the generation and transmission or distribution of electricity are incompatible activities, and therefore a company that owns energy-generating units cannot own/operate electricity transmission or distribution networks. In the diagram of Fig. 1, we can see a super-simplified scheme of the different agents that exist in the electrical system. As we can observe, traditionally, and assuming that there is no distributed generation in the distribution network, the energy is injected into the transmission grid by the generating companies (GENCOs), this energy flows through the transmission grid (owned by a so called TRASCO) at a high voltage to reduce losses, and the voltage is reduced once the final consumption is approached. This voltage reduction is done through step down substations that connect the high-voltage transmission grid to the medium-voltage distribution grid owned by a distribution company (DISCO). There are end users which obtain energy from the medium-voltage distribution grid (or even from the transmission grid), but the vast majority of domestic users obtain energy from the
118
P. Arboleya et al.
Fig. 1 Simplified scheme of the electrical sector actor interactions
low-voltage distribution grid that is connected to the medium-voltage grid through small distribution transformer stations. In relation to the interactions between agents and the roles of each one, we can see in the diagram how the retailer establishes energy contracts with the consumer; the establishment of this type of contract is totally liberalized allowing the provision of different services ranging from conventional ones with an energy term and a power term to the so-called flat rates in which the consumer basically pays for the power contracted (power term) and can consume “unlimited” energy with the only restriction of not exceeding a certain power. Another more sophisticated type of tariff includes the so-called implicit demand response programmes in which the retailer sends the consumer a price signal either the day before or in real time so that the consumer adjusts his consumption to those price signals. To acquire energy, the retailer buys it from the electricity pool by submitting bids for purchasing energy. The electricity pool is operated by the market operator which is responsible for matching the offers to buy energy submitted by the retailers with offers to sell energy submitted by the generators. Generation and retailing are not compatible because the strategies for submitting bids to the pool of these two agents are opposed; while the retailers pursue the purchase of energy at the lowest possible price, the generators aim to obtain the highest price. The activity of buying and selling energy is also a liberalized activity as the agents who make bids for the purchase and sale of energy at the pool are free to select the bid prices and energy packages always within limits pre-established by the regulator. Once the retailer has acquired the energy packages, it needs to transport that energy to the final consumer following the energy flow described above, for which it must pay an access fee to the transporters and also to the distributors for the use of their networks. In the majority of European countries, transport and distribution operate as geographical monopolies, i.e. once a company has its network deployed within a certain geographical area, another company cannot deploy a second redundant network. In return, in these cases, the regulator classifies
Challenges for a Massive Integration of Flexible Resources in LV Networks
119
both transmission and distribution as regulated activities (as opposed to all other liberalized activities) and thus sets the prices that third-party agents have to pay to distribution and transmission network owners for the use of their networks. In a very simplified way, we could say that the generators receive incomes from the sale of electricity in the pool and the transporters and distributors receive incomes from the payment of the access tariff but also from the investments they make in keeping their networks in optimal conditions. The retailer on the other hand receives its revenues from the supply contracts established with consumers, and its main expenses are the payment of the access tariffs to distributors and transporters and the payment for the energy obtained through the pool. There is also a modality in which retailers and generators establish bilateral contracts bypassing the pool-matching mechanism. The procedure described above applies to several types of markets, but especially to the so-called day-ahead market where generators and retailers negotiate the energy packages to be sold/bought on each of the 24 slots (each of them representing an hour) of the following day. It must be pointed out that once the daily market is closed and all purchase and sale bids are matched, the market operator (MO) check with the transmission system operator (TSO) whether or not the transmission grid is capable of handling the power flows resulting from the energy transactions. If technical constraints related to the overload of transmission system elements and voltage levels are not met, the TSO activate a set of correction mechanisms that will modify the matching performed by the MO. In the previous paragraphs and in Fig. 1 diagram, for the sake of simplicity, it has been assumed that both retailers and generators operate individually in the market. However, in doing so, they are responsible for the possible power imbalances. In other words, generators are responsible for generating the energy they have sold or retailers for delivering the energy they have bought. Designing the best strategy to operate in the market as well as taking responsibility for the energy balance in each of the so-called imbalance settlement periods (ISPs) is a very complex and specialized task. A bad strategy when designing purchase and sale offers can lead to being left out of the equation. On the other hand, once a certain bid is closed, maintaining the committed balance during each of the 24 h is also complicated. For this reason, both generators and retailers usually close agreements with specialized trading companies which will design the optimized operation strategies for their customers (which in this case will be generation units, retailers or other consumption units). We will refer to this type of company as balance responsible parties (BRPs). It must be taken into account that although generation and retailing are not compatible activities, a generator and a retailer are allowed to contract the services of the same BRP, and this is normally the case. A BRP usually has in its portfolio both retailers and generators with different technologies, since the joint management of a wide and varied portfolio of clients allows to compensate certain imbalances with other ones, thus reducing their risk of imbalances and the cost directly generated by them (penalties). BRPs and their role are represented in Fig. 2. In this representation, a more realistic representation of how the different agents interact in the production market is provided.
120
P. Arboleya et al.
Fig. 2 Production market representation considering BRPs
The above-mentioned day-ahead market is a so-called production market. The mission of the production markets is to close the energy transactions between actors that will carry out the energy exchange within a time horizon that is at least a few hours away from the closing of the market. The production markets are usually operated by the market operator (MO), and in addition to the day-ahead market, there are a number of intraday markets in all European countries. There are two types of intraday markets, conventional (intraday auction markets) and a continuous intraday market (also called single intraday coupling). Regarding conventional intraday markets, they usually last 1 hour, and their application window is separated by a few hours from the closing. For example, in the case of the Iberian electricity market affecting Spain and Portugal, the first intraday market extends from 14:00 to 15:00 h, and it handles energy transactions from midnight on the same day (day D) to midnight the following day (day D+1). The rest of the intraday markets are staggered in time, opening at 5 p.m., 9 p.m., 1 a.m., 4 a.m., and 9 a.m. They last approximately 1 h, and they handle energy transactions that can be made from approximately 3 h after closing time until midnight on day D+1. Figure 3 shows the structure and application periods of the different markets. In the case of the continuous intraday market, the difference with the conventional market is that the rounds are continuous and agents from all European areas can participate as long as maximum power is not reached at the international interconnection points. Furthermore, energy transactions can be agreed only 1 h in advance.
121
Fig. 3 Scheme containing the different kind of electrical markets
Challenges for a Massive Integration of Flexible Resources in LV Networks
122
P. Arboleya et al.
It is very important to note that in the day-ahead market, the generators or their respective BRPs represent the role of energy sellers, while the retailers or their respective market representatives (BRPs) play the role of buyers. This does not have to be the case in the different intraday markets. Suppose, for example, that a generator has sold in the day-ahead market on day D 10MWh for the hourly period from 14:00 to 15:00 on day D+1. However, at 23:00 h on day D, the generator detects a fault in one of its groups and determines that for the period under study it will only be able to produce 8MWh. The generator could wait for the opening of the next intraday market (intraday number 5 opening at 1 a.m. of day D+1) and buy the remaining 2MWh to maintain the balance. It could be the case that these 2MWh are sold by a retailer which has bought them on the day-ahead market but has determined that it will not be able to deliver them. In the previous paragraphs, we have explained how the purchase and sale of energy is carried out in the day-ahead market and how corrections are established in the successive intraday markets. As it has been seen, there are correction mechanisms in the different production markets which allow to correct unforeseen imbalances at the time of the closing of the intraday market but foreseen in the successive hours. However, obviously, the sum of the electrical power demanded plus the system losses has to be equal to the generation at all times. Keeping the energy balance in the imbalance settlement periods is not enough for the stability of the system. When an imbalance occurs, the frequency of the system is affected in such a way that if generation exceeds demand plus losses, the frequency will increase. The frequency will decrease if demand plus losses exceeds generation. The transmission system operator (TSO) has a set of market mechanisms which allow it to maintain this balance at all times by monitoring the frequency and the socalled balance service providers (BSP). These market mechanisms are grouped into what are called operation markets. The difference between production and operation markets is that in the case of operation markets, they are managed by the system operator and not by the market operator. In addition, production markets are energy markets because what is traded are energy packages, while operating markets are hybrid markets. They are hybrid markets because they are energy markets but also capacity markets where in some cases certain actors are remunerated for the capacity to produce or vary their generation/consumption in addition to the energy produced. In general, we could say that BRPs participate in the production markets, while BSPs participate in operating markets. In the general case, BRPs and BSPs may be different agents (e.g. the case of Belgium); in that case, the transactions of balance services made by a certain BSP would affect one or more BRPs. It is very common for the role of BSP to be assumed by a BRP-type company as these roles are not separated in many European countries. The balancing services that a BSP or in its absence a BRP can provide to the TSO are varied and will be described below. On the one hand, there is the primary regulation also called frequency containment reserves (FCR), which is a decentralized frequency control service that is mandatory for all generators capable of providing it. Traditionally, this type of generators used to be large generators with
Challenges for a Massive Integration of Flexible Resources in LV Networks
123
a very high inertia, but nowadays any load or generator that can act quickly when faced with a change in frequency could provide the primary regulation. The power variation that an element makes in the event of activating the primary regulation does not need to be maintained in time beyond a few seconds. In the case of prosumers, if they could act very quickly in the event of frequency variations, they could be providers of primary regulation. The remuneration for primary regulation capacity varies from one country to another, while in countries such as Spain it is a compulsory but non-remunerated service. In countries such as Belgium, the Netherlands or Germany, primary regulation capacity is auctioned weekly. In addition to primary regulation, secondary and tertiary regulations are also balance services. In the case of secondary regulation, this is a power-frequency control service that is handled centrally and with market mechanisms, as opposed to primary regulation, which acts automatically in a decentralized manner. In other words, generators that modify their power in the case of primary regulation do so because they are programmed to operate without receiving any external order or having a central control that coordinates them. This is due to the need for very rapid action in the case of primary regulation. However, in the case of secondary regulation, the required power variation to adjust the system frequency is done a few seconds after the frequency fluctuation has occurred. The power variation provided by secondary regulation then replaces the primary regulation and is dispatched automatically but in a centralized way, that is why it is also called automatic frequency restoration reserves (aFRR). The secondary regulation rising or decreasing power must be maintained during an imbalance settlement period (ISP), traditionally 15 min. It has traditionally been a compulsory but remunerated service for generation units that could provide it, but the service could also be provided by prosumers or demand units. In this service, the availability to increase or decrease power is remunerated, as well as the extra energy that is injected or no longer injected into the system if the secondary regulation is activated. In countries like Spain, this service is auctioned the day D affecting the day D+1 as it can be observed in Fig. 3. The cost of this service is transferred to the demand and generation units that generated the imbalance or to the BRPs that represent such agents. Tertiary regulation is similar to secondary regulation, but it acts after an imbalance settlement period has passed to release secondary regulation and can extend several ISTs, 2 h in the case of countries such as Spain. Tertiary regulation services are also called manual frequency reserve restoration. Normally, those units that subscribe to this service are obliged to offer an upregulation or downregulation band. In Spain, for example, this market is resolved between 11 p.m. and midnight on the D−1 day for the whole D-day interval. As in the case of secondary regulation, it is paid for availability, and energy so again is a hybrid product, capacity product combined with energy product. In the case of countries such as Spain, the system operator or TSO also manages a service called replacement reserves. In theory, this mechanism is used to deal with deviations that are foreseen several hours in advance but do not fall within any intraday market. At present, it is expected that with the entry of the continuous intraday market, this deviation management service will disappear.
124
P. Arboleya et al.
2.2 Changes Introduced by European Legislation The new European Directive (EU) 2019/944 [2] establishes common rules for the internal market in electricity and amends the previous regulation based on Directive 2012/27/EU [4]. The document does not go into details of implementation (which in many cases are not set out in the directive itself and are left open to the different national governments of the member states) and focuses on the general guidelines. It aims to be a very concise summary structured around four main axes: (1) the change of energy model, (2) the strengthening of the role of active consumers as the main source of flexibility, (3) the implementation of energy communities, and (4) the change in the development, planning and management of distribution networks.
The Need of a New Energy Model With regard to the new energy model, the regulation recognizes the need for change and proposes to encourage it by providing long-term regulatory stability that promotes investment in this new model. On the one hand, the growing process of electrification will undoubtedly lead to an increase in the demand for electrical energy, and on the other hand, the objectives in terms of reducing global energy consumption and decarbonizing the European Union are very clear. Therefore, if we want to reduce the global energy consumed in a context of growing demand for electrical energy, we have no choice but to introduce radical changes in the system that will allow us to exploit the installed resources to the limit, increasing efficiency and reducing costs. Both distributed generation and climate control systems based on heat pumps, electric vehicles with conventional or bidirectional charging systems for the application of V2G, electric energy storage systems, etc. will increase the overall load on the electricity system in terms of energy volume but will also provide flexibility to the system as this type of generation/load is flexible, i.e. it can be regulated relatively easily without provoking a loss of comfort for the end users. The management of the flexibility of new loads/generators installed in the distribution network is identified as a critical element in achieving these objectives.
Empowerment of Active Consumers: Managing Their Flexibility in an Aggregated Way The European Union recognizes the need to put active consumers who can participate in demand response programmes at the centre of the solution. The EU defines the term demand response as “the change of electricity load by final customers from their normal or current consumption patterns in response to market signals, including in response to time-variable electricity prices or incentive payments, or in response to the acceptance of the final customer’s bid to sell demand reduction or increase at a price in an organized market whether alone or through aggregation”.
Challenges for a Massive Integration of Flexible Resources in LV Networks
125
In this definition, the European Union deals with two concepts of demand response: on the one hand, what is known as implicit demand response in which customers adapt their consumption manually or automatically to price or incentive signals that vary over time and, on the other hand, explicit demand response that allows a third party, generally an aggregation entity, to execute actions on the flexible devices of a final consumer. It will be imposed by regulation that all customer groups, whether commercial, industrial or domestic, must have access to electricity markets in order to commercialize their flexibility and, in this respect, the importance of the role of the aggregator in managing the flexibility of these customers is recognized. In addition, customers will be able to contract the management of their flexibility with an aggregator which can obviously be independent of their distributor, but also of the retailer with whom the customer has established the supply contract. The regulation also guarantees the integration of electric vehicle recharging systems but goes even further. It recognizes existing legal and commercial barriers in the form of excessive taxes and administrative burdens and plans to remove them. It will therefore encourage the use of self-generated energy, its use in storage systems and its injection into the grid. Explicit mention is made to the need of promoting all types of electric mobility solutions, intelligent recharging systems and V1G and V2G technologies. Moreover, the directive recognizes the lack of real-time or quasireal-time information at consumer level and that this is absolutely necessary to carry out this type of implementation and is an unresolved issue in the current system. It will therefore be necessary to promote the deployment of intelligent metering systems that allow consumers to participate in all demand response programmes (implicit or explicit).
A New Actor in the Stage: Energy Communities With regard to the implementation of the new role of “energy community” or “energy citizen community”, the EU defines this new actor of the system as a legal entity that: • Is based on voluntary and open participation and whose effective control is exercised by partners or members who are natural persons, local authorities including municipalities or small enterprises • Whose main objective is to provide environmental, economic or social benefits to its members or partners or to the locality in which it operates, rather than generating a financial return • Participates in the generation, including from renewable sources, distribution, supply, consumption, aggregation, storage of energy, provision of energy efficiency services or, provision of electric vehicle charging services or other energy services to its members or partners Energy communities constitute another mechanism through which their members or partners can participate in the electricity markets since in this case they would be
126
P. Arboleya et al.
represented by the community itself. The implementation of this type of community not only brings an increase in energy efficiency and an environmental benefit but also allows certain consumers to participate in the benefits of the different electricity markets. Furthermore, energy communities are seen as a mechanism to fight energy poverty, which means that in addition to the technical and economic benefits of demand-side management described above, there are also social benefits in this new actor. It should also be taken into account that regulations allow these types of energy communities to set up their own distribution networks and also to become electricity distributors with full rights and duties. Once again and as mentioned in the previous point, having overcome the regulatory barriers, a change of mentality must be promoted in society, and in addition the advanced metering infrastructure installed within the community itself must allow its operators to manage the community’s assets in real time or quasi-real time.
Adapting the Distribution Grid to This Challenging Scenario With regard to the management of the distribution network, the new directive continues to impose a clear model for the segregation of activities and to promote the modernization and the increase of observability of the networks by the operators as the main line of action to promote the penetration of distributed resources and manage their flexibility. It should be borne in mind that the figure of the distributor is mainly remunerated from the income received from the payment of the access tariff by the agents using its infrastructure. Moreover, the states reward the distributors with a percentage of the investments they make to maintain their networks in optimal conditions. The new regulation imposes the reduction of these incentives. This measure will lead to a reduction in the incomes that distribution companies receive for maintaining and upgrading their networks in a context in which the energy managed within them will undoubtedly increase. With this regulatory framework, the only way to profitably and massively integrate the different distributed resources without creating continuous congestion problems in the distribution network is to manage them in an optimal and coordinated manner. In this sense, the different states will implement network codes applicable to the distribution networks so that they have tools that allow them to carry out congestion management in real time. The European Union’s position on storage systems is curious, as it explicitly states that distribution network operators must not own, develop, manage or operate energy storage facilities, as storage services must be based on market mechanisms and therefore cannot be implemented by agents whose activity is regulated, as is the case with distribution. A distributor can only own a storage system when it is fully integrated in its network and not used for balancing or congestion management. With the arguments set out above, there is no doubt that adapting the distribution networks and their measurement and communication infrastructure is one of the
Challenges for a Massive Integration of Flexible Resources in LV Networks
127
critical points for achieving significant levels of penetration of the distributed resources resulting in an overall benefit for the system by managing the flexibility that the distributed resources provide.
3 Establishing the Business Models for Aggregate Flexibility Managers The previous sections described the products, services and markets available today as well as those that are under development but which are guaranteed to be implemented through a stable, long-term regulatory framework. In this section, we will develop how the flexibility provided by elements, such as electric vehicles, heat pumps, storage systems, etc., can give added value to different agents in the electricity system and how to monetize this added value through different products or business models. It is necessary to emphasize that in a context in which the electricity sector is regulated by market mechanisms, in order to achieve a massive penetration of this type of resource and thus favour an overall increase in efficiency and the process of decarbonization, it is not enough for a solution to be viable from a technical point of view, but it must also be viable from an economic point of view. In this section, we will review what services a flexibility aggregator can provide to the different agents involved in the system. Some of these services are compatible with each other, and therefore a flexibility aggregator could offer two or more compatible services simultaneously to more than one agent, thus maximizing the benefits derived from managing this flexibility. In other cases, the services or products offered will be incompatible, and the aggregator will have to choose, depending on its portfolio of aggregated devices, which is its best business model and focus on specific products or services. It must be pointed out that the role of aggregator can be embedded within the retailer in some cases, but in other cases, the aggregator and the retailer or supplier will be different companies. In the scheme represented in Fig. 4, we can observe how the supplier is in charge
Fig. 4 Scheme representing aggregator and supplier as different companies. In some cases, both roles could be within the same company
128
P. Arboleya et al.
Fig. 5 Scheme representing the aggregator interaction with the rest of the agents in the systems as well as the products it can provide them
of providing energy to the prosumer, but it is the aggregator the one managing its flexibility. In some cases, duplicate or redundant metering is needed for these kinds of applications. In the following paragraphs, the products that the aggregator can offer to the different agents of the system will be classified according to which agent within the electricity system can be a customer of them, and the added value of each specific product will be explained for each specific customer. A summary of the different products as well as a representation of the interaction of the aggregator with the different agents can be observed in Fig. 5.
3.1 Specific Products for Distributors The extent to which energy distribution companies can control what is happening in their networks is acceptable at a high-voltage level but really low or nonexistent in many cases in low-voltage networks. In some circumstances, the power transformer stations that connect the low-voltage network with the medium-voltage network have tap changers that allow them to regulate the output voltage of the transformer to a certain degree depending on the load on the low-voltage lines they serve. However, the number of operations that this type of device can perform is very limited, and once a tap is changed, it must remain in that position for a long period of time, so this mechanism is not acceptable in a context where the dynamics of load variation causes rapid fluctuations in the voltage level. Something similar could be said about
Challenges for a Massive Integration of Flexible Resources in LV Networks
129
the use of capacitor banks that inject reactive power into the system raising its voltage. On one hand, its use is not very extended, and its control mechanisms do not allow a fast and continuous regulation of the injected reactive power. Another issue to take into account is the possible overloading of lines and power transformer. It must be taken into account that the European low-voltage distribution network in urban environments has a very complex topology which allows it to be reconfigured, thus distributing the load from some power transformer stations to others, i.e. a certain line can be fed (not simultaneously) from two or more power transformer stations, thus allowing some congestion to be resolved. However, once again, in the majority of cases, this reconfiguration is manual, and once it is done, it tends to remain static for months or years, so this technique does not respond to the dynamic control requirements necessary to manage distributed resources with high variability either. In this respect, it must be said that distributed resources not only provide flexibility in the sense that they can vary their power or shift their consumption over time, but a common feature is that they are connected to the grid through converters whose catalogue of functions is very extensive. Both electric vehicles with simple or bidirectional charge and heat pumps, accumulation systems, photovoltaic systems and other resources are connected to the network through converters that allow a very fast variation of the working conditions. Regulations such as the one proposed by the IEEE in its 1547 Standard on the requirements for converters for interconnecting distributed resources to the network [5] or the advances made by working group WG17 of IEC’s technical committee TC57 to adapt the IEC 61850 standard on systems for the automation of power utilities already include this type of functionality for smart converters [6]. A detailed description of the above-mentioned functionalities goes beyond the scope of this document and can be found in the literature in documents such as the one proposed by the (Electric Power Research Institute) EPRI in its report on “Common Functions for Smart Inverters” [7]. In summary, we could say that the functionalities described are the enabling technologies for the implementation of the whole set of products and services that the flexible resources can offer to distributors and also to the rest of the system’s agents. In this way, the products/services that a flexibility aggregator can offer to a distributor are the following: • Congestion management: In the event that the aggregated resources are concentrated in a specific geographical distribution area, the aggregator could coordinate the resources in such a way as to guarantee a point of operation for the distributor below the overload. An example would be the coordinated charging of electric vehicles which would avoid peak loads and thus reduce the need for investments in network maintenance and upgrading by the distributor. Normally the peak demand in a low-voltage network, depending on the consumer profile, can occur either at midday or at night. The distributor must have the necessary infrastructure to deal with these peaks of load, which normally last for a very short time, and this means operating for most of the time with an oversized infrastructure. The aggregation service would make it possible to
130
P. Arboleya et al.
resolve infrastructure congestion by coordinating flexibility and thus contribute to increasing the quality of supply while also allowing distribution companies to save on infrastructure investment. • Voltage control: As mentioned above, the voltage regulation mechanisms currently available to distributors are very few and have a slow dynamic at best. In this case, it should be mentioned that the distributed devices once again have an underutilized asset which makes it possible to regulate the voltage, that is, the converter present in solar generation systems, accumulation systems, in some electric vehicle recharging systems, etc. Grid connection converters use the rated current for very short periods of time; in the case of solar generators, for example, at most they will use that maximum current to inject active power at the hour of maximum radiation on a sunny day; the rest of the time they will be operating below their nominal capacity. In this case, this capacity can be used to inject reactive power “free of charge” for the owner of the asset so that local voltage control is exercised in the distribution network. It should be noted that the example of the solar converter has been given because in case of a peak demand at night, which causes large voltage drops, the full capacity of the solar converters could be used to inject reactive and raise the grid voltage. In addition, in the case of distribution networks, given the high R/X ratio (resistance/reactance), there is also a high correlation between consumption and injection of active power with the voltage level, so a Volt–Watt type control could also be implemented in the converters. This control would prevent excessive increase or decrease in voltage due to over-injection or overconsumption of distributed resources, and in this case, this type of control could and should be installed in vehicle recharging systems, V2G systems, heat pumps, etc.
3.2 Specific Products for Balance Responsible Parties (BRPs) BRPs are responsible for maintaining the energy balance in the system and are accountable for deviations from that balance. Because of this, it is common that a single BRP represents a diversified portfolio of generation technologies, as well as consumers or retailers, in order to reduce the risk of mismatches and therefore reduce costs. In this sense, having an aggregator within its portfolio of represented companies can give to the BRP a very important value due precisely to the flexibility that the aggregator manages. There are three mechanisms by which a BRP can monetize the flexibility services provided by an aggregator, which are the following: • Day-ahead portfolio optimization: The flexibility provided by the aggregator to the BRP will allow it to operate in the day-ahead electricity market by transferring net-loading from periods of high market price to periods of low market price reducing its cost of energy purchase. In this way, the BRP could monetize in the daily market the flexibility provided by the aggregator which in turn should share these benefits with the managed prosumers.
Challenges for a Massive Integration of Flexible Resources in LV Networks
131
• Intraday portfolio optimization: The model is similar to the one presented for the day-ahead market operation, but in this case, the flexibility provided by the aggregator will be used by the BRP to operate in the different intraday markets. • Self-balancing portfolio optimization: If any deviation from the schedule is detected in one of the so-called imbalance settlement periods (ISPs), the aggregator can provide the BRP with the necessary flexibility to reduce this imbalance and thus avoid penalties. The difference between this case and the two previous ones is that now the BRP will not use the aggregator’s flexibility to operate in the market but to reduce its deviation from its scheduled energy. However, as before, the BRP must remunerate the aggregator for this flexibility with an amount that must obviously be less than the penalty avoided. The aggregator must in turn remunerate the owners of the aggregated devices who are the ultimate suppliers of the flexibility. For example, an aggregator specialized in V2G applications could work with a BRP whose portfolio is based on renewable generation. In cases where the generation is lower than expected, the aggregator could send an order to the vehicles to inject power into the network during part of the ISP (usually 15 min), thus avoiding the deviation penalty. If the generation was higher than expected, the aggregator would give a charge order to the vehicles. In this way, remuneration could be obtained not only for injecting power into the network but also for consuming it in the appropriate period. • Hedging/portfolio adequacy: In this case, the aggregator would sign a bilateral contract with a BRP in such a way that it would activate its flexibility at a fixed price in the event that the market price at which the BRP was purchasing the energy exceeded a certain value. As in the previous case, the service provided by the aggregator to the BRP would be remunerated according to the conditions set out in the contract and not through a market mechanism [8]. In the case of the three previous products, it must be taken into account that in some cases they may not be compatible with congestion management services provided to the distributor, i.e. the BRP may demand an increase in consumption from the aggregator, but this increase in consumption is incompatible with avoiding congestion in a given distribution area. In these cases, there are mechanisms for prioritizing the power increase or decrease orders of an aggregator when it offers products simultaneously to BRPs and distributors [9].
3.3 Specific Products for Balance Service Providers (BSPs) The services that an aggregator can provide to a BSP are very similar in technical terms to those provided to a BRP; the main difference is the way in which the BSP benefits from them. In this case, the BSP can use the flexibility to participate in operation markets selling primary, secondary or tertiary regulation, so in this case, the products offered are clear: • Primary control: For this model, aggregator should configure the distributed resources it manages to respond very quickly and automatically to changes
132
P. Arboleya et al.
in network frequency. This does not pose any technical difficulty since the converters used to interface the distributed resource with the grid have PLL (phase locked-loop) systems that allow them to synchronize with the network and therefore estimate variations in frequency. In the case of electric vehicles, in the event of detecting a drop in the grid frequency, the charger could automatically cut off the charge and thus assist in restoring the frequency and supporting the system. If the vehicles were also equipped with V2G technology, they could also react to a drop in the frequency by injecting power. It should be borne in mind that the energy managed in the primary regulation is relatively small because its duration does not exceed a few seconds, so the charging times of the vehicles in the example described would not be affected. In some countries such as Spain, this service is compulsory but not paid. In other countries, such as Germany, this service is auctioned on a weekly basis. • Secondary control: The activation signal for flexibility would be automatically generated in a central control system which would impose the activation of power increase or decrease by zones. In this case, the power to be increased or decreased should also be maintained for short periods of time determined by the duration of the imbalance settlement periods (15 min) so that the activation of this service would not, in principle, entail a relevant loss of comfort for the end users in the vast majority of cases. In other words, the mere fact of delaying by 15 min the turning on or off of a heat pump or the charging of an electric vehicle would have practically no impact on the end user, but it could bring important benefits for the system and also generate important economical revenues for the aggregator and the end users. • Tertiary control: The business model of providing flexibility to participate in tertiary regulation is very similar to that described for secondary regulation, but in this case, the response time is longer, and the power to rise or fall must be maintained over time for a longer period. The aggregator could stagger short-duration orders of power increase/decrease to the different devices that it manages, making the action last globally the 2 h required but without causing losses of comfort to the end users. As mentioned for services provided to BRPs, in the case of services provided to BSPs, there may be temporary incompatibilities with services provided to distributors in case the aggregator offers services simultaneously to both agents. However, there is no incompatibility between services provided by the aggregator to a BRP and to a BSP simultaneously.
3.4 Specific Products for Prosumers or Active Consumers The prosumer is the key player and ultimate provider of the flexibility that will most commonly be managed through an aggregator or energy community. In the case of the energy community, it can assume the functions of the aggregator, manage the
Challenges for a Massive Integration of Flexible Resources in LV Networks
133
flexibility of its associates and reach agreements with both BRPs and BSPs to put value this flexibility. As energy communities are, by definition, non-profit-making entities, the profits obtained would be used to reduce the price of the electricity supply of their associates. This price reduction should be applied in a manner proportional to the flexibility provided by each of the associates. When an active consumer signs a supply contract with a retailer, the latter can act as an aggregator, managing not only the consumer supply but also its flexibility and permitting the consumer to participate in part of the profits obtained by using its flexibility in the various markets previously described. It may also be the case that the aggregator and the retailer are presented as separate entities so that the retailer is responsible for supplying the energy to the prosumer, but it is the aggregator that manages its flexibility and operates with it by offering it to distributors, BRPs and BSPs, in which case the prosumer should also participate in part of the profits obtained by the aggregator. To illustrate, a given active consumer could contract his supply with retailer “A”, but the management of the charge/discharge of his EV in the case of V2G would be assigned to an aggregator “B”. Company A would install a meter covering the whole installation (house plus vehicle) to bill for the energy consumed. Company B would install a meter that would only affect the EV charger in order to know how much flexible energy has been used. The consumer would pay for the energy used to charge the car to A, but B would pay the consumer to manage the process of charge/discharge of the vehicle. The possibilities here are endless. The previous paragraph has described how to involve prosumers in the benefits that aggregators can obtain from managing the flexibility provided by consumers. However, there are products that aggregators can offer directly and specifically to consumers so that they benefit directly from their own flexibility. These services are as follows [8, 9]: • Time-of-use optimization: The tariffs that retailers offer to their customers have, in some occasions, variable prices in the different periods of the day, and in some cases, these prices may even vary dynamically in real time. The aggregator can understand the customer’s capabilities and habits to shift as much as load as possible from high to low price periods and thus make the consumer get the most benefit of its flexibility by reducing its energy cost. • Maximum power control: A very important term in the majority of electricity tariffs offered by retailers is the so-called power term which represents the cost of the availability of power. In other words, the user will pay more the higher the peak power it can consume. In many cases, the power term represents a very important part of the tariff, and the user only reaches this power in specific situations, i.e. charging the vehicle to coincide with a consumption peak in the home. Let us assume that a user has a photovoltaic generator, an accumulation system and an electric vehicle. Let us also suppose that the surplus energy generated by the PV panel during the day is used to charge the battery and the battery begins to discharge at 7 p.m. because the electricity consumption of the home begins to increase. At 10 p.m., the electric vehicle starts to charge, but the battery is already discharged so that all the power of the vehicle plus that of
134
P. Arboleya et al.
the house has to be obtained from the network, which means a high peak and implies the need to increase the power term of the tariff and with it the total cost. The aggregator would charge the battery during the day but would not start discharging it at 7 p.m. but would wait until the vehicle demanded power at 10 p.m. and would discharge the battery against the vehicle; in this way, the impact of the vehicle would be less, and the user could reduce the power term of the tariff. This is just one of countless examples that can be given. • Self-balancing: Similar to the previous product and available to those prosumers who have significant flexible capacity, self-balancing techniques would allow all the prosumer’s flexible resources to be managed in an integrated and optimal way, taking into account energy purchase and sales prices, capacities and consumer habits. • Controlled islanding: In the case of weak networks with power quality problems such as micro outages or undervoltage or overvoltage problems, the aggregator could intentionally isolate the consumer from the network and make it selfsufficient for a certain period of time.
4 Conclusions The process of decarbonization is closely linked to a growing trend in electrification. This electrification will bring about a notable increase in the demand for electrical energy and the level of penetration of distributed resources such as renewable generation, storage systems, electric vehicles, heat pumps, etc. This change in the energetic model paradigm is not only an option but a real priority for the technical, economic and environmental sustainability of the system. A coordinated integration that exploits all the capacities of digitalization of the network and the market mechanisms available can be beneficial for each and every one of the agents in the electrical system and especially for the final consumer. Any other option than coordinated integration is simply not feasible. The role of the aggregator is essential for managing the flexibility in the network provided by the new distributed resources. The catalogue of business models available to aggregators is very broad since they can provide services to almost all agents in the system. In some cases, these services are compatible over time, and in others they are not. It is currently not clear which will be the most profitable and widespread aggregation model, if any, or if there will simply be different types of aggregators that manage different technologies with different business models, but what is clear is that this figure will be at the centre of the energy paradigm shift in the short to medium term, at least as far as the European Union is concerned.
Challenges for a Massive Integration of Flexible Resources in LV Networks
135
References 1. Council of European Union, Clean Energy for all Europeans Package (2019) 2. Council of European Union, Directive (EU) 2019/944 of the European Parliament and of the Council of 5 June 2019 on common rules for the internal market for electricity and amending Directive 2012/27/EU (2019). 3. Council of European Union, Directive 96/92/EC of the European Parliament and of the Council of 19 December 1996 concerning common rules for the internal market in electricity (1996) 4. Council of European Union, Directive 2012/27/EU of the European Parliament and of the Council of 19 December 1996 of 25 October 2012 on energy efficiency, amending Directives 2009/125/EC and 2010/30/EU and repealing Directives 2004/8/EC and 2006/32/EC (2012) 5. IEEE Draft Standard Conformance Test Procedures for Equipment Interconnecting Distributed Energy Resources with Electric Power Systems and Associated Interfaces, in IEEE P1547.1/D9.8, December 2019 (2019), pp. 1–283 6. T. C. 57 IEC, IEC 61850: Communication networks and systems for power utility automation, in International Electrotechnical Commission Std, vol. 53 (2010), p. 54 7. Common Functions for Smart Inverters, 4th edn. (EPRI, Palo Alto, 2016). 3002008217 8. H. de Heer, M. van der Laan, USEF: Workstream on Aggregator Implementation Models. USEF Aggregator Workstream Final Report (2017) 9. P. Olivella-Rosell, P. Lloret-Gallego, P. Munné-Collado, R. Villafafila-Robles, A. Sumper, S. Ottessen, J. Rajasekharan, B. Bremdal, Local flexibility market design for aggregators providing multiple flexibility services at distribution network level. Energies (11), 882 (2018)
Electrical Railway Power Supply Systems for High-Speed Lines: From Traditional Grids to Smart Grids Daniel Serrano-Jimenez, Sandra Castano-Solis, Eneko Unamuno, and Jon Andoni Barrena
1 Introduction Electrical railway power supply systems, ERPSS, are defined as the set of elements required to feed the trains with the necessary energy to ensure their proper operation. The type and configuration of these elements have changed significantly over time driven by the technological developments available at each moment. Historically, the first railway electrifications date back to the late nineteenth century. They consisted of low-voltage DC installations, typically 750–1500 V, supplied from the utility grid by means of rotary converters or mercury arc rectifiers. The simplicity of the velocity control of DC motors made this type of systems very convenient at first, but their reduced maximum voltage capability motivated the development of AC electrification. AC railway electrification arose in the beginning of the twentieth century with the development of the series wound DC motors with pole commutator. Although these motors can use AC, they must have a low frequency to prevent sparks during commutations. The necessity of reduced frequency was solved by two different approaches. The first and earlier one was the creation of proprietary transmission systems with their own generation. This topology was adopted in Germany, Austria, and Switzerland. The other approach was based on the utilization
D. Serrano-Jimenez ETS de Ingeniería de Minas y Energía, Universidad Politécnica de Madrid, Madrid, Spain S. Castano-Solis () ETS de Ingeniería y Diseño Industrial, Universidad Politécnica de Madrid, Madrid, Spain e-mail: [email protected] E. Unamuno · J. A. Barrena Faculty of Engineering of Mondragon Univertsitatea (EPS-MU), Arrasate, Spain © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_5
137
138
D. Serrano-Jimenez et al.
of rotary converters fed from the utility grid, and it was implemented in Sweden and Norway. Both solutions used a nominal voltage of 15 kV and a frequency of 16 2/3 Hz. Meanwhile, DC electrification continued to expand in other countries like Spain or Italy, albeit at a higher voltage of 3 kV. The next milestone in railway electrification came with the inclusion of converters onboard after World War II. This fact made possible the use of high voltage levels at industrial frequency on the catenary, 25 kV 25 Hz, with DC traction motors on the train. The use of industrial frequency enables to connect the utility grid and the catenary by means of simple transformers, thus reducing the cost of the traction substation significantly. As a result of this evolution process, there are five electrification schemes at the moment in Europe: 750 V DC, 1500 V DC, 3000 V DC, 15 kV 16 2/3 Hz, and 25 kV 50 Hz [1–4]. Currently, most of the new ERPSS for medium- and high-power demanding applications such as high-speed railway lines are based on transformer-based configurations. Despite the aforementioned benefits, these systems present important power quality issues that are leading to the development of new technological solutions based on modern power converters. These new systems do involve a change not only in the electrical installation but also in the way that railway systems are operated. The next two sections describe the main conventional configurations and the principal developments found for the transformer-based and converter-based systems [5].
2 Transformer-Based Configurations 2.1 Conventional Configurations ERPSS are typically divided into five main parts: the generation system, the transmission and distribution system, the traction substations, the catenary system, and the electrical traction units. As it was explained before, the possibility of using industrial frequency on the catenary favored the utilization of the utility grid as generation system. Regarding the transmission and distribution lines, the proprietary and operator of the infrastructure has been changing instead. In some cases, the railway operator was in charge of building and maintaining the lines, while in other cases, this role has been adopted by the utility grid operator. The current tendency is toward the latter approach. The traction substations are the elements in charge of connecting the transmission or distribution system grid to the corresponding catenary system. For high-power demanding applications, like high-speed rail, the traction substations must be connected directly to the transmission system. It is usually composed of two transformers in a single busbar arrangement that provides a reasonable trade-off between flexibility and cost.
Electrical Railway Power Supply Systems for High-Speed Lines: From. . .
139
Utility Grid Railway Grid Transmission line
Traction
Traction
Substation
Substation
Section
Neutral Section
Fig. 1 Conventional transformer-based configurations
In order to reduce the power imbalance in the utility grid, each transformer is typically connected to different phases of the transmission system. This fact requires to divide the catenary into isolated electrical sections separated by neutral zones. Figure 1 shows the typical transformer-based arrangements. In the first case, the traction substation is directly connected to the utility grid, while in the second case, there is an intermediate proprietary railway transmission grid. The catenary system is defined as the set of conductors in charge of transmitting the power to the trains from the traction substations. It is usually composed of many different conductors that are grouped regarding their voltage level. The positive conductors consist of the contact and messenger wires and additional positive feeders when needed. On the other hand, the neutral conductors are composed of the rails and, eventually, return lines. Finally, there is a third group for the negative conductors used in the bivoltage configurations. The simplest catenary system is the direct feeding configuration. As shown in Fig. 2a, it uses a 25 kV single-phase transformer that is connected between the positive lines and the rails. This scheme presents significant voltage drops due to the high resistance of the return path. In order to reduce the current flowing through earth, two principal schemes have been proposed. The first one is based on the utilization of an additional neutral conductor and a set of booster transformers whose primary and secondary windings are connected in series with the positive and neutral lines, respectively. Finally, the neutral line is grounded between every two booster transformers, providing an additional path to the current to flow. The second scheme is shown in Fig. 2c. This configuration uses a single-phase 50 kV transformer with a secondary central tap grounded to feed the +25 kV
140
D. Serrano-Jimenez et al.
Traction transformer Positive conductors
Rails
a)
Traction transformer Booster transformer Neutral return Positive conductors Rails
b)
Traction transformer Autotransformer Negative feeder
Positive conductors Rails
c) Fig. 2 Catenary feeding schemes (a) Monovoltage, direct feeding scheme (b) Monovoltage, booster feeding scheme (c) Bivoltage, autotransformer feeding scheme
Electrical Railway Power Supply Systems for High-Speed Lines: From. . .
141
positive lines and the –25 kV negative line. To reduce the transmission voltage from 50 kV to 25 kV, a set of autotransformers are placed along the catenary. Since this system uses two different voltage levels, it is also known as bivoltage. It is important to highlight the possibility of using nonsymmetrical voltages on the catenary lines. Although the cost of these two latter schemes is higher than the direct feeding one due to the additional conductors and transformers used, the lower voltage drop in the catenary enables to increase the length of the electrical sections, thus reducing the number of traction substations required. The electrical traction unit is the last element in the power supply scheme. It is in charge of proving electrical energy to the traction motors from the catenary. The type of these motors has changed significantly over time. As commented before, the first motors used in railway electrification were DC and single-phase AC motors. Subsequently, with the development of power electronic converters, the use of threephase AC asynchronous and synchronous motors has become the standard solution adopted due their lower cost and easier maintenance. Finally, it is important to highlight that the power drawn from the catenary must also include the power of the auxiliary services such as the heating/cooling system or the lighting system. Despite being the most common electrification scheme now, transformer-based configurations present important issues from the power quality point of view [6]. The power imbalance is perhaps the most important one, although the harmonic pollution and the low power factor are also important issues to consider. As it is generally known, the connection of single-phase loads, like railway to three-phase utility grids, produces a negative sequence current injection that distorts the voltage. This distortion can lead to a malfunctioning of the electrical equipment when it is not conveniently mitigated. To avoid it, transformer-based railway systems must be connected to utility grids with short-circuit power enough to accept such imbalance [7, 8]. Besides the aforementioned power quality concerns, transformer-based configurations have a great obstacle to enhance the efficiency and reliability of the railway operation, i.e., the neutral zones. As commented before, the single-phase transformers of the traction substations are usually connected to different phases of the utility grid that makes necessary to divide the catenary into isolated electrical sections. As commented, it reduces the potential use of regeneration power and impedes to feed the trains from different traction substations at the same time. The next sections describe the most important solutions found in literature to overcome these issues.
2.2 Balancing Transformers Balancing transformers are specially connected power transformers capable of producing a balanced system of n phases from a balanced system of m phases. According to [9], this balanced transformation is based on two premises: the existence of a balanced set of electromotive forces in both sides of the transformer and a balanced set of magnetomotive forces on each limb of the transformer.
142
D. Serrano-Jimenez et al.
A 0,5
1
0,5
1
1
2
B
1
1
1
1 2
2
2
2
2
0,36
0,36
2
2
2
2
a’ b’
b
C
1
C
2
a
b’
b)
B
2
a’
b A
1
1
2
b’
0,672
0,33 2
a
a)
A
1
0,56
2
2
a’ b
C
1
0,56
2
1,73
1
B
A
C
0,33
a
B
4
3
a
a’
3
b 2
3
+
4
b’
= 1,
4
c)
d)
Fig. 3 Balancing transformer configurations (a) Scott (b) Le-Blanc (c) Impedance matching (d) Modified Woodbridge
The four balancing connections most used in railway electrification are Scott, Le-Blanc, impedance matching, and Woodbridge [10, 11]. As shown in Fig. 3, these transformers present complex connections that usually require unequal number of turns on each limb to fulfill the two previous premises. These characteristics complicate enormously the manufacturing process and, thus, their cost in comparison to conventional transformers. The reduction of the voltage imbalance by means of the use of balancing transformers has been comprehensively studied in literature [12, 13]. Most of these works make use of simple formulae to calculate such impact. However, as [14] states, this latter approach can underestimate its real value because it neglects the interaction between the railway system and the electrical system. In [15], the technical performance and cost comparison of different balancing and conventional transformers is accomplished. It concludes that although balancing transformers have a higher initial cost, they require less power compensating capacity that results into a lower total investment. Furthermore, balancing transformers can be an interesting option to reduce harmonic pollution due to their special connections that can lead to a natural harmonic cancelation [16].
Electrical Railway Power Supply Systems for High-Speed Lines: From. . .
143
2.3 Converter Compensators SVC The static VAR compensator (SVC) is an electronic device capable of providing variable impedance by means of their controllable switches, typically thyristors. SVCs can be classified as either thyristor-controlled reactor (TCR) when the variable impedance is a reactance or thyristor-switched capacitor (TSC) when it is a capacitor [17]. It is also common to add parallel constant impedance branches to cancel certain harmonics. The most common application of these electronic devices is the voltage control. Raising the voltage level of the catenary makes possible to increase the power transmission capability of the line and thus the number of trains that can be fed in the same section. To this end, a single-phase TSC is placed at the beginning or at the end of each section, regulating the reactive power injected depending on the voltage of the catenary [18, 19]. The main drawback of this solution is that it only provides discrete levels of reactive compensation. Besides voltage control, SVCs can also be applied to balance the load. To this end, a three-phase SVC in delta arrangement is placed on the utility grid side of the traction substation [20]. The Steinmetz circuit demonstrates that controlling the reactive components connected to each phase, the system is able to transfer power between phases to balance the load. It is important to take into account the possible resonances between the SVC and the system [21].
STATCOM The advent of high-power self-commutated switches such as GTO (gate turn-off thyristor), IGBT (insulated gate bipolar transistor), or IGCTs (integrated gatecommutated thyristor) provided the development of voltage source converters (VSCs) and all their potential applications. The static synchronous compensator (STATCOM) is one of them. It consists of a VSC that can provide variable continuous reactive power compensation without the use of large reactive energy storage elements by simply controlling the voltage magnitude. In [22], the utilization of the STACOM is presented to increase the catenary voltage during the railway operation. As the SVC, the STATCOM can also provide load balancing on the threephase utility grid [23]. The control technique can be based on the Steinmetz circuit principle but also on the direct power control. This latter approach relies on the statement that only balanced systems without current, and voltage harmonics will produce constant instantaneous power. Accordingly, it can provide load balance and harmonic cancelation simultaneously.
144
D. Serrano-Jimenez et al.
Traction transformer Positive conductors
Rails Railway power DC
conditioner
AC
AC DC
Fig. 4 Railway power conditioner
Railway Power Conditioner Another common power converter compensator is the so-called railway power conditioner. As shown in Fig. 4, this solution consists of two single-phase VSCs in back-to-back configuration, whose AC sides are connected to different catenary sections by means of step-up transformers [24, 25]. The use of an appropriate coordination control between the converters can solve the problems of power imbalance, harmonic pollution, and power factor correction at the same time. The enhancement efforts of this compensator have been aimed to the simplification of the converter topology. For example, the work presented in [26] proposes the use of a half-bridge converter configuration instead of a full-bridge topology implemented in its conventional arrangement. This approach reduces the number of switches required by half. Other proposals [27, 28] advocate for replacing the two single-phase converters with a three-phase converter. In this case, the number of switches is reduced from eight to six. Finally, in order to eliminate the use of step-up transformers, there are two recent proposals based on the utilization on a hybrid converter [29] or a modular multilevel converter [30].
Co-phase Power Conditioner The co-phase power conditioner [31, 32] can be seen as an evolution of the railway power conditioner that can not only solve the power quality concerns of the conventional transformer-based configurations but also reduce the number of neutral zones. As it is shown in Fig. 5, it consists of two single-phase VSCs in back-
Electrical Railway Power Supply Systems for High-Speed Lines: From. . .
145
Traction transformer
AC DC
Co-phase
AC DC
power DC AC
conditioner
Hybrid co-phase power
DC AC
conditioner
Positive conductors
Rails Fig. 5 Co-phase power conditioner
to-back configuration connected in parallel to the traction transformer. Due to the converter possibility of controlling the voltage, the neutral zone at the front of each traction substation can be avoided. This solution can also be implemented using a balancing transformer [33]. There are three main operation modes regarding the element providing the energy. In the normal operation mode, the energy is supplied by both the transformer and the converter. In the other two modes, only one of the elements of the traction substation is available, leading to converter operation mode and transformer operation mode. It is important to highlight that these operation modes correspond to degraded situations and for the transformer case, the power quality issues are the same as the ones for a monovoltage direct feeding configuration. In order to reduce the compensation capacity of the converter and, thus, its cost, some authors have proposed to combine the co-phase system with passive reactive elements connected in series with the step-up transformers [34, 35]. This approach is known as hybrid co-phase power conditioner, and it is shown on the right side of Fig. 5.
146
D. Serrano-Jimenez et al.
3 Converter-Based Systems 3.1 Conventional and Advanced AC Systems As it was described in the introduction, the difficulties of using AC motors at industrial frequency at the beginning of railway electrification led to the development of the AC low-frequency railway systems. These systems can be implemented following two different approaches: centralized and decentralized. The centralized approach is based on the utilization on a proprietary transmission line that connects the catenary through a set of traction transformer substations. In this configuration, the electrical power is principally supplied by dedicated power plants, although it can also include some connections to the utility grid by means of traction converter substations when needed. On the contrary, the decentralized approach connects the catenary to the utility grid by means of traction converter substations. It is important to mention the possibility of building a parallel transmission grid to the catenary in order to increase the power transmission capability of the catenary. Figure 6 shows the two aforementioned approaches. Regarding the traction converter substations, there are two possible configurations: rotary and static. Rotary converters were the first to appear, and they consist of a three-phase motor connected to a single-phase generator linked together by a mechanical shaft. Depending on the motor type of the converter, the frequency conversion can be fixed or variable. If the motor is synchronous, the frequency conversion is fixed and equals to the ratio between the numbers of poles of the two electrical machines. On the other hand, if the motor is asynchronous, the frequency conversion can be regulated. As it will be discussed later, the type of converter used must be in accordance with the operation scheme adopted in the railway grid. The high-power losses along with the considerable maintenance cost of rotary converters have promoted the utilization of static ones. The first static converters were based on a direct power conversion approach by means of cycloconverters with thyristor valves. Later on, with the development of VSCs with self-commutated switches, the DC link converters have become the standard solution adopted. They consist of a three-phase to single-phase converters in back-to-back configuration. The topologies of these converters have changed significantly over time. The first approaches were based on the series connection of two-level converters with high power switches. Subsequently, the different multilevel topologies with mature medium power switches [36–38] have been gaining ground. One of the most promising topology is the modular multilevel converter [39, 40], which can operate without the use of the final step-up transformer [41]. It consists of a string of identical independent submodules or cells, typically made of two-level half-bridge converters connected in series to reach the desired voltage level. In contrast to transformer-based configurations, AC converter-based systems present a non-sectioning catenary that enables to feed the trains from multiple traction substations simultaneously. The parallel connection of these substations requires having approximately the same phase in all of them under no load
Electrical Railway Power Supply Systems for High-Speed Lines: From. . .
147
Utility Grid Railway Grid Rotary
Static
Converter
converter M
station
Power station G
station
G
AC DC
DC AC
Transmission line Traction substation
a) Utility Grid Railway Grid
M
Static traction
Rotary traction substation
AC DC
substation G
AC
DC
Transmission line
b) Fig. 6 Conventional AC converter-based configurations (a) Centralized scheme (b) Decentralized scheme
148
D. Serrano-Jimenez et al.
conditions. This issue is particularly important in case of decentralized systems with synchronous-synchronous rotary converters. As it was previously explained, they perform a fixed frequency conversion, and thus, the angle set on the catenary at no load is directly proportional to the angle on the utility grid. In order to avoid possible circular currents, the voltage angle differences must be carefully studied and reduced when needed. Some of the most common measures undertaken to reduce it consist of the addition of series reactors or the utilization of tap changer transformers. The operation of conventional AC converter-based systems depends on the configuration scheme adopted. In case of centralized schemes, the railway grid is operated asynchronously to the utility grid. It means that the railway operator is in charge of frequency regulation and active power control of the railway grid. In case of having rotary converters connected to the utility grid, they must have an asynchronous-synchronous configuration capable of accommodating the possible frequency variations between the two grids. Alternatively, decentralized schemes are operated synchronously to the utility grid. In this case, the frequency regulation is performed by the synchronous generators of the utility grid, but the active power control is reduced to the existence of static converter stations. This type of configuration requires the use of synchronous-synchronous rotary converters. With the outstanding development in power electronic converters, the utilization of controllable static converter stations has become a very attractive option for the electrification of high-speed railway lines. On the first hand, they do not introduce any power imbalance into the utility grid, and they can reduce the harmonic pollution and correct the power factor at the same time. On the other hand, they can perform an active power control strategy that can improve the reliability and efficiency of railway systems operation. In light of these benefits, some authors have proposed the development of 25 kV 50 Hz advanced converter-based systems [42– 44]. Despite being decentralized systems, they perform an asynchronous operation to the utility grid, and thus, they combine the benefits of both configurations. It is noted that this approach is aligned with the current evolution of the electrical power systems to smart grids [45].
3.2 Conventional and Advanced DC Systems Conventional DC systems consist of a set of traction converter substations connected in parallel to a continuous catenary. As shown in Fig. 7, these traction substations can be directly supplied from the utility grid, or they can present an intermediate distribution line. In both cases, there are two main topologies for the traction substation. The first and simplest approach consists of a six-pulse non-controllable rectifier. The second and most extended one is based on the utilization of 12-pulse non-controllable rectifier. In this case, the traction substation requires the utilization of a three-phase transformer with two secondary windings in a wye and delta arrangement.
Electrical Railway Power Supply Systems for High-Speed Lines: From. . .
149
Utility Grid Railway Grid
Distribution line
AC DC
AC
AC
AC DC
DC
DC
AC
AC
AC DC
DC
DC
Positive conductors Rails Fig. 7 Conventional DC converter-based configurations
The use of diode rectifier substations is an important limitation that seriously impedes to reduce the cost of the installation and to enhance the railway operation. On the first hand, diode rectifiers cannot maintain the voltage fixed on the catenary, increasing the number of substations required. On the other hand, they do not allow a bidirectional power flow, reducing the potential of regenerative braking. In this context, railway engineers have proposed to replace the diodes with more advanced power switches. The thyristor rectifier converter is the most common approach encountered. In order to have a bidirectional power flow, the traction substations must include two three-phase thyristor rectifiers in antiparallel arrangement. More recently, VSC converter substations have been gaining ground. Their fully controllable power switches simplify the traction substation to a single power converter, but their high price is still a great barrier for most applications. Despite all the advantages brought by the new controllable traction substations, the low voltage magnitude accomplished so far reduces their application to low power demanding purposes. The idea of increasing the catenary voltage has been considered in many occasions in the past [46], but the cost and capabilities of the power electronics switches have always impeded it to thrive [47, 48]. It was only recently, with the outstanding developments achieved in high-voltage direct-current transmission grids, when this idea has been gaining strength again. In the work presented in [49], the authors propose a medium-voltage DC converter-based system to feed a high-speed railway line. Since this approach is very similar to the advanced AC converter-based system, this configuration will
150
D. Serrano-Jimenez et al.
be referred advanced DC converter-based systems hereafter. These configurations have the same advantages presented in AC advanced systems, such as active power control, load power balancing, harmonic pollution reduction, or voltage support. However, the use of DC provides additional benefits and some drawbacks. The two most important advantages are the line loss reductions and the simplification of the control. On the downside, the fault clearance capacity is still an issue under research despite the great advances that have recently been made. Finally, DC systems can also have the possibility of having bivoltage catenary configuration for DC electrification schemes. In order to achieve it, the autotransformer stations used in AC must be replaced with equivalent DC/DC autoconverters that redistribute the current between the feeder lines. It is also important to include a DC/DC converter at the front of each traction substations to guarantee a balanced DC voltage operation [50].
References 1. R.J. Hill, Electric railway traction. Part 3: Traction power supplies. Power Eng. J. 8(6), 275– 286 (1994). https://doi.org/10.1049/pe:19940604 2. T. Oura, Y. Mochinaga, H. Nagasawa, Railway electric power feeding systems. Japan Railw. Transp. Rev. (16), 48–58 (1998) 3. L. Abrahamsson, T. Schütte, S. Östlund, Use of converters for feeding of AC railways for all frequencies. Energy Sustain. Dev. 16(3), 368–378 (2012). https://doi.org/10.1016/ j.esd.2012.05.003 4. A. Steimel, Power-electronic grid supply of AC railway systems, in 2012 13th International Conference on Optimization of Electrical and Electronic Equipment (OPTIM), (2012), pp. 16– 25. https://doi.org/10.1109/OPTIM.2012.6231844 5. D. Serrano-Jiménez, L. Abrahamsson, S. Castaño-Solís, J. Sanz-Feito, Electrical railway power supply systems: Current situation and future trends. Int. J. Electr. Power Energy Syst. 92 (2017). https://doi.org/10.1016/j.ijepes.2017.05.008 6. S.M. Mousavi Gazafrudi, A. Tabakhpour Langerudy, E.F. Fuchs, K. Al-Haddad, Power quality issues in railway electrification: A comprehensive perspective. IEEE Trans. Ind. Electron. 62(5), 3081–3090 (2015). https://doi.org/10.1109/TIE.2014.2386794 7. S.-L. Chen, F.-C. Kao, T.-M. Lee, Specification of minimum short circuit capacity for three-phase unbalance evaluation of high-speed railway power system, in Proceedings 1995 International conference on energy management and power delivery EMPD ‘95, vol. 1, (1995), pp. 323–330. https://doi.org/10.1109/EMPD.1995.500747 8. I. Saboya, I. Egido, E. Pilo, L. Rouco, Impact of high-speed trains in small isolated power system phase to phase imbalances, in WIT transactions on the built environment, vol. 127, (2012), pp. 615–626. https://doi.org/10.2495/CR120521 9. J.E. Parton, A general theory of phase transformation. Proc. IEE – Part IV Inst. Monogr. 99(2), 12–23 (Apr. 1952). https://doi.org/10.1049/pi-4.1952.0002 10. B.-K. Chen, B.-S. Guo, Three phase models of specially connected transformers. IEEE Trans. Power Deliv. 11(1), 323–330 (1996). https://doi.org/10.1109/61.484031 11. Z. Zhang, B. Wu, J. Kang, L. Luo, A multi-purpose balanced transformer for railway traction applications. IEEE Trans. Power Deliv. 24(2), 711–718 (2009). https://doi.org/10.1109/ TPWRD.2008.2008491 12. T.-H. Chen, Criteria to estimate the voltage unbalances due to high-speed railway demands. IEEE Trans. Power Syst. 9(3), 1672–1678 (1994). https://doi.org/10.1109/59.336089
Electrical Railway Power Supply Systems for High-Speed Lines: From. . .
151
13. T.H. Chen, Network modelling of traction substation transformers for studying unbalance effects. IEE Proc. Gener. Transm. Distrib. 142(2), 103 (1995). https://doi.org/10.1049/ipgtd:19951592 14. H.-Y. Kuo, T.-H. Chen, Rigorous evaluation of the voltage unbalance due to high-speed railway demands. IEEE Trans. Veh. Technol. 47(4), 1385–1389 (1998). https://doi.org/10.1109/ 25.728533 15. H. Yu, Y. Yue, C. Zhe, C. Zhifei, T. Ye, Research on the selection of railway traction transformer, in 2010 Conference Proceedings IPEC, (2010), pp. 677–681. https://doi.org/ 10.1109/IPECON.2010.5697012 16. H.E. Mazin, W. Xu, Harmonic cancellation characteristics of specially connected transformers. Electr. Power Syst. Res. 79(12), 1689–1697 (2009). https://doi.org/10.1016/j.epsr.2009.07.006 17. M.H. Rashid, Power electronics handbook: Devices, circuits, and applications, 2nd edn. (Academic, Burlington) 18. R. Grünbaum, SVC for the Channel Tunnel Rail Link: providing flexibility and power quality in rail traction, in IEE seminar on power – it’s a quality thing, (2005), p. 3. https://doi.org/ 10.1049/ic.2005.0691 19. G. Celli, F. Pilo, S.B. Tennakoon, Voltage regulation on 25 kV AC railway systems by using thyristor switched capacitor, in Ninth International Conference on Harmonics and Quality of Power. Proceedings (Cat. No.00EX441), vol. 2, (2000), pp. 633–638. https://doi.org/10.1109/ ICHQP.2000.897752 20. G. Zhu, C. Jianye, L. Xiaoyu, Compensation for the negative-sequence currents of electric railway based on SVC, in 2008 3rd IEEE Conference on Industrial Electronics and Applications, (2008), pp. 1958–1963. https://doi.org/10.1109/ICIEA.2008.4582862 21. L. Sainz, L. Monjo, S. Riera, J. Pedra, Study of the Steinmetz circuit influence on AC traction system resonance. IEEE Trans. Power Deliv. 27(4), 2295–2303 (2012). https://doi.org/10.1109/ TPWRD.2012.2211084 22. P.-C. Tan, P.C. Loh, D.G. Holmes, A robust multilevel hybrid compensation system for 25-kV electrified railway applications. IEEE Trans. Power Electron. 19(4), 1043–1052 (2004). https:/ /doi.org/10.1109/TPEL.2004.830038 23. A. Bueno, J.M. Aller, J.A. Restrepo, R. Harley, T.G. Habetler, Harmonic and unbalance compensation based on direct power control for electric railway systems. IEEE Trans. Power Electron. 28(12), 5823–5831 (2013). https://doi.org/10.1109/TPEL.2013.2253803 24. Y. Mochinaga, Y. Hisamizu, M. Takeda, T. Miyashita, K. Hasuike, Static power conditioner using GTO converters for AC electric railway, in Conference record of the power conversion conference – Yokohama 1993, (1993), pp. 641–646. https://doi.org/10.1109/ PCCON.1993.264181 25. A. Luo, W. Wu, J. Shen, S. Z., Railway static power conditioners for high-speed train traction power supply systems using three-phase V/V transformers. IEEE Trans. Power Electron. 26(10), 2844–2856 (Oct. 2011). https://doi.org/10.1109/TPEL.2011.2128888 26. F. Ma, A. Luo, X. Xu, H. Xiao, C. Wu, W. Wang, A simplified power conditioner based on half-bridge converter for high-speed railway system. IEEE Trans. Ind. Electron. 60(2), 728– 738 (2013). https://doi.org/10.1109/TIE.2012.2206358 27. Z. Sun, X. Jiang, D. Zhu, G. Zhang, A novel active power quality compensator topology for electrified railway. IEEE Trans. Power Electron. 19(4), 1036–1042 (2004). https://doi.org/ 10.1109/TPEL.2004.830032 28. C. Wu, A. Luo, J. Shen, F.J. Ma, S. Peng, A negative sequence compensation method based on a two-phase three-wire converter for a high-speed railway traction power supply system. IEEE Trans. Power Electron. 27(2), 706–717 (2012). https://doi.org/10.1109/TPEL.2011.2159273
152
D. Serrano-Jimenez et al.
29. S. Hu et al., A new integrated hybrid power quality control system for electrical railway. IEEE Trans. Ind. Electron. 62(10), 6222–6232 (2015). https://doi.org/10.1109/TIE.2015.2420614 30. F. Ma et al., A railway traction power conditioner using modular multilevel converter and its control strategy for high-speed railway system. IEEE Trans. Transp. Electrif. 2(1), 96–109 (2016). https://doi.org/10.1109/TTE.2016.2515164 31. Z. Shu, S. Xie, Q. Li, Single-phase Back-to-back converter for active power balancing, reactive power compensation, and harmonic filtering in traction power system. IEEE Trans. Power Electron. 26(2), 334–343 (2011). https://doi.org/10.1109/TPEL.2010.2060360 32. N.-Y. Dai, M.-C. Wong, K.-W. Lao, C.-K. Wong, Modelling and control of a railway power conditioner in co-phase traction power system under partial compensation. IET Power Electron. 7(5), 1044–1054 (2014). https://doi.org/10.1049/iet-pel.2013.0396 33. M. Chen, Q. Li, G. Wei, Optimized design and performance evaluation of new cophase traction power supply system, in 2009 Asia-Pacific power and energy engineering conference, (2009), pp. 1–6. https://doi.org/10.1109/APPEEC.2009.4918578 34. K.-W. Lao, N. Dai, W.-G. Liu, M.-C. Wong, Hybrid power quality compensator with minimum DC operation voltage design for high-speed traction power systems. IEEE Trans. Power Electron. 28(4), 2024–2036 (2013). https://doi.org/10.1109/TPEL.2012.2200909 35. K.-W. Lao, M.-C. Wong, N. Dai, C.-K. Wong, C.-S. Lam, A systematic approach to hybrid railway power conditioner design with harmonic compensation for high-speed railway. IEEE Trans. Ind. Electron. 62(2), 930–942 (2015). https://doi.org/10.1109/TIE.2014.2341577 36. A. Nabae, I. Takahashi, H. Akagi, A new neutral-point-clamped PWM inverter. IEEE Trans. Ind. Appl. IA-17(5), 518–523 (1981). https://doi.org/10.1109/TIA.1981.4503992 37. T.A. Meynard, H. Foch, Multi-level conversion: high voltage choppers and voltage-source inverters, in PESC ‘92 Record. 23rd Annual IEEE Power Electronics Specialists Conference, (1992), pp. 397–403. https://doi.org/10.1109/PESC.1992.254717 38. H. Richard, B. Baker, Electric power converter. 3867643 (1974) 39. H. Akagi, Classification, terminology, and application of the modular multilevel Cascade converter (MMCC). IEEE Trans. Power Electron. 26(11), 3119–3130 (2011). https://doi.org/ 10.1109/TPEL.2011.2143431 40. M. Glinka, R. Marquardt, A new AC/AC multilevel converter family. IEEE Trans. Ind. Electron. 52(3), 662–669 (2005). https://doi.org/10.1109/TIE.2005.843973 41. J. Ranneberg, Transformerless topologies for future stationary AC-railway power supply, in 2007 European Conference on Power Electronics and Applications, (2007), pp. 1–11. https:// doi.org/10.1109/EPE.2007.4417330 42. X. He et al., Advanced Cophase traction power supply system based on three-phase to singlephase converter. IEEE Trans. Power Electron. 29(10), 5323–5333 (2014). https://doi.org/ 10.1109/TPEL.2013.2292612 43. Z. Shu, X. Xie, Y. Jing, Advanced co-phase traction power supply simulation based on multilevel converter, in Proceedings of the 2011 2nd International Congress on Computer Applications and Computational Science SE – 63, ed. by F. L. Gaol, Q. V. Nguyen, vol. 145, (Springer, Berlin/Heidelberg, 2012), pp. 459–465 44. X. He, A. Guo, X. Peng, Y. Zhou, Z. Shi, Z. Shu, A traction three-phase to single-phase Cascade converter substation in an advanced traction power supply system. Energies 8(9), 9915–9929 (2015). https://doi.org/10.3390/en8099915 45. E. Pilo de la Fuente, S.K. Mazumder, I.G. Franco, Railway electrical smart grids: An introduction to next-generation railway power systems and their operation. IEEE Electrif. Mag. 2(3), 49–55 (2014). https://doi.org/10.1109/MELE.2014.2338411 46. D. Laousse, C. Brogard, H. Caron, C. Courtois, Direct current- A future under which conditions. elektrische bahnen 114, 260–275 (2016) 47. M.L. Erlangen, High voltage DC power supply – Part 1: Basics ans systems. Elektrische bahnen 109, 271–275 (2011) 48. M.L. Erlangen, High voltage DC power supply- Part 2: Technology and migration strategies. elektrische bahnen 109, 672–679 (2011)
Electrical Railway Power Supply Systems for High-Speed Lines: From. . .
153
49. A. Gomez-Exposito, J.M. Mauricio, J.M. Maza-Ortega, VSC-based MVDC railway electrification system. IEEE Trans. Power Deliv. 29(1), 422–431 (2014). https://doi.org/10.1109/ TPWRD.2013.2268692 50. H. Kakigano, Y. Miura, T. Ise, Low-voltage bipolar-type DC microgrid for super high quality distribution. IEEE Trans. Power Electron. 25(12), 3066–3075 (2010). https://doi.org/10.1109/ TPEL.2010.2077682
Energy-Efficient Scheduling of Intraterminal Container Transport S. Mahdi Homayouni and Dalila B. M. M. Fontes
1 Introduction The expansion of the maritime transport capacity and the world trade development depend on and support each other. Maritime transport enables trade and contacts between economies and ensures a reliable supply of energy, food, and commodities. Despite being one of the slowest modes of transportation, maritime transportation is widely used (particularly for intercontinental transports) since it can handle large volumes/quantities, offers competitive (low) costs, and is less polluting [9]. Conceptually, any goods other than time or contents sensitive ones can be moved by sea. The maritime sector is responsible for the transportation of most of the international trade, more specifically for almost 85% of the world’s total international trade [33], about 90% of the European Union external trade (imports and exports), and one-third of the internal trade [7]. Although maritime transportation is among both the most affordable and the most environmentally friendly transportation [9, 22, 23], it needs to ensure that performance is sustained and even improved to tackle effectively the stringer environmental policies without losing competitiveness. Innovative planning and scheduling methods are critical for the successful performance of seaports since container handling operations are responsible for most of the energy consumption.
S. Mahdi Homayouni () LIAAD - INESC TEC, Porto, Portugal e-mail: [email protected] D. B. M. M. Fontes LIAAD - INESC TEC, Porto, Portugal Faculdade de Economia da Universidade do Porto, Porto, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_6
155
156
S. Mahdi Homayouni and D. B. M. M. Fontes
The introduction of containers in the 1960s allowed for easier loading and unloading processes as well as for the design of standard handling equipment, which in turn led to more efficient scheduling and controlling in seaports. While general cargo ships spend, on average, 50–70% of their time in ports to be loaded and/or unloaded, containerships spend only about 15–30% [4]. Containerships carry all goods in standardized containers and carry most of the world’s non-bulk cargoes. A container can be transferred between truck, train, and ship relatively easily, and its standard size simplifies transportation. Nowadays about 17% of the total seaborne transported cargos are handled in containers [9]. Once a containership moors at a container terminal, several quay cranes are assigned to the ship to load export containers and/or unload import containers. Export containers are made available at the quayside by vehicles that transport one container at a time from the storage yard. Then a quay crane picks up the container from the vehicle and loads it into the ship. Import containers are handled in the reverse direction, that is, quay cranes unload the containers, one at a time, onto the vehicle that then transports it to the storage yard. The transport between the quayside and the storage yard is performed by several vehicles, which are shared among several ships. Quay crane operations and vehicle operations are closely related and interconnected. Therefore, to use quay cranes and vehicles efficiently, they need to be jointly scheduled. This work addresses the problem of scheduling intraterminal container transport while both optimizing operations time and energy efficiency, which are among the main concerns of port authorities. Not only large energy cost savings may be accomplished but also substantial environmental benefits may be achieved. The intraterminal scheduling involves scheduling quay cranes, which are responsible for loading containers from the vehicle at the quayside into the ship and unloading containers from the ship onto the vehicle at the quayside as well as scheduling the vehicles that transport the containers between the ship quayside and the storage yard. Although quay cranes are exclusively allocated to ships, vehicles are shared among several ships. The transportation of each container is made by a single vehicle that can move at different speeds, which is a novel idea in this context. For this problem, we seek solutions that balance energy efficiency and operational efficiency; thus, we aim at minimizing both the total energy consumption and makespan. The contributions of this work are threefold: (i) the introduction of automated guided vehicles (AGVs) that can move at different speeds (AGVs with adjustable speed); (ii) three bi-objective mixed-integer linear programming (MILP) models for simultaneously scheduling quay cranes and vehicles in order to load containers onto ships, to unload containers from ships, and to both load and unload containers to and from ships (dual-cycling), respectively; and (iii) to infer on the advantages of considering AGVs with adjustable speed in order to balance the makespan and the energy consumption. In the following sections of this chapter, we provide a brief description of handling operations in container terminals in Sect. 2, wherein we also review the works on intraterminal transportation and the vehicle scheduling problem and the works on energy efficiency in maritime operations. The intraterminal container
Energy-Efficient Scheduling of Intraterminal Container Transport
157
scheduling problem with adjustable AGV speed is formulated as three mixed-integer linear programming models in Sects. 3, and 4 describes the relationship between speed adjustment and energy consumption. Section 5 reports on a set of numerical experiments and respective results. Finally, Sect. 6 provides some conclusions and future research directions.
2 Container Terminals Seaports primarily serve as an interface between different modes of transportation, e.g., rail, truck, and maritime. It is a common practice to dedicate part of the seaports (known as a terminal) to a specific type of load (i.e., dry and liquid bulks or containers). Although different types of loads may require different types of equipment, generally, in seaports, similar activities are performed to load/unload a ship. Container terminals (CTs) are an area in the seaports specifically designed to load/unload containerships. Figure 1 illustrates a conventional container terminal. Quayside
Quay cranes
Connecting area
Yard side
Intra-terminal transport
Load/ Unload stations
Vehicles
Quay cranes
Load/ Unload stations
Fig. 1 Schematic side and top views of a container terminal
158
S. Mahdi Homayouni and D. B. M. M. Fontes
The exporting containers are received from the hinterland through the terminal gate. After the initial legal inspections, the containers are transferred to the storage yard where they wait to be shipped. Once the ship to which these containers are to be loaded arrives, the containers are retrieved from the storage yard and transported by intraterminal vehicles to the ship quayside to be loaded into the ship by a quay crane. The importing loads go through the same process but in a reverse mode; that is, once unloaded from the ship and onto the intraterminal vehicles, the containers are transported to the storage yard, from where they will be delivered to the final customer or another mode of transportation (e.g., trains, smaller ships, or trucks). The goal of the CTs used to be to move the containers as quickly as possible and at the least possible cost. More recently, mainly due to regulation and legislation, CTs also became concerned with energy efficiency and energy consumption. Therefore, a CT needs to be able to efficiently and rapidly receive, store, and dispatch containers while saving energy and reducing emissions. In order to do so, CTs have to resort to new technologies (e.g., replacement of manually operated cranes by automated cranes and manually driven carts by automated guided vehicles) and to new techniques (e.g., joint scheduling methods, energy-efficient scheduling, etc.) to coordinate all types of handling equipment. Most literature focus only on efficiency measures such as CT throughput, ship turnaround, and equipment utilization [32, 38]. Recently, some studies including energy and emissions have been reported, although most studies considering energy measures address management practice and policies [e.g., 1, 19] rather than operational decisions. Energy concerns at the operational level have mostly dealt with the coordination between shipping lines and terminals, and thus, the decisions to be made are ship mooring time, berth allocation, and equipment assignment [15, 29]. In addition, energy savings are measured mostly at the ship level, associated with ship mooring delay (which implies additional sailing), late departure (subsequently compensated through sailing speed), and idle time [5, 37]. Here, we study the trade-off between energy consumption and operational efficiency and its impacts on the integrated scheduling of quay cranes and intraterminal vehicles, since reducing energy consumption allows, on the one hand, large energy cost savings and, on the other hand, substantial environmental benefits.
2.1 Vehicle Scheduling in CTs Typically in CTs, a set of loading/unloading containers or a mix of them are to be transported between defined pickup and drop-off points by a set of identical vehicles. The ready time corresponds to the moment when the container can be delivered to the vehicle, either at the storage area (for loading tasks) or at the quayside (for unloading tasks). Traditionally, CTs employ yard trucks to transport the containers; however, they have been progressively replaced by unmanned vehicles. The current standard, mainly, involves using automated guided vehicles (AGVs), which are controlled
Energy-Efficient Scheduling of Intraterminal Container Transport
159
by a central computer that decides on the dispatching and movement of each vehicle. AGVs have been widely implemented in modern automated container terminals, e.g., Euromax CT in Rotterdam, Netherlands, and Pasir Panjang Terminal in Singapore. AGVs, usually, follow a fixed path guided by markers, wires, lasers, or computer vision and can carry up to one forty-foot equivalent unit (FEU) or two twenty-foot equivalent unit (TEU) container(s) in each trip. However, since AGVs are not able to lift the containers by themselves, they need to interact with cranes that receive and/or deliver the containers. Therefore, operations scheduling in these two equipment types (vehicles and cranes) has to be optimized simultaneously as their operations need to be synchronized. Recent reviews on AGV scheduling methods in container terminals can be found in [9, 14]. Among the rich literature on vehicle scheduling, we briefly review works on jointly scheduling cranes and vehicles, since, as said before, their operations are interconnected. A crane can only load (unload) a container onto (from) the vehicle once the vehicle is at the quayside, and the vehicle can only carry (travel to get) a container after the crane loads (unloads) it. The impacts of the lack of coordination have already been shown. For example, Homayouni et al. [11] have shown that in order to obtain the same performance (QC total tardiness, AGV total travelling time, and storage equipment total operational time) by scheduling QCs and AGVs sequentially, rather than simultaneously, the AGV fleet has to be doubled in size. Kaveshgar and Huynh [13] found out that when scheduling QCs and vehicles sequentially rather than simultaneously, the makespan of the handling operations is, on average, 10% larger. Recently, several works have been reported on the simultaneous scheduling of AGVs and QCs; however, all optimize time-related measures. Kaveshgar and Huynh [13], Zhen et al. [36] propose MILP models for the joint scheduling of QCs and yard trucks with a single-cycling strategy, while the MILP models proposed in [2, 10, 28] consider a dual-cycling strategy. Additionally, these works also propose heuristic approaches, namely, genetic algorithms (GAs) are proposed in [10, 13], particle swarm optimization (PSO) algorithms in [28, 36], and an imperialist competitive algorithm (ICA) in [2].
2.2 Energy Efficiency in CT Scheduling The reduction of energy consumption can be accomplished not only through the introduction of new technology and infrastructures but also through advanced planning and scheduling methods. The typical objective functions in energy concerned scheduling are energy consumption and pollutant emissions. Energy efficiency entails providing the same services while consuming less energy, which in turn results in emission reduction. Thus, energy-efficient scheduling aims not only at reducing energy consumption but also at increasing operational efficiency. Although energy-efficiency and operational efficiency seem to be con-
160
S. Mahdi Homayouni and D. B. M. M. Fontes
flicting objectives, they are connected and can somehow reinforce each other. For example, reducing travel time leads to a reduction in the energy consumed and also in the operational costs [6]. Recent literature surveys on seaports energy-efficient practices can be found in [12, 27]. In here, we only briefly review recent works on energy efficiency considering vehicle scheduling in CTs. Liu and Ge [16] study the impact of the waiting behavior of AGVs, resorting to queuing theory, on the carbon emissions. They were able to conclude that the optimal number of QCs increases with the AGV expected arrival rate and the mean fuel consumption. In contrast, it decreases with the QC mean queue service rate and the electricity consumption. Zhao et al. [35] address the simultaneous scheduling of QCs and AGVs in order to minimize the total energy consumed by both QCs and AGVs. They propose a two-stage tabu search (TS) algorithm, which in the first stage schedules QC operations and in the second stage optimizes the AGV dispatching. Another two-phase approach is proposed in [34]. The first phase schedules the QC operations while minimizing the energy consumed by the QC movements. The second phase deals with AGV assignment and AGV scheduling and minimizes the energy consumed by the AGVs and maximizes the utilization rate of the AGVs. The first phase resorts to an enumeration strategy and the second phase to a GA. Although the energy-efficient scheduling of vehicles in CTs has become the subject of recent research, AGVs with adjustable speed have never been considered. AGV scheduling studies have the common assumption that AGVs move at a constant speed; however, the speed is adjustable and controllable. Furthermore, adjustable speed vehicles have already been studied in other contexts. The pollution routing problem (PRP), which is a variant of the vehicle routing problem, includes determining the speed of the vehicles on each of the several route segments. This problem was first proposed in 2011 by Bekta¸s and Laporte [3]. The concept of energy-efficient speed optimization has recently been proposed for ship routing problem through “slow steaming,” i.e., sailing at a lower speed. It leads to a significant reduction in fuel consumption, as well as in pollutant emissions; by reducing the cruising speed by 20%, a reduction of 50% in bunker consumption can be accomplished [25]. Nevertheless, sailing at a reduced speed implies a longer travel time, and thus, shipping companies must look for the “best” trade-off between energy efficiency and service level (or number of required ships). Recent reviews on concepts and models for the energy-efficient ship routing problem can be found in [18].
3 Problem Definition and Formulation This section provides a detailed description of the problem being addressed and its mathematical formulation. The joint scheduling of QC handling tasks and intraterminal transport requires solving simultaneous four interdependent combinatorial optimization problems, namely, sequencing container handling operations on each QC (QC scheduling), determining which vehicle performs each transport
Energy-Efficient Scheduling of Intraterminal Container Transport
161
task (vehicle assignment), sequencing transport tasks on each vehicle (vehicle scheduling), and determining the speed at which AGVs travel on each segment of the transport task (vehicle mode). Loading and unloading containerships require QCs that perform the loading and unloading operations, that is, the movement of containers between the containership and the quayside, and AGVs that transport containers between the quayside and the storage yard. To handle containers between the containership and the quayside, a set I of QCs is allocated to each containership i ∈ I to load or unload its ordered set Ji of containers. Each QC handles (loads or unloads) one container at a time, and each container is handled by a single QC. In loading tasks, a container is transported from a specific load/unload (LU) station at the storage yard to the quayside by an AGV, and then a QC retrieves it from the AGV and loads it into the containership. A conventional QC is equipped with a trolley that can move along the crane arm to transport a container between the ship and the quayside and vice versa. At the quayside, the spreader of the QC clutches the container and lifts it vertically (to a safe height), then the trolley moves (loaded) horizontally over the ship, and the spreader reshuffles other containers (if required) and places the container in its predetermined place (in the ship). We consider the handling time of a QC in loading tasks to be the time since the spreader picks up the container from the AGV (releasing the AGV) until the QC trolley is empty and back to the quayside (becoming ready for its next assignment). Similarly, in unloading tasks, the trolley (that is already in the quayside after delivering its previous container to an AGV) moves from the quayside, horizontally over the ship, to the ship where the spreader grabs the assigned container (reshuffling some other containers if necessary) and lifts it (to a safe height), and then the trolley moves back loaded to the quayside where it places the container on the assigned AGV. Once loaded, the AGV travels to the storage yard and delivers the container to its predetermined LU station. The handling time of a QC in unloading tasks is considered as the time elapsed since its trolley moves empty from the quayside until its spreader places the container on the AGV. The intraterminal container transport is done using a set A of identical AGVs that can carry one container at a time in a nonpreemptive travel. The AGVs are not dedicated, meaning that each transport task can be performed by using any of the AGVs. The location of each AGV at the beginning of the scheduling period is given, and while operating, AGVs remain wherever they finish a task (dwell point) until the start of the next assignment. Therefore, no empty travels are imposed. However, AGVs may need to do an empty travel, from their current dwell point to the location where the container, of the next assignment, needs to be picked up (at either an LU station at the storage yard or a QC, depending on whether the container is being loaded or unloaded) before they can carry their assignment, i.e., transport the container from its current location to its destination. Note that the origin and destination of each transport task (i.e., each container) are an input to the problem. The operation of QCs and AGVs is interconnected. On the one hand, when an AGV arrives at the quayside, it may need to wait to be loaded or unloaded by the QC
162
S. Mahdi Homayouni and D. B. M. M. Fontes Loaded moves
Empty moves
A loading container
An unloading container
Vehicle
Dual-cycling strategy
Single-cycling strategy QCs
An unloading container
Storage yard
Fig. 2 Single-cycling and dual-cycling strategies for intraterminal transport
before pursuing its next assignment. On the other hand, the QC may need to wait for the AGV arrival at the quayside in order to pick up or drop off the container, depending on the handling task being a loading or an unloading task, respectively. We assume that the transfer time between QC and AGV (regardless of being a pickup or a drop-off) is negligible. Regarding the storage yard side, we assume that all containers are initially available at the LU station and that the transfer time of containers between the AGV and the predetermined LU station is negligible. The CT floor layout is known; thus, the distances between LU stations and QCs are known, and the transportation times are distance-dependent. Nevertheless, it is assumed that AGVs can move at different average speeds and that the empty and the loaded travel segments of each transport task can be performed at different speeds. Moreover, the energy consumption of AGVs per time unit depends, on the one hand, on its average speed (the higher the speed, the higher the energy consumed per time unit) and, on the other hand, on the load being transported by the AGV (i.e., AGV energy consumption per time unit is lower on empty travels than on loaded ones). Finally, no congestion issues for AGVs are considered in the scheduling problem. It is common, in practice, to unload importing containers first and then to load exporting containers. Such a strategy is known as single-cycling mode, and AGVs serve one single containership at the time. However, combining container loading and unloading operations may significantly increase the efficiency of the container terminal by reducing the empty travel distance as well as idle times (both of QCs and AGVs). This latter strategy is known as dual-cycling, and in it, AGVs serve several ships. Obviously, the resulting optimization problem is harder. The single-cycling and dual-cycling strategies are illustrated in Fig. 2. The next three subsections propose three mixed-integer linear programming models, two associated with the single-cycling strategy and another associated with
Energy-Efficient Scheduling of Intraterminal Container Transport
163
the dual-cycling strategy: (i) one for a set of loading tasks, (ii) one for a set of unloading tasks, and (iii) one for a mixed set of loading and unloading tasks.
3.1 MILP for Loading Tasks In here, we present an MILP model for a set of loading tasks that minimizes the makespan (last loading task completed by a QC) and the total energy consumed by the AGVs. This model is based on the idea of using a set of chained decisions for AGV assignments that avoids introducing a new index for the AGVs, this way reducing substantially the number of decision variables. First, transport tasks are explicitly assigned to an AGV, and then chains of assigned tasks are built without considering AGVs. These chains of independent decision variables are connected through the completion time constraints. Since we also decide on the AGV speed for each segment (empty and loaded), which is a novel idea, we include variables for each trip segment. For the loaded trip segment, we determine the AGV speed; however, for the empty trip segment, in addition to the AGV speed, we also need to determine its origin (i.e., the dwell point of the assigned AGV). Note that regardless of the AGVs location, we always consider an initial empty travel, although it may have a zero travel time. In addition, we assume that the energy consumption during AGVs’ idle time (while waiting for its next assignment) is negligible. The notation used and the model proposed follow: Sets and Indices I Set of QCs indexed by i, k Ji Set of containers to be loaded/unloaded by QC i ∈ I , indexed by j, l A Set of AGVs, indexed by a V Set of speed values for the AGVs, indexed by v, ω Parameters ni = |Ji | Number of containers to be loaded/unloaded by QC i ∈ I Tij Tasks (handling and transport) of container j ∈ Ji of QC i ∈ I QC handling time of task Tij , i ∈ I , j ∈ Ji τij ij v θkl AGV empty travel time for performing task Tij at speed v ∈ V immediately after completing task Tkl , i, k ∈ I, j ∈ Ji , l ∈ Jk ij v θa AGV a ∈ A empty travel time for performing task Tij , as its first, at speed v ∈ V , i ∈ I, j ∈ Ji ϑijv AGV loaded travel time for performing task Tij at speed v ∈ V , i ∈ I , j ∈ Ji ev AGV energy consumption per time unit when travelling empty at speed v∈V εv AGV energy consumption per time unit when travelling loaded at speed v∈V M A sufficiently large positive integer
164
S. Mahdi Homayouni and D. B. M. M. Fontes
Decision Variables ij v xkl Binary variable set to 1 if the empty travel of task Tij is performed at speed v ∈ V immediately after completing task Tkl (by the same AGV) and to 0 otherwise, i, k ∈ I, j ∈ Ji , l ∈ Jk χijv Binary variable set to 1 if the loaded travel of task Tij is performed at speed v ∈ V and to 0 otherwise, i ∈ I, j ∈ Ji ij v ya Binary variable set to 1 if the first task of AGV a ∈ A is the empty travel of task Tij and is done at speed v ∈ V and to 0 otherwise, i ∈ I, j ∈ Ji wij Binary “dummy” variable set to 1 if task Tij is the last task of an AGV and to 0 otherwise, i ∈ I, j ∈ Ji rij AGV arrival time at the destination of task Tij , i ∈ I , j ∈ Ji cij QC completion time of task Tij , i ∈ I , j ∈ Ji E Total energy consumed by the AGVs Cmax Task makespan z1 = cmax ,
Minimize
z2 = E
(1)
Subject to:
ij v
i∈I v∈V j ∈Ji
≤ 1,
ya
wij =
i∈I j ∈Ji
ij v
(3)
ya ,
ij v
xkl +
ij v
xil +
lj ∈Ji v∈V
χijv = 1,
∀i ∈ I ; j ∈ Ji ,
(6)
v∈V
cij ≥ ci(j −1) + τij , ci1 ≥ τi1 ,
∀i ∈ I ; j ∈ Ji \ {1},
∀i ∈ I,
cij ≥ rij + τij ,
(8)
∀i ∈ I ; j ∈ Ji , ij v
rij ≥ ckl − τkl + θkl
(7)
ij v + ϑijω + M xkl + χijω − 2 ,
(9)
Energy-Efficient Scheduling of Intraterminal Container Transport
ij v
rij ≥ θa
165
∀i, k ∈ I ; j ∈ Ji ; l ∈ Jk (if i = k : j > l); v, ω ∈ V , (10)
ij v + ϑijω + M ya + χijω − 2 , ∀i ∈ I ; j ∈ Ji ; a ∈ A; v, ω ∈ V , (11)
cmax ≥ cini , ⎛ E=
∀i ∈ I,
(12) ⎞
⎜ ij v ij v ij v ij v v ⎟ v v v v ⎜ ⎟, x + χ + y θ e ϑ ε θ e a a ij ij kl kl ⎝ ⎠
i∈I j ∈Ji
k∈I v∈V l∈Jk
v∈V
a∈A v∈V
(13) ij v
ij v
xkl , χijv , ya , wij ∈ {0, 1}, cmax , cij , rij , eij ≥ 0,
∀i, k ∈ I ; j ∈ Ji ; l ∈ Jk ; v ∈ V ; a ∈ A,
∀i ∈ I ; j ∈ Ji .
(14) (15)
In expressions (1), we define our objectives, the makespan, i.e., the latest time at which a container is loaded into the ship and total energy consumed by the AGVs, which are to be minimized. Constraints (2) and (3) ensure that each AGV has at most one first task and that the number of first and last tasks is the same, respectively. Constraints (4) to (5) ensure task precedence constraints since, on the one hand, either each task is the first task of an AGV or it is preceded by another task, and on the other hand, either it is the last task of an AGV or it is followed by another task. In addition, constraint (6) ensures that tasks are performed and at a single speed value. Constraints (7) to (11) are the completion time constraints, and they connect the AGV and QC decisions. A handling task can only be completed once the QC performing it has completed its previous task and the container handling time has elapsed (constraint (7) or constraint (8) if it is the QC first task) and after the container has arrived at the quayside and its handling time has elapsed (constraint (9)). An AGV can only arrive at the quayside to deliver a container after it has delivered the previous container, travelled empty from the quayside to the LU station at the chosen speed, picked up the container, and travelled back, loaded, to the quayside at the chosen speed, as enforced by constraint (10). For AGV first task, no previous container delivery exists; thus, the arrival time at the quayside is given as in constraints (11). The makespan and total energy consumed values are computed as given in expressions (12) and (13). Finally, constraints (14) and (15) define the nature of the variables.
166
S. Mahdi Homayouni and D. B. M. M. Fontes
3.2 MILP for Unloading Tasks Scheduling a set of unloading tasks is quite similar to scheduling a set of loading tasks since containers go through the same process, however, in a reverse mode. Thus, the MILP model for unloading tasks is quite similar to the one presented in Sect. 3.1, and only constraints (9) to (12) need to be changed. The notation used is the one previously introduced in Sect. 3.1. All constraints are still valid except constraints (9) to (12), which due to the reverse moving direction of containers (unloading first and then transporting) are replaced by constraints (16) to (19). Thus, the MILP model for a set of unloading tasks can be formulated as follows: (In)equations (1) to (8),
ij v ij v cij ≥ rkl + θkl + M xkl − 1 , ∀i, k ∈ I ; j ∈ Ji ; l ∈ Jk (if i = k : j > l); v ∈ V ,
ij v ij v cij ≥ θa + M ya − 1 , ∀i ∈ I ; j ∈ Ji ; a ∈ A; v ∈ V ,
rij ≥ cij + ϑijv + M χijv − 1 , ∀i ∈ I ; j ∈ Ji ; v ∈ V ,
(18)
cmax ≥ rini ,
(19)
∀i ∈ I,
(16) (17)
(In)equations (13) to (15). A handling task can only be finished, i.e., a container can only be unloaded from the ship and onto the AGV, once an AGV has finished its previous delivery to the storage yard and travelled back empty to the quayside, as stated by constraint (16); unless it is the first AGV task, in which case only the empty travel, back to the quayside, needs to be accounted for (see constraint (17)). Moreover, the time at which the container can be delivered to the storage yard is given by the time at which it is unloaded onto the AGV plus the loaded travel time to the specific LU station in the storage yard, as in constraint (18). The makespan needs to be redefined since it is now given by the time at which the latest delivered container reaches its location at the storage yard, as given by constraint (19).
3.3 MILP Model for Dual-Cycling As previously mentioned, port authorities commonly schedule the unloading of the importing containers followed by the loading of the exporting containers. Although single-cycling results in less complicated problems, both for the quayside and the
Energy-Efficient Scheduling of Intraterminal Container Transport
167
yard side operations, it implies additional travelling and handling times as well as additional idle times. In single-cycling, for almost all operations, the cranes and vehicles perform one empty travel after each loading or unloading task. By addressing loading and unloading simultaneously—dual-cycling—empty crane movements and empty AGV travels can be reduced, which can be used to either reduce the resources needed (fewer QCs and/or AGVs) or increase productivity. In addition, since the loading and unloading tasks may involve multiple ships, AGVs may be shared among several ships. A cycle is a complete round-trip of the vehicle from the quayside to the storage yard and back to the quayside, or vice versa. It is expected that a dual-cycling strategy results in a higher level of productivity for handling operations at CTs. In order to adapt the MILP models previously proposed and to take advantages of the dual-cycling strategy, we separate the tasks into two sets, namely, (U ), set of unloading tasks, and (L), set of loading tasks. Then, considering the sets, indices, parameters, and decision variables defined in Sect. 3.1, the MILP model can be formulated as follows:
(In)equations (1) to (8), cij ≥ rij + τij ,
∀i ∈ I ; j ∈ Ji ; Tij ∈ L,
ij v ij v rij ≥ ckl − τkl + θkl + ϑijω + M xkl + χijω − 2 ,
(20)
∀i, k ∈ I ; j ∈ Ji ; l ∈ Jk ; v, ω ∈ V ; Tij , Tkl ∈ L,
ij v ij v rij ≥ θa + ϑijω + M ya + χijω − 2 ,
(21)
∀i ∈ I ; j ∈ Ji ; a ∈ A; v, ω ∈ V ; Tij ∈ L,
ij v ij v cij ≥ rkl + θkl + M xkl − 1 ,
(22)
∀i, k ∈ I ; j ∈ Ji ; l ∈ Jk ; v ∈ V ; Tij , Tkl ∈ U ,
ij v ij v cij ≥ θa + M ya − 1 , ∀i ∈ I ; j ∈ Ji ; a ∈ A; v ∈ V ; Tij ∈ U ,
rij ≥ cij + ϑijv + M χijv − 1 , ∀i ∈ I ; j ∈ Ji ; v ∈ V ; Tij ∈ U ,
ij v ij v cij ≥ ckl − τkl + θkl + M xkl − 1 ,
(23)
∀i, k ∈ I ; j ∈ Ji ; l ∈ Jk ; v ∈ V ; Tij ∈ U , Tkl ∈ L,
ij v ij v rij ≥ rkl + θkl + ϑijω + M xkl + χijω − 2 ,
(26)
∀i, k ∈ I ; j ∈ Ji ; l ∈ Jk ; v, ω ∈ V ; Tij ∈ L, Tkl ∈ U ,
(24) (25)
(27)
168
S. Mahdi Homayouni and D. B. M. M. Fontes
cmax ≥ cini ,
∀i ∈ I ; Tini ∈ L,
(28)
cmax ≥ rini ,
∀i ∈ I ; Tini ∈ U ,
(29)
(In)equations (13) to (15). (In)equations (1) to (8) and (13) to (15), as before, define the two objectives under consideration and ensure the correct usage of the AGVs (number of first and last tasks and task precedence constraints) and of the QCs (task precedence constraints), the correct calculation of the total energy consumed by the AGVs, and the nature of the variables. These constraints are common to both the loading and the unloading MILP models. The remaining constraints ensure the correctness of the completion time constraints that interconnect the AGV and QC decisions for (i) loading tasks (constraints (20) to (22)), (ii) unloading tasks (constraints (23) to (25)), and (iii) mixed loading and unloading tasks (constraints (26) to (27)). Finally, constraints (28) and (29) state that the makespan is the latest task completion time among loading and unloading tasks.
4 Speed Adjustment of AGVs Conventionally, CTs make use of diesel-powered AGVs; however, in recent years, they have started to adopt (electric) battery-powered ones. This new generation of AGVs consumes less energy and generates fewer carbon emissions. Thus, by using this type of AGVs, CTs not only decrease operational costs but also become more environmentally responsible. According to Schmidt et al. [26], battery-powered AGVs can save more than 10% of the terminal transport costs, in comparison with diesel-powered ones. Hence, battery-powered AGVs, in addition to being more environmentally friendly, are an economically viable alternative for CTs [17, 26]. The Hamburg Container Terminal Alternwerder is a recent and ongoing example of the transition from diesel-powered AGVs to battery-powered ones. The full 100 AGV fleet conversion to fast-charging lithium-ion batteries is projected to be completed by the end of 2022, this way contributing to the overall reduction of carbon dioxide and nitrogen dioxide emissions and to the improvement of air quality [8]. Although the idea of using adjustable speed vehicles in container terminals has not been investigated yet, it has been used (at least theoretically) to reduce AGV energy consumption within manufacturing systems [24] and has been the subject of significant recent research in other areas such as the green vehicle routing problem and the pollution routing problem [3, 21, 31]. The dimensions of a typical AGV, in container terminals, are approximately 15 m by 3 m with a deadweight of 25 tons and, usually, can carry either one 40/45 ft container weighting up to 40 tons or two 20 ft containers with a combined weight
Energy-Efficient Scheduling of Intraterminal Container Transport
169
of up to 70 tons. Following on the works by Zhong et al. [39] and Wang et al. [30] and based on the real data from container terminals, we consider the AGV “nominal” travel speeds to be 6 m/s when empty and 3 m/s when carrying a 40-ton load. The average energy consumption is, respectively, 10 and 15 kW per second, when travelling empty and with a 40 ton load. Although AGVs can autonomously adjust their instantaneous speed due to the current requirements of the schedule and trajectory, it is assumed that they travel at a constant average speed in a specific travel segment. Therefore, the average travel time between any pair of locations can be calculated given the CT layout and the specific average speed. Let us consider an AGV travelling along a specific route segment with a known distance x at the average “nominal” speed v0 and then its “nominal” travel time is t0 = x/v0 time units. The AGV can travel the same segment at the average speed v1 taking t1 = x/v1 time units. Let α be the speed factor and v1 = αv0 ; then t1 = α1 t0 . Figure 3 shows the variation of the travel time t1 as a ratio of the nominal travel time t0 with the variation of the speed factor α, for values of α ranging from 0.5 to 1.5. The relationship between energy consumption and vehicle speed and weight, which depends on whether the AGV is travelling empty or loaded, (in container terminals) is yet to be investigated. Therefore, we resort to the data used in similar works [6, 20, 31] to estimate the vehicle energy consumption under different speeds and loads. Figure 3 shows the AGV estimated energy consumption (in KWs) by an AGV travelling empty and carrying a 40-ton load, for various speed values.
5 Results and Discussions This section reports the results of the computational experiments conducted to ascertain the performance of the MILP models. We also investigated the impact of the dual-cycling strategy on the makespan and on the energy consumption. The models were implemented in Python® 3.7 and were solved using Gurobi® 9.0. All computational experiments were carried out on a 3.20 GHz Intel® Core™ i7-8700 PC with 24 GB RAM.
5.1 Problem Instances The problem under study is novel as AGVs with adjustable speed are considered here for the first time. Hence, there are no benchmark instances in the literature. Therefore, we propose a set of small-sized problem instances to test the approaches and compare the results obtained. These instances as well as the optimal solutions obtained are available to the research community at https:// fastmanufacturingproject.wordpress.com/problem-instances/.
S. Mahdi Homayouni and D. B. M. M. Fontes
0.69
0.67
0.71
0.77
0.74
0.80
0.87
0.83
0.91
1.00
0.95
1.05
1.18
1.11
1.25
1.33
1.54
1.43
1.67
Travel time
1.82
2.00
170
Empty
30.8 20.6
26.5
28.6 19.0
17.7
24.6 16.4
21.3
22.9 15.2
14.2
19.8 13.2
17.2
18.5 12.3
11.5
16.1 10.7
15.0
13.6
12.6 8.4
10.0
12.0 8.0
9.1
11.7 7.8
13.1
12.3
8.8
8.2
14.3 9.6
18.0 12.0
10.6
15.9
20.7 13.8
Energy consumption in kWs
Speed factor- α
Loaded
Speed factor- α Fig. 3 Top: variation of the travel time ratio with the speed factor (v1 = αv0 and t1 = α1 t0 ). Bottom: variation of the energy consumption ratio with the speed factor (v1 = αv0 and e1 = βe0 , β is unknown, and the energy consumption values were empirically determined)
Energy-Efficient Scheduling of Intraterminal Container Transport
171
We consider a rectangular container terminal, 300 m long and 100 m wide, with unidirectional paths on which the AVGs travel. There are six QCs in the quayside and six LU stations in the storage yard. The travelling distance in meters between any two points of the terminal is reported in the Appendix (Table 4). We generated 35 small-sized problem instances, and each instance is used twice, one by considering that all tasks are loading tasks (data set 1) and another by considering that all tasks are unloading tasks (data set 2). The number of containers ranges from 4 to 16, being loaded or unloaded by two to six QCs and transported by a fleet of two AGVs. In the storage yard, there between one and six LU stations. All AGVs are initially parked at LU station 1. The QC handling time of each container, in seconds, has been randomly generated from a uniform distribution U(30, 80). The data for the loading instances (data set 1) can be found in the Appendix (Table 5). For each instance, we report the number of QCs and the number of container tasks (Q-T), the destination QC (from where the containers are loaded into the ship), and, in parentheses, the QC handling time and the origin LU station (where the containers are picked up from). Unloading instances (data set 2) are obtained from the same data (in Table 5) by reversing the origins and destinations, i.e., by considering that the containers origin is the QC and their destination the LU station. We also defined a set of 26 mixed loading-unloading problem instances (data set 3) to infer on the advantages of using the dual-cycling strategy. Each of these instances is obtained by combining one loading instance (taken from data set 1) with one unloading instance (taken from data set 2). AGVs have three speed levels, namely, lower, nominal, and higher. In the nominal mode, AGVs move at an average speed of 6 m per second and consume 10 kW per second when travelling empty. When carrying a 40-ton container, the AGVs move at an average speed of 3 m per second and consume 15 kW per second. At lower and higher speed levels, the AGV average speed is 20% slower and 20% faster, respectively. AGV travel time and energy consumption when moving at lower and higher speed levels are computed as discussed in Sect. 4.
5.2 The Solution Method In a multi-objective optimization, a non-dominated solution is a solution for which it is not possible to improve on one objective without degrading at least one of the other objectives. A known strategy to solve multi-objective problems is “lexicographic” optimization, in which the objectives are ranked in order of importance. Then, by disregarding the other objectives, the problem is converted into a single-objective optimization problem. Once an optimal solution is found, another single-objective optimization is solved. The objective function of this latter problem is the one ranked next. In addition, this problem considers extra constraints that impose a value for the higher ranked objective functions at least as good as the best found when optimizing it.
172
S. Mahdi Homayouni and D. B. M. M. Fontes
The problem under study is a bi-objective optimization problem, and we solve it by considering, in turn, that each of the objective functions has a higher rank. This way we find two extreme best solutions. Note that increasing the number of speed levels increases the size of the optimization problem and of the MILP model, which in turn leads to an increase in the computation time. Therefore, reducing the considered levels is of utmost importance. Since the AGVs consume less energy at lower speed levels, when optimizing the total energy consumed, it is enough to consider the lower speed level. However, when optimizing the makespan subject to the smallest possible energy consumption, it may be possible to increase the AGV travelling speed without increasing the overall energy consumption and this way decrease the makespan. Since there is an increasing trend in the energy consumption marginal increase (see Fig. 3), when minimizing the total energy consumed, only lower and nominal speed levels will be considered. In contrast, the makespan cannot be reduced by considering speed levels other than the higher one. Nonetheless, it may be possible to reduce energy consumption by moving the AGVs slower while maintaining the best makespan value. However, travelling time increases faster for large speed reductions; therefore, when minimizing the makespan, only nominal and higher speed levels will be considered. Since solving the problem instances with two speed levels is computationally much less expensive than doing it for three speed levels, we employed this strategy. To further analyze our approach, we found the Pareto front for a subset of problem instances. In order to do so, we employed the E-constraint method. To implement it, we made the following changes to our model: (i) make it single-objective by considering only the energy minimization, and (ii) consider an additional constraint imposing an upper limit to the makespan. This new model is solved several times, each of which with a different and tighter upper limit for the makespan. The upper limit is gradually decreased by about 1.5%. This way we obtain a series of trade-off solutions by improving the solution makespan, which results in a natural and inevitable gradual degradation of the energy consumption.
5.3 Single-Cycling Strategy Results The results for data sets 1 and 2 are reported, respectively, in Tables 1 and 2. For each problem instance, these tables report instance characteristics, minimum total energy ∗ ). For each of these two solutions, consumed (E ∗ ), and minimum makespan (Cmax we also report on the best value obtained for the other objective. We use three scenarios, namely, Scen.N, Scen.L∼N, and Scen.N∼H . Scenario Scen.N is used to show the advantage of considering AGVs with adjustable speed, while Scen.L ∼ N and Scen.N ∼ H are used to avoid solving the instances considering three speed levels, as discussed in Sect. 5.2. Finally, we also report the improvement attained by considering AGVs with adjustable speed (over single nominal ones) through the computation of gaps. As an example, the GAP for E ∗
Energy-Efficient Scheduling of Intraterminal Container Transport
173
Table 1 Results for the 35 loading problem instances in data set 1 Scen.N Ins L01 L02 L03 L04 L05 L06 L07 L08 L09 L10 L11 L12 L13 L14 L15 L16 L17 L18 L19 L20 L21 L22 L23 L24 L25 L26 L27 L28 L29 L30 L31 L32 L33 L34 L35 Max Mean Min
Q-T 2–4 2–4 2–4 2–6 2–6 2–6 2–8 2–8 2–10 2–10 2–12 2–12 2–14 2–14 2–16 2–16 3–6 3–6 3–9 3–9 3–12 3–12 3–15 3–15 4-8 4–8 4–12 4–12 4–16 4–16 5–10 5–10 5–15 5–15 6–12
E∗
4000 4833 5000 6500 6333 7000 9333 9500 10,917 11,417 12,583 13,250 17,500 16,750 19,333 20,667 6333 9000 11,333 10,250 12,500 15,000 17,417 17,833 9500 10,667 15,500 14,500 21,667 18,000 14,250 12,833 19,417 18,917 18,333
Cmax 274 437 420 403 541 605 738 787 849 889 578 1064 823 1321 926 971 350 477 901 840 586 1159 862 1406 435 574 1201 1155 937 846 587 1034 1503 1494 788
∗ Cmax 200 262 266 323 316 372 436 466 465 525 553 593 726 721 814 847 288 406 510 465 554 632 734 763 407 485 651 620 871 742 563 559 798 781 700
E
4083 5083 5417 6917 6583 7583 9750 10,250 11,250 11,833 13,083 13,750 17,917 17,417 19,917 20,917 6583 9250 11,750 10,750 12,917 15,667 18,083 18,500 9667 10,750 15,833 14,833 21,917 18,583 14,417 13,083 19,667 19,250 18,667
Scen.L∼N E∗ Cmax 3900 309 4713 528 4875 516 6338 443 6175 662 6825 739 9100 913 9263 962 10,644 1052 11,131 1104 12,269 667 12,919 1308 17,063 991 16,331 1633 18,850 1119 20,150 1171 6175 425 8775 565 11,050 1111 9994 1034 12,188 698 14,625 1429 – – 17,388 1733 9263 520 10,400 699 15,113 1489 14,138 1428 21,125 1141 17,550 1026 13,894 718 12,513 1275 18,931 1866 – – 17,875 959
Scen.N∼H GAP s ∗ ∗ Cmax E E ∗ Cmax Cmax E 175 4474 −2.5 12.9 −12.8 9.6 229 5569 −2.5 21.0 −12.5 9.6 229 5815 −2.5 22.8 −13.9 7.4 276 7482 −2.5 9.8 −14.5 8.2 274 7045 −2.5 22.3 −13.3 7.0 322 8006 −2.5 22.0 −13.5 5.6 384 10,467 −2.5 23.7 −11.9 7.4 405 10,951 −2.5 22.3 −13.1 6.8 404 12,182 −2.5 23.9 −13.0 8.3 449 12,606 −2.5 24.1 −14.6 6.5 494 14,063 −2.5 15.5 −10.7 7.5 504 14,969 −2.5 22.9 −15.0 8.9 610 19,590 −2.5 20.4 −16.0 9.3 614 19,034 −2.5 23.7 −14.9 9.3 688 21,821 −2.5 20.9 −15.5 9.6 722 22,797 −2.5 20.6 −14.7 9.0 250 7181 −2.5 21.4 −13.2 9.1 347 10,087 −2.5 18.3 −14.6 9.0 437 12,778 −2.5 23.4 −14.3 8.7 394 11,778 −2.5 23.1 −15.2 9.6 479 13,817 −2.5 19.0 −13.6 7.0 – – −2.5 23.3 − − – – − − − − – – −2.5 23.3 − − 344 10,559 −2.5 19.7 −15.3 9.2 418 11,682 −2.5 21.8 −13.9 8.7 – – −2.5 23.9 − − – – −2.5 23.6 − − – – −2.5 21.8 − − – – −2.5 21.3 − − – – −2.5 22.3 − − – – −2.5 23.4 − − – – −2.5 24.1 − − – – − − − − – – −2.5 21.7 − − −2.5 24.1 −10.7 9.6 −2.5 21.3 −13.9 8.3 −2.5 9.8 −16.0 5.6
174
S. Mahdi Homayouni and D. B. M. M. Fontes
Table 2 Results for the 35 unloading problem instances data set 2 Scen. N Ins U01 U02 U03 U04 U05 U06 U07 U08 U09 U10 U11 U12 U13 U14 U15 U16 U17 U18 U19 U20 U21 U22 U23 U24 U25 U26 U27 U28 U29 U30 U31 U32 U33 U34 U35 Max Mean Min
Q-T 2–4 2–4 2–4 2–6 2–6 2–6 2–8 2–8 2–10 2–10 2–12 2–12 2–14 2–14 2–16 2–16 3–6 3–6 3–9 3–9 3–12 3–12 3–15 3–15 4–8 4–8 4–12 4–12 4–16 4–16 5–10 5–10 5–15 5–15 6–12
E∗
4667 4833 5000 7083 6250 6833 9167 9333 11,000 11,333 13,000 13,250 17,667 17,000 20,333 20,750 7083 9167 11,083 10,000 13,000 14,750 17,667 18,083 10,667 10,917 15,333 14,417 21,833 18,583 15,333 12,750 18,917 18,917 19,667
Cmax 226 367 383 301 493 521 714 709 825 854 760 1006 819 1275 914 1550 290 692 825 750 593 1094 791 1350 441 825 606 1105 1625 737 575 754 781 1494 742
∗ Cmax 226 217 226 301 260 287 397 394 433 446 524 527 717 667 797 808 290 379 445 408 521 583 697 700 419 425 592 559 825 720 575 513 733 781 742
E
4667 5000 5333 7083 6417 7333 9500 10,000 1125 11,667 13,250 13,583 18,000 17,417 20,417 21,000 7083 9333 11,500 10,500 13,167 15,417 17,917 18,333 10,750 11,333 15,750 14,750 22,000 18,917 15,333 13,167 19,500 19,250 19,667
Scen. L∼N E∗ Cmax 4550 264 4713 458 4875 479 6906 360 6094 602 6663 646 8938 874 9100 875 10,725 1031 11,050 1062 12,675 894 12,919 1243 17,225 979 16,575 1594 19,825 1599 20,231 1937 6906 349 8938 865 10,806 1021 9750 937 12,675 708 14,381 1365 17,225 967 17,631 1688 10,400 532 10,644 1031 14,950 750 14,056 1365 – – 18,119 908 14,950 719 12,431 923 18,444 946 – – 19,175 927
Scen. N∼H GAP s ∗ ∗ Cmax E E ∗ Cmax Cmax E 197 4914 −2.5 16.8 −12.5 5.3 187 5343 −2.5 25.0 −13.5 6.9 197 5644 −2.5 25.0 −12.7 5.8 262 7490 −2.5 19.4 −13.2 5.7 226 6871 −2.5 22.0 −13.1 7.1 269 7596 −2.5 24.0 −6.3 3.6 343 10,225 −2.5 22.5 −13.6 7.6 356 10,621 −2.5 23.5 −9.5 6.2 369 12,190 −2.5 25.0 −14.8 8.4 383 12,615 −2.5 24.4 −14.0 8.1 465 14,238 −2.5 17.5 −11.2 7.5 457 14,667 −2.5 23.6 −13.2 8.0 610 19,569 −2.5 19.6 −15.0 8.7 565 19,010 −2.5 25.0 −15.3 9.1 684 22,082 −2.5 74.8 −14.2 8.2 678 22,912 −2.5 25.0 −16.2 9.1 251 7490 −2.5 20.1 −13.7 5.7 327 9979 −2.5 25.0 −13.8 6.9 382 12,034 −2.5 23.7 −14.2 4.6 342 11,360 −2.5 25.0 −16.2 8.2 450 14,536 −2.5 19.4 −13.710.4 489 16,819 −2.5 24.8 −16.2 9.1 – – −2.5 22.3 − − – – −2.5 25.0 − − 353 11,610 −2.5 20.8 −15.9 8.0 363 12,326 −2.5 25.0 −14.5 8.8 – – −2.5 23.8 − − – – −2.5 23.6 − − – – − − − − – – −2.5 23.3 − − – – −2.5 25.0 − − – – −2.5 22.4 − − – – −2.5 21.1 − − – – − − − − – – −2.5 25.0 − − −2.5 74.8 −6.310.4 −2.5 24.5 −13.6 7.4 −2.5 16.8 −16.2 3.6
Energy-Efficient Scheduling of Intraterminal Container Transport
175
was calculated as in Eq. (30). The other GAP values have been calculated in a similar way. Note that the objective values in Tables 1 and 2 have been rounded to the nearest integer. GAP
E∗
=
∗ ∗ E(Scen.L∼N ) − E(Scen.N ) ∗ E(Scen.N )
× 100.
(30)
Generally speaking, minimizing the total energy consumed (E ∗ ) is computa∗ ). As it can be seen tionally less expensive than minimizing the makespan (Cmax in Tables 1 and 2, the MILP model was solved to optimality for all the problem instances when only one speed level was considered (Scen.N), regardless of the objective function being optimized. However, that was not the case when two speed levels were allowed (scenarios Scen.L ∼ N and Scen.N ∼ H ). From the 70 instances attempted, when considering AGVs with lower and nominal speed levels and optimizing the total energy consumption, four instances could not be solved (due either to the lack of memory or to reaching the predetermined CPU time threshold of 36,000 s); while when considering AGVs with nominal and higher speed levels and optimizing the makespan, 23 instances could not be solved. ∗ under Scen.N is 395.66 s (ranging from 0.04 The average CPU time to find Cmax to 5641.82 s) for data set 1 and 536.64 s (ranging from 0.03 to 6609.27 s) for data set 2. However, the average CPU time to find E ∗ is much lower, 1.22 s (ranging from 0.04 to 12.68 s) for data set 1 and 8.82 s (ranging from 0.05 to 279.27 s) for data set 2. The average CPU time to find E ∗ is 376.03 and 29.55 s, respectively, for ∗ , it is 166.16 and 35.79 s under data sets 1 and 2 under Scen.L∼N, and to find Cmax Scen.N ∼H for data sets 1 and 2, respectively. (Note that the latter averages were computed only for 23 and 24 problem instances in data sets 1 and 2, respectively.) It is not surprising to see that the adjustable speed strategy can improve the intraterminal transport performance by decreasing the makespan or energy consumption. Nevertheless, as expected, decreasing one objective increases the other one. Regarding the energy consumption E ∗ , it is 2.5% lower than that of a fixed nominal speed, under any of three speed level scenarios. Recall that considering the values reported in Fig. 3, travelling at the lower speed allows for a decrease in the energy consumption per time unit of 22% while increasing the travel time by 25%; hence, the overall energy consumption can be decreased by 2.5%. However, achieving the minimum energy consumption increases the makespan on average by 21.3% for data set 1 and by 24.5% for data set 2 (ranging from 9.8% to 24.1% and from 16.8% to 74.8% for data sets 1 and 2, respectively). Similarly, one can see ∗ , under three speed level scenarios, decreases by 13.9% and that on average Cmax 13.6%, respectively, for data sets 1 and 2. Nevertheless, this lower makespan value is achieved at the cost of increasing the energy consumption that, on average, goes up by 8.3% and 7.4% for data sets 1 and 2, respectively.
176
S. Mahdi Homayouni and D. B. M. M. Fontes
5.4 Dual-Cycling Strategy Results We investigate the impact of the dual-cycling strategy by considering a set of 26 problem instances. These instances were obtained by combining one problem instance from data set 1 (loading task instance) with one instance from data set 2 (unloading task instance). Regarding the AGVs, we consider them to be shared, that is, for each instance, there are two AGVs to transport all containers. If the loading and unloading tasks are performed by the same set of QCs, then the instance refers to a single ship that needs to be unloaded and loaded. In this case, a higher priority is given to the unloading tasks. However, when the unloading and loading tasks are performed by different sets of QCs, then the instance refers two different ships that are under service simultaneously; thus, the model decides on the priority of the tasks. The problem instances under the dual-cycling strategy were also solved using the “lexicographic” approach previously described. The results obtained (under scenarios Scen.L ∼ N and Scen.N ∼ H ) are reported in Table 3. Note that the objective values in this table have been rounded to the nearest integer. We report on the instances, each of which is a combination of the two problem instances specified in the “comb. of” column and their characteristics. For convenience, we also report the single-cycling best (of each) objective values (previously reported in Tables 1 and 2). The GAP values are calculated between objective values of the dual-cycling problem instances and the summation of respective values of the unloading and loading problem instances, assuming that the two sets of unloading and loading tasks are performed consecutively for a single ship under service (or concurrently for two ships under service) without loss of time and energy. The GAP values for ∗ , E ∗ , and E ∗ are the E ∗ have been calculated as shown in Eq. (31), wherein E(D) (U) (L) minimum energy consumption under the dual-cycling strategy and under the singlecycling strategy for unloading tasks and for loading tasks, respectively. The other GAP values have been calculated in a similar way. Note that an optimal solution ∗ was not found for E ∗ and/or Cmax for problem instances D18, D21, D22, D25, and D26 due to reaching the 36,000 CPU time limit imposed:
∗ = GAP E(D)
∗ − E∗ + E∗ E(D) (L) (U) ∗ + E∗ E(L) (U)
× 100.
(31)
As it can be seen from the results reported, the dual-cycling strategy can decrease ∗ , on average, by 7.5% (ranges from 0.0% to 18.3%) and 18.1% the E ∗ and Cmax (ranges from 5.6% to 34.5%), respectively. It is interesting to note that for most ∗ not only problem instances, under the dual-cycling strategy, decreasing E ∗ or Cmax does not imply additional energy or time, respectively, but also, in most cases, allows for their reduction. In other words, with the dual-cycling strategy, it is possible not ∗ only to decrease E ∗ and Cmax (on average by 7.5% and 18.1%, respectively) but
Energy-Efficient Scheduling of Intraterminal Container Transport
177
Table 3 Results for the 26 problem instances with dual-cycling strategy Unloading Ins Loading Ins Dual-cycling
GAP s
Ins
comb. of
Q-T E ∗
∗ Cmax
E∗
∗ Cmax E∗
∗ Cmax Cmax E
D01 D02 D03 D04 D05 D06 D07 D08 D09 D10 D11 D12 D13 D14 D15 D16 D17 D18 D19 D20 D21 D22 D23 D24 D25 D26 Max Mean Min
U01&L01 U02&L02 U03&L03 U01&L02 U01&L03 U02&L01 U02&L03 U03&L01 U03&L02 U04&L04 U05&L05 U06&L06 U04&L05 U04&L06 U05&L04 U05&L06 U06&L04 U06&L05 U07&L07 U08&L08 U07&L08 U08&L07 U17&L17 U18&L18 U17&L18 U18&L17
2–8 2–8 2–8 4–8 4–8 4–8 4–8 4–8 4–8 2–12 2–12 2–12 4–12 4–12 4–12 4–12 4–12 4–12 2–16 2–16 4–16 4–16 3–12 3–12 6–12 6–12
197 187 197 197 197 187 187 197 197 262 226 269 262 262 226 226 269 269 343 356 343 356 251 327 251 327
3900 4713 4875 4713 4875 3900 4875 3900 4713 6338 6175 6825 6175 6825 6338 6825 6338 6175 9100 9263 9263 9100 6175 8775 8775 6175
175 229 229 229 229 175 229 175 229 276 274 322 274 322 276 322 276 274 384 405 405 384 250 347 347 250
510 510 491 528 516 354 954 396 792 709 648 751 992 1068 578 1309 805 – 1069 1670 1815 – 657 797 842 646
4550 4713 4875 4550 4550 4713 4713 4875 4875 6906 6094 6663 6906 6906 6094 6094 6663 6663 8938 9100 8938 9100 6906 8938 6906 8938
7475 7800 8125 9263 9425 7800 9425 8450 8450 12,513 10,400 11,700 13,000 13,650 11,213 12,675 12,675 – 16,900 17,063 18,038 – 11,050 14,463 15,600 13,650
351 350 350 318 346 243 339 277 279 504 443 493 450 486 367 453 429 – 636 686 – – 453 532 – –
8832 9258 9442 10,956 11,686 8749 11,381 9304 9860 14,166 12,897 14,230 15,223 15,234 13,207 15,199 15,156 – 20,069 20,084 – – 13,837 17,009 – –
E∗
−11.5 −17.2 −16.7 0.0 0.0 −9.4 −1.7 −3.7 −11.9 −5.5 −15.2 −13.3 −0.6 −0.6 −9.8 −1.9 −2.5 − −6.3 −7.1 −0.9 − −15.5 −18.3 −0.5 −9.7 0.0 −7.5 −18.3
∗ Cmax Cmax E
−10.9 −48.3 −50.7 −33.3 −33.8 −53.8 −2.1 −49.8 −21.4 −11.7 −48.7 −45.8 −2.9 4.6 −44.6 3.6 −26.1 − −40.2 −9.1 −1.1 − −15.0 −43.9 −22.9 −41.4 4.6 −27.1 −53.8
−5.6 −15.9 −17.8 −25.4 −18.8 −32.9 −18.6 −25.6 −34.5 −6.3 −11.3 −16.6 −16.1 −16.7 −27.0 −17.3 −21.4 − −12.5 −9.9 − − −9.7 −21.0 − − −5.6 −18.1 −34.5
−5.9 −15.2 −17.6 4.5 8.9 −10.9 2.0 −8.0 −12.1 −5.4 −7.3 −8.8 4.7 4.8 −8.0 9.2 0.5 − −3.0 −6.9 − − −5.7 7.3 − − 9.2 −3.5 −17.6
also to decrease the corresponding Cmax and E (on average by 27.1% and 3.5%, respectively).
5.5 Pareto Front Analysis In the last set of experiments of this chapter, we employed the E-constraint method to obtain the Pareto front (at least partially) of a selected set of problem instances. The fronts obtained are depicted in Figs. 4 and 5. The trade-off curves were obtained by solving each problem instance under the three speed level scenario. We started by obtaining the extreme solution that optimizes the energy consumed E ∗ while disregarding the makespan and calculated
178
S. Mahdi Homayouni and D. B. M. M. Fontes
the corresponding makespan. Then we solved a series of MILP models minimizing E with an additional constraint on the maximum allowed value for Cmax while gradually decreasing it (by about 1.5%). The optimization process was repeated ∗ . until we reached the other extreme solution, obtaining Cmax In Figs. 4 and 5, the extreme solutions are shown in red. Note that for problem instances U30 and L30, we were not able to solve the MILP model to optimality when the maximum allowed makespan is, respectively, 820 and 970 s and lower. Nevertheless, the E values of these points are very close to the optimum ones since the optimality gaps reported by Gurobi® are always below 1%. Thus, we show these values in orange. As it can be seen, for problem instances U10, L10, U20, and L20 at a specific level of energy consumption, the makespan can be decreased to a great extent. As an example, in problem instance L20, by allowing the intraterminal transport to consume 10,150 kW (which is only 1.5% more than the minimum energy consumption), the makespan can be decreased from 890 s to about 600 s (about 33% reduction). Or we can say that to decrease the energy consumption by 1.5%, the makespan value becomes almost three times as much as the minimum makespan value. The trade-off curves for problem instances D10 and D15 follow a similar pattern.
6 Conclusion Although innovative technology for intraterminal transport can be used to decrease energy consumption, its usage is not yet common. Besides, such technologies require large financial investments. Moreover, several and significant benefits can be achieved by improving the efficiency of existing equipment. In this work, we propose to employ the concept of pollution routing problem for intraterminal container transportation which, without a significant impact on service level, can decrease the energy consumption and/or service time (makespan) at almost no extra operational cost. The problem is formulated as three mixedinteger linear programming (MILP) models for a set of loading tasks, a set of unloading tasks, and a set of mixed unloading and loading tasks under a dual-cycling strategy. We also generated three sets of small-sized problem instances that can be used for future comparison purposes. The results show that under speed adjustable policy for intraterminal transport (with three speed levels), the minimum makespan can be decreased on average by 13.9% and 13.6% for problem instances in data sets 1 and 2, respectively, while the energy consumption can be decreased by 2.5% in both data sets. These results were obtained by allowing the vehicles to be able to also travel at speed levels 20% lower and 20% higher than the nominal speed level. In addition, we have also shown that the dual-cycling strategy can decrease the minimum energy consumption and makespan, on average, by 7.5% and 18.1%, respectively. The Pareto fronts of a selected subset of problem instances have shown that several trade-off solutions
420750
520850
620950
720 1050
820
920
Makespan
700 18000
L20
750
800
U20
18050
18100
18150
7201050
550
420 750
520 850
620 950 18100
Fig. 4 Pareto front analysis for problem instances U10, L10, U20, and L20
350 9900
450
550
10100
10300
10500
10700
Energy
Energy
10900
11100
11300
11500
11200 11600 11800 12000 12200 12400 12600 12800 320 650 11400 Energy 9600 9800 10000 10200 10400 10600 10800 11000 11200 11400
L20
700 18000
Energy
18250
18200
11400
Energy
11700
119000
Energy
18200
11200 11600 11800 12000 12200 12400 12600 12800 32065011400 9600 9800 10000 Energy 10200 10400 10600 10800 11000 11200
Makespan
U10
Makespan
7201050
750
450
550
650
750
850
950
1050
1150
350 11000
450
550
650
820
920
1020
850
950
1050
1150
420 750
520 850
620 950
U10 U20 Makespan
L20
700 18000
750
800
850
16500
18100
17000
175
L30
Energy
Energy
17800 1800012400 11600 17850 11800 17900 1200017950 12200
L10
12600
12800
11200 11600 11800 12000 12200 12400 12600 12800 320 65011400 Energy 9600 9800 10000 10200 10400 10600 10800 11000 11200 1140
350 17600 1765011200 17700 11400 17750 11000
Makespan
Makespan
Makespan
Makespan
Makespan akespan
1020
Energy-Efficient Scheduling of Intraterminal Container Transport 179
Fig. 5 Pareto front analysis for problem instances U30, L30, D10, and D15
12600
12800
13000
13400
Energy
Energy
D15
13200
13600
13800
14000
14200
14400
350 11000 11200 11400 11600 11800 12000 12200 12400 12600 12800 13000 13200 13400
400
450
500
550
600
450 12400
500
550
600
650
700
D10
Makespan
Ma
750 17500
800
850
900
950
1000
1050
17550
18050
600 16500
800
1000
700 18000
L20
750
800
U20
850
900
950
Makespan
750
L30
Energy
350 17600 1765011200 17700 11400 17750 11000
450
550
650
750
850
950
1050
1150
Energy
18250
18000
U30
18200
17500
18150
17000
18100
Makespan
Makespan
Makespan
L10
18350
18450
19500
18400
19000
18500
20000
Energy
17800 18000124 11600 17850 11800 17900 1200017950 12200
18300
18500
Energy
180 S. Mahdi Homayouni and D. B. M. M. Fontes
Energy-Efficient Scheduling of Intraterminal Container Transport
181
can be of great importance since a large reduction on one of the objectives can be achieved with a small increase on the other. This work can be extended in several ways, for example, by considering other types of intraterminal transporters, including other container handling operations within the scheduling process, and developing heuristic solution methods capable of solving larger and more realistic problem instances in a timely manner. In order to estimate the energy consumption rate at different speed levels, we used the data gathered for the low-weight and heavyweight carrying vehicles. However, a comprehensive theoretical/practical investigation on the energy consumption rate by vehicles in container terminals (specifically AGVs and other autonomous vehicles) seems to be in urgent need. Acknowledgments The authors acknowledge receiving FEDER/COMPETE2020/NORTE2020/ POCI/PIDDAC/MCTES/FCT funds through grants PTDC/EGE-OGE/31821/2017 and PTDC/ EEI-AUT/31447/2017.
Appendix The travelling distance between QCs and LU stations in the designed CT is reported in Table 4, and data set for the proposed small-sized problem instances are reported in Table 5. Table 4 Travelling distance (in m) between QCs and LUs in the designed CT layout QC1 QC2 QC3 QC4 QC5 QC6 LU1 LU2 LU3 LU4 LU5 LU6
QC1 0 250 300 350 400 450 150 200 250 300 350 400
QC2 50 0 250 300 350 400 200 150 200 250 300 350
QC3 100 50 0 250 300 350 250 200 150 200 250 300
QC4 150 100 50 0 250 350 300 250 200 150 200 250
QC5 200 150 100 50 0 250 350 300 250 200 150 200
QC6 250 200 150 100 50 0 400 350 300 250 200 150
LU1 150 200 250 300 350 400 0 50 100 150 200 250
LU2 200 150 200 250 300 350 250 0 50 100 150 200
LU3 250 200 150 200 250 300 300 250 0 50 100 150
LU4 300 250 200 150 200 250 350 300 250 0 50 100
LU5 350 300 250 200 150 200 400 350 300 250 0 50
LU6 400 350 300 250 200 150 450 400 350 300 250 0
Q-T 2–4
2–4
2–6
2–8
2–10
2–12
2–12
2–14
2–14
2–16
2–16
Ins L01
L03
L05
L07
L09
L11
L12
L13
L14
L15
L16
QC 1 2 5 6 3 4 3 4 2 3 1 2 3 4 1 2 4 5 2 3 3 4
(76, 1) (59, 1) (76, 5) (59, 5) (60, 3) (50, 4) (72, 3) (80, 4) (42, 2) (37, 2) (64, 1) (47, 2) (56, 4) (60, 3) (59, 2) (78, 1) (67, 5) (52, 3) (64, 2) (57, 1) (69, 2) (37, 3)
(50, 1) (56, 1) (49, 5) (37, 5) (60, 3) (64, 4) (59, 4) (52, 3) (38, 2) (60, 3) (63, 1) (77, 1) (55, 3) (79, 3) (51, 1) (41, 3) (35, 4) (66, 5) (46, 1) (66, 2) (40, 3) (78, 4)
(42, 3) (58, 4) (56, 4) (78, 4) (47, 3) (71, 3) (38, 2) (79, 2) (71, 3) (47, 4) (55, 3) (67, 2) (33, 4) (69, 3) (77, 3) (32, 2) (45, 2) (48, 2) (80, 3) (38, 3) (76, 2) (68, 2) (62, 1) (66, 2) (48, 4) (62, 4) (41, 2) (32, 3) (39, 3) (35, 5) (78, 1) (45, 1) (43, 4) (61, 2)
Table 5 Data set for loading small-sized problem instances
(47, 2) (31, 3) (51, 1) (75, 2) (30, 3) (49, 4) (31, 2) (80, 2) (54, 5) (38, 5) (59, 2) (74, 1) (66, 3) (75, 3) (30, 2) (66, 1) (68, 3) (54, 4) (64, 2) (75, 3) (31, 3) (41, 4) (37, 1) (31, 2) (60, 2) (39, 4) (43, 3) (63, 1) (49, 4) (71, 5) (61, 2) (40, 3) (50, 2) (69, 3)
L10
L08
L06
L04
Ins L02
(70, 2) (72, 3) (80, 4) (43, 2)
2–10
2–8
2–6
2–6
Q-T 2–4 QC 3 4 1 2 5 6 5 6 4 5 (68, 3) (49, 3) (38, 2) (68, 1) (41, 5) (63, 6) (63, 6) (77, 6) (54, 4) (37, 5)
(58, 3) (70, 3) (52, 2) (55, 1) (30, 6) (79, 6) (38, 5) (79, 5) (38, 5) (55, 4) (70, 1) (48, 2) (36, 5) (72, 5) (62, 6) (66, 6) (57, 4) (46, 5)
(30, 5) (66, 5) (47, 4) (44, 4)
(79, 5) (31, 5)
182 S. Mahdi Homayouni and D. B. M. M. Fontes
3–6
3–9
3–12
3–15
4–8
4–12
4–16
L17
L19
L21
L23
L25
L27
L29
1 2 3 2 3 4 1 2 3 1 2 3 1 2 3 4 2 3 4 5 3 4 5 6
(76, 1) (36, 2) (57, 1) (39, 2) (34, 3) (78, 4) (71, 1) (30, 2) (72, 3) (72, 2) (66, 2) (47, 2) (36, 1) (74, 2) (77, 1) (40, 2) (39, 3) (67, 5) (50, 3) (68, 4) (46, 3) (47, 5) (79, 3) (32, 4)
(37, 1) (55, 2) (50, 2) (31, 3) (78, 2) (66, 3) (60, 1) (78, 1) (64, 3) (62, 3) (61, 3) (56, 3) (73, 1) (40, 2) (58, 2) (43, 1) (48, 4) (41, 3) (46, 5) (68, 3) (60, 4) (58, 3) (30, 5) (49, 3)
(51, 3) (56, 4) (56, 4) (77, 5) (54, 3) (50, 4) (56, 4) (67, 5)
(51, 4) (60, 4) (59, 2) (54, 2) (72, 2) (67, 2) (41, 1) (67, 1) (40, 1)
(71, 5) (70, 3) (39, 4) (63, 3)
(54, 1) (57, 2) (76, 3) (73, 2) (54, 1) (33, 3) (56, 1) (51, 2) (48, 3)
L30
L28
L26
L24
L22
L20
L18
4–16
4–12
4–8
3–15
3–12
3–9
3–6
4 5 6 3 4 5 4 5 6 4 5 6 3 4 5 6 3 4 5 6 1 2 3 4
(64, 3) (71, 4) (63, 4) (74, 3) (42, 4) (32, 5) (52, 5) (32, 6) (80, 6) (47, 5) (79, 6) (32, 4) (64, 3) (45, 3) (68, 4) (63, 3) (63, 3) (51, 4) (60, 5) (41, 5) (75, 1) (69, 2) (45, 3) (62, 3)
(73, 3) (47, 4) (56, 3) (38, 3) (65, 4) (44, 5) (33, 6) (79, 5) (32, 6) (58, 6) (30, 5) (49, 6) (75, 3) (72, 4) (77, 4) (74, 4) (42, 4) (43, 3) (30, 5) (47, 5) (39, 2) (80, 2) (48, 2) (36, 3) (64, 3) (45, 4) (68, 4) (63, 4) (46, 1) (67, 1) (77, 3) (50, 3)
(48, 3) (52, 5) (65, 3) (77, 4) (74, 5) (66, 4) (50, 4) (56, 5) (67, 5)
(35, 1) (44, 2) (42, 3) (63, 2)
(34, 5) (63, 4) (33, 4) (70, 4) (39, 4) (63, 4)
(continued)
(51, 4) (56, 5) (77, 6)
Energy-Efficient Scheduling of Intraterminal Container Transport 183
Q-T 5–10
5–15
6–12
Ins L31
L33
L35
Table 5 (continued)
QC 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 6
(61, 1) (71, 2) (30, 2) (38, 2) (55, 1) (80, 2) (32, 3) (63, 4) (48, 4) (67, 4) (72, 1) (61, 2) (53, 3) (57, 1) (65, 3) (58, 2)
(59, 1) (62, 1) (30, 1) (37, 2) (42, 1) (43, 2) (47, 2) (52, 4) (57, 3) (43, 3) (59, 3) (57, 1) (48, 3) (40, 2) (50, 1) (47, 3) (53, 3) (70, 3) (71, 4) (75, 4) (65, 4)
L34
Ins L32
5–15
Q-T 5–10 QC 2 3 4 5 6 2 3 4 5 6 (79, 3) (55, 3) (46, 4) (77, 4) (80, 4) (45, 2) (34, 3) (43, 4) (66, 4) (56, 4)
(66, 3) (45, 3) (56, 4) (34, 4) (67, 4) (50, 3) (47, 2) (70, 3) (35, 4) (36, 3)
(77, 2) (52, 3) (67, 4) (60, 4) (56, 4)
184 S. Mahdi Homayouni and D. B. M. M. Fontes
Energy-Efficient Scheduling of Intraterminal Container Transport
185
References 1. A.S. Alamoush, F. Ballini, A.I. Ölçer, Ports’ technical and operational measures to reduce greenhouse gas emission and improve energy efficiency: a review. Mar. Pollut. Bull. 160, 111508 (2020) 2. S. Behjati, N. Nahavandi, A mathematical model and grouping imperialist competitive algorithm for integrated quay crane and yard truck scheduling problem with non-crossing constraint. IJE Trans. A Basics 32(10), 1464–1479 (2019) 3. T. Bekta¸s, G. Laporte, The pollution-routing problem. Transp. Res. B: Methodol. 45(8), 1232– 1250 (2011) 4. M. Christiansen, K. Fagerholt, B. Nygreen, D. Ronen, Ship routing and scheduling in the new millennium. Eur. J. Oper. Res. 228(3), 467–483 (2013) 5. Y. Du, Q. Meng, S. Wang, H. Kuang, Two-phase optimal solutions for ship speed and trim optimization over a voyage using voyage report data. Transp. Res. B: Methodol. 122, 88–114 (2019) 6. R. Eshtehadi, M. Fathian, E. Demir, Robust solutions to the pollution-routing problem with demand and travel time uncertainty. Transp. Res. D: Transp. Environ. 51, 351–363 (2017) 7. EU, Study on the Analysis and Evolution of International and EU Shipping (European Commission, 2015) 8. EU, Port of Hamburg uses green ‘smart batteries’ to support the German energy transition, 2020. https://ec.europa.eu/regional_policy/en/projects/germany/port-of-hamburg-uses-greensmart-batteries-to-support-the-german-energy-transition 9. S.M. Homayouni, D.B. Fontes, Metaheuristics for Maritime Operations (Wiley, London, 2018) 10. S.M. Homayouni, S.H. Tang, Multi objective optimization of coordinated scheduling of cranes and vehicles at container terminals. Math. Probl. Eng. 2013, 746781 (2013) 11. S.M. Homayouni, S.H. Tang, O. Motlagh, A genetic algorithm for optimization of integrated scheduling of cranes, vehicles, and storage platforms at automated container terminals. J. Comput. Appl. Math. 270, 545–556 (2014) 12. Ç. Iris, J.S.L. Lam, A review of energy efficiency in ports: operational strategies, technologies and energy management systems. Renew. Sustain. Energy Rev. 112, 170–182 (2019) 13. N. Kaveshgar, N. Huynh, Integrated quay crane and yard truck scheduling for unloading inbound containers. Int. J. Prod. Econ. 159, 168–177 (2015) 14. D. Kizilay, D.T. Eliiyi et al., A comprehensive review of quay crane scheduling, yard operations and integrations thereof in container terminals. Flexible Serv. Manuf. J. 33, 1–42 (2021) 15. Y. Li, F. Chu, F. Zheng, M. Liu, A bi-objective optimization for integrated berth allocation and quay crane assignment with preventive maintenance activities. IEEE Trans. Intell. Transp. Syst. (2020). https://doi.org/10.1109/TITS.2020.3023701 16. D. Liu, Y.-E. Ge, Modeling assignment of quay cranes using queueing theory for minimizing CO2 emission at a container terminal. Transp. Res. D Transp. Environ. 61, 140–151 (2018) 17. N. Ma, C. Zhou, A. Stephen, Simulation model and performance evaluation of battery-powered AGV systems in automated container terminals. Simul. Model. Pract. Theory 106, 102146 (2021) 18. S.A. Mansouri, H. Lee, O. Aluko, Multi-objective decision support to enhance environmental sustainability in maritime shipping: a review and future directions. Transp. Res. E Logist. Transp. Rev. 78, 3–18 (2015) 19. J. Martínez-Moya, B. Vazquez-Paja, J.A.G. Maldonado, Energy efficiency and co2 emissions of port container terminal equipment: Evidence from the port of Valencia. Energy Policy 131, 312–319 (2019) 20. M. Meißner, L. Massalski, Modeling the electrical power and energy consumption of automated guided vehicles to improve the energy efficiency of production systems. Int. J. Adv. Manuf. Technol. 110(1), 481–498 (2020) 21. R. Moghdani, K. Salimifard, E. Demir, A. Benyettou, The green vehicle routing problem: a systematic literature review. J. Cleaner Prod. 279, 123691 (2021)
186
S. Mahdi Homayouni and D. B. M. M. Fontes
22. H.N. Psaraftis, C.A. Kontovas, Balancing the economic and environmental performance of maritime transportation. Transp. Res. D Transp. Environ. 15(8), 458–462 (2010) 23. X. Qiu, E.Y. Wong, J.S.L. Lam, Evaluating economic and environmental value of liner vessel sharing along the maritime silk road. Mar. Policy Manag. 45(3), 336–350 (2018) 24. S. Riazi, K. Bengtsson, B. Lennartson, Energy optimization of large-scale AGV systems. IEEE Trans. Autom. Sci. Eng. 18(2), 638–649 (2021) 25. D. Ronen, The effect of oil price on containership speed and fleet size. J. Oper. Res. Soc. 62(1), 211–216 (2011) 26. J. Schmidt, C. Meyer-Barlag, M. Eisel, L.M. Kolbe, H.-J. Appelrath, Using battery-electric AGVs in container terminals–assessing the potential and optimizing the economic viability. Res. Transp. Bus. Manag. 17, 99–111 (2015) 27. E. Sdoukopoulos, M. Boile, A. Tromaras, N. Anastasiadis, Energy efficiency in European ports: State-of-practice and insights on the way forward. Sustainability 11(18), 4952 (2019) 28. L. Tang, J. Zhao, J. Liu, Modeling and solution of the joint quay crane and truck scheduling problem. Eur. J. Oper. Res. 236(3), 978–990 (2014) 29. G. Venturini, Ç. Iris, C.A. Kontovas, A. Larsen, The multi-port berth allocation problem with speed optimization and emission considerations. Transp. Res. D Transp. Environ. 54, 142–159 (2017) 30. N. Wang, D. Chang, X. Shi, J. Yuan, Y. Gao, Analysis and design of typical automated container terminals layout considering carbon emissions. Sustainability 11(10), 2957 (2019) 31. Y. Xiao, X. Zuo, J. Huang, A. Konak, Y. Xu, The continuous pollution routing problem. Appl. Math. Comput. 387, 125072 (2020) 32. Y. Yang, M. Zhong, Y. Dessouky, O. Postolache, An integrated scheduling method for AGV routing in automated container terminals. Comput. Ind. Eng. 126, 482–493 (2018) 33. Z. Yuan, A brief literature review on ship management in maritime transportation, in Technical Report Series, TR/IRIDIA/2016-001 (IRIDIA, Universite Libre de Bruxelles, 2016) 34. L. Yue, H. Fan, C. Zhai, Joint configuration and scheduling optimization of a dual-trolley quay crane and automatic guided vehicles with consideration of vessel stability. Sustainability 12(1), 24 (2020) 35. Q. Zhao, S. Ji, D. Guo, X. Du, H. Wang, Research on cooperative scheduling of automated quayside cranes and automatic guided vehicles in automated container terminal. Math. Probl. Eng. 2019, 6574582 (2019) 36. L. Zhen, S. Yu, S. Wang, Z. Sun, Scheduling quay cranes and yard trucks for unloading operations in container ports. Ann. Oper. Res. 273(1), 455–478 (2019) 37. L. Zhen, D. Zhuge, L. Murong, R. Yan, S. Wang, Operation management of green ports and shipping networks: overview and research opportunities. Front. Eng. Manag. 6, 152–162 (2019) 38. M. Zhong, Y. Yang, Y. Dessouky, O. Postolache, Multi-AGV scheduling for conflict-free path planning in automated container terminals. Comput. Ind. Eng. 142, 106371 (2020) 39. M. Zhong, Y. Yang, S. Sun, Y. Zhou, O. Postolache, Y.-E. Ge, Priority-based speed control strategy for automated guided vehicle path planning in automated container terminals. Trans. Inst. Measur. Control (2020). https://doi.org/10.1177/0142331220940110
Learning-Based Control for Hybrid Battery Management Systems Jonas Mirwald, Ricardo de Castro, Jonathan Brembeck, Johannes Ultsch, and Rui Esteves Araujo
1 Introduction Electric vehicles (EVs) are currently seen as a key technology for sustainable transportation. They enable the integration of renewable energies with transportation systems and provide a promising avenue to reduce environmental impact [1]. However, to fulfill these goals, several challenges need to be addressed. One challenge is the parameter variations of large battery packs, which occurs due to manufacturing tolerances and nonuniform aging of battery cells. These variations introduce the so-called weakest cell problem, i.e., the performance of the battery pack is limited by the cell with the largest thermal and capacity degradation [2]. To overcome these issues, active balancing systems, capable of equalizing charge and temperature, have been developed [3]. Another challenge in the design of battery packs lies in the selection of battery chemistries, which offer simultaneously high energy density, high power density, and long life. To attenuate these issues, hybrid and modular energy storage systems, composed of heterogeneous units, have been investigated [4]. One promising research avenue deals with battery-supercapacitor hybridization. Supercapacitors with high power density and durability are particularly suited to handle rapid power bursts, while battery packs with high energy density can provide
J. Mirwald · J. Brembeck · J. Ultsch Institute of System Dynamics and Control, German Aerospace Center (DLR), Weßling, Germany e-mail: [email protected] R. de Castro () Department of Mechanical Engineering, University of California, Merced, CA, USA e-mail: [email protected] R. E. Araujo INESC TEC and Faculty of Engineering, University of Porto, Porto, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_7
187
188
J. Mirwald et al.
RL Training Algorithm
Agent
Action 2
Action 1
Action n
States
…
isc
–
…
Supercapacitor Pack
…
+
Power Converter 2
Power Converter 1
ib,1
…
ib,n …
ib,2 Cell 1
+
vb,1
Cell 2 –
+
vb,2
Power Converter n
Cell n
+
–
iload
vb,n
–
Load
Pload,1 Power Re-scaling
Pload,0
Drivetrain Model
Fig. 1 Block diagram of the hybrid battery management system and the RL-based controller. ∗ Note: each j-th action is equal to the converter reference current ib,j
average power during vehicle cruising. Numerous works have been utilizing these properties to reduce peak power loads, weight, and stress of the battery (see, e.g., [5, 6] and references therein). In the literature, the active battery balancing and hybridization (e.g., batterysupercapacitor) has been treated as two independent functions, performed by separate power converters. In contrast, the authors’ group has been investigating the possibility of integrating these two functions into one system [7] (see Fig. 1). The concept, called hybrid battery management system (HBMS), exploits the power electronics already embedded in the balancing circuit to simultaneously enable battery equalization and hybridization, avoiding the need to incorporate multiple power converters. This increased level of integration has great potential to reduce hardware costs and raise competitiveness of hybrid energy storage solutions while posing research challenges for the design, real-time control, and implementation of the HBMS. The HBMS is capable of simultaneously equalizing battery capacity and temperature while enabling hybridization with additional storage systems, such as supercapacitors. Despite these attractive features, the HBMS poses numerous
Learning-Based Control for Hybrid Battery Management Systems
189
control issues, such as coordinating a large number of power converters, enforcing actuation and safety constraints, and making trade-offs between multiple technical and economic objectives. Model-based controllers, such as [7, 8], are one possible approach to tackle these issues. These controllers rely on mathematical models that approximate and predict the behavior of the plant (energy storage and power conversion in the HBMS). However, these models, usually obtained from physical laws, are not always easy to derive or parameterize. For example, batteries depend on complex chemical reactions, requiring involved partial differential equations or complicated approximations via electric-equivalent circuits [9], while power converters have switching and nonlinear behavior [10]. Constructing and parameterizing these representations is time-consuming, is subject to uncertainties and modelling mismatches, and often needs engineering insights to find a good balance between complexity and accuracy. Additionally, the real-time deployment of modelbased controllers needs, in some cases, significant computational effort [11], especially if optimization-based approaches are used, which presents hurdles for execution in embedded systems. Reinforcement learning (RL) offers an alternative design route to tackle some of these hurdles. Similarly to optimal control, the RL algorithm tries to optimize a policy with regard to a predefined reward function, encoding the control goal. In model-free RL, this optimization is done solely based on observed states, actions, and rewards during the repeated interaction with the environment, called training. One major advantage of using RL is the fact that simulation models (i.e., synthetic data) or even the real plant can be used for the training process, avoiding the need for a controller synthesis model. By using offline training against a simulation model, the computational effort is shifted to the design stage, where significantly higher computational power is available. The obtained control policy after training, usually a rather small multilayer neural network, can then be evaluated efficiently after deployment on the target platform. Still, there are some limitations to RL [12, 13, 14]: (i) the offline training might require a significant amount of time to achieve the desired goals; (ii) RL algorithms are sensible to the used parameters and hyperparameters, which can be difficult to identify; (iii) the RL algorithm may tend to create unsmooth actions, if this is not explicitly penalized in the training process; (iv) the successful deployment of RL may heavily depend on the difference between the used training model and the real system. RL has been applied to a wide variety of problems, ranging from self-driving cars to finance and healthcare [14]. More recently, it has been considered for energy storage systems. For example, [15] used RL to search for battery electrolyte compositions that can reduce electric conductivity. Energy management of multiple energy storage systems (e.g., combining batteries, fuel cells, and/or supercapacitors) is another task that plays to RL strengths. In this case, RL is well suited to handle the stochastic uncertainties associated with future driving cycle information and to reduce computational effort when compared to optimal receding control approaches [16, 17]. Energy storage arbitrage in smart grids is another emerging problem tackled by RL. For example, [18] developed Q-learning-based algorithms to decide when to buy energy from the grid, accumulate it in the battery, and resell it to the
190
J. Mirwald et al.
grid while maximizing profit. In comparison with model-based solutions, these RL algorithms were able to increase profit margins of grid operators by more than 50%. RL is also being applied to derive control strategies for fast battery charging. For instance, [19] investigated the use of deep deterministic policy gradient (DDPG) algorithms to reduce the charging times of batteries in the presence of voltage and temperature constraints and parameter variations. One common element in these previous works is the treatment of the battery at a pack level, i.e., where capacity and temperature variations of the modules/cells are neglected, and the battery operation is approximated by a single virtual cell. To the best of the authors’ knowledge, the application of RL to control batteries at module or cell level has received less attention in the literature to date (particularly in HBMS), which represents a research gap that this work addresses. The main contribution of this work consists in evaluating the potential of RL algorithms to control HBMS. To support this investigation, a numerical efficient simulation model of the HBMS in Modelica language is developed. Particular attention is dedicated to reducing numerical complexity of the HBMS model, which is instrumental to decrease the training times of the RL algorithm – one of the main hurdles when applying this method in practice. This efficient simulation model is then used to train a control policy for the HBMS based on a soft actor-critic (SAC) algorithm [20, 21]. The application of SAC algorithms brings two advantages: (i) SAC is able to handle continuous control actions and feedback states; (ii) SAC offers a sample-efficient learning, thanks to its ability to automatically adapt exploration control policies during training, often requiring less training time than other RL algorithms previously employed in the control of energy storage systems, such as Q-learning [18] and DDPG [19].
2 HBMS Modeling As depicted in Fig. 1, the HBMS is composed of a battery pack, a supercapacitor (SC) pack, power conversion, and the load, which emulates the power consumption of the electric vehicle. The battery pack contains n single-cell modules connected in series. Each cell is linked with the primary side of a bidirectional DC-DC power converter, while the converters’ secondary sides are connected in parallel with the SCs. The power converters enable the cell-to-cell and the cell-to-SC transfer of energy. The goal of this work is to design a RL-based control algorithm that uses the power converters to equalize state of charge and temperature while reducing current stress in the cells. To support the design of the RL algorithm, this section presents the modeling environment for the HBMS. This modeling is carried out in Modelica, an objectoriented, acausal, and open-source language [22]. Modelica allows to integrate different physical domains, such as electrical and thermal, in the HBMS model. Its object-oriented features also enable the automatization of the creation of multiple instances of battery cells and power converters. As a result, HBMS models with
Learning-Based Control for Hybrid Battery Management Systems
191
different dimensions (ranging from a few cells to hundreds of cells) can be created with reduced coding effort. Additionally, Modelica’s HBMS model can be exported through the functional mock-up interface (FMI) standard [22] and combined with other simulation environments, such as Python [21], for training RL agents.
2.1 Power Conversion This subsection presents the principle of operation and modeling of the power converter, with particular emphasis on the efficient computation of energy losses.
Dual Half-Bridge Converter The power balancing hardware relies on a dual half-bridge (DHB) configuration. In addition to galvanic isolation, DHB offers zero-voltage switching, bidirectional power flow, and a reduced number of switches when compared to dual full-bridge configurations [23, 24]. As depicted in Fig. 2 the DHB interfaces with the battery cell (vb ) on the primary side and with the SCs pack (vsc ) on the secondary side. The DHB is composed of four main components: two input inductors (Lb , Lsc ), two half bridges (S1 , S2 and S3 , S4 ), auxiliary capacitors (C1 , . . . , C4 ), and a high-frequency transformer. The two half bridges, and their four switches, regulate the power flow between primary and secondary sides of the converter. This power is transferred via the high-frequency transformer and its leakage inductor (Ls ) [25]. Through pulse-width modulation of the DHB’s switches, two square voltage waveforms, shifted by φ [rad] and with switching frequency fsw [1/s], are applied to the transformer’s terminals (vab , vcd in Fig. 2). The resulting switching scheme is discussed in more detail at the end of this section. The power extracted from the battery cell is given by [26]:
Lb
S1
C1
ib + v - b
S2
C2
+v 1 -
transformer Ls
S3
+ vab +v 2 -
is
+ vcd -
battery side Fig. 2 Schematic configuration of the dual half bridge
S4
C3
+v 3 Lsc isc
C4
+ v 4 -
super capacitor side
+ v - sc
192
J. Mirwald et al.
dual half bridge with high frequency transformer bat_p
A
hB_ bat_si de
L_s
R=R_m
L=Ls 1
2
R2
hB_ SC_si de
SC_p
DC
R=R_2
V
V
AC
Vab
R
iB
HFtran sformer
Vcd
Rm
DC
R AC
bat_n
SC_n
n=1
R=R_Lio
L=LInOut
C_1_3
L_b _sc
C=C_B
S_1 _3
dc_p
R_L_b _sc
dio_ 1_3
half bridge model
ac
C_2_4
C=C_B
S_2 _4
dc_n
dio_ 2_4
fire_p
fire_n
hea tPo rt
fire_p2
fire_n2
fire_n1
fire_p1
heatPort
ac2
Fig. 3 Modelica model of the dual half-bridge converter
P (φ) = vb ib =
φ (π − φ) vb2 π 2πfsw Ls
(1)
which reveals that the maximum power flow is achieved when the phase shift φ reaches π /2. To further facilitate the control of power, an inner current loop in the DHB is included. This loop is based on a proportional plus integrational control law and regulates the input current (ib ) via manipulation of the phase shift (φ). The interested reader is referred to [26] for additional details. A detailed Modelica model of the DHB is developed, incorporating energy losses. In Fig. 3 (top), the dual half-bridge converter with a high-frequency transformer is shown in a typical Modelica diagram layer. All components are described in an acausal representation with flow and potential variables (i.e., voltage and current in case of electric systems, blue components). The red interfaces capture the thermal flow between the components, and the magenta indicates Boolean control signals of the switches. In Fig. 3 (bottom), the sub-model of the half bridge, which is identical on both sides of the DHB, is shown. The half-bridge model implements a DC-to-AC voltage conversion and contains copper and switching losses, as well parasitic elements, such as inductors’ resistances. The switches are implemented through MOSFET and an antiparallel freewheeling diode, based on the Infineon IPB009N03L G datasheet [27]. The control strategy must prevent that the Boolean signals fire_p and fire_n are true at the same time to avoid short circuits. Due to Modelica’s object-oriented design paradigm, the DHB can be constructed
Learning-Based Control for Hybrid Battery Management Systems
S1,fire
true false
S3,fire
vab
vcd
[V] 0 is
[A] 0 8
isc
ib
[A] 0 2
Ploss,C
[W] 0
193
0
+1E-5
+2E-5
+3E-5
+4E-5
+5E-5
time [s] Fig. 4 Switching signals, voltages, currents, and power losses of the DHB, obtained with phase shift of φ = 0.3 · π /2
using two instances of the half bridges (cf. Fig. 3 (top)). The value of the DHB parameters (Lb , Lsc , C1 , . . . , C4 , etc.) can be found in [26] – Table 5.3. Figure 4 shows the simulation results of the DHB when operating with constant phase shift of φ = 0.3 · π/2 and voltages vb = 5.4 V, vSC = 4.5 V. The first plot shows the two MOSFET fire signals. These signals are shifted by φ and generate square voltages (vab , vcd ) in the half bridges (second plot). The square voltages are applied to the transformer, leading to a large variation in the transformer’s leakage inductor (third plot). The fourth plot shows the currents in the battery cell and SCs, which are filtered by the DHB’s inductors. Finally, the last plot illustrates the power losses, which are discussed in detail in the next subsection.
Energy Losses in the MOSFETs This subsection models the switching (Pon , Poff ) and conduction (ohmic) losses (P ) of the MOSFETs. As a starting point, the ideal switch1 with closed resistance (RD ) and opened conductance losses from the Modelica Standard Library [22] is considered. This model was extended with additional elements to capture the switch-on and switch-off losses. To better understand how these losses are modelled, a single MOSFET is considered (Fig. 5).
1 Modelica.Electrical.Analog.Interfaces.IdealSemiconductor
194
J. Mirwald et al.
Fig. 5 Single MOSFET with diode and measurements of VD and ID
p
heatPort
A
diode i_d
MOSFET
V_d
V
fire
n
During the switch-on phase, the current ID of the MOSFET/diode rises linearly, while the blocking voltage VD falls close to zero, leading to a triangular-shaped power loss (see Fig. 6 and [28] for details). A similar pattern also appears during the switch-off phase. Since the raise (ton ) and fall (toff ) times are usually very small (order of magnitude of nanoseconds) [27], the numerical simulator needs small integration steps to accurately represent the “triangular” power losses. This causes a significant slowdown of the numerical simulation, which is further aggravated by the high number of events generated by the switching frequency of the MOSFETs (usually in the order of kHz). To improve numerical efficiency, a more pragmatic approach for modeling the switching losses is developed. The energy lost during switching is assumed to be continuously dissipated in the “on” and “off” state, respectively (see Fig. 6). This equivalent power loss depends on the switching frequency fsw and has a rectangular shape with duty cycle D. It is defined as: Pon =
Poff =
⎧ ⎨
⎩0 ⎧ ⎨
VDpre ·IDpost ·ton · fsw 2
D
VDpost ·IDpre ·toff · fsw 2
⎩0
1−D
Sj,fire = true Sj,fire = false
(2)
Sj,fire = true Sj,fire = false
(3)
The lower scripts (·)pre and (·)post denote the point in time before (pre) or after (post) the switching event when the value is recorded (see Fig. 6). Although the time-domain representation of the “triangular” and the “rectangular” power losses differ, the accumulated energy losses of these two models is the same. In other words, the area under the “rectangular” power losses is equal to the area under the “triangular” power losses (see the green and red areas represented in Fig. 6).
Learning-Based Control for Hybrid Battery Management Systems Ploss,M
Pon
Poff
PΩ
[W]
0.05
195
[W]
0.00 0.06 0.04
0.00 Ppe ak Qualitative switch losses 0.00
Pon
Poff
on
vblock Qualitatitve blocking voltage VD 0.00 Ion Qualitative current flow ID 0.00
off post
pre
pre
post
MOSFET true Fire false 0
4e-5
time [s]
5e-5
6e-5
Fig. 6 Approximated MOSFET with switching and conduction (ohmic) losses
Due to the “rectangular” approximation, it is possible to avoid very small integration time steps that would be necessary if the “triangular” power losses were modeled in detail. This brings an important practical benefit: the simulation time of the HBMS can be reduced. The overall losses in the MOSFET simulation model (Ploss, M ) are therefore the sum of switching and conduction losses: Ploss,M = PΩ + Pon + Poff
(4)
where P captures the Joule losses dissipated in the MOSFET resistance (RD ID2 .
Quasi-Stationary Model of the DHB Despite the above simplifications, the simulation of the DHB model still generates a large number of events. For example, for a test simulation run with constant voltage on the battery cell and SC sides and a total simulation time of 5.7 s, the Modelica model of the DHB still needs a calculation time of about 421 s and generates
J. Mirwald et al.
Ploss,C [W]
196
4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 15 10 5
0
2.5
3
3.5
4
4.5
Fig. 7 DHB losses at vSC = 2.7 V (cyan edges) and 5.4 V (red edges). Note: the power losses for negative balancing currents (ib < 0) are symmetrical and are not shown in this plot
~1.3 · 106 events. This yields a poor real-time factor of 0.014 (DASSL2 solver, Intel i7-8665U, 16 GB RAM, NVMe SSD). To further reduce the simulation time, a quasi-stationary DHB model is developed. This model captures the main time constants of the DHB and calculates the energy losses and actuation limits through multidimensional lookup tables. To parameterize these tables, the following operational range is defined: (i) voltage in the battery cell side vb ∈ [2.7 4.5V], (ii) voltage in the SC side vsc ∈ [2.7 V, 5.4 V], V, 1 π and (iii) phase shift φ ∈ 0, 2 . By means of parallelized parameter sweeping simulations with an equidistant grid, all relevant data at the operations point could be generated. In Fig. 7, the resulting power losses for different operating points are shown; this allows to build a smooth multidimensional lookup table for the DHB power losses, Ploss, C = f (ib , vb , vSC ), which depends on the battery cell/SC voltage and input current (ib ). Note that Ploss, C includes not only the MOSFET losses but also power losses in the other components of the DHB (such as inductor and transformer). The DHB maximum power and input current (ib ) are subject to physical limits. For example, the (theoretical) maximum power that the DHB can transfer is obtained when the phase shift reaches φ max = π /2, as expressed in (1). In practice, the maximum power is also affected by the DHB power losses. Since these losses are difficult to define analytically, a lookup table is constructed for determining the maximum transferable power and maximum input current ibmax , using the same approach as described in the previous paragraph.
2 DASSL is a variable-order, variable-step numerical integration method that is able to handle the occurrence of events during switching
Learning-Based Control for Hybrid Battery Management Systems
197
i_b_ref i_b_max i_b_lim
bat_p
DHB_dynamic 2
SC_p PTn
heatPort
v_sc
V
v_b
i_sc
i_sc
i_b
i_b
V
f=65 Hz
k=-1
y1_n u_n .. . .. .
y1_1 u_1
TableND
bat_n
P_loss,C
SC_n
Fig. 8 Quasi-stationary dual half-bridge model in a mixed diagram and equation representation. Note: i_b_ref =ib∗
The dynamic response of the DHB is dominated by the input current controller. The goal of this controller is to manipulate the phase shift φ such that the input current ib accurately follows the reference ib∗ (generated by the RL agent). The closed-loop response of this controller – which is designed using linear methods [29] – is approximated here by a second-order filter (Gi (s)) with critical damping and cutoff frequency fcut = 65 Hz. The resulting quasi-stationary Modelica model of the DHB is shown in Fig. 8. Both sides of the DHB converter are modeled through current sources. The current source on the battery cell side is controlled through the second-order filter, i.e., ib = Gi (s)ib∗ . The current source on the SC side is controlled through a power-balance constraint: isc = (vb ib − Ploss, C )/vsc . Finally, the power losses Ploss, C are computed from a lookup table and transferred to a Modelica heat port. The quasi-stationary DHB model provides a real-time factor of ~44.5, which is approximately 3200-fold larger compared to the switching DHB model presented in the previous subsection. Figure 9 depicts the power losses obtained with the two modeling approaches using balancing currents determined in [10]. The percentage goodness of fit (cf. [30] – C.5) of the power losses is about 84%, which is an acceptable value considering the substantial improvement of the real-time simulation factor obtained with the quasi-stationary DHB model. Because of these features, the quasi-stationary DHB model was used in the training of the RL agent.
198
J. Mirwald et al.
[A] [W] 10 3
Switching Model (Ploss,C)
Quasi-Stationary Model (Ploss,C)
ib
5 2 0 1
-5 0 -10 -1 0
100
time [s]
200
300
Fig. 9 Validation of the quasi-stationary DHB model
2.2 Battery Cell This subsection discusses the battery cell model and its parameterization.
Battery Cell Modeling The battery cell model relies on the representation proposed in [31, 32]. This model is modified to better match the lithium nickel manganese cobalt oxide (NMC) chemistry (Li-Tec HEI40 40 Ah [33]) cell that is used in the authors’ group’s experimental vehicle, ROboMObil [34]. Figure 10 shows the Modelica implementation of the battery cell. Each cell is composed of an internal voltage source (vint ), equivalent series resistance (R), and a Coulomb counter for the state of charge (SoC). The internal cell’s voltage vint is calculated based on a combination of the following terms (see Table 1): • v0 captures linear temperature effects in the open-circuit voltage. • (vp, fs , vp, vs ) are fixed−/variable-structure polarization voltages. The variablestructure voltage (vp, vs ) has switching dynamics, which is dependent on the filtered current if and its sign. The filtered current if is calculated by applying a first-order filter to the cell current i with a time constant τ d . • ve captures the exponential voltage variations when the cell approaches full charging conditions. • vl1 is a linear ( voltage term, proportional to the actual battery’s discharge I(t) = I0 + i(t)dt, where I0 is the initial discharge. • vl2 is an additional term that improves voltage course fitting to the NMCtype cells used in this work. This term provides a smooth transition to a second linear voltage term, which is activated when the discharge reaches the value I1 . This transition is implemented through a logistic sigmoid function f (x) = logsig(x) = 1/(1 + exp (−x)), and its smoothness is adjusted via the parameter Inorm .
Learning-Based Control for Hybrid Battery Management Systems
199
bat_p T
R alpha_age
v_b
V
v_int
firstOrder
PT1 T=tau_d
A
i
IntegratorCurrent
beta_age
q
I k=1
bat_n Fig. 10 Implementation of the modified temperature-dependent generic battery model for NMC li-ion cells (simplification) with age-defining parameters α age and β age
The equivalent series resistance R (see Eq. (14) in Table 1) has a nominal value R |Ta,ref,1 , which is obtained at a reference temperature Ta, ref, 1 . This nominal value can be modified due to battery aging factors (α age ) and Arrhenius-based temperature effects [9]. To compute the cell’s temperature, a heat exchange model ([30], Appendix C.2) is employed, which assumes that the heat transfer is dominated by Joule losses in the series resistance, Ploss, B = R(T(t)) · i2 (t). Finally, the cell’s capacity is described by the variable Q (see (15) in Table 1), whose value can be reduced by the aging factor β age . The description of all the parameters employed in the battery cell model is presented in Table 2. The computational efficiency provided by this battery model is high. For example, a dynamic single-cell discharge simulation with a length of 1280 s, performed with a load as used during validation (cf. Sect. 4.3), only takes about 0.5 s, yielding a real-time factor of 2560. Because of this fast simulation time, no simplifications were performed in the battery model.
200
J. Mirwald et al.
Table 1 Battery cell model equations Description Internal cell’s voltage Thermodynamics voltage Fixed-structure polarization voltage
Equation vint = v0 + vp, fs + vp, vs + ve + vl1 + vl2 ∂v v0 = v0Ta,ref,1 + ∂T · T (t) − Ta,ref,1 Q vp,fs = −K (T (t)) I (t)· Q−I (t)
Number (5) (6) (7)
Variable-structure polarization voltage Exponential zone voltage Linear zone 1 voltage
1 · K (T (t)) if (t)· pr (Q, I (t), if (t)) vp,vs = − 3600
(8)
ve = A · exp (−B · I(t)) vl1 = (−1) · Cl, 1 · I(t)
(9) (10)
Linear zone 2 voltage
vl2 = −Cl,2 (I (t) − I1 ) · logsig
I (t)−I1 Inorm
(11)
Variable-structure polarization ratio
K (T (t)) = K |Ta,ref,1 · exp αK · T 1(t) − 2 Q ξP Q−I (t) , if if (t) ≥ 0 pr = Q ζP Q+I (t) , else
Resistance
R = 1 + αage · R |Ta,ref,1 · exp βR · T 1(t) −
Capacity
Q = 1 − βage · Q |Ta,ref,1
State of charge
q =1−
Polarization constant
I (t) Q
1 Ta,ref,1
(12) (13)
1 Ta,ref,1
(14) (15) (16)
Battery Cell Model Parametrization The battery cell model was parameterized for NMC chemistry and used information from the cell’s datasheet and experimental measurements [33]. The datasheet information allowed to identify parameters such as the nominal cell voltage. The experimental data was used to identify the remaining parameters of the battery model through the following three-step optimization approach: 1. Experimental measurements from dynamic discharges (0 ◦ C and 25 ◦ C) were used to estimate the temperature-related parameters (R |Ta,ref,1 , β R ). The squared temperature error was used as cost function and minimized through the simplex optimization algorithm [35]. 2. Experimental measurements from the constant-current charge/discharge references (0 ◦ C and 25 ◦ C) were used to estimate the parameters that affect the ∂v , K |Ta,ref,1 , α K , Cl, 1 , Q |Ta,ref,1 , Cl, 2 , I1 , and Inorm ). internal voltage (such as ∂T The squared voltage error was used as cost function and minimized through a genetic algorithm [35]. 3. The parameters obtained in step 2 were then used as initial values for a second optimization run using the simplex algorithm, enabling the further refinement of the parameter estimate.
Learning-Based Control for Hybrid Battery Management Systems
Current [A]
0.5
Temperature [°C]
0 25.2 25 24.8 24.6 24.4 4 3.5 3 0
80 60 40 20 0 -20 -40 28.4 28.2 28 27.8 27.6
Voltage [V]
Voltage [V]
Temperature [°C]
Current [A]
1
Experimental data Generic battery model
201
4 3.5 3
5
10
Time [s]
a)
15 10
0 4
2000
4000
6000 8000 Time [s]
10000 12000
b)
Fig. 11 Optimization results of the temperature-dependent battery cell model for NMC li-ion cells for continuous discharge at 24.85 ◦ C ambient temperature (a) and dynamic discharge at 27.78 ◦ C ambient temperature (b)
Note that, during the parametrization, the cell is assumed to be at the beginning of life (BOL): α age = β age = 0. Figure 11 compares the battery cell model output and experimental measurements. Overall, one can find a small error between the measured terminal voltage and the cell model, for both constant and dynamic current excitation. The dynamic excitation test presents a less smooth voltage behavior than the constant current test, which can be explained by the faster and higher current excitation employed in this test (cf. Figure 11 b). It is also worth noting that the cell temperature did not change much during the tests. This can be explained by the large heat capacity of the cells. All optimized parameters of the battery cell model can be found in Table 2.
2.3 Supercapacitor As the last component of the HBMS, the modeling of the supercapacitors is discussed. Two types of models were considered. The first (Fig. 12, left) is based on a simple RC equivalent circuit composed of an ideal capacitance, equivalent series resistance, and a self-discharge parallel resistance. The second (Fig. 12 right) considers a more complex “two branches” equivalent circuit, which was proposed and experimentally validated in [36]; it is composed of: (i) one RC branch, where the capacitance has a linear voltage dependence, (ii) one RC branch with constant
202
J. Mirwald et al.
Table 2 Optimized battery cell parameters
Parameter R |Ta,ref,1
Value 2.14 · 10−4
Parameter Q |Ta,ref,1
Value 44.99 Ah
βR ∂v ∂T
3.24 · 103 K 7.25 · 10−4 V/K
(ξ P , ζ P )* A*
(1,0.1) 0
K |Ta,ref,1
2.50 · 10−4 V/Ah
B*
0
αK Cl, 1 Cl, 2 I1 Inorm
0K 7.74 · 10−6 V/As −5.44 · 10−6 V/As 15.66 Ah 0.86 Ah
v0Ta,ref,1 * Ta, ref, 1 * Ta, ref, 2 * τ d*
4.2 V ◦ 24.85 C ◦ −1.32 C 30 s
*Parameter value extracted from datasheet or prefixed
Simple RC equivalent circuit
„Two branches“ equivalent circuit
SC_p
C2
kV
R=2500 Ω
k=89.8642
R2
C=43.1925 F
C0
R=5.55652 Ω
R0
C=232.519 F
R=2500/n_p
Rp
capacitor
R=0.0048032 Ω
Rs
C=430*n_p
voltageSensor
V SC_n
SC_n heatPort
capacitor
SC_p
heatPort
Fig. 12 Modeling of the SCs with one (left) and two branches (right)
parameters, and (iii) a parallel self-discharge resistance. In both models, the SCs’ energy loss Ploss, SC is captured by heat dissipated in the resistances. Figure 13 provides a comparison of the SC terminal voltage for both models during a charge/discharge test. The “two-branch” model exhibits a nonlinear behavior during the charging and discharging phases. Nevertheless, the voltage mismatch between both models is relatively small. Additionally, the simple RC model is about ten times faster to simulate than the more complex “two-branch” model. Since the simple RC model provides good accuracy and reduced simulation times, this model is deployed during the training of the RL agent.
2.4 Load, Drivetrain Model, and Integration of Components In what follows, a small-scale HBMS is considered, composed of the following:
Learning-Based Control for Hybrid Battery Management Systems
[A] [V] 30 3.0
RC model
„Two Branches“
203
isc
20 2.5 10 2.0 0 1.5 -10 1.0 -20 0.5 -30 0.0
0
25
50
75 Time [s]
100
125
150
Fig. 13 Comparison of the modeling with one or two branches (parameters extracted from [36])
• A small battery pack with three NMC cells [33] in series (3s1p); this represents a 1-to-30 reduction factor with respect to the ROboMObil [34], the reference vehicle employed in this study and which contains a 90s1p full-scale battery pack. • A SC pack (2s4p), with a hybridization ratio γ of 2.2% (i.e., capacity ratio of the SCs and the battery pack); this is in line with the sizing pattern adopted in [7]. • DHB converters with a maximum current of 11.8 A (cf. Figure 8); the parameters of the converter were extracted from [26]. The load of the HBMS is connected to a Modelica drivetrain model of the ROboMObil (from the Modelica PowerTrain library [37]) via a power rescaling block. The drivetrain model emulates the power consumption Pload, 0 and respective drivetrain losses (e.g., inverter losses) of the vehicle for a given driving cycle. To ensure that the current requested from the battery cells (of the small-scale HBMS) reaches average values close to 1 C,3 a 1-to-15 power rescaling factor γ ps is used: Pload, 1 = γ ps · Pload, 0 . Figure 14 illustrates the velocity and load power requested for the two driving cycles considered in this work.
3 Reinforcement Learning This section presents the principle of operation of the selected reinforcement learning algorithm of this work as well as its application to the HBMS.
3 The C-rate of 1 C describes the current necessary to discharge the battery’s nominal capacity during 1 h.
204
J. Mirwald et al.
a) WLTC
b) Stuttgart
Fig. 14 Velocity and load power (Pload, 0 ) of the driving cycles employed in this work: WLTC (a) is used for training, while Stuttgart (b) is used for validation
3.1 Principle of Operation RL algorithms are developed assuming a stochastic environment, usually formulated as a Markov decision process (MDP). At time step k, the MDP receives an action a k ∈ A ⊂ Rm and produces a state vector s k ∈ S ⊂ Rn , where A and S are the action and state spaces, respectively. The state evolves according to state-transition probability p(sk + 1 | sk , ak ), which denotes the probability of transitioning from state sk to state sk + 1 by taking the action ak [38]. After the transition to state sk + 1 , the reward rk + 1 is assigned. This interaction, as depicted in Fig. 15, leads to a trajectory sk , ak , rk + 1 , sk + 1 , ak + 1 , rk + 2 , . . . . A trajectory is called an episode, if a terminal state is reached and a reset to initial conditions occurs. The sum of rewards of an episode is referred to as the return. The RL goal is to find an optimal policy π ∗ (ak | sk ) that maximizes the expected return J: E(s k ,a k )∼ρπ [r (s k , a k )] J = (17) k
with the expectation denoted by E and ρ π (sk , ak ) being the state-action marginal of the trajectory distribution induced by the stochastic policy π (ak | sk ) (cf. [20]). While trying to optimize the policy (exploitation) based on the observed transition trajectories, a certain randomness during the training is necessary to discover the full state and action space (exploration). Several types of RL algorithms can be used to maximize J. In the soft actor-critic algorithm [20], which is adopted in this work,
Learning-Based Control for Hybrid Battery Management Systems
205
i_load
Action ak
Agent
externalLoad
Environment DHB_1 DC u_1 y1_1.. .
litec_HEI40_1 1s 1p
i_b_ref_1
u_n y1_n.. .
DC 1s 4p
DHB_2 DC u_1 y1_1.. .
litec_HEI40_2 1s 1p
litec_HEI40_3 1s 1p
Reward rk
1s 4p
hcSC
hcDCDC
hcBat
DC
State sk
i_b_ref_2
u_n y1_n.. .
DHB_3 DC
i_b_ref_3
u_1 y1_1.. . u_n y1_n.. .
DC ground
Fig. 15 Interaction of reinforcement learning agent with the environment [38]
the standard RL objective (17) is augmented with an information-theoretical entropy term H(X) (cf. [39]): ∼
J =
E(s k ,a k )∼ρπ r (s k , a k ) + αH π (· |s k )
(18)
k
where α represents the weight (known as temperature parameter), which allows to balance exploration and exploitation. By starting the training with a high value of α, higher levels of exploration and discovery can be promoted. Afterward, as the agent learns the impact of its actions in the environment and rewards, lower values of α can be deployed, shifting the focus toward exploitation [12]. Previous research has shown that the SAC algorithm offers a sample-efficient learning and outperforms other widely used RL algorithms [12]. The SAC algorithm [12] is an off-policy actor-critic algorithm incorporating a replay buffer D and neural networks to approximate the policy (actor) as well as two action-value functions (critic). The action-value functions are used to account for positive bias in the policy improvement step, which can deteriorate performance (cf. [40, 41]). Moreover, two additional target neural networks are used to stabilize training. In the SAC algorithm (cf. Algorithm 1), the stochastic policy is usually parametrized as a Gaussian distribution with mean and variance given by the actor neural network. During training, the SAC alternates between performing environment steps and gradient steps. To perform an environment step, an action ak is sampled out of the Gaussian policy and applied to the environment. Together with the returned reward rk and next state sk + 1 , the sampled action is then stored inside the replay buffer. During the gradient step phase, the parameters of the policy, the two actionvalue functions, and the temperature parameter are updated with stochastic gradient descent steps, which are computed based on uniformly sampled batches B of the
206
J. Mirwald et al.
replay buffer D. As a last step during the gradient step phase, the target networks’ parameters are computed as an exponentially moving average of the parameters of the two action-value functions. The environment step phase and the gradient step phase are repeated until a certain amount of iterations is reached [40, 41]. After training, the deterministic part of the policy (i.e., the policy neural network) is used for deployment. Algorithm 1: Basic SAC principle of operation [12, 21] 1:
Initialize policy parameters, parameters of action-value function networks, parameters of targetvalue function networks and create empty replay buffer
2:
Retrieve initial state
3:
for each training time step :
from environment ~
⋅|
4:
Sample action from policy:
5:
Sample transition from environment:
6:
Add transition to replay buffer:
7: 8:
if update condition* then: for number of gradient steps:
~ (⋅ | , =
∪ {( ,
) ,
,
)}
Extract a batch of transitions ℬ from the replay buffer
9: 10:
With ℬ, compute policy gradient, action-value function gradients and temperature gradient
11:
Update the policy network, action-value networks and
12:
Update target-value networks
13: 14:
end for end if
15:
if
16: 17: 18:
is terminal then: Reset environment, retrieve new initial state
end if end for
* the update condition can be e.g. an update every nu ∈ N steps
3.2 Application of Deep Reinforcement Learning to the HBMS To apply the RL to the HBMS, it is necessary to specify (i) the actions (ak ), (ii) the state vector (sk ), and (iii) the reward function (rk ) that contains the control objectives.
Actions ∗ for each j-th power The RL actions ak consist of the reference currents ib,k,j converter:
T
∗ ∗ , . . . , ib,k,n . a k = ib,k,1
(19)
Learning-Based Control for Hybrid Battery Management Systems
207
The actions are subject to saturation and rate limit constraints. As discussed in max q Sect. 2.1.3, the DHB saturates the maximum current ib,k,j SC,k , qk,j that can be extracted from each battery cell. This saturation is modelled here as:
sat ak,j
⎧ ⎨
ak,j , = ηa · amax,k , ⎩ − ηa · amax,k ,
# # if #ak,j # ≤ ηa · amax,k max qSC,k , qk,j if ak,j > ηa · amax,k , amax,k = minib,k,j j if ak,j < −ηa · amax,k (20)
sat is the saturated action for the j-th power converter, a where ak,j max, k the worst-case current limit among all power converters, and ηa = 0.9 a safety margin. The action is rate limited to enforce smoothness of the RL actions. This limit is defined as: # # # # sat sat Δak,j = #ak,j − ak−1,j (21) # ≤ Δamax · Δt sat is the action applied in the previous time step, amax = 2.5 A/s the where ak−1,j current rate limit (defined by the control engineer), and t = 1 s the sample time.
State Vector The state vector sk is defined as follows: T T T T s k = q Tk , qSC,k , Δq Tk , ΔT Tk , a sat k−1 , amax,k , α age , β age , iload,k .
(22)
where • qk = [qk, 1 , . . . , qk, n ]T and qSC, k represent the SoC of the battery cells and SC pack. • qk, j and Tk, j represent the SoC and temperature deviations, computed as: T
Δq k = Δqk,1 , . . . , Δqk,n , Δqk,j = qk,j − q k
(23)
T
ΔT k = ΔTk,1 , . . . , ΔTk,n , ΔTk,j = Tk,j − T k
(24)
where q and T correspond to the mean SoC and mean temperature of the battery pack, respectively.
208
J. Mirwald et al.
• a sat k−1 and amax, k are the actions applied in the previous time step and the saturation limits, respectively. This information helps the RL agent to enforce rate and range limits. • α age = [α age, 1 , . . . , α age, n ]T and β age = [β age, 1 , . . . , β age, n ]T contain information about the variability of the cell’s inner resistance and capacity. This information can be obtained by battery diagnosis algorithms (such as [9]), and it helps the RL agent to understand the cell-to-cell variability in the battery pack. • iload, k represents the load current requested from the battery pack. It is computed based on the battery voltage and load power (Pload, 1 ) requested from the battery pack (see Sect. 2.4). It is worth noting that the above states are normalized with respect to their maximum expected values before passing them to the RL agent.
Reward Function The goal of RL is to maximize the expected return J; therefore, control goals must be incorporated into the obtained rewards. The control goals in this work are to minimize (i) battery current stress, (ii) SoC deviations q, (iii) temperature deviations T, and (iv) power losses and to maintain (v) smooth control values a. For this, the following nominal reward function is used: rno abort,k
n 2 1 1 1 + αage,j · ik,j = − 2· − 2 · ΔT Tk ΔT k wi j =1 wΔT n 1 1 1 ## sat ## − 2 · Δq Tk Δq k − · Plosses,k − · #Δak,j # wl wΔa wΔq j =1
(25)
where wi , w T , w q , w a are weights that the designer can use to prioritize different control goals and normalize the different reward terms. The first term of the reward function aims at reducing the battery cell current sat + i ik,j = ak,j load,k ; it encourages the use of battery cells with less degradation, i.e., with lower aging factor α age, j . The following two terms aim at reducing temperature and SoC deviations, respectively. The fourth term enables the learning of smooth actions a by penalizing large differences between the current (ak ) and the previous action (a sat k−1 ). Finally, the last term aims at reducing the power losses in the battery cells, converters, and SCs: Plosses =
n Ploss,B,j + Ploss,C,j + Ploss,SC,j
(26)
j =1
There are several conditions that might prematurely end an episode, in which case the (highly) negative reward rabort < 0 is assigned at that time step k instead of
Learning-Based Control for Hybrid Battery Management Systems
209
rno abort, k . This abort mechanism speeds up the training by showing the agent that specific state-action pairs are less desirable. The final reward function is therefore described as: $ , if abort k = false r rk = no abort,k (27) else rabort , where abortk is a Boolean condition that becomes true when the following state constraints are violated: 50% ≤ qSC,k ≤ 100%, 5% ≤ qk,j ≤ 95%.
(28)
These constraints prevent deep discharge and overcharge of the battery cells and SCs and must be enforced by the RL agent.
4 Results and Discussion This section presents the training setup, training results, and validation results obtained when controlling the HBMS with the RL algorithm.
4.1 Training Setup To train the RL agent, the control engineer needs to specify several parameters, including the training length, the initial values of the HBMS model, the SAC hyperparameters, and the weights of the reward function. Regarding the training length, each episode has a maximum length of 500 s (representing 500 time steps), which is sufficiently large to cover the dominant time constants of the HBMS. The overall training process consists of 2 · 106 time steps, which is also adequately long to achieve convergence for the reward functions. The initial values of the parameters of the environment (the HBMS model) are randomized in order to increase the robustness of the RL agent; this also decreases overfitting to a specific system initialization. The initial values are sampled from a continuous uniform distribution with intervals summarized in Table 3. These parameters include the initial time of the driving cycle (tdc, 0 ), which allows the RL agent to be exposed to a different sequence of load current iload, k during the episodes. Miscellaneous values for the initial SoC aging are also considered. For the aging factors, it is assumed that α age, j and β age, j are fully correlated for a single cell (e.g., when α age, j = 0, also β age, j = 0), while there is no correlation assumed between cells. For training, random intervals of the WLTC driving cycle (cf. Figure 14) are used. WLTC is a representative driving cycle that contains a wide range of operating conditions that are expected from the vehicle.
210
J. Mirwald et al.
Table 3 Random sampling for training initialization Description Beginning of driving cycle Cell’s initial temperature* Initial mean of the cells’ SoC Initial cell’s deviation from the sampled mean Fixed resistance increase** Fixed capacity decrease** Supercapacitor pack’s SoC
Variable tdc, 0 T0, j q0
q0, j
Sampling interval [0 s, 1299 s] ◦ ◦ [Ta − 1 C, Ta + 1 C] [45%, 85%] [− 5%, 5%]
α age, j β age, j qSC, 0
[0, α age, EOL ] [0, β age, EOL ] [75%, 95%]
◦
*ambient temperature: Ta = 25 C *end-of-life (EOL) values: α age, EOL = 240% and β age, EOL = 12% [29, 42, 43]. Table 4 Reward function parametrizations Agent a b c d
Description SoC deviation Temperature deviation Cell current and temperature deviation Cell current, temperature deviation, SoC deviation, and losses
wi NA* NA* 60
w T NA* 0.25 1
w q 0.01 NA* NA*
w a 2 2 10
wl NA* NA* NA*
rabort −3000 −3000 −3000
60
1
0.01
2
1
−10,000
*not available (NA): corresponding term is removed from the reward function
Regarding the SAC hyperparameters, mostly the default values are deployed as provided by [21]. One major modification is the size of the replay buffer, which was increased to 2 · 106 time steps; this gives the RL agent an additional possibility to learn from the whole training process. The mapping between states and actions is performed by the RL agent using a feedforward neural network with 256 neurons in each of two hidden layers and the rectified linear unit f (x) = relu(x) = max (0, x) [44] as activation function. Four types of reward functions, with different weights and abort penalizations, were considered in this study (see Table 4). They aim at some of the most common control goals of battery management [8] and constitute the following family of RL agents: • Agents (a) and (b) focus on minimization of SoC and temperature deviations, respectively. • Agent (c) aims primarily at current stress reduction, with a small incentive for reducing temperature deviations. • Agent (d) attempts to achieve a balance between all goals (SoC balancing, temperature balancing, current stress, and losses). The corresponding reward function parametrizations are heuristically determined, and the RL algorithm is very sensitive to their changes. Thus, besides the
Learning-Based Control for Hybrid Battery Management Systems
211
Fig. 16 Episode returns and lengths during training RL agent (a). Outlier returns below −10,000 not shown. Mean with added and subtracted standard deviation shown (note that there are no samples with an episode length larger than 500 s)
choice of hyperparameters, this is one of the main challenges of the application of RL: the weights and abort penalizations need to be carefully chosen to create a stable training that leads to the desired agent’s behavior.
4.2 Training Results Figure 16 shows the development of episode returns and episode lengths for a RL agent (a). For the episode returns, a high variance can be observed, which appears due to difficult initial conditions caused by the randomization. The mean of the episode returns increases during the training from about −9000 to about −5000. The low changes at the end demonstrate the convergence of the training. The episode lengths initially show many aborts but quickly increase toward the maximum episode length. This shows that the agent learns to consider the abort conditions and to adjust its actions ak accordingly. The later decrease of the mean episode length can also be explained by the agent’s exploration. The reward functions for the other RL agents (b, c, and d) have a similar training behavior and are omitted here for the sake of brevity.
4.3 Validation Results This section presents the validation results of the RL agents. In contrast with the previous section, the custom Stuttgart driving cycle in its full length (cf. Figure 14) is employed in the validation. This allows to assess the performance of the RL agents
212
J. Mirwald et al.
Table 5 Validation initialization Description Beginning of driving cycle Cells’ initial temperature* Initial cells’ SoC Fixed resistance increase
Variable tdc, 0 T0 q0 α age
Fixed capacity decrease
β age
Supercapacitor pack’s SoC
qSC, 0
Initialization 0s [Ta , Ta , Ta ]T [90%, 87.5%, 85%]T [0.25 · α age, EOL , 0.35 · α age, EOL , α age, EOL ]T [0.25 · β age, EOL , 0.35 · β age, EOL , β age, EOL ]T 95%
◦
*ambient temperature: Ta = 25 C
on a driving cycle that is entirely unknown to them. Additionally, the initial values of the environment are fixed in order to make the results comparable (see Table 5). Table 6 summarizes the validation results of the different RL agents. The following performance metrics are considered: n • Mean (absolute) SoC deviations, μ| q| = K1 K k=1 j =1 | Δqk,j |, with K denoting the number of validation time steps. n • Mean (absolute) temperature deviations, μ| T | = K1 K k=1 j =1 | ΔTk,j | . • Smoothness of control actions, measured as the mean differences in control sat | actions: μ| a| = K1 k j | Δak,j • Root-mean-square (RMS) current for each cell (RMSj ) • Average RMS current in the battery pack ( n1 j RMSj ) • Overall energy losses in the battery cells, power converters, and SCs
No Balancing Figure 17 shows the SoC and temperature behavior with no control (ak = 0). Due to different aging of the cells, variations in the SoC and temperature emerge. The most-aged cell 3 is the fastest to discharge, while the least-aged cell 1 is the slowest to do so. Cell 3 also presents higher temperature than the other cells, which is expected given its higher internal resistance. Furthermore, the temperature increase ◦ during the driving cycle is relatively moderate: 1 C increase with respect to ambient temperature. This can be explained by the high heat capacity of the battery cells (cf. Sect. 2.2.2).
Agent (a) Figure 18 shows the validation results of RL agent (a). This agent reduces the SoC deviations qj within the first 600 s of the driving cycle. During this initial
Balancing actions No balancing Agent (a) Agent (b) Agent (c) Agent (d)
μΔq [%/100] 0.0603 0.0171 0.0463 0.0407 0.0245
μΔT [◦ C] 0.422 0.315 0.291 0.337 0.330 μΔa [A] 0.0 0.883 2.678 8.840 1.731
RMS current [A] Aver. RMSj = 1 41.60 41.60 41.77 43.35 41.82 42.24 41.19 42.81 41.22 43.15 RMSj = 2 41.60 42.01 44.27 41.12 40.76
RMSj = 3 41.60 39.94 38.95 39.63 39.76
Integrated losses [Wh] 0. 98 1.46 1.78 1.56 1.29
Table 6 Summary of evaluation metrics’ results for Stuttgart driving cycle with no balancing actions and different trained agents. Best results are marked in green
Learning-Based Control for Hybrid Battery Management Systems 213
214
J. Mirwald et al.
Fig. 17 SoCs, temperatures, SoC deviations, and temperature deviations of cells and SoC of SC pack during validation on Stuttgart driving cycle with no balancing actions
period, the agent charges the most-aged cell 3 with the maximum balancing current and discharges (most of the time) the stronger cells (1 and 2). Afterward, the magnitude of the control actions is reduced, but the agent is still able to keep the SoC differences ( qj ) low. In comparison to the “no balancing” scenario, the RL agent (cf. Table 6): • Reduces μΔq by 72% and μΔT by 25%. This suggests that the control objectives of SoC balancing and temperature balancing are not independent from each other. • Reduces current stress in the most-aged cell 3 (−4.0%) and increases stress in the least-aged cell 1 (+4.2%). The average RMS current does increase, because it is not an explicit part of the reward function. • Increases the integrated losses (+49%). This mainly stems from the higher losses generated by the power converters. It is also interesting to note that the SC is mainly used during the first 600 s to support the SoC balancing task.
Agent (b) Figure 19 shows the results of the RL agent (b), which focuses primarily on reducing temperature variations. It can be observed that this agent charges the most-aged cell 3 with a value that is always very close to saturation; the agent’s goal is to shift thermal load from the most-aged cell 3 to the least-aged cells (1 and 2). Despite these efforts, only modest improvements in temperature variation are obtained: μΔT is
Learning-Based Control for Hybrid Battery Management Systems
215
Fig. 18 SoCs, temperatures, SoC deviations, and temperature deviations of cells and SoC of SC as well as balancing actions by SoC deviation minimizing agent pack during validation on Stuttgart driving cycle
decreased by 8% with respect to RL agent (a). This result is due to the high heat capacity of the battery cells, which leads to small temperature increases during the driving cycles (with a load current of 1 C on average).
Agent (c) Figure 20 shows the results of the RL agent (c), which focuses on minimization of the current stress. During the first 200 s, the RL agent extracts energy from the SCs to support the battery cells. Interestingly, this support is aging-aware: the agent generates actions that decrease the current in the most-aged cells (2 and 3) and increase the current in the least-aged cell (1). After 200 s, the SCs provide minimum support to the battery pack. The RL agent (c) features the lowest average RMS of all agents, 41.19 A, which represents a reduction of 1.0% when compared to the no balancing scenario. At cell
216
J. Mirwald et al.
Fig. 19 SoCs, temperatures, SoC deviations, and temperature deviations of cells and SoC of SC as well as balancing actions by temperature deviation minimizing agent pack during validation on Stuttgart driving cycle
level, the RL agent c) reduces the current of the most-aged cell (3) in 4.7% while increasing the current in the least-aged cell in 2.9%. In contrast with the previous test cases, RL agent (c) presents more chattering in the control action, which is due to the reduction of the weight (1/ w a ). It had therefore less incentive to learn a smooth action behavior.
Agent (d) This final agent (d) aims at fulfilling all control objectives simultaneously. Table 6 shows that this agent achieves the following: • Second place in SoC deviation (just behind agent a). • Third place in temperature deviation (but very close to agents (a) and (b)). • Second place in average current stress (very close to agent (c)).
Learning-Based Control for Hybrid Battery Management Systems
217
Fig. 20 SoCs, temperatures, SoC deviations, and temperature deviations of cells and SoC of SC pack as well as balancing actions by cell current and temperature deviation minimizing agent during validation on Stuttgart driving cycle
• Second place in integrated losses (behind no balancing but ahead of all RL agents). The time-domain results of agent (d) (Fig. 21) show a reduction in SoC imbalances during the first 400 s. After this initial transient, the SoC deviations are approximately kept at a constant value. By doing so, the agent’s actions can be kept at a lower (absolute) amplitude, providing a good compromise between power losses and SoC balance. The corresponding actions are less smooth than for RL agent (a) but smoother than for agent (c). This can be explained by the less weighted penalization of action and control value differences in reward function (c).
218
J. Mirwald et al.
Fig. 21 SoCs, temperatures, SoC deviations, and temperature deviations of cells and SoC of SC pack as well as balancing actions by cell current, temperature deviation, SoC deviation, and losses minimizing agent during validation on Stuttgart driving cycle
5 Conclusion and Outlook This work investigated the use of RL to control HBMS. First, a multi-physical model of the HBMS’ components – including power converters, battery cells, and supercapacitors – was implemented in Modelica. To improve numerical efficiency, a quasi-static model of the power converter was developed. This allowed to improve by more than 3000-fold the real-time simulation factor of the HBMS model and accelerate the RL training. Additionally, the battery cell model was optimized and validated with experimental data from NMC cells. The model was incorporated into a RL toolchain featuring the SAC algorithm. Multiple control objectives were assessed, yielding RL agents that focus on (a) SoC equalization, (b) temperature equalization, (c) age-weighted cell current reduction, and (d) trade-off between these three objectives and power losses. To increase the
Learning-Based Control for Hybrid Battery Management Systems
219
robustness of the obtained RL agents, several randomizations were incorporated in the training process. The trained agents were then validated with an unknown driving cycle. It could be shown that an RL agent trained with a reward function that incorporates the objectives of SOC balancing, temperature balancing, age-weighted cell current reduction, control value smoothness, and heat losses can satisfy these goals. Based on the weights of the reward function, it can either focus on a subset of these objectives or on all of them at the same time. As one might expect, the agent performs better when focusing on a smaller subset of objectives. In comparison to the scenario without control, RL agents that prioritize a single objective (a, b, c) were able to reduce SoC deviations by 72%, temperature deviation by 31%, and the RMS current in the most-aged cell by 4.7%. The maximum temperatures and the average RMS cell current are difficult to reduce. For the temperature, the large heat capacity of the battery cell neither does permit high temperature increases nor does it allow big advantages by temperature balancing. The average RMS cell current is dominated by the load current of the whole HBMS system of 1 C on average, and only slight adjustments are possible with a maximum balancing current of 0.25 C for a limited time interval. When all objectives are traded off simultaneously, the RL agent (d) still performed well, reducing the SoC deviations by 59%, temperature deviations by 22%, and the RMS current in the most-aged cell by 4.4%. The combination of all objectives is helped by the fact that the control objectives are not independent from each other: balancing the SOC deviations has a positive effect on the temperature deviations and vice versa, while reducing the current with focus on the most-aged cells also leads to SOC deviation balancing. All RL agents increased energy losses due to balancing actions, which is a drawback of the HBMS. Additionally, the maximum temperatures in the battery cells were only marginally reduced by the RL agents due to the large heat capacity of the battery cells. Future work should consider cells with smaller heat capacity to better assess the potential of RL for thermal management. Also, this work considered a constant aging, although the driving cycle would lead to an aging increase. Therefore, it is planned to adapt the algorithm to consider time-varying aging and to use a more detailed aging model for this. It is also planned to compare the RL agents’ performance with model-based controllers and incorporate preview information in the RL’s state vector. If the RL agents show a good performance in comparison with model-based approaches, an experimental validation, e.g., with a hardware-inthe-loop system, would give new insights. Furthermore, the RL agent’s robustness against parameter uncertainties, e.g., different equivalent impedance of the components, could be assessed. Finally, approaches using multi-agent RL could be beneficial when scaling up the HBMS to a realistic size that might be used by future electric vehicles. Acknowledgments The authors’ thanks go to Tobias Posielek for his feedback and help in scientific writing.
220
J. Mirwald et al.
Author Contributions Conceptualization, R.C., J.M., J.B., and R.A.; methodology, R.C., J.M., J.B., and R.A.; RL and FMI framework, J.U. and J.M.; Modelica models, J. B and J.M.; validation, J.M. and R.C.; resources, J.B., J.U., and R.C.; writing, original draft preparation, J.M., J.B., R.C., J.U., and R.A.; writing, review and editing, J.M., J.B., R.C., J.U., and R.A.; visualization, J.M., J.B., R.C., and J.U.; and supervision, R.C. All authors have read and agreed to the published version of the manuscript. Funding The work of J.M., J.B., and J. U was funded by the DLR internal project NGC-A&E. This project also funded R.C. during his employment at DLR. The participation of R.A. in this work was financed by the Portuguese funding agency Fundação para a Ciência e a Tecnologia (FCT) within project UIDB/50014/2020.
References 1. International Energy Agency (IEA). Global EV Outlook 2020, Paris, France, 2020. [Online]. Available: https://www.iea.org/reports/global-ev-outlook-2020. 2. J. Barreras, D. Frost, D. Howey, Smart balancing systems: an ultimate solution to the weakest cell problem? in IEEE Vehicular Technology Society Newsletter, (2018) 3. J. Barreras, C. Pinto, R. de Castro, E. Schaltz, S. Andreasen, R. Araujo, Multi-objective control of balancing systems for Li-Ion battery packs: a paradigm shift? in IEEE vehicle power and propulsion conference, Coimbra, Portugal, (2014) 4. E. Chemali, M. Preindl, P. Malysz, A. Emadi, Electrochemical and electrostatic energy storage and management systems for electric drive vehicles: State-of-the-art review and future trends. IEEE J. Emerg. Select. Topics Power Electr. 4(3), 1117–1134 (2016) 5. R. Araujo, R. de Castro, C. Pinto, P. Melo, D. Freitas, Combined sizing and energy management in EVs with batteries and supercapacitors. IEEE Trans. Veh. Technol. 63(7), 3062–3076 (2014) 6. Q. Zhang, W. Deng, G. Li, Stochastic control of predictive power Management for Battery/supercapacitor hybrid energy storage systems of electric vehicles. IEEE Trans. Indus. Inform. 14(7), 3023–3030 (2018) 7. R. de Castro, C. Pinto, J. Barreras, R. Araújo, D. Howey, Smart and hybrid balancing system: Design, modeling, and experimental demonstration. IEEE Trans. Veh. Technol. 68(12), 11449– 11461 (2019) 8. F. Altaf, B. Egardt, L. Johannesson Mardh, Load Management of Modular Battery Using Model Predictive Control: Thermal and state-of-charge balancing. IEEE Trans. Control Syst. Technol. 25(1), 47–62 (2017) 9. G.L. Plett, Battery management systems, Vol. 2: Equivalent-circuit methods (Norwood, Artech House, 2016) 10. R. de Castro, J. Brembeck, R.E. Araujo, Nonlinear control of dual half bridge converters in hybrid energy storage systems, in IEEE vehicular power and propulsion conference, Gijon, Spain, (2020) 11. S. Di Cairano, I.V. Kolmanovsky, Real-time optimization and model predictive control for aerospace and automotive applications, in Annual American Control Conference, Milwaukee, USA, (2018) 12. T. Haarnoja, A. Zhou, K. Hartikainen, G. Tucker, S. Ha, J. Tan, V. Kumar, H. Zhu, A. Gupta, P. Abbeel and S. Levine, Soft Actor-Critic Algorithms and Applications, 2018. [Online]. Available: https://arxiv.org/abs/1812.05905. 13. A. Raffin, F. Stulp, Generalized State-Dependent Exploration for Deep Reinforcement Learning in Robotics, 2020. [Online]. Available: https://arxiv.org/abs/2005.05719. 14. Y. Li, Reinforcement Learning Applications, 2019. [Online]. Available: https://arxiv.org/abs/ 1908.06973.
Learning-Based Control for Hybrid Battery Management Systems
221
15. D. Liu, Application of Deep Reinforcement Learning for Battery Design, Master’s thesis (University of Missouri, USA, 2020) 16. H. Sun, Z. Fu, F. Tao, L. Zhu, P. Si, Data-Driven Reinforcement-Learning-Based Hierarchical Energy Management Strategy for Fuel Cell/Battery/Ultracapacitor Hybrid Electric Vehicles. J. Power Sour. 455 (2020) 17. B. Xu, J. Shi, S. Li, H. Li, and Z. Wang, Energy Consumption and Battery Aging Minimization Using a Q-learning Strategy for a Battery/Ultracapacitor Electric Vehicle, 2020. [Online]. Available: https://arxiv.org/abs/2010.14115. 18. J. Cao, D. Harrold, Z. Fan, T. Morstyn, D. Healey, K. Li, Deep reinforcement learning-based energy storage arbitrage with accurate Lithium-ion battery degradation model. IEEE Trans. Smart Grid 11(5), 4513–4521 (2020) 19. S. Park, A. Pozzi, M. Whitmeyer, W. T. Joe, D. M. Raimondo and S. Moura, Reinforcement Learning-based Fast Charging Control Strategy for Li-ion Batteries, 2020. [Online]. Available: https://arxiv.org/abs/2002.02060. 20. T. Haarnoja, A. Zhou, P. Abbeel, S. Levine, Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor, in International Conference on Machine Learning, Stockholm, Sweden, (2018) 21. A. Hill, A. Raffin, M. Ernestus, A. Gleave, R. Traore, P. Dhariwal, C. Hesse, O. Klimov, A. Nichol, M. Plappert, A. Radford, J. Schulman, S. Sidor, and Y. Wu, Stable Baselines. https:// github.com/hill-a/stable-baselines, 2018. 22. Modelica Association, Modelica, 2020. [Online]. Available: http://www.modelica.org. [Accessed 10 2020]. 23. F. Gao, N. Mugwisi, D. Rogers, Three degrees of freedom operation of a dual half bridge, in European conference on power electronics and applications, Genova, Italy, (2019) 24. H. Li, F.Z. Peng, J.S. Lawler, A natural ZVS medium-power bidirectional DC-DC converter with minimum number of devices. IEEE Trans. Ind. Appl. 39(2), 525–535 (2003) 25. H. Li, F.Z. Peng, J. Lawler, Modeling, simulation, and experimental verification of softswitched Bi-directional DC-DC converters, in Sixteenth annual IEEE applied power electronics conference and exposition, Anaheim, USA, (2001) 26. C. F. A. Pinto, Sizing and Energy Management of a Distributed Hybrid Energy Storage System for Electric Vehicles, Ph.D. thesis, University of Porto, Porto, Portugal, 2018. [Online]. Available: https://sigarra.up.pt/feup/en/pub_geral.pub_view?pi_pub_base_id=266342. 27. Infineon Technologies AG, IPB009N03L G, 2016. [Online]. Available: https://www.infineon.com/dgdl/Infineon-IPB009N03L-DS-v01_04en.pdf?fileId=db3a30431689f4420116d426b6770ca3. 28. D. Graovac, M. Pürschel, A. Kiep, MOSFET Power Losses Calculation Using the Data-Sheet Parameters (Infineon Technologies AG, Neubiberg, 2006) 29. C. Pinto, J.V. Barreras, E. Schaltz, R.E. Araujo, Evaluation of advanced control for Li-ion battery balancing systems using convex optimization. IEEE Trans. Sustain. Energy 7, 1703– 1717 (2016) 30. J. Brembeck, Model Based Energy Management and State Estimation for the Robotic Electric Vehicle ROboMObil, Ph.D. thesis, Technical University of Munich, Munich, 2018. 31. O. Tremblay, L.-A. Dessaint, Experimental validation of a battery dynamic model for EV applications. World Electr. Veh. J. 3, 289–298 (2009) 32. S.N. Motapon, A. Lupien-Bedard, L.-A. Dessaint, H. Fortin-Blanchette, K. Al-Haddad, A generic electrothermal Li-ion battery model for rapid evaluation of cell temperature temporal evolution. IEEE Trans. Ind. Electron. 64, 998–1008 (2017) 33. J. Brembeck, A physical model-based observer framework for nonlinear constrained state estimation applied to battery state estimation. Sensors 19(20), 4402 (2019) 34. J. Brembeck, L.M. Ho, A. Schaub, C. Satzger, J. Tobolar, J. Bals, G. Hirzinger, ROMO – The robotic electric vehicle, in IAVSD International Symposium on Dynamics of Vehicle on Roads and Tracks, Manchester, UK, (2011) 35. A. Pfeiffer, Optimization library for interactive multi-criteria optimization tasks, in International MODELICA Conference, Munich, Germany, (2012)
222
J. Mirwald et al.
36. R. Faranda, M. Gallina, D.T. Son, A new simplified model of double-layer capacitors, in International Conference on Clean Electrical Power, Capri, Italy, (2007) 37. J. Tobolar, M. Otter, T. Bünte, Modelling of vehicle powertrains with the Modelica PowerTrain Library, in Systemanalyse in der Kfz-Antriebstechnik, Augsburg, Germany, (2007) 38. R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction, 2nd edn. (The MIT Press, Cambridge, 2018) 39. B. D. Ziebart, Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy, Ph.D. thesis, Carnegie Mellon University, Pittsburgh, USA, 2010. 40. H.V. Hasselt, Double Q-learning, in International Conference on Neural Information Processing Systems, Vancouver, Canada, (2010) 41. S. Fujimoto, H. van Hoof, D. Meger, Addressing Function Approximation Error in Actor-Critic Methods, in International Conference on Machine Learning, Stockholm, Sweden, (2018) 42. S.F. Schuster, M.J. Brand, P. Berg, M. Gleissenberger, A. Jossen, Lithium-ion cell-to-cell variation during battery electric vehicle operation. J. Power Sources 297, 242–251 (2015) 43. W. Waag, S. Käbitz, D.U. Sauer, Experimental investigation of the Lithium-ion battery impedance characteristic at various conditions and aging states and its influence on the application. Appl. Energy 102, 885–897 (2013) 44. X. Glorot, A. Bordes, Y. Bengio, Deep Sparse Rectifier Neural Networks, in International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, USA, (2011) Ricardo de Castro Research and initial submission were done while being an employee at DLR Institute of System Dynamics and Control; he is with the University of California, Merced, since the review process.
Robust, Resilient, and Energy-Efficient Satellite Formation Control Sean Phillips, Christopher Petersen, and Rafael Fierro
1 Introduction In recent years, the concept of spacecraft formation flight has become more attractive and cost-effective, thanks to advances in manufacturing and automation. However, there are still many open problems in spacecraft coordination not yet understood. These problems arise in a variety of scenarios including, but not limited to energy-efficient formation keeping, resilience to communication link faults [1], and lack of robustness to unmodeled dynamics and sensor errors. In this chapter, we address the challenge of coordinating a team of satellites to accomplish a mission while maintaining a specified formation. The specified formation under consideration is for each agent to converge to an oscillatory trajectory around a center point with a prespecified separation in an elliptical orbit. The objective of this chapter is to provide a provably stable, robust, and resilient satellite formation control methodology.
Approved for public release; distribution is unlimited. Public Affairs release approval # AFRL-2021-1208. S. Phillips () · C. Petersen Space Vehicles Directorate, Air Force Research Laboratory, Kirtland, NM, USA R. Fierro Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_8
223
224
S. Phillips et al.
Related Work Formation Control Many approaches have been proposed to solve the general vehicle formation guidance and control problem (see, for instance, [2, 3] and the reference therein). Some techniques utilize consensus protocols [4, 5] and model predictive control (MPC) [6, 7] or reference governors [8]. Spacecraft formation flight is challenging as vehicles have to perform tasks within highly nonlinear and unknown dynamic environments subject to energy constraints and limited computational resources. Moreover, formation control [9] is an important coordination method in the control and robotics communities, thanks to applications in exploration, surveillance, mapping, ocean sampling, and transportation [10]. The use of multiple spacecraft for experiments is steadily increasing. To be more specific, successful formation flight missions have been reported in [11, 12]. Three spacecraft were used in the Aero-4 mission for formation flight with differential drag [13], and five spacecraft were used in the NASA’s THEMIS mission to study Earth’s substorms [14]. Robust and Resilient Coordination Efficient coordination of multiagent systems requires a reliable and resilient communication network. The authors in [15] presented a distributed coordination approach to enable a team of robots to intermittently communicate while patrolling an environment. Intermittent connectivity was achieved as each robot synchronizes its speed with robots moving along neighboring perimeters. An algorithm that allows heterogeneous robots to reach a goal location in a 3D environment while maintaining connectivity with a base station by forming connectivity chains is proposed in [16]. The resilience of multiple spacecraft in formation measures the capability of recovering the cooperative performance after experiencing faults on communication links. The notion of robustness, on the other hand, refers to the system’s ability to tolerate, in general, perturbations, disturbances, or errors that may be induced from modeling approaches, communication or actuator noise, etc. (see [17, 18] for more information). Robust synchronization in multiagent systems under intermittent communication has been considered in [19, 20]. The authors in [21] described a resilient formation control algorithm. The synchronization of multiagent systems over a directed graph subject to communication link faults was considered in [1]. To address the communication link fault problem, the authors proposed a distributed state observer-based adaptive control protocol. Recent work in [22, 23] outlines an approach called weighted mean subsequence reduced (W-MSR) algorithm for a resilient communication network which is guaranteed to reach consensus despite malicious agents attempting to affect the agreement value. In this chapter, we leverage this algorithm to enforce the resiliency of the spacecraft formation control. Chapter Outline The rest of this chapter is organized as follows: Sect. 2 first introduces some notation and network preliminaries and then presents the spacecraft model and the W-MSR algorithm to enforce resiliency of the formation. In Sect. 3, we formulate the N spacecraft formation control problem considered
Robust, Resilient, and Energy-Efficient Satellite Formation Control
225
herein. The control law development is described in Sect. 4. Section 5 describes network resilience to communication malfunctions and adversarial errors. Section 6 discussed satellite formation robustness. Section 7 concludes the chapter.
2 Preliminaries 2.1 Notation Preliminaries We denote the set of real numbers as R, nonnegative real numbers as R≥0 , and natural numbers as N0 . Given a differentiable function f (x) : Rn → R, its derivative with respect to variable x is the row vector ∂f the gradient ∂x . Likewise,
of f (x) with respect to x is defined as a column vector ∇x f = ∂f . Given a ∂x vector a = [a1 a2 . . . an ] ∈ Rn , we denote the L2 -norm as ||a||, more specifically ||a||2 = a12 + a22 + . . . + an2 ; the infinity norm of a, denoted by ||a||∞ , is given by ||a||∞ = maxi |ai | where |·| is the absolute value. The unit vector in the direction of a ∈ R3 is defined by φa = a/||a||. Given a vector a ∈ R3 , an orthogonal projection matrix is given by Pa , Pa = I3 − φa φa ,
(1)
where I3 is a 3 × 3 identity matrix. Given b ∈ R3 , the unit projection of b onto a plane orthogonal to a is given by φba = Pa b/||Pa b||. Given the sets A ⊂ Rn and B ⊂ Rn , we define the cardinality of A as |A|, and the relative complement of A with respect to B (also known as set minus) is denoted as B \ A which is the set of elements in B that are not in A. More information on set notation and theory can be found in [24].
2.2 Network Preliminaries In this work, we investigate the communication between numerous satellites to enable formation control. More specifically, we consider the case where every satellite may not be able to communicate globally to each other satellite, but instead only to a select subset of satellites which we refer to as neighboring satellites. A common way to model such an interaction is through graphical topology networks. The remainder of this section introduces the notation and preliminary results on graph theory; more information can be found in [25]. Consider a set of N satellite systems. An undirected graph, denoted by G, is defined as G = (V, E), where V = {1, 2, . . . , N } is called the satellite set and E ⊆ V × V is called the edge set. Under this terminology, the set of neighboring
226
S. Phillips et al.
satellites for satellite i is given by Ni = {j ∈ V | (i, j ) ∈ E}; more specifically, the set Ni is the set of all satellites that are able to communicate to satellite i. The adjacency matrix A = [aij ]N ×N represents the edges and nodes in matrix form, where, for each (i, j ) ∈ E, the element aij is the (i, j )-th entry of A. If aij > 0, then there exists a link between satellite i and satellite j . In this work, it is assumed that the graph is undirected which implies that the adjacency matrix is symmetric, i.e., the element aij = aj i for all i, j ∈ V. It is also assumed that aii = 0 for all i ∈ V. To establish the resiliency properties of the satellite formation problem, we leverage the notions of r-robust graphs as in [26, 27]. The following notions present properties of r-reachable set and r-robust graphs. Definition 1 ([27, Definition 2]) For a graph Γ and a set S ⊂ V of nodes of Γ , we say that S is an r-reachable set if there exists i ∈ S such that |N (i) \ S| ≥ r, where r ∈ Z≥1 . Definition 2 ([27, Definition 3]) A graph Γ is r-robust if, for every pair of nonempty, disjoint subsets of V, at least one of the subsets is r-reachable, where r ∈ Z≥1 .
2.3 Spacecraft Model In this chapter, we are interested in studying resilient formation control of spacecraft in close proximity. Therefore, we leverage the relative Hill’s frame to describe the motion of each satellite in the network [28]. The set of equations leveraged in this chapter come from the following derivation. Consider an inertial frame, one that is Earth-centered and fixed. For N agent satellites and a target satellite xt which agents will maneuver about (also sometimes referred to as the chief), the equations of motion are given by μ 1 R¨ j = − Rj + Uj + F j , 3 mj |Rj |
(2)
for each j ∈ {t} ∪ {1, 2, . . . , N }, where Rj ∈ R3 is the position vector of the spacecraft in the inertial frame, Rj ∈ R3 is the acceleration of the spacecraft with respect to the inertial frame, μ is Earth’s gravitational constant parameter, mj is the mass of the j -th spacecraft, Uj ∈ R3 is the actuation force of the spacecraft, and Fj ∈ R3 are forces induced from other sources (i.e., solar radiation pressure, drag, etc.). The equations of motion in (2) can yield stable, elliptical orbits about the center of gravity (approximately the center of the Earth), given the correct initial conditions. To reduce the nonlinear dynamics in (2) into the Hill’s frame, the following assumptions are made:
Robust, Resilient, and Energy-Efficient Satellite Formation Control
227
Assumption 1 The primary acting force on all spacecraft is spherical, two-body gravity generated by the central body Earth. Assumption 2 The mass loss of the spacecraft is significantly smaller than the total mass of the spacecraft. Assumption 3 The target spacecraft is in a circular orbit with radius |Rt | = r0 . Assumption 4 The distance from the target spacecraft to each other’s spacecraft is significantly less than that of the distance from the target spacecraft to the center of the Earth. Assumption 1 is leveraged to remove higher-order gravity terms which may be several orders of magnitude less than gravity of the primary body, i.e., Earth (see Chapter 3 of [29]). For Assumption 2, in satellite systems, the mass of the satellite changes over time due to fuel usage, which can cause adjustments in the overall dynamics of the satellite. However, over small time intervals, Assumption 2 implies that the particular maneuvers that the change in the fuel mass is small compared to the mass of the satellite and can thus be considered a constant in the dynamics. Assumptions 3–4 allow simplification of the nonlinear equations in (2) to the classical relative equations. With the target spacecraft in an equilibrium orbit, we attach a non-inertial frame to it with unit axis vectors ei , ej and ek called the radial track, the in-track, and the out-of-plane orbital positions, respectively. The relative position vector xi , for each i-th satellite i ∈ {1, 2, . . . , N }, is given by ⎡ ⎤ r0 D(t)Ri = ⎣ 0 ⎦ + xi , 0 where the rotation matrix D(t) transforms the inertial frame to the relative frame, based upon where the target spacecraft is in its equilibrium orbit. Utilizing Assumptions 3 and 4, the relative equations of motion abide by the following dynamics: x˙i = vi , v˙i = Ax xi + Av vi + Bui + wi ,
(3)
where, for each i ∈ V, xi ∈ R3 is the relative position, vi ∈ R3 is the relative velocity, ui ∈ R3 is the control signal expressed in the relative frame, and wi are external perturbations in the relative frame. The matrices Ax , Av , and B in (3) are given by ⎡
⎤ 3n2 0 0 Ax = ⎣ 0 0 0 ⎦ , 0 0 −n2
⎡
⎤ 0 2n 0 Av = ⎣−2n 0 0⎦ , 0 0 0
⎤ 0 0 ⎥ ⎢ B = ⎣ 0 m1i 0 ⎦ , 0 0 m1i ⎡
1 mi
(4)
228
where n =
S. Phillips et al.
3μ r03
is the angular velocity magnitude of the target spacecraft in the
inertial frame, commonly known as the mean motion. Nominally, each satellite agent has access to their own full state through perfect sensor measurements and the state of neighbors. In Sect. 6, we consider the practical case of the measurements containing induced noises. For the remainder of this chapter, we consider the satellite t to be centered at the origin, i.e., xt = vt = 0. Recall that the xi for I ∈ V is the relative motion position of each satellite in a non-inertial frame. However, the analysis allows for a general target to be considered. As we mention in Sect. 1, the life span of a satellite is limited by the amount of fuel onboard, so maintaining a formation around an object that requires minimal fuel is ideal. Due to the fact that state matrix in (3) has pure complex eigenvalues with zero real components, it follows that there exists a stable zero-fuel elliptical orbit in the ei -ej plane [28]. More specifically, these ellipses have their center along the ei axis, a semimajor axis of length 2a along the ei axis, and a semiminor axis of length a in the ej axis, for any given constant a > 0. The particular solutions that result in these ellipses are given by (3) with xi = (a, 0, 0) vi = (0, −2na, 0) and u(t) = 0 for all t ≥ 0. Figure 1 provides a numerical solution with a = 1.
2.4 Weighted Mean Subsequence Reduced Algorithm Description One iconic problem in distributed systems is the consensus of measured information problem, namely, the case of a distributed set of agents sharing information to achieve an agreement value. The discrete consensus problem is defined as xi (k + 1) = ui (k)
(5)
where xi , ui ∈ R are the information states and inputs for agent i ∈ V and k ∈ N0 is the discrete time step. It can be seen that if the input for each agent is assigned to be ui = − N j =1 aij (xi − xj ) and the graph is connected, then the consensus value is achieved [30, 31]. See Fig. 2 for a numerical example of four agents over a fully connected graph reaching consensus from initial conditions x1 (0) = 1, x2 (0) = 2, x3 = 3, and x4 = 4 and converge to xi (t) = 2.5 for each i ∈ {1, 2, 3, 4} and t ≥ 1 and thus reach consensus. However, if a single agent is malfunctioning, then it follows that the agents may never converge to the true consensus value. See Fig. 3 for a numerical example with the same initial conditions as in Fig. 2; however, agent 1 is holding a constant value of x1 (k) = 1 for all k ≥ 0. In this case, note that the agents also reach a consensus value, but to the wrong value, namely, the value of the malfunctioning agent. More precisely, the disambiguation of malfunctioning agents and normal agents is as follows:
Robust, Resilient, and Energy-Efficient Satellite Formation Control
229
1
0.5
0
-0.5 1 0.5
-1
0 1.5
1
0.5
-0.5 0
-0.5
-1
-1.5
2
1
0
-1
-2 0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Fig. 1 A numerical solution of (3) from xi = (1, 0, 0) vi = (0, −2n, 0) with n = 0.0012 and u(t) = 0 for all t ≥ 0
Definition 3 A node i is called normal if it applies the intended dynamics at each time step; if not, then it is called malfunctioning. In this work, we leverage the weighted mean subsequence reduced (W-MSR) algorithm to enforce resiliency of the formation algorithm. Some preliminaries for further discussion on this algorithm applied to the nominal satellite problem with potentially malfunctioning agents. More specifically, this algorithm intends to limit the influence of such malfunctioning agents by implicitly removing the most extreme values from influencing neighboring agents. For instance, in the standard consensus in (5) algorithm, the agents are tasked with converging to an agreement value. If one of the agent’s sensors is malfunctioning, this will drive the other agents
230
S. Phillips et al. 4 1 2 3 4
3.5 3 2.5 2 1.5 1 0
2
4
6
8
10
12
14
16
18
Time Step
Fig. 2 A numerical example of the consensus problem (5) from initial conditions x1 (0) = 1, x2 (0) = 2, x3 (0) = 3, and x4 (0) = 4 4 3.5 3 2.5 2 1.5 1 0.5 0
2
4
6
8
10
12
14
16
18
20
Time Step
Fig. 3 A numerical example of the consensus problem (5) as in Fig. 2; however, agent 1 (in blue) is malfunctioning and sharing a constant value, x1 (k) = 1 for all k ≥ 0
to converge to this value which is not the most accurate value. Therefore, each agent will locally determine which agent is malfunctioning by locally disregarding the most extreme information. More specifically, the general W-MSR algorithm is defined as follows [22]: (1) At each time instance k, each normal agent i receives output information from its neighbors and sorts them. (2) If there are n˜ or more values larger than the current local state, the normal agent removes n˜ largest values (similarly, for values smaller than the current local state). If there are fewer than n˜ values larger than the local state, normal agent removes all of these values (similarly, for values smaller than the current local state). Then, define the set Ri (k) as the set of agents whose values were removed by each normal agent at time t.
Robust, Resilient, and Energy-Efficient Satellite Formation Control
231
4 3.5 3 2.5 2 1.5 1 0.5 0
2
4
6
8
10
12
14
16
18
20
Time Step
Fig. 4 A numerical example of the consensus problem (5) as in Fig. 3 with the W-MSR applied. Like in Fig. 3, agent 1 (in blue) is malfunctioning and sharing a constant value, x1 (k) = 1 for all k≥0
(3) Each normal i-th agent updates its value according to its neighbors Ni in the network graph but removing the agents contained in the set Ri . This algorithm effectively removes the influence of the malfunctioning agents from driving the agents as seen in Fig. 4.
3 Problem Statement In this section, we formally introduce the problem. Consider a formation of N spacecraft in the Hill’s frame. Each i-th spacecraft, i ∈ V, has relative position states xi ∈ R3 , relative velocity states vi ∈ R3 , and inertial position vectors defined by Ri (see Fig. 5). Let these N satellites be in relative proximity to the target satellite whose position xt is fixed in the relative frame. Then we define the problem as follows: Problem Given N spacecraft with full control authority, develop a control law that drives each spacecraft to a formation with the following properties: 1. The formation is centered at a fixed point xt . 2. Each spacecraft must reach a formation plane defined by normal α ∈ R3 satisfying ||α|| = 1, i.e., α (xi − xt ) = 0, i = 1, . . . , N.
(6)
3. The spacecraft formation must circumnavigate in an elliptical orbit with a constant orbital period Tp , a semimajor axis β, and a semiminor axis γ , with each axis direction defined by αM and αm , respectively.
232
S. Phillips et al. Chaser, x 2
R2 Target, x t Rt Earth
n Formatio
R3
Plane
x2 Chaser, x 3
R1 x1
x3 Chaser, x 1
Fig. 5 The formation problem consisting of three satellites (two cooperative and one malfunctioning) with position states (x1 , x2 , x3 ) connected over a digraph shown in the bottom left. The goal of each satellite is to converge to a prescribed orbit around the target xt using only neighboring information
4. The spacecraft must achieve the formation with a predefined phase,
cos−1 xi xj / ||xi ||||xj || = θij , i = 1, . . . , N, j ∈ Ni ,
(7)
where the set Ni is a neighbor set of spacecraft i, which is formally defined in Sect. 4. Section 2.2, but for brevity here describes which spacecraft in the formation are connected in a graph structure to make the formation unique. Requirements 2–4 impose the shape of the formation; however, depending on the angle definitions θij , this shape may not be unique. The next assumption imposes this uniqueness. Assumption 5 The angular distance θij between satellites i and j given in (15) is sufficient to ensure that there exists a unique circular formation that satisfies the formation plane (6), (14), and (15). The above assumption imposes a global rigidity-like condition on the graph structure G. A discussion on the importance of such a framework can be found in references [32, 33]. The above requirements are desirable for spacecraft missions as there exist zerofuel elliptical trajectories in the relative frame which can be exploited to increase satellite mission life. The reason for the existence of these trajectories is given in Sect. 2.3. It should be noted that while this work emphasizes zero-fuel trajectories, the control scheme applies to any planar elliptical formation. Figure 5 gives a
Robust, Resilient, and Energy-Efficient Satellite Formation Control
233
pictorial representation with the ellipse in a plane defined by the relative unit axis ek (the coordinate system is formally defined in Sect. 2.3).
4 Control Law Development Several distributed control protocols have been proposed for common orbit convergence [34] and spacecraft formations via cyclic pursuit [35]. In the literature, these problems are treated independently of each other. This chapter describes a technique based on our previous work [32, 36] and guarantees convergence of the spacecraft to a prescribed formation graph. A summary of the overall control is given below, with in-depth explanation in the following subsections: 1. Scale xi and vi according to (12) such that the desired elliptical trajectory is a circular trajectory. 2. Compute the desired velocity trajectory given in (21). 3. Using the velocity trajectory, compute the control signal gi from (22). 4. Using gi from above, the following control signal is
ui = B −1 −Ax xi − Av vi + C −1 gi ,
(8)
where C is the scaling matrix given in (11).
4.1 Scaling and Feedback The control law in references [32, 36] rely on single integrator dynamics where the velocity state is controlled. To facilitate this framework, a feedback technique is first performed. Assuming that there are no perturbation forces, i.e., wi = 0, the following control law is used: ui = B −1 (−Ax xi − Av vi + zi ) ,
(9)
where zi is a new control signal which, with the dynamics in (3), becomes the following double integrator dynamics: x˙i = vi , v˙i = zi .
(10)
To align with the work in references [32, 36], the equations of motion in (10) are scaled in order to make the desired elliptical formation a circle. Noting that α, αM , and αm define an orthonormal basis, a constant scaling matrix can be given:
234
S. Phillips et al.
⎡1
β αM
⎤
⎥ ⎢ ⎥ ⎢ ⎢ 1 ⎥ C = ⎢ γ αm ⎥ . ⎥ ⎢ ⎦ ⎣
(11)
α Defining a new set of coordinates, qi = Cxi , pi = Cvi ,
(12)
where qi ∈ R3 and pi ∈ R3 are the scaled position and velocity, respectively. Taking the time derivative of (12), it follows that the equations of motion in (10) become in the scaled frame q˙i = pi , p˙ i = gi ,
(13)
where gi = Czi . Now, in this new scaled system, the desired formation becomes circular and fits well into the previous works. In fact, Problem Definitions 3 and 4 in Sect. 3 reduce to the following: 3’. The spacecraft must circulate around the target qt = Cxt with a prescribed radius ρ > 0, i.e., ||qi − qt || = ||qit || = ρ, i = 1, . . . , N.
(14)
4’. The spacecraft must achieve the formation with a predefined separation distance, ||qi − qj || = dij , i = 1, . . . , N, j ∈ Ni ,
(15)
where again Ni is the neighbor set of i. Note that requirements (3’) and (4’) are identical with those in the previous work in [32, 36] which allow us to leverage a similar approach to design a control law that satisfies these requirements for a fuel-efficient formation.
4.2 Lyapunov Control Law The control law exploited is from [32, 37] and utilizes a Lyapunov framework. In contrast to traditional Lyapunov theory, the desired steady-state is not a specific equilibrium point, but rather a dynamic formation configuration which is defined
Robust, Resilient, and Energy-Efficient Satellite Formation Control
235
by the Problem Statement in Sect. 3. To apply Lyapunov theory, items 1–4 in the Problem Statement are reflected in a single function whose basic minimum represents the desired equilibrium trajectory. This subsection will introduce four functions which make up this. Note that the stability analysis is performed on the transformed, scaled system (13), but as this mapping is isomorphic, implying that the stability properties in the scaled system carry over to the original system. Firstly, consider the following functions: V1 =
1 (α qit )2 , 2 i
1 V2 = (||Pα qit || − ρ)2 , 2
(16)
i
4 5 2 2 d 1 ij V3 = aij ||φiα − φjα ||2 − 2 4 ρ i
j ∈Ni
where φiα = Pα qit /||Pα qit ||. The functions Vi in (16) express different components of the Problem Statement in Sect. 3; more specifically, V1 represents the sum error of all spacecraft to the target plane, V2 is the sum error of all spacecraft to the constant radius objective, and V3 is the sum of the inter-distance error between each spacecraft. The gradients of V1 , V2 , and V3 with respect to the i-th scaled spacecraft position qi in (12), denoted ∇qi , are given by ∇qi V1 = α qit α, ∇qi V2 =
(17)
(||Pα qit || − ρ) φiα ,
4 5 dij2 α 1 α α 2 φi − φjα α × φiα . aij ||φi − φj || − 2 α × φiα ∇qi V3 = ||Pα qit || ρ j ∈Ni
These gradients have an orthogonality relationship, which is given by the following lemma: Lemma 1 The gradients ∇qi Vk for each k ∈ {1, 2, 3} in (17) satisfy (∇qi V1 ) (∇qi V2 ) = (∇qi V2 ) (∇qi V3 ) = (∇qi V1 ) (∇qi V3 ) = 0 for each i ∈ V. Proof First, we consider the case (∇qi V1 ) (∇qi V2 ). Note that Pα = I3 − αα and α Pα = (1 − |α|2 )α . It follows that (∇qi V1 ) (∇qi V2 ) is given by (∇qi V1 ) (∇qi V2 ) = (α qit α) (||Pα qit || − ρ)φiα = αqit α (||Pα pit || − ρ)
Pα qit = 0. ||Pα qit ||
236
S. Phillips et al.
Next, we consider the case of (∇qi V2 ) (∇qi V3 ), it follows that (∇qi V2 ) (∇qi V3 ) = ((||Pα qit || − ρ)φiα )
1 γ (φjα , φiα ) α × φiα , ||Pα qit || j ∈Ni
(18) where
γ (φjα , φiα )
% = aij ||φiα − φjα ||2 −
dij2
&
ρ2
α α × φiα φi − φjα is a scalar
function. Furthermore, note that because γ is a scalar function and (||Pα qit || − ρ) is scalar, it follows that (∇qi V2 ) (∇qi V3 ) =
(||Pα qit || − ρ)γ (φjα , φiα ) j ∈Ni
||Pα qit ||
(φiα ) α × φiα = 0,
(19)
due to the fact that φiα and α × φiα are orthogonal to each other due to the definition of cross-product. Lastly, the case of (∇qi V2 ) (∇qi V3 ) is shown in [32, Lemma 1]. Now define V4i which only concerns itself with the velocity of the i-th agent, in contrast to V1 , V2 , and V3 which look over the position of the formation, V4i =
1 ||pi − h1i − h2i − h3i ||2 , 2
(20)
where the functions h1i , h2i , and h3i are h1i = −k1 ∇qi V1 , h2i = −k2 ∇qi V2 + k0 ||Pα qit || α × φiα ,
(21)
h3i = −k3 ||Pα qit ||∇qi V3 , where k1 , k2 , k3 > 0, and k0 are the angular velocities of the circular motion as defined by the right-hand rule about the axis α. The function V4i represents error to a velocity trajectory that the spacecraft needs to follow in order to reach the desired formation, as derived by Miao et al. [32]. The control strategy for this problem focuses on stabilizing the velocity states pi to the given trajectory exponentially fast. If these trajectories are reached sufficiently quickly, then the position states can converge to the desired trajectory. The control law that performs this for each ith agent is gi = −k4 (pi − h1i − h2i − h3i ), where k4 > 0 is a user-defined parameter chosen sufficiently large.
(22)
Robust, Resilient, and Energy-Efficient Satellite Formation Control
237
The following lemma now gives the result for the velocity trajectory: Lemma 2 The velocity dynamics of the i-th spacecraft given by (13) under the control law (22) converges to h1i + h2i + h3i exponentially. Proof First, define the velocity error as p¯ i = pi − h1i − h2i − h3i . By definition, the function V4i is positive definite with respect to the desired trajectory for all p¯ i ∈ R3 . That means V4i is a candidate Lyapunov function. Looking at the time derivative of V4i and substituting in (22) give V˙4i =
∇pi V4i p˙ i ∂pi
= (pi − h1i − h2i − h3i ) gi
(23)
= −k4 V4i < 0, ∀||p¯ i || > 0. More specifically, the velocity trajectories converge to the set of points defined by pi = h1i + h2i + h3i . Due to the form of V4i , we integrate V4i with respect to time to yield V4i (t) = e−k4 t V4i (0)
(24)
which, with (20), leads to ||pi (t) − h1i (t) − h2i (t) − h3i (t)|| = ||pi (0) − h1i (0) − h2i (0) − h3i (0)||e
−k4 2 t
. (25)
Next, we consider the following definition: Definition 4 (Semiglobal Practical Asymptotic Stability) Given a continuous time system x˙ = f (x), a nonempty closed set A ⊂ Rn , and an open set U ⊂ Rn , the set A is said to be semiglobally practically asymptotically stable for the system if there exists a function β ∈ KL such that every solution from U satisfies |x(t)| ≤ β(|x(0)|, t) + . With the above lemma and definition, we can make the following assertion leveraging results from [38]: Proposition 1 Consider the scaled dynamics in (13) with feedback law (22). Assume the initial conditions satisfy ||Pα qi (0)|| > 0. The position dynamics of the i-th spacecraft given by (13) and scaled input (22) are semiglobally practically asymptotically stable.
238
S. Phillips et al.
Proof Consider the following Lyapunov function: V = V1 + V2 + V3 .
(26)
Recall from (16) that the functions Vs s ∈ {1, 2, 3} are given by V1 =
1 (α qit )2 , 2 i
1 V2 = (||Pα qit || − ρ)2 , 2 i
4 5 2 2 d 1 ij V3 = aij ||φiα − φjα ||2 − 2 4 ρ j ∈Ni
i
which define the set of points to which we want solutions to stabilize. First, note that V is positive definite everywhere outside of the set to which we want to stabilize, denoted by A and V = 0 for all (p, q) ∈ A. The time derivative of this Lyapunov function is given by V˙ =
N
(∇qi V1 + ∇qi V2 + ∇qi V3 )pi .
(27)
i=1
Noting that the trajectories of pi exponentially converge via (25), V˙ ≤
N
i=1 (∇qi V1
+|| =
+ ∇qi V2 + ∇qi V3 )(h1i + h2i + h3i )
N
i=1 (∇qi V1
N
+ ∇qi V2 + ∇qi V3 )||c1 e−c2 ||p¯i ||t ,
i=1 (−k1 ||∇qi V1 ||
+||
N
i=1 (∇qi V1
= (qi ) + ||
N
2
− k2 ||∇qi V2 ||2 − 2k3 ||∇qi V3 ||2 · ||Pα qi ||)
(28)
+ ∇qi V2 + ∇qi V3 )||c1 e−c2 ||p¯i ||t ,
i=1 (∇qi V1
+ ∇qi V2 + ∇qi V3 )||c1 e−c2 ||p¯i ||t .
Consider (28) without the second term. This is exactly the case from [38], in which it was demonstrated that, locally, V˙ < 0. A sketch of the proof is given here: It can be verified that ||Pα qi (t)|| > 0 for all t > 0. Thus, (qi ) ≤ 0 and V˙ ≤ 0, which implies without the second term the system is stable. It can be shown that ||∇qi Vj || = 0 for each j = {1, 2} and for each i ∈ V as t goes to infinity. If ∇qi V1 = 0 then the ith spacecraft reaches the desired plane. If ∇qi V2 = 0, then the ith spacecraft reaches the desired projected radius, i.e., ||Pα qi || = ρ, but as we approach the desired plane, ||qi || = ρ, and the satellites begin circulating at
Robust, Resilient, and Energy-Efficient Satellite Formation Control
239
the prescribed rate. Thus, conditions 1, 2, and 3’ are met. This is the result from Theorem 1 in [38]. To meet the last condition, it can be seen from a linear analysis that system locally asymptotically converges to the desired inter-satellite distances. Global convergence to inter-satellite distances can be achieved by the addition of a decaying sinusoidal term in the control but was not approached here. However, now there is the addition of the second term in (28), which represents intuitively the initial error in the velocity component. Denote the function ζ (t) = c1 e−c2 ||p¯i ||t ≥ 0 but will decay exponentially to zero. It follows that V˙
≤
(qi ) + (
N
||∇qi V1 ||ζ (t) + ||∇qi V2 ||ζ (t) + ||∇qi V3 ||ζ (t))
(29)
i=1
≤
(qi ) +
N ||∇qi V1 ||2 k1 θ i=1
+
2
+
||∇qi V2 ||2 k1 θ ζ (t)2 ζ (t)2 + + 2k1 θ 2 2k2 θ
2||∇qi V1 ||2 k3 ||Pα qi (0)||θ ζ (t)2 + , 2 4k3 ||Pα qi (0)||θ
θ ∈ (0, 1)
=
(qi )(1 − θ ) +
ζ (t)2 ζ (t)2 ζ (t)2 + + . 2k1 θ 2k2 θ 4k3 ||Pα qi (0)||θ
θ ∈ (0, 1)
≤
(qi )(1 − θ ) +
ζ (0)2 ζ (0)2 ζ (0)2 + + . 2k1 θ 2k2 θ 4k3 ||Pα qi (0)||θ
θ ∈ (0, 1)
ζ (0)2 ζ (0)2 ζ (0)2 ⇒ V˙ ≤ 0, (qi )(1 − θ ) ≥ + + . 2k1 θ 2k2 θ 4k3 ||Pα qi (0)||θ
θ ∈ (0, 1)
Far from the desired formation, it can be seen that V˙ ≤ 0. This is when (qi ) dominates and from [38] and the above sketch results in asymptotic stability. However, due to this second term, in a closer region about the desired formation, V˙ may not be positive semi-definite. This results in practical asymptotic stability from the definition above. In the following section, we present numerical simulations to demonstrate the results. Namely, we show that the satellites converge to the desired formation and the fact that (qi ) does dominate the second term of (28).
4.3 Nominal Formation Control Numerical Simulation Consider the case of ten satellites. Each satellite has a mass of 100 kg and is controlled with continuous thrusters in the three independent directions. The target spacecraft is in a circular, low Earth orbit corresponding the mean motion of n = 0.0012 and is assigned to be the center of the relative motion frame, i.e., xt = (0, 0, 0). In each of the following simulations, the control gains are chosen
240
S. Phillips et al.
as k1 = k2 = k3 = 0.01 and k4 = 0.03. We assign the value of k0 = n since this is the angular velocity of the orbit which will yield one full circumnavigation per one period of the target spacecraft’s orbit. In each simulation, we desire to achieve a 2 × 1 elliptical orbit in the ei − ej plane, which implies that we have the following parameters: αM = (1, 0, 0),
α = (0, 0, 1),
αm = (0, 1, 0)
β = 2,
γ = 1,
ρ = 3. (30)
Furthermore, each agent wants to be “equiangular” from its neighbors. In this case, with ten spacecrafts, the agents want to achieve a splay state configuration, namely, each agent wants to be π5 radians away from its sequential neighbor. For this case, the distances dij in augmented Problem Definition 4’ given in Sect. 4.1 are defined in the matrix d = [dij ]N ×N where each dij element is the prescribed distance between each satellite node, i.e., the quantity defined in (15). ⎡
0.00 ⎢ 1.85 ⎢ ⎢ 3.53 ⎢ ⎢ 4.85 ⎢ ⎢ ⎢ 5.71 δ≈⎢ ⎢ 6.00 ⎢ ⎢ 5.71 ⎢ ⎢ 4.85 ⎢ ⎣ 3.53 1.85
1.85 0.00 1.85 3.53 4.85 5.71 6.00 5.71 4.85 3.53
3.53 1.85 0.00 1.85 3.53 4.85 5.71 6.00 5.71 4.85
4.85 3.53 1.85 0.00 1.85 3.53 4.85 5.71 6.00 5.71
5.71 4.85 3.53 1.85 0.00 1.85 3.53 4.85 5.71 6.00
6.00 5.71 4.85 3.53 1.85 0.00 1.85 3.53 4.85 5.71
5.71 6.00 5.71 4.85 3.53 1.85 0.00 1.85 3.53 4.85
4.85 5.71 6.00 5.71 4.85 3.53 1.85 0.00 1.85 3.53
3.53 4.85 5.71 6.00 5.71 4.85 3.53 1.85 0.00 1.85
⎤ 1.85 3.53⎥ ⎥ 4.85⎥ ⎥ 5.71⎥ ⎥ ⎥ 6.00⎥ ⎥. 5.71⎥ ⎥ 4.85⎥ ⎥ 3.53⎥ ⎥ 1.85⎦
(31)
0.00
Note that the δii for each i ∈ V is equal to zero and the δij for each i, k ∈ V such that |i − j | = 4 are π radians apart which corresponds to 2ρ. Lastly, the initial conditions of each spacecraft are chosen randomly as follows: xi (0) ∈ [−10, 10] × [−10, 10] × [−10, 10] (km), vi (0) ∈ [−1, 1] × [−1, 1] × [−1, 1] (km/sec). The formation trajectory, the norm of each spacecrafts’ control, and the overall Lyapunov function V = V1 + V2 + V3 in (16) are shown in Figs. 6 and 7. The trajectory in Fig. 6 shows that the agents reach the desired 2 × 1 ellipse smoothly; moreover, they reach the formation defined by the δ matrix in (31). Paring with the spatial trajectories in Fig. 6, it can be observed in top plot in Fig. 7 that the control input values over time converge to zero when the formation is achieved. More specifically, the resulting control trajectories directly correspond to the zerofuel elliptical motions. To confirm that the desired formation is achieved, the bottom chart in Fig. 7 shows that the Lyapunov function in (16) over the solution converges to zero.
Robust, Resilient, and Energy-Efficient Satellite Formation Control
241
15
Cross-track (km)
10
5
0
-5
-10 -10
15
-5
10 0
5 5
0 10
In-track (km)
-5 15
-10
Radial (km)
Fig. 6 Trajectories of the ten agents converging to the a 2 × 1 ellipse
5 Satellite Network Resilience to Communication Malfunction In this section, we will consider the case when the information coming from neighboring satellites in the network are potentially corrupted due to internal malfunctions which may affect the formation of satellites. As mentioned in Definition 3, the satellites be considered as normal agents or they can be considered as malfunctioning agents. Intuitively, a malfunctioning agent doesn’t know if it is malfunctioning and, therefore, cannot alert its neighbors of the error. These malfunctioning agents, instead, share their output as yi = εi (t, xi , vi ), where εi : R≥0 × R3 × R3 → R6 is a potentially time-varying unknown function defining the sensor malfunction. If the satellite is a normal agent, then the output of each agent is given by yi = xi as in the nominal satellite mode presented in Sect. 2.3.
242
S. Phillips et al.
40 30 | u i | 20 10 0 0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
10 -6
600
1.156
500 400
1.154
300 1.152
200 100
96
0 0
0.5
1
1.5
2
2.5
3
3.5
98 4
100 4.5
5
time, (sec)
Fig. 7 (top) Norm of the inputs of each agent showing that the control input converges to zero as the spacecraft reaches the formation, (bottom) the Lyapunov function composed with the solution
5.1 Resilient Formation Control Algorithm In this section, we utilize the work in [22] to define an algorithm with similar characteristics to the W-MSR algorithm presented in the previous section. Note that V in (16) has only one equation dependent on neighboring satellite information. More specifically, the function V3 in (17) uses information from connected satellites to determine the actions necessary to achieve the correct formation on the plane, thus making it vulnerable to malfunctioning communications. The following update law is applied to guarantee such information is not influencing the formation of the spacecraft: (i) Given F malfunctioning agents. (ii) At each time step t, each i-th satellite receives values from all its neighbors to calculate 4 5 2 δ α ij ψij (t) := |φiα − φjα |2 − 2 α × φiα φi − φjα α × φiα , ρ
Robust, Resilient, and Energy-Efficient Satellite Formation Control
243
for each j ∈ N (i) ∪ {i} and sorts this list in descending order. (iii) For each i-th agent, if there are more than F elements larger than ψii in the sorted list in step (2), remove the top F elements. If there are less than ψii , remove all elements larger than ψii . Similar to values less than ψii . Note that for each agent, ψii is identically zero. (iv) Let the set of agent Ri (t) whose values were removed by agent i at time k. (v) Each node i-th node applies its update value as gi = h1i + h2i + h3i , where h1i –h3i are given in (21) with V3 being augmented to have the following form: 1 V3 = 4 i
j ∈N (i)\Ri (t)
4 |φiα
− φjα |2
−
δij2
52
ρ2
which implies that ∇qi V3 is given by 1 ∇qi V3 = |Pα qit |
j ∈N (i)\Ri
4
2 5 δij α α α 2 φi − φjα α × φiα . |φi − φj | − 2 α × φiα ρ
Under this augmented algorithm, at each time step t and for each i-th agent, a set of agents Ri must be computed. Then, each agent will remove the information of neighboring agents contained in Ri . This effectively limits the amount of information used by each agent. So as a consequence, the convergence rate will decrease due to limiting the total information available to the Lyapunov-based control law in (17). Following [22], if the graph is (2F + 1)-robust where F is the number of malfunctioning agents, then the formation problem is solved under the influence of malfunctioning communication. To showcase this, consider the same setup in Sect. 4.3 but with two malfunctioning agents in Figs. 8 and 9. Namely, in this case, agents 4 and 10 output a constant random position value yi = i ∈ [0, 100]3 ⊂ R3 , where [0, 100]3 is a regular threedimensional volume of 100 units wide in each dimension. Agents are assumed to be completely connected which leads to the second eigenvalue of the Laplacian to be λ2 = 10. This network topology meets the criteria of being resilient to capture the errors in their neighboring agents. Neighboring agents use this output value yi to converge to the formation defined in (31) in which neighboring agents will use and incorporate this value into their formation control algorithm. In fact, after some transient response, the agents remove the malfunctioning agents and remove this data from their formation control algorithm which can be seen in the bottom two figures in Fig. 9. Moreover, the agents converge to a 2 × 1 km elliptical orbit as indicated by the control inputs ui in Fig. 8 converging to zero which also indicates that the information from the agents are removed.
244
S. Phillips et al.
20
15
Cross-track (km)
10
5
0
-5
-10 -10
15
-5
10 0
5 5
0 10
In-track (km)
-5 15
-10
Radial (km)
Fig. 8 Trajectories of the ten agents converging to a 2 × 1 ellipse under the influence of two malfunctioning agents (in red)
6 Satellite Formation Robustness In Sect. 4, we considered the nominal case of formation control. Namely, there were no applied external forces or errors included in the model. In practical applications, there are several external perturbations that affect the motion of satellites outside of spherical gravity. Examples are (a) gravity due to unequal mass distribution of Earth (violating Assumption 1), (b) solar radiation pressure (SRP), (c) drag due to the atmosphere of the Earth, and (d) gravity due to the Moon and the Sun. In all cases, these perturbation accelerations are additive to the equations of motion, i.e., wi in (3). It follows that through the change in variables in (12) with ui assigned by (9), the scaled dynamics with additive disturbances are q˙i = pi , p˙ i = gi + Ωi , where gi = Czi and Ωi = Cwi .
(32)
Robust, Resilient, and Energy-Efficient Satellite Formation Control
245
100
|u i |
50
0 0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
10 5
x 1i
0 -5 -10
10 5
x 2i
0 -5 -10
time, (sec)
Fig. 9 (top) Norm of the inputs of each agent showing that the control input converges to zero as the spacecraft reaches the formation; (center and bottom) the first and second (x1i ∈ R and x2i , respectively) components of the position of each agent x1i trajectories of the positions of each agent
More specifically, we consider the case of input-to-state stability which is defined as follows:1 Definition 5 (Input-to-State Stable) A system x˙ = f (x, ω),
x(t) ∈ Rn
is called input-to-state stable (ISS) with respect to ω if there exist functions γ ∈ K and β ∈ KL such that the following holds: ||x(t)|| ≤ β(||x0 ||, t) + γ (||ω||∞ ) for all x(0) = x0 , all admissible inputs ω, and t ≥ 0. function γ : R≥0 → R≥0 is called a class-K function, denoted as γ ∈ K, if f (0) = 0 and f (x) is strictly increasing. A function β : R≥0 × R≥0 → R≥0 is a class-KL function, denoted as β ∈ KL, if it is nondecreasing in its first argument, nonincreasing in its second argument, limr→∞ β(r, s) = 0 for each s ∈ R≥0 and, likewise, lims→0 β(r, s) = 0 for each r ∈ R≥0 .
1A
246
S. Phillips et al.
Spacecraft 1 Spacecraft 2 Spacecraft 3
10 8 6
Cross-track (km)
4 2 0 -2 -4 -6 -8 -10 5
15 10 0
5 0
-5
-5 -10
-10
In-Track (km)
Radial (km)
Fig. 10 Trajectories of the three agents converging to a 2×1 ellipse under unmodeled disturbances
As the control law in this chapter is Lyapunov-based and continuous, it affords a certain level of robustness, as demonstrated by the next theorem. Theorem 1 The closed-loop system given by (13) with control strategy (22) is input-to-state stable (ISS) with respect to additive disturbance Ωi after sufficiently large time t. Proof Let the scaled velocity dynamics for the ith spacecraft be given as p˙ i = gi + Ωi ,
(33)
where ||wi || ≤ c3 , for c3 > 0 is some bounded constant. Using Young’s inequality,2 we can infer the following on V˙4i and V4i :
2 Note
that Young’s inequality is defined ab ≤
a2 2ε
+
εb2 2
where a, b, ε > 0.
Robust, Resilient, and Energy-Efficient Satellite Formation Control
247
300
250
V (dimensionless)
Cross-track (km)
200
150 5 100
4.5 4
50 3.5 150
200
250
0 0
50
100
150
200
Time (hr)
In-track (km)
250
300
Radial (km)
Fig. 11 Combined Lyapunov function of three agents converging to a 2 × 1 ellipse under unmodeled disturbances
V˙4i = (pi − h1i − h2i − h3i ) (−k4 (pi − h1i − h2i − h3i ) + Ωi ) = −k4 ||p¯ i ||2 + p¯ i wi ≤ −k4 ||p¯ i ||2 + ||p¯ i ||c3 c2 k4 θ ||p˜ i ||2 + 3 ≤ −k4 ||p˜i || + 2 2k4 θ 2
≤ −k4 (2 −
c2 θ )V4i + 3 2 2k4 θ
(34) θ ∈ (0, 4)
θ ∈ (0, 4)
which, after integrating V4i with respect to time, leads to V4i (t) ≤ V4i (0)c1 e
−c2 t
+ 0
t
e−c2 (t−s)
c32 ds, 2k4 θ
θ ∈ (0, 4),
(35)
recalling that p¯ i = pi − h1i − h2i − h3i which implies that the velocity p¯ i will converge to a bounded region around the desired velocity, relative to the maximum magnitude of disturbance. From the above, we can then demonstrate ISS of the velocity states from Lemma 4.6 of [39].
248
S. Phillips et al.
With the velocity states proven ISS, we can say that trajectories in the velocity space have the following relationship: ||pi − h1i − h2i − h3i || ≤ β(||p¯ i ||, t) + c4 , β ∈ KL (36) ≤ c1 e−c2 t + c4 , where c4 > 0 is a constant bound, given by the ISS property and dependent on the magnitude of the additive disturbance. This, in turn, affects (28) as V˙ ≤ (qi ) + || ≤ (qi ) + (
N
i=1 (∇qi V1
+ ∇qi V2 + ∇qi V3 )||(c1 e−c2 ||p¯i ||t + c4 ),
N
i=1 ||∇qi V1 || + ||∇qi V2 || + ||∇qi V3 ||)(c1 e
−c2 ||p¯ i ||t
+ c4 ). (37)
Now consider that after a sufficient time t, the component c4 dominates c1 e−c2 ||p¯i ||t . If we assume that the exponential term is sufficiently small in contrast to constant c4 , we can perform a similar analysis to as above using Young’s inequality to obtain the following: V˙ ≤ (qi ) + ( ≤ (qi ) +
N
i=1 ||∇qi V1 ||c4
N i=1
||∇qi V1 ||2 k1 θ 2
2||∇ V ||2 k ||P q (0)||θ + qi 1 23 α i
= (qi )(1 − θ ) +
c42 2k1 θ
+
+
+ ||∇qi V2 ||c4 + ||∇qi V3 ||c4 )
+
c42 2k1 θ
+
||∇qi V2 ||2 k1 θ 2
c42 4k3 ||Pα qi (0)||θ ,
c42 2k2 θ
+
c42 4k3 ||Pα qi (0)||θ .
+
c42 2k2 θ
(38) θ ∈ (0, 1) θ ∈ (0, 1)
From Proposition 1, it follows that (qi ) is negative definite and thus we can demonstrate ISS from Lemma 4.6 of [39]. To illustrate this, consider a formation of three satellites in the same orbit as previous examples with initial conditions: x1 = [10, 5, 10] (km), v1 = [0, 0, 0] (km/s), x2 = [−10, − 10, − 10] (km), v2 = [0, 0, 0] (km/s), v3 = [0, 0, 0] (km/s), x3 = [10, − 10, 10] (km),
(39)
The three satellites have the goal of going to a 2 × 1 ellipse. In contrast to above, the satellites are affected by not just spherical Earth gravity but also gravity due to unequal mass distribution (i.e., J2), SRP, drag, and gravity from the Sun and the Moon. To give an idea of relative orders of magnitude, primary spherical gravity, under which the nominal equations of motion are derived around, is roughly
Robust, Resilient, and Energy-Efficient Satellite Formation Control
249
10− 2 km/s2 at low earth orbit. The perturbation forces have the following orders of magnitude at low earth orbit (a) unequal mass distribution: 10−5 km/s2 ; (b) SRP, 10−10 km/s2 ; (c) drag, 10−4 km/s2 ; and (d) Sun and Moon gravity, 10−9 km/s2 (see [29]). Note that while some of these accelerations are small, they still persist over time. Figures 10 and 11 demonstrate that even under these unmodeled effects, the satellites converge around a 2 × 1 ellipse. Furthermore, Fig. 11 shows that the overall Lyapunov function does not reach zero, but a bound around 0, demonstrating the ISS robustness property.
7 Conclusion In this chapter, we considered a novel spacecraft formation problem, wherein satellites were modeled using the Hill’s frame. The particular formation under consideration was for each agent to converge to an oscillatory trajectory around a center point with a prespecified separation in an elliptical orbit. To solve this problem, we recast the problem as a set stabilization problem and designed a Lyapunov-based controller. We then showed that the set of points defining this formation was shown to be locally practically stable. Moreover, the control law was shown to be resilient to malfunctioning satellite communications or measurements through a W-MSR protocol and robust in the sense of input-to-state stability to additive perturbations. Acknowledgments The work of Rafael Fierro was supported in part by the Air Force Research Laboratory (AFRL) under agreement number FA9453-18-2-0022.
References 1. C. Chen, K. Xie, F.L. Lewis, S. Xie, R. Fierro, Adaptive synchronization of multi-agent systems with resilience to communication link faults. Automatica 111, 108636 (2019) 2. A.R. Girard, J.B. De Sousa, J.K. Hedrick, An overview of emerging results in networked multivehicle systems, in Proceedings of the American Control Conference (2004), pp. 1485–1490 3. J. Marshall, M.E. Broucke, B. Francis, Formations of vehicles in cyclic pursuit. Trans. Autom. Control. 49(11), 1963–1974 (2004) 4. W. Ren, Formation keeping and attitude alignment for multiple spacecraft through local interactions. J. Guid. Control. Dyn. 30(2), 633–638 (2007) 5. K. Zhang, M. Demetriou, Optimization and adaptation of consensus penalty terms for the attitude synchronization of spacecraft formation, in Proceedings of the Guidance, Navigation, and Control Conference (2013) 6. D. Morgan, S. Chung, F.Y. Hadaegh, Model predictive control of swarms of spacecraft using sequential convex programming. J. Guid. Control Dyn. 37(6), 1725–1740 (2014) 7. L. Sauter, P. Palmer, Analytic model predictive controller for collision-free relative motion reconfiguration. J. Guid. Control Dyn. 35(4), 1069–1079 (2012) 8. G.R. Frey, C.D. Petersen, F.A. Leve, E. Garone, I.V. Kolmanovsky, A.R. Girard, Parameter governors for coordinated control of n-spacecraft formations. J. Guid. Control Dyn. 40(11), 3020–3025 (2017)
250
S. Phillips et al.
9. A.K. Das, R. Fierro, V. Kumar, J.P. Ostrowski, J. Spletzer, C.J. Taylor, A vision-based formation control framework. IEEE Trans. Robot. Autom. 18(5), 813–825 (2002). https://doi. org/10.1109/TRA.2002.803463 10. K. Sakurama, Unified formulation of multi-agent coordination with relative measurements. IEEE Trans. Autom. Control, 1–16 (2020). https://doi.org/10.1109/TAC.2020.3030761 11. M. Kirschner, O. Montenbruck, S. Bettadpur, Flight dynamics aspects of the grace formation flying, in International Workshop on Satellite Constellations and Formation Flying (2001), pp. 19–20 12. E. Gill, O. Montenbruck, S. D’Amico, Autonomous formation flying for the prisma mission. J. Spacecr. Rocket. 44(3), 671–681 (2007) 13. J. Gangestad, B. Hardy, D. Hinkley, Operations, orbit determination, and formation control of the aerocube-4 cubesats, in Proceedings of the Conference on Small Satellites (2007), pp. 671–681 14. J.L. Burch, V. Angelopoulos, The THEMIS mission (Springer, New York, 2009) 15. X. Yu, M.A. Hsieh, Synthesis of a time-varying communication network by robot teams with information propagation guarantees. IEEE Robot. Autom. Letters 5(2), 1413–1420 (2020) 16. V.S. Varadharajan, D. St-Onge, B. Adams, G. Beltrame, Swarm relays: distributed self-healing ground-and-air connectivity chains. IEEE Robot. Autom. Letters 5(4), 5347–5354 (2020) 17. J. Hromkovic, R. Klasing, A. Pelc, P. Ruzicka, W. Unger, Dissemination of Information in Communication Networks: Broadcasting, Gossiping, Leader Election, and Fault-Tolerance (Texts in Theoretical Computer Science. An EATCS Series) (Springer, Berlin, 2005) 18. U. Mackenroth, Robust Control Systems: Theory and Case Studies (Springer, Berlin, 2004) 19. S. Phillips, R.G. Sanfelice, Robust distributed synchronization of networked linear systems with intermittent information. Automatica 105, 323–333 (2019) 20. S. Phillips, Y. Li, R.G. Sanfelice, A hybrid consensus protocol for pointwise exponential stability with intermittent information. 10th IFAC Symp. Nonlinear Control Syst. NOLCOS 2016 49(18), 146–151 (2016) 21. L. Guerrero-Bonilla, D. Saldana, V. Kumar, Design guarantees for resilient robot formations on lattices. IEEE Robot. Autom. Letters 4(1), 89–96 (2018) 22. H. Zhang, S. Sundaram, A simple median-based resilient consensus algorithm, in Proceedings of the 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton) (2012), pp. 1734–1741. https://doi.org/10.1109/Allerton.2012.6483431 23. H.J. LeBlanc, H. Zhang, X. Koutsoukos, S. Sundaram, Resilient asymptotic consensus in robust networks. IEEE J. Sel. Areas Commun. 31(4), 766–781 (2013) 24. J. Aubin, H. Frankowska, Set-valued Analysis, in Systems and Control (Birkhäuser, Basel, 1990) 25. C. Godsil, G. Royle, Algebraic Graph Theory (Springer, New York, 2001) 26. J. Usevitch, D. Panagou, r-robustness and (r, s)-robustness of circulant graphs, in Proceedings of the 2017 IEEE 56th Annual Conference on Decision and Control (CDC) (2017), pp. 4416– 4421. https://doi.org/10.1109/CDC.2017.8264310 27. D. Saldaña, A. Prorok, S. Sundaram, M.F.M. Campos, V. Kumar, Resilient consensus for timevarying networks of dynamic agents, in Proceedings of the 2017 American Control Conference (ACC), pp. 252–258 (2017). https://doi.org/10.23919/ACC.2017.7962962 28. A. de Ruiter, C. Damaren, J. Forbes, Spacecraft Dynamics and Control: An Introduction (Wiley, London, 2012) 29. O. Montenbruck, G. Gill Eberhard, Satellite Orbits: Models, Methods, Applications (Springer, Heidelberg, 2005) 30. R. Olfati-Saber, R.M. Murray, Graph rigidity and distributed formation stabilization of multivehicle systems, in Proceedings of the IEEE Conference on Decision and Control, vol. 3 (IEEE, New York, 2002), pp. 2965–2971 31. R. Olfati-Saber, J.S. Shamma, Consensus filters for sensor networks and distributed sensor fusion, in 44th IEEE Conference on Decision and Control, 2005 and 2005 European Control Conference (CDC-ECC’05) (IEEE, New York, 2005), pp. 6698–6703
Robust, Resilient, and Energy-Efficient Satellite Formation Control
251
32. Z. Miao, D. Thakur, R.S. Erwin, J. Pierre, Y. Wang, R. Fierro, Orthogonal vector field-based control for a multi-robot system circumnavigating a moving target in 3d,in 2016 IEEE 55th Conference on Decision and Control (CDC) (2016), pp. 6004–6009 33. B. Anderson, C. Yu, B. Fidan, J.M. Hendrickx, Rigid graph control architectures for autonomous formations. IEEE Control. Syst. Mag. 28(6), 48–63 (2008) 34. D. Thakur, S. Hernandez, M.R. Akella, Spacecraft swarm finite-thrust cooperative control for common orbit convergence. AIAA J. Guid. Control Dyn. 38(3), 478–488 (2015). https://doi. org/10.2514/1.G000621 35. J.L. Ramirez-Riberos, M. Pavone, E. Frazzoli, D.W. Miller, Distributed control of spacecraft formations via cyclic pursuit: Theory and experiments. AIAA J. Guid. Control Dyn. 33(5), 1655–438 (2010). https://doi.org/10.2514/1.46511 36. Z. Miao, Y. Wang, R. Fierro, Cooperative circumnavigation of a moving target with multiple nonholonomic robots using backstepping design. Syst. Control Lett. 103, 58–65 (2017) 37. C. Petersen, R. Fierro, Network-Lyapunov technique for spacecraft formation control, in Proceedings of the AIAA SciTech Conference (2018) 38. Z. Miao, D. Thakur, R.S. Erwin, J. Pierre, Y. Wang, R. Fierro, Orthogonal vector field-based control for a multi-robot system circumnavigating a moving target in 3D, in Proceedings of the IEEE Conference on Decision and Control, Las Vegas, NV (2016), pp. 6004–6009 39. H.K. Khalil, Nonlinear Systems (Prentice Hall, Englewood, 2002)
A Methodology for the Assessment of Efficiency in Systems Under Transient Conditions: Case Study for Hybrid Storage Systems in Elevators Jorge García, Cristina González-Morán, Pablo García, and Pablo Arboleya
Acronyms AC: DC: DPC: EB: EMI: ESR: ESS: EV: HESS: HF: IGBT: IoT: LPF: MPPT: Mtoe: NPC: PEC: PECG: PEGL: PFC: PI: PV: SPC: SM: VSI: WP:
Alternating Current Direct Current Direct-Parallel Connection Electrochemical Battery Electromagnetic Interference Equivalent Series Resistor Energy Storage System Electric Vehicle Hybrid Energy Storage System High Frequency Insulated Gate Bipolar Transistor Internet of Things Low-Pass Filter Maximum Power Point Tracking Million tons of oil equivalent Neutral-Point Clamped (converter) Power Electronic Converter Power Electronic Converter for Grid Power Electronics conv. For Generation/Load Power Factor Correction Proportional-Integral (regulator) Photovoltaic Series-Parallel Connection Supercapacitor Module Voltage Source Inverter Wind Power
J. García () · C. González-Morán · P. García · P. Arboleya LEMUR Group, Department of Electrical Engineering, University of Oviedo, Gijon, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_9
253
254
J. García et al.
1 Introduction The energy consumption of buildings at EU level represented 458 million tons of oil equivalent (Mtoe), more than 40% of the final energy consumption, and more than 60% of the electrical energy consumption [1]. Although a high percentage of this energy is used for ambient heating [2], it is also true that loads, such as large household appliances and auxiliary services in buildings, represent a significant amount of electricity consumption [3]. Any increase in the efficiency of these types of loads, regardless of how small, can lead to significant overall energy savings. In addition, as part of its “Clean Energy for all Europeans” package [4], the Council of the European Union has established a series of measures to achieve the objectives of efficiency, energy consumption, and decarbonization, recognizing that the deployment of systems that allow for better management of consumption flexibility and distributed generation is a pending task. The task of managing flexibility is critical not only because of the need to increase efficiency but also to reduce the impact on the distribution network of the growing electrification process, which currently diverts large parts of the energy consumption covered by other technologies toward electricity. As a general figure, it can be estimated that vertical transportation in buildings (elevators, vertical conveyors, escalators, etc.) moves more than one billion people worldwide every day, which represents a significant portion of the energy consumption of an average building. Previous elevator systems implied around up to 10% of the energy consumption of an average building [5, 6]. Therefore, measures to improve the performance of the elevator system will also have an enormous impact on global energy consumption in transportation applications, making it a strategic research topic. Some research lines can be found in the technical literature to confirm this interest, fostered by the development of ubiquitous IoT (Internet of things) technologies and intelligent management strategies that underpin the concept of energy management in smart buildings [7–10]. In particular, the present work is focused in two of these research lines. The first one is based on the development of software techniques that provide analysis and design tools for the dimensioning and development of elevator systems, with the aim of optimizing their overall performance [11–14]. These tools are generally developed to provide the output parameters of functional system design procedures, starting with the configuration of elevators [15–17]. However, there are also development platforms for designing the management of elevator systems, which mainly focus on the analysis of system traffic [18, 19]. The second research area is the integration of energy storage devices at the system level to improve the energy efficiency of the overall system [7, 20–24]. This work discusses the operation of an energy management system in a hybrid energy storage system specifically developed for an elevator powertrain in a commercial building. Besides energy support in case of mains failure, the storage units are also coordinated to attain an efficient management of the power flows in the full system, allowing for the implementation of a peak-shaving grid power
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
255
strategy. The design procedure of such a storage system needs to take into account the effects in key performance parameters of the varying operating conditions defined. Among these key performance parameters, the system efficiency plays a vital role, as it is required to define major aspects as, for instance, the power losses, the thermal performance, the reliability of the system, or even exploitation costs. With this purpose, the work introduces a methodology for the selection of the optimal configuration of the power electronic conversion systems, in terms of energy efficiency. The algorithm, intended for evaluating the performance of a given power conversion stage, includes a procedure for the power electronic topology selection and for the dynamic control parameters’ adjustment. The major contribution of the work consists on the definition of a method that encloses the same information that can be obtained from the thorough computation of full historic mission profiles of the power demanded by the system but using a simplified characteristic power profile. As a consequence, by using the proposed strategy that relies in the simple, characteristic profile, an optimal configuration of the control parameters still can be derived for the system, even though the complete power profiles are not available. An additional advantage of this contribution is that this approach shows very accurate results with a reduced number of calculations. This last aspect opens the possibility to implement the resulting low computational burden algorithm in realtime control schemes. The methodology is illustrated and applied to a particular case study defined in this discussion. But in any case, the same control algorithm could be applied to other applications that inherently perform on a non-steady-state operation, such as any kind of load/generator with accumulation capability, vehicle charging systems combined with second life batteries, vehicle to grid applications, flexible distributed resources, etc. Taking into account this possibility of generalization, a revision of the state of the art of energy efficiency in converters is given in the following paragraphs. The analysis of the efficiency in power electronic converters is generally a complex task that depends on several aspects. Firstly, it relies upon the main purpose and ratings of the system considered. But also, it depends on the specific technical solution deployed for the application. And, finally, it also depends on the operating conditions given by the load profiles of the system. One key aspect to assess this performance is the accuracy of the analytical and simulation models considered. The simplest definition of the efficiency, ηd , of any general system able to process energy can be expressed by Eq. (1) as a function of the steady-state average rated input and output power values, PI and PO , respectively, and alternatively as a function of the power losses, Ploss :
ηd =
PO PO = PI PO + Ploss
(1)
If applied to power electronic converters, this definition assumes that the device is operating in a steady state, at nominal operating values. But this figure of merit, even though it gives a general idea of the performance of the system, is not suitable
256
J. García et al.
for most applications. This expression does not consider the operation of the system under different functional conditions (percentage of full load, input/output voltage levels, etc.), nor the evolution of the system during transient or complex operation modes, in which steady state is not clearly defined or is not representative of the overall performance (e.g., in an energy backup system, the steady state is idle; thus, the output power is null most of the time, and the power consumption evolves upon a sequential set of transients). If any of the involved power values varies in time, then the expression of the efficiency in a given interval can still be computed using Eq. (1), but it must be done in terms of energy rather than power levels. In general terms, a power converter might operate at conditions far away from the rated settings. Examples of these applications are renewable generators, such as photovoltaic (PV) systems (with varying irradiance), or wind power (WP) generation systems (with variable wind speed conditions), powertrain converters in electric vehicles (EVs) (with varying load profiles due to different driving cycles), storage systems in microgrids for grid support (with random demand-side and generation of instant profiles), etc. For all these wide-variation operating range applications, the simplistic definition of efficiency (steady state, nominal operation) is not accurate enough as to take sound design decisions based on the energy efficiency performance. A number of methodologies and calculation strategies have been proposed in the literature, in order to assess the efficiency of the converters in these applications. In the cases of grid integration of renewable sources, the most common analysis assumes different load conditions at rated voltage values in the converters, always in steady-state operation [23–25]. Also, the case of single-phase power factor correction (PFC) converters is reported [26, 27], discussing the efficiency values in a manifold of topologies in steady-state operation. In [28], a steady-state analysis for a resonant DC-DC converter is performed, as a function of variables, such as the load and the voltage levels at the input or at the output of the converter. Reference [29] proposes a comparison of the performance of a neutral-point clamped (NPC) converter, under different modulation strategies. This is done by carrying out a detailed model of the losses in the system, also taking into account the thermal behavior of the converter. In [30], the authors propose a systematic approach for assessing the power density of different topologies for three-phase converters, taking into account the effect of losses, power quality, EMI, etc. However, in these comparisons, the performance is assessed at a given operating condition, either at the worst-case scenario or at nominal load power demand conditions. In [31], an extensive comparative evaluation of three-phase converters is presented, which includes an assessment of the efficiency under key constructive parameters, such as the power density, switching frequency, or power-to-mass ratio, considering steadystate operation at two different load conditions. In some specific applications, as, for instance, in photovoltaic (PV) applications, the way to cope with this lack of accuracy is by defining several complex efficiency figures of merit, such as the European Efficiency (ηE ) or the California Energy Commission Efficiency (ηC ) [32–34]. These are, in summary, weighted average
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
257
values of the converter efficiency calculated from the efficiency values measured at different load conditions in steady state, defined by (2) (3): ηE = 0.03η5% + 0.06η10% + 0.13η20% + 0.1η30% + 0.48η50% + 0.2η100%
(2)
ηC = 0.04η10% + 0.05η20% + 0.12η30% + 0.21η50% + 0.53η75% + 0.05η100% (3) where ηi% is the efficiency of the converter loaded with i% of the rated power. It is true that these weighted efficiencies account for the converter performance in a wide range of loads and are good parameters in order to compare different inverters for a given PV installation [35]. However, this method, even if extrapolated to other cases, is not valid in the case of applications in which the main operation is intended to be a sequence of transient load profiles rather than a steady-state operation point (e.g., transportation, stochastic loads, microgrid storage systems, etc.). The definition of efficiency as a quotient between input and output steady-state average power values (even for different operating points) does not provide enough information of the performance of the energy conversion process in such applications. For instance, in the case of PV systems, there are a number of researches that assess the effects on the PV converter efficiency, considering different dynamic conditions on the maximum power point tracking (MPPT) algorithms [36], or even the different output power factor [37]. In the case of transportation applications, several references consider the performance of traction systems assuming fixed operating conditions [38, 39]. In [40], the author carries out a thorough computation of the traction system efficiency as a function of key parameters, such as the torque, the rotor speed, or the voltage levels in the inverter in steady state. Apart from transportation applications, the performance of AC drives as a function of key parameters, such as the output power level [41] or the rotor speed [42], has been studied. And in general, a relevant number of studies focused on the assessment of the power efficiency for generic industrial applications can be found in the technical literature [43]. In these applications, the computation of efficiency and losses in transient operating conditions is done by means of the derivation of efficiency maps for combustion engines but also for electric machines and drives [44–49]. This scheme presents the desired parameter (usually, efficiency) in a multivariable coordinate system that is used in specific algorithms for the computation of the efficiency. Paradigmatic cases of systems operating under varying operating conditions are energy storage systems (ESS) for grid-connected applications, in which the ESS is intended to support the power quality of the grid, by absorbing and providing the harsh instant power transients that might perturb the grid energy supply. Another typical example is the case of the integration of distributed resources in power systems. The efficiency can be monitored in real time in order to change operating conditions, thus optimizing the efficiency of the system [50]. A relevant example of such distributed resources is the case of hybrid energy storage systems (HESS),
258
J. García et al.
in which two or more different technologies of storage devices are coordinated by means of power electronic converters and dedicated control stages. Under these coordination strategies, some of the individual devices work only during fast transients and operate most of the time in standby. In these systems, the usual definitions of efficiency do not provide enough information to make decisions for a design that considers relevant aspects, such as round-trip/daily performance, reliability, or lifecycle efficiency [51]. For instance, the concept of efficiency maps has also been applied to compute the efficiency of some type of HESS [52, 53]. In the case of HESS for electric transportation, there are several interesting studies that take into account their intrinsic continuous varying conditions. Reference [54] discusses the analysis of performance of converters in hybrid storage systems, considering driving profiles, in order to select the optimal topology among the studied options. Similar to this methodology, also the performance of industrial power converters can be assessed by computing the losses once the power profile is known [55]. But these cases need a complete definition of key system variables as a function of time during all the considered driving profile, which is not always possible, and also imply relatively high computation requirements. In the present work, a methodology for selecting the optimal configuration in power conversion systems, in terms of energy efficiency for complete mission profiles, is proposed. In this context, a mission profile is understood as a set of all relevant conditions that the system under study is subjected to in its specific application during its characteristic operation throughout its service life and which have an effect on its reliability. The optimal configuration is defined as a set of parameters of given design variables (such as specific power converter topology, values of control parameters as, for instance, the bandwidth of the controllers, etc.) that ensure the highest efficiency in the given operating conditions. A preliminary version of the methodology was presented in [60], but the full derivation of the algorithm, the main discussion on the results of the technique to a full mission profile, and the application of a simplified version of the technique, based on a characteristic profile, are original contributions of this work. Also, the present paper includes the application of the methodology to a specific target implementation, aiming to validate the claims and the obtained results. The case of study is a HESS, and it will be applied to a target application, a storage system for energy support to the powertrain in an elevator system. The structure of the chapter is detailed in the following lines. The work firstly outlines the proposed methodology in Sect. 2. Then, the methodology is thoroughly applied to the specific case of study of HESS in Sect. 3. Also, in that section, the target application is introduced. Section 4 defines the target application converters’ power and control stages, and some hints on the calculation of the power losses in each case are provided in Sect. 5. Section 6 introduces the mission profiles for the application, which are the basic inputs to assess the performance of the system and are used in Sect. 7 to compute the full losses upon every operating condition considered. A discussion on the results after these computations is carried out in Sect. 8, which also introduces and verifies the simplification of the algorithm by
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
259
using a single power profile. Finally, Sect. 9 remarks the conclusion of the research and proposes some future developments to be undertaken.
2 Methodology In order to achieve the goals of the former discussion, the methodology of the work has been defined as the following sequence of steps: (a) Definition of the Target Application In the first place, a specific target application is considered. The general purpose of the system to design, the operation modes and features, and the limits of applicability must be established. This step is detailed in Sect. 3. (b) Design of the Power Converter After the application has been defined, the ratings of the converter, sources, loads, and the storage devices, as well as the possible converter structure options under consideration, need to be settled. This work considers the case of two structures, although it can be generalized for a higher number of options. Also, the control parameters need to be established. Further details are given in Sect. 4. (c) Model of the Power Losses Since the energy efficiency is the main parameter to optimize in this work, a thorough calculation on the power losses in the converter has to be carried out in order to estimate the power losses in different operating points. This computation is a function of the instant electrical operation parameters. The model needs to be experimentally verified with a real system or with an adequate laboratory equipment. This is discussed, for the case of study, in Sect. 5. (d) Characterization of the Mission Profile Once the system is characterized, the real power profile of the operation of the system is generated, using the existing power profiles’ generation tools available in the literature. Therefore, interval-defined profiles can be extracted, which may be used for the assessment of the system performance. The construction of such profiles is detailed in Sect. 6. (e) Calculation of Energy Losses and Efficiency In order to consider not only the steady-state behavior but also the transient operation, a more generic definition of efficiency is taken into account. This wellknown definition is obtained through the integration of the instantaneous power along a given time (i.e., energy, E) of expression (1). This yields expressions (4)–(5) where sub-index Loss stands for losses:
260
J. García et al.
ηd =
EO EO = EI EO + ELoss
(4)
(T
ηd = ( T 0
po (t)· dt (T po (t)· dt + 0 pLoss (t)· dt 0
(5)
It is necessary to take into account that the results from (1) to (5) depend on the evolution of the system variables during the time considered. It is assumed that the target system behavior is defined between instants t = 0 and t = T. The idea is to calculate the energy losses along the evolution of the system in the typical operating conditions, including transient operation. Given that this evolution is a function of the control scheme implemented, the main goal of the work is to obtain an expression of the instant power losses as a function of the control variables: pLoss (t) = f (x1 (t), x2 (t), . . . , xn (t), d1 , d2 , . . . , dm )
(6)
where pLoss (t) is the instant value of the power losses, x1 (t) . . . xn (t) are the n independent control variables of the system, and d1 . . . dm are the m independent design parameters that are considered. Once this expression is obtained and once the evolution of the control variables is defined, the evaluation of the total energy loss in the system upon these operating conditions can be calculated using Eq. (5). For the case under study, these calculations are described in Sect. 7. (f) Differential Power Losses’ Map and Computation of the Losses Through a Characteristic Profile Up to this step in the methodology, the discussion is a computation of the power losses in the different options available, which is achieved by knowing thoroughly both the description of the power losses and the definition of a complete set of mission profiles able to characterize the performance of the system for every relevant mode of operation. As discussed in the introductory session, this technique provides very rigorous results; however, it implies extensive, time-consuming computations, which might not be practical in terms of data size/processing capability, especially if this information aims to be ready in a short amount of time (e.g., for real-time control applications) or if the desired performance is not known with enough detail (e.g., mission profiles not available). The novelty of the work is the discussion and establishment of a methodology for reaching the same goal, avoiding extensive computations; this allows for extract conclusions comparatively faster but also parting from incomplete information on the load profiles in the system. For this purpose, the approach followed in this work will be to select simple yet representative transient profiles for the system of the defined operational modes, which simplifies the comparison procedure. Hence, a comparative study can be carried out, which can enlighten the decision of the most
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
261
suitable power topology for a given application. But also, it helps to select the best set of design parameters once a topology is chosen. At this end, two basic simplifications have been introduced in the technique and are the basic contributions of the work. Firstly, for each target application considered, the power losses’ calculation is focused on the power losses’ difference between the two topologies rather than in the independent power losses at each alternative. This will reduce in half the number of calculations to carry out for each of the target applications. As it will be discussed ahead, this differential power losses’ scheme will give results in favor of one or the other topological solution, depending on the final sign of the efficiency balance. On the other hand, the effect of the performance of each of the possible converters will be assessed, considering only a simple, characteristic, single profile of the power demand profile. This profile must be defined without knowing the mission profile information, so it can be applied in systems in which the full demand conditions are not clear. The definition of the characteristic profile is discussed in Sect. 8. However, the results obtained from the first, complete technique versus the ones achieved with the proposed simplifications need to be compared, in order to assess that the simplification in the profile does not jeopardize the validity of the conclusions derived. The following sections apply the proposed methodology in a HESS converter for a target industrial application.
3 Definition of the Target Application The forthcoming discussion considers the performance, in terms of efficiency, of a HESS as the one depicted in Fig. 1. In this case of study, the system under consideration has two energy storage devices: an electrochemical battery (EB) module intended for sustained power support and a supercapacitor module (SM) intended for fast dynamics peak power support. Such a HESS will be implemented in a target industrial application, defined as an energy management system for the powertrain of an elevator. Figure 1 shows a power electronic converter for grid (PECG), which interfaces the mains facility, considered in both applications as the primary energy source, to the full system under consideration. In particular, the converter interconnects the grid to the DC link, implemented in the real system by an assembly of bulk DC capacitors, which effectively decouples the energy flows in the full system. This converter is, in principle, bidirectional, although the specific operating characteristics of the system might not allow a reverse energy flow back to the grid. The power electronics converter for generation/load (PEGL) is the main actuator, in this case the converter or set of converters and loads that need to be interfaced to the grid. For the considered application, this element is the electric machine used in the elevator system. This converter might operate sending energy back to the DC link, in regenerative braking mode.
262
J. García et al.
Primary Energy Source (Grid)
pGrid
pBat PECG
pD
DC link
pLoad Load (Elev)
PEGL
pHESS
pCdc DC Link Cap
PEC S1
PEC S2
ESS1 EB
pSCaps
ESS2 SM
HESS
Fig. 1 Modeling approach in terms of the power flow balance in the considered hybrid energy storage system (HESS). Gray arrows indicate possible power flows. Black arrows point positive reference for the power flow. Table 1 Rated (nominal) parameters of the system under study
Symbol PELEV PPKSHV PNOM VDC VBAT iBat VSCaps iLSCaps fSW
Parameter Rated power of the elevator system Rated power demanded from the grid Power of the storage converter DC link voltage Battery voltage Battery current (rated value) SM voltage (rated value) SM current (rated) Switching frequency
Value 4.3 kW 2.3 kW 2 kW 600 V 300 V 8A 60 V 30 A 20 kHz
Still in Fig. 1, the HESS is formed by a first energy storage system (ESS1) implemented by means of a lithium-ion EB with a dedicated power electronic converter (PEC S1) and by a second storage system (ESS2), based on a SM device, connected to the DC link as well by second subsystem converter (PEC S2). The application considered is the powertrain on an elevator system in commercial buildings, with the corresponding hybrid energy storage system under test. The converters for the storage devices are based on commercial solutions, with a robust, cost-efficient silicon-based IGBT technology. Table 1 shows the main parameters of the elevator and associated HESS. Besides energy support in case of mains failure, the HESS can also help in the optimization of the energy management in the system. In this particular case, the energy management strategy under normal operation implemented is peak shaving of the power demanded from the grid. This scheme limits the power delivered from the grid to a given value, PPKSHV , to be able to contract a grid tariff for a nominal power smaller than the rated values of the elevator powertrain. The remaining power required by the system is supplied by the hybrid storage system, following expressions (7, 8):
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
$ pGrid (t) =
pLoad (t) if pLoad (t) ≤ PPKSHV PPKSHV otherwise
pHESS (t) = pGrid (t) − pLoad (t)
263
(7)
(8)
In addition to (7) and (8), it must be mentioned that this strategy also implements a balance of the HESS energy level to ensure that the hybrid subsystem is kept at a rated charge level, in order to guarantee proper operation in the long term.
4 Design of the Power Converter For the interconnection of the storage devices with the DC link, two different topological solutions will be considered. The first topology, based on the standard half-bridge inverter, is defined as direct-parallel connection (DPC), and it is shown in Fig. 2a. The alternative topology is the series-parallel connection (SPC), depicted
Fig. 2 (a) Direct-parallel connection of two energy storage devices. (b) Series-parallel connection. (c) Control scheme of the HESS considered [19]
264
J. García et al.
in Fig. 2b. This figure is a possible implementation of the general scheme depicted in Fig. 1. The current and voltage values of the EB and SM storage units are iBat , VBat , iSCaps , and VSCaps , respectively. The DC link node is implemented by means of a bank of capacitors represented by CDC . VDC is the voltage at the DC link, while iCDC is the current through the DC link capacitor bank itself. The currents from the power electronic converters for the EB and SM are iPECS1 and iPECS2 , respectively, while iLOAD and iGRID are the currents from the load side and grid side converters, respectively. Both topologies have been deeply analyzed in previous works [56–58]. The choice between either options is not straightforward, given that it depends on the operating point considered. DPC is the standard solution that implements a parallel operation of two bidirectional boost converters to interface DC storage sources to a DC link. It is a well-known solution, and it allows for an independent design of each boost converter, since the operation of both converters is independent. However, it has a main drawback when the operation of the converter requires large voltage gains. These large gains imply a very small duty ratio in the power switches, which also limits the margins for the control actions in the operation, yielding slow responses against fast requirements in the dynamic performance. This limitation is overcome by the use of the SPC, which extends the control action ranges significantly, but at the cost of higher current stresses at some specific operating conditions of the converter. Therefore, the choice between DPC and SCP turns this discussion into a relevant example to display the proposed methodology. Prior to finding the theoretical expression of the losses, the control scheme implemented needs to be defined. In this kind of systems, distinct control strategies can be implemented, either considering or disregarding several features, such as peak shaving, hybrid vs. simple storage, PV generation capability (in case the system has this feature), and even strategies that consider sending energy to the grid [59]. For the sake of simplicity, in this work, only a standard hybrid control will be considered, without generation capability and without sending energy back to the grid. This helps to focus the analysis on the storage system alone, but it must be noticed that all the mentioned features can be included in the analysis provided that available information is detailed enough. The control variables for the EB and for the SM converters will be the currents through these devices, iBat and iSCaps , respectively [60]. The associated current control loops will define the share of the power managed by each device, provided that the bandwidths for both loop controllers are defined adequately. In addition, there is a DC link control loop that aims to keep the DC bus voltage constant; however, it is assumed that the small, slow variations in the DC link voltage due to the usual operation of the converter do not to affect significantly the efficiency performance of the system. Each of those current control loops has a number of possible design strategies to effectively tune the controllers. This work will assume a first-order behavior of the current loops, and therefore, the main design parameter for each control loop will be the bandwidth of the controller [58]. But the values of the instant currents through the inductors do not give an intuitive vision of the actual power share, and therefore,
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
265
the system is analyzed considering the instant power values at the EB and at the SM, PBat and PSCaps , respectively. Thus, for any instant, (6) can be particularized for the target system as (9): pLoss (t) = f pBat (t), pSCaps (t), VDC , VBat , VSCaps
(9)
where VDC , VBat , and VSCaps are the voltages at the DC link, at the EB, and at the SM, respectively. This scheme is depicted in Fig. 2c. The HESS power flows are controlled by current control loops at the storage systems through standard PI regulators. The references for the feedback loops in the HESS control scheme are the instant power levels demanded to the EB and the SM, p* Bat and p* SCaps , respectively. These power levels are transformed into current levels required from the storage units, i* Bat and i* SCaps . This adaptation is done by processing the information at two specific computation blocks, CRBat and CRSCaps , for the EB and the SM, respectively. As a result, the regulators RBat and RSCaps in the two current feedback loops represented in Fig. 2c provide the control actions to the EB and SM converters. These converters are represented in Fig. 2c by the two controlled current sources that inject currents iPECS1 and iPECS2 to the system DC link, implemented by the capacitor CDC . It is assumed that another feedback loop controls the voltage at the DC link, by injecting the required power from the grid by means of the regulator RGrid , in order to keep the system operation stable by ensuring that the DC link voltage follows the reference V* DC .
5 Model of the Power Losses The model of the power losses depends on the specific technologies involved in the power converter implementation. The converter for the HESS will be implemented by means of a commercial IGBT inverter. The total losses in the system need to consider the switching and conduction losses at each of the IGBTs, the magnetic losses at the inductors (core losses and copper losses, depending on the winding strategy, etc.), as well as the DC link capacitor losses. The rest of the losses in the system will be disregarded, given that their values are much smaller than the ones considered above. The expressions of the turn-on, pTurn-on , at turn-off, pTurn-off , and the full switching losses, pSW , at each switch have been modelled through existing models in the literature [61–65] and are given by (10)–(12):
pTurn−on (t) =
1 VCE · ICon · ton · fSW + pIrr 2
(10)
266
J. García et al.
pTurn−off (t) =
1 VCE · ICoff · toff · fSW 2
pSW (t) = pTurn−on + pTurn−off + pCce + pDriver
(11) (12)
where ICon and ICoff are the values of the instant currents through the collector of the switch at the switching on and off instants, respectively (therefore considering the effect of the HF ripple); fSW is the switching frequency, ton and toff are the turn-on and turn-off switching intervals of the switch; VCE is the collector-to-emitter voltage at the DC link; pIrr is the power loss due the reverse recovery at turn-off of the diode, pCce are the losses due the collector-to-emitter capacitor charge and discharge; and pDriver are the losses due the driver. Generally speaking, PDriver results smaller than any other component of the switching losses. It also can be said that the losses at turn-on are usually larger than the ones at turn-off. All these parameters have been calculated from the manufacturer’s datasheet [66, 67]. On the other hand, the conduction losses at each of the switches can be expressed by (13): pCD (t) =
ICoff + ICon · VCESat · ton · fSW 2
(13)
where VCESat is the collector-to-emitter saturation voltage. About the losses in the inductors, these must be divided into core losses and copper losses in the winding of the inductor. The losses in a magnetic core, Pco , can be approached analytically, for instance, by the well-known empirical equation the standard Steinmetz equation [68]: y
pco (t) = c· fxSW · BHF · Vole
(14)
where c, x, and y are constants of the magnetic material, BHF is the HF component of the magnetic field in the core, and Vole is the equivalent volume of the core. The copper losses, pcu , are given considering Dowell’s formula [69, 70], as per (15):
pcu (t) = I2Lavg · RLdc + I2Lrms(HF) · RLac
(15)
where RLdc and RLac are the DC and AC components of the equivalent resistor of the inductance, taking into account the proximity effect. Finally, the losses at the DC link (pLCdc ) can be considered by the ESR of the capacitors although in VSI converter. These losses are usually low [41, 61, 62, 71] and are given by (16):
A Methodology for the Assessment of Efficiency in Systems Under Transient. . . Table 2 Parameters of the converter
Symbol CDC ESRCDC Lbat RLbat LScaps RLScaps SX
Parameter DC link capacitance ESR of DC link capacitor Battery converter induct. Series resistance of Lbat Supercap converter induct. Series resistance of LSCaps IGBT reference (IXYS)
pCdc = ESRCDC · I2CDCrms
267
Value 2.2 mF < 20 m
2.1 mH 0.2
350 μH 0.06
MIXA10WB1200TED
(16)
where ESRCDC is the equivalent series resistance of the DC link capacitor, and ICDCrms is the rms value of the current through the DC link capacitor. Notice that each of the current and voltage values in the expressions from (8) to (16) needs to be expressed as a function of the instant values of the control parameters defined in (9), i.e., pBat , pSCaps , VDC , VBat , and VSCaps . These expressions can be derived explicitly by considering the instant current and voltage waveforms in the different components and nodes of the topologies, as it was discussed in [56– 58]. The final expression of the losses has been calculated as the addition of these partial components for a given setup for the system defined in Table 2. The resulting expressions have been graphically depicted in Fig. 3. These plots show an instant power losses’ map, as a function of the instant power values, pBat and pSCaps . These plots represent a particular state of the system, for a given value of VDC , VBat , and VSCaps ; however, they indicate the general trend of the system in terms of the steady-state losses. The clearer the zone in the map, the lower the losses in steady state in that set of coordinates. Two different maps have been calculated, for each one of the two possible topologies considered, DPC (Fig. 3a) and SPC (Fig. 3b). However, the idea underlying this work is to make a methodology for comparing different power topologies. In this sense, it is more interesting to depict a single instant power losses’ map, which accounts for the difference in the losses among the two possible options. This map is represented in Fig. 4. In this case, the z axis defines the losses’ difference, defined by (17): ΔpLoss = pLoss(SPC) − pLoss(DPC)
(17)
This map can be understood as to state that in the clearer regions, DPC losses are greater than those in SPC for the same conditions, and hence, SPC would be preferred from the efficiency losses point of view. This ΔpLoss map has been experimentally validated through the characterization of a laboratory demonstrator shown in Fig. 5, already reported in the literature [57, 58].
268
J. García et al.
Fig. 3 Instant power losses’ map for the industrial application, as a function of battery and supercap inductor currents, for DPC (a) and SPC (b) configuration. The clearer the area, the lower the losses
6 Characterization of the Mission Profile The next step is to define the typical operation of the system under consideration, as a set of trajectories in the different loss maps obtained. In fact, a first glimpse at the map of losses’ subtraction at Fig. 4 provides an initial guess of the preferred relationship of operating values, in which each of the two topologies under
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
269
Fig. 4 Instant power losses’ subtraction map for the industrial application, as a function of the battery and supercap inductor currents. The darker the area, the higher the losses in DPC
Fig. 5 Laboratory demonstrator of the system depicted in Fig. 1, with the characteristic values of Tables 1 and 2
consideration (either DPC or SPC) presents a better efficiency performance. For instance, it is clear that the SPC structure will be preferred if the system operates for long intervals with SM power levels relatively small, compared to the EB power levels, but in both cases, ensuring that both power levels have the same sign, as in
270
J. García et al.
+
pLoad
* pHESS
PPKSHV
pHESS pGrid t
PPKSHV
* pBat
BWHESS
p*
Grid
pLoad
p*Bat
f -
LIM
LPF
p*HESS
t
p*
+
Grid
PPKSHV
t
HESS -
* pSCaps
t
p*SCaps t
Fig. 6 Generation of power references for the EB and SM converters from the load power signal by means of a limiter block and a LPF [58]
these regions in Fig. 4, the value of ΔpLoss is negative. In any case, the goal of the research outlined in this paper is to attain a figure of merit of the performance of each topology that is based on the power profiles obtained from the real operation of the system. In order to do so, the dynamic evolution of the system and the mission profile of the power levels under consideration need to be defined. The generation of the power references for both storage devices, i.e., for the EB and SM converters, is achieved by defining the energy management strategy (EMS) and by applying it to the measured load power demand, pLoad . There is a manifold of strategies to attain an optimal operation of the system, balancing both the functional features and the reliability of the set of devices involved. These strategies need to take into account key functional aspects of the devices (charging and discharging rates, current and voltage stresses in the storage devices and in the converters, etc.). In this case, to simplify the methodology explanation, the grid power peak-shaving strategy outlined in (7) and (8) is achieved by implementing the diagram shown in Fig. 6 [58]. The power demanded by the load, pLoad , enters a limiting block, LIM; the output of the limiter is the reference for the power demanded from the ∗ ∗ grid, pGrid . This output (pGrid ) equals the input (pLoad ), if the input is smaller than ∗ the limiting value, PPKSHV . Otherwise, the output, pGrid , gets clamped at value PPKSHV . In any case, the difference between the load and the grid power values is the ∗ ∗ reference to the storage system, pH ESS . The reference to the EB, pBat , is obtained by applying a low-pass filter (LPF) to the full energy storage system reference, ∗ ∗ pH ESS , while the reference for the SM, pSCaps , is obtained by subtracting the EB power reference to the hybrid storage system reference (a bandwidth limitation in the internal HESS control system might be included in order to prevent controller ∗ instability due to the resulting pSCaps reference waveform). As a result, the expressions of the storage devices’ instant power values, given in (18) (19), are a function of the bandwidth in the storage system, BWHESS , and of the time:
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
271
Table 3 Parameters of the elevator system Symbol PLoad Ncab Nfloor CW Pld cNOM CUSE Ts PPKSHV BWHESS
Parameter Rated power of the elevator system Number of elevator cabins Number of floors, from max to min Rated cabin weight Rated payload Rated cabin speed Case of use (as per [72–74]) Sampling time of the simulated signals Limiting value for the grid power peak-shaving EMS Bandwidth for the hybrid energy storage system
Value 4.3 kW 1 10 600 kg 450 kg 0.9 m/s 5 1s 2.3 kW 0.001–10 Hz
∗ pBat (t) = f (BW H ESS , t)
(18)
∗ pSCaps (t) = f (BW H ESS , t)
(19)
Therefore, the study of the system efficiency must consider this bandwidth (and in general, the dynamic control design parameters) as a key specification. The mission profiles for the instant power magnitudes involved in the industrial application have been obtained from a software tool specifically developed for elevator powertrain performance analysis [59]. This tool provides daily profiles of all the significant parameters in the simulation (voltage, current, and instant power values), as a function of time, which can be extended through a full year. In addition, this tool is fully configurable parametrically, and therefore, the main parameters of the elevator system considered (such as number of elevator cabins in the system, number of floors, payload, relevant kinematic information, case of operation, standby consumption, regeneration capability, etc.) can be tuned easily. Also, several control schemes can be simulated, among which is the peak-shaving strategy discussed here, with the limiting value PPKSHV and the bandwidth BWHESS as configurable parameters. For the present analysis, a non-regenerative braking elevator system has been configured. The main parameters of such system can be seen in Table 3. The information of a full day of operation is obtained from the simulator as a set of time arrays, as a *.txt file, which includes all relevant information. This profile, for a BWHESS of 0.1 Hz, has been plotted in Fig. 7.a for a full day. Fig. 7.b shows a detail of a 10-minute interval of the daily profile considered, in a moment of the day that presents a typical use of the system (interval between 2 pm and 3 pm). It can be seen how the operating conditions given by (7) and (8) are fulfilled.
272
J. García et al.
Fig. 7 Mission profiles for a full day of the elevator system considered, for BWHESS = 0.1 Hz. The values of the power demanded by the elevator system (PLoad ), of the storage system (PHESS ), of the EB (PBat ), and of the SM (PSCaps ) and demanded from the grid (PGrid ) are represented: (a) for a full day and (b) for a 10-minute slot
7 Computation of the Efficiency Once the differential power losses’ expression and the mission profiles are known, the differential energy loss, ΔeLoss , can be calculated by (20): ΔeLoss =
t=Tday t=0
ΔpLoss (t)· dt
(20)
where ΔpLoss (t) is the time evolution of the differential power losses in the system as a function of time; for each instance, this function is calculated as a position in the coordinate system (PBat , PSCaps ) in the differential power losses’ map previously obtained. Therefore, this computation is, eventually, the trajectory of the differential power losses’ array in the losses’ map. Figure 8 shows this trajectory for a design case, in which the bandwidth for the storage system (i.e., the LPF bandwidth discussed in Fig. 6) is BWHESS = 0.01 Hz. The total amount of energy losses can then be computed. It is obvious that in order to take a decision on the structure to select (DPC vs. SPC), the sign of the resulted computed energy loss must be considered. This will tell the best option from
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
273
Fig. 8 Daily trajectory over the instant power differential losses’ map for a bandwidth of BWHESS = 0.1 Hz. (a) Three-dimensional view. (b) Zenithal view
Fig. 9 Mission profiles for a full day of the elevator system considered, for BWHESS = 0.002 Hz. (a) For a full day and (b) for a 10-minute slot
the possible two considered for this dynamic design. Yet, this will not guarantee that this result is valid for every feasible bandwidth considered. In order to extend this assessment to all the possible dynamic conditions, the calculations have been repeated for different bandwidth values, from BWHESS = 0.002 Hz (all the power in the HESS is managed by the SM, i.e., pBat ≈ 0) to 5 Hz (power mainly managed by the EB, and thus pSCaps ≈ 0). Fig. 9 shows the daily mission profile for the system under the condition of BWHESS = 0.002 Hz, while Fig. 10 shows the differential losses’ computations for such value of BWHESS .
274
J. García et al.
Fig. 10 Daily trajectory over the instant power differential losses’ map for a bandwidth of BWHESS = 0.002 Hz. (a) Three-dimensional view. (b) Zenithal view
Fig. 11 Mission profiles for a full day of the elevator system considered, for BWHESS = 5 Hz. (a) For a full day and (b) for a 10-minute slot
Analogously, Figs. 11 and 12 represent the mission profiles and the differential losses’ map for BWHESS = 5 Hz. It is clear that these extreme values for the bandwidth in the LPF are not suitable for the operation of the real system, as they imply working conditions in which one of the individual storage units is not being utilized; however, they are relevant as they give precisely an idea of the performance of the system under this extreme usage conditions. Figure 13 shows the trend of the differential energy loss, ΔeLoss , as a function of BWHESS . As it can be seen, depending on the dynamic share of power values
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
275
Fig. 12 Daily trajectory over the instant power differential losses’ map for a bandwidth of BWHESS = 5 Hz. (a) Three-dimensional view. (b) Zenithal view
Fig. 13 Evolution of the differential energy loss, ΔeLoss , in kWh in a full day, for the same mission profile as a function of BWHESS
between both storage units, the value of ΔeLoss changes significantly. Moreover, for a given dynamic condition, the sign of ΔeLoss changes, which implies that the preferred choice for the structure is different. For instance, for small BWHESS values, that is to say when most of the power is managed by the SM, the value of ΔeLoss results positive, which implies that the DPC structure provides less losses compared to the SPC arrangement. And, on the other hand, for fast dynamic operation (most of the power is managed by the EB unit), SPC provides a more efficient operation. It is obvious that besides the ability to simply choose the structure that better performs in the application, also accurate numerical calculations of the energy loss in a given period of time can be carried out. Thus, an optimal design, solely from the efficiency point of view, can be calculated by including constraints in the design,
276
J. García et al.
such as physically feasible and operative design points, and by implementing a given optimization algorithm. Besides, this information also allows for computing the economic and reliability performance of the given choice. However, the optimization procedure and the reliability analysis and cost analysis are beyond the scope of this work.
8 Discussion on the Definition of a Simple Estimator The calculation of the differential energy loss for each dynamic condition implies a number of computations and a data size storage that imposes some constraints in real-time applications of the proposed methodology. However, this depends largely on the resolution and size of the data arrays needed for in the calculations of the algorithm. And in addition, these constraints might be removed by using high computation processors selected in accordance to these requirements. A more critical limitation of this analysis is that in some applications, the mission profiles of the demand side might not be available. In these cases, a simplification of the algorithm can be carried out. The idea is to repeat the same procedure but with a simplified input signal. Fig. 14 shows such a simplified power profile for the system under consideration, for the three relevant values of BWHESS represented in the previous sections of this work (0.002 Hz, 0.1 Hz, and 5 Hz). This profile corresponds to a given travel of the elevator. The choice of this travel must be made carefully, so that it is representative of the real use of the system. In order to select this profile, several sources of information can be used, from technical literature up to manufacturer datasheets, or even regulations and directives, depending on the specific application. In this case, this characteristic profile has been arranged, considering the guidelines for indoor elevators in the regulations [72–74]. The characteristic profile is configured considering two sequential trips. The first one (from 0 s to around 30 s) considers a trip from the ground floor to the midfloor in the building, with the cabin travelling empty, and after reaching the ground floor, the cabin travels down to the ground floor again with a given payload (from 30 s to 60 s). The parameters of this trip are represented in Table 4. The computation of the differential energy loss for this simplified profile, is defined also by (20), taking in to account that the profile now corresponds to the characteristic trip defined in the previous paragraphs, with Tchar being the length of this characteristic trip: δeLoss =
t=Tchar t=0
ΔpLoss (t)· dt
(21)
In order to distinguish the differential energy loss for a full-day profile from the characteristic one, the latter is denoted as δeLoss . As this last term is defined only for a few seconds, its absolute value is several orders of magnitude smaller than the
A Methodology for the Assessment of Efficiency in Systems Under Transient. . . P
P Load (W)
(W)
Load
4000
P (W)
P (W)
4000
2000
2000
0
0 0
10
20
30
40
50
60
70
80
0
90
10
20
30
40
t (s) P (W) P (W)
P (W)
0
70
80
90
60
70
80
90
60
70
80
90
60
70
80
90
0
-2000
-2000 0
10
20
30
40
50
60
70
80
0
90
10
20
30
40
50
t (s) P Bat (W)
t (s) PBat (W)
2000
P (W)
2000
P (W)
60
t (s) (W)
HESS
2000
2000
0
-2000
0
-2000 0
10
20
30
40
50
P
t (s) (W)
2000
60
70
80
90
0
10
20
30
40
50
t (s) PSCaps (W)
SCaps
2000
P (W)
P (W)
50
P
HESS
0
-2000
0
-2000 0
10
20
30
40
P
50
60
70
80
90
0
10
20
30
40
t (s) (W)
50
t (s) PGrid (W)
Grid
2000
data1
2000
P (W)
P (W)
277
1000
1000
0 0
10
20
30
40
50
60
70
80
0
90
0
10
20
30
40
t (s)
50
60
70
80
90
t (s)
a)
b)
P
(W)
Load
P (W)
4000
2000
0 0
10
20
30
40
50
P
60
70
80
90
60
70
80
90
60
70
80
90
60
70
80
90
60
70
80
90
t (s) (W)
HESS
P (W)
2000
0
-2000 0
10
20
30
40
50
t (s) PBat (W) P (W)
2000
0
-2000 0
10
20
30
40
P P (W)
2000
50
t (s) (W)
SCaps
0
-2000 0
10
20
30
40
P
50
t (s) (W)
Grid
P (W)
2000 1000 0 0
10
20
30
40
50
t (s)
c)
Fig. 14 Characteristic profile for of the elevator system considered. (a) BWHESS = 0.002 Hz, (b) BWHESS = 0.1 Hz, and (c) BWHESS = 5 Hz
278 Table 4 Parameters of the simplified profile
J. García et al. Symbol Ninit Nend Pldinit Pldend
Parameter Initial floor Final (ground) floor Payload of initial descending trip Payload of final ascending trip
Value 0 4 0 kg 0 kg
Fig. 15 Evolution of the differential energy loss, ΔeLoss , and of the estimator ΔeˆLoss in kWh, as a function of BWHESS
energy loss defined for a full-day profile. In order to be able to compare both terms ΔeLoss and δeLoss , an estimator has been defined by (22): ΔeˆLoss = δeLoss ·
Tday · 1 − tstdby Tchar
(22)
where tstdby is the amount of time in which the elevator is in standby operation mode, as a per unit value, obtained by the definition of the characteristic profile [59, 74]. These estimated energy loss parameters are depicted in Fig. 15 for the dynamic conditions considered along this research, and they are compared against the computation with a full-day profile. From these figures, it can be seen how the estimator ΔeˆLoss provides an accurate value of the full measurements ΔeLoss , which implies that the information required to take a decision between both structures is basically the same from the full
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
279
computation (complete mission profile) than from the simplified version of the algorithm (based on the single characteristic profile). Some error can be seen between both plots; nevertheless, it can be seen how the relevant information for real-time applications is present with the simplified version of the algorithm, as the trend in both curves is similar. This is achieved with a relatively small number of calculations; therefore, it enables the real-time selection of the best control scheme for a given set of operation characteristics. But, more relevant, the information given by the simplified algorithm can be used when the mission profile of the target application is not known as a first approach for the design, which might be refined in the event that a full characterization of the system is eventually carried out.
9 Conclusions and Future Developments A methodology for optimal dynamic efficiency performance design of power converters has been presented. This methodology has been illustrated by applying it to a case study consisting on the operation of two different arrangements of power electronic converters for hybrid energy storage systems in a target application of an elevator system. Once the system is defined (nominal values, case of operation, etc.), and the main structure options and control scheme are identified, then the power losses in the system are calculated. After defining the mission profiles that are the basic input to assess the performance of the system, the full losses upon every operating condition considered are computed. The main claims of this work enable a simplification of the algorithm by discussing the differential energy loss (and not the energy losses in each of the structure choices) and by using a single power profile that enables the methodology in cases where the full mission profile is not known, or in cases where fast computation is required. In any case, the outputs of the methodology obtained are the optimal power topology (in this case, either DPC or SPC), the control design parameter, BWHESS , and an estimation the total energy loss in a given amount of service time. Future developments include a refinement of the estimator using specific identification techniques (e.g., an online adjustment of the characteristic profile from statistic information, parting from the initial solution presented here), the extension of the algorithm for more than two initial topological options, and the extension to other power converter applications. Also, the methodology can be extended to take into account an optimization procedure and a reliability and cost analysis that would provide an optimal system design. Also, it must be said that this analysis considers the losses from the point of view of the power converter only. It does not consider the implications of the alternative topologies or even of the bandwidths on the losses of the EB and SM. Specially, the battery losses’ dependence with the controller bandwidths should be taken into account in order to account for all the energy losses in the system. Therefore, another
280
J. García et al.
future development of this work is to include the models of these losses in the analysis. However, the methodology and the procedure are still the same ones that have been outlined here. Another interesting development is to explore the effect of a variable bandwidth scheme in the application of the algorithm. The idea is to establish a more complex control scheme than the one exemplified in Fig. 6 by including different parameters for the bandwidth definition of both the EM and the SM module. This would open a further optimization of the trajectories themselves, potentially increasing the savings. Acknowledgments This work has been partially supported by the Spanish Government, Innovation Development, and Research Office (MEC), under research grant ENE2016-77919, Project “Conciliator,” and research grant PID2019-111051RB-100 Project “B2B- Energy” and by the European Union through ERFD Structural Funds (FEDER). This work has been partially supported by the government of Principality of Asturias, under IDEPA grants 2017ThyssenSV-PA-17-RIS3-3 and FC-GRUPIN-IDI/2018/000241. This work has been partially funded from the European Union’s H2020 Research and Innovation program under grant agreement 864459, Project “Cost Effective Technological Developments for Accelerating Energy Transition (TALENT).”
References 1. Mapping and analyses of the current and future (2020–2030) heating/cooling fuel deployment (fossil/renewables), EUROPEAN COMMISSION DIRECTORATE-GENERAL FOR ENERGY https://ec.europa.eu/energy/sites/ener/files/documents/Report%20WP2.pdf 2. T. Nowak, Heat Pumps, Integrating technologies to decarbonize heating and cooling (European Copper Institute) https://www.ehpa.org/fileadmin/user_upload/ White_Paper_Heat_pumps.pdf 3. Mapping and analyses of the current and future (2020–2030) heat-ing/cooling fuel deployment (fossil/renewables) – EUROPEAN COMMISSION DIRECTORATE-GENERAL FOR ENERGY https://ec.europa.eu/energy/sites/ener/files/documents/Report%20WP1.pdf 4. European Directive EU2019/944, Clean Energy for all Europeans. https://ec.europa.eu/energy/ topics/energy-strategy/clean-energy-all-europeans_en 5. J. Fisher, Thyssenkrupp is the first to enable conversion of existing elevators into net-zero energy units through modernization (Press Release, https://www.thyssenkrupp-elevator.com, 2017) 6. B. Urbanand, K. Roth, Demonstrating a net-zero solar energy elevator in a Boston office building. Fraunhofer USA Center for Sustainable Energy Systems, USA, Final Report 1, May 2016 7. E. Bilbao, P. Barrade, I. Etxeberria-Otadui, A. Rufer, S. Luri, I. Gil, Optimal energy management strategy of an improved elevator with energy storage capacity based on dynamic programming. IEEE Trans. Ind. Appl. 50(2), 1233–1244 (2014) 8. M. Manic, D. Wijayasekara, K. Amarasinghe, J.J. Rodriguez-Andina, Building energy management systems: The age of intelligent and adaptive buildings. IEEE Ind. Electron. Mag. 10(1), 25–39 (2016) 9. T. Tukia, S. Uimonen, M. Lehtonen, Evaluating the Applicability of Elevators in Frequency Containment Reserves, in 2018 15th International Conference on the European Energy Market (EEM), (2018), pp. 1–5
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
281
10. C.F. Nicolas, I. Ayestaran, I. Martinez, P. Franco, Model-Based Development of an FPGA Encoder Simulator for Real-Time Testing of Elevator Controllers, in 2016 IEEE 19th international symposium on real-time distributed computing (ISORC), (2016), pp. 53–60 11. C. Nicolas, I. Ayestaran, T. Poggi, G. Sagardui, J. Martin, A CAN Restbus HiL Elevator Simulator Based on Code Reuse and Device Para- Virtualization, in 2017 IEEE 20th International Symposium on Real- Time Distributed Computing (ISORC), (2017), pp. 117–124 12. H.-M. Ryu, S.-J. Kim, S.-K. Sul, T.-S. Kwon, K.-S. Kim, Y.-S. Shim, K.-R. Seok, Dynamic Load Simulator for High-Speed Elevator System, in Proceedings of the Power Conversion Conference-Osaka 2002, Cat. No.02TH8579, vol. 2, (2002), pp. 885–889 13. N. Chaosangket, P. Sasithong, S.K. Wijayasekara, W. Asdornwised, L. Wuttisittikulkij, P. Vanichchanunt, M. Saadi, A Simulation Tool for Vertical Transportation Systems using Python, in 2018 5th International Conference on Business and Industrial Research (ICBIR), (May 2018), pp. 270–275 14. H.-M. Ryu, S.-K. Sul, Position Control for Direct Landing of Elevator Using Time-based Position Pattern Generation, in Conference Record of the 2002 IEEE Industry Applications Conference. 37th IAS Annual Meeting, vol. 1, (Oct 2002), pp. 644–649 15. FineLIFT TM, Website, 2019. [Online]. Available: https://www.4msa.com 16. P. Cortes, J. Muuzuri, L. Onieva, Design and analysis of a tool for planning and simulating dynamic vertical transport. Simulation 82(4), 255–274 (2006) 17. L. Al-Sharif, R. Peters, R. Smith, Elevator energy simulation model. IAEE Elev. Technol. 14 (2004) 18. R.D. Peters, The Application of Simulation to Traffic Design and Dispatcher Testing, in The 3rd Lift Symposium on Lift and Escalator Technologies, Northampton, UK, vol. 1, (Oct 2013) 19. G. Barney, L. Al-Sharif, Elevator Traffic Handbook: Theory and Practice (Routledge, 2016) 20. N. Jabbour, C. Mademlis, Improved control strategy of a supercapacitor-based energy recovery system for elevator applications. IEEE Trans. Power Electron. 31(12), 8398–8408 (Dec 2016) 21. A. Ruferand, P. Barrade, Asupercapacitor-basedenergy-storagesystem for elevators with soft commutated interface. IEEE Trans. Ind. Appl. 38(5), 1151–1159 (2002) 22. N. Jabbour, C. Mademlis, Supercapacitor-based energy recovery system with improved power control and energy Management for Elevator Applications. IEEE Trans. Power Electron. 32(12), 9389–9399 (2017) 23. H.H. Sathler, L.H. Sathler, F.L.F. Marcelino, T.R. de Oliveira, S.I. Seleme, P.F.D. Garcia, A comparative efficiency study on bidirectional grid interface converters applied to low power DC nanogrids, in 2017 Brazilian Power Electronics Conference (COBEP), JUIZ DE FORA, Brazil, (2017), pp. 1–6. https://doi.org/10.1109/COBEP.2017.8257219 24. R. Reshma Gopi, S. Sreejith, Converter topologies in photovoltaic applications – A review. Renew. Sust. Energ. Rev. 94, 1–14 (2018) ISSN 1364-0321 25. T. Kang, T. Kang, B. Chae, K. Lee, Y. Suh, Comparison of voltage source and current source based Converter in 5MW PMSG wind turbine systems, in 2015 9th International Conference on Power Electronics and ECCE Asia (ICPE-ECCE Asia), Seoul, (2015), pp. 894–901. https:/ /doi.org/10.1109/ICPE.2015.7167888 26. B. Singh, S. Singh, A. Chandra, K. Al-Haddad, Comprehensive study of single-phase AC-DC power factor corrected converters with high-frequency isolation. IEEE Trans. Indus. Inform. 7(4), 540–556 (2011). https://doi.org/10.1109/TII.2011.2166798 27. J. Yang, J. Zhang, X. Wu, Z. Qian, M. Xu, Performance comparison between buck and boost CRM PFC converter, in 2010 IEEE 12th Workshop on Control and Modeling for Power Electronics (COMPEL), Boulder, CO, (2010), pp. 1–5. https://doi.org/10.1109/ COMPEL.2010.5562437 28. X. Fang et al., Efficiency-oriented optimal design of the LLC resonant converter based on peak gain placement. IEEE Trans. Power Electr. 28(5), 2285–2296 (2013). https://doi.org/10.1109/ TPEL.2012.2211895
282
J. García et al.
29. S. Busquets-Monge, J. Bordonau, J.A. Beristain, comparison of losses and thermal performance of a three-level three-phase neutral-point-clamped dc-ac converter under a conventional NTV and the NTV2 modulation strategies, in IECON 2006 – 32nd Annual Conference on IEEE Industrial Electronics, Paris, (2006), pp. 4819–4824. https://doi.org/10.1109/ IECON.2006.347400 30. R. Lai et al., A systematic topology evaluation methodology for high-density three-phase PWM AC-AC converters. IEEE Trans. Power Electr. 23(6), 2665–2680 (2008). https://doi.org/ 10.1109/TPEL.2008.2005381 31. T. Friedli, J.W. Kolar, J. Rodriguez, P.W. Wheeler, Comparative evaluation of three-phase AC– AC matrix converter and voltage DC-Link Back-to-Back converter systems. IEEE Trans. Indus. Electr. 59(12), 4487–4510 (2012). https://doi.org/10.1109/TIE.2011.2179278 32. California Energy Commission (CEC) Emerging Renewables Program (ERP) Guidebook, 4th Edition, posted on the web in January 2005, http://www.energy.ca.gov/2005publications/CEC300-2005-001/CEC-300-2005-001-ED4F.PDF 33. Standard IEC 61683:1999, Photovoltaic Systems-Power Conditioners Procedure for Measuring Efficiency (International Electrotechnical Commission, 1999) 34. J. Newmiller, W. Erdman, J.S. Stein, S. Gonzalez, Sandia Inverter Performance Test Protocol Efficiency Weighting Alternatives, in 2014 IEEE 40th Photovoltaic Specialist Conference (PVSC), Denver, CO, (2014), pp. 0897–0900. https://doi.org/10.1109/PVSC.2014.6925058 35. B. Brooks, C.M. Whitaker, Guideline for the Use of the Performance Test Protocol for Evaluating Inverters Used in Grid–Connected Photovoltaic Systems (California Energy Commission, 2005) 36. K. Ishaque, Z. Salam, Dynamic efficiency of direct control based maximum power point trackers, in 2014 5th international conference on intelligent systems, Modelling and Simulation, Langkawi, (2014), pp. 429–434. https://doi.org/10.1109/ISMS.2014.79 37. K.N.D. Malamaki, C.S. Demoulias, K.O. Oureilidis, Analytical Calculation of the PV Converter Efficiency Curve at Non-unity Power Factors, in 2017 52nd International Universities Power Engineering Conference (UPEC), HERAKLION, Crete, Greece, (2017), pp. 1–6. https:/ /doi.org/10.1109/UPEC.2017.8231924 38. D. Yu, Z. Xiaohu, B. Sanzhong, S. Lukic, A. Huang, Review of non-isolated bi-directional DC–DC converters for plug-in hybrid electric vehicle charge station application at municipal parking decks. Proc. IEEE APEC, 1145–1151 (2010) 39. R.M. Schupbach, J.C. Balda, Comparing DC–DC converters for power management in hybrid electric vehicles. Proc. IEEE IEMDC, 1369–1374 (2003) 40. J.O. Estima, A.J. Marques Cardoso, Efficiency analysis of drive train topologies applied to electric/hybrid vehicles. IEEE Trans. Vehicular Technol. 61(3), 1021–1031 (2012). https:// doi.org/10.1109/TVT.2012.2186993 41. E.P. Wiechmann, P. Aqueveque, R. Burgos, J. Rodriguez, On the efficiency of voltage source and current source inverters for high-power drives. IEEE Trans. Indus. Electr. 55(4), 1771– 1782 (April 2008). https://doi.org/10.1109/TIE.2008.918625 42. M.V. Nitesh, S. Arjun, S.A. Ahammed, P. Ramesh, N.C. Lenin, Experimental Comparison of a 70 Watt Switched Reluctance Machine with Different Types of Converter Topologies. Energy Procedia 117, 306–313 (2017) ISSN 1876-6102 43. N. Nguyen-Quang, D.A. Stone, C.M. Bingham, M.P. Foster, Comparison of single-phase matrix converter and H-bridge converter for radio frequency induction heating, in 2007 European Conference on Power Electronics and Applications, Aalborg, (2007), pp. 1–9. https:/ /doi.org/10.1109/EPE.2007.4417415 44. L. Krcmar, O. Mach, J. Cernohorsky, Design and efficiency mapping of an electric drive for mobile robotic container platform for use in industrial halls, in 2018 International Symposium on Power Electronics, Electrical Drives, Automation and Motion (SPEEDAM), Amalfi, (2018), pp. 963–967. https://doi.org/10.1109/SPEEDAM.2018.8445396
A Methodology for the Assessment of Efficiency in Systems Under Transient. . .
283
45. S. Overington, S. Rajakaruna, High-efficiency control of internal combustion engines in blended charge depletion/charge sustenance strategies for plug-in hybrid electric vehicles. IEEE Trans. Vehicular Technol. 64(1), 48–61 (2015). https://doi.org/10.1109/ TVT.2014.2321454 46. D.S. Amogh, S.B. Rudraswamy, Engine Torque Output Control Using Torque Maps - For Automotive Application, in 2018 International Conference on Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT), Msyuru, India, (2018), pp. 815–818. https://doi.org/10.1109/ICEECCOT43722.2018.9001354 47. C. Wang, M. Liang, Y. Chai, Finite-time identification algorithm for volumetric efficiency map in SI gasoline engines. IEEE Trans. Indus. Electr. 67(12), 10702–10712 (2020). https://doi.org/ 10.1109/TIE.2019.2962481 48. M.H. Mohammadi, D.A. Lowther, A computational study of efficiency map calculation for synchronous AC motor drives including cross coupling and saturation effects, in 2016 IEEE Conference on Electromagnetic Field Computation (CEFC), Miami, FL, (2016), pp. 1–1. https://doi.org/10.1109/CEFC.2016.7816154 49. A. Rassolkin, H. Heidari, A. Kallaste, T. Vaimann, J.P. Acedo, E. Romero-Cadaval, Efficiency Map Comparison of Induction and Synchronous Reluctance Motors, in 2019 26th International Workshop on Electric Drives: Improvement in Efficiency of Electric Drives (IWED), Moscow, Russia, (2019), pp. 1–4. https://doi.org/10.1109/IWED.2019.8664334 50. J.M. Cano, Á. Navarro-Rodríguez, A. Suárez, P. García, Variable switching frequency control of distributed resources for improved system efficiency. IEEE Trans. Indus. Appl. 54(5), 4612– 4620 (2018). https://doi.org/10.1109/TIA.2018.2836365 51. H. Liu, J. Jiang, H. Wu, W. Wang, Reliability evaluation of PV converter based on modified hybrid power control method. Microelectr. Reliab., 100–101 (2019) ISSN 0026-2714 52. J. Torres, P. Moreno-Torres, G. Navarro, M. Blanco, M. Lafoz, Fast energy storage systems comparison in terms of energy efficiency for a specific application. IEEE Access 6, 40656– 40672 (2018). https://doi.org/10.1109/ACCESS.2018.2854915 53. J. Torres, G. Navarro, M. Blanco, M. González-de-Soto, L. García-Tabares, M. Lafoz, Efficiency Map to Evaluate the Performance of Kinetic Energy Storage Systems Used with Renewable Generation, in 2018 20th European Conference on Power Electronics and Applications (EPE’18 ECCE Europe), Riga, (2018), pp. 1–9 54. S. Dusmez, A. Hasanzadeh, A. Khaligh, Comparative analysis of bidirectional three-level DC– DC converter for automotive applications. IEEE Trans. Indus. Electr. 62(5), 3305–3315 (2015). https://doi.org/10.1109/TIE.2014.2336605 55. V. de Nazareth Ferreira, A. Fagner Cupertino, H. Augusto Pereira, A. Vagner Rocha, S. Isaac Seleme, B. de Jesus Cardoso Filho, Design and selection of high reliability converters for Mission critical industrial applications: A rolling mill case study. IEEE Trans. Indus. Appl. 54(5), 4938–4947 (2018). https://doi.org/10.1109/TIA.2018.2829104 56. R. Georgious, J. García, Á. Navarro-Rodríguez, P. García, A study on the control design of nonisolated converter configurations for hybrid energy storage systems. IEEE Trans. Indus. Appl. 54(5), 4660–4671 (2018) 57. J. Garcia, R. Georgious, P. Garcia, A. Navarro-Rodriguez, Non-isolated high-gain three-port converter for hybrid storage systems. ECCE (2016) 58. R. Georgious, J. Garcia, P. Garcia, A. Navarro-Rodriguez, A Comparison of Non-Isolated High-Gain Three-Port Converters for Hybrid Energy Storage Systems. MDPI Energies 11(3), 658 (2018). https://doi.org/10.3390/en11030658 59. J. Garcia, S. Saeed, R. Georgious, I. Pelaez, I. El Sayed, B. Mohamed, J. Mendiolagoitia, Development and Testing of a Software Simulation Tool for Elevators, in 2019 IEEE Vehicle Power and Propulsion Conference (VPPC), Hanoi, (2019) 60. J. García, C. González-Morán, P.G. Fernández, P. Arboleya, Efficiency comparison in power converters under transient operation conditions: Application to hybrid energy storage systems, in 2018 IEEE energy conversion congress and exposition (ECCE), Portland, OR, (2018), pp. 5344–5350
284
J. García et al.
61. M. Kamil, Switch Mode Power Supply (SMPS) Topologies (Part I), AN1114 (Microchip Technology Inc., 2007) 62. A. Bersani, Switch Mode Power Supply (SMPS) Topologies (Part II), AN1207 (Microchip Technology Inc., 2009) 63. A. Maniktala, Switching Power Supplies A to Z (Copyright © 2006, Elsevier Inc) All rights reserved 64. B. Hauke, Basic Calculation of a Boost Converter’s Power Stage (TI Application Report SLVA372C–November 2009–Revised, 2014) 65. B.T. Lynch, Under the hood of a DC/DC boost converter, in Presented at the TI Power Supply Design Seminar, Dallas, TX, 20082009, Paper SEM1800, 66. E. Datasheet, Preliminary Technical Information (MT Series) https://www.e-guasch.com/ 67. SCT2280KE N-channel SiC power MOSFET datasheet, ROHM Semiconductor. https:// www.rohm.com/ 68. C.P. Steinmetz, On the law of hysteresis. reprint, Proc. IEEE 72(2), 196–221,2 (1984) 69. S.A. Mulder, Loss formulas for power ferrites and their use in transformer design (Philips Components, Eindhoven, 1994), pp. 1–16 70. P.L. Dowell, Effects of eddy currents in transformer windings. Electrical Engineers, Proceedings of the Institution of 113(8), 1387–1394 (1966) 71. H. Wang, F. Blaabjerg, Reliability of capacitors for DC-link applications in power electronic converters—An overview. IEEE Trans. Indust. Appl. 50(5) 72. ISO 25745-1:2012. Energy performance of lifts, escalators and moving walks—Part 1: Energy measurement and verification 73. ISO 25745-2:2015. Energy performance of lifts, escalators and moving walks—Part 2: Energy calculation and classification for lifts (elevators) 74. VDI 4707, VDI Guidelines, Lefst-Energy Efficiency. Association of German Engineers
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via Autonomous Vehicles Amaury Hayat, Xiaoqian Gong, Jonathan Lee, Sydney Truong, Sean McQuade, Nicolas Kardous, Alexander Keimer, Yiling You, Saleh Albeaik, Eugene Vinistky, Paige Arnold, Maria Laura Delle Monache, Alexandre Bayen, Benjamin Seibold, Jonathan Sprinkle, Dan Work, and Benedetto Piccoli
1 Introduction 1.1 Stop-and-Go Waves It is commonly known that having many drivers at the same speed on the road and the same distance between vehicles can lead to jam [46, 50]. Stop-and-go waves can seem to appear without any particular reason (no road reduction, no roadwork ahead, etc.). There is however a simple explanation to this phenomena: steady states in traffic flows are sometimes unstable [7]. Stop-and-go waves have a large impact both on the economy and sustainability of traffic. Not only can they reduce the outflow
A. Hayat () Ecole des Ponts Paristech, Marne-la-Vallée, France X. Gong Arizona State University, Tempe, AZ, USA J. Lee · N. Kardous · A. Keimer · Y. You · S. Albeaik · E. Vinistky · A. Bayen University of California at Berkeley, Berkeley, CA, USA S. Truong · S. McQuade · P. Arnold · B. Piccoli Rutgers University Camden, Camden, NJ, USA M. L. Delle Monache INRIA Grenoble—Rhône Alpes, Montbonnot-Saint-Martin, France B. Seibold Temple University, Philadelphia, PA, USA J. Sprinkle University of Arizona, Tucson, AZ, USA D. Work Vanderbilt University, Nashville, TN, USA © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_10
285
286
A. Hayat et al.
compared to a steady-state situation where all vehicles are equally spaced and have the same speed, but in addition fuel consumption and CO2 emissions associated with constantly braking and accelerating are much higher than fuel consumption and CO2 emission of the corresponding steady state [39]. This chapter introduces an initiative aiming at reducing stop-and-go waves by incorporating a small number of autonomous vehicles (AV) in the traffic and using it as a system controller. After presenting an overview of the project in Sect. 1, we focus in Sects. 2–3 on a control design approach derived from theory-based model, and in Sect. 4, we present some results when using the resulting controller in simulations on a highway model. In particular, we show that in a highly congested traffic, this controller allows for a strong reduction of fuel consumption for the same throughput, even with a very low proportion of AVs in the traffic.
1.2 An Initiative to Increase Energy Efficiency Finding ways to increase energy efficiency, i.e., the miles traveled per gallon of fuel (denoted “fuel economy” in the United States) or distance traveled per Joule of electric power, through traffic control is not a new idea. Numerical simulation results and stability analysis to regulate traffic flow via autonomous vehicles are available (see [10, 20, 47, 51]). Other approaches used variable speed limits (see [2, 21, 52]) or jam absorption [23, 34]. Our project is based on evidence shown by the experimental results on a ring road (see [44, 45, 53]). However, the energy efficiency of today’s vehicular mobility relies on the combination of two elements: control via static assets (traffic lights, metering, variable speed limits, etc.) and onboard vehicle automation (adaptive cruise control (ACC), ecodriving, etc.). These two families of controls were not co-designed and are not engineered to work in coordination. Recent studies have shown limitations of controls and even sometimes negative impacts of ACC [30]. Our general approach focuses on the technology development, implementation and prototyping, and validation of mobile traffic control (MTC), in other words using AVs as mobile controllers in the traffic flow. This can be viewed as an extension of classical traffic control (in which static infrastructure actuates traffic flow). In the MTC paradigm, automated vehicles impact the entire surrounding traffic via their behavior, offering enhanced possibilities to optimize the energy footprint of traffic, if designed correctly. We want to demonstrate for the first time that considerably reduced fuel consumption of all vehicles in traffic can be achieved via distributed control of a small proportion of controlled autonomous vehicles (CAV)s. The level of autonomy expected for the CAV is level 2 (according to the SAE taxonomy J3016_201806). Compared to baseline vehicular technologies, our work offers a significant design departure: control algorithms for the CAV consider the impact one vehicle can have on overall traffic, improving resulting overall fuel consumption. We focus on using a few vehicles as traffic controllers (via CAV technology) to improve the energy
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
287
efficiency of traffic flow to further optimize energy efficiency. The target is to achieve energy gains exceeding 10% in average for all vehicles on the road when the traffic is congested, through automation of less than 5% of the vehicles in the flow (called penetration rate in the following). This ambitious goal is motivated by prior field experiments that demonstrated fuel consumption reductions of up to 40% with a single CAV on a single lane track under ideal conditions [44]. Of course, for real-life multilane traffic, many new difficulties have to be taken into account such as lane-changing, drivers’ responses to actuation, interaction between CAV, topography of the road, etc. To achieve our aim, several challenges need to be addressed on the technical level: • Establishing the minimum sensing and connectivity required for eliminating traffic waves with mobile actuation • Investigating control requirements to dampen stop-and-go traffic • Designing simulation models representing reliably the multilane stop-and-go traffic and the relevant behaviors • Deducing a precise and realistic energy model for vehicle consumption, taking into account the different types of vehicles • Designing sensing systems on a highway as well as estimation algorithms to detect the traffic state using on-board vehicle sensing and/or infrastructure sensor networks • Finding efficient control algorithms either from accurate mathematical models and tools or from machine learning methods In the following, we present the main methods and achievements related to each of these issues, before focusing on the design of control algorithms from microscopic mathematical models. Testbed Development Research and development for a testbed began with evaluating modern closedcircuit television (CCTV) camera systems for possible deployment on roadside poles for traffic observation. A determination of high-resolution (4K) cameras, capable of individual pan-tilt-zoom control, was made. Roadside poles at a height of 110 ft were selected to minimize optical occlusion between vehicles. Six cameras could be housed on a single roadside pole with their fields of view adjusted for complete coverage of 500 linear feet of roadway. These developments are done on a portion of the I-24 (Tennessee), in complement to the test site of the I-210 California presented in this chapter. Motivations for this site were roadside site layout, traffic patterns, and existing data sources. Testing of this proposed configuration was conducted on an existing roadside pole owned by the Tennessee Department of Transportation (TDOT). The test also provided valuable video data for computer vision algorithm development. The poles carry 18 4K resolution cameras that can provide continuous coverage of traffic on I-24, capturing trajectories from ≈150,000 vehicles daily [17] (Fig. 1).
288
A. Hayat et al.
Fig. 1 Installation of a roadside pole on the I-24
Fast Car Tracking Computer Vision Methods To meet the needs of our project, we required a new method for object tracking. Existing methods, while fairly accurate, fall far short of the testbed needs in terms of speed; a fast method is necessary to process video data as it is streamed and to provide high-fidelity estimates of vehicle positions. The core innovation of this new tracking method is to rely on a convolutional neural network for object localization within a small crop of a video frame, rather than utilizing object detectors that specialize in finding all object locations within an entire video frame. The lightweight localization network is able to run much faster than the object detector at only a small cost in accuracy. Computer vision algorithms have been continually developed on video data from cameras on roadside traffic monitoring poles, mimicking the setup of the I-24 testbed. This has enabled algorithms to be tuned for performance with large numbers of objects in view, which exhibit realistic vehicle dynamics within the field of view. Sensing and Hardware Development Modern vehicles (since 1995 in the United States) operate through a system of control units and sensors that are interconnected through a Controller Area Network (CAN) architecture. Messages sent between electronic control units throughout a vehicle are sent through a CAN bus (or in some cases on multiple buses), which provides an opportunity to detect the vehicle state directly by recording these messages. While there exists a bevy of off-the-shelf solutions to inspect CAN messages or (in some cases) record these data, such solutions do not reliably deliver the messages in a timely way that permits for closed-loop control of the system. To address these limitations in data fidelity and timeliness, we developed a custom library called libpanda that interfaces with vehicle’s CAN bus through optimized software and off-the-shelf hardware for data acquisition in real time [5, 48]. Libpanda is a multi-threaded C++-based library for custom Panda interface applications (such as those in [41]) using the observer-based software design pattern (i.e., callbacks). Also featured in the software area are pre-made data recording utilities, simple data visualizers, startup services, and network handling services.
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
289
Libpanda utilities save CSV-formatted data that are of interest to vehicle state estimation, control, and in some cases estimation of surrounding traffic. Recorded data values include instantaneous velocity, acceleration, wheel speeds, steering angle (and rate), accelerator angle, brake angle, cruise control settings and state, as well as other vehicle states. Localization and information on the surroundings include radar trace information from vehicles equipped with adaptive cruise control systems, permitting estimation of following distance and lead vehicle dynamics. Additional contextual information is provided by synchronized GPS sensors that come standard with the off-the-shelf CAN bus hardware. Libpanda records, for each drive, a pair of files that are synchronized to GPS time and which include the CAN data in one file and GPS data in the other. Downstream analysis of these data files are performed by our data analytic tool strym [4], written in Python, for further downstream analysis of acquired data. Strym uses existing standards for decoding CAN message files that are CSVformatted, to produce time-series data representing desired signals for a particular drive. Validation of CAN data through offline analysis using strym permitted us to produce runtime analysis and computation on the data through a software bridge that joins the Robot Operating System (ROS) [38] with streaming data produced by libpanda. Through this bridge, additional software components can make runtime decisions on data coming from the CAN bus and could potentially inject control commands to vehicles in the future. Using these software and hardware interfaces, it is possible to imagine how to put a human in the loop with sensed data from the vehicle that a driver cannot perceive in real time. Inspired by this possibility, we created the CAN Coach, which is a system that continuously feeds time gap sensor information from the CAN bus back to the driver in near real time. Three sets of preliminary experiments are conducted in which the study vehicle follows a lead vehicle driving a specified driving profile to assess the potential of the CAN Coach to modify driver behavior. The experiments consider three modes: normal driving (the driver is given no prompt and no feedback), instructed driving (driver is given a prompt to drive at a 2-second time gap, but is not given any feedback from the CAN Coach), and coached driving (a 2-second prompt and CAN Coach feedback). The mean time gap errors from the 2-second target are 0.39s (normal driving), 0.09s (instructed driving), and 0.01s (coached driving). The standard deviation of the time gap error with the CAN Coach reduced by 72% and 68% from normal driving and instructed driving, respectively. Given this reduction of mean and standard deviation of the time gap error, we conclude that it is possible to “coach” drivers using only data from the CAN. Below shows how CAN Coach fits in the loop (Fig. 2). Traffic Flow Modeling To train and test wave smoothing control strategies, we have been working on developing high-fidelity traffic simulation models that explicitly produce nonequilibrium phenomena. Target traffic patterns, such as stop-and-go waves and traffic congestion, are evaluated and carefully tested for consistency with known phenomena on
290
A. Hayat et al.
Fig. 2 A diagram showing where libpanda, ROS, and the CAN Coach fit in the hardware/software stack
real roads. We developed a model to capture bulk traffic on a stretch of I-210 in California. We investigated various calibration routines and associated metrics [43]. This was done to prepare for the development of a model on I-24, and so while not complete, it served as a starting point to better design microsimulations with the macrostatistics of bulk traffic flow. Energy Modeling Based on two fitted energy models, calibrated to measurements conducted by Toyota, we have produced two simplified models that provide an easy structural adaptability to the control and simulation framework employed below. Specifically, the models are of the form P (v, a), i.e., the instantaneous energy consumption rate at time t is given by P (x˙i (t), x¨i (t)), where x˙i (t) is the speed and x¨i (t) the acceleration of a vehicle at that same time t. The quantity P can be fuel usage per time (g/s or "/s) or the power equivalents (kW) of the battery depletion per time. One model (PriusEV 1.0) is for a hypothetical vehicle that possesses all the characteristics of the 2017 Prius, except for being a fully electric vehicle rather than a hybrid vehicle. While the (proprietary) measurements/models by Toyota do cover real battery properties, particularly the fact that the power consumption does depend on the battery’s state of charge, this simplified fit in Fig. 3 is restricted to a fixed battery state of charge of 60%. The other model (Tacoma 1.0) represents an internal combustion engine, the 2017 Tacoma. These simplified models, shown in Fig. 3, are represented by explicit equations (thus admit rapid evaluation) and are piecewise smooth, convex functions. This last property is an approximation of reality, as a real vehicle’s gear switching dynamics tends to introduce non-convexity effects that may render a vehicle-specific unsteady driving profile more energy-efficient than a uniform velocity profile. The motivation for this simplification is that we wish to have energy metrics representing vehicle types in an averaged sense, rather than results that are fine-tuned to a specific single vehicle. In particular, the simple structure facilitates optimization and machine learning.
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
291
Fig. 3 Visualization of simplified models for energy consumption rate P (in power equivalent) as a function of instantaneous vehicle speed u and acceleration a. Left: PriusEV 1.0 model for an electric vehicle. The negative values of P reflect that the electric vehicle can recuperate some of the energy when braking. Right: Tacoma 1.0 for an internal combustion engine vehicle Fig. 4 An example of safe set. The safe set is defined in the domain where distance is positive, lead speed is between 0 and 30, and follower speed is between 0 and 30. The red surface represents the boundary of the safety set. The safe set is the set of states bounded by the surface (i.e., all states below the surface)
Safety With real-life implementation in mind, ensuring safety of the control algorithms is paramount. We seek to establish a method of validating safety via the use of reachability analysis, i.e., looking at the set of reachable states for the AV given admissible controls. We aim at producing safe sets that represent a set of states from which, when evolved forward for all time, our controller’s action will not result in a collision. To solve the reachability problem, we formulate the problem as a twoplayer game with a leading vehicle and subject (following) vehicle. The two-player game is an optimization problem optimizing over control inputs of both vehicles. The cost function of the optimization problem is the minimum distance between two vehicles at all times. The subject vehicle is governed by a controller, while the leading vehicle’s input is some input that minimizes the cost function. Both vehicles have a maximal acceleration and deceleration. The safety set is determined as a positive value set of the optimal value function. An example is shown in Fig. 4. Control Design Using Reinforcement Learning Due to the complexities of the driving and lane-changing model, one approach pursued is model-free reinforcement learning (RL) methods which can return
292
A. Hayat et al.
Fig. 5 The reinforcement learning loop. At each time step, an agent (the controller) receives a state and reward from the environment/simulator, uses the state to compute a desired action, and then executes that action to receive its next state and reward. This loop continues until the system horizon is reached
effective controllers via optimization. At a high level, the RL formulation consists of defining a reward function (in this case, maximizing average miles per gallon) and then searching for a controller that optimizes the sum of the reward over a given horizon. By casting the problem as an optimization problem, we can find controllers for systems whose system dynamics are challenging to analyze. RL works by running the controller repeatedly in an environment, i.e., a simulator or a real-world deployment. Figure 5 demonstrates the process of running RL to acquire sequences of state, action, reward pairs. Over these repeated runs, we acquire an estimate of the expected cumulative reward and then use optimization schemes to update the controller parameters to increase the expected cumulative reward, often using first-order/gradient-based methods. By running this process repeatedly, we eventually return a controller that is close to a local maximum of the cumulative reward function. Formally, since the autonomous vehicles are decentralized and their observations are local, we cast the problem as a decentralized partially observable Markov decision process (DEC-POMDP), a formalism in which we have decentralized actors trying to each optimize their reward functions while having only partial access to the true global system state. In particular, each of our cars observe only their speed, the speed of their leading car, and the distance to the lead car; these are all states that can easily be acquired by radar but are insufficient. RL studies the problem of how an agent can learn to take actions in its environment to maximize its cumulative discounted reward. Specifically each agent, indexed by i, tries to T π i i i optimize J = Eρ0 , p(st+1 |st ,at ) t=0 rt (st , at ) | π(at |ot ) where rt is the reward at time t and the expectation is over a distribution of initial states ρ, the probabilistic dynamics p(st+1 |st , at ), and the probabilistic controller πi which depends on a local observation ot rather than the global state ot . Here we have each agent optimize a local reward function rather than a global reward function. As a preliminary example of this approach, we use Proximal Policy Optimization [42] to approximately optimize the following intuitive reward function, r(st , at ) = −P (vt , at ), i.e., the negative of the instantaneous consumed energy. Additionally, a bonus of 5 is added to the reward every time the vehicle completes 50 m to ensure
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
293
Fig. 6 Leftmost lane of the I-210 without control (left) and under control of AVs at a 10% penetration rate (right)
that it makes forward progress. After several hundred iterations, the controller generates the space-time diagram shown in Fig. 6 (note, only one lane showed) at a penetration rate of 10%. Control Design Using Mathematical Models Control strategies were designed in multiline multi-population microscopic models by using a small number of autonomous vehicles (AV) (less than 5% penetration rate) to represent Lagrangian control actuators that can smooth stop-and-go waves in multilane traffic flow. A microscopic multilane model is typically composed of two components: longitudinal dynamics for each lane and a lane-changing mechanism. The parameters used on the lane-changing mechanism were shown to have a potentially large impact on the macroscopic behavior of the systems [28]. This implies that the control algorithm has to be robust with respect to the lane-changing mechanism. Two other approaches were considered by looking at the traffic at two higher scales: a macro-model consisting of a PDE coupled to several ODEs by a flux relation, which represents the interaction between the global traffic flow and the AV, identified as moving bottlenecks, and a new mean-field model that allows us to take the behavior of the well-known and well-studied microscopic models (such as the IDM) to a mesoscopic scale. This is obtained by considering the microscopic equations as equations on concentrated distributions that are the individual vehicles and showing a convergence when the number of vehicles goes to infinity. In the following, we focus on a way to design an efficient controller from a microscopic model. This is detailed in Sects. 3–4.
2 Control System Studied Depending on the scale at which they represent vehicular traffic, mathematical traffic models usually can be classified into different categories: microscopic, mesoscopic, macroscopic, and cellular. We refer to the survey papers [3, 6] and
294
A. Hayat et al.
reference therein for general discussions about the models at various scales in the literature. In this section, we only focus on microscopic and macroscopic models. Microscopic traffic models represent traffic by looking at each single vehicle. The aim is to simulate each single vehicle via the variables of position and velocity. Each vehicle is seen as a particle or agent, and its trajectory usually evolves according to the behavior of other vehicles in front. The dynamics of all vehicle can then be caused by a system of ordinary differential equation (ODE). In this section, we first focus on two microscopic traffic models: • The Bando-Follow the Leader (Bando-FTL) model • The Intelligent Driver Model (IDM). Then, we introduce novel multilane, multiclass traffic by dividing the vehicle population into human-driven vehicles and autonomous vehicles. The autonomous vehicles are influenced by external policy-makers with controlled dynamics. The multilane traffic is distinguished by its hybrid nature: continuous dynamics on each lane and discrete events due to lane-changing. For the lane-changing mechanism, we consider three components: safety (cars do not become too close), incentive (cars benefit from changing lanes with larger prescribed acceleration), and cooldown time (cars cannot change lanes too rapidly). To this end, we study an optimal control problem related to the controlled microscopic hybrid system to model, for instance, the minimization of energy cost.
2.1 Microscopic Traffic Models: Bando-FTL Model and IDM Model Consider a population of P ∈ N vehicles on an open stretch road of a single lane. For each vehicle i ∈ {1, . . . , P }, let iL ∈ {1, . . . P } and iF = {1, . . . , P } be the indices of the leading vehicle and following vehicle i, respectively. In other words, we associate with vehicle i an index vector ι(i) = (i, iL , iF ) where iL is index of the leader and iF the index of the follower of vehicle i. To fix notation, we assume that iL = 0 if there is no other vehicle in front of vehicle i and iF = 0 if there is no other vehicle following vehicle i. Let the considered time horizon T ∈ R>0 be fixed. For each vehicle i ∈ {1, . . . , P }, let (xi (t), vi (t)) ∈ R × R≥0 be the position-velocity vector of vehicle i at time t ∈ [0, T ]. The Bando-FTL Model The Bando-Follow the Leader (FTL) model is a combination of the Bando model and the Follow the Leader model. In this section, we will introduce the Bando model and Follow the Leader model separately before giving the formulation of the BandoFTL model. The Bando model, which is also called Optimal Velocity (OV) model, is a traffic model proposed in [1]. One key feature is that each vehicle adjusts its acceleration or deceleration according to the difference between the optimal and
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
295
their own “optimal” of preferred velocity. Specifically, the dynamics are given by the following system of second-order ODEs: x˙i (t) = vi (t),
t ∈ [0, T ], i ∈ {1, . . . , P },
v˙i (t) = a(V (hi (t)) − vi (t)), t ∈ [0, T ], i ∈ {1, . . . , P }.
(1)
In the above formula, the dot represents the time derivative, and a ∈ R>0 is a sensitivity constant. For every t ∈ [0, T ], hi (t) := xiL (t) − xi (t) is the headway of vehicle i. The function V : R → R is called “OV function” and describes the optimal velocity determined by the headway. In general, V is monotonically increasing with respect to the headway, i.e., the optimal velocity is “smaller” if the headway is small and increases as the headway increases. In addition, if the headway is large enough for a sufficiently large time, the vehicle’s velocity should arrive at the maximum velocity vmax ∈ R>0 , which may depend on the driver. One example of OV function is V : R≥0 !→ R≥0 , h !→ V (h), with V (h) = vmax tanh (h−l−d)+tanh(l+d) , 1+tanh (l+d)
(2)
where l ∈ R>0 is the length of the vehicles and d ∈ R>0 is the minimum safe distance between vehicles. Note that the equilibrium point of the Bando model is obtained when all vehicles travel at constant velocity and have the same headway. The Bando model has been extended in [33] by considering interactions with the following vehicle (beside the leading) to stabilize the traffic flow. The Follow the Leader (FTL) model was introduced in [16]. It is based on the idea that the acceleration of a vehicle is determined only by the vehicle in front. In particular, the acceleration of a vehicle is directly proportional to the difference between its leader’s velocity and its own velocity and is inversely proportional to the vehicle’s headway. The main dynamics are given by x˙i (t) = vi (t), v˙i (t) = b
viL (t)−vi (t) , (hi (t))2
t ∈ [0, T ], i ∈ {1, . . . , P }, t ∈ [0, T ], i ∈ {1, . . . , P },
(3)
where b ∈ R>0 reflects the sensitivity of the driver. In the case that there is no leading vehicle for vehicle i, i.e., iL = 0, the dynamics of vehicle i is given by x˙i (t) = vmax
∀t ∈ [0, T ].
(4)
We would like to mention the following drawback of the FTL model: the acceleration for vehicle i is zero as long as vehicle i and its leader have the same velocity. This implies that an extremely small headway is allowed even with high speeds. One way to handle the abovementioned drawback of the FTL model is to combine the Bando model and the FTL model in the following way:
296
A. Hayat et al.
x˙i (t) = vi (t), v˙i (t) = a(V (hi (t)) − vi (t)) + b
t ∈ [0, T ], i ∈ {1, . . . , P }, viL (t)−vi (t) , (hi (t))2
t ∈ [0, T ], i ∈ {1, . . . , P }.
(5) We point out that the Bando-FTL model justifies the fact that drivers adjust their acceleration or deceleration based on their own velocities, the optimal velocities, and the velocities of their leading vehicles. This model allows formation and persistence of stop-and-go waves and can be well calibrated following the experimental results of [46] (see [11]). The Intelligent Driver Model The Intelligent Driver Model (IDM) introduced in [49] is a time-continuous microscopic car-following model, which is widely used in the traffic engineering community. It assumes that each vehicle driver decides to accelerate or to brake depending only on their own velocity and on the position and velocity of the leading vehicle immediately ahead. To simplify notations, we set the velocity difference or approaching rate of vehicle i at time t ∈ [0, T ] as Δvi (t) := vi (t) − viL (t) and the net distance of vehicle i at time t ∈ [0, T ] as si (t) := xiL (t) − xi (t) − l, where l ∈ R>0 is again the vehicle length. The dynamics of vehicles are then described by the following system of ODE: x˙i (t) = vi (t), t ∈ [0, T ], i ∈ {1, . . . , P }, & % δ ∗ 2
s (vi (t),Δvi (t)) , t ∈ [0, T ], i ∈ {1, . . . , P }, − v˙i (t) = a 1 − viv(t) si (t) 0 (6) where the desired minimum gap of the vehicle i represented by the function s ∗ : R × R is given by s ∗ (vi (t), Δvi (t)) := s0 + vi (t)τ +
vi (t)Δv √ i (t) 2 ab
t ∈ [0, T ].
(7)
The parameters’ meaning and choice are as follows: • • • • • •
v0 ∈ R>0 is the desired velocity the vehicle would drive in free traffic (m/s). s0 is the minimum desired net distance. τ is the safety time gap. a is the maximum vehicle acceleration. b is the comfortable braking deceleration.. The exponent δ ∈ R>0 a tuning parameter, usually set to 4.
The desired minimum gap depends on the safety time gap, the vehicle acceleration, the deceleration, and the velocity difference. Specifically, Eq. (7) contains three terms: 1. The minimum distance s0 in congested traffic 2. The safety gap that the follower must have with its leader, vi (t)τ
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
3. The term velocity.
vi (t)Δv √ i (t) 2 ab
297
which is designed to stabilize the platoon vehicle in terms of
Furthermore, the acceleration of vehicle i can be separated into a free road term % δ &
vi (t) , afree (t) = a 1 − v0
t ∈ [0, T ]
(8)
and an interaction term aint (t) = −a
s ∗ (vi (t),Δvi (t)) si (t)
2
= −a
s0 +vi (t)τ si (t)
+
vi√ (t)Δvi (t) 2 absi (t)
2
t ∈ [0, T ]. (9) Note that in the case of a free road when the net distance of vehicle i, si , is large, the vehicle’s acceleration is governed by the free road term Eq. (8) which vanishes as vi approaches v0 . Thus, a vehicle on a free road will gradually approach its desired velocity v0 . For large approaching rate Δvi , the interaction term Eq. (9) is dominated 2
2 (t)Δvi (t) i (t)) by the term −a vi√ = − (vi (t)Δv which leads to a driving behavior that 4bsi (t)2 2 absi (t) compensates velocity differences while trying not to brake much harder than the comfortable braking deceleration b. For negligible velocity differences and small 2 i (t)τ ) net distance, the interaction term Eq. (9) is approximately equal to −a (s0 +v . si (t)2 This resembles a simple repulsive force such that small net distances are quickly enlarged toward an equilibrium net distance. ,
We point out that the IDM model has many drawbacks when it comes to the drivers’ safety and the vehicles’ real capability especially in the case of a collision (see [12]). Therein, a modified version of IDM model was proposed and tested in terms of string stabilization.
2.2 Lane-Changing Conditions for Multilane Traffic In the case of multilane traffic, frequent lane-changing maneuvers could disrupt traffic flow and even worse lead to accidents. In addition, lane-changing behaviors have significant impact on formulation and propagation of the stop-and-go traffic waves. Recently, efforts to model lane change have rapidly increased (see [25, 26, 29]). In this subsection, we propose lane-changing rules based on acceleration due to two main advantages (see [54]): 1. The lane-changing decision-making process is dramatically simplified. 2. One can readily calculate the accelerations based on an underlying microscopic traffic model. We consider an open stretch of road with m ∈ N≥1 lanes and assume that the number of vehicles on lane j ∈ J := {1, . . . , m} is Pj . We associate each vehicle with
298
A. Hayat et al.
i ∈ {1, . . . , Pj }, on lane j ∈ J , a vector of indices ι(i) = (i, j, iL , iF ), where iL and iF are defined as in Sect. 2.1. We design lane-changing rules based on three components, cooldown time, safety, and incentive for lane-changing, and assume that lane-changing is performed instantaneously. Then, we propose that a vehicle performs a lane change if the following conditions are met: 1. The vehicle did not make a recent lane change. To prevent a vehicle changing lane too frequently, we assign each vehicle a cooldown time τ > 0. A vehicle is allowed to perform lane-changing at most once every τ time units. This constraint is motivated by two reasons: the lane change is not instantaneous (so there should be a fixed minimal duration between two lane changes), and a driver that has just changed lanes is less likely to change lanes again right away. From [24], only the 15% of the vehicles cross a lane while traveling the road section. In the simulation presented in [19], we choose the cooldown time T1 = 5 s over the time interval [0, 1000s]. The simulation shows that our model with this cooldown time can handle the case when around 40% of vehicles performing lane-changing. In particular, one can model higher lane-changing frequency by taking smaller cooldown times. 2. The lane change can be performed safely into a target lane. This corresponds to having enough space in between the two vehicles in the target lane where the vehicle is moving to. 3. There is sufficient incentive to perform lane-changing. In other words, if the expected acceleration of the vehicle in the new lane is sufficiently bigger than its acceleration on the current lane, the lane change will be performed (assuming that the other mentioned requirements are met). We now instantiate the above rules in a mathematically sound and precise way. We associate with each vehicle i in lane j ∈ J an internal time for lane change called j τi : [0, T ] → R≥0 with T > 0 being fixed, such that the following holds: j
τ˙i (t) = 1, j
t ∈ [0, T ]
j
τi (0) = τi,0 ∈ [0, τ ),
(10)
where τ ∈ R>0 is the cooldown time. To avoid synchronous lane changes, we assume that all initial conditions of the internal times are distinct one from another. That is, we assume for j, j ∈ J , i ∈ {1, . . . , Pj }, k ∈ {1, . . . , Pj }, with j #= j and i #= k, j
j
τi,0 #= τk,0 . We assume that a vehicle does not lane change unless its internal time reaches the cooldown time τ . In addition, we reset the internal time for each vehicle to be zero just after it reaches the cooldown time τ . Thus, we set
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . . j
τ˙i (t) = 1, j
τi (kτ +) = 0,
299
t ∈]kτ, (k + 1)τ ] ∩ [0, T ], k ∈ {1, 2, . . . }, j ∈ J,
(11)
i ∈ {1, . . . , Pj }, j
where kτ + represents the right-hand side limit which is needed as τi is only piecewise continuous. Note that we do not allow two vehicles to change lane at the same time. This assumption is reasonable since it has been experimentally shown that lane-changing is not frequent in a traffic flow (see [27]). Let us now describe the safety and incentive conditions. Let Δ ∈ R>0 be fixed. Vehicle i on lane j ∈ J will change to lane j = j +1or j −1 ∈ J at time t ∈ [0, T ], if the following conditions are met: j
j
Safety: a¯ i (t) ≥ −Δ and a¯ l (t) ≥ −Δ;
j
j
Incentive: a¯ i (t) ≥ ai (t) + Δ,
where l is the index of the potential follower of vehicle i on the new lane j , ai (t) j
j
j
denote the acceleration of vehicle i on lane j at time t, and a¯ l (t) and a¯ i (t) are the expected acceleration of vehicles l and i on lane j at time t, respectively.
2.3 A Controlled Hybrid System In this subsection, we present a multi-population multilane model. First, we divide the whole traffic population into two classes: autonomous vehicles and humandriven vehicles. Let Mj ∈ N≥1 and Nj ∈ N≥1 represent the number of autonomous vehicles and human-driven vehicles on lane j ∈ J , respectively. Due to the fact that external policy-makers can influence the dynamics of the autonomous vehicles, we only add controls to the autonomous vehicles instead of controlling the whole traffic population. The dynamics of vehicles on each lane are continuous and are governed by a system of ODEs consisting of, for instance, Bando-FTL model or IDM model described in Sect. 2.1. Roughly speaking, for any microscopic model, the acceleration of an individual vehicle is determined by its desired velocity and the position-velocity j j vectors of its leader and its own. Let xi (t), vi (t) ∈ R × R≥0 be the position j j velocity vector of human-driven vehicle i and yk (t), wk (t) ∈ R × R≥0 be the position-velocity vector of autonomous vehicle k on lane j ∈ J at time t ∈ [0, T ]. Then, the dynamics of vehicles on the j -th lane can be written for t ∈ [0, T ] as j
j
y˙k (t) = wk (t), j
(j, k) ∈ J × {1, . . . , Mj }, j
j
j
j
j,∗
j
w˙ k (t) = ACC(yk (t), wk (t), zkL (t), νkL (t), vk (t)) + uk (t),
(j, k) ∈ J × {1, . . . , Mj },
j x˙i (t)
=
j vi (t),
(j, k) ∈ J × {1, . . . , Nj },
=
j j j j j,∗ ACC(xi (t), vi (t), z˜ iL (t), ν˜ iL (t), vi (t)),
(j, k) ∈ J × {1, . . . , Nj },
j v˙i (t)
(12)
300
A. Hayat et al.
where ACC : R × R≥0 × R × R≥0 × R≥0 !→ R is the general formulation of j j a vehicle’s acceleration based on a microscopic traffic model; (zkL (t), νkL (t)) and j
j
(˜ziL (t), ν˜ iL (t)) are the position-velocity vectors of the leaders of autonomous vehicle j,∗
k and human-driven vehicle i on lane j , respectively, at time t ∈ [0, T ] and vk (t); j,∗ and vi (t) are the desired velocities of autonomous vehicle k and human-driven vehicle k on lane j , respectively. Notice that the dynamics of autonomous vehicle k j on lane j are distinguished by the control term uk : [0, T ] !→ R. In addition, the lane-changing mechanism of the vehicles introduced in Sect. 2.2 generates discrete events for the multilane, multiclass traffic system. The presence of both time-dependent continuous dynamics in Eq. (12) and discrete events leads to a system of hybrid nature (see [35, 36]). Such systems are characterized by the presence of continuous dynamics, which, at discrete times—so-called switching times—are effected by logic variables. The latter, in turn, change their values depending on the values of the continuous variables at switching times. We call such system a hybrid system or a switched system. In our example, the switching times are precisely the lane-changing times. The discrete variables are the lane indices of each vehicle, which may change depending on position, speed, and accelerations of vehicles, which represent the continuous variables. Because of the cooldown time assumption, we avoid the famous Zeno phenomenon, which impacts strongly the behavior of hybrid systems [9]. Furthermore, we formulate an optimal control problem related to the mentioned controlled hybrid system to minimize, for instance, the energy cost. Let U := {u : [0, T ] !→ RM } be the set of admissible controls, where M = Mj is the j ∈J
total number of the autonomous vehicles. We define a cost functional F : U !→ R, by F (u) =
j ∈J
T 0
⎧ ⎨
⎫ Mj j ⎬
|u (t)| k Lj xj (t), vj (t), yj (t), wj (t), uj (t) + dt ⎩ Mj ⎭ k=1
(13) N M where for each j ∈ J , the Lagrangian function Lj : RNj × R≥0j × RMj × R≥0j × RMj !→ R is sufficiently smooth. The hybrid optimal control problem is to minimize the cost functional Eq. (13) over the set of admissible controls U , where x, v, y, andu satisfy the hybrid system Eq. (12). For the existence of optimal control, we refer the interested reader to [13].
2.4 A Mean-Field Approach Macroscopic traffic models describe vehicular traffic as fluid flow by assuming a sufficiently large number of vehicles on a road. By capturing and predicting the main phenomena of microscopic dynamics, macroscopic models provide overall
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
301
and statistical views of traffic. There are three important variables in a macroscopic traffic model: • The traffic density ρ : R × [0, T ] → R≥0 • The speed v : R × [0, T ] → R≥0 of the traffic • The flow rate q : R × [0, T ] → R≥0 passing through a fixed point The three variables satisfy the following relationship: for (x, t) ∈ R × [0, T ], q(x, t) = ρ(x, t)v(x, t).
(14)
The evolution of the traffic density is governed by partial differential equations (PDEs), for example, the Lighthill-Whitham-Richards (LWR) model in [31, 40] consists of a single conservation law for the density, and closing the system by assuming that the speed depends only on the density. Models with more equations and different “closure relationships” were developed (see [14, 15] for an extensive discussion). For the multi-population traffic system introduced in Sect. 2.3, one can use a coupled ODE-PDE system to model the dynamics of a small number of autonomous vehicles with ODEs and the large number of human-driven vehicles with PDEs. This is a combination of the microscopic and macroscopic models using multiple scales. One may use a mean-field approach to relate the two different scales, i.e., microscopic and macroscopic, both formally and rigorously by letting the number P of vehicles go to infinity. For a population of P ∈ N vehicles on an open stretch of a single lane, we again let for t ∈ [0, T ] the vector (xi (t), vi (t)) ∈ R × R≥0 be the position-velocity of vehicle i ∈ {1, . . . , P }. We assume that drivers adjust their acceleration based on the position-velocity vectors of several vehicles ahead (instead of only one vehicle as it is usually assumed in classical car-following models). To realize that, we introduce a locally Lipschitz convolutional kernel function H : R×R≥0 !→ R which is of sublinear growth and describes the dynamics of the P vehicles as x˙i (t) = vi (t),
t ∈ [0, T ], i ∈ {1, . . . , P },
v˙i (t) = H ∗ μP (xi , vi )(t), t ∈ [0, T ], i ∈ {1, . . . , P }, where μP (xi , vi ) =
1 P
P
(15)
δxi ,vi is the probability measure obtained using Dirac
i=1
distributions placed at a cars’ position-speed location, also called the empirical measure. Then, the rigorous mean-field limit of the finite-dimensional ODE system in Eq. (15) consists of a Vlasov-Poisson-type PDE: ∂t μ + v∇x μ = ∇v · [(H ∗ μ)μ],
(t, x, v) ∈ [0, T ] × R × R≥0
(16)
which describes the evolution of the density distribution μ : (0, T ) × R × R≥0 → R≥0 of infinitely many vehicles in (x, v) space (see [13]).
302
A. Hayat et al.
For multi-population and multilane traffic with a small number of autonomous vehicles and a large number of human-driven vehicles, we can proceed as follows: The mean-field limit is computed only for the human-driven vehicles by taking their number to infinity, while the number of autonomous vehicles is kept fixed. The lanechanging maneuvers of the human-driven vehicles generate in the limit a source term for the Vlasov-Poisson-type PDEs Eq. (16) (one equation per lane). The autonomous vehicles’ lane-changing behavior (discrete dynamics) can be considered as controls, thus leading to a controlled hybrid system as in Sect. 2.3. Similarly, in the limit, one obtains a coupled hybrid system of controlled ODEs (for the autonomous vehicles) and Vlasov-Poisson-type PDEs with source terms. It turns out that the optimal controls for the finite-dimensional hybrid ODE system converge to the optimal controls of the hybrid ODE-PDE system when the number of humandriven vehicles approaches infinity. This is proved rigorously by taking advantage of Γ -convergence [8] and using a generalized Wasserstein metric [37]. For details on the limit process, see [18].
2.5 Numerical Implementation and Difficulties About Modeling A traffic smoothing algorithm or “controller” will be selected among candidate controllers that we first test in a simulation in the traffic simulator “sumo” [32]. We use this platform to evaluate the controller’s potential real-world performance in reducing the energy consumption of the bulk traffic. This step allows us to select the best performing controllers for the eventual road test. To meaningfully evaluate the road performance, the bulk traffic in the simulation must be similar to real-life bulk traffic. There are features of the bulk traffic that are especially important in evaluating the controllers, such as stop-and-go traffic waves. We designed a microscopic traffic model in sumo to simulate the bulk traffic. The simulation will include stop-and-go waves in expected traffic density regimes using the Intelligent Driver Model (IDM), augmented with a small amount of additive noise in the velocity to trigger instabilities that can grow into full nonlinear stopand-go waves if traffic flow is in the dynamically unstable regime. This microscopic model must be parameterized to have macroscopic features such as decreasing velocity with increasing density and congestion formation on a segment of road. The simulation will have a rate of cars spawning at the upstream boundary, called the inflow, and a rate of cars leaving the segment, called the outflow. When the inflow is higher than the outflow, congestion will form. The vehicle accelerations from the microscopic model are used to calculate emissions and energy usage for the bulk traffic flow. In order to test that our bulk traffic model was representing the type of traffic we require, we started with a model of a small section of highway and no on ramps or off ramps. We needed a way to force different levels of congestion and observe the presence or lack of stop-and-go waves. An initial challenge was the
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
303
dramatic effect of simulation step size on the presence and quality of stop-and-go waves. Unexpected traffic waves would be present for large time steps. Designing this bulk traffic model on a small section of highway involves intricate car-spawning conditions at the beginning of the highway segment. In order to control congestion levels, we designed a downstream “ghost cell,” which is a section of road used only to control conditions on the road segment of interest, where vehicles obey a lower speed limit which propagates upstream to traffic in the highway segment. The vehicles in the downstream ghost cell approach this new speed with a relaxation term so as to smooth out unexpected acceleration artefacts and prevent them from propagating into the segment of interest. Before this segment, we also had to implement an upstream ghost cell in order to circumvent issues due to vehicle spawning logic in a high congestion setting. These two ghost cells provided a way to control the traffic density and make sure we were achieving stop-andgo waves in expected density and throughput regimes. This design of a highway segment represents the minimum testbed to meaningfully evaluate the potential road performance of controllers. This was done before including other necessary features, such as inclusion of trucks, more sophisticated lane change models, and highway topology, such as on/off ramps (Fig. 7).
Fig. 7 Visualization of a simulation on the stretch of the I-210 (top), zoom on a start of congestion (bottom). Regular vehicles are represented in white; AVs are represented in red but disabled in this simulation
304
A. Hayat et al.
2.6 Energy Model Used The results presented in Sect. 4 assume that all vehicles on the roadway (uncontrolled as well as controlled vehicles) are all combustion engine vehicles with identical fuel consumption characteristics. We use a simplified, regularized, energy function that is calibrated to measurements conducted by Toyota on a 2017 Tacoma vehicle. The energy demand is expressed via a power equivalent function P (v, a) that expresses the instantaneous power as a function of the vehicle speed v and acceleration a. The simplified function P , shown in Fig. 3, is designed to structurally resemble physical power functions, augmented by structurally simple correction terms. It is of the form
P (v, a) = max m av + C0 + C1 v + C2 v 2 + C3 v 3 , 0 + max (p1 a + p3 av, 0) , (17) where m = 2041 kg, C0 = 3405.54 W, C1 = 83.1239 kg m/s2 , C2 = 6.76507 kg/s, C3 = 0.70413 kg/m, p1 = 4598.71 kg m/s, and p3 = 975.127 kg. To obtain fuel consumption (in volume per time), the fitted conversion factor 15.09 KW = 1 gallon/h is used, which incorporates the engine’s efficiency. Of the two terms in Eq. (17), the first represents a physics equivalent power function (on flat roads) of the form P = vF , where the force F is composed of the acceleration force ma, and a generalized friction and drag force function (the Cj v j terms). The second term is a correction term that accounts for engine and powertrain inefficiencies. In both terms, the maximum with zero expresses the fact that no energy is recuperated during braking, which is the typical situation for combustion engine vehicles. In contrast, electric vehicles do generally recuperate a certain amount of energy during braking (see Fig. 3), albeit not the full amount possible during the braking that tends to occur in strong stop-and-go traffic waves. Hence, flow smoothing will still improve the energy efficiency of electric vehicles, but not as significantly as the results shown in Sect. 4. In the simulator, the vehicle energy is evaluated as follows: Given that the integration time step Δt is chosen suitably small, energy is computed via a simple first-order quadrature rule—for each vehicle, evaluate the energy rate f (v(t), a(t)) at time t, with a version of the acceleration devoid of noise, and add f (v(t), a(t))Δt to the cumulative energy.
3 Control Design As stated in the introduction, our ideal goal is to suppress stop-and-go waves in the system, namely, bringing it to a situation where all vehicles have the same velocity. Given the models’ dynamics in Eq. (5) or in Eq. (6), this corresponds to bringing all the vehicles to a steady state. This steady-state is uniquely defined by the velocity, and in particular a given velocity imposes all the vehicles headway.
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
305
In this section, we first present a simple and ideal controller. We explain why this controller works in ideal situations, and can prevent the appearance of new stopand-go waves, but struggles to cope with dissipating existing stop-and-go waves in practical situations. Then we present an approach to make this controller more robust and adapt it to realistic situation with strong stop-and-go waves. We illustrate these points by providing numerical simulations on an ideal one lane straight road. Numerical simulations on a realistic mutlilane setting as well as quantitative results are the subject of the next section.
3.1 Ideal Controller Let us assume that, given the inflow and outflow conditions, there exists a steadystate with velocity vopt ∈ R>0 to the system. A natural idea to force the traffic flow into this steady-state would be to set the velocity of each AV to be vopt , using the following proportional control law: v˙AV (t) = −k vAV (t) − vopt ,
(18)
where k ∈ R>0 is a design parameter to be chosen. To understand why such a simple idea might work, one has to see stop-and-go waves as propagating backward because all vehicles in the jam alternate between decelerating, when they are at the top of the wave, and accelerating, when they are at the bottom of the wave. Usually such acceleration brings them to a higher speed than the steady-state speed, which therefore inevitably leads to a further deceleration. Few AVs in the bulk traffic dampening stop-and-go waves are like rocks on a shore braking sea waves. Hence, this control law has already been used in several works of research, in both theoretical and experimental settings. In [11], for instance, the authors study this control law implemented on a single AV on a ring road, i.e., with the i-th vehicle being also identified as the N + i-th vehicle and with the Bando-FTL (see Eq. (5)) traffic model. They show theoretically that this controller guarantees the local exponential stability of the system for up to 9 vehicles (which correspond to a penetration rate of 11%). They also show numerically that this stability holds reasonably well up to 20 vehicles (penetration rate 5%). In [22], the authors show that this controller guarantees in fact the local exponential stability for any number of vehicles on the road (penetration rate as low as desired); moreover, they show the exponential decay rate and the optimal k to not depend on the number of vehicles. However and as it could be expected, the bound we are able to obtain on the basin of attraction decreases strongly with the number of vehicles. This means that in order to stabilize an entire ring road with a single vehicle, the vehicle’s velocities must be very close to the steady-state value if the number of vehicles becomes large. In other words, this means that this controller might not be able to ensure the stability of motion in heavy stop-and-go sequences. And this, as we will see in the next subsection, will be the main problem. Nevertheless, this means that this simple
306
A. Hayat et al.
controller is good at preventing the emergence of stop-and-go waves in a steady flow. The controller was also tested in a field experiment, on a single lane ring road with 20 vehicles and a single AV. Several variants of the controller were tested. Among them a variant where1 v˙AV (t) = −β vAV(t) − v1 (t) ,
(19)
s , 0 , 1 with Δx s being a where β is a positive coefficient, α = min min Δx−Δx γ safety distance and v1 (t) = α(t)vtarget (t) + (1 − α(t))vlead (t), 1 vtarget (t) = τ
τ 0
1 vAV (τ )dτ + min max Δx−l , 0 ,1 , l2
(20) (21)
with τ a characteristic time and l1 and l2 two distances, typically l1 = 7 m and l2 = 23 m. This amounts to changing the speed to be reached vopt by v1 , an objective speed that depends on time. This time dependency is important, as we will see later on in Sect. 3.2. Despite the fact that a single lane ring road is a very favorable situation, the results were astonishing with a nearly complete disappearance of stop-and-go waves and up to 40% fuel consumption reduction.
Limits in Practical Situations While the controller in Eq. (18) works perfectly when the system is close to the target steady state, it suffers several limitations when the speed variance is high, all related to the following fact: if an AV follows the controller in Eq. (18) while the system undergoes large stop-and-go waves, it might crash in its leading vehicle at the top of the wave. A natural quick fix would consist of adding a safety mechanism that overrides the controller and brakes when the AV is too close from its leading vehicle. However, if the AV is itself decelerating when at the top of the stop-and-go wave, it propagates the stop-and-go wave, and this destroys the stabilizing effect of the controller. This is where the difficulty lies, and the reason why we consider these two simple options: • The first option is to replace vopt by min(vopt , vlead ) where vlead is the velocity of the vehicle in front of the AV. However, this does not prevent it from being caught in a stop-and-go wave in numerical experiments. In some cases, the stopand-go wave reduces the average velocity in a manner that vlead is below vopt
1 In
[11], this variant is given as a discretized control law. We state it its continuous version.
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
307
Fig. 8 Speed variance with respect to time in three lanes of a straight road, using the controller with min(vopt , vlead ) instead of vopt . AVs are turned on at t = 500 s. The different lanes are represented in different colors; the proportion of AVs in inflow is 5% and the same for each lane
most of the time. This, in turn, leads the AVs to propagate the stop-and-go wave instead of dissipating it. This can be observed in Fig. 8 representing the speed variance of vehicles with time of a straight stretch of road with three lanes. The road is 2000 m, and 5% of vehicles are AVs, while the 95% remaining are regular vehicles modeled with Bando-FTL model. For the first 500s, the AVs obey the same dynamics as the regular vehicles and then start following the controller dynamics at t = 500 s. The simulation is done in MATLAB using a fourth-order explicit scheme. We see on Fig. 8 that a little before t = 500 s, the speed variance is high, indicating heavy stop-and-go waves, and this does not seem to change after the AVs are turned on. The switch at t = 500 s does not lead to any apparent reduction of the overall speed variance of each lane, indicating that the AVs do not seem to have a real smoothing effect in this case. • Another approach consists of adding to the control law a term that mimics a regular vehicle behavior. For instance, for a traffic flow modeled with BandoFTL (see Eq. (5)), this would give the new control law v˙AV (t) = a(V (xlead (t) − xAV (t)) − vAV(t) ) (t)−vAV (t) + b (xvlead(t)−x − k(vAV (t) − vopt (t)), )2 lead
(22)
AV(t)
and the parameter k ∈ R properly chosen as well as the parameter a, b ∈ R introduced in Eq. (5). When too close from the leading vehicle, the FTL term is dominant and is expected to avoid crashes. However, this approach was shown in [7] to have some limitations: not only with the noise inherent to practical situations there is an inferior bound to the penetration rate that can achieve the stabilization of the system; the k needed for the controller to stabilize efficiently the system increases exponentially when the penetration rate decreases linearly and could become quickly unpractical.
308
A. Hayat et al.
Finally, another limit comes from the estimation of the steady-state speed vopt and the possible subsequent errors. While underestimating vopt would mean diminishing the throughput, overestimating vopt could render the system unstable. Indeed, in this case the AV tries to reach a higher speed than what the overall system can reach. As a consequence, it will always end up being too close to its leading vehicle, triggering a safety brake, and starting a wave again. This effect was already seen in field experiments on a ring road [44].
3.2 Robust Approach Our approach to tackle the limitations mentioned in section “Limits in Practical Situations” and have a robust controller is trying to stabilize another speed vcmd (t) instead of vopt . Motivated by the field experiment in [11, 44], this speed should be time-dependent, and the controller then becomes v˙AV (t) = −k vAV (t) − vcmd (t) + v˙cmd (t),
(23)
The term v˙cmd is for compensating that vcmd might depend on t. Indeed, we recover then (vAV (t) − vcmd (t)) = −k(vAV (t) − vcmd (t)) so that the velocity of the AV converges to vcmd (t). From this point, the goal is to find a vcmd that would conciliate two opposite goals: being as large as possible to avoid decreasing the outflow, while making sure that the AVs are decelerating as little as possible. Our proposed algorithm to determine vcmd is the following, inspired by the TCP algorithm used to reduce congestion in communication networks. The general philosophy is to select a target speed, reduce it if the AV has to brake at some point (meaning that the target speed was chosen too high compared to the stop-and-go wave), and increase it slowly again. Besides, if the AV starts to have a headway significantly larger than the usual headway in the stop-and-go wave, the speed is increased to the target speed. Finally, if the AV is really too far from its leading vehicle, vcmd is set to a free flow speed to catch up. This leads vcmd to have an hysteresis behavior that allows to adapt to the current stop-and-go wave. More precisely, we define three distances, d0 , d1 , and d2 ∈ R>0 (typically d0 = 4.5, d1 = 25 m, and d2 = 100 m) and two speeds: • vdes = 0.95vopt , slightly slower than the steady-state speed vopt to compensate the potential errors and overestimation of vopt which would make the system unstable, as seen in the previous subsection • vfast a speed of free-flow, typically significantly higher than vopt Then, denoting the headway h ≡ xAV − xlead , we act for t ∈ [0, T ] as follows: • If h(t) < d0 , then vcmd (t) = vlead (t) (safety measure)
(24)
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
309
• If h(t) ∈ (d0 , d1 ], then we denote by t1 the last time at which h(t) crossed the threshold d0 (if it does not exist we set t1 = 0 and have vcmd = vdes min(c1 + c2 (t − t1 ), 1) (TCP-type hysteresis),
(25)
where c1 and c2 are design parameters with b = 0.8 typically. • If h(t) ∈ [d1 , d2 ), then vcmd = vdes • If h(t) ≥ d2 , then vcmd = vfast (catching-up)
(26)
Note that by setting initially t1 = 0, we ensure that the AV will start by stabilizing the system to a speed that is significantly smaller than vdes . This will ensure that the AV has some room to absorb a stop-and-go wave that would be already present in the system.
4 Results We present the implementation of our controller on a model representing a stretch of the I-210 California described in Sect. 2.5. We consider a situation where too many vehicles arrive from inflow compared to the outflow speed and the system gets rapidly congested. After a time of 600s, the road is entirely congested with strong stop-and go waves (see Fig. 9 left-hand side). The AVs then are activated and follow the control algorithm described in Sect. 3.2. They represent 5% of the total number of vehicles and are spanned randomly among lanes in inflow with a uniform distribution. In these simulations, the outflow speed is imposed at 5.5 m.s−1 , while vehicles are arriving with an inflow speed of 25.5 m.s−1 . The road has five lanes and is 1430 m long and represents a part of the State Route 134 as a segment of the I-210. We observe an exceptionally large diminution of fuel consumption and a large increase of the energy efficiency (+40% reduction of fuel consumption) while having a comparable throughput (+1% vehicles per minute). These numbers are quite impressive and, of course, have to be taken with precaution for several reasons. First and foremost, the car-following model used for the simulations (the IDM with the chosen parameters) produces, in the given traffic flow regime, strong stop-and-go traffic waves throughout the domain, so the uncontrolled baseline state is governed by persistent strong braking and acceleration phases. Real traffic flow may only rarely be in such an extreme strong wave state all over the road; and some drivers may drive less aggressively. Second, it is assumed that the uncontrolled vehicles always drive according to the chosen model dynamics and accept the control vehicles’ dynamics without any adverse reactions. That being said, the results are not completely unrealistic. In the field experiments on a ring road using an AV as a sparse controller [44], a 40% fuel reduction was obtained, which is in the same ballpark.
310
A. Hayat et al.
Table 1 Effect of AVs on the traffic energy efficiency Quantity Throughput (vehicles per hour) Throughput vs. baseline Energy efficiency (miles per gallon) Fuel consumption reduction
Uncontrolled 7319
2% AV 7365
5% AV 7353
15% AV 7286
30% AV 7262
Ideal 7750
0%
+0.63%
+0.46%
−0.45%
−0.77%
+5.8%
18.1
27.4
32.0
35.5
36.5
40.4
0%
34%
43%
49%
50%
55%
Besides the absolute value of these numbers, it is interesting to compare to the ideal baseline, namely, the steady-state behavior where all vehicles keep the same velocity and headway. This ideal baseline represents the lower possible bound on fuel consumption (equivalently the higher bound on energy efficiency). In this case, the throughput is still roughly the same, despite an increase (+5.8% vehicles per hour), while the energy efficiency shows a reduction of fuel consumption by 55% with respect to the uncontrolled case. This means that the energy efficiency achieved with our controller applied to 5% AVs is already quite close to the ideal situation in terms of reduction of fuel consumption. Increasing the proportion of AVs logically improves the energy efficiency, but with a decreasing marginal effect as it is bounded anyway by the ideal case. This is summarized in Table 1. To visualize the effect on the stop-and-go waves, we present Fig. 9, which contains the time space diagrams for each lane with and without the controller. The x axis represent the time, the y axis the positions of the different vehicles at the considered time, and the color the actual velocity where red stands for low velocities and green for high velocities. On the left-hand side of Fig. 9, there are only regular vehicles, while on the right-hand side, there are 5% of AVs equipped with the controller of the previous section. The AVs are activated in the portion of time that is highlighted (upper-right side). This means that until reaching this time, they imitate human normal human drivers (modeled by the Bando-FTL model), and as soon as they reach this time, they start following the controller. We see that without any controllers, strong stop-and-go waves appear and propagate backward on the road. This is illustrated by the many parallel lines of darker colors. When adding the controllers, these stop-and-go waves are partially dissipated even though some still manage to remain. One could do the same with a much larger proportion of AVs and see that, logically, even less waves appear as seen in Fig. 10 when one has 15% of AVs.
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
311
Fig. 9 Time-space diagrams with no control (left side) and 5% AVs (right-side). Colors represent the vehicle velocities
312
A. Hayat et al.
Fig. 10 Time-space diagrams with 15% AVs (left) and 30% AVs (right). Colors represent the vehicles velocities
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
313
5 Conclusion In this chapter, we presented an initiative aiming at dissipating stop-and-go waves by using a small number of autonomous vehicles (AVs) as sparse controllers for traffic flow in a complex setting: a multilane highway. Because of the changes of lane, the resulting system is of hybrid nature with boundary conditions which make it difficult to study theoretically. We proposed several approaches to obtain control algorithms, from reinforcement learning controllers to model-based controllers. We have focused on finding robust model-based controllers when the traffic is highly congested. In a highly congested situation, the model-based controller exhibits a high-energy reduction compared with the uncontrolled case, even with a small proportion of AVs also called penetration rate. Nevertheless, this controller has limitations, and the research is still ongoing to ensure suitability for application to an experimental setting. When increasing the proportion of AVs, the marginal gains in energy efficiency are decreasing. Therefore, an interesting question is the existence of an an optimal trade-off between penetration rate and energy efficiency. Acknowledgments The authors would like to thank Maria Teresa Chiri for her remarks. This research is based upon work supported by the US Department of Energy’s Office of Energy Efficiency and Renewable Energy (EERE) under the Vehicle Technologies Office award number CID DE-EE0008872. The views expressed herein do not necessarily represent the views of the US Department of Energy or the US Government.
References 1. M. Bando, K. Hasebe, A. Nakayama, A. Shibata, Y. Sugiyama, Dynamical model of traffic congestion and numerical simulation. Phys. Rev. E 51(2), 1035 (1995) 2. L.D. Baskar, D. Schutter, H. Hellendoorn, Model-based predictive traffic control for intelligent vehicles: Dynamic speed limits and dynamic lane allocation, in Proceedings of the 2008 IEEE Intelligent Vehicles Symposium (IV’08) (2008), pp. 174–179 3. N. Bellomo, C. Dogbe, On the modeling of traffic and crowds: A survey of models, speculations, and perspectives. SIAM Rev. 53(3), 409–463 (2011) 4. R. Bhadani, J. Sprinkle, STRYM: A data-analytic tool for CAN-bus messages (Department of Electrical and Computer Engineering, The University of Arizona, Arizona, 2020), 0.3.1 5. M. Bunting, R. Bhadani, J. Sprinkle, Libpanda—a high performance library for vehicle data collection, in The Workshop on Data-Driven and Intelligent Cyber-Physical Systems (Submitted, 2021) 6. J.A. Carrillo, M. Fornasier, G. Toscani, F. Vecil, Particle, kinetic, and hydrodynamic models of swarming, in Mathematical Modeling of Collective Behavior in Socio-Economic and Life Sciences (Springer, Berlin, 2010), pp. 297–336 7. S. Cui, B. Seibold, R. Stern, D.B. Work, Stabilizing traffic flow via a single autonomous vehicle: Possibilities and limitations, in Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (IV) (IEEE, New York, 2017), pp. 1336–1341 8. G. Dal Maso, An Introduction to Γ Convergence (Springer, New York, 1993) 9. S. Dashkovskiy, P. Feketa, Zeno phenomenon in hybrid dynamical systems. PAMM 17(1), 789–790 (2017)
314
A. Hayat et al.
10. L.C. Davis, Effect of adaptive cruise control systems on traffic flow. Phys. Rev. E 69, 066110 (2004) 11. M.L. Delle Monache, T. Liard, A. Rat, R. Stern, R. Bhadani, B. Seibold, J. Sprinkle, D.B Work, B. Piccoli, Feedback control algorithms for the dissipation of traffic waves with autonomous vehicles, in Computational Intelligence and Optimization Methods for Control Engineering (Springer, Berlin, 2019), pp. 275–299 12. O. Derbel, T. Peter, H. Zebiri, B. Mourllion, M. Basset, Modified intelligent driver model for driver safety and traffic stability improvement. IFAC Proceedings Volumes 46(21), 744–749 (2013) 13. M. Fornasier, B. Piccoli, F. Rossi, Mean-field sparse optimal control. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 372(2028), 20130400 (2014) 14. M. Garavello, B. Piccoli, Traffic flow on networks, in AIMS Series on Applied Mathematics, vol. 1 (American Institute of Mathematical Sciences (AIMS), Springfield, 2006). Conservation laws models 15. M. Garavello, K. Han, B. Piccoli, Models for vehicular traffic on networks, in AIMS Series on Applied Mathematics, vol. 9 (American Institute of Mathematical Sciences (AIMS), Springfield, 2016) 16. D.C. Gazis, R. Herman, R.W. Rothery, Nonlinear follow-the-leader models of traffic flow. Oper. Res. 9(4), 545–567 (1961) 17. D. Gloudemans, W. Barbour, N. Gloudemans, M. Neuendorf, B. Freeze, S. ElSaid, D.B. Work, Interstate-24 motion: Closing the loop on smart mobility, in Proceedings of the 2020 IEEE Workshop on Design Automation for CPS and IoT (DESTION) (IEEE, New York, 2020), pp. 49–55 18. X. Gong, B. Piccoli, G. Visconti, Mean-field limit of a hybrid system for multi-lane multi-class traffic. Preprint arXiv:2007.14655 (2020) 19. X. Gong, B. Piccoli, G. Visconti, Mean-field of optimal control problems for hybrid model of multilane traffic. IEEE Control Syst. Letters 5(6), 1964–1969 (2020) 20. M. Guériau, R. Billot, N.-E. [El Faouzi], J. Monteil, F. Armetta, S. Hassas, How to assess the benefits of connected vehicles? a simulation framework for the design of cooperative traffic management strategies. Transp. Res. Part C: Emerg. Technol. 67, 266–279 (2016) 21. Y. Han, D. Chen, S. Ahn, Variable speed limit control at fixed freeway bottlenecks using connected vehicles. Transp. Res. B Methodol. 98,113–134 (2017) 22. A. Hayat, B. Piccoli, S. Truong, Dissipation of Traffic Jams Using a Single Autonomous Vehicle on a Ring Road. Preprint (2020) 23. Z. He, L. Zheng, L. Song, N. Zhu, A jam-absorption driving strategy for mitigating traffic oscillations. IEEE Trans. Intell. Transp. Syst. 18(4), 802–813 (2017) 24. M. Herty, G. Visconti, Analysis of risk levels for traffic on a multi-lane highway. IFACPapersOnLine 51(9), 43–48 (2018) 25. W.-L. Jin, A kinematic wave theory of lane-changing traffic flow. Transp. Res. Part B Methodol. 44(8–9), 1001–1021 (2010) 26. W.-L. Jin, A multi-commodity Lighthill-Whitham-Richards model of lane-changing traffic flow. Procedia. Soc. Behav. Sci. 80, 658–677 (2013) 27. E. Kallo, A. Fazekas, S. Lamberty, M. Oeser, Microscopic traffic data obtained from videos recorded on a German motorway. Mendeley Data 1, 7 (2019) 28. N. Kardous, A. Hayat, S. McQuade, X. Gong, S. Truong, P. Arnold, A. Bayen, B. Piccoli, A rigorous multi-population multi-lane hybrid traffic model and its mean-field limit for dissipation of waves via autonomous vehicles. Preprint (2020) 29. J.A. Laval, C.F. Daganzo, Lane-changing in traffic streams. Transp. Res. Part B Methodol. 40(3), 251–264 (2006) 30. C.-Y. Liang, H. Peng, Optimal adaptive cruise control with guaranteed string stability. Veh. Syst. Dyn. 32(4–5), 313–330 (1999) 31. M.J. Lighthill, G.B. Whitham, On kinematic waves ii. a theory of traffic flow on long crowded roads. Proc. R. Soc. Lond. A Math. Phys. Sci. 229(1178), 317–345 (1955)
A Holistic Approach to the Energy-Efficient Smoothing of Traffic via. . .
315
32. P.A. Lopez, M. Behrisch, L. Bieker-Walz, J. Erdmann, Y. Flötteröd, R. Hilbrich, L. Lücken, J. Rummel, P. Wagner, E. Wiessner, Microscopic traffic simulation using sumo, in Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC) (2018), pp. 2575–2582 33. A. Nakayama, Y. Sugiyama, K. Hasebe, Effect of looking at the car that follows in an optimal velocity model of traffic flow. Phys. Rev. E 65(1), 016112 (2001) 34. R. Nishi, A. Tomoeda, K. Shimura, K. Nishinari, Theory of jam-absorption driving. Transp. Res. Part B Methodol. 50, 116–129 (2013) 35. B. Piccoli, Hybrid systems and optimal control, in Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No. 98CH36171), vol. 1 (IEEE, New York, 1998), pp. 13–18 36. B. Piccoli, Necessary conditions for hybrid optimization, in Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No. 99CH36304), vol. 1 (IEEE, New York, 1999), pp. 410–415 37. B. Piccoli, F. Rossi, Generalized Wasserstein distance and its application to transport equations with source. Arch. Ration. Mech. Anal. 211(1), 335–358 (2014) 38. M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, A.Y. Ng, Ros: an open-source robot operating system, in ICRA Workshop on Open Source Software, vol. 3 (Kobe, Japan, 2009), p. 5 39. R.A. Ramadan, B. Seibold, Traffic flow control and fuel consumption reduction via moving bottlenecks, in Transportation Research Board Conference (2017) 40. P.I. Richards, Shock waves on the highway. Oper. Res. 4(1), 42–51 (1956) 41. H. Schafer, E. Santana, A. Haden, R. Biasini, A commute in data: The comma2k19 dataset. arXiv preprint arXiv:1812.05752 (2018) 42. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, O. Klimov, Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017) 43. S.-A. Shanto, G. Gunter, R. Ramadan, B. Seibold, D. Work, Challenges of Microsimulation Calibration with Traffic Waves using Aggregate Measurements. Preprint (2020) 44. R.E. Stern, S. Cui, M.L. Delle Monache, R. Bhadani, M. Bunting, M. Churchill, N. Hamilton, H. Pohlmann, F. Wu, B. Piccoli, et al., Dissipation of stop-and-go waves via control of autonomous vehicles: field experiments. Transp. Res. Part C: Emerg. Technol. 89, 205–221 (2018) 45. R.E. Stern, Y. Chen, M. Churchill, F. Wu, M. Laura [Delle Monache], B. Piccoli, B. Seibold, J. Sprinkle, D.B. Work, Quantifying air quality benefits resulting from few autonomous vehicles stabilizing traffic. Transp. Res. Part D: Transp. Environ. 67, 351–365 (2019) 46. Y. Sugiyama, M. Fukui, M. Kikuchi, K. Hasebe, A. Nakayama, K. Nishinari, S.-I. Tadaki, S. Yukawa, Traffic jams without bottlenecks–experimental evidence for the physical mechanism of the formation of a Jam. New J. Phys. 10(3), 033001 (2008) 47. A. Talebpour, H.S. Mahmassani, Influence of connected and autonomous vehicles on traffic flow stability and throughput. Transp. Res. Part C: Emerg. Technol. 71, 143–163 (2016) 48. The University of Arizona, Libpanda: A software library and utilities for interfacing with vehicle hardware systems (2020) 49. M. Treiber, A. Hennecke, D. Helbing, Congested traffic states in empirical observations and microscopic simulations. Phys. Rev. E 62(2), 1805 (2000) 50. J. Treiterer, J. Myers, The hysteresis phenomenon in traffic flow. Transp. Traffic Theory 6, 13–38 (1974) 51. M. Wang, W. Daamen, S.P. Hoogendoorn, B. van Arem, Cooperative car-following control: Distributed algorithm and impact on moving jam features. IEEE Trans. Intell. Transp. Syst. 17(5), 1459–1471 (2016) 52. M. Wang, W. Daamen, S.P. Hoogendoorn, B. van Arem, Connected variable speed limits control and car-following control with vehicle-infrastructure communication to resolve stopand-go waves. J. Intell. Transp. Syst. 20(6), 559–572 (2016)
316
A. Hayat et al.
53. F. Wu, R.E. Stern, S. Cui, M. Laura [Delle Monache], R. Bhadani, M. Bunting, M. Churchill, N. Hamilton, R. Haulcy, B. Piccoli, B. Seibold, J. Sprinkle, D.B. Work, Tracking vehicle trajectories and fuel rates in phantom traffic jams: Methodology and data. Transp. Res. Part C: Emerg. Technol. 99, 82–109 (2019) 54. Z. Zheng, Recent developments and research needs in modeling lane changing. Transp. Res. Part B Methodol. 60, 16–32 (2014)
Optimal Energy Management of Electric Vehicles Supplied by Battery and Supercapacitors: A Multi-Objective Approach Bảo-Huy Nguyễn and João Pedro F. Trovão
1 Introduction Energy management strategy (EMS) is mandatory for hybrid energy storage system (HESS) of electric vehicles (EVs), except the passive topology [1–3]. Among various topologies of HESS, the semi-active configuration using battery and supercapacitor (SC) is studied due to its good trade-off between cost and performance [4]. The strategy plays the role of sharing the instantaneous power between the sources to achieve the goal of energy management. The EMS can be conventionally developed by using mono-objective approaches [5–11]. In a battery/SC system, the single objective is often to extend the battery lifetime while handling SC state of charge (SoC) and current kept into their operation boundaries [5, 7–11]. Recently, several efforts are dedicated to develop multi-objective EMS, e.g., [12– 14]. Decomposition methodologies are used in [14] to deal with multi-objective problems by using mono-objective methods. This paper uses the approach to study
B.-H. Nguyễn () e-TESC Lab., University of Sherbrooke, Sherbrooke, QC, Canada CTI Lab. for EVs, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam e-mail: [email protected] J. P. F. Trovão e-TESC Lab., University of Sherbrooke, Sherbrooke, QC, Canada Canada Research Chair in Efficient Electric Vehicles with Hybridized Energy Storage Systems, University of Sherbrooke, Sherbrooke, QC, Canada INESC Coimbra, University of Coimbra, DEEC, Polo II, Coimbra, Portugal Polytechnic of Coimbra, IPC-ISEC, DEE, Coimbra, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_11
317
318
B.-H. Nguyễn and J. P. F. Trovão
the energy management of series HEVs using battery/SC HESS. Fuel consumption and electric cost are addressed in a scalar objective function with equal priorities. In [13], a global objective function is obtained by summation of three single costs without any weighting factor. The negotiations between the conflicting objectives are therefore not fully considered. Generally, these above works obtain unique solutions for multi-objective problems. A contribution as a framework of multi-objective optimization for hybrid electric vehicles (HEVs) is presented in [12] which is a complete work on multi-objective EMS with Pareto optimality. This framework is rigorous but complicated with numerous mathematical developments. In addition, the author of [12] used dynamic programming (DP) to evaluate the performance of the Pareto-based real-time strategy. However, this is not a Pareto set of global optimal solutions but only a particular one. Recent work presented in [15] has proposed a multi-objective optimal EMS for battery/SC EVs by employing an alternative approach of using Pontryagin’s minimum principle (called alt-PMP) which has been introduced in [16]. That study obtains a quasi-analytical solution which is efficient in terms of computation resource and therefore faster than DP. On the other hand, the benchmarking role of offline optimal strategies should be emphasized. Despite that real-time EMS can be developed based on optimization techniques such as model predictive control (MPC) [17, 18], stochastic DP [19], adaptive PMP [20, 21], and meta-heuristic methods [22], they are all suboptimal strategies. Hence, there is a necessity for a benchmark to examine their effectiveness, in which the benchmark should be a global optimal solution of the energy management problem. To do so, the whole driving condition, which is the disturbance of the optimal control system, must be assumed to be known in advance. As a consequence, the optimal benchmark should be an offline strategy obtained by numerical simulation. The aim of this chapter is to introduce a multi-objective offline optimal EMS for the battery/SC HESS to generate a Pareto front benchmark for performance evaluation and/or EMS tuning. The multi-objective optimization problems are treated by using a hierarchical structure proposed in [23]. This approach is to decompose the EMS into strategic and tactical levels that are in accordance with the multi-objective scalarization and the optimization problem-solving layers, in which weighted sum method is traditionally used for scalarization of multiple objective functions [24–26]. Using this method, each objective is associated with a weighting factor that reflects the priority given to it. The weighted objective functions are then combined in a summation to form a scalar single objective function to be solved using an appropriate optimization method. At the tactical layer, global optimal solutions for the benchmark are deduced by using DP. This is a backward computation dynamic optimization method which is well-known for the ability of deducing global optimal solution for a wide range of complex nonlinear systems [27, 28]. On the other hand, comparing to a DP-based optimal benchmark for evaluating the performance of real-time EMS is a common practice in energy management studies such as [29]. To give an example on the evaluation using the
Multi-Objective Optimal EMS of Battery/Supercapacitor EVs
319
Pareto front benchmark, the well-known filtering strategy is used with a range of cutoff frequencies. In the following, Sect. 2 states the multi-objective energy management problem for battery/SC EVs and addresses the general methodology of the chapter. In Sect. 3, the studied system is modeled and controlled with a well-known rule-based EMS which is the filtering strategy. Section 4 then presents multi-objective optimal EMS with the weighted sum scalarization method using DP. Numerical results obtained via simulation carried out based on a real EV reference model are given and discussed in Sect. 5. Conclusion and perspectives are then drawn in Sect. 6.
2 Problem Statement and General Methodology In this section, firstly, the studied system is described with the engineering problem statement. Secondly, the necessity of multi-objective approach to deal with this problem is figured out. The role of a Pareto front benchmark is also discussed. Next, the general formulation of the energy management problems is given. Finally, the general methodology for multi-objective optimal EMS development is developed.
2.1 Studied System and Engineering Problem Statement A semi-active configuration of HESS is studied as presented in Fig. 1. Battery directly maintains the DC voltage supplied to the traction subsystem. SCs are connected to the DC bus through a bidirectional DC/DC converter composed of a power inductor and a chopper. Since this work focuses on the energy management of the HESS, the traction subsystem can be simplified as a dynamic current source. This current source imposes the demanded traction current itrac , which reflect the traction subsystem dynamics, to the HESS.
2.2 Multi-Objective Approach with Pareto Front for Benchmarking The above engineering problem statement figures out the necessity of considering SC losses in addition to the battery degradation. That means the energy management problem should be treated by using a multi-objective approach instead of the monoobjective one. A more general discussion is given here to address the advantages of the multi-objective approach over its counterpart. Since this work is to deduce an optimal benchmark, it is limited to addressing the advantages of the multi-objective benchmark approach.
320
B.-H. Nguyễn and J. P. F. Trovão
bat
trac
Chassis
Inverter
trac veh
M
:M
Battery SC
Drivetrain
Inductor
SC
SCs
res
M
bat
ch
Electrical drive Traction subsystem
ch
trac
bat
Equivalent dynamic current source
Chopper
Fig. 1 Studied system: an EV supplied by a battery/SC HESS
To be general, the battery and the SCs can be considered as main and auxiliary sources associated with the objective functions Jmain and Jaux , respectively. Here, Jmain and Jaux are generally defined as the main and the auxiliary objectives of the energy management problem for illustration purpose. Their specific meanings will be assigned regarding the battery power and the SC losses in Sect. 4. Let us assume that there are three strategies denoted by 1, 2, and 3 to be evaluated and improved (Fig. 2). By using mono-objective approach (see Fig. 2a), that is, the comparison with only one criterion Jmain , it is easy to select Strategy 3 as the best one. However, as seen in Fig. 2b, the picture can change with a multi-objective viewpoint when Jaux is considered. Even though Strategy 2 has slightly higher value of Jmain , it takes much less cost Jaux than Strategy 3 does. It can be therefore considered as a better choice than Strategy 3. This illustration shows why a multi-objective approach can bring a global picture that covers necessary points of view for a correct performance evaluation. Next, the role of a multi-objective benchmark, which is a Pareto front, is illustrated in Fig. 2c. Pareto front is a set of non-dominated solutions of a multiobjective optimization problem. If these solutions are global optimal for given trade-offs, the Pareto front can serve as a multi-objective benchmark. Utopia point is the unrealistic ideal solution which optimizes all the conflicting objectives. Strategies can be evaluated by comparisons with the benchmark. One may argue the disadvantage of the multi-objective approach that it is somehow subjective and requires expertise for compromises. However, that is the essences of real-world engineering, especially for the hybrid systems which are the combinations of different characteristics. Furthermore, once a multi-objective strategy is developed, it is easy to be reduced to mono-objective as a typical case. By contrast, if only the mono-objective one is developed, there can be a lack of a global viewpoint. Consequently, some better solutions may be missed to be taken into account. A common question is often raised; that is, how to select “the best of the best” solution on the Pareto front which is a set of the non-dominated solutions? That
Multi-Objective Optimal EMS of Battery/Supercapacitor EVs main
main
1
main
1 2
3
321
1 2
3
2
3 Pareto front benchmark
Utopia point
(a)
Strategy
(b)
aux
(c)
aux
Fig. 2 Illustration of multi-objective approach for performance evaluation of EMS: (a) monoobjective evaluation, (b) multi-objective evaluation, (c) Pareto front as an optimal benchmark
selection can be done by two approaches as explained in [24]: (i) ideal method where the full Pareto front is carried out first and then the solution is chosen based on higher-level technical information and (ii) preference-based method where the trade-off is done first and then the optimization problem is solved to obtain only a single solution regarding the preferred choice. Both approaches are often heuristic that relies on the developer’s expertise. On the other hand, the aim of this study is to achieve a benchmark for evaluating the performance of the other EMS. For that purpose, the whole Pareto front should serve as the benchmark instead of a single point, and the chosen α granularity is enough to have a graphical representation at this stage.
2.3 General Formulation of Energy Management Problems An energy management problem can be formulated by adopting the form of optimal control [27, 30] as follows: Find the optimal control laws u∗ (t) for the system:
d x(t) = f x(t), u(t), w(t), t ; dt
(1)
in which x(t) is the state variables and w(t) the disturbances, which minimize the objective functions [J1 · · · Jn ] given as: J = [J1 · · · Jn ]T
(2)
with the constraints 2
p x(t), u(t), t ≤ 0
q x(t), u(t), t = 0;
(3)
322
B.-H. Nguyễn and J. P. F. Trovão
where p and q are sets of functions expressing the inequality and equality constraints of the system, respectively. The objective functions (2) are expressed by:
Ji = hi x(tf ), tf + Cost of the final state
tf t
0
gi x(t), u(t), t dt
(4)
Cost of the whole procedure
with i ∈ {1, 2, · · · , n} ; where g and h denote arbitrary functions and t0 and tf are the initial and the final time, respectively. When the problem is properly formulated, one can use various types of methods to solve it depending upon the study purpose. They can be either rule-based or optimization-based as well as either real-time or offline methods [1].
2.4 General Methodology Problem formulation is to translate the engineering problem stated in Sect. 2.1 to the mathematical formulation in the form aforementioned in Sect. 2.3. Besides, an optimal benchmark requires the global optimal solutions of the optimization problems. DP is known to be suitable for deducing such kind of solutions [27]. Moreover, dealing with multiple objective functions is the discipline of multiobjective optimization [24, 26]. There are two main groups of multi-objective optimization methods: vectorization and scalarization [24, 31]. The vectorization techniques are to iteratively generate populations of the feasible solutions in the searching space. The populations would converge to a set of non-dominated solutions which is the Pareto front. Alternatively, the scalarization methods are to form a single (scalar) objective function from the original multiple objectives by assigning them weighting factors. Optimization techniques are then applied to solve the problem of this scalar function regarding the variation of the weights that deduce the Pareto front. It is pointed out that the vectorization is not always suitable for optimal control, while the latter group has advantages [31]. It is because the vectorization techniques require to solve a lot of optimization subproblem for the populations of solution candidates that would be very time-consuming. The vectorization approach is more appropriate for optimal design/sizing of the energy storage systems, e.g., a well-investigated design is presented in [32]. Hence, the scalarization technique is employed in this study of multi-objective optimal control for energy management problem.
Multi-Objective Optimal EMS of Battery/Supercapacitor EVs
System Control
Measure
Manage
Reduced model
Model reduction
Manage
Control
Measure
Backward model
Decomposition Manage
State feedback
Control
Tactical
Measure
Strategy
323
Measure
Strategic (a)
Dynamic programming Scalarized cost function Multi-objective scalarization (b)
Fig. 3 General methodology: (a) model reduction and strategy decomposition for EMS development, (b) structure of multi-objective global optimal EMS using DP
Structure of Optimal Multi-Objective Energy Management An energy system can be dealt with in three levels: system model, control, and strategy (see Fig. 3a). As EMS is at the higher level than the local control, it has slower dynamics than the lower layer. It is therefore not necessary and not effective to consider the full dynamical model for EMS development and study [23]. Thus, model reduction should be done in order to effectively develop the EMS. At this step, a reduced model is carried out from the full dynamical system and the local control. The energy management problem is then formulated based on this reduced model. According to the formulated problem and the study objective, different structures of EMS could be used. In this study, the hierarchical structure of two management layers proposed in [23] is adapted. The strategy is decomposed into strategic and tactical layers. The philosophy is that the strategic layer gives the global directions and then the tactical one handles the system by following these guidelines. In [23], the structure is realized in the real-time EMS by a rule-based strategy at the higher layer; it restricts the searching space of the optimization-based strategy at the lower layer. This study adapts the above hierarchical structure to deal with the multi-objective optimal energy management problem (see Fig. 3b comparing with the right part of Fig. 3a). The multi-objective scalarization plays the role of the strategic layer. It gives the scalarized objective function as a guideline for the lower layer. The tactic is realized by DP which globally minimize each given scalarized objective function. Since DP is a backward calculation technique, a backward model must be deduced from the reduced model. It is presented in details in Sect. 4.2. It is also worth to note that DP is a closed-loop optimal control method [27] and, thus, state feedback is mandatory. Meanwhile, multi-objective scalarization handles its duty in
324
B.-H. Nguyễn and J. P. F. Trovão
an open-loop scheme. The measure going from the tactical layer to the strategic one is therefore eliminated.
Pareto Front Benchmark Generation The steps for generating the Pareto front benchmark are illustrated in Fig. 4. There are several techniques to scalarize the multiple objective functions, in which the most used is the weighted sum method [24, 25]. Weighting factors ki are given to each objective function and then summed to create a single multi-objective function. Moreover, it is necessary to make the objective functions dimensionless due to the different units of the performance measurements. Normalization factors are therefore introduced. The choices of these factors depend upon the detailed applications. The weighted sum objective function Jws is therefore expressed as follows: J1 + (1 − k1 ) J1_nom && % % J2 Jn−1 Jn + · · · + kn−1 + (1 − kn−1 ) × k2 J2_nom Jn−1_nom Jn_nom
Jws = k1
(5)
where 0 ≤ ki ≤ 1 and Ji_nom is the normalization factor with i ∈ {1, . . . , n − 1}. Since the studied battery/SC HESS can be considered as the combination of a main source and an auxiliary source (see Sect. 2.2), the weighted sum objective function is depicted by: Jws = α
Jmain Jaux + (1 − α) ; Jmain_nom Jaux_nom
(6)
in which α ∈ {0, 1} is the weighting factor. By the given weighting factors, the multi-objective problem is scalarized to a series of single objective problems. Thereafter, each single problem is solved by using DP to produce the optimal solution for the given weighting factor. The set of these optimal solutions is the Pareto front benchmark as illustrated in Fig. 4.
3 Modeling, Control, and Rule-Based Strategy Before dealing with the strategy level, the system should be properly controlled. Moreover, in this study, the EMS is not independently developed, but by following a systematic procedure. It is based on the model organization and control scheme of the system using a unified formalism. Therefore, it is necessary to carry out the modeling and the control of the system.
Multi-Objective Optimal EMS of Battery/Supercapacitor EVs
325
Fig. 4 Illustration of multi-objective benchmark generation
3.1 Modeling and Control For control and energy management, it is sufficient to use simple models of the battery and the SCs [21, 29] as follows: ⎧ ⎨ ubat = ubat_OC (SoCbat ) − rbat (SoCbat )ibat (t 1 ; ⎩ SoCbat = SoCbat (0) − ibat dt 3600Cbat 0 uSC
1 = uSC (0) − CSC
t
iSC dt − rSC iSC ;
(7)
(8)
0
where ubat is the battery voltage, ubat_OC the battery open-circuit voltage which is a function of battery state of charge SoCbat , rbat the battery series resistance, ibat the battery current, Cbat the battery capacity, uSC the SC voltage, rSC the SC series resistance, CSC the SC capacitance, and iSC the SC current. The inductor of the converter is given by its linear dynamic model as: uSC = L
d iSC + rL iSC + uch ; dt
(9)
in which L is the inductor inductance, rL is the inductor series resistance, and uch is the chopper voltage. The average linear model of the chopper is utilized:
326
B.-H. Nguyễn and J. P. F. Trovão
$
uch = mch ubat with k = k i ich = mch ηch SC
2
1 if ubat ich ≥ 0 −1 if ubat ich < 0
;
(10)
where ich is the chopper current, mch the duty cycles of pulse width modulation, and ηch the chopper average efficiency. The parallel connection model is the Kirchhoff’s current law: $
ubat = common itrac = ibat + ich
(11)
where itrac is the traction current. The traction part is simplified as an equivalent dynamic current source which generates the traction current as a function of the traction power and the battery voltage: itrac =
Ptrac . ubat
(12)
Tuning path goes from the control variable mch to the objective variable ibat . Then, the control path is carried out by inversion of the tuning path. Based on the control path, the control scheme is deduced by step-by-step inversions of the element models. There are two main kinds which are direct and indirect inversions. The elements which do not store energy are directly inverted from their modeling equations. For example, from (11), the direct inversion of the parallel connection is given as: ich_ref = itrac_mea − ibat_ref .
(13)
By contrast, the elements containing dynamic models cannot be directly inverted because that leads to derivative terms which unrespect the physical causality conditions [33]. Hence, indirect inversions must be used for them. This kind of inversion is realized by closed-loop control. In this case, the SC current control is derived from (9) as follows: & % 1 uch_ref = uSC_mea − kP + kI (iSC_ref − iSC_mea ) s
(14)
where kP and kI are the factors of the well-known proportional-integral (PI) controller.
Multi-Objective Optimal EMS of Battery/Supercapacitor EVs
327
3.2 Rule-Based Filtering Strategy The filtering-based strategy is used as an example for performance evaluation with a low-pass filter (LPF) [5, 6]: ibat_ref =
1 itrac_ref ; τLPF s + 1
(15)
where τLPF is the time constant of the LPF. The cutoff frequency of the filter is calculated by: fc =
1 . 2π τLPF
(16)
The SC voltage is constrained in its upper and lower limitations [34]. Besides, in the intervals that there is no power request from the traction subsystem, the battery charges the SCs if the SC voltage is lower than uSC_max .
4 Multi-Objective Optimal Energy Management System 4.1 Problem Formulation System Dynamical Model Formulating an optimal control problem is the step to determine the system dynamical model, the objective function, and the constraints which are generally addressed in (1), (2), and (3). The model can be obtained by combining the component models given in (7)–(12) as in [35]. By that, one can eventually get a nonlinear model of current and voltage relationships [21]. However, we can deduce a more general linear model considering the power and energy relationships of the HESS inherited from [15] as illustrated in Fig. 5. By defining the positive direction of the source powers as discharging to supply the traction subsystem, the power coupling node is given by: Pbat + PSC = Ptrac .
(17)
From the SC subsystem power PSC , the SC subsystem losses can be calculated by: PSC_loss
−kSC = PSC kSC ηSC − kSC with kSC =
2
1 if PSC ≥ 0 −1 if PSC < 0
;
(18)
328
B.-H. Nguyễn and J. P. F. Trovão
Battery
SCs pure capacitance
PSC0
ESC as the state variable
Pbat as the control variable
SCs subsystem losses
PSC
Ptrac as the disturbance
Traction
PSC loss
Fig. 5 The studied battery/SC HESS power flows
in which ηSC is the SC subsystem efficiency considering the DC/DC converter losses and the Joule’s losses caused by the SC series resistance rSC . On the other hand, the charging and discharging dynamics of the pure SC capacitance are modeled by: d ESC = −PSC0 dt
(19)
where the pure charging/discharging power PSC0 is given by: −kSC PSC0 = PSC ηSC .
(20)
Hence, considering the battery power Pbat as the control variable and the SC energy ESC as the state variable, the dynamical model used for EMS development can be deduced as the following: d −kSC ESC = (Pbat − Ptrac ) ηSC dt
(21)
where the traction power demand Ptrac is the disturbance imposed to the system.
Objective Functions This study is to minimize the two objectives which are battery degradation and SC losses given by:
J = Jbat JSC .
(22)
These objectives are conflicted to each other because minimizing the battery degradation requires to use more SC power, i.e., more SC losses, to support the
Multi-Objective Optimal EMS of Battery/Supercapacitor EVs
329
battery and vice versa. Dealing with such conflicted multiple objectives falls within the scope of multi-objective optimization methods. In order to reduce the battery aging factors, from the control point of view, the battery power usage should be reduced. Moreover, in terms of optimal control, it is convenient to have the control variable taken into account in the objective function. Thus, the battery objective function can be given by:
T
Jbat = 0
2 Pbat dt
(23)
where T is the final time of the driving cycle; the quadratic function is employed to reduce the battery power in both positive and negative regions. The objective function to be minimized from the SC side is the SC subsystem losses PSC_loss . From (18), applying the quadratic form, the SC objective function is addressed as:
T
JSC = 0
2 −k (Pbat − Ptrac )2 kSC ηSC SC − kSC dt.
(24)
As aforementioned in Sect. 2.4, the two objective functions should be normalized in order to properly apply the weighted sum method for scalarization of the objective function vector J as follows: Jws = α
Jbat JSC + (1 − α) Jbat_nom JSC_nom
(25)
with 0 ≤ α ≤ 1 being the weighting factor. Here, we define the scalarization factors 2 and P 2 Jbat_nom and JSC_nom as the mean of the optimal solutions of Pbat SC_loss precomputed individually in the case of α = 0 and α = 1, respectively, as follows: #
⎧ ∗ 2# ⎪ ⎨ Jbat_nom = mean Pbat #α=0 #
⎪ 2# ⎩ JSC_nom = mean P ∗ SC_loss #α=1
(26)
Applying (23), (24), and (26) to (25), the weighted sum objective function to be implemented for computation of the studied optimal control problem is given by: Jws = 0
⎡ T
P2 ⎢
bat # ⎣α ∗ 2# mean Pbat #α=0
2 ⎤ −k (Pbat − Ptrac )2 kSC ηSC SC − kSC ⎥ #
+ (1 − α) ⎦ dt. 2# ∗ mean PSC_loss #α=1 (27)
If the higher priority is given to the objective of extending battery lifetime, i.e., α is close to 0, the battery power will be controlled to be small that means lower aging
330
B.-H. Nguyễn and J. P. F. Trovão
stress. By contrast, if α is close to 1, Pbat will be managed to be close to Ptrac so that the second term of the objective function Jws can be minimized.
Constraints Constraints of the control and state variables can be imposed to an optimal control problem. In practical applications, they are often the limitations of the variables, which are the battery power Pbat and the SC energy ESC [15]. An upper boundary of Pbat can avoid the high peak power demand which can be harmful for the battery. On the other hand, we should limit the maximum charging power of the battery, which is a lower boundary because it is in the negative direction, due to the same reason. Hence, the control variable constraints of the studied problem are given by: Pbat_min ≤ Pbat ≤ Pbat_max .
(28)
These boundaries can be calculated from the maximum continuous load current and the maximum charging current which are often given vis-à-vis the battery C-rate. The second sort of constraints is for the state variable that is often among the most critical issues of optimal control. In a battery/SC HESS, the SC energy is indeed restricted due to the limited voltage and capacitance of the SCs as follows: 2 1 1 CSC u2SC_nom ≤ ESC ≤ CSC 0.5uSC_nom . 2 2
(29)
Moreover, it is interesting that a final-state constraint can be enforced to an optimal control problem. This sort of constraint is especially useful for energy management of HESS where the auxiliary source energy should be recovered at the end of the driving cycle. It firstly implies that the SCs only support the battery to compensate the power fluctuation, whereas the battery mostly provides the whole energy for range autonomy. Secondly and more important for the benchmark purpose, the final-state constraint may ensure a fair comparison to evaluate the effectiveness of the EMS in terms of battery power smoothing and energy consumption. In the studied HESS, this charge-sustaining condition is given by: ESC (T ) = ESC (0).
(30)
Consequently, the multi-objective optimal HESS energy management problem can be formulated as the following:
Multi-Objective Optimal EMS of Battery/Supercapacitor EVs
∗ Pbat = arg min
⎧ ⎨ ⎩
331
⎡ T 0
⎣α
Pbat 2 #
∗ 2# mean Pbat #α=0
2 ⎤ ⎫ −kSC ⎪ ⎬ (Pbat − Ptrac ) kSC ηSC − kSC ⎥ #
+ (1 − α) dt ⎦ ⎪ ∗ 2# ⎭ mean PSC #α=1 2
s.t.:
d −kSC ESC − (Pbat − Ptrac ) ηSC =0 dt
(31)
Pbat_max − Pbat ≥ 0 Pbat − Pbat_min ≥ 0 2 1 CSC 0.5uSC_nom − ESC ≥ 0 2 1 ESC − CSC u2SC_nom ≥ 0 2 ESC (T ) − ESC (0) = 0. In the next subsection, we will apply DP to solve this optimal control problem for each value of α; then the set of optimal solutions with 0 ≤ α ≤ 1 forms the Pareto front.
4.2 Dynamic Programming Here, DP is used at the tactical layer to deduce the optimal solution for each particular dynamic optimization problem given by each weighting factor value. DP is based on the Bellman principle of optimality that leads to an effective numerical searching technique to find the optimal control law for a dynamic system [27]. Considering the above general system model (1) in the discrete form, DP is expressed by Bellman equation as follows: ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨
⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬
∗ gD x(k), u(k) + Jk+1,N f x(k), u(k) u(k) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ Cost-to-go from current ⎪ ⎪ Optimal cost-to-go from next ⎩ stage k to next stage k+1 ⎭
∗ Jk,N x(k) = min
(32)
stage k+1 to final stage N
in which the subscript k, N denotes the procedure going from the stage k th to the final stage N th , similar for k + 1, N and gD is the discrete form of the function g mentioned in (4). Figure 6 illustrates the solving procedure for an arbitrary scalar
332
B.-H. Nguyễn and J. P. F. Trovão
Backward computation
Path constraints M 1
Quantization
max
Final state constraint
0 1
min
1
2
Feasible Infeasible
0
1
0 Discretization
Fig. 6 Illustration of dynamic programming computation procedure
component of state vector x. It is to find the optimal path from the current state xi (k) to the final state x(N). The optimal paths from all possible next states x(k + 1) to x(N) must be determined in advance. The operation of DP is to repeat this procedure from x(N) to the initial state x(0). Due to computation in discrete time, the system must be discretized and quantized. It is noteworthy that unlike real-time control, stability is not an issue of the optimal control methods like DP. The offline optimal control law is obtained considering a priori known disturbance and respecting the control and state constraints which often reflect the physical limitations keeping the system stable. Hence, it is unnecessary to analyze the stability of DP because the proposed approach is for offline benchmark comparison but not for real-time control. DP is of interest because of its natural ability of dealing with the control- and state-constrained problems applied for various sorts of nonlinear systems. Thanks to its ability to generate global optimal solutions regarding all types of constraints for all types of dynamical systems, DP is the most used method to deduce the optimal benchmark for energy management problems. The drawback of DP is that it is heavy in term of computation. Moreover, this method requires the a priori known disturbances; thus, DP is only validated by offline simulations. In this study, we use the dpm MATLAB function introduced in [36] to implement DP in the inner loop (Algorithm 1), while the outer loop is the iteration of the weighting factor α as illustrated in Fig. 4.
Multi-Objective Optimal EMS of Battery/Supercapacitor EVs
333
Algorithm 1 System model implementation for DP computation using the dpm function introduced in [36] 1: 2: 3: 4: 5: 6: 7:
Definition INP = [inp.W, inp.U, inp.X] // input structure 2 ∗ 2 ), par.mean(P ∗ PAR = [par.η, par.α, par.mean(Pbat SC_loss )] // user-defined parameters X = inp.X{1} // state variable structure C = inp.C{1} // cost matrix I=I // infeasible matrix out = [Pbat , ESC , PSC_loss ] // user-defined output signals
8: 9: 10: 11: 12: 13:
Step 1: Initialization Do Ptrac = inp.W {1} Pbat = inp.U {1} ESC = inp.X{1} η = par.η α = par.α
14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26:
Step 2: System model and cost function computation While ∀t ∈ T do if Ptrac − Pbat ≥ 0 // SC subsystem efficiency coefficient k k=1 else k = −1 end if ESC = ESC + inp.Ts × (Pbat − Ptrac ) × η−k // SC energy ESC X{1} = ESC // Update the state variable x(t) PSC_loss = (Pbat − Ptrac ) × (k − k × (η−k )) // SC subsystem losses I =0 // Summarize infeasible matrix 2 2 /par.mean(P ∗ 2 ) + (1 − α) × P 2 ∗ C{1} = α × Pbat bat SC_loss /par.mean(PSC_loss ) // Calculate cost matrix end while
27: 28: 29: 30:
Step 3: Output Do out.Pbat = Pbat out.ESC = ESC out.PSC_loss = PSC_loss
// Disturbance w(t) = Ptrac // Control input u(t) = Pbat // State variable x(t) = ESC // SC subsystem efficiency // Cost function weighting factor
5 Results and Discussions This section presents the numerical validation of the proposed approach. The simulation configuration and scenario will be described and then followed by the results of Pareto front to serve as a multi-objective optimal benchmark. Finally, representative cases regarding the different weighting factor values will be given and discussed to show the pros of the proposed EMS.
334
B.-H. Nguyễn and J. P. F. Trovão
Fig. 7 The eCommander EV with the associated Valence U24-12XP Li-ion battery and Maxwell BMOD0058 E016 B02 SC modules of e-TESC laboratory, University of Sherbrooke, as the reference model
5.1 Simulation Setup The simulation is carried out using the parameters of eCommander EV available at the e-TESC laboratory, University of Sherbrooke [37], as the reference vehicle (Fig. 7). The vehicle total mass, including the HESS and the driver, is 871 kg. Nine SC modules Maxwell BMOD0058 E016 B02 are connected as three modules in series forming a branch and three branches in parallel. The SCs are linked to a DC/DC converter having an average efficiency of 95%. The SC subsystem is directly connected to a battery composed of 12 Valence U24-12XP Li-ion modules, in which 4 modules are in series and 3 branches are in parallel (4s/3p arrangement). The full dynamical model, local control, and filtering-based strategy are simulated in MATLAB/Simulink environment. A real-world driving cycle recorded in the campus of our university is used as the vehicle speed reference (see Fig. 8)1 with the length of 199.4 s. The discretization step, i.e., sampling time, of the problem is 0.1 s. Hence, there are 1994 time steps to be computed. The upper and lower boundaries of the state variable ESC are based on the SC voltage limitations of 45 V and 22.5 V, respectively. The maximum and minimum battery power constraints are set as 6 kW of discharging for traction and −1 kW of charging for regenerative braking. The quantization of both ESC and Pbat is 200 steps. Two main results are given. First, it is the Pareto front generated with the weighting factor α varying from zero to one. Results of filtering-based strategy are given to illustrate the benchmark role of the generated Pareto front. The second main result is the energy and power trajectories of the studied HESS with some specific values of α.
1 HESU
Eco Drive Platform: https://www.gel.usherbrooke.ca/e-TESC/?page_id=89.
Multi-Objective Optimal EMS of Battery/Supercapacitor EVs Fig. 8 The real-world driving cycle obtained in the University of Sherbrooke campus with the studied eCommander vehicle
335
Vehicle speed (km/h) veh
30
20
10
0
0
50
100
150
Time (s)
5.2 Pareto Front as a Multi-Objective Optimal Benchmark The Pareto front generated from multi-objective optimal EMS is given in Fig. 9. Since DP is the global optimization method, this front can be used as a benchmark to evaluate the performance of sub-optimal strategies. The well-distributed convex form of the generated Pareto front verifies the validation of weighted sum scalarization. To give an example for the benchmark role of the generated Pareto front, results of the filtering-based strategy are used. Two typical values of the LPF cutoff frequency fc1 = 50 mHz and fc2 = 10 mHz are studied. With the fc1 , the EMS causes low SC system losses but reduces less battery stress demands. By contrast, with the fc2 , it is better for battery lifetime (less stress) but forces the SC system working exhaustively causing higher losses. Both are fairly far from the optimal solutions set. For better evaluation of filtering-based strategy, a set of LPF cutoff frequencies with 5 mHz step is examined as given in Fig. 9. This proposed set corresponds in fact to nine different filtering-based strategies. Also, Fig. 9 shows that the results of filtering-based strategy are distant from the Pareto front. Advanced EMS giving results closer to the Pareto front are therefore preferred. It is supposed that an EMS giving results between the Pareto front and the curve of filtering strategy could be considered as better than this one. For instance, in [21], an optimization-based real-time strategy has been developed based on an adaptation of PMP to deduce a closed-loop control scheme of the SC voltage. Simulation and experimental validations have been carried out to verify the superiority of that EMS to the filtering strategy. Moreover, appropriate methods used for energy management of HEVs could be adapted for HESS due to their analogy in terms of power flow model. For example, the recent work [29] has proposed a linear quadratic regulator (LQR)-based EMS for a parallel hybrid powertrain of which the performance has been proven to be close to DP results. It could therefore be of interest to extend this strategy to energy management of EVs supplied by HESS.
336
B.-H. Nguyễn and J. P. F. Trovão
Fig. 9 Pareto front benchmark generated from multi-objective optimal EMS
In this case, solutions to the multi-objective optimization problem are computed by means of solving weighted sum with well-dispersed set of weights. In addition, the L1 (Manhattan metric) and L∞ (Chebyshev metric) distances to reference points can be added to the designer define which level of each objective function would like to attain [26]. For instance, to identify more suitable Pareto optimal solutions, the L∞ metric enables to compute unsupported solutions (i.e., non-dominated solutions which are convexly dominated) besides supported non-dominated solutions resulting from weighted sum scalar function and the L1 metric. Moreover, these different computation processes will offer to the benchmarking process different insights of the possible trade-offs regarding the tuning and evaluation of different real-time strategies.
5.3 Typical Cases In these scenarios, the weighting factor α is chosen as {0, 0.25, 0.75, 1} respecting the priorities given to the purpose of battery lifetime extension. Figure 10 shows the SC energy ESC and the HESS power profiles of the studied cases. They include the
Multi-Objective Optimal EMS of Battery/Supercapacitor EVs
337
traction power Ptrac which is the disturbance and the battery power Pbat which is the control variable. In all scenarios, the state variable ESC is constrained between the maximum and minimum limitations. The final state ESC (T ) is controlled to be equal to the initial state ESC (0) which means SCs charge sustaining. The Pbat is kept within its constraints as expected. In traction mode, the SCs support the battery to make the Pbat smoother and having lower peak values. In regenerative braking mode, all the energy is charged to the SCs as the expectation of almost EMS. More detailed discussions on each case follow. The case α = 1 given in Fig. 10a means that all the priority is put on battery degradation reduction. In fact, this is the common case when the mono-objective approach is applied to the main source. The battery current is kept very smooth around a small average value. Most of the requested power is provided by SCs. It is worth to note that to provide the same power, the SCs with the lower voltage, i.e., lower energy, must give the higher current. This may put a heavy duty on the power electronics converter, especially on the stability issue, the magnetic saturation of the power inductor, and the efficiency of the converter. High Joule losses are also the consequence. The results of this case are used as the normalization factor for the scalarization of the SC objective function. By contrast, in the case α = 0 plotted in Fig. 10b, the only objective is to minimize the SC system losses while receiving all regenerative energy as the constraints of the battery current. The SCs, therefore, support the battery to reduce only the peak power demand which is higher than its power constraint. Like the previous case of SCs, the battery objective function is normalized by using these results. Between the two above extreme cases, either major or minor priority can be given to each objective. The two scenarios where α = {0.25, 0.75} are presented in Fig. 10c, d, respectively. They are the trade-off between battery stresses and SC system losses. The higher the weighting factor α is, the lower and smoother the battery power is, because in this study α is introduced as the priority given to the objective of battery power smoothing. Meanwhile, the SC power is in the inverse proportion. The smoothness and the reduction of battery power depend upon the chosen value of α which could be based on the expertise of the strategy designer.
6 Conclusion This chapter has presented a systematic approach to develop multi-objective optimal EMS for HESS-based EVs. The SC subsystem losses are taken into account in addition to the main objective of extending battery life-span. The hierarchical structure has been employed that the EMS is considered as decomposing into strategic and tactical layers. At the strategic level, the multi-objective optimal control problem has been dealt with by using the weighted sum scalarization method. Afterward DP has been used at the tactical level for problem-solving, thanks to its ability of giving the
338
B.-H. Nguyễn and J. P. F. Trovão SCs energy (kWh)
0.02
SC lim
SC init
SC
SC lim
0.015
0.015
0.01
0.01
0.005
0.005 0
50
100 Time (s)
SCs energy (kWh)
0.02
0
150
HESS powers (kW) bat max
trac
bat
bat max
10
5
5
0
0 50
100 Time (s)
0
150
50
(a)
SC lim
SC init
SC
0.01
0.01
0.005
0.005 100 Time (s)
0
150
HESS powers (kW) bat max
trac
bat
5
5
0
0 100 Time (s) (c)
150
150
50
SC init
100 Time (s)
bat max
10
50
bat
SC
150
HESS powers (kW)
10
0
100 Time (s)
SC lim
0.015
50
trac
SCs energy (kWh)
0.02
0.015
0
150
(b)
SCs energy (kWh)
0.02
100 Time (s)
HESS powers (kW)
10
0
50
SC
SC init
0
50
100 Time (s)
trac
bat
150
(d)
Fig. 10 SC energy and HESS power trajectories with typical values of α. (a) α = 1. (b) α = 0. (c) α = 0.75. (d) α = 0.25
Multi-Objective Optimal EMS of Battery/Supercapacitor EVs
339
global optimal solution. Consequently, a Pareto front has been generated as a multiobjective EMS benchmark of which the advantages are validated via numerical evaluations with intensive analyses. We have illustrated the benchmarking role of the Pareto front by comparing it to the well-known filtering strategy as a real-time EMS for a real EV. The proposed methodology is not limited to the studied battery/SC system but can be extend for the other hybridized systems such as hybrid electric vehicles (HEVs) or fuel cell/battery/SC HESS. The optimal EMS can also be combined with component sizing to form a strategy/sizing bi-level optimization problem which is of interest for future study. Furthermore, using advanced scalarization methods such as L1 and L∞ metric may enrich the Pareto front for maximizing the benchmarking assessment. Acknowledgments This work was supported in part by Grant 950-230672 from Canada Research Chairs Program, in part by Grant 2019-NC-252886 from Fonds de recherche du Québec – Nature et Technologies, in part by FCT-Portuguese Foundation for Science and Technology project UIDB/00308/2020, and by the European Regional Development Fund through the COMPETE 2020 Program within project MAnAGER (POCI-01-0145-FEDER-028040).
References 1. F.R. Salmasi, Control strategies for hybrid electric vehicles: evolution, classification, comparison, and future trends. IEEE Trans. Veh. Technol. 56(5), 2393–2404 (2007) 2. X. Luo, J. Wang, M. Dooner, J. Clarke, Overview of current development in electrical energy storage technologies and the application potential in power system operation. Appl. Energy 137, 511–536 (2015) 3. D.-D. Tran, M. Vafaeipour, M. El Baghdadi, R. Barrero, J. Van Mierlo, O. Hegazy, Thorough state-of-the-art analysis of electric and hybrid vehicle powertrains: topologies and integrated energy management strategies. Renew. Sust. Energ. Rev. 119, 109596 (2020) 4. I. Aharon, A. Kuperman, Topological overview of powertrains for battery-powered vehicles with range extenders. IEEE Trans. Power Elect. 26(3), 868–876 (2011) 5. A. Florescu, S. Bacha, I. Munteanu, A.I. Bratcu, A. Rumeau, Adaptive frequency-separationbased energy management system for electric vehicles. J. Power Sources 280, 410–421 (2015) 6. A. Tani, M.B. Camara, B. Dakyo, Energy management based on frequency approach for hybrid electric vehicle applications: fuel-cell/lithium-battery and ultracapacitors. IEEE Trans. Veh. Technol. 61(8), 3375–3386 (2012) 7. Z. Song, H. Hofmann, J. Li, X. Han, M. Ouyang, Optimization for a hybrid energy storage system in electric vehicles using dynamic programing approach. Appl. Energy 139, 151–162 (2015) 8. Z. Song, H. Hofmann, J. Li, J. Hou, X. Han, M. Ouyang, Energy management strategies comparison for electric vehicles with hybrid energy storage system. Appl. Energy 134, 321– 331 (2014) 9. M. Adnane, B.-H. Nguyen, A. Khoumsi, J.P.F. Trovão, Driving mode predictor-based realtime energy management for dual-source electric vehicle. IEEE Trans. Transport. Electrific. 7(3), 1173–1185 (2021) 10. A. Florescu, A.I. Bratcu, I. Munteanu, A. Rumeau, S. Bacha, LQG optimal control applied to on-board energy management system of all-electric vehicles. IEEE Trans. Control Syst. Technol. 23(4), 1–13 (2014)
340
B.-H. Nguyễn and J. P. F. Trovão
11. Y.-H. Hung, C.-H. Wu, An integrated optimization approach for a hybrid energy system in electric vehicles. Appl. Energy 98, 479–490 (2012) 12. A.A. Malikopoulos, A multiobjective optimization framework for online stochastic optimal control in hybrid electric vehicles. IEEE Trans. Control Syst. Technol. 24(2), 440–450 (2016) 13. L. Li, S. You, C. Yang, Multi-objective stochastic MPC-based system control architecture for plug-in hybrid electric buses. IEEE Trans. Indust. Electron. 63(8), 4752–4763 (2016) 14. S. Zhang, R. Xiong, Adaptive energy management of a plug-in hybrid electric vehicle based on driving pattern recognition and dynamic programming. Appl. Energy 155, 68–78 (2015) 15. B.-H. Nguyen, T. Vo-Duy, C. H. Antunes, J.P.F. Trovão, Multi-objective benchmark for energy management of dual-source electric vehicles: an optimal control approach. Energy 223, 119857 (2021) 16. B.-H. Nguyen, T. Vo-Duy, M.C. Ta, J.P.F. Trovão, Optimal energy management of hybrid storage systems using an alternative approach of Pontryagin’s minimum principle. IEEE Trans. Transport. Electrific. 7(4), 2224–2237 (2021) 17. B. Hredzak, V.G. Agelidis, G. Demetriades, Application of explicit model predictive control to a hybrid battery-ultracapacitor power source. J. Power Sources 277, 84–94 (2015) 18. O. Gomozov, J.P.F. Trovão, X. Kestelyn, M. Dubois, X. Kestelyn, Adaptive energy management system based on a real-time model predictive control with non-uniform sampling time for multiple energy storage electric vehicle. IEEE Trans. Veh. Technol. 66(7), 5520–5530 (2017) 19. T. Fletcher, R. Thring, M. Watkinson, An energy management strategy to concurrently optimise fuel consumption & PEM fuel cell lifetime in a hybrid vehicle. Int. J. Hydrogen Energy 41(46), 21503–21515 (2016) 20. K. Ettihir, L. Boulon, K. Agbossou, Optimization-based energy management strategy for a fuel cell/battery hybrid power system. Appl. Energy 163, 142–153 (2016) 21. B.-H. Nguyen, R. German, J.P.F. Trovão, A. Bouscayrol, Real-time energy management of battery/supercapacitor electric vehicles based on an adaptation of Pontryagin’s minimum principle. IEEE Trans. Veh. Technol. 68(1), 203–212 (2019) 22. J.P.F. Trovão, C.H. Antunes, A comparative analysis of meta-heuristic methods for power management of a dual energy storage system for electric vehicles. Energy Convers. Manag. 95, 281–296 (2015) 23. J.P.F. Trovão, P.G. Pereirinha, H.M. Jorge, C.H. Antunes, A multi-level energy management system for multi-source electric vehicles – an integrated rule-based meta-heuristic approach. Appl. Energy 105, 304–318 (2013) 24. K. Deb, Multi-objective optimization, in Search Methodologies: Introductory Tutorials in Optimization and Decision Support Techniques, ed. by E.K. Burke, G. Kendall, ch. 15, 2nd edn. (Springer, New York, 2014), pp. 403–449 25. I.Y. Kim, O.L. De Weck, Adaptive weighted sum method for multiobjective optimization: a new method for Pareto front generation. Struct. Multidiscip. Optim. 31(2), 105–116 (2006) 26. C.H. Antunes, M.J. Alves, J. Clímaco, Multiobjective Linear and Integer Programming (Springer International Publishing, Basel, 2016) 27. D.E. Kirk, Optimal control theory: An introduction (Prentice-Hall, Hoboken, 1970) 28. A.E. Bryson, Y.-C. Ho, Applied Optimal Control: Optimization, Estimation, and Control (John Wiley & Sons, Hoboken, 1975) 29. B.-H. Nguyen, J.P.F. Trovão, R. German, A. Bouscayrol, Real-time energy management of parallel hybrid electric vehicles using linear quadratic regulation. Energies 13 (2020) 30. A. Sciarretta, L. Guzzella, Control of hybrid electric vehicles. IEEE Control Syst. Mag. 27(2), 60–70 (2007) 31. F. Logist, B. Houska, M. Diehl, J. Van Impe, Fast Pareto set generation for nonlinear optimal control problems with multiple objectives. Struct. Multidiscip. Optim. 42(4), 591–603 (2010) 32. Z. Song, J. Li, X. Han, L. Xu, L. Lu, M. Ouyang, H. Hofmann, Multi-objective optimization of a semi-active battery/supercapacitor energy storage system for electric vehicles. Appl. Energy 135, 212–224 (2014)
Multi-Objective Optimal EMS of Battery/Supercapacitor EVs
341
33. A. Bouscayrol, J.P. Hautier, B. Lemaire-Semail, Graphic formalisms for the control of multiphysical energetic systems: COG and EMR, in Systemic Design Methodologies for Electrical Energy Systems: Analysis, Synthesis and Management, ed. by X. Roboam, ch. 3 (ISTE Ltd., Washington, 2013), pp. 89–124 34. B.-H. Nguyen, R. German, J.P.F. Trovão, A. Bouscayrol, Improved voltage limitation method of supercapacitors in electric vehicle applications, in Proceedings of the 2016 IEEE Vehicle Power and Propulsion Conference, Hangzhou (2016) 35. B.-H. Nguyen, J.P.F. Trovão, R. German, A. Bouscayrol, Impact of supercapacitors on fuel consumption and battery current of a parallel hybrid truck, in Proceedings of the 2019 IEEE Vehicle Power and Propulsion Conference, Hanoi (2019) 36. O. Sundström, L. Guzzella, A generic dynamic programming Matlab function, in Proceedings of the IEEE International Conference on Control Applications (2009), pp. 1625–1630 37. M.J. Blondin, J.P.F. Trovão, Soft-computing techniques for cruise controller tuning for an offroad electric vehicle. IET Elect. Syst. Transport. 9(4), 196–205 (2019)
Transient Stability and Protection Evaluation of Distribution Systems with Distributed Energy Resources Guilherme S. Morais, Mariana Resener, Bibiana M. P. Ferraz, Ana P. Zanatta, Maicon J. S. Ramos, and Younes Mohammadi
1 Introduction Electricity occupies a prominent place in the history of the development of society in the world since economic progress and quality of life currently depend on it. The increase and diffusion of electricity consumption brought about the need to increase the generation capacity, which led to the emergence of large hydroelectric plants. In Brazil, for instance, these large power generation plants were installed far from large consumption centers, basically due to the better use of the inflows and unevenness of the rivers. As a result, long high voltage transmission lines were built to transport the energy generated in these plants. In several other countries, a similar procedure has also been adopted, due to either the proximity of rivers or the proximity to the primary sources of energy used to generate electricity (such as mineral coal used in thermal power plants). Thus, the basic topology of an electrical power system (EPS) is characterized by large generators supplying energy to loads through transmission systems. The transmission system is used to transport energy to consumption centers or large consumers, often over long distances. As generators normally generate at
G. S. Morais () · B. M. P. Ferraz · A. P. Zanatta · M. J. S. Ramos Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil e-mail: [email protected]; [email protected]; [email protected]; [email protected] M. Resener Simon Fraser University Surrey, B.C., Canada e-mail: [email protected] Y. Mohammadi Lulea University of Technology, Luleå, Sweden e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_12
343
344
G. S. Morais et al.
lower voltages, they are connected to the grid through transformers. This voltage increase is necessary to reduce transmission losses, making long-distance transmission feasible [1]. Close to the load centers, the voltage is reduced to subtransmission levels. Subtransmission systems transmit power in smaller quantities, from the substations (SBs) to the distribution SBs, serving medium-sized consumers and distribution systems. Finally, the distribution SBs provide energy to consumers through primary medium voltage feeders. Connected to the primary feeders, through step-down transformers, are the secondary feeders that supply power to smaller customers at low voltage, such as residential and commercial customers [2]. Some examples of typical voltages practiced in EPS are the following: • • • •
Generation: 400 V to 24 kV Transmission: 138 kV to 765 kV Subtransmission: 69 kV and 138 kV Distribution: medium voltage (13.8 kV to 34.5 kV) and low voltage (110 V to 440 V)
With the connection of distributed energy resources (DER) in power distribution systems (PDS), the basic EPS topology has been modified. DER can be defined as electricity generation and storage systems connected to distribution networks, usually connected behind the meter (consumer unit) [3]. The definition of DER often includes distributed generation (DG), energy storage, and demand-side management. Several countries are going through a transition from a centralized power system to a more distributed model, changing the power flows and increasing the complexity of EPS. In the last years, interest in DER has increased considerably, due to some factors such as [4, 5]: • The need to use different primary energy sources, in order to increase the reliability and safety of the electrical system • Technological advances that have enabled different types of energy generation to be applied in DG, such as wind turbines and photovoltaic panels • Awareness of environmental conservation and rational use of energy, which has brought about concern for the environment, highlighting the renewable energy generation With the increase in the number of DER connections, they become important to systemic issues, such as in helping in the stability of EPS in general. For instance, one of the conclusions about the blackout in Europe in 2006, caused by the loss of a transmission line and consequent generation deficit, which led to a reduction in the frequency of the system, was that the blackout could have reached a smaller area if the generators connected to the distribution systems had remained connected [6]. Thus, the importance of understanding the impacts of DER on the stability of electrical power systems and, consequently, the dynamic behavior of DER in general is verified. Regarding the type of technology applied in distributed generation, there is the trend of applying DER through the connection of primary energy sources using inverters, since many of these sources do not produce energy through rotating
Transient Stability and Protection Evaluation of Distribution Systems with. . .
345
machines or, if produced, they may not be synchronous machines [7]. In Brazil, the presence of synchronous distributed generators connected to PDS is significant [8, 9], due to the development of the sugar and ethanol sector, with a great diffusion of cogeneration or combined heat and power systems. There has been a great effort from the power utility engineers and researchers to consider the presence of DER in PDS, in order to optimize the system operation. Distribution power utilities must guarantee the quality of supply and safety for both the population and the electrical system, and, therefore, the connection of generators in these systems is of concern. The protection of these generators is usually adjusted so that any disturbance in the system must remove them from operation, since utilities usually do not have control over these machines, while the distribution power company is responsible for any damage to consumers. The connection of DER brings new technical aspects that must be analyzed. All protection and control devices normally used in distribution systems, as well as planning and optimization techniques for their operation, have to be adapted to consider DER. While DER imposes new challenges, several benefits associated with their integration might result. Increasing the potential efficiency gains is an objective in this modern power system era, allowing to integrate more DER. In this context, this chapter aims to contribute to the understanding of the impacts of connecting DER to distribution networks, focusing on transient stability and protection aspects. We present an analysis of an unbalanced PDS with multiple DER, through simulations using the software DIgSILENT PowerFactory [10], where the dynamic behavior of the systems is evaluated increasing the presence of photovoltaic generation. Furthermore, typical protection devices are modeled, in order to evaluate the behavior of the PDS under different disturbances. This chapter is organized as follows: Sect. 2 describes the main concepts related to power system stability and protection systems; Sect. 3 presents the model of the system used in the case studied, as well as data of the dynamic models and protection settings; Sect. 4 presents the test cases and results; Sect. 5 presents the final conclusions; and Sect. 6 brings suggestions of future works.
2 Transient Stability and Protection of Distribution Systems with DER 2.1 Modeling the Stability Problem The differential equations describing the power system dynamics can be obtained through power balance formulation in each synchronous generator. A primary element provides mechanical power to the machine; thus, part of this mechanical energy is converted into electricity and delivered to the power grid. The remaining energy becomes an accelerating power of machine rotor [11]. The following subsections shortly describe the main concepts related to power system stability problem.
346
G. S. Morais et al.
The Swing Equation The mathematical formulation of rotational inertia is essential to power system stability analysis, since it allows to describe the imbalance effect between the electromagnetic and mechanical torques of synchronous machines [12]. Any unbalanced torque condition occurring on generator rotor will result in its acceleration or deceleration, according to Newton’s Second Law given by: J
∂ 2 δm = Tm − Te = Ta , ∂t 2
(1)
where: J combined moment of inertia of generator and turbine [kg m2 ] δm angular position with respect to a synchronously rotating reference [rad] Tm mechanical torque [Nm] Te electromagnetic torque [Nm] Ta accelerating torque [Nm] t time [s] The mechanical and electrical torques are considered positive to a synchronous machine operating as generator. Mechanical torque originates from the motor agent (i.e., water in hydroelectric plants or steam in thermoelectric power plants), while the electrical power required by the load generates electrical torques, through magnetic fields [11]. If the machine is operating as a generator, the mechanical torque acts to accelerate the generator rotor and the electrical torque in the opposite direction. Thus, if the mechanical torque is greater than the electrical torque, the acceleration is positive, and otherwise, the machine undergoes a deceleration. Concerning the steady state, both torques are equal, and the machine operates at zero acceleration and constant speed. Since the power is a product between torque and angular speed, multiplying expression (1) by mechanical angular speed results to: J ωm
∂ 2 δm = Pm − Pe = Pa , ∂t 2
(2)
where: ωm Pm Pe Pa
mechanical angular speed [mech. rad/s] mechanical power [W] electrical power output [W] accelerating power [W]
In order to avoid instability in power systems due to a quickly loss of synchronism, the mechanical angular speed ωm is considered not to deviate significantly from the synchronous speed ωs . Therefore, the inertia constant is defined as [11]:
Transient Stability and Protection Evaluation of Distribution Systems with. . .
J ωm ∼ = J ωs = M,
347
(3)
2 where M is given in kgm and ωs is the electrical synchronous speed [electrical s rad/s]. Usually, the available information of this type of machine brings another constant, defined by the relationship between kinetic energy [MJ]—when a machine operates with synchronous speed—and generator nominal power [MVA], as follows [13]: H =
2 0.5J ωsm , Snom
(4)
where Snom is a three-phase nominal power [MVA] and ωsm is the synchronous mechanical angular speed [mech. rad/s]. Based on (3) and (4), the inertia constant is given by: M=
2H Snom . ωsm
(5)
Since electrical power injected into the power system is a function of electrical angles, it is necessary to relate the machine mechanical angles to power system angles. As a result, (2) becomes: 2H ∂ 2 δ = Pm − Pe = Pa , ωs ∂t 2
(6)
where Pm , Pe , and Pa are the mechanical, electrical, and accelerating per-unit power (considering the same base value of constant H ), respectively; ωs is the synchronous electrical angular speed [electrical rad/s]; and δ is the synchronous machine rotor angle, measured by the angular difference between the synchronous reference and the magnetic field axis generated by the field winding (direct axis), expressed in electrical radians. Expression (6) is referred to as the swing equation, which formulates the movement of a synchronous machine, being possible to rewrite it in the form of two first-order differential equations, given by: 2H ∂ω = Pm − Pe , ωs ∂t ∂δ = ω − ωs , ∂t
(7) (8)
where Pm and Pe are expressed in per unit, δ in electrical radians, and ω and ωs in electrical radians per second.
348
G. S. Morais et al.
Synchronous Machine Formulation A synchronous machine is basically composed of a field winding and an armature winding. Then, a magnetic excitation field is produced by energizing the field winding with a direct current source, which is inserted in the machine’s rotor and presents rotational movement imposed by a turbine coupled on its axis. The armature winding is located in the stator and is formed by a set of winding coils (phases a, b, c), allocated along the periphery of stator slot wedges. In addition, synchronous machines can also be composed of rotor slots for damping winding l, which are formed by short-circuit conductive bars and used to improve the damping of rotor speed oscillations [12]. In accordance with [14], it is quite common to use synchronous generators with cylindrical-type rotors in power systems with DG. This type of machine features the field winding inserted in slots along the rotor perimeter. Moreover, smooth-pole synchronous generators operate at high speeds and present normally two or four poles. The cylindrical rotor allows paths for eddy current circulation, which act as equivalent damping windings under dynamic conditions [12, 15]. A synchronous machine can be represented by an equivalent rotor circuit and the three-phase circuit of the armature winding. Thus, the terminal voltage at any winding is given by Anderson and Fouad [15]: v=±
ri (t) ±
∂λ (θ, t) ∂t
,
(9)
where: r winding resistance [pu] λ (θ, t) flux linkage [pu] The use of expression (9) could be complicated, since the linked flux varies due to rotor position, which is related to a fixed reference on the stator. Therefore, Park’s transformation is used to change the system reference to a rotational reference that follows the direction of rotation [15]. A sixth-order model can be used to represent smooth-pole synchronous generators, with one field winding, two damping windings (one on the direct axis and another on the quadrature axis), and an equivalent winding representing eddy currents (on the quadrature axis). Note that in this model, the stator transients are neglected [12]. Thus, the electrical model, in dq0 coordinates, are given by Kundur et al. [12]: vd = −rs id − ωλq ,
(10)
vq = −rs iq − ωλd ,
(11)
vf d = if d rf d +
∂λf d , ∂t
(12)
Transient Stability and Protection Evaluation of Distribution Systems with. . .
∂λ1d , ∂t ∂λ1q , + ∂t ∂λ2q + , ∂t
349
v1d = i1d r1d +
(13)
v1q = i1q r1q
(14)
v2q = i2q r2q
(15)
where: rf d r1d r1q r2q λd , λq λ1 d, λ1d λ1q , λ2q vd , id vq , iq vf d , if d v1d , i1d v1q , i1q v2q , i2q
field circuit resistance [pu] resistance of amortisseur winding in direct-axis [pu] resistance of first amortisseur winding in quadrature-axis [pu] resistance of second amortisseur winding in quadrature-axis [pu] magnetic fluxes of direct-axis stator and quadrature-axis stator [pu] magnetic flux of direct-axis rotor [pu] magnetic flux of quadrature-axis rotor [pu] voltage and current of direct-axis stator [pu] voltage and current of quadrature-axis stator [pu] field voltage and current [pu] voltage and current of amortisseur winding in direct-axis [pu] voltage and current of first amortisseur winding in quadrature-axis [pu] voltage and current of second amortisseur winding in quadrature-axis [pu]
Finally, the 6th order model of synchronous generators with one field winding and three damper windings is represented by the swing equations (7) and (8) and the expressions (10)–(15) and is further used in this study.
Power-Angle Curve The oscillation expression of a synchronous generator considers the mechanical power input, which can be considered as a constant, if the electrical variations in power systems occur before the speed regulator actuation [13]. Thus, if the mechanical power is constant, the electrical power (Pe ) will determine whether the rotor will accelerate, decelerate, or keep rotating at synchronous speed. The steadystate operation will occur when the mechanical and electrical powers are equal. Electrical power variations are related to the power system conditions. Severe disturbances can cause quick variation of Pe , which causes electromechanical transients. When neglecting the effect of speed variations on the voltage generated, it is assumed that the way in which Pe varies is determined by the power flow equations and by the model chosen to represent the machine electrical behavior [13]. Figure 1 depicts the diagram of a simple system composed of a synchronous generator connected through a lossless transmission line to a remote bus, considered as an infinite bus. The generator is represented by its classic model, that is, a voltage behind its synchronous reactance.
350
G. S. Morais et al.
Fig. 1 Machine system versus infinite bus
Fig. 2 Power-angle curve
The active and reactive power delivered by the generator are given by Anderson and Fouad [15], Stevenson [16]: P =
EV sin (δ) , x
Q=−
E2 EV − cos (δ) , x x
(16) (17)
where x is the equivalent reactance given by x = xs + xLT , xLT is the reactance of transmission power line, and xs is the synchronous reactance. Thus, the relationship between active power and angle is a sinusoidal, as shown in Fig. 2. In steady-state operation, the synchronous generator can be represented by an internal voltage in series with the direct-axis synchronous reactance. This model is an approximation, since factors such as (i) saturation of the magnetic circuit and (ii) the difference between direct-axis and quadrature reactances (in the case of salient pole generators) are not considered [15]. Therefore, the power-angle function is highly nonlinear (considering the adopted model simplifications), since the power varies according to the sine. However, if more accurate and complex machine models are considered—including the effects of the excitation system controls, such as automatic voltage regulators—the powerangle curve of a synchronous machine differs from that presented in (16), but the general form remains similar [12].
Considering Losses in Transmission Lines The angular stability analysis requires to find the initial value of all generator variables, considering the machine operating in steady state.
Transient Stability and Protection Evaluation of Distribution Systems with. . .
351
Fig. 3 System of a single synchronous generator
Fig. 4 Steady-state phasor diagram
It is important to remember that the dependence between reactive power delivered by the generator and phase angle of internal voltage cannot be overlooked when it comes to connection in power distribution systems. Considering that R/X ratio of these systems can be high, the electrical line resistance cannot be neglected in this case [17]. Considering a system with a synchronous generator, represented by a voltage source in series with an impedance (classic model), connected through a line with losses, as shown in Fig. 3. The effective internal voltage of the generator is equal to: E = V + (R + j X) IT ,
(18)
where X = XS + XL , R = RS + RL , and: XS generator stator reactance [Ω] XL series reactance of power distribution feeder [Ω] RS generator stator resistance [Ω] RL series resistance of power distribution feeder [Ω] IT magnitude of the terminal current of a generator [A] V magnitude of the remote bus voltage [V] The phasor diagram for the system depicted by Fig. 3 is detailed in Fig. 4, adapted from [12]. In this way, the internal angle of a generator operating in steady state is given by Kundur et al. [12]: %
& XIT cos (φ) − RIT sin (φ) δ = arctan , V + RIT cos (φ) + XIT sin (φ) where φ is the angle between two phasors I and V . Then, rewriting (19):
(19)
352
G. S. Morais et al.
%
& XP − RQ δ = arctan , V 2 + RP + XQ
(20)
where P = V IT cos (φ) is the active power injected into the remote bus and Q = V IT sin (φ) is the reactive power injected into the remote bus. Considering the expression (20), it could be assumed that [17]: • If the generator is operating with capacitive power factor, it will be injecting reactive power into the system; then Q > 0. Therefore, the numerical value of tangent arc term in (20) reduces, as the δ angle. • If the generator is operating with inductive power factor, it is absorbing reactive energy from the system; then Q < 0. Therefore, the numerical value of the tangent arc term in (20) increases, as the δ angle.
2.2 Dynamic Controllers Controllers play an important role in keeping synchronous machines on a stable operating point. For that, they act to ensure voltage and frequency remain within acceptable limits and do not vary significantly, thus contributing to the stability of the system. The main controllers are: • Automatic voltage regulator (AVR) • Speed governor (GOV) • Power system stabilizer (PSS) Next, the excitation systems and the speed governor are better detailed.
Excitation Systems The main function of an excitation system is to provide direct current for the field winding of synchronous machines and to control the generated voltage output. For this, the control is performed using a system composed by the automatic voltage regulator (AVR) and the exciter of the synchronous machine. In addition, the excitation is also responsible for the control of reactive power injection and the power factor of the equipment. Thus, it plays an important role in functions that guarantee the improvement of the system stability and the proper operation of the generator [12]. The time response of an excitation system must be as short as possible, so that the voltage regulator be able to control the terminal voltage of the generator whenever the system is submitted to any disturbance or short-term event. Figure 5 depicts the main components of the excitation system of a synchronous machine [18]. As illustrated in Fig. 5, an excitation system of a synchronous machine present the following components:
Transient Stability and Protection Evaluation of Distribution Systems with. . .
353
Fig. 5 Block diagram of excitation system
• Exciter, responsible for producing the power required by the field winding of the synchronous generator in the form of voltage and direct current • Voltage regulator, whose function is to keep the terminal voltage of the generator constant and stabilized, that is, within the allowed and previously established values • Voltage transducer, in which the output signal corresponds to the main voltage reference for the excitation control system • Load compensator, which is used to control voltage, considering an internal or external point of the generator • Power system stabilizer (PSS), which consists of a complementary compensator that provides additional damping torque • Limiters and protection circuits, which aim to guarantee a proper operation for the synchronous generator, without exceeding the limits of the synchronous machine (capability curve) and the excitation system
Types of Excitation Systems In the IEEE 421.1-2007 standard [19], concepts related to excitation systems are discussed. The document is focused on three systems, classified based on the excitation power source used, which are: • DC excitation systems • AC excitation systems • Static excitation systems A DC excitation system uses a direct current generator to supply current to the main generator field, through a commutator at the end of the generator shaft or through slip rings. The exciter may be driven by a motor or by the generator shaft. In addition, it can be self-excited or separately excited. Older systems used mechanical equipment to perform field excitation control, varying a rheostat [18].
354
G. S. Morais et al.
An AC excitation system uses an alternator and either stationary or rotating rectifiers (controlled or not) to supply direct current to the main generator field. The early AC excitation systems used a combination of magnetic and rotary amplifiers as regulators, but the new systems use electronic amplifiers for the regulation [18]. In a static excitation system, only static components are used. The excitation is usually derived from the generator terminals or an auxiliary bus, through an exciter transformer and a rectifier (of thyristors). The rectifiers supply excitation current directly to the main synchronous generator field through slip rings. In some cases, the use of transformers can be eliminated by using an auxiliary bus for excitation. Currently, the static excitation is the most used among others [18].
Speed Governor The speed governor has the important role of controlling the speed of the primary machine (turbine) connected to the generator, so that the frequency is maintained at its nominal value, as well as contributing to the load balance (active power) of the electrical system. However, the characteristics and performance of the speed regulator may vary depending on the nature of the generation plant. A representation of a thermal power station is presented in Fig. 6, where steam is used to drive the electrical generator. The frequency generated is related to the rotational speed of the shaft; thus, the governor acts over valves, controlling the steam flow rate into the turbine. In addition, there is a speed measuring device (SD), providing information about rotational speed, which allows the regulator to restrict or not the steam flow and consequently to control the generator frequency and load.
Fig. 6 Example of thermal power station
Transient Stability and Protection Evaluation of Distribution Systems with. . .
355
Fig. 7 Diagram blocks of steam turbine governor
A block diagram of the control loop of the speed regulator is illustrated in Fig. 7. Speed control basically consists of a feedback system, in which the error is obtained through the difference between the final speed measured and the reference value. Therefore, to correct this difference, the position of the valves, which defines the amount of steam to be admitted into the turbine, is changed, due to variations on the torque generated by the electrical load, demanded by the system. This control is needed to maintain a fixed rotational speed, and subsequently alternating current frequency, while load variations are applied to the turbine. In other words, the controller has to monitor the machine speed, and in case of any load variation, it has to increase or decrease the mechanical power of the shaft, according to the speed deviation. In addition, it is essential that in a multi-machine system, the speed remains close to the nominal value, so that the system is always synced. In order to obtain an adequate load fraction among generators operating in parallel, there is a control called speed droop or R. In this function, the GOV decreases the speed reference as load increases. That is, the higher the load, the lower the speed. It is implemented in all controllers that aim to operate on a stable condition. Since R is a suitable parameter, typically it is assumed 3% and 5% [20], but it can also be obtained through: % R=
fe − fl fn
& × 100,
(21)
where: fe no-load machine frequency [Hz] fl full-load machine frequency [Hz] fn machine-rated frequency [Hz] An example of droop control is presented in Fig. 8, where R is equal to 5%, the load varies from 0 to 1.0 pu, and curves have different frequency initial adjustments. Note that when there is no load on the generator (0 pu), the frequency is higher. In contrast, when the load is maximum (1.0 pu), the frequency drops. Besides, it is observed that frequency variation is up to 3 Hz (5% of 60 Hz) due to load changes (minimum and maximum).
356
G. S. Morais et al.
Fig. 8 5% Speed droop operation
Fig. 9 Example of isochronous mode control
There is also the isochronous control mode, which is usually used when the generator is islanded or when the generator corresponds to the highest rated power in the network. In this mode, the speed of the machine is constant, in which any load variation implies a quick action by the controller to correct speed variations and keep the frequency at its nominal value. That is, this mode control is based only on the speed of the machine, regardless of the generator load. Figure 9 shows the frequency response in relation to the generator load for an isochronous control. However, the use of the isochronous mode can lead to instability problems for a system with more than one generator supplying several loads, given the impossibility of the correct load distribution among the system’s generating units.
Transient Stability and Protection Evaluation of Distribution Systems with. . .
357
2.3 Protection System The protection system aims to guarantee the disconnection of an electrical system when the later is subjected to any disturbance that could lead to an unstable condition [21]. Thus, it is essential the protection devices operate as quick as possible, in order to minimize risks to human life and damages to power system equipment. The classification and identification of protection devices are covered in the standard device numbers by the American National Standards Institute (ANSI), where numbers are applied to identify and describe the protections. The protection system is usually formed by circuit breakers, fuses, instrument transformers, and relays. Relays operate as the central part of the protection scheme, and they are composed by a wide range of devices that act to protect the system from different types of events, such as overload, short circuit, overvoltage, undervoltage, frequency variations, and others. The relay operation consists of analyzing voltage and current provided by the voltage transformer (VT) and current transformer (CT). Through those parameters, the relay verifies the power system operating conditions, according to its settings. In case of an abnormal operation, a trip command is sent to the circuit breaker, causing disconnection of the affected circuit. Due to the continuous increase of embedded generations interconnected to the grid, the use of protection relays becomes essential to guarantee the reliability and stability of the system. Unlike electromechanical relays, digital ones are currently capable of acting for several protection functions in a single device.
Function 50/51 The instantaneous overcurrent relay (50) and the time-overcurrent relay (51) are the most applied protections in the electrical system. Generally, overcurrent events are the most common ones in power systems and subject the equipment to the highest levels of stress and risk of damage [21]. These undesirable conditions may be caused by system overload or short-circuit events. The principle of overcurrent protection is quite simple: whenever the current exceeds a previously set value, the relay operates in order to protect the desired element or circuit. In addition, the relay operating time is based on the time-current relation; therefore, the manufacturers offer several options of curves to better attend the needs of the protection system. In Fig. 10, a single-line diagram is presented, with the overcurrent function inserted. The setup of an overcurrent relay consists of choosing a pickup current, which is the value that trips the device (maximum tolerable current). Regarding the operating time, it is necessary to set a definite operating time for the instantaneous overcurrent relay (50). As for the time-overcurrent relay (51), a time dial has to be set, which provides different operating times at the same operating current level. Typical values of IEEE standard (C37.112-1996) and International Electrotechnical Commission
358
G. S. Morais et al.
Fig. 10 Example of a single-line diagram with the overcurrent protection enabled
Table 1 Typical values for each characteristic curve [22] Standard IEC
IEEE
Curves Standard inverse Very inverse Extremely inverse Long-time inverse Short-time inverse Moderately inverse Very inverse Extremely inverse
K 0.14 13.5 80 120 0.05 0.515 196.1 282
α 0.02 1 2 1 0.04 0.02 2 2
L 0 0 0 0 0 1.14 4.91 1.217
β 1 1 1 1 1 1 1 1
(IEC) standard (60.255) can be used in the following expression: % Trelay = T D M=
& K + L ; Mα − β Isc Ipickup
.
(22)
(23)
where: T D time dial Isc short-circuit current [A] Ipickup chosen current to operate [A] M multiples of pickup The typical values of the variables that represent the characteristic curves are shown in Table 1. The characteristic curves available in the IEC standard are shown in Fig. 11, where it is possible to compare the operating differences among the curves, regarding the operating time and their slope.
Functions 27 and 59 The undervoltage protection (27) aims to detect events in which the voltage drops below the rated voltage. It can be caused due to load starting, loss of a capacitor
Transient Stability and Protection Evaluation of Distribution Systems with. . .
359
Fig. 11 Time-overcurrent curves
bank, and switching and by other reasons. Besides, a voltage sag can also be a consequence of other existing disturbances in the system, such as a short circuit. Then, VTs are essential providing voltage information to relays. The overvoltage protection (59) operates similarly to the 27 relay; however, in this case, it monitors when voltage increases significantly, exceeding the rated voltage. This function protects generators against overvoltages which can occur due to sudden switching of the loads or due to a malfunction of the voltage regulator.
Function 81 The frequency relay (81 under/over) is used to identify variations in frequency caused by load unbalance or bad operation of the speed controller of the primary movers. Frequency events are usually related to the unbalance of active power between generation and demanded load. If not solved quickly, frequency instability can result in a shutdown of one or more load blocks. Hence, frequency protection is fundamental to assure the network stability, since the electrical system operates within a very narrow frequency range. According to [23], a procedure defined by the Brazilian regulatory agency, the declared frequency for the system is 60 Hz, and the mandatory frequency responses are: • Frequency cannot be above 66 Hz and below 56.5 Hz. • Avoid a deviation of system frequency outside the range from 59.5 Hz to 60.5 Hz for more than 30 s.
360
G. S. Morais et al.
Fig. 12 Operating conditions and limits for frequency
• Frequency cannot be above 62 Hz for more than 30 s and above 63.5 Hz for more than 10 s. • Frequency cannot be below 58.5 for more than 10 s and below 57.5 Hz for more than 5 s. A diagram is presented in Fig. 12, which is an alternative way to understand the aforementioned recommendations [23].
Function 81R The rate of change of frequency (ROCOF or df /dt) protection (81R) is used when a fast action is necessary in cases of load shedding, in order to speed up operating time for frequency deviation and to detect islanding situations. Thus, the rate of change of frequency is an immediate indicator of power imbalance. This function is sensitive to rapid frequency changes, and it may be enabled to detect both positive and negative deviations. The relay measures the frequency decay rate, which provides the possibility of acting much ahead when the system frequency would have actually dipped to a point at which generator underfrequency relays or other element would trip. Thus, the relay can act over the circuit breakers and disconnect loads from the network as soon as the load-generation balance is achieved. Relays calculate the rate of change of frequency considering a window with some cycles, usually between 2 and 40 cycles. This signal is processed by filters, and then a resulting signal is used to detect an islanding situation. If the frequency rate of change exceeds the settings, a trip signal is immediately sent to the circuit breaker. Typical settings for the 81R function on 60 Hz systems are between 0.10 and 1.20 Hz/s [24].
Transient Stability and Protection Evaluation of Distribution Systems with. . .
361
Function 67 The directional overcurrent relay (67) is sensitive in relation to the direction of the power flow that circulates through the system. In other words, this relay operates when the operating direction is respected and the operating thresholds is exceeded [25]. Directional relays are applicable to different disturbances, such as phase and ground fault. Thus, it is used in generators in order to prevent reversing flow (generators operating as motors) [21]. For its correct operation, the relay needs two parameters, which are [22]: • A polarization quantity, which can be both voltage and current (voltage is generally used) • An operating quantity, whose electric current is the most implemented option The direction of the current flow is obtained through the phasor comparison between the operating current and the polarization voltage. This resulting lag is what gives the direction of the operating power flow [22]. Thus, it is important to choose the relay polarization connection, which may be 30◦ , 60◦ , or 90◦ . These connections define the reference voltages to obtain the polarization voltage. In Fig. 13, an example of a 90◦ connection is shown, where the polarization voltage Vbc is lagged by 90◦ of VAN . Moreover, it is noted that an angle r was arbitrated at 45◦ , which is the angle formed between the polarization voltage and the maximum torque. The relay operating limit is formed by the lines at +90◦ and −90◦ in relation to the maximum torque, dividing the plane into two parts: the operation region (area highlighted in green) and the region where the relay is out of
Fig. 13 Phasor diagram for 67 relay
362
G. S. Morais et al.
Table 2 Typical connections for relay 67 Connection 1 2 3 4 5
Angle 30◦ 60◦ Δ 60◦ Y 90◦ –60◦ 90◦ –60◦
Phase A I Ia Ia − Ib Ia Ia Ia
V Vac Vac −Vc Vbc Vbc
Phase B I V Ib Vba Ibc Vba Ib −Va Ib Vca Ib Vca
Phase C I Ic Ic − Ia Ic Ic Ic
V Vcb Vcb −Vb Vab Vab
service (area highlighted in red). Therefore, the relay is able to obtain the direction of a short-circuit current. In Table 2, typical connection configurations are presented, as well as the respective polarization voltages and currents. The neutral directional overcurrent relay (67N) works similarly to phase one, however, with some peculiarities. The polarization quantity is based on the zerosequence voltage (3V0 ) or the zero-sequence current (3I0 ). Thus, for any earth fault or phase imbalance, it will imply a polarization in the neutral, allowing the relay to act properly.
Fuses Fuses are devices widely used in distribution systems, once they have satisfactory performance in protecting the network and reduced price compared to other devices. Besides, they are applicable both in urban and rural areas [21]. They are composed by a metal wire or strip and operate as a switch, interrupting the current flow. When the operating current exceeds the rated current, the metallic material melts by Joule effect and interrupts the circuit current. However, as a disadvantage, it needs to be replaced every time it operates. The fuse operating characteristics are related to time-current relation. Thus, fuses are classified according to their time responses: type “H” (slow), “K” (fast), or “T” (slow). For fuse sizing, the fuse rated current and the maximum expected load have to be calculated. According to [21], the fuse capacity has to be greater than or equal to 150% of maximum current, given by: Ie ≥ 1.5 · Imax , where: Ie fuse current [A] Imax is the maximum current for the estimated load [A]
(24)
Transient Stability and Protection Evaluation of Distribution Systems with. . .
363
The design must consider a possible increase in demand over time. Through Eq. (25), a load growth is estimated, where a period of 5 years is normally used [21]: & % C% n , K = 1+ 100
(25)
where: C% annual growth rate [%] n number of years during the analysis period [year] Another important aspect to be assessed is the site short-circuit current. For the specification, the current at circuit end is calculated for a phase-to-ground fault, considering an impedance of 40 Ω. The fuse rated current may be defined from the relation presented in: K · In ≤ Ie ≤
1 · Isc 1φ−min , 4
(26)
where: In rated current [A] Isc 1φ−min minimum short-circuit current at the end of the circuit for a phase-toground fault, multiplied by a safety factor 14 [A]
3 Study Case and Modeling This chapter further details data and parameter used in modeling the system. Once models are properly validated, simulations are carried out considering different test cases, followed by an in-depth analysis of the responses.
3.1 Feeder Model The transient stability of the IEEE 34-node test system [26] is evaluated in this chapter, considering the presence of DERs connected along the feeder. The model is based on an unbalanced system, located in the state of Arizona (USA), which operates at a voltage of 24.9 kV and a frequency of 60 Hz. Its rated load is around 2.06 MVA, and despite being a mostly three-phase system, it has some single-phase and two-phase overhead lines. The network is simulated using the DIgSILENT PowerFactory software [10], and the model consists of the following sources: four photovoltaic generators (PVs)
364
G. S. Morais et al. TF-2
848
External Grid
822
TF-7 CB-A
814
808 812
806
LEGEND R - Automatic Recloser CB - Circuit Breaker SG - Synchronous Generator PV - Photovoltaic Plant
PV 2
844
864
818 802
846 TF-3
820
PV 4 TF-4
800
PV 3
824 826 850 816
842 CB-C 860 R 834 858
862
832 810
852
TF-6
CB-B TF-5
TF-1 836 840
PV 1
888 890 838 856
828 830 854
SG
Fig. 14 Single-line diagram of IEEE 34-node test system
installed along the feeder and a steam turbine-generator connected to the node 828. Moreover, the substation is connected at the beginning of the feeder (node 800), represented by an infinite bus (External Grid). The complete system is shown in Fig. 14.
3.2 Dynamic Models This section presents the data used in dynamic simulations of the synchronous generator, voltage and speed regulators, PVs and protection system.
Synchronous Generator The nominal power of synchronous generator is 4.875 MVA, operating at 13.8 kV. The machine is connected to the 24.9 kV distribution system through a step-up transformer (TF5), at the 828. Regarding the control mode, the SG is set to regulate voltage at the point of common coupling at 1.00 pu. In dynamic simulations was considered a round rotor machine with one field winding, one damping winding on the direct axis, and two damping windings on the quadrature axis. Table 3 presents the parameters of the model.
Voltage Regulator For the excitation system, the model used was of the static type, in which the IEEE ST2A AVR was considered [27]. Switching and charging effects of the rectifier were disregarded, which sets up the exciter energy source model [28]. Figure 15 shows a block diagram of the model, and Table 4 describes the set of parameters [28].
Transient Stability and Protection Evaluation of Distribution Systems with. . .
365
Table 3 Steam generator parameters Description Rated apparent power Inertia time constant Rotor type Unsaturated d axis synchronous reactance Unsaturated q axis synchronous reactance Unsaturated d axis synchronous transient reactance Unsaturated q axis synchronous transient reactance Unsaturated d axis synchronous subtransient reactance Unsaturated q axis synchronous subtransient reactance Stator leakage reactance d axis transient open-circuit time constant q axis transient open-circuit time constant d axis subtransient open-circuit time constant q axis subtransient open-circuit time constant
Parameter S (MVA) H (s) – Xd (pu) Xq (pu) Xd (pu) Xq (pu) Xd (pu) Xq (pu) Xl (pu) Td (s) Tq (s) Td (s) Tq (s)
Value 4.875 1.05 Round 1.55 1.55 0.28 0.65 0.19 0.19 0.15 6.5 1.25 0.035 0.035
Fig. 15 Voltage regulator model
Fig. 16 Steam turbine model with reheating
3.3 Governor Since a thermoelectric power plant was considered in this study, the IEEE TGOV1 governor model [29] was selected to control a steam turbine, which is coupled to the shaft of the generator. In Fig. 16, the TGOV1 block diagram is illustrated. Speed regulator parameters are shown in Table 5, based on the data available in [28].
366
G. S. Morais et al.
Table 4 ST2A excitation system parameters Description Voltage regulator gain Exciter constant related to self-excited field Excitation control system stabilizer gains Voltage regulator time constant Exciter time constant Excitation control system stabilizer time constant Rectifier loading factor proportional to commutating reactance Potential circuit gain coefficient Potential circuit gain coefficient Minimum voltage regulator outputs Maximum voltage regulator outputs Minimum field voltage Maximum field voltage
Table 5 Steam turbine governor
Parameter KA (pu) KE (pu) KF (pu) TA (s) TE (s) TF (s) KC (pu)
Value 180 1 0.01 0.15 0.5 1 1.82
KP (pu) KI (pu) VRmin (pu) VRmax (pu) EF Dmin (pu) EF Dmax (pu)
14 8 0 1 0 4.2625
Description Permanent droop Steam bowl time constant Time constant Time constant Turbine damping coefficient Minimum valve position Maximum valve position
Parameter R (pu) T1 (s) T2 (pu) T3 (pu) Dt (pu) Vmin (pu) Vmax (pu)
Value 0.05 0.01 1.5 5 0.0 0 1.1
3.4 PV Model The model used in this work is the one provided by DIgSILENT, which considers the basic characteristics of the PV cell and the inverter that connects the solar modules into the grid. In addition, it provides a control loop and information about model’s voltage and power measurements. In this work, the simulations are performed based on the instantaneous generation of a given moment in time. A rated power of 500 kVA and 0.4 kV of operating voltage were considered for each PV. In total, the system has four units of PV, generating 300 kW of active power each (there is no reactive power contribution). They are connected to nodes 840, 848, 858 and 812, through step-up transformers TF1, TF2, TF3, and TF4, respectively. Table 6 shows the parameters of the solar system.
Transient Stability and Protection Evaluation of Distribution Systems with. . .
367
Table 6 Solar-PV system parameters Description Open-circuit voltage of module at STC MPP voltage of module at STC MPP current of module at STC Short-circuit current of module at STC Temperature correction factor (voltage) Temperature correction factor (current) No. of series connected modules No. of parallel connected modules Time constant of module
Parameter VOC (V) VMP P (V) IMP P (A) ISC (A) au (1/K) ai (1/K) Ns Np Tr (s)
Value 43.8 35 4.58 5 −0.0039 0.0004 20 140 0
Fig. 17 Protection zones for each type of device
3.5 Modeling of Protection System For the protection system, some equipment, such as relays, fuses, and a recloser, were inserted along the feeder. These devices oversee the protection of the network whenever it is exposed to any disturbance or event that could lead the system to an unstable condition. In Fig. 17, the protection devices are identified. Concerning digital relays, the model 751-5A by Schweitzer Engineering Laboratories (SEL) was elected, which is a feeder relay widely used in power systems and has several built-in functions. In Power Factory, the following functionalities are modeled: F50/F51 (phase, earth, negative sequence), F59, F27, F81, and F81R, among others of a complementary nature. In addition to these, the relay also possesses a reclose function (F79). As the directional overcurrent protection (phase and neutral) is not available on SEL model, the SR 750 relay by General Electric (GE) was selected to operate this function. In addition to this device, a generic relay provided by DIgSILENT
368
G. S. Morais et al.
was also installed, since it allows implementing other features, such as the reclosing operation. Further details regarding the parameters adopted, as well as protection functions used in this work, are discussed in the following sections.
Feeder Protection Close to the substation, between nodes 800 and 802, three digital relays were installed, each one having a different role of protection, monitoring faults near the transformer as well as those along the feeder. The devices and their functions are detailed in what follows: • Relay A1, responsible for phase time-overcurrent protection (F51) and neutral time-overcurrent protection (F51N) • Relay A2, responsible for phase directional overcurrent protection (F67), with the reclosing operation (F79) enabled • Relay A3, responsible for neutral directional overcurrent protection (F67N), with the reclosing operation (F79) enabled The settings of the relays are better explained next. (a) Relay A1: SEL 751-5A Phase time-overcurrent protection (F51) aims to protect the network from short circuits that may occur, regardless of the current direction. Thus, considering that the rated current between nodes 800 and 802 is 230 A, it was set 200 A for Relay A1. Moreover, the IEC Extremely Inverse Curve was selected, with a time dial setting of 0.3 s, while neutral function refers to the ground fault protection that takes residual current input from the phase CTs. Thus, the zero-sequence currents are used for fault-type identification. According to [22], the pickup current relay must respect the following expression: (10% − 45%) × IN ≤ Ipickup ≤
Isc , a
(27)
where: IN is the nominal current [A] Ipickup is the pickup current of neutral relay [A] is the short-circuit current for a phase-to-ground fault at the end of segment Isc [A] Ipickup is equal to 1.1 if the relay is a digital and 1.5 if the relay is electromechanical The nominal current is obtained through load flow calculations, considering that the synchronous generator is operating with a power factor equal to 0.8 and that each PV is generating 300 kW with unity power factor. Thus, the nominal current is equal to 79.187 A between nodes 800 and 802, while the short-circuit current measured
Transient Stability and Protection Evaluation of Distribution Systems with. . .
369
for a ground fault applied at the node 840 is approximately 132 A. Thus, the pickup current was set to 20 A for Relay A1, with the IEC Extremely Inverse Curve and a time dial setting of 0.3 s. The current threshold respects the recommendations, as expressed by: 132 1.1
(28)
15.8374 A ≤ Ipickup ≤ 120 A
(29)
0.2 × 79.187 ≤ Ipickup ≤
(b) Relay A2: Generic The phase directional overcurrent function (F67) is used to protect the distribution network against faults that may occur in the direction of the substation. Furthermore, provided that the system load is low, this function enables the protection of the generator from overload, imposing a limit of current in the direction of the substation. Thus, considering that nominal current corresponds to 79.187 A and the maximum overload factor is equal to 1.5, the following pickup current is obtained: Ipickup = 1.5 × IN = 1.5 × 79.187 = 118.780 A.
(30)
The protection was configured with a definite time curve at 0.03 s. Complementary to directional sensing, the reclosing function (F79) was also activated for a single operation, considering an operating time of 1 s. Thus, in case of the circuit breaker being opened by Relay A2, for example, after 1 s of the switching event, the relay is set to send a “close” signal to the circuit breaker. After the reclosing attempt, the relay finally acquires lockout status. (c) Relay A3: Generic The neutral directional overcurrent function (F67N) was enabled on Relay A3 to protect the grid against ground faults that may occur in the direction of the substation. For this protection, Relay A3 determines the voltage reference according to the zero-sequence voltage (3 × V o) derived from a wye connection of the VTs, which allows to know the direction of short circuits. On the other hand, the operation current (3 × I o) is obtained from a wye connection of CTs. Generally, in unbalanced systems, there is a residual current flowing through the neutral. According to [21], the maximum factor of imbalance between the conductor’s phases must be between 0.1 and 0.3. Considering this recommendation, it was chosen a pickup current equal to 10 A for Relay A3, with a definite time curve at 0.03 s and the reclosing function enabled.
370
G. S. Morais et al.
Table 7 Summary of the protections near the substation Identification Relay A1
Model SEL 751-5A
Relay A2 Relay A3
Generic Generic
ANSI F51 F51N F67 F67N
Adjustment 200 A 20 A 118.8 A 10 A
Curve E.I. (IEC) E.I. (IEC) – –
Dial/definite time 0.3 0.3 0.03 s 0.03 s
Fig. 18 Single-line diagram close to the substation
Fig. 19 Time-overcurrent curves of substation devices
Table 7 presents a summary of all the parameters defined for each relay. The single-line diagram of the protective zone is illustrated in Fig. 18, and the timeovercurrent curves are shown in Fig. 19.
Protection at the SG Point of Common Coupling In order to provide greater protection to the distribution system, three relays were installed at the SG point of common coupling (PCC), being two of them on the HV
Transient Stability and Protection Evaluation of Distribution Systems with. . .
371
Table 8 Recommended adjustments to the connection of generator accessories (independent producers)
*
ANSI 27
Description Undervoltage relay
59
Overvoltage relay
81U
Underfrequency relay
81O
Overfrequency relay
67 67N
Phase directional overcurrent relay Neutral directional overcurrent relay
Adjustments 0.8 pu 0.7 pu 1.1 pu 1.2 pu 58.5 Hz 59.0 Hz 60.5 Hz 61 Hz * *
Definite time 5s 1.5 s 5s 0.5 s 0.2 s 2s 2s 0.2 s * *
Depending on direction of current
side of the transformer (24.9 kV) and one on the LV side (13.8 kV). Thus, the system consists of the following devices: • Relay B1, responsible for phase time-overcurrent (F51), neutral time-overcurrent (F51N), undervoltage (F27), overvoltage (F59), underfrequency (F81U), overfrequency (F81O), and rate of change of frequency (81R) function • Relay B2, responsible for phase directional overcurrent protection (F67), in the direction of distribution company • Relay B3, responsible for neutral directional overcurrent protection (F67N), in the direction of the SG For the modeling of these devices, the reference used was the standard [30], provided by Companhia Energética de Minas Gerais (CEMIG), the company responsible for the energy distribution in the state of Minas Gerais, Brazil. This document establishes the recommended adjustments for distributed generation connection to the medium voltage system. Table 8 presents the parameters used in modeling the protection at the PCC. (a) Relay B1: SEL 751-5A Protections F27, F59, F81U, and F81O were enabled on Relay B1, installed on the LV side of transformer TF5, following the setting instructions contained on the standard (available in Table 8). With respect to overcurrent function, it was used the recommended procedure in [21], in which the adjustment is based on transformer nominal power. So it is known that: ST F = 5 MVA,
(31)
VLV = 13.8 kV,
(32)
372
G. S. Morais et al.
VH V = 24.9 kV,
(33)
where: ST F nominal power of transformer VLV nominal voltage on the LV side of transformer VH V nominal voltage on the HV side of transformer From this information, it is possible to define the nominal current at the LV side of transformer and use it as a parameter for overcurrent calculation. The nominal current on the LV side of transformer (ILV ) can be obtained as: ILV = √
ST F 3 × VLV
(34)
,
5 × 106 ILV = √ = 115.9338 A. 3 × 24.9 × 103
(35)
Considering a maximum overload factor of 150%, the pickup current is obtained through: Ipickup = 1.5 × 115.934 = 173.90 A.
(36)
Thus, it was chosen the value of 175 A for the pickup current, the IEC Extremely Inverse Curve, and a time dial setting of 0.5 s. Concerning the ground faults close to the generator, since transformer configuration is Delta Star with grounded neutral connection (Δ − Yg ), there is no zero-sequence current flowing on the Delta side. However, for ground faults on the grid side, there will be zero-sequence current flowing through the Yg connection. Therefore, the neutral time-overcurrent protection (F51N) was enabled on Relay B1. Using Eq. (27) and considering that the factor is 20%, the nominal current is 115.934 A, and the short-circuit current is 232 A for a fault at the end of the feeder; the pickup current is set at 25 A on Relay B1. This value is within the acceptable range, as demonstrated in what follows: 232 , 1.1
(37)
23.187 A ≤ Ipickup ≤ 210.91 A.
(38)
0.2 × 115.934 ≤ Ipickup ≤
Another important protection is the rate of change of frequency (81R), which was also enabled on this relay. According to [31], in order to detect an islanding condition, values in the range of 0.10–1.20 Hz/s are generally used. Thus, this protection was set to 1.00 Hz/s, with a maximum operating time equal to 100 ms.
Transient Stability and Protection Evaluation of Distribution Systems with. . .
373
(b) Relay B2: GE SR 750 According to [30], the directional function is suggested for faults on the distribution network and on the DG side. Thus, considering that Relay B2 is monitoring only faults that may occur on the grid side, the nominal current (IN ) is calculated based on the generator capacity (Pinst ), in kW, using: Pinst , IN = √ 3 × Vpcc × 0.92
(39)
where Vpcc is the voltage at the PCC, in kV. Thus, IN can be obtained as: 4, 875 IN = √ = 122.865A. 3 × 24.9 × 0.92
(40)
It is recommended that the pickup current should be 1.05 times the value of IN , since it is possible that generation power exceeds up to 5% of the nominal power. Thus, the current was set to 129 A, for a 0.2 time dial and using the Extremely Inverse Curve, resulted from: Ipickup = 1.05 × IN = 1.05 × 122.865 = 129.01A.
(41)
(c) Relay B3: GE SR 750 The directional protection was enabled in Relay B3 in order to prevent the grid from faults that may occur on the generator side. As there is no load connected to the generator bus, it was considered that the demand is very small or zero. Therefore, the adjustment should consider the minimum possible current value, as indicated in [30]. Thus, relay B3 was set to 5 A of pickup current, 0.1 of time dial, and the Extremely Inverse Curve. Table 9 exhibits a summary of the adjustments proposed for the protection devices. In Figs. 20 and 21, the one-line diagram close to the synchronous generator and the coordination of the protections (selectivity) are presented, respectively.
Fuses Fuses were considered in some branches of the test feeder, in order to protect them against short-circuit currents. For all analyzed cases, a growth rate of 5% per year was considered, for a period n of 5 years. Thus, from Eq. (25), the K factor can be obtained as: %
5 K = 1+ 100
&5 = 1.2763.
(42)
374
G. S. Morais et al.
Table 9 Summary of the protections for the SG devices Identification Relay B1
Model SEL 751-5A
ANSI F51 F51N F27 F59 F81O F81U
Relay B2 Relay B3
GE SR 750 GE SR 750
F81R F67 F67
Settings 174 A 25 A 0.8 pu 0.7 pu 1.1 pu 1.2 pu 60.5 Hz 61.0 Hz 59.0 Hz 58.5 Hz 1.0 Hz/s 129 A 5A
Curve E. I. (IEC) E.I. (IEC) – – – – – – – – – E. I. (IEC) E. I. (IEC)
Fig. 20 One-line diagram close to the synchronous generator
Fig. 21 Time-overcurrent curves of the SG protection devices
Dial/definite time 0.5 0.5 5.0 s 1.5 s 5.0 s 0.5 2.0 s 0.2 s 2.0 s 0.2 s 0.10 s 0.2 0.1
Transient Stability and Protection Evaluation of Distribution Systems with. . . Table 10 Fuses and specifications
Fuse F1 F2 F3
Type 10 K 20 K 10 K
Nodes 808–810 816–818 836–862
375
Ie (A) 1.549 ≤ Ie ≤ 71 15.413 ≤ Ie ≤ 47.5 2.374 ≤ Ie ≤ 76.75
Calculating the short-circuit current at the end of each chosen branch (nodes 808–810, 816–822, and 836–838), nominal current and using Eq. (26) allow for obtaining the rated current of fuses. Thus, the expected relation between demanded current and short-circuit current, as well as the type and location for each fuse, is shown in Table 10.
Recloser: Relay C1 Due to the large extension of the feeder, a recloser with time-overcurrent relay (phase and neutral) was inserted at the node 858. For any downstream fault, the device ensures the trip of the circuit breaker and, after a period, the automatic reclosing of the switch. In addition, it must be observed that Fuse F3 is downstream from the recloser, at the end of the feeder, so the recloser settings must consider the coordination of both devices. Using the relay SEL 751-5A, the settings consider a pickup current equal to 48 A, a time dial of 0.8, and the Extremely Inverse Curve. In addition, three reclosing operations were considered in the following sequence, 1.0 s, 2.0 s, and 4.0 s, while the fault counter will reset after 30 s. For the neutral protection, the pickup current is obtained using (27), yielding: 0.2 × 14.085 ≤ Ipickup ≤
132 , 1.1
2.817 A ≤ Ipickup ≤ 120 A.
(43) (44)
Thus, the neutral pickup current was set to 30, with a time dial of 0.50 and the Extremely Inverse Curve. The neutral reclosing function was configured in the same manner as phase one (number of attempts and time of operation). In Fig. 22, the time-overcurrent curves are presented.
Summary of Settings Finally, Table 11 presents a summary of the functions enabled in each relay.
376
G. S. Morais et al.
Fig. 22 Time-overcurrent curves of Relay C1 and Fuse F3. Table 11 Summary of all the implemented protections along the feeder Relay A1 A2 A3 B1 B2 B3 C1
Model SEL 751-A Generic Generic SEL 751-A GE SR 750 GE SR 750 SEL 751-A
F51
F51N F27
F59
F81
F81R
F67
F67N F79
4 Results and Discussion In this section, the results are presented and discussed, where different scenarios were simulated and the transient behavior of the synchronous generator is evaluated. In order to analyze stability and validate critical points of power protection, the definition of cases took into account some aspects, such as (i) number of PVs connected to the network, (ii) different locations for applying the fault, (iii) islanding, and (iv) changes in the protection sensitivity of SG device. The simulations are performed through DIgSILENT software, and the main SG variables are observed, among them voltage, frequency, active power, reactive power, field voltage, and the rotor angle. In this study, a total of five different cases were simulated, as shown below: • Case 1: phase-to-ground fault at node 822 • Case 2: phase-to-ground fault at node 860
Transient Stability and Protection Evaluation of Distribution Systems with. . .
377
Table 12 Events and specifications Case 1 2 (a) 2 (b) 3 (a) 3 (b) 4 5 (a) 5 (b)
Event Phase-to-ground Phase-to-ground
Node 822 860
Number of PVs SG relay 4 4
Phase-to-ground
860
0/2/4
Islanding Phase-to-ground
800 800
2 4
Substation relay
Recloser
• Case 3: phase-to-ground fault at node 860, considering different numbers of PVs connected to the system • Case 4: opening of circuit breaker CB-A and grid islanding • Case 5: phase-to-ground fault at node 800 Complementary scenarios to cases 2, 3, and 5 were also simulated. In those cases, the protection device of the synchronous generator was disabled, in order to verify the impact on the system dynamic, under to a less selective protection scheme. Table 12, summarizes the case studies, where the event type, the locations where the actions took place, the number of PVs connected to the system, and the status of protection devices during the simulations (enabled or not) are presented. The nodes where the events were applied are highlighted in Fig. 23. With respect to the transient stability analysis, Table 13 summarizes the criteria adopted in this study, being established acceptable values for each variable.
4.1 Case 1: Phase-to-Ground Fault at Node 822 In this case, a phase-to-ground fault was applied at the node 822 (phase “A”). The fault occurred at 1.0 s of simulation, and after 233 ms, Fuse 2 opens, disconnecting the branch under fault from the rest of the grid. Figure 24 shows the opening event of the protection device, isolating the fault on the respective phase. Due to the fast response of the fuse, none of the other protective equipment installed along the feeder operate, keeping the generators connected to the grid. However, depending on the chosen settings and the coordination project, other protections could trip before the fuse, which would cause the shutdown of one or more generators. Regarding the SG behavior, Fig. 25 illustrates that the fault caused a voltage dip at the PCC, achieving almost 0.9 pu during the transient period. Nevertheless, the
378
G. S. Morais et al. TF-2
848
External Grid
822
TF-7 CB-A
814
808 806
812
824 826 850 816
844
842 CB-C 860 R 834 858
852 TF-6
810 CB-B TF-5
TF-1 836 840 862
832
LEGEND R - Automatic Recloser CB - Circuit Breaker SG - Synchronous Generator PV - Photovoltaic Plant
PV 2
864
818 802
846 TF-3
820
PV 4 TF-4
800
PV 3
PV 1
888 890 838 856
828 830 854
SG
Fig. 23 Feeder with the highlighted nodes for the simulated events
Fig. 24 Fuse status for each phase—Case 1
Table 13 Criteria for stability analysis Variables ΔP TV SV TF SF
Acceptable values 0.5 p.u. 0.8 p.u. ≤ TV ≤ 1.1 p.u. (normalize on TV in 10 s) 0.95 p.u. ≤ SV ≤ 1.05 p.u 56.5 Hz ≤ TF ≤ 66.0 (normalize to SF in maximum 30 s) 59.9 Hz ≤ SF ≤ 60.1 Hz
Description Torsional stress Level voltage in transient state Level voltage in steady state Frequency in transient state Frequency in steady state
Transient Stability and Protection Evaluation of Distribution Systems with. . .
379
Fig. 25 Terminal voltage on SG bus during the event—Case 1 Table 14 Analysis of the criteria—Case 1
Variable ΔP (kW) VRT (pu) VRP (pu) FRT (Hz) FRP (Hz) a The
Lower limit −937.5 0.9 0.93 56.5 59.9
Statusa
Upper limit 3937.5 1.1 1.05 66 60.1
Status
symbol indicates that no violations resulted
fast fuse operating allowed the SG to recover the voltage magnitude, stabilizing at 1.0 pu in steady state. Analyzing other SG responses, such as frequency, active power, reactive power, and rotor angle, it was possible to verify that no significant deviation resulted. Table 14 summarizes the criteria analyzed in this test case, where it is possible to observe that no violations resulted.
4.2 Case 2 (a): Phase-to-Ground Fault at Node 860 In Case 2, a permanent phase-to-ground fault was applied at the node 860, downstream of the recloser, and it was considered that all the PVs were operating. The results of this case are quite interesting, and to be better understood, the timeline of events is presented in Fig. 26. In addition, Fig. 27 shows the status of each circuit breaker installed on grid, where CB-A is close to the Substation, CB-B is close to SG, and CB-C is at the end of feeder.
380
G. S. Morais et al.
Fig. 26 Sequence of events—Case 2(a)
Fig. 27 Status for each CB—Case 2(a)
According to Fig. 26, the first protection equipment to operate is Relay B1 (SG relay), at 1.118 s of the simulation, opening CB-B due to neutral time overcurrent (F51N). As this device does not have reclosing function (F79) enabled, SG is disconnected from grid permanently, assuming an islanded condition. The second equipment to trip is Relay C-1 (recloser) at 1.206 s, opening CB-C and isolating the branch under fault. After 1.0 s, Relay C-1 completes its first reclosing operation (2.203 s), and once again connects the branch to the rest of the grid. Relay A1 (feeder relay) operates and opens CB-A at 2.32 s, making the system lose its second source. After 1.0 s, Relay A1 recovers the connection between the substation and the feeder; however, at 3.436 s, the device operates for the second time and sends the “open” signal to CB-A. Since it has only one reclosing operation
Transient Stability and Protection Evaluation of Distribution Systems with. . .
381
Fig. 28 Voltage response of SG—Case 2(a)
set, Relay A1 achieves the lockout status. Therefore, it is not possible to reclose the circuit breaker. As a consequence of those events, no source is supplying the network anymore, since all the PVs were switched off. Figure 26 illustrates the timeline of the simulation, as well as the description of the events. One doubt that could arise concerns is the fact that after the recloser reconnects the branch to the grid, the next protection device to operate is Relay A1 instead of Relay C1, though when the simulation started and the fault was applied, recloser tripped first. However, that might be explained through the opening of CB-B, which represents the disconnection of SG. Without one of the sources, the short-circuit current at the fault location drops, even though the substation remains with the same contribution as before the events. Thus, the curve response of the substation relay becomes faster than the recloser one. Analyzing the behavior of the voltage at the SG point of common coupling, a voltage sag was observed, followed by an overvoltage caused by the fast switching of CB-A. However, the AVR controller had properly responded, regulating the voltage at 1.0 pu again, as shown in Fig. 28.
Case 2 (b): SG Protection Device Disabled Another interesting condition to be evaluated in this case is reducing the sensitivity of protection system to fault, by disabling the generator protective device (Relay B1). This is necessary to more thoroughly examine the SG behavior during the disturbance, as it reduces the chances of SG being disconnected from the grid. In Fig. 29, it is possible to observe the status of each circuit breaker, where only
382
G. S. Morais et al.
Fig. 29 Circuit breakers during fault for a new condition of simulation—Case 2(b) Table 15 Analysis of the criteria with less protection sensitivity—Case 2(b)
Variable ΔP (kW) VRT (pu) VRP (pu) FRT (Hz) FRP (Hz) a
Lower limit −937.5 0.9 0.93 56.5 59.9
Statusa X
Upper limit 3937.5 1.1 1.05 66 60.1
Status
The symbol indicates that no violations resulted, while X indicates that the criterion was not attended
the recloser operates, isolating the short circuit from the main feeder without disconnecting any source from the grid, except for PV 01 and PV 02 (installed on the branch islanded). Furthermore, three reclosing attempts are observed on CB-C, until it finally acquires lockout status. In Fig. 30, the impacts of the reclosing attempts on the voltage can be observed, since there are three events of voltage dip after the breaker CB-C is opened for the first time. However, when the recloser is in lockout status and the fault was isolated, it was seen that the voltage terminal reaches 1.00 pu, showing that the AVR had properly responded to the event. Table 15 summarizes the criteria analyzed in this test case.
Transient Stability and Protection Evaluation of Distribution Systems with. . .
383
Fig. 30 Voltage response of SG for a new condition of simulation—Case 2(b)
4.3 Case 3 (a): Different Number of PVs Connected Case 3 was simulated in order to assess the impact that the number of PVs in operation could have on the stability of the synchronous generator, thus considering the following operational conditions: four PVs, two PVs, and zero PV. The results obtained in these simulations are complementary to those found in Case 2, since the same fault is applied at the node 860; however, the number of PVs in service is changed. First, the results showed some changes related to the voltage profile. In spite of the PVs contributing with only active power under normal conditions, that is, operating with power factor equal to 1.00, the connection of inverter-based DGs caused an increase in the voltage of most of the nodes along the feeder. This change might be seen in Fig. 31, where the positive sequence voltage at all nodes is presented, considering four and zero PVs on a pre-event condition. At node 864, for instance, the voltage varied from 0.944 to 0.987 pu due to the introduction of four PVs at the grid. Therefore, such results indicate that the use of PV can play an important role in helping to control the system voltage. When analyzing the SG responses, no significant difference was observed in all scenarios, which can be explained by the fast operation of the SG protective relay, disconnecting the generator, and the consequent similarity in the events that occur next. However, voltage response presented some peculiarities that merit further exploration. Figure 32 shows the voltage response for different numbers of PVs connected to the system. The voltage at the SG reaches a higher transient value in the case without PVs, which can be explained by the different operation point before the fault. It is important to highlight that the opening of the CB-B results
384
G. S. Morais et al.
Fig. 31 Voltage profile of 34-node feeder on a pre-event condition—Case 3(a)
Fig. 32 Voltage response for different number of PVs operating—Case 3(a)
in overvoltages, which may cause the trip of the specific generator protection, not considered in this work. When comparing the Case 2 for different numbers of PVs, it was observed that a higher number of PVs connected to the network decreased the magnitude deviation of the transient voltage response. Moreover, it took less time to stabilize voltage than when there was no PV connected to the system.
Transient Stability and Protection Evaluation of Distribution Systems with. . .
385
Fig. 33 SG voltage for a different protection scheme—Case 3(b)
Case 3 (b): SG Protection Device Disabled The procedure performed in Case 2(b) was replicated for Case 3(b), in which the generator protection device (Relay B1) is disabled. Figure 33 shows the voltage response for this condition, with a zoom on the interval between 0 and 2 s, period in which the fault and the recloser opening occur. During the short circuit, it is noted that responses are quite similar; however, after the disconnection of the branch where the fault occurs, the magnitude of overvoltage in the synchronous generator is different, being smaller for the condition in which all PVs are in operation. Regarding the behavior of PVs during events, Fig. 34 shows the behavior of active power and reactive power injected by PV3 (located upstream of the recloser) in the system, considering all PVs in operation. During the fault, at first when the recloser was closed, it was noticed that the PV contributed with reactive power, since the voltage in the system tended to drop. As the recloser opens the circuit breaker, the PV reactive power contribution becomes zero, and the generation of active power was increased.
4.4 Case 4: CB-A Opening and Grid Islanding In this case, PVs 1 and 3 were turned off, while PVs 2 and 4 remained in operation. Additionally, the synchronous generator dispatch was changed, providing a total of 1, 122 kW to the grid, so that the power intake in the distribution system from the substation was approximately equal to zero. At 1.00 s of the simulation, CB-A
386
G. S. Morais et al.
Fig. 34 Active and reactive power supplied by PV3—Case 3(b)
Fig. 35 Circuit breaker status during islanding—Case 4
was opened, leading the system to an islanded condition. Figure 35 shows the exact moment of CB-A opening, as well as the state of other circuit breakers. The active power behavior of the SG is presented in Fig. 36, where it is possible to observe a small deviation from the pre-islanding operation point. In Fig. 37, the frequency response is presented, being possible to observe a negligible frequency deviation. The frequency quickly stabilized at 60 Hz, showing
Transient Stability and Protection Evaluation of Distribution Systems with. . .
387
Fig. 36 Active power during islanding—Case 4
Fig. 37 Frequency during the islanding—Case 4
that for this operation point, the system could operate in an islanding condition. None of the enabled protection functions trip in this event, so in order to avoid islanding, other protection schemes must be evaluated. Table 16 presents the analysis of criteria for this test case, where it is possible to observe that no violations resulted.
388
G. S. Morais et al.
Table 16 Analysis of the criteria—Case 4
Variable ΔP (kW) VRT (pu) VRP (pu) FRT (Hz) FRP (Hz) a
Lower limit −1315.5 0.9 0.93 56.5 59.9
Statusa
Upper limit 3559.5 1.1 1.05 66 60.1
Status
The symbol indicates that no violations resulted
Fig. 38 Circuit breaker status during fault on bus 800—5(a)
4.5 Case 5 (a): Fault at Node 800 In Case 5, a single phase-to-ground fault was applied to the node 800 at 1.0 s of simulation, considering all PVs connected to the system and the SG generating 1500 kW. In this case, Relay A4 (directional phase overcurrent relay) tripped first, at 1.06 s, opening CB-A and islanding the grid from substation. Although the fault was isolated, Relay B1 (SG relay) was tripped very fast by the 81R protection (rate of change of frequency) at 1.201 s, leading the system to lose the SG and, consequently, all the PVs. As relays close to the substation have one reclosing operation enabled, at 2.11 s, CB-A was closed again. However, the fault is permanent, so at 2.17 s, a trip command is sent to circuit breaker, and it is opened definitely. In Fig. 38, the status of each circuit breaker is presented. Due to voltage sag caused by the short circuit, SG increased its reactive power generation in order to contribute to voltage recovery. Figure 39 illustrates the behavior of the field voltage, showing that, as expected, the AVR increases field
Transient Stability and Protection Evaluation of Distribution Systems with. . .
389
Fig. 39 Field voltage of generator limited by Ef dmax —Case 5(a)
voltage, limited by the parameter Ef dmax , which is the maximum field voltage allowed for the generator.
5 (b): SG Protection Device Disabled In this case, the fault at the node 800 (Case 5) was simulated considering that the SG protection is disabled. Results show that the substation protection tripped, as illustrated in Fig. 40. In Fig. 41, the active power of the SG is presented, where it is possible to see a significant deviation compared to the active power before the event, which may cause torsional stress. According to [32], the torsional stress can be evaluated by the difference of the active power generated immediately before and immediately after a contingency, which should not exceed 0.5 p.u. of the generator nominal power to safeguard the shaft of the generator turbine systems. The limits for this scenario were included in Fig. 41. In this case, the power exceeded the recommendation, which can result in harmful torsional effects. Thus, it is important to properly evaluate the dynamic behavior and protection systems of synchronous distributed generator, in order to avoid damage. The analysis of the criteria for this test case is presented in Table 17, where it is possible to observe that violations resulted.
390
G. S. Morais et al.
Fig. 40 Circuit breaker status under new condition of simulation—Case 5(b)
Fig. 41 Active power during the dynamic simulation—Case 5(b)
5 Conclusions This chapter presented an analysis of the transient stability and protection system of an unbalanced distribution system with the presence of DERs. Through dynamic simulations with DIgSILENT PowerFactory, five different cases were evaluated in order to contribute for the understanding of the impacts which the DER connection may cause in PDS, as well as the effect of protection devices on the system stability.
Transient Stability and Protection Evaluation of Distribution Systems with. . . Table 17 Analysis of the criteria for Case 5(b)
Variable ΔP (kW) VRT (pu) VRP (pu) FRT (Hz) FRP (Hz) a
Lower limit −937.5 0.9 0.93 56.5 59.9
Statusa X
Upper limit 3937.5 1.1 1.05 66 60.1
391 Status X X
The symbol indicates that no violations resulted
For that goal, a synchronous distributed generator and four PVs were introduced to a modified version of the IEEE 34-node system, in which operational limits of the SG were analyzed in the test cases. The simulations showed that the number of PVs connected to the distribution network has a direct impact on the dynamic behavior of the system, especially with regard to the voltage and frequency response of SG, and it also affects voltage profile along the feeder. During the disturbance, for a greater number of DG units connected (a total of four PVs), the SG voltage achieved close to 1.1 pu, while for a less number (without PV) the magnitude was about 1.16 pu. Moreover, SG needed 2 s less to stabilize when all the PVs were connected to the system. It was also possible to verify the behavior of PV3 during the recloser operation and under a fault event as well. The results showed that when faced to a short circuit, the PV contributed with reactive power, injecting 108.2 kvar to the grid, caused mainly by the voltage dip along the nodes. When the recloser opened, that is, the fault branch was disconnected from the main feeder, the PV became to contribute with active power (300 kW), illustrating the variation of power factor from 0.916 to 1.00 pu. Therefore, it is observed that PV behavior may change according to the operational conditions of the system. In other words, it is proven that although under normal operating conditions the PV generates only active power, when there is a short-circuit event, the inverterbased DG changes in cases of operation, providing reactive power to the grid, which can be evidenced by the variation of power factor. The implementation of protection devices also influences the dynamic behavior of the system, which may vary according to the criteria and the level of sensitivity adopted in the protection devices. In a case with more sensitive adjustments, that is, with less tolerance to disturbances, the protection acts quickly, preventing excessive torsional efforts from occurring and mainly guaranteeing the integrity of the machines. Conversely, a system with less sensitive settings can expose the system to instability, causing damage to the synchronous generator. The islanding of the feeder was simulated through the opening of the circuit breaker at the substation. The number of PVs and the SG dispatch were chosen so as to result in a power intake in the distribution feeder from the substation approximately equal zero. This represents a favorable condition for islanding. The results demonstrate that none of the enabled protection functions tripped in this event; thus, other protection schemes must be studied to avoid islanding. Under
392
G. S. Morais et al.
the aspect of operational limits, the results showed that the SG suffered almost negligible oscillations by the circuit breaker switching, with a variation less than 1%. Lastly, the results of this chapter book will be very useful for the distribution networks and the distribution system operator, once it brings some case studies similar to those that occur in real networks and their consequences, as well as for students who intend to have a better understanding on stability and protection fields.
6 Future Works Possible topics to be addressed in the future for this study are presented in what follows: • Expand the stability analysis studies considering the use of other controls (voltage and governor), and compare the responses. • Evaluate the impact of connecting wind farm (wind power plant) to the test network. • Consider more than one synchronous generator on the simulations. • Consider different grid code standards of networks. • Apply other types of events, such as switching a big load or a block of loads. • Consider both PV and wind plant (a complex view of DERs). • Design a voltage stability index applicable for distribution systems. • Consider regional stability.
References 1. M.E. El-Hawary, Introduction to Electrical Power Systems (John Wiley & Sons, Hoboken, 2008) 2. T.A. Short, Electric Power Distribution Handbook (CRC Press, Boca Raton, 2014) 3. E.R. Office, Distributed energy resources: impacts on energy planning studies. Empresa de Pesquisa Energética. Technical Report (2018) 4. G. Pepermans, J. Driesen, D. Haeseldonckx, R. Belmans, W. D’haeseleer, Distributed generation: definition, benefits and issues. Energy Policy 787–798 (2005) 5. N. Jenkins, Distributed generation, ser. Energy Engineering. Institution of Engineering and Technology (2010). https://digital-library.theiet.org/content/books/po/pbrn001e 6. Union for the Coordination of Transmission of Electricity (UCTE), Final Report: system disturbance on 4 November 2006. Union for the Coordination of Transmission of Electricity (UCTE). Technical Report (2006) 7. W. El-Khattam, M. Salama, Distributed generation technologies, definitions and benefits. Electr. Power Syst. Res. 71(2), 119–128 (2004) 8. W. Freitas, J.C.M. Vieira Jr., A.M. Franca, L.C.P.d. Silva, V.F.d. Costa, “Análise comparativa entre geradores síncronos e geradores de indução com rotor tipo gaiola de esquilo para aplicação em geração distribuída. Sba: Controle & Automaç¯ao Sociedade Brasileira de Automatica 16, 332–344 (2005)
Transient Stability and Protection Evaluation of Distribution Systems with. . .
393
9. S. Granville, P. Lino, F. Ralston, L.A. Barroso, M. Pereira, Recent advances of sugarcane biomass cogeneration in brazil, in 2009 IEEE Power Energy Society General Meeting (2009), pp. 1–5 10. DigSILENT, DigSILENT User Manual (2020). https://www.digsilent.de/en/. Accessed 10 Mar 2020 11. N.G. Bretas, L.F.C. Alberto, Estabilidade transitória em sistema eletroenergéticos. EESC/USP (2000) 12. P. Kundur, N.J. Balu, M.G. Lauby, Power System Stability and Control, vol. 7 (McGraw-hill, New York, 1994) 13. J.J. Grainger, W.D. Stevenson Jr., Power System Analysis (McGraw-hill, New York, 1994) 14. R. Kuiava, Projeto de controladores para o amortecimento de oscilações em sistemas elétricos com geração distribuída. Ph.D. Dissertation, Universidade de São Paulo (2010) 15. P.M. Anderson, A.A. Fouad, Power System Control and Stability (Wiley-IEEE Press, Piscataway, 2002) 16. W.D. Stevenson, Elementos de análise de sistemas de potência (McGraw-Hill, New York, 1974) 17. W. Freitas, J.C. Vieira, A. Morelato, W. Xu, Influence of excitation system control modes on the allowable penetration level of distributed synchronous generators. IEEE Trans. Energy Convers. 20(2), 474–480 (2005) 18. M. Eremia, M. Shahidehpour, Handbook of Electrical Power System Dynamics: Modeling, Stability, and Control (John Wiley & Sons, Hoboken, 2013) 19. IEEE, IEEE standard definitions for excitation systems for synchronous machines, in IEEE Std 421.1-2007 (Revision of IEEE Std 421.1-1986) (2007), pp. 1–33 20. Woodward, Speed droop and power generation, in Application Note 01302 (Revision NEW) (1991), pp. 1–10 21. J. ao Mamede Filho, D.R. Mamede, Proteção de sistemas elétricos de potência (LTC, Rio de Janeiro, 2011) 22. G. Kindermann, Proteção de Sistemas Elétricos de Potência, vol. 1 (UFSC - EEL - LABPLAN, Florianópolis, 2005) 23. ANEEL - National Agency of Electric Energy, Procedimentos de Distribuição de Energia Elétrica no Sistema Elétrico Nacional - PRODIST: Módulo 8 - Qualidade da Energia Elétrica (2021), pp. 1–88. https://www.aneel.gov.br/documents/656827/14866914/Módulo_8-Revis~ ao_12/342ff02a-8eab-2480-a135-e31ed2d7db47 24. J.C.M. Vieira, W. Freitas, Z. Huang, W. Xu, A. Morelato, Formulas for predicting the dynamic performance of rocof relays for embedded generation applications. IEE Proc. Gen. Trans. Distrib. 153(4), 399–406 (2006) 25. C. Mardegan, Proteção e Seletividade em sistemas elétricos industriais (Atitude Editorial Ltd., São Paulo, 2012) 26. W.H. Kersting, Radial distribution test feeders. IEEE Trans. Power Syst. 6(3), 975–985 (1991) 27. IEEE, IEEE recommended practice for excitation system models for power system stability studies, in IEEE Std 421.5-2016 (Revision of IEEE Std 421.5-2005) (2016), pp. 1–207 28. M. Resener, Avaliação do impacto dos controladores de excitação na estabilidade transitória de geradores síncronos conectados em sistemas de distribuição,” Master’s Thesis (2011) 29. P. Pourbeik, G. Chown, J. Feltes, F. Modau, S. Sterpu, R. Boyer, K. Chan, L. Hannett, D. Leonard, L. Lima, W. Hofbauer, L. Gerin-Lajoie, S. Patterson, J. Undrill, F. Langenbacher, Dynamic models for turbine – governors in power system studies. Technical Report. PES-TR77 (2013) 30. CEMIG, Requisitos Para Conex¯ao de Acessantes Produtores de Energia Elétrica ao Sistema de Distribuição da Cemig D - Média Tensão (2020). https://novoportal.cemig.com.br/wpcontent/uploads/2020/07/ND.5.31.pdf/ 31. J.C. de Melo Vieira Jr., Metodologias para ajuste e avaliação do desempenho de relés de proteção anti-ilhamento de geradores síncronos distribuídos. Master’s Thesis. Unicamp, Campinas (2006) 32. IEEE, IEEE screening guide for planned steady-state switching operations to minimize harmful effects on steam turbine-generators. IEEE Trans. Power Apparatus Syst. PAS-99(4), 1519– 1521 (1980)
Fuzzy Logic Control for Motor Drive Performance Improvement in EV Applications Minh C. Ta, Binh-Minh Nguyen, and Thanh Vo-Duy
1 Introduction The purpose of this chapter is to discuss an intelligent control technique based on fuzzy logic to improve the performance of motor drives in electric vehicle (EV) applications. The fundamental and the most distinguish feature of an EV compared to an internal combustion engine (ICE) car resides in the use of electric motor as “horse power,” which has clear advantages over its counterpart [1]: • The torque response of the electric motor is very accurate and quick, approximately 10-100 times faster than that of an ICE. • As the electromagnetic torque is proportional to the motor current, the developed torque can be calculated easily for the control purpose. • The use of electric motor allows eliminating many mechanical parts and enables various configurations, including attaching motor to each wheel, which allow
M. C. Ta () e-TESC Lab., University of Sherbrooke, Sherbrooke, QC, Canada CTI Lab. for EVs, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam e-mail: [email protected] B.-M. Nguyen Department of Advanced Science and Technology, Toyota Technological Institute, Nagoya, Japan e-mail: [email protected] T. Vo-Duy CTI Lab. for EVs, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 M. J. Blondin et al. (eds.), Intelligent Control and Smart Energy Management, Springer Optimization and Its Applications 181, https://doi.org/10.1007/978-3-030-84474-5_13
395
396
M. C. Ta et al.
flexible and advanced control algorithms to make car operation safer, more comfortable, and more intelligent. Although several kinds of electric motors can be utilized, the recent battery electric vehicles (EVs) and hybrid electric vehicles (HEVs) primarily employ two types of motors: induction motors (IMs) and interior permanent magnet (IPM) synchronous motors. The control of these two motors will be addressed in this chapter. Actually, motor drive is placed between the power electronic control layer (inner) and motion control (outer) in the traction architecture. Motor drive control consists of current control, flux control, and motor angular speed control. In the regular driving, the “speed controller” is the driver himself/herself, when he/she presses/releases the acceleration pedal to give the torque command signal to motor drive to attain the desired speed/travel trajectory. In the advanced EV control configuration, speed control is the inner loop of the motion control, for example, in the slip ratio control in [2], which enhances the safety and comfort of the driver and passengers. In other applications, the speed controller produces a close representation of how a human behaves during real-world driving exercises leading to a closer representation of how the final version of the vehicle will perform [3]. The design of the speed controller requires a special care, given the fact that there is an interaction of the environment to the vehicle body, such as road characteristics (surface adhesion, road slope), wind, etc. Besides, there are uncertainties in the system model caused by unknown or imprecise parameters during operation. All these disturbances and plant uncertainties have a direct affect to the performance of speed controller, especially when it was designed by model-based approach, such as proportional-integral-differential (PID) control. Fuzzy logic (FL), which was first introduced by Zadeh on 1965 [4] as one class of artificial intelligence (AI), has been shown to be successfully applied to the motor drives in the industrial applications, transportation systems, aerospace, house appliances, etc. Fuzzy logic controller (FLC) for the speed control loop has been shown to have superior dynamic performance over PID speed controllers. FLC is a kind of nonlinear controller in its nature and has flexible control gains that can deal with ill-known model and disturbances [5]. In this chapter, the application of FLC in an EV will be extended by showing how it can control an IM or an IPM of any size and any configuration, with only minimal fine-tuning between different vehicle models. After a review on recent applications of FLC to electric motor drives and EVs in Sect. 2, the modeling of an EV will be presented in Sect. 3, following by dynamic models of IM and IPM, as well as the vector control of the drives in Sect. 4. General procedure in the design of FLC is given in Sect. 5. The approach is applied in Sect. 6 to design an FL speed controller for an off-road EV. A comparative study with PI control is also provided. The EV has been adapted as an on-road laboratory platform for advanced electric and hybrid vehicle research at e-TESC Lab. of the University of Sherbrooke in Canada. The controlled system is tested by simulation in Sect. 7 with various scenarios for a performance analysis and evaluation. The flexibility
Fuzzy Logic Control for Motor Drive in EVs
397
of FLC is also demonstrated by employing the same FLC in other platform, the iMiEV of Mitsubishi at CTI Laboratory for EVs, Hanoi University of Science and Technology in Vietnam. The summary of FLC design and its prospective will be given in the conclusion of this chapter.
2 Review on Recent Applications of FL to Motor Drives and EV Systems Since its first engineering application was reported in 1975, fuzzy logic has been successfully used in numerous fields such as control systems engineering, image processing, power engineering, industrial automation, robotics, consumer electronics, and optimization. In electric motor drive areas in particular, extensive researches have been recognized since the 1990s. Applications include speed control of DC and AC drives, parameter estimation, diagnostics, and so on, for use in industrial processes, robotics, railway systems, photovoltaic (PV) renewable energy systems, etc. In the EVs/HEV domain, as the vehicle powertrain is electric motor drive, the experience on FLC design for industrial applications can be directly exploited. The fuzzy logic approach is especially suitable for EVs, considering high level of uncertainties and disturbance, such as road condition (tire-road friction, slope, slip/skid phenomena), wind, payload variation, etc. In this section, a brief review on recent literature is carried out for FL in motor drives and EV applications. FLC and adaptive FLC were used to improve the performance of induction motor drive [6, 7]. Experiment results with large variation of moment of inertia (as much as five times) showed that the robustness of the speed control system was greatly improved, compared to that of conventional PI control. The principle of FL was also utilized to estimate the rotor resistance, the parameter that has big influence on the accuracy of indirect vector control. FL utilization is found in a stator resistance observer of an IM in [8]. The current error is the input of the FLC process to decide the stator resistance increment. The estimated stator resistance is accepted in both simulated and experimental results. FL can be used in the direct torque control configuration [9, 10]. Lai and Lin [9] presented a direct torque control based on a hybrid FLC for IM drive. The proposed technique alternates between PI controller and FLC controller by a simple switching mechanism, which is based on speed error as the threshold value. The PI controller works in steady state, while the FLC is selected in transient state to provide fast response and low overshoot. In comparison with other approaches, the hybrid fuzzy logic controller shows more robust performance and lowest rootmean-squared value of the speed error. In [10], a conventional PI or PID controller is replaced by a direct fuzzy logic for the torque regulation of an IM. The FLC requires the torque error and the torque error change as two inputs to calculate the incremental current command. It can be seen by the experimental results that this
398
M. C. Ta et al.
advanced method provides more accurate estimation of the torque and the flux than other techniques. Sensorless speed vector control of an IM using model reference adaptive systems (MRAS) based on rotor flux is discussed in [11]. The conventional MRAS-PI speed estimator is replaced by two proposed controllers. The sliding mode control is employed to improve the stability performance and fast dynamic response; meanwhile, the speed error signal is minimized by the FL technique. In open-loop and closed-loop operation modes, the best performance is proved by MRAS-FLC in the transient and removed load torque disturbance conditions. Liu et al. [12] introduce an expert controller based on fuzzy logic to improve the performance of the current controllers for IMs in field-weakening region. In high-speed flux-weakening of an IM, the expert controller treats the reference ∗ from the flux-weakening (FW) strategy and the reference qd-axis current ids ∗ axis current iqs from the speed regulator to handle the current change following two requirements: reasonable current tuning and limited current margin in FW. Compared with traditional current controllers, the command current pre-treated by the fuzzy inference expert controller has found an admissible trajectory for the current regulators. A review of the different switching techniques and switching pattern of voltage space vectors, along with artificial intelligence controllers such as an artificial neural network, adaptive neural fuzzy inference system, and fuzzy logic control, has been made in [13]. Beside motor drives and power electronic control, FL has found many other energy-related applications in EV and HEV systems. Trovao et al. proposed an FLC as an energy management algorithm for multisource energy storage systems (MESSs) containing batteries and supercapacitors (SCs). One input of the FLC is the ratio between the power demanded by the powertrain and the rated power that the batteries can offer to the DC link; the other input is state of charge (SoC) of supercapacitors. The proposed FLC outputted the gain to decide the distribution ratio of the supercapacitor current in the MESSs [14, 15]. The coupled energy management algorithm based on FLC and filtering technique in the inner control layer can enhance the battery lifetime by reducing the battery current root mean square value by 12% in comparison with a battery-only architecture. The proposed energy management system (EMS), which is equivalent to an energy- and powersplit management strategy, could enhance the stability of motor drive DC voltage [15] and has been tested on several EV configurations, including the three-wheel electric vehicle powered by battery-capacitor energy storage system [16]. In [17], by adopting the decision-making property of fuzzy logic, the driving map for an HEV is made according to driving conditions. An HEV, a city bus for shuttle service, with the proposed fuzzy logic-based driving strategy was built and tested at a real service route. It reveals the reduced NOx emission and better charge balance without an extra battery charger over the conventional deterministic tablebased strategy. Regarding the antilock braking systems of the electric vehicles, a wheel slip controller based on the fuzzy logic technique is proposed by Khatun in [18]. The
Fuzzy Logic Control for Motor Drive in EVs
399
torque demand is computed from the slip ratio, and the load torque. Over 5s of the examined period, the proposed fuzzy logic controller demonstrates a longitudinal performance enhancement even in the icy road condition. Power optimization can be achieved by using FLC [19]. The paper proposes an algorithm that reduces core losses of the induction motor, thus improving the efficiency of the driving system for electric vehicles. Beside the applications to the energy management and traction system, FL approach has found in many topics of the automotive aspects such as driver assistance system, vehicle dynamic control and ride comfort, estimation of battery performance and battery changing, and so on. A good list of reference can be found in [20].
3 Vehicle Modeling There exist various types of EV prototypes, such as the three-wheel EV [16], six-wheel EV [21], and eight-wheel EV driven by in-wheel motors [22]. As illustrated in Fig. 1, this chapter investigates one of the most popular four-wheel EV prototypes with front wheel drive. The AC motor M (IM or IPM motor in this chapter) converses electric power received from the storage system into the mechanical power in the form of electromagnetic torque Tm in the motor shaft and rotor (mechanical) rotation speed Ωm . Gearbox GB is used to increase the torque generated by the motor by the gear ratio kgear . The torque is transmitted to the axle of the two front wheels, while the two rear wheels rotate freely under the effect of road friction. The rotational motions of the motor and the wheels are expressed as Jm
Jw
dΩm Td,1 Td,2 = Tm − − dt kgear kgear
dΩw,i = Td,i − Rwh Fd,i , dt
i = 1, 2
(1)
(2)
where Jm and Jw are motor inertia and wheel inertia, respectively. The wheel has the radius Rwh , the rotational speed Ωw,i and the driving force Fd,i . The interaction between the motor and the wheel is represented by the drive shaft torque Td,i . Let Kd be the torsional stiffness of the drive shaft, Td,i is expressed as follows & % Ωm Td,i = Kd i = 1, 2 (3) − Ωw,i dt, kgear Let vveh be the longitudinal speed of the vehicle. The difference between the vehicle speed and the wheel speed is specified by the slip ratio λi =
Rwh Ωw,i − vveh max Rwh Ωw,i , vveh , ε
(4)
400
M. C. Ta et al.
Fig. 1 Electric vehicle and its traction system
where ε is a small positive number to avoid division by zero. As can be seen from Fig. 1, there exists a nonlinear relationship between the driving force and the slip ratio. This relationship is commonly described by the “magic formula” [23] 2 Fd,i =
fi (λi )
if λi ≥ 0
−fi (λi )
if λi < 0
(5)
where
) * fi (λi ) = Ai sin Bi tan−1 Ci λi − Di Ci λi − tan−1 (Ci λi )
(6)
where Ai = μi Fz,i , Fz,i is the vertical load of the wheel i, μi is the friction coefficient, and Bi , Ci , and Di are the shape factors. Summing all driving forces and resistance forces to the vehicle body of mass Mveh , the longitudinal motion of the vehicle is given by the following equation: 4 4 5 dvveh = Mveh Fd,i − Fres dt i=1
(7)
Fuzzy Logic Control for Motor Drive in EVs
401
The resistance forces act on the vehicle include air resistance, rolling resistance, and gravity component in the direction of travel when the vehicle is in climbing mode: 1 Fres = froll Mveh g cos (α) + ρCd Af (vveh + vwind )2 + Mveh g sin (α) 2
(8)
where froll is the rolling resistance coefficient, g is the acceleration of gravity, α is the incline angle of the road, ρ is the mass density of the air, Cd is the aerodynamic drag coefficient, Af is the equivalent frontal area of the vehicle, and vwind is the wind speed, which has the positive sign if the wind is resisting the forward motion of the vehicle and the negative sign if the wind pushes the vehicle forward. To conclude this section, readers should notice that the set of Eqs. (1)–(8) describes a nonlinear complex system with several uncertainties. The common uncertainties are introduced by vehicle mass, wind force, and road incline angle. Moreover, the road condition might change frequently during the operation of the vehicle. Therefore, road friction coefficient and the shape factors of the magic formula are actually time-varying parameters. Also, the torsional characteristics of the shaft might introduce shaking vibration to the traction system [24]. Due to the above reasons, the complexity and burden of system design would be increased by several approaches such as pole placement, H-infinity-based robust control, and linear quadratic regulator (LQR). To extend the application of EV, automotive engineers should pay attention to the design approach which is practically simple but effective. This motivates us to think about the application of fuzzy logic control.
4 Modeling and Vector Control of AC Motor Drives Using the vector control principle, an AC motor can be analyzed and controlled like a separately excited DC motor for high-performance applications. The fundamental principle of the vector control (VC) is to model the motor in d-q synchronously rotating frame and to control the torque generating component of the stator current in the q-axis while maintaining or adjusting the flux-related component of the stator current in the d-axis. Firstly introduced by Hasse (in 1969) [25] and Blaschke (in 1972) [26], thanks to the development of power switches and sophisticated microprocessors, vector control has become the standard in industry and other application fields since several decades. The vector control principle was initially developed for IM. The method was then utilized directly for synchronous motors (SM), including PM synchronous motors. In the following, the VC will be presented for IM and then used for IPM motor.
402
M. C. Ta et al.
4.1 Referential Transformation The transformation between three-phase stationary reference frame a-b-c and the synchronous frame d-q is realized by Park transformation: & % &⎤ % ⎡ 2π 2π ⎡ ⎤ ⎡ ⎤ cos θe + ⎢ cos θe cos θe − 3 d 3 &⎥ & % % ⎥ a ⎢ 2 ⎣q ⎦ = ⎢ 2π 2π ⎥ ⎣b ⎦ ⎥ − sin θ − sin θ − sin θ − + e e e 3⎢ ⎣ 3 3 ⎦ c 0 1/2 1/2 1/2
(9)
In the above equation d, q, and 0 represent the components in the d-axis, q-axis, and zero sequence, respectively; a, b, and c represent the components in three-phase stationary reference frame, θe is the rotor electrical position; and θe = ωe t with ωe , is the synchronous (electrical) angular speed. When the zero-sequence element is zero, the matrix of direct Park transformation is simplified as & % &⎤ % 2π 2π cos θ cos θ cos θ − + e e e 2⎢ 3 & 3 &⎥ ⎥ % % = ⎢ ⎣ 3 − sin θ − sin θ − 2π − sin θ + 2π ⎦ e e e 3 3 ⎡
KP ark
(10)
And the inverse Park transformation matrix KP−1ark has the form ⎡
KP−1ark
⎤ − %sin θe & %cos θe & ⎢ ⎥ ⎢cos θe − 2π − sin θe − 2π ⎥ ⎢ =⎢ % 3 & 3 &⎥ ⎥ % ⎣ 2π 2π ⎦ − sin θe + cos θe + 3 3
(11)
4.2 Modeling of IM in d-q Frame The electrical dynamic model of the IM can be expressed in the d-q frame as shown in (12) ⎧ d ⎪ ids − ωe σ Ls iqs ⎪ uds = Rs ids + σ Ls ⎨ dt % & ⎪ Lm d ⎪ ⎩ uqs = Rs iqs + σ Ls iqs + ωe ψdr + σ Ls ids dt Lr
(12)
Fuzzy Logic Control for Motor Drive in EVs
403
where uds , uqs , ids , and iqs denote the voltage and current in the d-q frame transformed from a-b-c frame; Ls , Lr , and Lm are the stator, rotor, and magnetization inductance, respectively; Rs and Rr are the stator and rotor resistance; ψdr is the d-axis rotor flux; and σ is the total linkage factor σ =1−
L2m Ls Lr
(13)
Denote ed and eq the coupling terms ⎧ ⎪ ⎨ ed = −ωe σ Ls iqs Lm ⎪ ⎩ eq = ωe σ Ls ids + ωe ψdr Lr we can have compact form ⎧ d ⎪ ⎪ uds = Rs ids + σ Ls ids + ed ⎨ dt ⎪ ⎪ ⎩ uqs = Rs iqs + σ Ls d iqs + eq dt
(14)
(15)
The relation of synchronously rotating speed ωe and rotor electrical speed ωm is ωe = ωm + ωsl ωsl =
iqs τr ids
(16)
where ωsl is the slip frequency and τr is electrical time constant of rotor (τr = Lr /Rr ). The rotor position θe needed in the Park transformation can be deduced from ωe as t ωe dt (17) θe = 0
The electromagnetic torque is generated by cross product of rotor flux and stator current and can be expressed in d-q frame: Tm =
3 Lm p ψdr iqs − ψqr ids 2 Lr
(18)
To complete model of IM, we should include the dynamic equation of the rotating part: Tm − Tl = Jeq
dΩm 1 dωm = Jeq dt p dt
(19)
404
M. C. Ta et al.
where p is the number of pole pairs, Ωm is mechanical rotor speed (i.e., ωm = Ωm p), Jeq is the equivalent moment of inertia, and Tl is the equivalent load torque of the vehicle on the motor shaft. Tl can be calculated from the vehicle model in Sec. 3, using (1)–(3) and (7), (8).
4.3 Vector Control of IM If we orient the d-axis of the d-q frame to be coincided with the rotor flux vector, the flux component on the q-axis ψqr is zero (where the name field-oriented control (FOC) method is originated), and (18) becomes Tm =
3 Lm ψdr iqs p 2 Lr
(20)
The torque developed by motor in (20) is proportional to the q-axis current iqs if the flux ψdr is controlled to be constant. This important feature—the basic principle of vector control—shows the analogy of IM with DC motor with independent excitation. The rotor flux can be dynamically estimated by Eq. (21): Lm ids = ψdr +
Lr dψdr Rr dt
(21)
In steady state, the flux is constant and becomes hence linear relation with current component ids ; therefore, (20) gives Tm =
3 L2m p ids iqs 2 Lr
(22)
4.4 Modeling of IPM Motors in d-q Frame The modeling of IPM motors can be obtained in a similar way as for IMs. The electrical part model of the IPM motor is represented on the d-q coordinate system as follows: ! " ! " "! " ! 0 R + Ld s −ωm Lq iq ud id = + (23) uq ωm Ld id + ψp 0 R + Lq s i q where R is the winding resistance Ld and Lq the stator winding inductance in d and q axis, respectively ψp is the flux generated by the permanent magnet
Fuzzy Logic Control for Motor Drive in EVs
ud
ua
id
Tm
ed
abc
ub
405
Coupling
uc
dq
uq
eq
iq
Fig. 2 Block diagram of the structure of the IPM motor model on the d-q coordinate system
The IPM motor model on the d-q frame is illustrated in Fig. 2. The interaction between the d and q axes in the electric part model (of the motor) is shown through the components ed = −ωm Lq iq eq = ωm Ld id + ψp
(24)
Equation (24) is equivalent to (14) in the case of IM. Eliminating coupling between the two d-q axes is one of the important problems in designing the motor controller in order to achieve good response of the motor torque to the torque demand. The motor torque is given by Tm =
3
p ψp iq + Ld − Lq id iq 2
(25)
Equation (25) shows that the motor torque comprises of two parts: the mutual torque (as the result of interaction between the PM field and the stator current iq ) and the reluctance torque. The existence of the reluctance torque component is due to the difference between inductance Ld and Lq . It is the basis of the algorithm for optimal torque per current control, called maximum torque per ampere (MTPA) control, in order to exploit the saliency of the IPM motor.
406
M. C. Ta et al.
4.5 General Layout of Vector Control of AC Motors The modeling of IM and IPM motor, especially the torque expressions (20) and (25), leads to the similarity in the construction of control schema for both motors. The general layout of vector control of AC motor is given in Fig. 3, in which we can see two control loops in the d-q frame: the inner loop for controlling the currents, the outer one for speed control in the q-axis and flux control in the d-axis. For the sake of simplicity in illustration of vector control, the decoupling networks based on (14) and (24) and the flux control loop are omitted in Fig. 3. If the bandwidth of the current control loops is high enough, the transfer function of the closed-loop current-controlled part is equivalent to a first-order function with a small time constant. We can then design the speed controller independent of current inner loops. The most popular control law is PI control, of which the gains are calculated using the motor parameters and equivalent parameters of the traction system. Figure 4 presents the simplified block diagram of vector-controlled induction motor drives, in which the block “current-controlled part” includes the electrical
Current Controller Speed Controller
Current Controller
Inverter d-q to a-b-c
PWM
Load
Rotor position from Position sensor or Estimator a-b-c to d-q
Measured currents
Fig. 3 Principle schema of vector control of AC motor
FW: Flux Weakening FW
e
Jeq: Equivalent Moment of Inertia
CurrentControlled part
Fig. 4 Simplified block diagram of vector-controlled induction motor drives
Fuzzy Logic Control for Motor Drive in EVs
407
MTPA: Maximum Torque Per Ampere MTPA & FW e
CurrentControlled part
Fig. 5 Simplified block diagram of vector-controlled IPM motor drives
part of the motor (Eqs. (14) and (15)) associated with two current controllers in Fig. 3. The (mechanical) rotational part of the motor is presented by the transfer function 1/(Jeq .s), which is directly deduced from (19). The motor speed controller is denoted Cω (s), and FW is the “flux-weakening” block. The vector control of IPM motor can be described by the same manner, which results in the simplified block diagram as illustrated in Fig. 5. ∗ in d-axis It is worth to note the particularity in the generation of current ids path: ∗ is • In IM drive, the reference value of the flux generating current component ids kept constant under base speed and reduced inversely proportional with the speed in high-speed region. It can be realized by the FW block in Fig. 4. • In IPM drive, the MTPA algorithm is employed to maximize the torque/current ratio, by exploiting the saliency of the motor. The MTPA algorithm is activated in whole region of base speed. In the high-speed range, when the flux weakening is needed to avoid the voltage saturation, the MTPA algorithm is compared with ∗ [27]. This operation is the FW condition to yield the optimal value of current ids carried out in the block “MTPA-FW” in Fig. 5.
In the next part of the chapter, the design of a speed controller Cω (s) in Figs. 4 and 5 using fuzzy logic is addressed.
5 Fuzzy Logic Control: Principles and Design Procedure 5.1 FLC vs. Conventional Control The design of a conventional control system is normally based on the mathematical model of a plant. Figure 6 illustrates the basic feedback configuration of a control system, in which P (s) represents the transfer function (or the model, in general) of the plant, C(s) denotes the controller, and u, y, w, and e are the control signal, output signal, input signal, and error signal, respectively. If an accurate mathematical model P (s) is available with known parameters, a controller C(s) can be designed for the specified performance. Unfortunately, for complex processes and systems, such
408
M. C. Ta et al.
w
Fig. 6 Basic feedback configuration of a control system Fig. 7 Basic configuration of a system using fuzzy logic: FLC in place of general conventional controller C(s)
w
e
C(s)
e ce FLC
u
u
P(s)
P(s)
y
y
as cement plants, electrical power delivery systems, EVs, etc., a reasonably good mathematical model is difficult to find. On the other hand, the plant operator may have good experience for controlling the process. For most practical systems, models are often ill-defined. Even if a plant model is well-known, there may be parameter variation problems. Very often, the model is multivariable and nonlinear, such as the dynamic model of an AC motor. Vector control in d-q frame presented in previous section can overcome this problem, but the accurate vector control is nearly impossible [5]. In indirect vector control, for instance, motor parameters may vary considerably that affect the perfect field orientation, conditioned by calculation synchronously rotating speed ωe and rotor position θe in (16) and (17) for direct and inverse coordinate transformation. To overcome such problems, various adaptive control techniques and online parameter identification algorithms have been investigated. Better control performances are obtained, in expense of control complexity and larger execution time. Fuzzy logic control, on the other hand, does not strictly need any mathematical model of the plant. It is based on plant operator experience. FLC is basically an adaptive and nonlinear control, which gives robust performance for a linear and nonlinear plant with parameter variation [5]. If the mathematical model is known, the FLC design becomes more convenient. We can take this advantage for preliminary calculation and simulation stage to shorten the control design procedure. Figure 7 shows the analogy between the FLC and the conventional controller in Fig. 6. The FLC takes the same place as of traditional controller C(s) in this feedback configuration. One input is error e between the reference (desired) signal and system output response; the other one is the change in error ce. Studies have shown that for most control problems, these two inputs’ configuration is good enough to give high performance. FLC in Fig. 7 is equivalent to a PI controller, of which the proportional and integral gains are automatically adjusted according to the working conditions. That explains why an FLC yields superior performance to conventional PI control. Other variances, equivalent to P-type FLC or PID-type FLC, are also possible [5]; however, the mentioned PI-type FLC is by far the most popular.
Fuzzy Logic Control for Motor Drive in EVs
409
5.2 General Design Procedure of an FLC Fuzzy control is basically a process that is based on a fuzzy inference system (FIS). The FIS is essentially a formulation of the mapping from a given input set to an output set using FL. An FIS consists of following steps [5]: • Fuzzification of input variables • Applications of fuzzy operators (AND, OR) in the IF (antecedent) part of the rule and implication from the antecedent to the consequent (THEN part of the rule) • Aggregation of the consequents across the rules • Defuzzification By placing the FIS in the general configuration of the feedback control as illustrated in Fig. 7, we can deduce the structure of an FLC in the control system. The FLC in Fig. 8 contains three main blocks, F (fuzzification), I (inference), and D (defuzzification), along with other two functional blocks, integral and knowledge base, which are described in the following: 1. Fuzzification A fuzzy variable has value that is expressed by natural language. The role of this stage is to converse the deterministic values (non-fuzzy or crisp) into fuzzy values: • Identify the input and output variables and range of (crisp) values. • Define the universe of discourse of input and output fuzzy variables. • Introduce the fuzzy sets of the fuzzy variables corresponding to input(s) and output(s). • Choose the form of membership functions. The first input to FLC is the error e(k) between the reference w(k) (desired value of output) and the system output y(k) (actual value), and the second input is the change of error ce(k) between two instants k and (k − 1). The following relation can be extracted in a discrete system at instant k:
Fig. 8 Structure of FLC
e(k) = w(k) − y(k)
(26)
ce(k) = e(k) − e(k − 1)
(27)
410
M. C. Ta et al.
A membership function (MF) can have different shapes, such as triangular, trapezoidal, Gaussian curve, bell, sigmoid, etc. MFs can be represented by mathematical functions, segmented straight lines, and look-up tables. The simplest and most commonly used MF is the triangular-type, because it can be realized by straight lines or by a linear function in programming. It can be symmetrical or asymmetrical in shape. 2. Inference Inference is the heart of an FLC that contains the capability of simulating the human decisions and deduces (infers) the action of fuzzy control by utilizing the fuzzy implication and the inference rules in the FL: • Formulate the rules of type IF . . . THEN . . . by utilizing fuzzy operators (AND, OR, NOT) in the IF (antecedent) part of the rule and implicating from the antecedent to the consequent (THEN part of the rule). • Establish the rule table (matrix). • Aggregation of the consequents across the rules. There are a number of implication methods in the literature, of which the Mamdani type and Sugeno type are the most frequent. In the Mamdani method, each rule is evaluated by Minimum operator, and the total fuzzy output is the union (OR) of all the component MFs (Maximum operator). In the Sugeno (or Takagi-Sugeno-Kang) method of implication, output MFs are only constants or have linear relations with the inputs. With a constant output MF, it is defined as the zero-order Sugeno method, whereas with a linear relation, it is known as the first-order Sugeno method. It can be shown that if the Mamdani and Sugeno methods are applied to the same problem, the output is nearly the same. In practice, the Mamdani (or Max-Min method) is the most commonly used implication (aggregation) method. 3. Defuzzification The result of the implication and aggregation steps is the fuzzy output, which is the union of all the output of individual rules that are validated. Conversion of this fuzzy output to the crisp output is performed in this stage of defuzzification. There are three methods of defuzzification: center of gravity method (COG), height method, and mean of maxima method. In the center of gravity method of defuzzification, the crisp output YO of the output fuzzy variable Y is taken to be the geometric center of the output fuzzy value μ(Y ) area, which is formed by taking the union of all the contributions of rules. Mathematically, the COG can be expressed as follows ( Y.μ(Y ).dY YO = ( μ(Y ).dY (
μ(Y ).dY denotes the area of the region bounded by the curve μ(Y ).
(28a)
Fuzzy Logic Control for Motor Drive in EVs
411
If the μ(Y ) is defined with a discrete membership function, the YO can be calculated by following formula which uses summations instead of integrations. n i=1 Yi .μ(Yi ) YO = n i=1 μ(Yi )
(28b)
Here Yi is a sample element and n represents the number of contributed samples in the given fuzzy set. In the height method of defuzzification, the COG method is simplified to consider only the height of each contributing MF and the midpoint of the base. The height method of defuzzification is further simplified in the mean of maxima method, where only the highest MF component in the output is considered. The COG method of defuzzification gives the most precise output value, with nearly similar calculation effort as other two methods. It is therefore the most utilized in literature. 4. Integral The output of FLC is the change (increment/decrement) of control variable. This signal is summed or integrated to generate the actual signal u to controlled plant. In a discrete system, the updated control variable is calculated at instant k as u(k) = u(k − 1) + cu(k)
(29)
That means the discrete integration is the sum of the change in control variable and its immediately past value. 5. Knowledge base The knowledge base block contains the database for the blocks F, I, and D and rule base for the I. In this meaning, the knowledge base block plays the role of supervision, which has a relation with the human intelligence (such as knowledge on the system and/or operation experience). In the general structure of a fuzzy feedback control system in Fig. 8, the scale factors are introduced. The loop error e and the change in error ce signals are converted to the respective per unit signals by multiplying by the respective scale factors ke and kce . Similarly, the output signal u is derived by multiplying the per unit (pu) output by the scale factor kce and then summed to generate the u signal: ke =
E ; e
kce =
CE ; ce
kcu =
cu CU
(30)
Working with pu values presents a great advantage that the “normalized FLC” can be applied to all the plants of the same family. For other different plant, we only need to change the scale factors to conform to specific database. Besides, it becomes convenient to design the FLC. The scale factors can be constant or programmable. Programmable scale factors can control the operation sensitivity
412
M. C. Ta et al.
in different regions of control or the same strategy can be applied in similar response loops. The above general design procedure will be illustrated by the design of an FLC for an off-road electric vehicle, driven by an IM.
6 Fuzzy Logic Speed Control for EV Applications 6.1 System Description The design of FLC is demonstrated in the case study using eCommander off-road EV available at our laboratory at the University of Sherbrooke (Fig. 9). The EV has been adapted as an on-road laboratory platform for advanced electric and hybrid vehicle research in our Lab [28, 29]. The vehicle is driven by a three-phase induction motor with the DC bus power supplied by a battery pack. The main parameters of the vehicle, the motor are given in Table 1.
6.2 Design of FL Speed Control Consider the FL speed controller in a vector control drive system, i.e., FLC in place of Cω (s) in Fig. 4, which correspond to Figs. 7 and 8. The controller observes the pattern of the speed loop error signal and correspondingly updates the output cu ∗ ) so that the actual speed Ω matches the command speed Ω ∗ . (ciqs m m Fig. 9 The eCommander off-road vehicle model reference at the e-TESC laboratory, University of Sherbrooke
Fuzzy Logic Control for Motor Drive in EVs
413
Table 1 System parameters for numerical study Parameters Vehicle (eCommander) Vehicle mass (net vehicle and a driver) Maximum goods carrying capability Aerodynamic drag coefficient Equivalent frontal area Air mass density (at 20◦ C) Rolling resistance coefficient Motor to wheel transmission ratio Efficiency of the transmission Wheel radius Electrical motor (ABM induction motor) Nominal DC bus voltage Stator resistance Rotor resistance Mutual magnetization inductance Stator leakage inductance Rotor leakage inductance Nominal frequency Number of pole pairs
Symbols
Values
Mveh Mg_max Cd Af ρ froll kgear ηgear Rwh
871 kg 272 kg 0.65 2 m2 1.223 kg/m3 0.02–0.08 20.5 0.91 0.3175 m
Udc_nom Rs Rr Lm Lls Llr fnom p
48 V 1.627 m
0.415 m
320 μH 19.42 μH 19.42 μH 60 Hz 2
The design of FLC for speed loop is carried out by following the general procedure described in Sect. 5.2. 1. Fuzzification ∗ − Ω and the There are two input signals to the FLC, the error e = Ωm m change in error, ce, which is related to the derivative of error de/dt. In a discrete system, de/dt = ce/Ts where Ts is the sampling time. With constant Ts , ce is proportional to de/dt. Figure 8 also illustrates how to calculate ce by using delay operator z−1 . The controller output cu (minuscule) in a vector control drive is the change of ∗ (Figs. 5, 7, and 8). This signal is summed or integrated to generate current ciqs ∗ in this case). the control signal u (iqs According to the data given in Table 1 for the vehicle and motor, we can calculate the universe of discourse for the inputs (speed errors e and change in ∗ ). We can define fuzzy speed error ce) and the output cu (change of current ciqs sets (linguistic values) as follows: • • • • •
Negative big (NB) Negative medium (NM) Negative small (NS) Nearly zero (ZE) Positive small (PS)
414
M. C. Ta et al.
Table 2 Distribution of fuzzy sets Significant Negative big Negative medium Negative small Nearly zero Positive small Positive medium Positive big
Symbol NB NM NS ZE PS PM PB
Level −3 −2 −1 0 1 2 3
e (rad/s) −800 → −250 −500 → −80 −250 → 0 −80 → 80 0 → 250 80 → 500 250 → 800
ce (rad/s2 ) −20 → −5 −15 → −2 −5 → 0 −2 → 2 0→5 2 → 15 5 → 20
cu (A/s) −10 → −3 −7 → −1 −3 → 0 −1 → 1 0→3 1→7 3 → 10
• Positive medium (PM) • Positive big (PB) For the reason of simplicity and for a better visual effect, the fuzzy sets can be further coded by levels using numbers from −3 (for NB) to 3 (for PB). Table 2 summarizes the universe of discourse of variables and distribution of fuzzy sets. Note that the given values in the table are for the motor side, as all the parameters and variables of the vehicles have been converted into the motor shaft. The universes of discourse of the input and output variables are converted in pu values and expressed by MFs as shown in Fig. 10. The MFs of triangulartype are asymmetrical because near the origin (steady state), the signals require more precision. All the MFs are balanced for positive and negative values of the variables. 2. Inference The rules of type IF. . . THEN are established in this stage. Given 7 fuzzy sets for each variable, there are 7 × 7 = 49 possible rules, which are connected by the operator OR: Rule 1:IF e is (NB) AND ce is (NB) THEN cu is (NB) OR Rule 2: IF e is (NB) AND ce is (NM) THEN cu is (NB) ... OR Rule 49: IF e is (PB) AND ce is (PB) THEN cu is (PB) Table 3 shows the corresponding table of rules for the speed controller, expressed in pu. The top row and left column of the matrix indicated the fuzzy sets of the variables E and CE, respectively, and the MFs of the output variable CU are shown in the body of the matrix. Note that the rule table is displayed by using the levels (−3; 3) for the linguistic values. For a given operation point, only some rules are active, which are then implicated using Max-Min operators (Mamdani method of implication):
Fuzzy Logic Control for Motor Drive in EVs Membership function E
1.2 NL
NM NS ZE PS PM
PL
1
(E)
0.8 0.6 0.4 0.2 0 -1
-0.5
0
0.5
1
E (normalized)
(a) Membership function CE
1.2 NL
NM NS ZE PS
PM
PL
1
(CE)
0.8 0.6 0.4 0.2 0 -1
-0.5
0
0.5
1
CE (normalized)
(b) Membership function CU
1.2 NL
NM NS ZE PS
PM
PL
1 0.8
(CU)
Fig. 10 Membership functions of input and output variables. (a) Speed error membership function. (b) Change in speed error membership function. (c) Output membership function
415
0.6 0.4 0.2 0 -1
-0.5
0
CU (normalized)
(c)
0.5
1
416
M. C. Ta et al.
Table 3 Rule table for FLC of speed control
E CE −3 −2 −1 0 1 2 3
−3 −3 −3 −3 −3 −2 −1 0
−2 −3 −3 −3 −2 −1 0 1
−1 −3 −3 −2 −1 0 1 2
0 −3 −2 −1 0 1 2 3
1 −2 −1 0 1 2 3 3
2 −1 0 1 2 3 3 3
3 0 1 2 3 3 3 3
CU (normalized)
Fuzzy rule surface
0.5
0
-0.5 1 0 -1
CE (normalized)
-1
-0.5
0
0.5
1
E (normalized)
Fig. 11 Fuzzy rule surface
• Calculate the degree of fulfillment (DOF) of each rule using the AND or min operator. • Aggregate the total fuzzy output using OR or max operator. 3. Defuzzification ∗ , is calculated by The crisp output CU, the change of q-axis current (pu) Ciqs using (28b) of the the COG method: n ∗ Ciqs
=
∗ ∗ i=1 Ciqs,i .μ(Ciqs,i ) n ∗ i=1 μ(Ciqs,i )
(31)
∗ ) is the membership function of the Ci ∗ , i is a sample element where μ(Ciqs,i qs,i and n represents the number of contributed samples in the given fuzzy set.
Fuzzy Logic Control for Motor Drive in EVs
417
∗ is then conversed by scale factor k to give the change of current ci ∗ . Ciqs cu qs ∗ in this case). This signal is integrated to generate the control signal u (iqs Figure 11 shows the fuzzy surface of the rules from the rule table (matrix). It can be seen that the distribution is concentrated around the origin, meaning that high accuracy of the controlled system is expected. The rule matrix and MF description of the variables are based on the knowledge of the system, and their fine-tuning may be time-consuming for optimal performance. For a simulation-based system design, controller tuning by the C-programming or recently with the help of the MATLAB Fuzzy Logic ToolboxTM , may be reasonably fast.
6.3 Comparison of PI Controller and FLC Speed Control Using PI Controller To control the speed of a motor, proportional-integral controller (PIC) has been shown to be a standard method. For instance, a traditional way to design the PICbased speed control is as follows. We let the transfer function from motor torque to motor speed and the transfer function of the PIC be Pm (s) = Cm (s) =
1
(32)
Jeq s
kp s + ki s
(33)
where kp and ki are the control gains. The transfer function of the closed-loop system Pc (s) including Pm (s) and Cm (s) is kp 1 kp s + ki ki s+ Jeq s s Jeq Jeq Pm (s)Cm (s) = = Pc (s) = k 1 kp s + ki ki 1 + Pm (s)Cm (s) p 1+ s2 + s+ Jeq s s Jeq Jeq
(34)
Let λ1 and λ2 be the desired poles of Pc (s), and the PI gains can be derived as s2 +
kp ki s+ ≡ (s − λ1 ) (s − λ2 ) ⇒ Jeq Jeq
$
kp = −Jeq (λ1 + λ2 ) ki = Jeq λ1 λ2
(35)
Readers should notice that the motor torque is limited by the maximum motor current. On the other hand, the current control loop always has a certain bandwidth. Therefore, λ1 and λ2 must not be placed too far from the origin in the left half-plane.
418
M. C. Ta et al.
Preliminary Discussion The above design procedure works really well if we only control a separate motor drive. However, the circumstance changes considerably when dealing with the application of EVs with the model shown in Fig. 1. The targeted plant consists of the EV body, the motor, and the mechanism that connects them. The EV plant is actually nonlinear. Hence, the nominal transfer function (32) fails to describe the true dynamics of the targeted plant. The designers also suffer from physical uncertainties, such as the vehicle mass, the road friction coefficient, and the torsional stiffness of the drive shaft. Considering the aforementioned issues, the disadvantages of PIC for EV have been discussed in our recent studies [30, 31]. Although stable poles are placed to the local transfer function Pc (s), the poles of the overall system might change their place during long-term operation of EV. If the poles move toward the imaginary axis, the control system might suffer from vibration. Some robust control tools can improve system performance, and the typical tool is disturbance observer (DOB) [32]. Although DOB is simple to be implemented, its design and analysis are nontrivial for a nonlinear system as EV. The design of DOB requires several techniques as H-infinity norm and μ-synthesis. From practical point of view, automotive engineers still need some approaches which are convenient to design without using complex calculations. Based on the above discussion, FLC turns out to be an attractive candidate for speed control. Following the previous section, readers can see that actually the FLC consists of two actions: the integral control action and the proportional control action. However, unlike the traditional fixed-gain PIC, FLC can be treated as a PIC with the gain adjusted and refined in real time by the fuzzy law.
7 Simulation and Performance Evaluation 7.1 Comparative Study of PI Controller and FLC To compare the performance of PIC and FLC, we considered the motor speed control problem described in Fig. 4 and assume that the reference speed is given when the vehicle runs on the road surface with the friction coefficient μ = 0.8. The road friction coefficient was reduced to μ = 0.3 in the short period from 12 s to 14 s. The PIC is designed using (35) with the desired poles λ1,2 = −14 ± 6j , and the moment of inertia Jeq is calculated using nominal mass of the vehicle. The nominal mass is calculated under the assumption that the vehicle has two passengers. The weight of each passenger is 70 kg. Two test cases were performed. In Test 1, the plant is the simplified linear model (32). Both PIC and FLC showed very good control performances as can be seen in Fig. 12. In the Test 2, the above controllers were verified by using the nonlinear
Fuzzy Logic Control for Motor Drive in EVs
419
(a)
(b) Fig. 12 Simulation results with simplified linear model without uncertainties. (a) Speed response. (b) Speed zoom
vehicle model described by Eqs. (1)–(8). In this test, the vehicle has to carry four passengers with the additional luggage of 50 kg. As can be seen from Fig. 13, both controllers still guarantee good tracking performances. However, the PIC has suffered from more oscillation with a higher overshoot. Motivated by the above results, in the following section, we have verified the performance of FLC and the overall control system in Fig. 4 by a standard driving cycle test.
7.2 Simulation of Vehicle Operation In order to examine the performance, especially the robustness of the fuzzy logic speed controller, the vehicle is tested over the modified ECE cycle (to be suitable for the off-road vehicle speed range), the maximum speed of 45 km/h, and a duration of 195 s.
420
M. C. Ta et al.
(a)
(b) Fig. 13 Simulation results (using nonlinear EV model with parameter uncertainties). (a) Speed response. (b) Speed zoom
The fuzzy logic speed controller is realized by using the Fuzzy Logic ToolboxTM of MATLAB. Main results are reported in Fig. 14. For system parameter variation, the vehicle total mass changes two times at the two stop periods (Fig. 14c). At the beginning, there is a 70-kg person driving the vehicle; then, at the 40th second, the vehicle is fully loaded with 272 kg of goods and two 70-kg people; finally, at the 100th second, the goods are unloaded. Consequently, the vehicle inertia characteristic changes during the driving cycle. Moreover, we examine the vehicle running on different road conditions with the rolling resistance coefficient varying from 0.02 to 0.08 as presented in Fig. 14e. The resistant force of the road to the vehicle is therefore significantly changed. The global response given in Fig. 14a and the relative error (in comparison to the top speed of 40 km/h) plotted in Fig. 14h confirm the good performance of the vehicle speed fuzzy logic controller. Thanks to the fast torque dynamics of the electrical motor reflected in Fig. 14g, the controller can quickly respond to reference change and robustly adapt to the parameter and disturbance variations to keep the error always lower than 2.5%.
Fuzzy Logic Control for Motor Drive in EVs Vehicle speed [km/h] 40 30 20
421 Vehicle speed (zoom 1) [km/h]
12.5
vveh
vveh
vveh ref
vveh ref 12
Zoom 2 Zoom 3
Zoom 1
10 0
0
50
100
150
200
11.5 14
16
1200
15 Mveh
Two people Full load
1100 1000
22
24
Vehicle speed (zoom 2) [km/h] vveh
14.5
vveh ref
14 13.5
Two people
900 800
20
(b)
(a) Vehicle total mass [kg]
1300
18
Time [s]
Time [s]
13 One person 0
50
100
150
12.5 90
200
90.2
90.4
90.6
90.8
91
Time [s]
Time [s]
(d)
(c) Rolling resistance coefficient
24.5
Vehicle speed (zoom 3) [km/h]
0.08 24 0.06
23.5
0.04
vveh
23 froll
0.02 0
50
100
150
22.5 130
200
vveh ref 130.2 130.4 130.6 130.8
(e)
(f)
Electrical motor torque [Nm]
30
2
10
1.5
0
1
-20
50
100
150
ev
veh
0.5
Tem 0
Vehicle speed error [%]
2.5
20
-10
131
Time [s]
Time [s]
200
0
0
50
100
Time [s]
Time [s]
(g)
(h)
150
200
Fig. 14 Vehicle speed response over the modified ECE driving cycle under system parameters and resistance variations. (a) Vehicle speed response. (b) Speed zoom 1. (c) Vehicle mass. (d) Speed zoom 2. (e) Rolling coefficient. (f) Speed zoom 3. (g) Motor torque. (h) Speed error
422
M. C. Ta et al.
To have a better view, three areas of the speed response are zoomed in. Zoom 1 (Fig. 14a) shows the overshoot when the speed reference is set to 12 km/h and the speed drop when froll suddenly raises from 0.035 to 0.06 with only a person driving the vehicle. In Zoom 2 plotted in Fig. 14b, when the vehicle is fully loaded and one more person added, the rolling resistance coefficient reduces to 0.02 that makes the speed raising higher than the reference. Finally, Zoom 3 presents the scenario that the coefficient froll dramatically increases four times, from 0.02 to 0.08, which causes a drop of vehicle speed. At this time, the vehicle carries only two people as the goods were released at the previous stop period. In the all three scenarios, it can be seen that the fuzzy logic controller quickly responds to the resistant force alterations regardless of different vehicle total mass. The vehicle speed, after small errors due to the disturbance changes, is forced to follow the reference within about 1 s. That verifies the robustness of the developed fuzzy logic vehicle speed controller.
7.3 Flexibility of Fuzzy Logic Controller The previous section shows that, in comparison with the traditional PIC, FLC might improve the performance of the motor actuator to a certain extent. This subsection is to discuss another merit of FLC, the flexibility. To this end, we demonstrate the performance of FLC using a completely different targeted plant. Figure 15 shows the commercial car Mitsubishi i-MiEV, which is acquired to serve as research prototype at the CTI Lab. for EVs, Hanoi University of Science and Technology, Vietnam. The measurement of the vehicle and motor parameters has been conducted in our Lab, and the values are reported in Table 4. In comparison with the eCommander in Fig. 9 and Table 1, the i-MiEV has heavier weight and driven by a different drivetrain with IPM motor. Fig. 15 Mitsubishi i-MiEV as research prototype at CTI Lab. for EVs, Hanoi University of Science and Technology
Fuzzy Logic Control for Motor Drive in EVs Table 4 Specifications of the i-MiEV model
423 Parameters Symbols Vehicle (i-MiEV) Vehicle mass Mveh Radius of wheel Rwh Wheel moment of inertia Jw Equivalent frontal area Af Drag coefficient Cd Gearbox ratio kgear Electrical motor (IPM motor) Rated power Prated Nominal voltage Unom d-axis inductance Ld q-axis inductance Lq Phase winding resistance R Permanent magnet flux ψ Number of pole pairs p
Values 1080 kg 0.285 m 1.25 kg.m2 2.37 m2 0.35 7.065 49 kW 330 V 140 μH 210 μH 12 m
0.06 Wb 4
In this case study, we control the speed of the IPM motor attached to the i-MiEV vehicle. The test condition and the reference speed are similar to the previous test using the eCommander vehicle. In this test, we have treated the i-MiEV as if a “black box.” This means we assume we do not know the physical parameters of the i-MiEV. We have used the same FLC developed for the eCommander in Sect. 6 and only adjusted the gain kcu . As can be seen in Fig. 16, the tracking performance was poor if kcu is small. On the other hand, the motor speed suffers from vibration if we selected for kcu a really big value. This also results in the vibration to the motor torque (Fig. 17). Such vibrations should be eliminated since they reduce the comfort of the driver and introduce negative effects to the inner loop of the motor drive system (current control loop, power electronics converter control). The designer should compromise the trade-off between tracking performance and the driver comfort when adjusting the gain of the FLC. This simulation study clarifies the flexibility of FLC for practical applications. After developing the fuzzy law using a given vehicle prototype (eCommander), the controller can be readily implemented for other vehicle prototypes. Even if the iMiEV is a “black box,” we can quickly perform fine-tuning process to achieve a good control performance.
424
M. C. Ta et al.
(a)
(b) Fig. 16 Performance of fuzzy logic controller with different gains kcu . (a) Speed response. (b) Speed zoom
8 Conclusion Electrification becomes an indispensable trend in automotive industry. While energy storage systems are the key components to enable EV penetration into the market, electric motors are the soul of the whole EV system. The use of electric motors not only does resolve the “traditional” problems of the pollution and fossil fuel shortage but also means many other features, such as safer, more comfortable, and more enjoyable to drive. As many latest technologies can be embedded, the vehicle can
Fuzzy Logic Control for Motor Drive in EVs
425
Fig. 17 Motor torque with different gains kcu
be autonomous and acts as an intelligent agent in the ecosystem of the Internet of Things and smart energy. It is expected that modern EVs in the very future will be equipped by one of the five levels of autonomous driving. In the normal mode, the driver imposes the demand torque to the traction part by pressing/releasing the acceleration pedal. In the driver assistance mode (level 1), for instance, the vehicle can be monitored through cruise control. In the last case, the reference torque is generated by the vehicle speed control loop. This chapter devotes to designing the speed controller and has shown how we can incorporate FLC, a class of artificial intelligence, into the whole system. Inspired by the advantages of an electric motor that make fundamental merits of EVs over ICE vehicles, the chapter focuses on how to design the “best quality” of the motor torque reference. The FLC approach has been adopted and utilized in a direct and simple way, but very systematic in the practical design point of view. After a thorough literature review on fuzzy control applications for motor drives and EVs, the vehicle powertrain has been modeled focusing on the drivetrain and the electrical motor models. Standing out in other works dealing with a specific sort of traction drive, this chapter has figured out two commonly used AC motor drives for EVs, which are IM and IPM motor. These two kinds of motors can utilize the common general vector control layout. Afterward, the FLC principle and design procedure have been addressed from philosophy to detailed guideline. A comparative study by simulation has shown that if the plant is of a simplified linear model, both PI controller and FLC yielded very good and similar control outcome, in terms of overshoot, response time, tracking error, and steady-state error. However, when considering the real model of EV with nonlinear characteristics, the FLC showed better performance than the PI controller.
426
M. C. Ta et al.
The proposed FLC has been evaluated via numerical simulation using a practical vehicle-based model. To verify the robustness and uncertainty-tolerant ability of the fuzzy controller, both parameter and disturbance variations have been applied to the system. The results show that despite of up to 39% of vehicle mass change and 300% of rolling resistance force alteration, the tracking performance of the vehicle speed control loop is ensured with less than 2.5% of relative error. Moreover, thanks to the normalization of the membership functions and inference design, the proposed FLC can tolerate a wide range of practical applications. It is tested by utilizing the same FLC for other EV platforms, the Mitsubishi i-MiEV driven by an IPM motor. The simulation study has confirmed the flexibility of the FLC for practical applications. A controller designed for a vehicle can be conveniently implemented for other vehicle prototypes. Only fine-tuning process is required to achieve a good control performance. The principle of FLC can also be further applied in other control layers of the EV system. More uncertainty and nonlinearity such as slippery characteristics of tire dynamics would be also of interest for future studies based on this FLC. Acknowledgments We would like to express our sincerest gratitude to Dr. Bảo-Huy Nguyễn, postdoctoral researcher at e-TESC Lab., Department of Electrical and Computer Engineering, University of Sherbrooke, Canada, for his time contributing the simulations in Sects. 6.2 and 7.2. We also express our deepest thanks to Prof. João Pedro F. Trovão at e-TESC Lab., Department of Electrical and Computer Engineering, University of Sherbrooke, Sherbrooke, QC, J1K 2R1, Canada, for his discussion and his great support of the electric vehicle model used in this study.
References 1. Y. Hori, Future vehicle driven by electricity and control-research on four-wheel-motored UOT Electric March II. IEEE Trans. Ind. Electron. 5, 954–962 (2004) 2. H. Fujimoto, Regenerative brake and slip angle control of electric vehicle with in-wheel motor and active front steering. Tech. rep., SAE Technical Paper, 2011 3. S. Cash, O. Olatunbosun, Fuzzy logic field-oriented control of an induction motor and a permanent magnet synchronous motor for hybrid/electric vehicle traction applications. Int. J. Electric Hybrid Veh. 9(3), 269–284 (2017) 4. L.A. Zadeh, Fuzzy sets. Inform. Control 8(3), 338–353 (1965) 5. B.K. Bose, Modern Power Electronics and AC Drives (Prentice Hall, Englewood Cliffs, 2002) 6. M. Ta-Cao, H. Le-Huy, Model reference adaptive fuzzy controller and fuzzy estimator for high performance induction motor drives, in IEEE Industry Applications Conference—IAS’96, vol. 1 (1996), pp. 380–387 7. M. Ta-Cao, Digital Control of Induction Machines using Fuzzy Logic. Ph.D. Dissertation, Université Laval, 1997. Text in French 8. B. Karanayil, M.F. Rahman, C. Grantham, Stator and rotor resistance observers for induction motor drive using fuzzy logic and artificial neural networks. IEEE Trans. Energy Convers. 20(4), 771–780 (2005) 9. Y.-S. Lai, J.-C. Lin, New hybrid fuzzy controller for direct torque control induction motor drives. IEEE Trans. Power Electron. 18(5), 1211–1219 (2003) 10. H. Rehman, Fuzzy logic enhanced robust torque controlled induction motor drive system. IEE Proc. Control Theory Appl. 151(6), 754–762 (2004)
Fuzzy Logic Control for Motor Drive in EVs
427
11. S.M. Gadoue, D. Giaouris, J.W. Finch, MRAS sensorless vector control of an induction motor using new sliding-mode and fuzzy-logic adaptation mechanisms. IEEE Trans. Energy Convers. 25(2), 394–402 (2009) 12. Y. Liu, J. Zhao, R. Wang, C. Huang, Performance improvement of induction motor current controllers in field-weakening region for electric vehicles. IEEE Trans. Power Electron 28(5), 2468–2482 (2012) 13. M.A. Hannan, J. Abd Ali, P.J. Ker, A. Mohamed, M.S. Lipu, A. Hussain, Switching techniques and intelligent controllers for induction motor drive: issues and recommendations. IEEE Access 6, 47489–47510 (2018) 14. J.P. Trovão, M.A. Silva, C.H. Antunes, M.R. Dubois, Stability enhancement of the motor drive DC input voltage of an electric vehicle using on-board hybrid energy storage systems. Appl. Energy 205, 244–259 (2017) 15. J.P. Trovão, M.A. Silva, M.R. Dubois, Coupled energy management algorithm for MESS in urban EV. IET Electr. Syst. Transp. 7(2), 125–134 (2017) 16. J.P.F. Trovão, M.-A. Roux, E. Ménard, M.R. Dubois, Energy- and power-split management of dual energy storage system for a three-wheel electric vehicle. IEEE Trans. Veh. Technol. 66(7), 5540–5550 (2017) 17. H.-D. Lee, S.-K. Sul, Fuzzy-logic-based torque control strategy for parallel-type hybrid electric vehicle. IEEE Trans. Ind. Electron. 45(4), 625–632 (1998) 18. P. Khatun, C.M. Bingham, N. Schofield, P. Mellor, Application of fuzzy control algorithms for electric vehicle antilock braking/traction control systems. IEEE Trans. Veh. Technol. 52(5), 1356–1364 (2003) 19. R. Kassem, K. Sayed, A. Kassem, R. Mostafa, Power optimisation scheme of induction motor using FLC for electric vehicle. IET Electr. Syst. Transp. 10(3), 301–309 (2020) 20. V. Ivanov, A review of fuzzy methods in automotive engineering applications. Eur. Transp. Res. Rev. 7(3), 1–10 (2015) 21. L. Boulon, D. Hissel, A. Bouscayrol, O. Pape, M. Péra, Simulation model of a military HEV with a highly redundant architecture. IEEE Trans. Veh. Technol. 59(6), 2654–2663 (2010) 22. S. Hiroshi, Multi-purpose electric vehicle “KAZ” . IATSS Res. 25(2), 96–97 (2001) 23. H.B. Pacejka, Tyre and Vehicle Dynamic (Elsevier BVl, 2006) 24. T. Karikomi, K. Itou, T. Okubo, S. Fujimoto, Development of the shaking vibration control for electric vehicles, in 2006 SICE-ICASE International Joint Conference (2006), pp. 2434–2439 25. K. Hasse, Zur Dynamik drehzahlgeregelter Antriebe mit stromrichtergespeisten AsynchronKurzschlusslaufermaschinen. Ph.D. Dissertation, Technische Hochschule Darmstadt, 1969. Text in German 26. F. Blaschke, The principle of field orientation as applied to the new transvector closed loop control system for rotating field machines. Siemens Rev. 34(3), 217–220 (1972) 27. K.H. Nam, AC Motor Control and Electrical Vehicle Applications (CRC Press, Boca Raton, 2019) 28. M.J. Blondin, J.P. Trovão, Soft-computing techniques for cruise controller tuning for an offroad electric vehicle. IET Electr. Syst. Transp. 9(4), 196–205 (2019) 29. C.T. Nguyen, B.-H. Nguyen, J.P.F. Trovão, M.C. Ta, Effect of battery voltage variation on electric vehicle performance driven by induction machine with optimal flux-weakening strategy. IET Electr. Syst. Transp. 10(4), 351–359 (2020) 30. B.-M. Nguyen, S. Hara, H. Fujimoto, Y. Hori, Slip control for IWM vehicles based on hierarchical LQR. Control Eng. Pract. 93, 104179 (2012) 31. B.-M. Nguyen, H.V. Nguyen, M. Ta-Cao, M. Kawanishi, Longitudinal modelling and control of in-wheel-motor electric vehicles as multi-agent systems. Energies 13(20), 5437 (2020) 32. T. Umeno, Y. Hori, Robust speed control of DC servomotors using modern two degrees-offreedom controller design. Trans. Ind. Electron. 38(5), 363–368 (1991)