139 7 17MB
English Pages 402 Year 2020
Springer Tracts in Nature-Inspired Computing
Mahdi Khosravy Neeraj Gupta Nilesh Patel Tomonobu Senjyu Editors
Frontier Applications of Nature Inspired Computation
Springer Tracts in Nature-Inspired Computing Series Editors Xin-She Yang, School of Science and Technology, Middlesex University, London, UK Nilanjan Dey, Department of Information Technology, Techno India College of Technology, Kolkata, India Simon Fong, Faculty of Science and Technology, University of Macau, Macau, Macao
The book series is aimed at providing an exchange platform for researchers to summarize the latest research and developments related to nature-inspired computing in the most general sense. It includes analysis of nature-inspired algorithms and techniques, inspiration from natural and biological systems, computational mechanisms and models that imitate them in various fields, and the applications to solve real-world problems in different disciplines. The book series addresses the most recent innovations and developments in nature-inspired computation, algorithms, models and methods, implementation, tools, architectures, frameworks, structures, applications associated with bio-inspired methodologies and other relevant areas. The book series covers the topics and fields of Nature-Inspired Computing, Bio-inspired Methods, Swarm Intelligence, Computational Intelligence, Evolutionary Computation, Nature-Inspired Algorithms, Neural Computing, Data Mining, Artificial Intelligence, Machine Learning, Theoretical Foundations and Analysis, and Multi-Agent Systems. In addition, case studies, implementation of methods and algorithms as well as applications in a diverse range of areas such as Bioinformatics, Big Data, Computer Science, Signal and Image Processing, Computer Vision, Biomedical and Health Science, Business Planning, Vehicle Routing and others are also an important part of this book series. The series publishes monographs, edited volumes and selected proceedings.
More information about this series at http://www.springer.com/series/16134
Mahdi Khosravy Neeraj Gupta Nilesh Patel Tomonobu Senjyu •
•
•
Editors
Frontier Applications of Nature Inspired Computation
123
Editors Mahdi Khosravy Media Integrated Communication Lab Graduate School of Engineering Osaka University Osaka, Japan
Neeraj Gupta Department of Computer Science and Engineering Oakland University Rochester, MI, USA
Nilesh Patel Department of Computer Science and Engineering Oakland University Rochester, MI, USA
Tomonobu Senjyu Department of Electrical and Electronics Engineering, Faculty of Engineering University of the Ryukyus Nishihara, Japan
ISSN 2524-552X ISSN 2524-5538 (electronic) Springer Tracts in Nature-Inspired Computing ISBN 978-981-15-2132-4 ISBN 978-981-15-2133-1 (eBook) https://doi.org/10.1007/978-981-15-2133-1 © Springer Nature Singapore Pte Ltd. 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
Nowadays, as industries and real-life routines face more and more complicated problems, there is a special need for stronger computation techniques with the ability of providing fast and accurate solution. Problems grow complexity, nonlinearity as well as there is a great lack of information about the conditions, limitation and the physical rules governing new technological aspects. Even if there was not such a lack of knowledge and information about several hidden layers of the problems, the numerous aspects, multiple formulations and nonlinearity of equations, all together make a wild nature of the problems which require methodologies beyond the known classic computations. As from very ancient time, the mother of nature had been being the greatest inspiring teacher for the human, now at this modern era, natural phenomena as well inspire the scientists for meta-heuristic computation techniques for efficiently solving the complex problems. Through mimicking the natural phenomena, during the last two decades, a variety of nature-inspired computational techniques have been invented, and a considerable efficiency by these techniques has gained. A variety of problems have been managed and solved by these techniques, and as nature-inspired computation is applied to non-convex, nonlinear, incomplete informed problems, researchers focus on three targets for (i) increasing the accuracy of the techniques, (ii) increasing the speed as equivalently finding ways for reducing the complexity of the operators (iii) and finally implementing these techniques on real-life applications. This book gives some of the frontier edge theory and applications of nature-inspired computation (NIC) in a wide range from electrical energy transmission expansion planning to robotics and fault detection in agriculture machinery. Accordingly, this book comprises 17 chapters as follows. In Chap. 1 Helmi and Lotfy, after listing the famous classic NIC techniques and having a brief view of newer techniques, give a review of the most recent six NIC techniques of 2018 and 2019. As the strategy of incorporating the prediction of a moving optimum according to a priori information in optimization process has gained attention in recent years, in Chap. 2 Meier and Kramer introduce the foundations of prediction-based dynamic optimization with nature-inspired methods. They present benchmark sets and quality measures and give insight into mechanisms to employ v
vi
Preface
predictions in evolution strategy and particle swarm optimization-based search. Further, they show how predictive uncertainty information allows the optimizer to explore regions with higher predictive uncertainty more extensively. Chapter 3 gives a descriptive tutorial to the very recent plant biology-inspired optimization technique. The chapter contains illustrative figures and algorithm structure details of the optimizer. In Chap. 4 Pei presents some new trends in fitness landscape analysis in evolutionary computation and meta-heuristic study. The fitness landscape concept is brought from that of evolutionary biology that attempts to present and visualize the relationship between genotype and its success. Evolutionary computation algorithm utilizes this concept to describe the success of an optimized variable. It takes a bridge between the evolutionary computation algorithm and its optimizing problem. In Chap. 5 Rajakumar presents the biological motivation from the lion’s behavior and its interpretation to the lion algorithm. The chapter briefly discusses the first version of the algorithm followed by its detailed description with illustration. Subsequently, the chapter discusses the performance accomplishments of the LA in solving different benchmark suites as well as notable applications with problem formulations. In Chap. 6 Zamani and Amirghasemi present a self-adaptive search procedure for solving the quadratic assignment problem, exploring the structure of the problem through regular interchanges of facilities made by a linear assignment technique. The quadratic assignment problem has been traditionally introduced as a mathematical model related to economic activities, modeling many real-world problems from making optimal arrangement of machines in factories to find the best location of departments within plants. The biological systems consist of self-adaptive mechanisms, and such self-adaptivity can inspire computer scientists to use the same mechanism in their algorithms. In Chap. 7 Rebello and Oliveira present applications of the meta-heuristic known as grey wolf optimizer (GWO) to solve the NP-hard transmission network expansion planning (TNEP) problem. In addition, in Chap. 8 Khosravy et al. study the searching behavior of the very recent Mendelian evolutionary theory optimizer by tracing the points in search space. In Chap. 9 Chatterjee et al. propose a novel meta-heuristic optimization algorithm by employing the concept of artificial cells, which are inspired by biological living cells. An efficient application of artificial cell division (ACD) algorithm has been employed to traverse the search space while decreasing the search time. In Chap. 10 Takano et al. introduce problem frameworks to determine coordinated operation schedules of microgrid components including controllable generation systems (CGs), energy storage systems (ESSs) and controllable loads (CLs). Discussions of the problem frameworks include electricity trade with the conventional power grids and uncertainty originated from variable renewable energy sources and/or electric consumption. As the basis of the solution method, particle swarm optimization (PSO) has been deployed. In Chap. 11 Duque et al. present the applications of the bio-inspired metaheuristic known as monkey search (MS) to the planning of electrical energy distribution systems (EEDS). The technique is inspired by the behavior of a monkey searching for food in a jungle through movements of climbing trees.
Preface
vii
In Chap. 12 Gupta et al. discuss the training of artificial neural network by the recent technique of plant biology-inspired optimizer. In Chap. 13 Fister et al. propose a new method for evolving classification pipelines automatically, founded on stochastic nature-inspired population-based optimization algorithms. Chapter 14 reviews fourteen state-of-the-art nature-inspired optimizers and compares their efficiency in training on an artificial neural network for fault detection in the hydraulic system of agriculture machinery. In Chap. 15 Panoeiro et al. compare the performance of the bat algorithm, grey wolf optimizer and sine cosine algorithm in optimization problem consists of determining the optimal wind farm layout configuration with the objectives of maximizing the extracted power while minimizing the costs related to the project. In Chap. 16 Mandava and Vundavilli develop an adaptive neural network algorithm to tune the gains of the PID controller and propose the modified chaotic invasive weed optimization (MCIWO) algorithm to train the weights of the neural network (NN). Finally, in Chap. 17 Firmino et al. optimize the crane operating time by ant colony optimization. In a nutshell, this book provides cutting-edge research works in the NIC theory and applications. A wide range of algorithms have been introduced and applied to the advanced problems in real life and industries. We are thankful to all the contributors for their valuable contributions. In addition, we are thankful to the book series editor for endless support. Last but not least, no words can express our sincere gratitude to the team members of Springer, who are always supportive as usual. Osaka, Japan Michigan, USA Michigan, USA Okinawa, Japan
Mahdi Khosravy Neeraj Gupta Nilesh Patel Tomonobu Senjyu
Contents
1
Recent Advances of Nature-Inspired Metaheuristic Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ahmed Mohamed Helmi and Mohammed Elsayed Lotfy
2
Prediction in Nature-Inspired Dynamic Optimization . . . . . . . . . . . Almuth Meier and Oliver Kramer
3
Plant Genetics-Inspired Evolutionary Optimization: A Descriptive Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Neeraj Gupta, Mahdi Khosravy, Nilesh Patel, Om Prakash Mahela, and Gazal Varshney
4
Trends on Fitness Landscape Analysis in Evolutionary Computation and Meta-Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . Yan Pei
1 34
53
78
5
Lion Algorithm and Its Applications . . . . . . . . . . . . . . . . . . . . . . . . 100 B. R. Rajakumar
6
A Self-adaptive Nature-Inspired Procedure for Solving the Quadratic Assignment Problem . . . . . . . . . . . . . . . . . . . . . . . . 119 Reza Zamani and Mehrdad Amirghasemi
7
Modified Binary Grey Wolf Optimizer . . . . . . . . . . . . . . . . . . . . . . 148 Gustavo Rebello and Edimar José de Oliveira
8
Tracing the Points in Search Space in Plant Biology Genetics Algorithm Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 Mahdi Khosravy, Neeraj Gupta, Nilesh Patel, Om Prakash Mahela, and Gazal Varshney
9
Artificial Cell Swarm Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 196 Sankhadeep Chatterjee, Subham Dawn, and Sirshendu Hore
ix
x
Contents
10 Application Example of Particle Swarm Optimization on Operation Scheduling of Microgrids . . . . . . . . . . . . . . . . . . . . . 215 Hirotaka Takano, Hiroshi Asano, and Neeraj Gupta 11 Modified Monkey Search Technique Applied for Planning of Electrical Energy Distribution Systems . . . . . . . . . . . . . . . . . . . . 240 F. G. Duque, L. W. De Oliveira, E. J. De Oliveira, B. H. Dias, and C. A. Moraes 12 Artificial Neural Network Trained by Plant Genetic-Inspired Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 Neeraj Gupta, Mahdi Khosravy, Nilesh Patel, Saurabh Gupta, and Gazal Varshney 13 Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Iztok Fister Jr., Milan Zorman, Dušan Fister, and Iztok Fister 14 Evolutionary Artificial Neural Networks: Comparative Study on State-of-the-Art Optimizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Neeraj Gupta, Mahdi Khosravy, Nilesh Patel, Saurabh Gupta, and Gazal Varshney 15 Application of Recent Metaheuristic Techniques for Optimizing Power Generation Plants with Wind Energy . . . . . . . . . . . . . . . . . 319 F. F. Panoeiro, G. Rebello, V. A. Cabral, C. A. Moraes, I. C. da Silva Junior, L. W. Oliveira, and B. H. Dias 16 Design and Comparison of Two Evolutionary and Hybrid Neural Network Algorithms in Obtaining Dynamic Balance for Two-Legged Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 Ravi Kumar Mandava and Pandu R. Vundavilli 17 Optimizing the Crane’s Operating Time with the Ant Colony Optimization and Pilot Method Metaheuristics . . . . . . . . . . . . . . . . 364 Andresson da Silva Firmino, Valéria Cesário Times, and Ricardo Martins de Abreu Silva
About the Editors
Mahdi Khosravy was born in Birjand city, Iran, 1979. He received BSc. in Electrical Engineering (bio-electric) from Sahand University of Technology, Tabriz, Iran, and MSc. in Biomedical Engineering (bio-electric) from Beheshti University of Medical Studies, Tehran, Iran. Mahdi received his Ph.D. in the field of Information Technology from University of the Ryukyus, Okinawa, Japan. He was awarded by the head of University for his excellence in research activities. To grow his international experience in education and research, in September 2010, he joined University of Information Science and Technology (UIST), Ohrid, Macedonia, in the capacity of assistant professor. In 2016, he established a journal in information technology (ejist.uist.edu.mk) in UIST as currently hold its executive editorship. UIST professorship helped him a lot to extend his international collaborations. In July 2017, he became an associate professor. From August 2018 he joined the Energy lab in University of the Ryukyus as a Visiting Researcher. Since April 2018, he is jointly a visiting associate professor in Electrical Engineering Department, Federal Universit of Juiz de For a in Brazil. Since November 2019, Dr Khosravy is an appointed researcher in media-integrated laboratories, University of Osaka, Japan. Dr. Khosravy is a member of IEEE.
xi
xii
About the Editors
Neeraj Gupta was born in India on November 1979. Multidisciplinary graduation studies provide him the ability to make fusion of different engineering fields to explore cutting edge research in optimization, information technology, network design, image and smart grids. He got Diploma in civil engineering (specialized in Environmental and pollution control) in 1999, Bachelor of Engineering in electrical and electronics engineering in 2003, Master of Technology (M. Tech) in engineering systems in 2006, and Ph. D. in Economic operation of power systems (power & control) in February 2013 from IIT Kampur, India. He worked as postdoctoral fellow (Sr. Project Engineer) at Indian Institute of Technology (IIT) Jodhpur, India for one year (June 2012 - May 2013). Thereafter, joined the same institute as faculty (May 2013 – August 2014). He has also two years of industrial along with academic experience before M. Tech. At IIT Jodhpur, he was involved as collaborator in many national and international projects funded from MNRE, UNICEF etc. He was the Assistant. Professor at University for Information Science and Technology, “St. Paul the Apostle”, Ohrid, Macedonia from 2014 to 2017. Currently he is teaching and research staff at the Department of Engineering and Technology, Oakland University, USA. Due to the exposition of different engineering fields and wide research domain, his current research interests are in the field of optimization, smart grid technology, smart cities, big data problem, multi-agent modeling, IoT and applications, development of heuristic optimization algorithms, particularly in the area of multilateral and real-time operation of the complex systems.
About the Editors
xiii
Nilesh Patel is an Associate Professor in the Department of Computer Science and Engineering at Oakland University, Rochester, Michigan. Prior to his tenure at Oakland University, he served as Assistant Professor at University of Michigan, Dearborn. In addition to his academic service, Dr. Patel served as a Software Architect and Software Engineering Manager at Ford Motors and Visteon Corporation, where he played an instrumental role in design and development of first voice-enabled vehicular control and GPS navigation systems. His research interest includes Deep Machine Learning, Pattern Recognition, Visual Computing, Evolutionary Computing, and Big Data Analytics. Tomonobu Senjyu (SM’06) was born in Saga Prefecture, Japan in 1963. He received the B.S. and M.S. degrees in Electrical Engineering from the University of the Ryukyus, Nishihara, Japan, in 1986 and 1988, respectively, and the Ph.D. degree in Electrical Engineering from Nagoya University, Nagoya, Japan, in 1994. He is currently a Full Professor in the Department of Electrical and Electronics Engineering, University of the Ryukyus. His research interests are in the areas of renewable energy, power system optimization and operation, power electronics, and advanced control of electrical machines.
Chapter 1 Recent Advances of Nature-Inspired Metaheuristic Optimization Ahmed Mohamed Helmi1 and Mohammed Elsayed Lotfy2,3(&) 1
Computer and Systems Department, Engineering Faculty, Zagazig University, Zagazig 44519, Egypt [email protected] 2 Electrical Power and Machines Department, Engineering Faculty, Zagazig University, Zagazig 44519, Egypt [email protected] 3 Electrical and Electronics Engineering Department, Engineering Faculty, University of the Ryukyus, Nishihara, Okinawa 903-0213, Japan
1 Introduction Nature-inspired metaheuristic algorithms have received a notable interest in the optimization research community [1]. They are applied in a wide range from medical applications [2] to power system planning [3]. As well as by increasing growth in the power and popularity of metaheuristic optimization, the researchers become more interested in possible efficiency of them in a wide range of potential applications. However, the techniques are already available but optimization may be needed like signal processing [4, 5], image processing [6], image adaptation [7], image enhancement [8], data mining [9], big data [10, 11], telecommunications [12–15], quality assessment [16], noise cancelation [17], morphological kernel design [18], morphological filtering [19], blind source separation [20–24], blind component processing [25], and acoustic data embedding [26]. As a subclass of stochastic optimization, metaheuristic techniques perform much better than blain random search and do not stuck at neither complications of mathematical methods like continuity and differentiability of functions, nor exponential time of the exhaustive search. The historical genetic algorithm (GA) was introduced by Holland in 1960 [27, 28] where the concept of Darwin’s theory of evolution inspired him. After that in 1989, Goldberg introduced a detailed explanation of GA [28] from theory to implementation. GA is the root of nature-inspired optimization family. The main concepts of applying a heuristic search like GA draw an outline for similar coming techniques. Problem formulation, solution encoding, fitness function implementation, generating new solutions mechanism, and termination conditions are the building blocks for any GAlike technique. Success of GA (and its small family evolutionary algorithms (EA) [29]) in different fields opens a window to numerous participations in the literature of optimization. Researchers got motivated by different sources in nature, for example, behavior of animals, movements of insects, physical processes, chemical reactions, and other © Springer Nature Singapore Pte Ltd. 2020 M. Khosravy et al. (eds.), Frontier Applications of Nature Inspired Computation, Springer Tracts in Nature-Inspired Computing, https://doi.org/10.1007/978-981-15-2133-1_1
2
A. M. Helmi and M. E. Lotfy
natural phenomena. However, limitations of GA were also fruitful to direct the research toward looking for alternative techniques, rather than enhancing existing ones. The principles of Darwinian evolution for solving optimization problems have been reiterated in GA [30–32]. A very recent variation of GA implies Mendelian evolution on multi-species as inspired from plants biology [33, 34] incorporating the use of doublestrand DNA for evolution. Nature-inspired algorithms (NIA) constitute a big category in metaheuristic techniques and maybe classified itself into subclasses. Without loss of generality, classification factors are mainly solution encoding, number of search agents, movement mechanism and objective function, local or global optimality, and implementation or coding style. Solutions may be encoded as real-valued numbers; hence, there is a continuous search space with infinite number of solutions. When number of feasible solutions is countable, then search space is said to be combinatorial search space and integer values compose the solutions. Later case of problems is well known in the literature as combinatorial optimization problems (COPs) [35]. Number of search agents or solutions distinguishes techniques into single-based (or trajectory-based solutions) and population-based techniques. Movement mechanism involves in principle method of generating neighborhood solutions to the current one(s) in order to examine various promising solutions. But this can be implicitly interpreted into the ability of applied search techniques to explore as many regions as possible in search space and exploit local regions around “good” solutions so far. Hence, any member in NIA class is asked to achieve a good compromise between exploration and exploitation tasks. Objective function provides the criteria for evaluating solutions found by the searching algorithm. It determines if the problem is maximization or minimization (i.e., fitness function or cost function). On the other hand, if there are only one measuring criteria, then a single-objective technique is enough to look for a global optima/minima. In the case of more than one optimality factor for solutions, multiobjective techniques should be called to find one or more Pareto optima [36]. Search technique may perform “local” optimal movements to enhance one “good” found solution; hence, it is classified as a local optimizer, whereas global optimizers try to find near-optimal solutions starting from a random region in search space. On the level of computer implementation of optimization techniques, it may go in a sequential fashion or parallelization can take place in order to speed up the search process. For more detailed and comprehensive study about such classifications, please refer to the book of Talbi [37]. Besides the endless inspiration resources in nature, NIA extensibility has been proven along with the literature. Various improvement entries for NIA have been tackled due to the simple background and flexible architecture. A very strong evidence is most probably hybrid metaheuristic techniques [37, 38] where different levels of cooperation between metaheuristic techniques and themselves or other exact ones are possible. NIA has a notable share in hybrid techniques as recently discussed in [39]. It is worth mentioning that one interesting overview of NIA and its applications was introduced by Yang [40] in 2014. This list of NIA was well reviewed in [40]: simulated annealing [41], GA, differential evolution (DE) [42], ant algorithms [43], bee-inspired algorithms [44–46], particle swarm optimization (PSO) [47], firefly
Recent Advances of Nature-Inspired Metaheuristic Optimization
3
11
12
13
14
16
Pity Beetle, Emperor Penguin MO Artificial Sheep
Sine-Cosine, Electromagnetic Field Dragonfly, Crow Search, Multi-verse 15
17
18
Fig. 1. Timeline (per year) for historical and recent nature-inspired algorithms
19 Emperor Penguin Colony Sailfish Opt., Seagull Opt.
10
Gray Wolf, Shark Smell Symbiotic Organisms Search
Flower Pollination, Anarchies Society Krill Herd, Wolf Search, Water Cycle
Fireworks Algorithm, Bat Algorithm 09
Artificial sheep, Spotted Hyena Salp Swarm, Satin Bowerbird
08
Moth-Flame, Ant Lion Gradient Evolution
07
Mine Blast, Invasive Weed
06
Teaching-learning
Firefly Algorithm
Cat Swarm 05
Cuckoo Search, Group Search Gravitational Search League Championship
02
Monkey Search Imperialist Competitive
2000
Bee-inspired Algorithms
97
Bacterial Foraging
Differential Evolution 95
Harmony Search
92
Particle Swarm
83 Simulated Annealing
1960
Ant Colony
Genetic Algorithm
algorithm (FFA) [48], cuckoo search algorithm (CSA) [49], bat algorithm [50], harmony search (HS) [51], and flower pollination algorithms (FPA) [52]. Later in 2017, Lindfield and Penny have handled NIA topic in their work [53]. EA, CSA, FFA, artificial bee colony [46], and ant colony optimization [54] have been reviewed again. PSO, bacterial foraging inspired algorithm [55], and physical-inspired optimization algorithms [56] were introduced in detail. Implementation issues of algorithms were also of interest in [53]. Bozorg-Haddad introduced a new set of NIA in [57] which contains cat swarm optimization [58], league championship algorithm [59], anarchies society optimization [60], teaching–learning-based optimization [61], krill herd algorithm [62], gray wolf optimization (GWO) [63], shark smell optimization [64], ant lion optimization (ALO) [65], gradient evolution [66], moth-flame optimization (MFO) [67], crow search algorithm (CRSA) [68], and dragonfly algorithm (DA) [69] while CSA and FPA are revisited. Yang came back in 2018 in a new study [70] where state of the art of many aforementioned techniques was covered. The focus was oriented to the following topics: mathematical analysis of NIA, no-free lunch (NFL) theorems in the literature and various application areas of NIA, for example, feature selection, classification, computational geometry, wireless networks, modeling to generate alternatives, and others. In this chapter, we highlight the recent newborn techniques in NIA family that have not been considered in similar previous studies. We introduce a comprehensive view for all aspects of each considered algorithm here. Figure 1 shows a timeline for the mentioned NIA techniques here.
4
A. M. Helmi and M. E. Lotfy
2 Recent Novel Techniques The inspiration of a novel technique in NIA class arises the passion of interested workers in the field. Among the vast literature of NIA category, this section is focusing on six novel techniques that have joined NIA family so recently. Five single-objective optimization techniques are included here, namely emperor penguins colony (2019), seagull optimization algorithm (2019), sailfish optimizer (2019), pity beetle algorithm (2018), and emperor penguin optimizer (2018). The multi-objective artificial sheep algorithm (2018) is introduced as well. For each introduced technique in this section, a summary of the inspiration background is given, search operators and algorithmic behavior are well clarified, main steps of proposed algorithm are simply stated, effect of search parameters on exploration and exploitation is discussed, and finally, testing sites where empirical results show the superiority of a proposed algorithm compared to other optimizers are mentioned. According to NFL theorem, we have seen that testing sites of each new proposed optimization technique, as well as selected algorithms of comparison, should be figured out. 2.1
Emperor Penguins Colony Algorithm
Harifi et al. have introduced emperor penguins colony (EPC) algorithm in [71]. It is a social swarm-based technique which got inspired by the behavior of the largest kind of living penguins in Antarctica, called emperor penguins. In order to overcome extreme cold weather, male individuals are grouping into huddles (a huddle may collect hundreds of penguins). Thus in a huddle, the body surface temperature of a penguin reaches 37 °C in 2 h. A penguin requires to keep the temperature at certain level (about 35 °C) for the growth of its fetus. The center of a huddle is much warmer than boundaries. Hence, outer penguins have spiral-like movements (see Fig. 2) toward inside to capture more heat, and everyone in the huddle becomes warmer as required. Each penguin determines its movement according to attractiveness measure which in turn depends on the body temperature and heat radiation of each penguin [71]. Initial population is formed of starting positions of penguins which are scattered in the huddle. Penguins move toward the warmer one (i.e., a penguin with the highest heat intensity). Cost is determined according to heat intensity and distance between penguins. When attraction (based on a heat absorption coefficient) is taking place, then new solutions are generated and heat intensity is updated.
Recent Advances of Nature-Inspired Metaheuristic Optimization
5
Fig. 2. Spiral-like movement inside a huddle of emperor penguins
Sorting all solutions result in determining the best solution so far. For purposes of convergence, a damping ratio of heat radiation is updated, and then the movement is applied and the attraction process is tested. Radiation emitted from each penguin body QPenguin per unit time (W) is calculated as: QPenguin ¼ AerTs4
ð1Þ
where total body surface area A ¼ 0:56 m2 , emissivity of bird plumage e ¼ 0:98, Stefan–Boltzmann constant r = 5.6703 10−8 W/m2 K4, and absolute temperature is Ts = 308.15 K (Kelvin) (equals 35 °C).
Fig. 3. Coordinated spiral movement of penguins
6
A. M. Helmi and M. E. Lotfy
The attractiveness is calculated as follows: Q ¼ AerTs4 elx
ð2Þ
where l is the attenuation coefficient and x represents the distance between penguins, considering a linear radiation emitted from penguin bodies. The spiral-like movement aims to optimize the temperature of a penguin. For a penguin at position k when traveling from position i to position j (see Fig. 3), the xk and yk components can be modeled as in Eq. 3. n
o
y y 1 b tan1 xj b tan1 xi j i þ Qe ln ð1 QÞe b n y o 1 j 1 yi b tan b tan y x xi þ Qe j b1b ln ð1QÞe 1 yi 1 b tan1 xj j sin ln ð1 QÞeb tan xi þ Qe yk ¼ ae b xk ¼ ae
b1b ln ð1QÞe
y b tan1 xi i
þ Qe
yj b tan1 x j
cos
ð3Þ
where a and b are two predetermined constants that control the shape of the logarithmic spiral. In order to avoid monotonicity movement, a random component has to be added as a mutation operator (Eq. 4). xk
xk þ u i ;
yk
yk þ u i
ð4Þ
where u is the mutation factor and is a randomly generated vector for each solution i in population. EPC algorithm uses a uniform distribution; however, other distributions are also applicable. Main steps and mathematical formulation of EPC algorithm are summarized in Algorithm 1. The algorithm is adaptively moving from the exploration phase into exploitation phase via reducing the attenuation coefficient l in Eq. 3 which leads, together with reducing heat radiation in Eq. 1, to reduce the attractiveness factor Q. Also decreasing the mutation factor u in Eq. 4 is directly leading to solution convergence.
Recent Advances of Nature-Inspired Metaheuristic Optimization
7
Algorithm 1: Emperor Penguins Colony (EPC) 1: Generate a random initial population of solutions 2: Evaluate the cost of initial solutions 3: Determine an initial value for heat absorption coefficient 4: Repeat until Max number of iterations is reached 5: For all solutions in current population with size 6: Compare the cost of each pair of solutions, 7: 8: 10: 11: 12: 13: 14: 15: 16: 17: 18:
If
then Calculate heat radiation by Eq. Calculate attractiveness by Eq. Calculate coordinated spiral movement by Eq. Generate new solutions by Eq. Evaluate new solutions End if End for Find best solution so far after sorting Update parameters: decease heat radiation, decrease mutation coefficient and increase heat absorption coefficient. End loop
Performance of EPC algorithm has been examined using eight standard benchmark test functions like Ackley, sphere, Rosenbrock functions, and others. The algorithms of comparison were GA, imperialist competitive algorithm (ICA) [72], PSO, ABC, DE, HS, GWO and invasive weed optimization [73]. Robustness of proposed algorithms was tested by varying population size between 10 and 50. Empirical results show that EPC performance is accepted, and it does not lose quality with larger dimensional problems. Also, statistical analysis has been performed using Friedman test [74] and Iman-Davenport test [75]. Moreover, authors in [71] stated that EPC algorithm can perform well in multimodal and nonlinear optimization problems. 2.2
Seagull Optimization Algorithm
Dhiman et al. have introduced seagull optimization algorithm (SOA) in [76]. This novel algorithm is inspired by the migration and attacking behaviors of a kind of sea birds, called seagull. Seagulls are living in colonies; hence, we have another swarmbased technique.
8
A. M. Helmi and M. E. Lotfy
Fig. 4. Attacking behavior of seagulls
Its seasonal movement or migration occurs also in groups, and it is noticed that seagulls can avoid collisions between each other by adjusting their positions. The migration direction of seagulls is determined according to the best survival fittest one. Seagulls can attack other birds making a spiral movement shape (see Fig. 4). Migration and attacking behaviors are explicitly interpreted into exploration and exploitation tasks of SOA. Search process starts by generating free-of-collision solutions. To ensure this behavior, a movement operator A is applied to generate a new ! ! population Cs from the current one Ps at iteration x as follows in Eq. 5. ! ! Cs ¼ A Ps ðxÞ
ð5Þ
where A is linearly decreased in the interval ½2; 0 using Eq. 6. A ¼ fc ðx ðfc =Max IterationÞÞ
ð6Þ
where fc ¼ 2 and x ¼ 0; 1; . . .; Max Iteration. During search process, seagulls make use of search experience and perform attacking in a spiral movement behavior which can be modeled using Eq. 7. r ¼ u ekv
ð7Þ
where u and v are two constants that define the spiral path shape (both u and v are set to 1 in SOA), and k is a random value in the range ½0; 2p. The x, y, and z coordinated components of the vector r can be derived as in Eq. 8. x0 ¼ r cosðkÞ; y0 ¼ r sinðkÞ; 0
z ¼rk
ð8Þ
Recent Advances of Nature-Inspired Metaheuristic Optimization
9
Solutions are updated according to positions of best neighbor using Eq. 9. ! ! Ms ¼ B ~ Pbest ð xÞ Ps ð xÞ
ð9Þ
! ! Pbest is the best solution so where Ms represents positions of current solution Ps and ~ far. B introduces required randomization to control balancing between exploration and exploitation phases. B depends on generating a random value rd 2 ½0; 1, and it is calculated as in Eq. 10. B ¼ 2 A2 rd
ð10Þ
! Now the distance Ds between current solution and best solution can be calculated as in Eq. 11. ! ! ! Ds ¼ Cs þ Ms ð11Þ Updating positions of search agents can be formulated as in Eq. 12. ! ~ s x0 y0 z 0 þ ~ Pbest ðxÞ Ps ðxÞ ¼ D
ð12Þ
SOA is one simple algorithm as well captured from its main steps introduced in Algorithm 2. Algorithm 2: Seagull Optimization Algorithm (SOA) 1: Generate an initial population 2: Initialize parameters: , , and 3: Repeat until is reached 4: Apply collision avoidance using Eq. 5: Evaluate solutions of current population and determine the best solution so far. 6: Calculate using Eq. 7: Calculate , and components using Eq. 8: using Eqs. Calculate distance 9: Update current population using Eq. 10: Update using Eq. 11: End loop
Performance of SOA was examined using 23 commonly used benchmark test functions. Some functions are unimodal where others are multimodal functions to test exploitation around optimum solution and exploration (or diversity) where local optima avoidance can be examined, respectively. Another set of test functions to test proposed technique behavior (just mentioned above) involves CEC 2005 and CEC 2015 [77, 78]. SOA was compared to spotted hyena optimizer (SHO) [79], GWO, PSO, MFO, multiverse optimizer (MVO) [80], sine cosine algorithm (SCA) [81], gravitational search algorithm (GSA) [82], GA, and DE. Empirical results suggested that SOA outperforms many other optimizers in most testing cases.
10
A. M. Helmi and M. E. Lotfy
Wilcoxon signed-ranked test [83] has been applied at 5% level of significance. Also, Mann–Whitney U rank sum test [84] has been applied to average values of CEC 2015 problems. These statistical tests reveal the superiority of SOA. Moreover, SOA efficiency was examined in solving many real-life constrained optimization problems, namely constraint handling, optical buffer design problem, pressure vessel design problem, speed reducer design problem, welded beam design problem, tension/compression spring design problem, rolling-element bearing design problem, and 25-bar truss design problem. Later set of testes show that SOA can be very effective when solving constrained engineering problems with reasonable computational cost and high speed of convergence. 2.3
Sailfish Optimizer
Shadravan et al. have introduced sailfish optimizer (SFO) in [85]. SFO is a new swarmbased optimization technique that depends on two populations of solutions. This optimizer got inspired by the social behavior of a group of hunting fish called sailfish. It follows an amazing hunting strategy that involves alternating the attacks on a schooling of a smaller fish-like sardine (see Fig. 5). Such strategy helps to save energy of hunters while other members are injuring the prey. Another interesting behavior is changing their body color before attacking which is—most probably—done to determine who attacks first and preventing overlapping between compatriots.
Fig. 5. A sailfish is attacking a school of sardine
Sailfish represents the population of candidate solutions which are the main search agents in the search space. Another incorporated population is that one of sardines that concerns to find the best solution in their region. Along search process, an elite of fittest
Recent Advances of Nature-Inspired Metaheuristic Optimization
11
i i solutions of both mentioned populations above are kept into Xelite SF and Xinjured S (at ith iteration) in order to determine the movements of the next solutions of each population. Sailfish can attack the prey in all directions, and they shrink the circle of attack. i This strategy is translated into updating positions of current solution Xold SF within a i i hypersphere neighborhood around best solution Xelite SF into Xnew SF as in Eq. 13. i Xnew
SF
i ¼ Xelite
SF
ki
randð0; 1Þ
i i Xelite þ Xinjured SF S
2
!
! i Xold
ð13Þ
SF
where the coefficient ki is calculated in Eq. 14 and it affects the search region around sardine. ki ¼ 2 randð0; 1Þ PD PD
ð14Þ
And PD is the prey density at ith iteration which is decreased along search process. PD is adaptively updated according to population size of each of sailfish and sardines, namely NSF and NS , respectively, in Eq. 15.
PD ¼ 1
NSF NSF þ NS
ð15Þ
In SFO settings, NSF is a percentage of NS where initial number of sardines is larger than number of sailfishes. Thus, ki 2 ½1; 1 and its fluctuation model the exploration to find a global solution. Now the other population of sardines is updated according to the hunt process details. Sailfish power is gradually decreased as well as the ability of sardines to escape and maneuver. This leads to a quick capture of injured sardines; in i other words, the exploitation process is taking place. A new position of sardines Xnew S of ith iteration can be calculated as given in Eq. 16. i Xnew
S
i ¼ randð0; 1Þ Xelite
i SF XoldS þ AP
ð16Þ
i i represents current position of sardine, Xelite where Xold SF is the position of best S sailfish (i.e., best solution so far and), and AP is the sailfish’s attacking power which is linearly decreased each iteration as shown in Eq. 17.
AP ¼ A ð1 ð2 Itr eÞÞ
ð17Þ
where A ¼ 4 and e ¼ 0:001. Reducing attacking power helps to assist convergence of solutions, together with limiting the positions of updates in population of sardines. This can be achieved using Eqs. 18 and 19.
12
A. M. Helmi and M. E. Lotfy
a ¼ NS AP
ð18Þ
b ¼ di AP
ð19Þ
Here, moving from exploration phase into exploitation phase can be controlled based on the value of AP coefficient. If ðAP 0:5Þ, then all sardines get updated, whereas ðAP\0:5Þ limits the updatings to just a sardines with b positions. i At the end of hunting step, position of a sailfish XSF replaces the position of its i corresponding hunted sardine Xs (i.e., better solution) in order to increase the chance of hunting a new prey. This situation is modeled in Eq. 20. i ¼ Xsi ; XSF
if f ðSi Þ\f ðSFi Þ
ð20Þ
Using two different populations where sailfish updates depend on sardine best positions guarantees good exploration of search space, rather than reducing chances of trapping in local minima. Algorithm 3 introduces the main steps of SFO. Algorithm 3: Sailfish Optimizer (SFO) 1: Generate initial populations of sailfish and sardine. 2: Initialize the parameters and 3: Evaluate solutions in both populations and determine best solution of each one. 4: Repeat until stopping criteria is met 5: for each solution in sailfish 6: Calculate using Eq. 7: Update positions of sailfish using Eq. 8: End for 9: coefficient using Eq. Calculate 10: . then If 11: Calculate and using Eq. and Eq. 12: Update positions of solutions in sardine by Eq. 13: else 14: Update positions of all sardine by Eq. 15: End if 16: Calculate the fitness of all sardine 17: If a better solution in sardine is found, then 18: Replace a sailfish with best sardine by Eq. 19: Remove the best sardine from population 20: Update best sailfish and best sardine. 21: End if 22: End loop
Experimental tests for performance examination of SFO were conducted using 20 benchmark functions with various characteristics. They vary between unimodal and multimodal, other than dimensional variability. As mentioned earlier in this chapter, unimodal optimization functions are suitable for checking exploitation behavior of the examined algorithm, whereas multimodal ones are used for testing exploration capabilities. The following optimizers were chosen for comparison: GA, PSO, GWO, satin bowerbird optimizer (SBO) [86], ALO, and salp swarm algorithm [87]. Reported results show a high speed of convergence and high efficiency of SFO over competitors
Recent Advances of Nature-Inspired Metaheuristic Optimization
13
in most cases. For non-convex optimization problems (i.e., there are multiple feasible regions) as well as non-separable ones where variables are dependent or interrelated, SFO shows great ability as an optimizer for such hard problems. Also, the statistical t-test has been run on two different sets of population and pvalues show acceptance of SFO. For large-scale problems, a set of 11 benchmark functions with 300 dimensions was optimized by SFO which outperformed all other optimizers of comparison. Moreover, SFO was tested in solving 5 engineering problems in comparison with different groups of state-of-the-art optimizers. In the following, we mention each of the 5 problems together with other tested optimizers than the proposed SFO: I-beam design problem with the optimizers MFO, adaptive response surface method (ARSM) [88], improved ARSM [89], CSA, and symbiotic organisms search [90] algorithms; welded beam design problem with the optimizers GA, GSA, an EA variant by Siddall [91], one decision-making-based technique by Coello [92], Richardson’s random method, simplex method, Davidon–Fletcher–Powell [93], and co-evolutionary PSO [94]; gear train design problem with the optimizers GA, mine blast algorithm (MBA) [95], CRSA, MFO, and CSA; three-bar truss design problem with the optimizers PSO-DE [96], MBA, MFO, method of Ray and Saini [97], and CSA; lastly, circular antenna array design problem with the optimizers SBO, water cycle algorithm [98], and GA. Efficiency of SFO in comparison with other optimizers was proven again with mentioned class of engineering optimization problems above. In conclusion, SFO is highly suggested as a global optimization algorithm. Assumptions of NFL theorem still hold; however, SFO has passed exhaustive experimental and statistical tests. 2.4
Pity Beetle Algorithm
Kallioras et al. have introduced pity beetle algorithm (PBA) in [99]. The behavior of the six-toothed spruce bark beetle has motivated Kallioras et al. to invent this new swarm-based optimization technique. First generation of these insects starts with searching a forest randomly by pioneer beetles in order to find suitable hosts. Trees are the aimed hosts by beetles. Once a host is found, and then other beetles are attracted after communicating through a pheromone message, and new generation is created. In this context, one beetle represents a solution, whereas a host is a randomly selected region in search space. Population evolution is one crucial aspect of this kind of beetles which discriminates PBA among swarm intelligence class. In searching stage, randomly located hosts are explored by pioneer beetles. After that, an aggregation stage starts by spreading out pheromones to collect more beetles (population is locally increased). In this stage, both weakened and robust hosts are discovered. At some threshold, there is an overcrowded population which reversely directs the beetles to look for other hosts. The latter stage is known as anti-aggregation stage. To simulate this interesting strategy, PBA uses a random sampling technique (RST) to ensure that almost all portions of search space are equally represented in each generated population. So that diversification of search space is well performed. For a D-dimensional problem, to construct a population of size N, the range of each of the D variables is divided into N non-overlapping equal intervals. Then a single value is
14
A. M. Helmi and M. E. Lotfy
selected randomly from each variable range to produce a sample of size D. Thus, there is a hypervolume of N D values which are randomly matched to generate a population. The samples (i.e., initial solutions) are determined according to Eq. 21. xij ¼
Fx1
i 1 þ rxi =ðui li Þ N
ð21Þ
where rxi 2 ½li ; ui is a random valued in the range of lower and upper bounds li and ui , respectively, of ith parameter, and Fx is the cumulative distribution function of uniform distribution for the jth solution xi;j . Following metaheuristic tradition, PBA has three main steps: initialization, movement mechanism, and solutions update. PBA starts by generating a population of N pioneer beetles that are randomly positioned in a hypervolume space (generation 0). Each of these initial solutions is attracted to the best fittest one. A brood of six populations, each of size N, are created at the best-decided position (second generation), and then a new hypervolume selection pattern (i.e., movement mechanism) is decided for each one. Finally, new populations replace old ones. In order to achieve a good balance with exploration and exploitation phases, PBA depends on the quality of previously obtained solutions in contrast to depending on neither the number of search iterations nor the number of objective function evaluations. For the same purpose, updating the positions of solutions is following a distinct strategy as explained below. Moving in search space does not follow the same procedure all the time. PBA mimics the beetles’ behavior in looking for a better position than starting one in order to create its own population. PBA implements five methods for selecting a new hypervolume, namely neighboring search volume, mid-scale search volume, largescale search volume, global-scale search volume, and memory consideration search volume. For any of these selection patterns, once one pattern is chosen, and then a search area is examined around the birth position (denoted as xbirth in Eq. 22) of generated population as explained by Eq. 22 where the previously mentioned RST is called again. ðgÞ
xj
¼ RST lðgÞ ; uðgÞ ; D; N
where h i h ðgÞ i ðgÞ ðgÞ ðgÞ l i ; ui 2 xbirth;i 1 fpat ; xbirth;i 1 þ fpat
ð22Þ
where i represents the ith position of the jth solution xj , g denotes current generation ðgÞ
ðgÞ
number, li and ui are lower and upper bounds of the ith dimension in generation g, respectively, and fpat is a particular parameter related to the applied search pattern (called pattern factor), and it determines the size of investigated neighborhood around ðgÞ xbirth .
Recent Advances of Nature-Inspired Metaheuristic Optimization
15
Neighboring search hypervolume is used first by each starting solution (Step #3 in Algorithm 4) which results in a new population using Eq. 22. positions of populations for For If
then
then Calculate a new Else If
using RST
is better than
then
Determine
using Eq. after then is less than Determine using Eq. after
Else End If End If End If End If End
In Eq. 22, fpat is replaced by a neighboring factor fnb which works in the range [0.01, 0.2]. In this pattern, the selected neighborhood is relatively close to starting positions of birth solution. A predetermined number of populations (analogous to broods in this stage) are generated. The best solution in each newborn population is determined where the best solution among all populations is found and named as ð g þ 1Þ ðgÞ xbirth . The latter solution replaces xbirth if it is fitter. If the neighboring search hypervolume has failed to find a better solution comparing to the starting one, then PBA switches to mid-scale search hypervolume pattern. A search neighborhood is investigated around the current best solution using Eq. 22 but fpat is now replaced by a mid-scale factor fms with range [0.1, 1] (Step #8 in Algorithm 4). Again, the best solution in the new positions is determined and compared to the starting one. In case of improvement, the newly generated positions constitute a new population. The large-scale search hypervolume phase is also started when the neighboring search hypervolume ceases to leave the starting solution to a better one. PBA switches to this mode with a probability pr which is considered one parameter of this algorithm. In Eq. 22, the large-scale factor fls with range [1, 100] replaces fpat (Step #10 in Algorithm 4). Similarly, the new randomly positioned solutions are compared with each other to determine the best starting one. PBA has another choice of movement if it decides not to do a large-scale search which is memory consideration search hypervolume pattern (Step #12 in Algorithm 4). The N best solutions so far are kept in a memory MEM which is initialized with the
16
A. M. Helmi and M. E. Lotfy
positions of initial population. Now another interesting glimpse of beetle’s behavior regarding colonies expansion can be modeled as memory consideration search hypervolume pattern. The procedure of investigating new search area is different from previously explained patterns and goes as follows: Just one solution in MEM is selected to randomly move to a new position which replaces the worst one in MEM in case of improvement. Also, a very narrow local search is carried out close to the newly found position using Eq. 22 with a fine-tuning parameter ftn in the range [0.005, 0.05]. When a large number of objective function evaluations FEun have been encountered with no improvement in the quality of solutions, then the global-scale search hypervolume strategy is called. Regarding the maximum allowable number of function evaluations FETotal , the FEun can be calculated using a multiplication factor fFE in range [0.05, 0.25] (Step #5 in Algorithm 4). A new population is created using RST method within the global lower and upper bounds. Algorithm 5 presents a pseudo-code for PBA. The important steps of PBA (Steps #4–#11 in Algorithm 5) are previously clarified in Algorithm 4. Algorithm 5: Pity Beetle Algorithm (PBA) 1: to Reset 2: Initialize search parameters 3: Repeat 4: For each generated population in the 5: For each solution in the population 6: Select a new hypervolume search pattern 7: Evaluate solutions 8: Increment 9: End 10: Update population positions 11: End 12: Until exceeds value
Discussion. The applied perturbation mechanism developed in PBA tries to overcome the known obstacles facing metaheuristic techniques. In the beginning, initializing first solutions using RST method is at least more efficient than a plain random start. Besides generating more than one population around promising locations, this should reduce the effect of bad starts in search space. The variety of generating neighborhoods via different walks in search space handles implicitly the issues of premature convergence and trapping in local minima. PBA has a chance to perform a mutation-like step in the large-scale search hypervolume pattern. This is also the case if PBA enters the memory consideration search hypervolume pattern, where the worst solution in the population of best solutions so far may be fired for the sake of a better one, rather than fine-tuning that newcomer by means of local search. One final advantage of PBA is using the number of function evaluations with no improvement in solutions quality as an indicator of falling into nearby local minima. As a direct or explicit treatment of such problem, PBA executes a random walk supervised by RST in search space via its global-scale search hypervolume strategy.
Recent Advances of Nature-Inspired Metaheuristic Optimization
17
On the other side, PBA has some control parameters each with a specific range. Authors have carried out sensitivity analysis for PBA where there is an empirical evidence that PBA isn’t influenced by its parameters, as clarified below. However, it seems that there will be a bit overwork to optimize PBA parameters when being implemented every once for a new optimization problem. This also opens a window to examine some low-level hybridization between PBA and another auxiliary technique to optimize PBA algorithmic parameters. As just figured out above, PBA contains some control parameters. Each parameter works in a preferable range and any problem-dependent implementation requires fixing the values of the parameters: N, fnb , ftn , fms , fls , pr, and fFE (Step #2 in Algorithm 5). Authors in [99] have carried out sensitivity analysis to test the best combination of search parameters. A 25 randomly generated combinations of parameters (each in its specified range) were used while optimizing six commonly addressed unimodal and multimodal benchmark test functions. This test reveals that PBA robustness against its search parameters change rather than problem dimension growth. For purposes of performance testing, another set of benchmark test functions, with dimensions 10 and 30, has been optimized using PBA in comparison with many stateof-the-art algorithms and most of them are PSO variants like PSO with inertia weight [100], PSO with constriction factor [101], a local version of PSO with inertia weight [102], unified PSO scheme [103], fully informed PSO [104], comprehensive learning PSO [105], aging leader and challengers PSO [106], and weighted quantum PSO [107], rather than group search optimizer (GSO) and improved GSO [107]. For most cases, PBA achieved better average best solutions than the optimizers of comparison. Moreover, PBA outperformed the best two algorithms for solving some 30-dimensional instances of CEC 2013 competition [108]. Moreover, many instances between 10-dimension and 100-dimension from CEC 2014 benchmark problems [109] were tested using PBA against the optimizers: simultaneous optimistic optimization [110], fireworks algorithm with differential mutation [111], and some improved versions of adaptive differential evolution in [112] and [113]. Superiority of PBA has been proven in the mentioned particular problems as well as its robustness and efficiency. Due to its considerable pros, PBA is expected to receive a notable interest in stochastic optimization community and widely applied with different-domain problems. 2.5
Emperor Penguin Optimizer
Dhiman et al. have introduced emperor penguin optimizer (EPO) in [114]. The social behavior of emperor penguins birds inside their huddles has been the inspiration source once again. As introduced earlier in this chapter, EPC optimization algorithm [71] mimics the spiral movement of a penguin from huddle boundaries toward the wormer point inside. However, Dhiman et al. [114] have firstly handled the huddling behavior before the work in [71] from a different point of view. Dhiman et al. introduced the EPO algorithm which implements other interesting activities like forming the huddle boundaries, temperature profile updates, distance calculations between individuals and relocating the effective mover around the huddle.
18
A. M. Helmi and M. E. Lotfy
The huddle boundaries are generated randomly in the very beginning. The huddle is assumed to be formed in one L-shaped polygon plane (see Fig. 6). The initial huddle boundary can be generated using Eq. 23. F ¼ U þ iX
ð23Þ
where F is an analytical function on the polygon plan, U defines the wind velocity that affects the huddle boundary, X is another vector, and i is the imaginary constant. The ambient temperature changes of the huddle can be modeled as given by Eq. 24. T 0 ¼ T ItrMaxItr MaxItr 0; if R [ 1 T¼ 1; if R\1
ð24Þ
where the temperature T ¼ 0 when polygon radius R [ 1, and T ¼ 1 for R\1 (see Fig. 6, the polygon is to the right). R is a randomly generated value in [0, 1]. T 0 controls the exploration and exploitation phases of the algorithm.
Fig. 6. A huddle of penguins forming like a L-shaped polygon to the left, and its equivalent graphical model is to the right
The distance between a penguin and the fittest individual in the huddle in current iteration Itr can be mathematically expressed as shown in Eq. 25. ~ Dep ¼ S ~ A ~ PðItrÞ ~ C ~ Pep ðItrÞ ð25Þ where ~ P defines the position of best search agent while ~ Pep indicates the position of the concerned penguin. In order to insure collision avoidance between penguins, two vectors ~ A (see Eq. 26) and ~ C are inserted. S ~ A is responsible to move the penguin
Recent Advances of Nature-Inspired Metaheuristic Optimization
19
toward the optimal location and is computed as in Eq. 27 while ~ C is a random vector in [0, 1]. jj denotes the absolute value function. ~ A ¼ M T 0 þ ~ P ~ Pep RandðÞ T 0
ð26Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 S ~ A ¼ f eItr=l eItr
ð27Þ
where M is set to 2 in Eq. 26 to maintain a gap between individuals. Two parameters f with range [2, 3] and l with range [1.5, 2] (as suggested by sensitivity analysis of EPO) are also helping to control the balance between exploration and exploitation. e denotes the expression functions. For an individual penguin, it can update its position after locating the best mover one as shown in Eq. 28. ~ Pep ðItr þ 1Þ ¼ ~ PðItrÞ ~ A~ Dep
ð28Þ
Thus, changing the position of the mover penguin (i.e., optimal solution) causes recomputing the huddling behavior and a new iteration of EPO algorithm is taking place. Algorithm 6 shows a pseudo-code for EPO. The proposed EPO algorithm has been tested using 44 benchmark test functions divided into five categories: unimodal, multimodal, fixed-dimension multimodal, and composite functions. Algorithm 6: Emperor Penguin Optimizer (EPO) 1: Generate initial population 2: Choose search parameters 3: Evaluate search agents 4: Repeat 5: Determine the huddle boundary using Eq. 6: Calculate the temperature profile using Eq. 7: Compute the distance for each individual using Eqs. 8: Update positions of search agents using Eq. 9: Update search parameters 10: Check if any search agent exceeds the boundary and then amend it 11: Evaluate search agents and determine the optimal Solution 12: Until stopping criteria is met 13: Report the optimal solution so far
Mathematical expressions for all instances of mentioned categories are available in the paper and other sources. Fifth type of test problems belongs to CEC 2015 functions [115]. A set of 8 state-of-the-art optimizers was chosen for comparison with EPO. It contains GA, PSO, HS, GSA, GWO, SCA, SHO, and MVO.
20
A. M. Helmi and M. E. Lotfy
Parameter settings for EPO during experimental test were population size = 80, A 2 ½1:5; 1:5, SðÞ 2 ½0; 1:5, M ¼ 2, f 2 ½2; 3, l 2 ½1:5; 2, and T 2 ½1; 1000, ~ MaxItr ¼ 1000. According to the performed sensitivity analysis of EPO, the indicated parameter values seem to be recommended for any further applications of EPO. EPO was able to find optimal solution of most cases of both tested unimodal and multimodal functions. Thus, EPO exploration and exploitation abilities were verified. Composite test functions are formed by means of shifting, rotating, expanding, and combining classical functions, so that they are well examination suit for local minima avoidance capability on an optimizer. EPO was able to outperform other competitors in this category of test function with reasonable convergence rates. EPO was also the best optimizer for most of CEC 2015 benchmark test functions. One more empirical test for scalability reveals that EPO performance hasn’t been degraded so much while changing dimensionality between 30 and 100. EPO undergoes the statistical analysis of variance test with sample size 30 and 95% confidence of interval. Results show that EPO is statistically significant from other compared algorithms for all 29 tested benchmark functions. In the field of engineering optimization, EPO has been applied to 7 constrained and unconstrained nonlinear engineering problems like pressure vessel, speed reducer, welded beam, tension/compression spring, 25-bar truss, rolling-element bearing, and displacement of loaded structure. The proposed EPO was also very effective and outperforms the other optimizers in most cases. Like other new members of NIA class, authors of EPO are looking forward to adapt this basic version of EPO to work in other important optimization sites like feature extraction which requires binary-coded solutions, multi-objective optimization, and COPs. 0
2.6
Multi-objective Artificial Sheep Algorithm
Lai et al. have introduced multi-objective artificial sheep algorithm (MOASA) in [116]. Social behavior of sheep flocks was the inspiration source for Wang et al. to develop a novel optimization technique called binary artificial sheep algorithm (BASA) [117]. Based on BASA, Lai et al. have created an enhanced version with new operators to work in multi-objective optimization (MOO) area. Inside a sheep flock, individuals are always trying to follow their bellwether (see Fig. 7) who is the strongest one in the herd, but they may collide with each other while moving. The bellwether leadership represents one major behavior that affects position updating of the rest of the herd. A sheep flock is loose which reflects a good divisibility over a search region. It may be considered blind herd due to the occurrence of collision between individuals, but this also gives an indication of a high-efficiency positive feedback. Here is another interesting conclusion that is individuals of the herd move randomly in their local position when strolling or playing in a self-awareness behavior. Herd members keep sharable information of the position of bellwether. Before illustrating the algorithmic features of MOASA, let’s have a quick look at MOO class. Unlike single-objective optimization, there are more than one objective in MOO each with an independent measuring criterion.
Recent Advances of Nature-Inspired Metaheuristic Optimization
21
Fig. 7. Leadership of a bellwether in a sheep herd
Thus, there is an objective function vector attached to each solution, and the concept of Pareto dominance [118] defines the optimal one. Optimal solution may be also called Pareto optima which dominates all other candidates if and only if it is the best one in all metrics. For the ith candidate solution Xi ðtÞ at iteration t in a population of size N, the objective function vector can be expressed as F ðXi ðtÞÞ ¼ ½f1 ðXi ðtÞÞ; . . .; fo ðXi ðtÞÞ, i ¼ 1; . . .; N and a number of o objectives. In MOO, more than on Pareto optima may exist and constitute what is called Pareto front. In the context of MOO algorithms, Pareto front is called external archive which currently preserves nondominated elite solutions during search process. The algorithm of MOASA consists of four main operators: selection of bellwether, positions update, releasing tired sheep, and neighborhood search of external archive. A leader selection strategy based on external archive appeared first in [118], and it is used here to determine the fittest solution to be followed by the rest of herd. For a hypercubic objective search space, only one particle is randomly chosen from the external archive to be the bellwether. Now let us XB ðtÞ denote the position of the bellwether (in the center of the entire flock), and then the movement of the ith member toward the bellwether can be calculated as in Eq. 29. d d d xbw i;d ðtÞ ¼ xB ðtÞ þ c1 xB ðtÞ xi ðt Þ
ð29Þ
where xbw i;d ðtÞ is the new position of the ith member with respect to only the d-dimension and c1 is a random coefficient which is calculated as: c1 ¼ a ð2 r1 1Þð1 t=T Þ with a randomly generated number r1 2 ½0; 1, a works in the range [0, 1] and T is total
22
A. M. Helmi and M. E. Lotfy
number of iterations. The parameter c1 controls the effect of leadership influence where it is linearly decreased over search lifetime. When the bellwether isn’t able to make notable movements, then the rest of the flock are allowed to follow the self-awareness behavior to look for better positions self nearby their local region. Let Xi;d ðtÞ denote the new position of the ith individual in self just one dimension d and can be calculated as in Eq. 30. Then Xi;d ðtÞ represents the inertia and local random search of a solution. d d d xself i;d ðtÞ ¼ xi ðt Þ þ R xB ðtÞ xi ðtÞ
ð30Þ
where R is a random number in ½1; 1. Now, the mechanism of updating positions of sheep makes the benefit of both bellwether influence and self-awareness movement as expressed in Eq. 31 for the specific dth dimension. bw xdi ðt þ 1Þ ¼ ui xself i;d ðtÞ þ ð1 ui Þ xi;d ðt Þ
ð31Þ
where ui ¼ b r2 with a random number r2 in ½0; 1 and b is decreased from 0.5 to 0 in order to gradually limit the local movements and go closer to the fittest solution in the external archive. After that, the sheep which is closer to the bellwether has more chance to update its position. The closeness of the ith sheep is denoted as nDomðiÞ that is the number of other dominated individuals in all dimensions by the ith sheep. Thus, the probability Pi of the ith sheep in a population of size N to update its position is defined as in Eq. 32. Pi ¼
nDomðiÞ N
ð32Þ
Then just one sheep is randomly selected using roulette wheel selection method in order to get updated in the randomly chosen dth dimension. In order to escape from local optima regions, a chaotic mutation (based on chaotic maps) is applied to what is called a “tired” sheep. When some sheep cannot be able to move to a better position for a predetermined number of iterations, then it is considered as a tired one. The neighborhood search is conducted by the help of both external archives together with neighbor particles which are generated using Eq. 33. L1 ¼ L0 Bdu Bdl ; xdn ¼ xdrep þ ðrand 0:5Þ L1
ð33Þ
where xdrep represents the dth dimension of a particle in the external archive, xdn is the determined neighbor particle, and d ¼ ðd1 ; . . .; dr Þ is randomly chosen r dimensions. The number r of dimensions has to be gradually decreased as shown in Eq. 34, for solution convergence achievement.
Recent Advances of Nature-Inspired Metaheuristic Optimization
8 h i
> gmin smin vmin < i;t ui:t þ j;t þ k;t dt 1 at i¼1
j¼1
k¼1
NG NS P P > up > > gmax smax : dt ð 1 þ at Þ i;t ui:t þ j;t i¼1
; for 8t;
ð9Þ
j¼1
where dt is the forecasted net load which can be calculated by the sum of electricity low are the coefficients for securing upper consumption and VREG outputs; aup t and at max min and lower margins; gi;t and gi;t are the maximum and the minimum outputs of CGs at and smin are the maximum and the minimum outputs of ESSs at t; smax j;t j;t max min t smin ; vmin j;t \0\sj;t k;t is the maximum consumption of CLs at t vk;t \0 . In addition, the microgrid components have several constraints for their operations as shown below. • State duration for controllable generation systems
If 0\uon i;t \MUT i then ui;t ¼ 1 ; for 8i; 8t; If 0\uoff i;t \MDT i then ui;t ¼ 0
ð10Þ
• Ramp rate for controllable generation systems gi;t gi;t1 DGup DGdown i ; for 8i; 8t; i
ð11Þ
• Maximum and minimum outputs of controllable generation systems max gmin i;t gi;t gi;t ; for 8i; 8t
ð12Þ
220
H. Takano et al.
(
min down gmin i;t ¼ max Gi ; gi;t1 þ DGi ; for 8i; 8t; max gmax ; gi;t1 þ DGup i i;t ¼ min Gi
ð13Þ
• Operation state for energy storage systems qj;t Qmax ; for 8j; t 2 TSj ; Qmin j j (
If sj;t \0 then qj;t ¼ qj;t1 gj sj;t If sj;t [ 0 then qj;t ¼ qj;t1 g1 sj;t ; for 8j; t 2 TSj ;
ð14Þ ð15Þ
j
• Maximum and minimum outputs of energy storage systems max smin j;t sj;t sj;t ; for 8j; t 2 TSj ;
8 min max < smin j;t ¼ max Sj ; qj;t1 Qj ; for 8j; t 2 TSj ; : smax ¼ min Smax ; qj;t1 Qmin j;t j j
ð16Þ ð17Þ
• Operation state for controllable loads max Pmin k pk;t Pk ; for 8k; 8t;
ð18Þ
pk;t ¼ pk;t1 nvk;t ; for 8k; t 2 TVk ;
ð19Þ
• Maximum consumption of controllable loads vmin k;t vk;t 0; for 8k; t 2 TVk ;
ð20Þ
min max vmin ; for 8k; t 2 TVk ; k;t ¼ max Vk ; pk;t1 Pk
ð21Þ
off where uon i;t and ui;t are the consecutive operating and suspending durations of CGs; MUT i and MDTi are the minimum operating and suspending durations of CGs;DGup i and DGdown are the ramp-up and the ramp-down rates of CGs; q is the level of state of j;t i charge (SOC) in ESSs; gj is the overall efficiency of ESSs; pk;t is the SOC level in CLs; and Pmin are the maximum and the minimum SOC levels of CLs; nk is the overall Pmax k k efficiency of CLs.
Application Example of Particle Swarm Optimization on Operation
221
This problem framework is similar to the UC-ELD problems in the conventional power grids and suitable for independent operation of small power grids. By (8), the ESSs and the CLs contribute to saving the operation cost of CGs described in (6) and (7). However, it is not easy to decide the coefficients for securing upper and lower low margins, aup t and at , appropriately. Since these equations do not include the trading electricity, the imbalance penalty must be evaluated in the process of solution methods additionally. 2.2
Including Electricity Trade
As already mentioned, the microgrids can operate in both connecting and disconnecting to the conventional power grids. The electricity trade with the conventional power grids often not only relaxes the difficulty in supply–demand management but also provides operational alternatives for the microgrid operators. If we consider the electricity trade, the objective function is modified as F ðu; g; s; vÞ ¼
T X
½Ct ðut ; gt Þ þ Mt et ;
ð22Þ
t¼1
Ct ðut ; gt Þ ¼
NG h X i Ai þ Bi gi;t þ Ci g2i;t þ SCi 1 ui;t1 ui;t ;
ð23Þ
i¼1
where et is the trading electricity ðet 2 RÞ; Mt is the price of electricity trade. Moreover, the operational constraints are expressed by the following. • Balance in power supply and demand dt ¼
NG X
gi;t ui;t þ
i¼1
NS X
sj;t þ
j¼1
NV X
vk;t þ et ; for 8t;
ð24Þ
k¼1
• Reserve margin 8 NG NS NV P P P > min low > > et smin vmin < gi;t ui:t þ j;t þ k;t dt 1 at i¼1
j¼1
k¼1
NG NS P P > up > > gmax smax : dt ð1 þ at Þ et i;t ui:t þ j;t i¼1
for 8t:
;
ð25Þ
j¼1
The other equations [Eqs. (5)–(21) excluding (6)–(9)] are common with the general problem framework described in Sect. 2.1. In this problem framework, the microgrid sells/buys electricity to the conventional power grids in response to the electricity price at each time. Violation in (25) can be regarded as the imbalance penalty. However,
222
H. Takano et al.
low there is still room for discussion on how to appropriately set aup as same as the t and at problem formulation described in Sect. 2.1.
2.3
Considering Uncertainty
In the problem frameworks described in Sects. 2.1 and 2.2, the uncertainty in the net load is treated by the coefficients for securing upper and lower margins, aup and alow . These coefficients are normally set to secure several percent margins for the forecasted low ¼ 0:05 (5% of the forecasted net load). However, value of net load, e.g., aup t ¼ at variation in the net load is not constant, especially in the case that we consider the VREG outputs, and thus the forecasted value-based setting leads to discussion on its appropriateness. If the coefficients become bigger, operational risks originated from the uncertainty can be reduced. In contrast, the resulting additional start-up/shutdown of the CGs accompanies increase of the operation cost. With a view to solving the issue, the authors define the net load as dt 2 dtmin ; dtmax ; for 8t;
ð26Þ
where dtmax and dtmin are the maximum and the minimum assumable values of net load. Now, the objective function is extended as
¼
T P t¼1
Ct ðut ; gt Þ ¼
(
max dR t
F ðu; g; s; v; eÞ ) 0 Ct ðut ; gt Þ þ Mt et þ It et f ðdt Þddt ;
ð27Þ
dtmin
NG h X i Ai þ Bi gi;t þ Ci g2i;t þ SCi 1 ui;t1 ui;t ;
ð28Þ
i¼1
dt ¼
NG X i¼1
gi;t ui;t þ
NS X j¼1
sj;t þ
NV X
vk;t þ et þ e0t ; for 8t;
ð29Þ
k¼1
8 NG NS NV P P P > > If dt et \ gmin smin vmin > i;t ui:t þ j;t þ k;t then > > i¼1 j¼1 k¼1 > ! > > > NG NS NV > P P P > > gmin smin vmin > e0t ¼ dt et i;t ui:t þ j;t þ k;t > > i¼1 j¼1 k¼1 < NG NS P P ; for8t; max > If d e [ g u þ smax t t i:t > i;t j;t then > > i¼1 j¼1 > ! > > > NG NS P P > > 0 max max > gi;t ui:t þ sj;t e ¼ dt e t > > > t i¼1 j¼1 > : Elsee0t ¼ 0
ð30Þ
Application Example of Particle Swarm Optimization on Operation
223
where f ðdt Þ is the probability density function of net load; It is the price of imbalance penalty. In the proposed problem framework, dtmax , dtmin and f ðdt Þ can be determined with reference to the past record of microgrid operations. By (29) and (30), the operational constraints of microgrids are integrated into a part of the objective function. The trading electricity is classified into et and e0t for individually calculating the expected trading cost and the expected imbalance penalty in (27). As a result, we do not need to set the coefficients, aup and alow t t , because the reserve margin is secured automatically depending on the result of expected cost calculation. The other constraints [Eqs. (10)– (21)] are used in common with the previous frameworks. Here, the authors ignore (10) and (11) from the target problem. This is because the time interval in operation scheduling, Dt, often satisfies the following conditions, and thus the constraints become inactive in the microgrids [21, 30, 31].
Dt MUT i þ MDT i ; for 8i;
ð31Þ
max DGup Gmin i Gi i ; for 8i: down min DGi Gi Gmax i
ð32Þ
Similarly, the evaluation of (13) is unnecessary under these conditions. In this case, min max and Gmin gmax i;t and gi;t are simply equal to Gi i , respectively.
3 Application of Particle Swarm Optimization The target optimization problems formulated in Sects. 2.1–2.3 are complicated MIP problems which have the discrete and the continuous optimization variables. As defined in (1)–(4), the former is the state of CGs, u, and the latter includes the outputs of the CGs, g, the ESSs, s, and the CLs, v. In the formulation of Sect. 2.3, the trading electricity, e, is included in the latter. Since it is extremely difficult to solve the target problems exactly, the intelligent optimization-based algorithms have been attracting attention as a realistic alternative. In fact, several nature-inspired metaheuristics and evolutionary computation algorithms have been applied to the similar problem frameworks. Typical algorithms are genetic algorithm (GA) [18], simulated annealing (SA) [19, 20], tabu search [21, 25] and PSO [22, 23, 30]. In this chapter, the PSO is selected as the basis of solution method. An improvement strategy for solution methods and its application are also introduced. 3.1
Solution Method Based on Standard Particle Swarm Optimization
PSO is a population-based stochastic computational method that optimizes a problem by iteratively trying to improve a solution candidate with regard to a given measure of solution quality (called fitness function) [32]. Its algorithm was originally developed through studies on the simulated social behavior representing the movement of birds or fishes (called particles) and has many similarities with GA. However, unlike the GA,
224
H. Takano et al.
the standard PSO has no evolutionary operators such as crossover and mutation. An initial set of randomly created solutions (called initial swarm) propagates in the design of search space toward the global optimal solution over a number of iterations (called moves). All members of the swarm fit and share information in search place. Each particle l has a position, xnl , and a velocity, ynl , in iteration n ðn ¼ 1; 2; . . .; NÞ and flies through on the search space for finding their best positions and velocities in accordance with (33) and (34). xnl þ 1 ¼ xnl þ ynl þ 1 ; ynl þ 1 ¼ xynl þ c1 r1 xl xnl þ c2 r2
ð33Þ
min xl xnl ; lL
ð34Þ
where c1 and c2 are the cognitive factors that represent the trust for each particle and the swarm; r1 and r2 are the random numbers in the range, ½0; 1; xl is the personal best for particle l ðxl : pbest, min xl : gbest); L is the set of all particles. Here, the inertia weight lL
factor, x, controls the iteration size in the PSO algorithm. In this chapter, the position of particles is expressed as xnl ¼ ðu; g; s; v; eÞ:
ð35Þ
Although the PSO has succeeded in many continuous problems, it has still some difficulties to treat the discrete optimization problems [33]. In order to modify the ON/OFF states of CGs, ui;t , a sigmoid function is introduced in part of the PSO algorithm. If 0:5\
1 then ui;t ¼ 1; else ui;t ¼ 0; for 8i; 8t: 1 þ eui;t
ð36Þ
As shown in (36), if value of the sigmoid limiting transformation function is greater than 0.5, the ith CG becomes “ON” at time t; otherwise, it becomes “OFF”. During the iterative process, the fitness function of individual particles is valued to measure their superiority. With the aim of handling the constraints described in Sects. 2.1–2.3, the penalty method, which replaces a constrained optimization problem by a series of unconstrained problems, is applied. The fitness function in the general problem framework described in Sect. 2.1 can be formulated as E ðu; g; s; vÞ ¼
T X
½Ct ðut ; gt Þ þ VIOt ;
ð37Þ
t¼1
where VIOt is the weighted sum of violation of the constraints including the violations for (8) and (9) (if the constraints (10) and (11) become active, they are also included in the calculation). In the formulation of Sect. 2.2, the fitness function is calculated as
Application Example of Particle Swarm Optimization on Operation
Eðu; g; s; vÞ ¼
T X
Ct ðut ; gt Þ þ Mt et þ VIO00t ;
225
ð38Þ
t¼1
where VIO00t is the weighted sum of violation of the constraints including (24) and (25) instead of (8) and (9). On the other hand, the fitness function in the problem framework described in Sect. 2.3 can be represented as
¼
T P t¼1
(
E ðu; g; s; v; eÞ max dR t
Ct ðut ; gt Þ þ Mt et þ It e0t
þ VIO0t
) f ðdt Þddt ;
ð39Þ
dtmin
where VIO00t is the weighted sum of violation of the constraints excluding the violations for constraints of the balance of supply–demand and the reserve margin because these violations are already evaluated in It e0t : 3.2
Solution Method Based on Binary Particle Swarm Optimization and Quadratic Programming
In order to improve compatibility between the target problems and their solution method, the authors replace the optimization variables and the operation cost function with u0h;t 2 f0; 1g; for 8h; 8t;
ð40Þ
0max ; for 8h; 8t; g0h;t 2 G0min h ; Gh
ð41Þ
Ct u0t ; g0t o NG Pþ 3 n 0 ¼ Ah þ Bh g0h;t þ Ch g02 1 u 1 u0h;t ; þ SC h h;t h;t
ð42Þ
h¼1
where h is the number of controllable components ðh ¼ 1; . . .; NG; . . .; NG þ 3Þ; u0h;t is the state of controllable components, which is an element of vectors u0t and u0 ; g0h;t is the output of controllable components, which is an element of vectors g0t and g'; G0max and h 0min Gh are the maximum and the minimum outputs of controllable components. In (40) and (41), the ESSs and the CLs are aggregated into large-scale components, respectively, to emphasize perspicuity of the formulation. The ðNG þ 1Þth component means the aggregated ESS, the ðNG þ 2Þth component is the aggregated CL, and the ðNG þ 3Þth component means the trading electricity. Since g0NG þ 1;t , g0NG þ 2;t and g0NG þ 3;t include 0 in their controllable range, we can set u0NG þ 1;t , u0NG þ 2;t and u0NG þ 3;t by 1 in the range of available periods of the target components (ESSs: t 2 TSj , CLs: t 2 TVk Þ, while 0 in the other periods (ESSs: t 62 TSj , CLs: t 62 TVk Þ. In the previous problem frameworks, operation costs of the ESSs and the CLs are expressed as the fuel cost change of CGs. That is, the ESSs and the CLs affect indirectly on the objective function. Therefore,
226
H. Takano et al.
the sets of coefficients, ðANG þ 1 ; BNG þ 1 ; CNG þ 1 Þ and ðANG þ 2 ; BNG þ 2 ; CNG þ 2 Þ, are both set by ð0; 0; 0Þ. Meanwhile, the set of coefficients, ðANG þ 3 ; BNG þ 3 ; CNG þ 3 Þ, is set by ð0; M; 0Þ. Since there is no start-up cost on the ðNG þ 1Þth–ðNG þ 3Þth components, SCNG þ 1 , SCNG þ 2 and SCNG þ 3 can be simply set by 0. Constraints for the balance in supply–demand [(8) and (24)] and the reserve margin [(9) and (25)] are expressed as shown below. • Balance in power supply and demand dt ¼
NG þ3 X
g0h;t u0h;t ; for8h; 8t;
ð43Þ
h¼1
• Reserve margin 8 NG þ 3 P 0min 0 > > gh;t uh:t dt 1 alow < t h¼1
NG Pþ 3 0max 0 > > : dt ð1 þ aup gh;t uh;t t Þ
; for8t;
ð44Þ
h¼1
where g0max and g0min are the maximum and the minimum outputs of controllable h;t h;t components at time t. Equations (29) and (30) also can be represented as dt ¼
NG þ3 X
g0h;t u0h;t þ e0t ; for8t;
ð45Þ
h¼1
8 NG Pþ 3 0min 0 > > gh;t uh:t then > If dt \ > > h¼1 > > > NG Pþ 3 0min 0 > > 0 > ¼ d gh;t uh:t e > t > < t h¼1 NG ; for8t: Pþ 3 0max 0 > If d [ g u then t > h;t h:t > > h¼1 > > > NG Pþ 3 0max 0 > > 0 > ¼ d gh;t uh:t e t > t > > h¼1 : Elsee0t ¼ 0
ð46Þ
Now, if we fix the states of controllable components on each time, u0h;t , the target problem can be relaxed as a special type of optimization problem that has a quadratic objective function and several variables subjected to linear constraints. As a result, quadratic programming (QP) solvers can be applied to the target problem after creating u0 , and we do not need to concern difficulty of the determination of continuous variables in the PSO algorithm. Under the circumstances, BPSO, which is an extended PSO
Application Example of Particle Swarm Optimization on Operation
227
algorithm, can be applied to the UC problems. In other words, we can reduce the dimension of application target of nature-inspired metaheuristic algorithms from ðu; g; s; v; eÞ to u0 by the transformation and thus expect to improve the search ability of applied algorithms. In the proposed BPSO-QP, position of the particles can be defined as xnl ¼ u0 ; If 0:5\
1 0
1 þ euh;t
then u0h;t ¼ 1; else u0h;t ¼ 0; for 8h; 8t
ð47Þ ð48Þ
This strategy is available not only for the PSO but also for the other intelligent optimization-based algorithms.
4 Performance Evaluation of Particle Swarm Optimization Preliminary numerical simulations were carried out to evaluate performance of the standard PSO- and the BPSO-QP-based solution methods. Time interval, Dt, was set to 1 h, and daily operation schedules ðt ¼ 1; 2; . . .; 24Þ from 1:00 AM ðt ¼ 1Þ to 12:00 PM ðt ¼ 24Þ were determined. With a view to simplifying discussions, the general problem framework formulated in Sect. 2.1 was selected, and the CLs were removed from the microgrid model illustrated in Fig. 1. Table 1 summarizes numerical cases, and Fig. 2 displays profiles of the net load consisting of the electricity consumptions and the VREG outputs. As shown in Fig. 2, the PVs in sunny day (ideal condition) were referred as the aggregated VREG. Specifications of the CGs are described in Table 2 and Fig. 3. These specifications were made by referring to [21, 23, 30, 31].
Table 1. Numerical cases Case 1
Available CGs CGs 1 and 2
2
CGs 1–3
3
CGs 1–4
4
All CGs
Available aggregated ESS Smax ¼ 1:6 (MW), Smin ¼ 1:6 (MW), Qmax ¼ 2:0 (MWh), Qmin ¼ 0:4 (MWh) Smax ¼ 3:2 (MW), Smin ¼ 3:2 (MW), Qmax ¼ 4:0 (MWh), Qmin ¼ 0:8 (MWh) Smax ¼ 4:8 (MW), Smin ¼ 4:8 (MW), Qmax ¼ 6:0 (MWh), Qmin ¼ 1:2 (MWh) Smax ¼ 6:2 (MW), Smin ¼ 6:2 (MW), Qmax ¼ 8:0 (MWh), Qmin ¼ 1:6 (MWh)
In the preliminary numerical simulations, there was no time constraint on utilization of the aggregated ESS. In other words, the variable, TS, did not specify available time of the aggregated ESS. Initial SOC level of the aggregated ESS was set to 50% of its capacity, and the SOC level had to be returned to the original level until the end of
70 60 50 40 30 20 10 0 -10 -20
Net load Electric loads VREG outputs
Case 1
1 2 3 4 5 6 7 8 9 101112131415161718192021222324
Electric power [MW]
H. Takano et al. 70 60 50 40 30 20 10 0 -10 -20
Electric power [MW]
Electric power [MW]
228
70 60 50 40 30 20 10 0 -10 -20
Net load Electric loads VREG outputs
Case 2
1 2 3 4 5 6 7 8 9 101112131415161718192021222324
Time
Electric power [MW]
Time 70 60 50 40 30 20 10 0 -10 -20
Net load Electric loads VREG outputs
Case 3
1 2 3 4 5 6 7 8 9 101112131415161718192021222324
Net load Electric loads VREG outputs
Case 4
1 2 3 4 5 6 7 8 9 101112131415161718192021222324
Time
Time
Fig. 2. Net load profiles in each case Table 2. Specifications of CGs for Sect. 4 (#: any currency unit is applicable) i
Ai ð#Þ
Bi ð#=MWÞ
1 2 3 4 5
12000.0 7800.0 2400.0 12000.0 2400.0
3800.0 3100.0 2500.0 3800.0 2500.0
40,000
5
10
15
CG output [MW]
20
ðMWÞ GMAX i
ðMWÞ GMIN i
3000.0 500.0 500.0 3000.0 500.0
20.0 16.0 12.0 20.0 12.0
4.0 3.2 2.4 4.0 2.4
80,000 60,000 40,000 20,000 0
25
100,000
100,000
80,000
80,000
Fuel cost [#]
Fuel cost [#]
All CGs
20,000 0
SCi ð#Þ
100,000
CG 2 CG 4
Fuel cost [#]
60,000
0
60,000 40,000 20,000 0
Fuel cost [#]
CG 1 CG 3 CG 5
80,000
CG 2 0
5
10 15 CG output [MW]
20
80,000
60,000 40,000 20,000
CG 4 10
5
15
10
15
CG output [MW]
20
25
CG 3 20
25
CG output [MW] 100,000
5
0
10
20,000
80,000
0
5
40,000
100,000
0
CG1 0
60,000
0
25
Fuel cost [#]
Fuel cost [#]
100,000
Ci ð#=MW2 Þ 1.2 1.8 2.8 1.4 3.0
15
20
25
60,000 40,000 20,000 0
CG 5 0
CG output [MW]
Fig. 3. Fuel costs of CGs
5
10
15
CG output [MW]
20
25
Application Example of Particle Swarm Optimization on Operation
229
scheduling period. That is, the following additional constraint was satisfied in the ESS operations. NS X j¼1
4.1
qj;0 ¼
NS X j¼1
qj;T ¼ 0:5
NS X
Qmax : j
ð49Þ
j¼1
Basis of Discussions
The PSO, as is well known, cannot guarantee mathematical optimality in its result. Therefore, to set the basis for discussions, the author approximately solved the target problem by the following enumeration-based solution method (enumeration-QP) whose outline is as follows. 1. All the feasible UC solution candidates are enumerated assuming no ESS under (31) and (32). 2. The QP solver derives the optimal output shares of the CGs for each enumerated UC solution candidate in consideration of operation of the aggregated ESS. By the relationship of Dgt ¼ st , the aggregated ESS influence the fuel costs of CGs. 3. The UC solution candidate having the best output shares of the CGs and the aggregated ESS in the previous step is determined as the optimal operation schedule of the microgrid. If either one of (31) and (32) is not satisfied, the operation plan at time t 1 affects it at time t. In contrast, under (31) and (32), we can regard the target optimization problem in the above steps 1 and 2 as a static hierarchical optimization problem [21, 23, 30, 31]. This is because the optimization variables at time t ðui;t and gi;t Þ are independent of their values at time t 1. It also means that the enumeration-QP only can apply under (31) and (32). Moreover, since the UC problem and the ELD problem are solved individually, the enumeration-QP does not guarantee the global optimality of its solution. However, unlike the intelligent algorithms whose solutions much depend on choices of initial solutions and/or random number sequences, this solution method can provide stable and consistent solutions. 4.2
Results and Discussion in Performance of Particle Swarm Optimization
The optimal operation schedules of microgrid components (CGs and ESSs) were determined by the enumeration-QP, the standard PSO and the BPSO-QP, respectively. Through trial and error, parameters of the standard PSO were set as follows: jLj ¼ 100, N ¼ 20; 000, x ¼ 0:7, c1 ¼ 1:4 and c2 ¼ 1:4. Since the BPSO-QP required much calculation time, its parameters were modified as shown below: jLj ¼ 100, N ¼ 1000, x ¼ 0:7, c1 ¼ 1:4 and c2 ¼ 1:4. Difference between their parameters is only the maximum iteration from the viewpoint of their required calculation time. Initial particles were randomly set in the search space. Table 3 summarizes the operation costs on obtained solution. In Table 3, we can confirm that the BPSO-QP obtained the best
230
H. Takano et al.
solutions in each case. The standard PSO became worse approximately 10% as compared to those of the enumeration-QP although the maximum iteration was bigger than that of the BPSO. Table 3. Solution comparison of enumeration-QP, standard PSO and BPSO-QP Case 1 2 3 4
Enumeration-QP 1493958.2 1762434.9 2443987.8 3114027.3
Standard PSO 1628064.8 (+9.0%) 1981212.9 (+12.4%) 2640737.1 (+8.1%) 3541872.7 (+13.7%)
BPSO-QP 1479829.3 1708589.9 2267596.6 3001256.0
(−0.9%) (−3.1%) (−7.2%) (−3.6%)
BPSO
Iteration
Fig. 4. Comparison of search records of standard PSO and BPSO-QP
Table 4. Comparison of calculation time until 1000 iteration Case 1 2 3 4
Standard PSO (s) 16.23 17.16 17.84 19.62
BPSO-QP (s) 109.46 422.70 919.39 2106.99
900
Case 2 Case 4
800
700
500
600
400
300
200
Case 1 Case 3
100
0
10,000,000 8,000,000 6,000,000 4,000,000 2,000,000 0
1,000
PSO
Fitness function
20,000
Figure 4 displays transitions of gbest in each PSO, and Table 4 describes comparison results of their calculation time. In Fig. 4, increase of the case number leads to slower convergence of the gbest. This is because the target problem becomes difficult owing to increase of the number of CGs. Actually, the number of possible UC candidates in Case 1 can be simply calculated by (22)24, while the number in Case 4 is (24)24. For the same reason, the calculation time in Case 4 was the worst in each PSO. As shown in Table 4, the BPSO-QP has still plenty room for discussion on the calculation time; nevertheless, its convergence was emphasized dramatically in comparison with that of the standard PSO. In accordance with the problem transformation described in Sect. 3.2, process in the QP solver became very complicated, and thus the calculation time increased. However, the issue is beyond the scope of this chapter because each calculation time was sufficiently fast from the viewpoint of practical use. It will be solved/relaxed in the authors’ future works.
Application Example of Particle Swarm Optimization on Operation
231
5 Verification of Validity in Problem Frameworks
50 40 30 20 10 0 -10 -20
Net load Electric loads VREG outputs
8 9 101112131415161718192021222324 1 2 3 4 5 6 7
Time
Price in trade [#/MWh]
Electric power [MW]
The authors verified the validity of formulated optimization problems through numerical simulations using the microgrid model illustrated in Fig. 1 and discussions on their results. Time interval, Dt, was set to 1 h, and daily operation schedules ðt ¼ 1; 2; . . .; 24Þ were determined as same as the preliminary numerical simulations described in Sect. 4. However, the period of optimization target was set from 8:00 AM ðt ¼ 1Þ to 7:00 AM ðt ¼ 24Þ considering operation of the time-constrained components in the microgrid model. The controllable time of the CLs was set from 9:00 PM to 7:00 AM. Figure 5 shows the net load profile and the price in electricity trade. As shown in Fig. 5, the aggregated PV in sunny day was referred as the aggregated VREG. The imbalance penalty was set to huge value for avoiding the electricity surplus/shortage in the operation stage. Parameters of the microgrid model were set as shown in Tables 5 and 6, and these settings were basically fixed in this section. These parameters were made by modification of the parameters described in Sect. 4. Here, there was no time constraint on utilization of the aggregated ESS, and the aggregated ESS had to satisfy (49).
25,000 20,000 15,000 10,000 5,000 0
8 9 101112131415161718192021222324 1 2 3 4 5 6 7
Time
Fig. 5. Net load profile and electricity trading price
Table 5. Specifications of CGs for Sect. 5 (#: any currency unit is applicable) i
Ai ð#Þ
Bi ð#=MWÞ
1 2 3
12000.0 7800.0 2400.0
3800.0 3100.0 2500.0
Ci ð#=MW2 Þ 1.2 1.8 2.8
SCi ð#Þ
ðMWÞ GMAX i
ðMWÞ GMIN i
3000.0 1000.0 500.0
20.0 16.0 12.0
4.0 3.2 2.4
Table 6. Specifications of aggregated ESS and aggregated CL ESS CL Smax ðMWÞ Smin ðMWÞ Qmax ðMWhÞ Qmin ðMWhÞ V max ðMWÞ Pmax ðMWhÞ Pmin ðMWhÞ 1.8 −1.8 10.4 2.6 −1.5 9.6 2.4
Under these conditions, the authors determined several operation schedules as summarized in Table 7. As for reference, operation schedules without consideration of the reserve margin were determined according to the problem framework of Sect. 2.1
232
H. Takano et al.
[the general framework excluding (9)]. The reserve margin was secured to compensate deviation within 5% of the net load in the problem frameworks of Sects. 2.1 and 2.2. Besides, the uncertainty originated from the aggregated VREG (PVs) was only considered in each problem framework to simplify discussions on the numerical results. Case 1 is essentially same with the UC-ELD problems in the conventional power grids. Case 4 is the most difficult condition because the number of optimization variables becomes large. Based on the results in Sect. 4, the BPSO-QP was selected as the solution method. By the result of trial and error, parameters for the BPSO were set as follows: j X j ¼ 40, N ¼ 300, x ¼ 0:9, c1 ¼ 2:0 and c2 ¼ 2:0. Table 7. Available components in each numerical case Case 1a 1b 1c 1d 2a 2b 2c 2d 3a 3b 3c 3d 4a 4b 4c 4d
5.1
Component CGs 1–3 Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available
Framework Aggregated ESS Unavailable Unavailable Unavailable Unavailable Available Available Available Available Unavailable Unavailable Unavailable Unavailable Available Available Available Available
Aggregated CL Unavailable Unavailable Unavailable Unavailable Unavailable Unavailable Unavailable Unavailable Available Available Available Available Available Available Available Available
Sect. Sect. Sect. Sect. Sect. Sect. Sect. Sect. Sect. Sect. Sect. Sect. Sect. Sect. Sect. Sect.
2.1 2.1 2.2 2.3 2.1 2.1 2.2 2.3 2.1 2.1 2.2 2.3 2.1 2.1 2.2 2.3
without (9)
without (9)
without (9)
without (9)
Results and Discussion
Operation costs for the forecasted net load are described in Table 8, and expected operation costs are summarized in Table 9. Figures 6, 7, 8 and 9 display the resulting operation schedules of the microgrid components. Figures 10, 11 and 12 show transitions of the SOC level in them. Table 8. Numerical results (operation costs for forecasted net load) Case 1 2 3 4
a 1941945.5 1913463.0 1953169.5 1924697.8
b 1956575.6 1920218.2 1967799.7 1931453.0
c 1956575.6 1917874.0 1961365.3 1929108.8
d 1980156.6 1949546.0 1991382.4 1960789.9
Application Example of Particle Swarm Optimization on Operation
233
Table 9. Numerical results (expected operation costs) a – – – –
50.0
g3
40.0
b – – – –
g2
g1
c 1987690.3 2040216.1 2046965.7 2051418.4
Net load
Case 1a
30.0 20.0 10.0 0.0 -10.0
Electric power [MW]
Electric power [MW]
Case 1 2 3 4
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
50.0 40.0
d 1980315.3 1949630.0 1991366.8 1960832.3
g3
g2
g2
g1
e
Net load
Case 1b
20.0 10.0 0.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Time
Case 1c
30.0 20.0 10.0 0.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Electric power [MW]
Electric power [MW]
40.0
g3
Net load
30.0
Time 50.0
g1
50.0 40.0
g3
g2
g1
e
Net load
Case 1d
30.0 20.0 10.0 0.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Time
Time
40.0
g3
g2
g1
s
Net load
Case 2a
30.0 20.0 10.0 0.0 -10.0 50.0 40.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Time g3
g2
g1
s
e
Net load
Case 2c
30.0 20.0 10.0 0.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Electric power [MW]
50.0
Electric power [MW]
Electric power [MW]
Electric power [MW]
Fig. 6. Operation schedules in Case 1
50.0 40.0
g3
g2
g1
s
Net load
Case 2b
30.0 20.0 10.0 0.0 -10.0
50.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7 Time
g3
g2
g1
s
e
Net load
Case 2d
30.0 10.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Time
Fig. 7. Operation schedules in Case 2
Time
50.0 40.0
g3
g2
g1
v
Net load
Case 3a
30.0 20.0 10.0 0.0 -10.0
50.0 40.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7 Time g3
g2
g1
v
e
Net load
Case 3c
30.0 20.0 10.0 0.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7 Time
Electric power [MW]
H. Takano et al.
Electric power [MW]
Electric power [MW]
Electric power [MW]
234
50.0 40.0
g3
g2
g1
v
Net load
Case 3b
30.0 20.0 10.0 0.0 -10.0
50.0 40.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7 Time g2
g3
g1
v
e
Net load
Case 3d
30.0 20.0 10.0 0.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7 Time
50.0 40.0
g3
g2
g1
s
v
Net load
Case 4a
30.0 20.0 10.0 0.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Electric power [MW]
Electric power [MW]
Fig. 8. Operation schedules in Case 3 50.0 40.0
g3
g2
g1
g2
g1
s
e
Net load
Case 4c
30.0 20.0 10.0 0.0 -10.0
Net load
20.0 10.0 0.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Time v
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Electric power [MW]
Electric power [MW]
40.0
g3
v
Case 4b
Time 50.0
s
30.0
50.0 40.0
g3
g2
g1
s
v
e
Net load
Case 4d
30.0 20.0 10.0 0.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Time
Time
14.0 12.0
q
Qmax
Qmin
10.0 8.0 6.0 4.0
Case 2a
2.0 0.0
State of charge [MWh]
State of charge [MWh]
Fig. 9. Operation schedules in Case 4
14.0 12.0
Qmax
6.0 4.0
Case 2b
2.0 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
Time 14.0
14.0 q
Qmax
Qmin
10.0 8.0 6.0 4.0
Case 2c
2.0 0.0
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
State of charge [MWh]
State of charge [MWh]
Time 12.0
Qmin
8.0
0.0
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
q
10.0
12.0
q
Qmax
Qmin
10.0 8.0 6.0 4.0
Case 2d
2.0 0.0
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
Time
Time
Fig. 10. SOC level in Case 2
State of charge [MWh]
State of charge [MWh]
Application Example of Particle Swarm Optimization on Operation 14.0 p
12.0
Pmax
Pmin
10.0 8.0 6.0 4.0
Case 3a
2.0 0.0
14.0 p
12.0
Pmax
8.0 6.0 4.0
Case 3b
2.0 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
Time State of charge [MWh]
State of charge [MWh]
Time 14.0 p
12.0
Pmax
Pmin
10.0 8.0 6.0 4.0 2.0 0.0
Pmin
10.0
0.0
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
235
Case 3c
14.0 Pmax
Pmin
10.0 8.0 6.0 4.0
Case 3d
2.0 0.0
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
p
12.0
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
Time
Time
q
p
Qmax
Qmin
Pmax
Pmin
10.0 8.0 6.0 4.0
Case 4a
2.0 0.0
q
p
Qmax
Qmin
Pmax
Pmin
10.0 8.0 6.0 4.0
Case 4c
2.0 0.0
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
14.0 12.0
q
p
Qmax
Qmin
Pmax
Pmin
10.0 8.0 6.0 4.0
Case 4b
2.0 0.0
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
Time 12.0
State of charge [MWh]
14.0
12.0
State of charge [MWh]
State of charge [MWh]
14.0
State of charge [MWh]
Fig. 11. SOC level in Case 3
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
Time 14.0 12.0
q
p
Qmax
Qmin
Pmax
Pmin
10.0 8.0 6.0 4.0
Case 4d
2.0 0.0
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
Time
Time
Fig. 12. SOC level in Case 4
In Tables 8 and 9, we can understand that the ESSs and/or the CLs were operated appropriately to save the operation costs of CGs by the comparison between Case 1 and Cases 2–4. The operation costs and the expected costs in Case 2 were smaller than those in Case 1. For example, in Fig. 6, the operation schedules in Cases 2–4 required earlier start-up of the CG 1 as compared to Case 1 (Case 1: 15:00, Case 2: 14:00, Case 3: 14:00, Case 4: 13:00). The CG 1, as described in Table 5, is the costly CG, and the resulting operation costs in Cases 2–4 became high. In Case 3, the operation costs became highest. This is because the CLs increase total amount of electricity consumption in the microgrids even though they have controllability in their operation. In contrast, increment of the operation costs was relaxed in Case 4 by the cooperative operation among the CGs, the ESSs and the CLs. As shown in Table 8, the operation schedules in the general problem framework described in Sect. 2.1 (Cases 1a–4a and 1b–4b) were the best results from the viewpoint of operation cost for the assumed net load. It means that the operation cost increases in the other cases for securing margin against the uncertainty in actual operation. However, Cases 1a–4a and 1b–4b have the potential risk for the imbalance
236
H. Takano et al.
penalty which is caused by unexpected change in the VREG outputs. In Table 9, the expected operation costs in this problem framework could not be evaluated by the setting of imbalance penalty, and this is the reason why there were no values in the table. Besides, the operation costs in Cases 1c–4c became worse than those in Table 8. On the other hand, the operation costs in Cases 1d–4d were almost same with those in Table 8; nevertheless, the potential risk was reflected in the expected operation cost. As a result, the expected operation costs in Cases 1d–4d became smaller than those in Cases 1c–4c as opposed to the operation cost for the forecasted net load. From these results, we can conclude that the authors’ proposal described in Sect. 2.3 functioned well. 5.2
Additional Numerical Simulations
In the numerical simulations described in Sect. 5.1, the time-constrained ESS is not considered. As already mentioned, a part of the controllable components, such as EVs, changes their attributes depending on the judgment whether the conventional power grids accept reverse power flow from them or not (accepted: ESS, not accepted: CL). Therefore, the authors determined the optimal operation schedules in Case 4 with replacement, the CLs into the ESSs having time constraint. In the determination, initial SOC level of the aggregated time-constrained ESS was set to 50% of its capacity, and the ESS had to be fully charged until the end of available period. The available time of time-constrained ESS was set from 9:00 PM to 7:00 AM. Table 10 summarizes the operation costs for forecasted net load. Table 11 describes the expected operation costs. Figures 13 and 14 show the resulting operation schedules of the microgrid components and their SOC level, respectively. From these results, we can conclude similar tendency with the results in Sect. 5.1. By discharging of the ESS, the results of Case 5 were better than those of Case 3.
Table 10. Results of additional numerical simulations (operation costs for forecasted net load) Case 5
a 1923640.4
b 1930395.7
c 1928051.5
d 1959730.7
Table 11. Results of additional numerical simulations (expected operation cost) Case 5
a –
b –
c 2050349.4
d 1959770.9
50.0
g3
40.0
g2
g1
s1
s2
Net load
Case 5a
30.0 20.0 10.0 0.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Electric power [MW]
Electric power [MW]
Application Example of Particle Swarm Optimization on Operation 50.0
g3
40.0
g2
g1
g2
g1
s1
e
Net load
Case 5c
30.0 20.0 10.0 0.0 -10.0
Net load
Case 5b
20.0 10.0 0.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Time s2
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Electric power [MW]
Electric power [MW]
40.0
g3
s2
30.0
Time 50.0
s1
237
50.0 40.0
g3
g2
g1
s1
s2
e
Net load
Case 5d
30.0 20.0 10.0 0.0 -10.0
8 9 1011121314151617181920212223 0 1 2 3 4 5 6 7
Time
Time
12.0
q1
q2
Q1max
Q1min
Q2max
Q2min
10.0 8.0 6.0 4.0
Case 5a
2.0 0.0
State of charge [MWh]
14.0
14.0
State of charge [MWh]
State of charge [MWh]
Fig. 13. Operation schedules in Case 5
14.0
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
12.0
q1
q2
Q1max
State of charge [MWh]
q2
Q1max
Q1min
Q2max
Q2min
10.0 8.0 6.0 4.0 2.0 0.0
Q2min
6.0 4.0
Case 5b
2.0 0.0
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
Time
14.0 q1
Q2max
8.0
Time 12.0
Q1min
10.0
Case 5c 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
12.0
q1
q2
Q1max
Q1min
Q2max
Q2min
10.0 8.0 6.0 4.0 2.0 0.0
Case 5d 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4 5 6 7
Time
Time
Fig. 14. SOC level in Case 5
6 Conclusions This chapter presented several problem frameworks and their solution methods to obtain coordinated operation schedules of the controllable components in microgrids. In Sect. 2, three problem frameworks were discussed. Differences between the problem frameworks were whether the electricity trade was considered or not, and the uncertainty was considered or not. In Sect. 3, applications of the standard PSO and the BPSO were selected as the basis of solution methods of the formulated problems. With a view to applying the BPSO, the authors proposed the problem transformation to reduce the dimension of solution space in application target of nature-inspired metaheuristic algorithms from ðu; g; s; v; eÞ to u0 . In other words, the compatibility between the target problems and their solution methods was improved by the proposed transformation. In Sect. 4, the preliminary numerical simulations were performed to evaluate performance of the standard PSO and the BPSO-QP in the target problems. Through discussions on their results, we could conclude that the solutions of standard
238
H. Takano et al.
PSO became worse than those of the proposed BPSO-QP from the viewpoint of their accuracy. On the other hand, the BPSO-QP also had the challenge in its calculation time. In Sect. 5, the differences between the problem frameworks were examined. As a result, we could conclude that each problem framework functioned appropriately. In the future works, the authors will solve/relax the issue in the calculation time of BPSO-QP. Furthermore, the authors will propose more effective solution method with reference to characteristics of the target problems and/or the nature-inspired metaheuristic algorithms. Acknowledgements. The authors would like to acknowledge the support provided by Japan Society for the Promotion of Science (KAKENHI Grant Numbers 16K06215 and 19K04325) and Gifu Renewable Energy System Research Center of Gifu University. Contributions to this study by Ryota Goto and Kan Nakae, who are pursuing their master’s degree in Gifu University, are also acknowledged.
References 1. Office of Electricity Delivery and Energy Reliability (2012) DOE microgrid workshop report. Summary Report 2. Ton DT, Smith MA (2012) The U.S. Department of Energy’s microgrid initiative. Electr J 25(8):84–94 3. Hatziargyriou N, Asano H, Iravani R, Marnay C (2007) Microgrids for distributed generation. IEEE Power and Energy Magazine 4. Liu CC, McAuthur S, Lee SJ (2016) Smart grid handbook. In: 3 Volume Set. Wiley 5. Investigating R&D Committee on advanced power system (2011) Current status of advanced power systems including microgrid and smartgrid (in Japanese). IEEJ Technical Report 1229 6. New Energy and Industrial Technology Development Organization (2018) Case Studies of Smart Community Demonstration Project. http://www.nedo.go.jp/english/reports_ 20130222.html. Access date: 31 May 2019 7. Kerr RH, Scheidt JL, Fontana AJ, Wiley JK (1966) Unit Commitment. IEEE Trans Power App Syst. PAS-85:417–421 8. Sen S, Kothari DP (1989) Optimal thermal generating unit commitment: a review. Int J Electr Power Energy Syst 20(7):443–451 9. Hobbs BF, Rothkopf MH, O’Neill RP, Chao HP (2001) The next generation of electric power unit commitment models. In: International series in operations research & management science, vol 36 10. Padhy NP (2004) Unit commitment—a bibliographical survey. IEEE Trans Power Syst 19 (2):1196–1205 11. Bhardwaj A, Tung NS, Kamboj V (2012) Unit commitment in power system: a review. Int J Power Eng 6(1):51–57 12. Saravanan B, Das S, Sikri S, Kothari DP (2013) A solution to the unit commitment problem —a review. Front Energy 7(2):223–236 13. Zheng QP, Wang J, Liu AL (2015) Stochastic optimization for unit commitment—a review. IEEE Trans Power Syst 30(4):1913–1924 14. Snyder WL, Powell HD, Raiburn JC (1987) Dynamic programming approach to unit commitment. IEEE Trans Power Syst 2(2):339–348
Application Example of Particle Swarm Optimization on Operation
239
15. Ouyang Z, Shahidehpour SM (1991) An intelligent dynamic programming for unit commitment application. IEEE Trans Power Syst 6(3):1203–1209 16. Cohen AI, Yoshimura M (1983) A branch-and-bound algorithm for unit commitment. IEEE Trans Power App Syst PAS-102(2):444–451 17. Chen CL, Wang SC (1993) Branch-and-bound scheduling for thermal generating units. IEEE Trans Energy Conversion 8(2):184–189 18. Kazarlis SA, Bakirtzis AG, Petridis V (1996) A genetic algorithm solution to the unit commitment problem. IEEE Trans Power Syst 11(1):83–92 19. Mantawy AH, Abdel-Magid YL, Selim SZ (1998) A simulated annealing algorithm for unit commitment. IEEE Proc Generation Trans Distribution 145(1):56–64 20. Simopoulos DN, Kavatza SD, Vournas CD (2006) Unit commitment by an enhanced simulated annealing algorithms. IEEE Trans Power Syst 21(1):68–76 21. Takano H, Zhang P, Murata J, Hashiguchi T, Goda T, Iizaka T, Nakanishi Y (2015) A determination method for the optimal operation of controllable generators in micro grids that copes with unstable outputs of renewable energy generation. Electr Eng Japan 190(4):56–65 22. Jeong YW, Park JB (2010) A new quantum-inspired binary PSO: application to unit commitment problem for power systems. IEEE Trans Power Syst 25(3):1486–1495 23. Hayashi Y, Miyamoto H, Matsuki J, Iizuka T, Azuma H (2008) Online optimization method for operation of generators in micro Grid (in Japanese). IEEJ Trans PE128-B(2):388–396 24. Juste KA, Kita H, Tanaka E, Hasegawa J (1999) An evolutionary programming solution to the unit commitment problem. IEEE Trans Power Syst 14(4):1452–1459 25. Rajan CCA, Mohan MR (2004) An evolutionary programming-based tabu search method for solving the unit commitment problem. IEEE Trans Power Syst 19(1):577–585 26. Lu B, Shahidehpour M (2005) Short-term scheduling of battery in a grid-connected PV/battery system. IEEE Trans PES 20(2):1053–1061 27. Palma-Behnke R, Benavides C, Lanas F, Severino B, Reyes L, Llanos J, Saez D (2013) A microgrid energy management system based on the rolling horizon strategy. IEEE Trans Smart Grid 4(2):996–1006 28. Li N, Uckun C, Constantinescu EM, Birge JR, Hedman KW, Botterud A (2016) Flexible operation of batteries in power system scheduling with renewable energy. IEEE Trans Sustain Energy 7(2):685–696 29. Hammati R, Saboori H (2016) Short-term bulk energy storage scheduling for load leveling in unit commitment: modeling, optimization, and sensitivity analysis. J Adv Res 7(3):360–372 30. Soe TZ, Takano H, Shiomi R, Taoka H (2018) Determination method for optimal cooperative operation plan of microgrids by providing alternatives for microgrid operators. J Int Council Electr Eng 8(1):103–110 31. Takano H, Nagaki Y, Murata J, Iizaka T, Ishibashi T, Katsuno T (2016) A study on supply and demand planning for Power Producer-Suppliers utilizing output of megawatt solar plants. J Int Council Electr Eng 6(1):102–109 32. Clerc M (2006) Particle swarm optimization. ISTE Ltd 33. Lee S, Soak S, Oh S, Pedryczm W, Jeon M (2008) Modified binary particle swarm optimization. Progress Natural Sci (18):1161–1166
Chapter 11 Modified Monkey Search Technique Applied for Planning of Electrical Energy Distribution Systems F. G. Duque, L. W. De Oliveira, E. J. De Oliveira, B. H. Dias(&), and C. A. Moraes Department of Electrical Energy, Federal University at Juiz de Fora (UFJF), Juiz de Fora, Brazil {Felipe.duque,camile.aredes}@engenharia.ufjf.br, {leonardo.willer,edimar.oliveira}@ufjf.edu.br, {Felipe.duque
1 Introduction An electric power system (EPS) has the basic function of supplying electrical energy with quality to commercial, industrial, and residential consumers. An EPS is defined as a set of equipment that operates in a coordinated way to convert, transmit, and supply energy to consumers, maintaining the quality standard as high as possible and meeting requirements such as [1]: continuity of supply; compliance with required standards; flexibility to adapt to possible network topology changes; and maintenance, with quick recovery in case of faults. The new philosophy of electric power systems, with lower cost and higher quality of service requirements, coupled with their modernization, has led to the search for more efficient solutions to issues as the monitoring. An EPS comprises electrical distribution systems (EDS), which have a significant impact on the planning of electric networks [2] and high complexity [3, 4]. Among the challenges for EDS, it is possible to highlight the diversity of loads connected to the feeders and their variations throughout the day, as well as the presence of distributed generation (DG). For achieving operation requirements, the National Electric Energy Agency (ANEEL) regulates the EDS in Brazil [5]. Modern EDS are facing changes due to their increase and the development of new technologies that can allow a more flexible, dynamic and efficient operation, as intelligent electronic devices (IEDs), smart meters (SM), and phasor measurement units (PMU) [6, 7]. The planning of measurement systems in EDS has objectives that differ from transmission systems due to the presence of unbalanced loads between phases, increased DG, and radial configuration [8]. Therefore, different strategies should be considered for allocating meters in EDS. The previously mentioned scenario requires advanced monitoring, control, and protection under network conditions. Since complete instrumentation for monitoring is economically unfeasible, the state estimation (SE) process is crucial for the control of © Springer Nature Singapore Pte Ltd. 2020 M. Khosravy et al. (eds.), Frontier Applications of Nature Inspired Computation, Springer Tracts in Nature-Inspired Computing, https://doi.org/10.1007/978-981-15-2133-1_11
Modified Monkey Search Technique Applied
241
the operation [1]. This was studied in terms of accuracy and allocation of measurement devices in [9, 10]. In such studies, traditional measurements such as voltage and active and reactive power at the substation (SS), in addition to low precision pseudomeasurements based on historical data [11], are used in the SE to overcome the low availability of direct measurements. However, the uncertainties in pseudomeasurements impact the accuracy of the SE. Moreover, online monitoring is important for modern distribution networks, particularly with regard to concepts such as smart grid and self-healing, which requires an advanced data acquisition infrastructure [12]. Among the relevant data for supervision and control, real-time synchronized voltage and current phasors can be pointed out. In this context, the synchronized PMU emerge [13]. However, a technical–economic feasibility analysis should be conducted to support the decision-making process on investments in various measurement systems. Tools for planning measurement systems must handle mixed types of measurements and meters, such as SM and PMU. In addition, these tools can deal with the representation of the equipment capacities, accuracies and related costs, as well as traditional substation measurements and pseudo-measurements with low precision. Finally, it is needed to model properly the different requirements of cost and state estimation in the light of different indexes for decision making. The smart grids concept opens a range of research to improve the EDS service quality, with the components of an electric network operating as independent agents with intelligence, competing and cooperating with each other to achieve general objectives [14]. Hence, the role of a state estimator is crucial in modern energy management systems, due to the diversity of applications that depend on accurate realtime data to estimate the operational condition [15]. In particular, the problem of meter allocation for the state estimation has the objective of deciding the type, quantity, and location where the equipment should be strategically placed, aiming at better accuracy and reduced costs to ensure the maximum efficiency of the distribution service. The optimal allocation of meters in an EDS is a complex task that must be aided by efficient computational tools able to handle mixed-integer and combinatorial problems. In this context, metaheuristics have potential to be applied due to their ability to find solutions with a good trade-off between quality and computational effort. Metaheuristic algorithms are present in many applications of optimization problems, but defining the best-applied metaheuristic is a challenge. In this context, [16] conducted studies on the effectiveness of these algorithms in providing optimal solutions to complex problems. The principles of Darwinian evolution for solving optimization problems have been reiterated in Genetic GA [17–19]. In [19], the main objective of the proposed study is to show the effect of optimizing generators for capacity and location, both to reduce transmission investment and to increase network reliability. A very recent variation of GA implies Mendelian evolution on multi-species as inspired by plants biology [20] incorporating the use of double-strand DNA for evolution. The adapted bat-inspired algorithm (ABA) [21] associated with search space shrinking (SSS) resulted in an efficient hybrid algorithm (EHA) for transmission network expansion planning (TEP). The SSS technique has a crucial role in the definition
242
F. G. Duque et al.
of ABA initial candidates, thereof considerably reduction of solution search space, thus the computational performance of the proposed ABA. The analysis of low-frequency interference that causes electrocardiographic noise (ECG) should be estimated and removed, which makes this study motivation give the importance of ECG to assess the patient’s health condition. Particle swarm optimization (PSO) is used to investigate a faster parameterization technique in the one-stage morphological ECG baseline estimation [22]. This paper presents an approach to EPS for high state estimation problem. To solve the problem is presented the MMS technique that stood out for its simplicity, effectiveness and obtaining very promising results in other past applications. 1.1
Contributions
Regarding the context of meter allocation in distribution systems, the contributions of this chapter are: • The description of the application of a recent method in the literature to handle the problem in mind, based on an improved metaheuristic; • The improved algorithm has modifications in the original monkey search (MS) algorithm to properly represent the meter allocation problem features and constraints; • The aforementioned modifications are described in detail to contribute to the literature on metaheuristic applications; • Regarding the meter allocation problem, important aspects and constraints such as topological variations are considered. 1.2
Organization
The remainder of the chapter is organized as follows: Sect. 2 presents the original monkey search and the improved algorithm with their mathematical foundations; Sect. 3 describes the application of the MMS algorithm for the meter allocation problem; Sect. 4 shows the simulations and case studies with an appropriate discussion of the results. The concluding remarks are made in Sect. 5.
2 Background and Related Works 2.1
Original Monkey Search
The optimization technique called monkey search (2007) is inspired by the behavior of a monkey in the search for food in a jungle [23]. This search is done through up and down movements in trees that have food sources. During the search, the monkey stores and updates in its memory the best routes found. This adaptive memory is then used to obtain more promising routes. The MS technique associates the mechanisms of adaptive memory and evolution of routes with a search for solutions of combinatorial optimization problems. This association can be summarized as:
Modified Monkey Search Technique Applied
243
• The root and “nodes” of a tree contain food sources that are related to possible solutions of an optimization problem; • A branch connects two consecutive nodes of a tree; • The path is formed by the set of nodes from the root to the top of a tree; • The height of a tree (h) determines the number of nodes of a path; • Adaptive memory (AM) stores the best solutions acquired during the search for solutions, which are used to lead the search process. In the optimization problem, AM is also used to evaluate convergence. Once the search is started, the monkey begins to traverse the branches of the trees that are encoded by a binary structure [24]. Then, the monkey climbs a tree from its root until its highest point (top), which is a parameter of the algorithm. The sequence of chosen branches is referred to as a path from the root to the top. With every feasible solution found, the monkey updates the data in its memory. Figure 1 shows the binary structure of a tree with root given by node “A” and height given by h = 6 where, from each node, one of the two options can be chosen and each option leads to a new node or candidate solution. From the root, the process can evolve to node “B” through the right branch, or to node “C” through the branch on the left. A possible codification for the decisions taken can be “0” (decision for the branch on the right); “1” (decision for the branch on the left).
Fig. 1. Tree structure: a random search; b targeted search
Considering that in the random search of Fig. 1a, the reinforced branches are visited, the covered path “A-C-D-E-K-L-M” can be represented by its binary coding as 1(“A-C”)-0(“C-D”)-0(“D-E”)-1(“E-K”)-0(“K-L”)-0(“L-M”). The rise from one node to another occurs through a branch and consists of obtaining a new solution (top node) from the current solution (lower node). This is accomplished through a disturbance in the current solution. The perturbation of one solution can result in two other solutions (nodes) and the chosen solution corresponds to the best one between them. A perturbation in the current solution can be an
244
F. G. Duque et al.
increment or decrement in the value of an element that forms the solution vector. Therefore, the following procedures must be defined: • • • •
What elements should be modified? What should be the increase? What should be the decrease? How many elements should be changed?
It is worth mentioning that all these decisions impact the search process. A largescale change may modify the current solution too much and its quality will be greatly affected, that is, the change may make the current solution to escape from a good point. On the other hand, very low-scale changes may make the current solution to be trapped in suboptimal points. 2.2
Modified Monkey Search Algorithm
The modified monkey search algorithm consists of two basic stages. The first stage is to generate a set of initial solutions to load adaptive memory with good quality solutions. The second stage represents the climbing process in the trees to search for better solutions until the adaptive memory contains only solutions of great quality identifying the process convergence. Each stage of the algorithm can be described as follows. Phase 1—Search for the Starting Tree At the beginning of the optimization process, the AMis empty. Starting from an initial solution, either the process of climbing the tree starts. In the allocation of meters, the initial solution corresponds to the base case (no meter installed). As defined, the tree root is associated with a solution to the optimization problem, and more specifically, it corresponds to the best individual, “ibest.” At the beginning of the process, the root corresponds to the base case. When the MMS method starts the search process, there is no information about the paths to be investigated. In the present work, the exhaustive search of the root to the top was adopted for this stage. That is, by disturbing a solution two other solutions are obtained and so on up to the top. The maximum number of paths in the tree operation (possible solutions) is given by Eq. (1). Thus, for example, if we consider a tree height equal to eight (h = 8) we have 256 possible solutions starting from the root. c ¼ 2h
ð1Þ
where c is the possible number of solutions of a tree and h is the height of the tree. From the results obtained from this exhaustive search, the adaptive memory mechanism starts to store a set of solutions that will serve as references for the next trees. In the present work, the AM always stores the ten best nodes (solutions) in descending order in such a way that the ibest (root) is the first in the list. Figure 2a
Modified Monkey Search Technique Applied
245
Fig. 2. Tree exploration steps
depicts the condition of exploration of the initial tree in which all initially unexplored paths are traversed during the search process resulting in the “complete” exploration of the tree. Step 2—Search for Subsequent Trees The subsequent trees investigated in the MMS search process are obtained from perturbations in the best solution found, that is, a tree always starts from the “ibest” of the AM. However, in this case, there is no exhaustive search. In other words, only the best node between the two generated by each disturbance will be visited. In this way, the top of the tree is struck with h disturbances from the root. Figure 2b depicts the condition of exploration of the subsequent tree in which some of the initially unexplored paths are traversed during the search process resulting in the “partial” exploration of the tree. 2.3
MMS Flowchart
Figure 3 shows the proposed algorithm flowchart, MMS, where it can be observed that the algorithm consists of nine steps. The following is a detailed description of the flowchart: Step 1: Data entry. In this step, the system data to be analyzed as well as the MMS parameters method are determined, such as the height of the trees (h), the size of the AM, and the maximum number of trees to be visited ia_max. Step 2: Climb up the starting tree. This step consists of exploring the starting tree from its root. This root corresponds to any solution. Knowing the root determines the corresponding fitness. From there, the solution is successively disturbed until it reaches the top of the tree with the evaluation of all the paths (solutions). Step 3: Start AM. The “n” best solutions, where “n” represents the size of AM, according to fitness are stored neatly in the AM in descending order of quality. In this case, the “ibest” is clearly identified as the first element of the AM.
246
F. G. Duque et al.
Fig. 3. MMS proposed algorithm flowchart
Modified Monkey Search Technique Applied
247
Step 4: Disturb “ibest”. Beginning of exploration (climb) of the subsequent tree Ai. This step is always performed from “ibest” in an attempt to find an even better solution. The evolution from one node of the tree to another of higher level is performed through the solution perturbation mechanism described above. Step 5: Choose a new solution. As already described, the upper node chosen is the one with the best fitness between the two nodes generated with the disturbance. Step 6: Evaluate the new solution. Two options may occur: (i) if the new solution presents better fitness than “ibest,” this solution becomes the “ibest” and new tree must be analyzed, returning the process to step 4; (ii) if the new solution is not better than “ibest,” then AM should be updated if this solution is better than some AM solution. It is emphasized that the worst solution of the AM is abandoned to give way to a better solution. In this case, the algorithm goes to step 7. Step 7: Convergence test. The global convergence of the MS algorithm is achieved when all adaptive memory solutions are close enough to each other. If this is not achieved, the algorithm ends when all trees are visited (ia_max). The degree of this proximity to the convergence is given by a tolerance whose value is important for the performance of the algorithm. A very high value implies a shorter computing time but may lead to premature convergence of the algorithm in a suboptimal solution, whereas very low values result in high processing times, with a promise of method efficiency. If convergence was not achieved, the algorithm should continue in step 8 to see if it reached the top of the tree. Step 8: Check top. This step consists of verifying that the top has been reached. If the top is reached, it means that no better solution was found than “ibest.” Thus, the algorithm must return to step 4 to perform a new perturbation from the root (“ibest”). This procedure is called the descent of the tree. On the other hand, if the top has not yet been hit, one should keep climbing the tree. Then, the algorithm goes to “step 9.” Step 9: Disturb the solution. As already described, two new solutions are obtained by perturbing the current solution. From there, the algorithm returns to step 5.
248
F. G. Duque et al.
The algorithm details are given below:
Global variables initialization h, c, AM, ε, α, β, ia_max, nperi, CPMU, CSM ; Initial Tree Base Solution (without meters) Definition of the root solution by applying eq (11.5) with NPMU = NSM = 0; Search in the initial tree Search all tree paths by disturbing the current solution; Variations in Nb (increment and/or decrement) subject to (NPMU + NSM ≤ Nb – 1); For each disturbed solution apply eq (11.5); Calculations of subsequent tree parameters ibest = best initial tree solution; tolerance = AM(10) – AM(1); Initialization of the subsequent tree counter (ih = 1); While tolerance > ε or ih < ia_max For ia = 1:c or ibest(ih) < ibest(ih-1) Variations in Nb (increment and/or decrement) subject to (NPMU + NSM ≤ Nb – 1); For each disturbed solution apply eq (11.5); Comparison of each disturbed solution with ibest; If the disturbed solution is better than ibest Updating of AM and ibest; Else Updating of AM; End If End For Tolerance updating; Increment in the tree counter (ih = ih + 1); End While
2.4
Differences Between the MS and MMS Algorithms
The main differences between the basic method (MS) and the modified method (MMS) are necessary to improve the algorithm performance for exclusive application in meter allocation in distribution systems. The differences between MS and MMS are described below:
Modified Monkey Search Technique Applied
249
• The first difference is the rise of subsequent trees. While the proposed MMS evaluates only the best solution of the AM (“ibest”), the original MS evaluates any of the solutions belonging to the AM set; • The original MS algorithm has an equal treatment for all trees; thus, initial tree = subsequent, and its exploration consists of an exhaustive search for all the trees, since the proposed MMS has a differentiated treatment with respect to the trees, and thus, initial tree 6¼subsequent, then only the initial tree consists of exhaustive search and the subsequent trees attend to the tree convergence criterion; • The original MS defines the path as a set of exploited branches not necessarily defined by the root-to-top sequence, but also from any level of the tree (6¼from the root) to another level above (6¼from the top). The proposed MMS defines a path with a sequence of branches explored from the root to the top (root-top) or to any level where it obtains the convergence of the tree; • Finally, the most important aspect is the process of intensifying the current solution. The MMS proposed after a series of executions create a rank allowing a reduction of the search space (stagnation of the search process) thus improving the quality of the solution and computational time. In the original MS, such a procedure does not exist. 2.5
MS and MMS Algorithm Approach
The work described in [23] demonstrated that the use of MS is competitive in relation to other metaheuristic methods. In order to prove the method efficiency, tests were performed to optimize the Lennard-Jones function and Morse clusters, and to simulate protein molecules based on a geometric model for protein folding. The following year Ref. [25] used the MS algorithm to solve optimization problems with continuous variables applying to benchmark problems with 30, 1000, or even 10,000 dimensions, showing that MS can find good quality solutions for large dimensions. Reference [24] applied the metaheuristic to solve multidimensional assignment problems (MAP). Already the Ref. [26] applied the MS technique in discretizable molecular distance geometry problem (DMDPG). In relation to the MMS algorithm, three articles were proposed in the line of power systems. Firstly, it was published [27] in which improvements in the MS optimization technique are proposed to obtain a better representation of the capacitor allocation problem and increase computational efficiency. The problem objective is to optimize the operation of the distribution network over a planning horizon, minimizing system losses with minimum investment cost in capacitors. It was later published [28] in which the proposed model considers different load levels, voltage limit restrictions, and practical values for fixed and switched capacitor banks, as well as for unit costs and emission coefficient. The objectives are to minimize energy loss, improve voltage levels, and reduce the emission of carbon dioxide. Finally, it was published [29] in which an extended optimal power flow (E-OPF) is presented for the estimating states in energy distribution systems in which it considers different network configurations. The objective function combines the state estimation error (SEE) of the weighted least squares (WLS) approach with additional indexes related to state variables, which improve the state estimation process.
250
F. G. Duque et al.
3 Problem Formulation 3.1
Modeling of the Meter Allocation Problem via MMS
The main aspects such as parameter definition, convergence criteria, and perturbation mechanisms of the proposed MMS algorithm applied to the meter allocation problem are discussed in this section. Tree Parameters For the meter allocation problem, the initial solution is not random and corresponds to the base case, that is, the system without allocation of meters. This solution is defined as the root of the initial tree. The tree parameter height (h) is of great importance since it determines the number of nodes to be investigated and, consequently, the number of solutions obtained. The determination of this parameter should consider the following aspects: • A high value of h increases the probability of obtaining good solutions, but implies a high number of paths, candidate solutions, and computational effort; • A low value of h implies low computational effort but also limits the search space, which can affect the quality of the solution. From the previous aspects, there is a compromise between solutions’ quality and computational requirements in the choice of the value of h for a given problem. Solution Perturbation Mechanism The perturbation mechanism is required to derive new candidate solutions. For the meter allocation problem, this mechanism consists of increasing or decreasing the number of meters at a random bus of a randomly chosen solution. The number of increments or decrements is also random from one to three variations. In order to exemplify this mechanism, a 10-bus system is considered, for which the allocation of Table 1 is a candidate solution. Table 1. Meter allocation of a candidate solution No. Bus No. Meters
1 1
2 1
3 0
4 0
5 0
6 1
7 0
8 1
9 0
10 1
According to Table 1, the solution establishes meters in buses 1, 2, 6, 8, and 11. Applying the disturbance, the number of increments is considered equal to 1 and the number of decrements equal to 2. The randomly chosen buses are bus 3 for increasing and buses 6 and 8 for decrements. Thus, after this perturbation, the solution of Table 2 is derived. Table 2. New meter allocation candidate solution No. Bus No. Meters
1 1
2 1
3 1
4 0
5 0
6 0
7 0
8 0
9 0
10 1
Modified Monkey Search Technique Applied
251
Adaptive Memory The adaptive memory of the proposed MMS algorithm consists of a list containing the ten current best solutions. This memory is formed during the search in the initial tree and is updated in the subsequent trees. In order to explain the formation and updating of the adaptive memory, AM in Eq. (2), it is considered that, after searching the initial tree, it comprises the solutions AM1 to AM10, arranged in descending order of quality. AM ¼ ½AM1 ; AM2 ; AM3 ; AM4 ; AM5 ; AM6 ; AM7 ; AM8 ; AM9 ; AM10
ð2Þ
The vector of Eq. (1) is obtained through an exhaustive search in the initial tree (A1). The best solution, whose value is AM1, is defined as the root of the first subsequent tree (A2). This solution is perturbed until the convergence criterion of the subsequent tree is reached. The memory update is done whenever a solution better than any belonging to the set [AM1: AM10] is found. Hence, the new solution found is inserted into this set, in the position defined according to its quality, and the subsequent values are shifted to the right. The value of the last position, AM10, is discarded and replaced by the value stored in AM9 before the shift. Notice that the size of memory AM remains the same, with ten positions. For instance, it is considered that during the search in tree A2, two solutions, AMx and AMy, are better than AM3 and AM6 of the initial memory of Eq. (1), respectively. In this case, the new configuration of the adaptive memory, updated after the search in tree A2, is shown in Eq. (3). h i AM ¼ AM1 ; AM2 ; jAMx j; AM3 ; AM4 ; AMy ; AM5 ; AM6 ; AM7 ; AM8
ð3Þ
It is observed that the memory update is not limited to solutions better than AM1 (“ibest”). In the example, AMx and AMy are not better than AM1 but are better than AM3 and AM6 of tree A1. This updated strategy allows a faster convergence of the algorithm and thus increases the computational efficiency. For the meter allocation problem, AM is in organized in the ascending order of the objective function (OBF) that must be minimized. Solution Intensification Process During the proposed MMS algorithm search process, the set of covered branches is marked. In the meter allocation problem, the maximum quantity of meters is initially not limited. After a given number of covered branches, the number of meters of the aforementioned set is limited around the number of meters given by the solutions of this set. Thus, from this point up to the convergence, the perturbation mechanism changes the buses for meters but maintains the total number of meters, which results in a reduced search space and speeds the algorithm. In order to exemplify, in the hypothetical 10-bus system, after 300 iterations of the MMS algorithm, a “rank” is created for the number of meters verified in the covered solutions, as in Table 3, where 100 solutions have 5 m, 45 solutions have 6 m, and so on.
252
F. G. Duque et al. Table 3. Rank for the number of meters in the found solutions
Number of meters Score
1 0
2 0
3 10
4 25
6 45
5 100
7 15
8 5
9 0
10 0
The value of branches for limiting the maximum number of meters is a parameter. A low value implies low processing time but can limit the search space too much, whereas a high value can affect the computational efficiency. Thus, this value must balance computational time with the quality of the solution. Convergence Criterion In MMS, the convergence criterion for the initial tree differs from the criterion for subsequent trees, as described hereinafter. • Criteria for the initial tree: The initial tree convergence is achieved when all paths of this tree are covered (exhaustive search for paths). • Criteria for subsequent trees: The subsequent tree convergence is achieved when at least one of the following conditions is met: (i) When the solution obtained through a perturbation is better than the root solution of the tree (ibest). In addition to the tree convergence criteria, which allow the transition from one tree to another, there is the criterion of the MMS algorithm global convergence, which is achieved when the difference between the objective functions of the solutions of the last and the first position of the adaptive memory is less than or equal to a tolerance e, as formulated in Eq. (4) for a given tree Ai. OBFðAM10 Þ OBFðAM1 Þ e
ð4Þ
where OBF(AM10) is the adaptive memory solution in the final position for Ai, OBF (AM1) is the adaptive memory solution in the initial position for Ai, and e is the tolerance value. Another global convergence criterion is achieved when all subsequent trees are covered, that is, when the maximum number of trees (ia_max) is achieved. 3.2
Measurement Planning Methodology
The objective function of the meter allocation problem is formulated in Eq. (5) [30]. " OBF ¼ Min c:
Nb X k¼1
cUMFk xUMFk þ
Nb X k¼1
! cMIk xMIk
þa
X c2C
c IMVD þb
X
# c IAFD
c2C
ð5Þ wherein cUMFk and cMIk are the investment costs in PMU and SM, respectively, in the bus k; xUMFk and xMIk are the integer variables that represent the investment option in PMU and SM, respectively, in bus k (1—investment, 0—investment not indicated), c, a and b are weights for the investment cost and the indexes [30] related to module (IMVD)
Modified Monkey Search Technique Applied
253
and phase angle (IAFD) of the nodal voltages, respectively. Notice the IMVD and IAFD indexes are known as least absolute values (LAV). Each individual of the MMS represents a candidate solution for the planning of monitoring equipment, being coded in a vector with two parts: (a) part A—values for the decision variables of the problem, that is, xUMFk e xMIk ; and (b) part B—branches monitored by the equipment allocated according to part A. Figure 4 shows the coding adopted in the MMS algorithm.
Fig. 4. Encoding a candidate solution in the proposed MMS
In the example of Fig. 4, the system has six buses, bus “1” refers to the substation and buses “4,” “5,” and “6” are load buses. In this case, the maximum number of equipment, PMU and/or SM, is five, since the SS already has its own measurement. It means that another constraint for the optimization model is NPMU + NSM Nb − 1, where NPMU and NSM are the numbers of PMU and SM, respectively. This constraint was not included in the proposed OPF model since it is provided by the metaheuristic MMS technique. The proposed coding for part A is binary, where the value “1” in one position determines the allocation of the meter in the bus associated with this position, while the value “0” represents no investment in the corresponding bus. Also, each candidate solution is represented by two vectors, one for PMU and one for SM. In the illustration of Fig. 4, the upper vector stores the investment decisions in PMU and the lower vector decisions on SM. Therefore, in this example, there is the allocation of PMUs in buses “2” and “3,” and SM in bus “5.”
254
F. G. Duque et al.
The coding of Part B also presents binary code where “1” indicates that the corresponding branch is monitored and “0” means the absence of measurement according to Fig. 4. Thus, the monitored sections by the PMUs of Part A are highlighted in Fig. 4 (S2, S3, S4, S5, and S6), whereas the SM of Part A monitors S8. It should be noted that the formation of Part B in the candidate solutions addresses the possibility of monitoring the equipment of Part A, in order to ensure the solution consistency, in terms of network elements to be monitored. For instance, in Fig. 4, the PMUs of buses “2” and “3” can monitor sections S2, S3, S4, S5, and S6, taking in account their locations and/or monitoring channels and/or communication infrastructure. Candidate solutions that do not meet operational constraints, such as precision, are penalized by the search process. Different configurations are considered for planning, each referring to network topology, aiming at the best planning envisaging different operating conditions to which SDE is normally subjected [31]. Each candidate configuration is evaluated through the objective function of Eq. (5), which includes the state estimation process accuracy and investment cost. To obtain the estimator accuracy, the proposed E-OPF model is executed for each topology. The optimization tool for solving the modified OPF model is based on the primal–dual interior-point method [32].
4 Results and Discussion The results that are presented in this section seek to show the MMS algorithm effectiveness for the planning of meter in EDS. Other methods used for comparison are modified monkey search (MMS) [27], original monkey search (MS) [28], genetic algorithm (GA) [33], simulated annealing (SA) [34], and exhaustive search (ES). 4.1
The Parameters Used for the MMS and MS
The parameters of the modified monkey search and MS were the same used in the Refs. [27, 28]: (i) the height of a tree h = 8, totaling c = 256 paths; (ii) the global convergence process tolerance e = 0 (Case 1) and e = 100 (Case 2); (iii) the maximum number of trees ia_max = 20; (iv) adaptive memory size (AMS) = “10”; and (v) the number of solutions for the intensification process nperi = 100. 4.2
The Parameters Used for GA
The parameters used for the implementation of the GA were the same used in the reference [33] for all cases: (i) the crossover rate about 95%; (ii) the mutation rate about 2%; (iii) the population of 300 individuals; (iv) 100 generations; (v) the convergence criterion is based on the maximum number of generations; (vi) elitism; (vii) decimal encoding of individuals; (viii) the selection via roulette; and (ix) two cutoff points for the crossover.
Modified Monkey Search Technique Applied
4.3
255
The Parameters Used for SA
The parameters used for the implementation of the SA were the same used in the reference [34] for all cases: (i) Boltzmann constant, KB = 1; (ii) initial temperature, T0 = 30; (iii) maximum number of iterations, kmax = 300; and (iv) cooling rate, a = 0.95. 4.4
The Parameters of the Meter Allocation Problem
Two case studies were used to evaluate the MMS. Case 1—Tutorial with the 16-bus system [35]; and Case 2—33-bus system [36]. The parameters of the meter allocation problem are: • Investment costs as in Ref. [30]—(i) PMU: 1.0 c.u.; (ii) SM: 0.2 u.c. where “c.u.” means “cost unit”; • Measurement errors as in Ref. [30]—(i) about 1% for the measures of SS and PMU; (ii) about 10% for SM; and (iii) about 50% for pseudo-measures based on historical data; and • Simulations number: about 100 simulations of each metaheuristic algorithm (MMS, MS, GA, and SA). 4.5
The Machine Configuration and the Software
The tests were performed by using a 4 GHz Intel Corei5–4200U processor with 3.20 GHz RAM. The proposed metaheuristics were developed by using the software MATLAB version R2013a. Case 1—Tutorial The well-known 16-bus 23 kV test system of [35], shown in Fig. 5, is used as a tutorial to explain the application of the proposed approach. In order to evaluate the impact of
Fig. 5. 16-bus test system [35]
256
F. G. Duque et al.
different scenarios in the optimization, two analyses are performed. The first one, Analysis-A, considers only the base case, Fig. 5, whereas Analysis-B includes more than one scenario. Analysis-A In Analysis-A, the weights a and b of Eq. (5) are equal to 1.0 as in [30] and Ns is also equal to “1,” which means that only one scenario is considered, the configuration of Fig. 5. With no allocation of measurement devices, there is no investment cost and the OBF is 24.3543 u.c. From the search in the initial tree of the proposed MMS, the best ten solutions that form vector AM are presented in Table 4 Table 4. Solutions of the initial adaptive tree, 16-bus system Position MA1 MA2 MA3 MA4 MA5
OBF (u.c.) 8.4926 8.4927 8.4941 8.4942 8.4943
Position MA6 MA7 MA8 MA9 MA10
OBF (u.c.) 8.5029 8.5092 8.5101 8.5151 8.5190
Table 5 presents the number of evaluated solutions and the cost evolution obtained in each tree. For instance, the root of tree A2 is the best solution found in tree A1(OBF = 8.4926 u.c.), which is subjected to a number of 2 perturbations to derive the best solution found in A2(OBF = 8.4923 u.c.). Figure 6 presents the convergence of the algorithm. Table 5. Number of perturbations and the cost evolution per tree, 16-bus Cost at the beginning ! cost at the end of the tree 10.0560 ! 8.4926 8.4926 ! 8.4923 8.4923 ! 8.4665 8.4665 ! 8.4163 8.4163 ! 8.2883 8.2883 ! 8.0517 8.0517 ! 7.9386 7.9386 ! 7.5601
Number of perturbations 256 2 14 16 35 96 158 254
Tree (Ai) 1 2 3 4 5 6 7 8
Modified Monkey Search Technique Applied
257
Fig. 6. Convergence curve of the MMS algorithm, 16-bus
Table 6 presents the reduction (in u.c. and %) in the total cost (OBF) obtained by the MMS in relation to the base case without meter allocation. Table 7 presents the average processing time and the average number of covered trees for 100 executions of the MMS algorithm. Figure 7 shows the best solutions found in the 100 executions of the proposed MMS.
Fig. 7. Best solutions found by the proposed MMS algorithm, 16-bus Table 6. Reduction in the total cost, 16-bus Reduction (%) Reduction (u.c.)
68.96 16.7942
Table 7. Average processing time and the number of trees, 16-bus The average time (min.) The average number of trees
2.26 9.56
258
F. G. Duque et al.
The intensification process is addressed in Table 8 and Fig. 8, where it can be verified that after 310 perturbations, solutions with “3” m are found in MA more than 100 times, that is, the number “3” reaches the “rank” = 100. Thus, from this point up to the convergence, the algorithm set the maximum number at “3.” Table 8. Rank of solutions according to the number of meters, 16-bus Number of meters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Rank 29 48 100 71 40 14 2 6 0 0 0 0 0 0
Fig. 8. Intensification process illustration
Finally, Table 9 presents the best-optimized results obtained by the MMS, MS, GA, and SA algorithms, where the time refers to the processing time for convergence. The investment cost (Inv. Cost) consists of the sum of the last two terms of Eq. (5), which are associated with the investment in PMU and SM. It can be observed that the MMS gives the best solution, associated with the smallest OBF, in this case. After 100 simulations of the MMS, the best result is 7.5601 u.c. and the worst is 8.1105 u.c., which means a deviation between the best and worst results of 1.66%. Table 9. Results for the 16-bus test system, Analysis-A Algorithm PMU buses SM buses IMVD (%) IAFD (crad) MMS 7-10-14 – 4.1509 0.4092 MS 6-9-14 10 4.1391 0.3764 GA 7-11 6-9-10 4.6417 0.5168 SA 7-10-13 3 4.0439 0.4067
Inv. cost (u.c.) OBF (u.c.) Time (min) 3.00
7.5601
2.26
3.20
7.7155
5.65
2.60
7.7585
9.81
3.20
7.6506
6.29
Modified Monkey Search Technique Applied
259
For comparison purpose, the estimation indexes and costs are also presented when: (a) PMU, or SM, is placed at all system buses except the substation, Table 10; (b) PMU, and/or SM, are placed at terminal buses of the feeder of Fig. 5, buses 5, 10, and 14, Table 11; (c) exhaustive search (ES) process is used, Table 12, whose objective is to assess the solutions obtained by the proposed optimization methodology. Table 10. PMU and SM at all buses, 16-bus, one configuration The device at all buses IMVD (%) IAFD (crad)
PMU 1.0000 0.0668
SM 10.6674 1.9326
The device at all buses Inv. cost (u.c.) OBF (u.c.)
PMU 13.00 14.07
SM 2.60 15.20
Table 11. PMU and SM at the feeder terminals, 16-bus, one configuration PMU buses 5-10-14 5-10 5-14 10-14 14 11 5 –
SM buses – 14 11 5 5-10 5-14 11-14 5-10-14
IMVD (%) 5.0673 10.7827 7.5913 5.4617 8.7984 11.9706 17.0848 20.3452
IAFD (crad) 0.3333 1.0141 1.0865 0.4241 0.8355 0.8329 3.6249 4.0092
Inv. cost (u.c) 3.00 2.20 2.20 2.20 1.40 1.40 1.40 0.60
OBF (u.c.) 8.40 14.00 10.88 8.09 11.03 14.20 22.11 24.95
Table 12. ES results, 16-bus, one configuration No. devices PMU buses SM buses IMVD (%) IAFD (crad) 2 7-11 – 5.2182 0.4757 3 7-10-14 – 4.1509 0.4092 4 3-6-7-14 – 3.2478 0.3207
Inv. cost (u.c.) OBF (u.c.) Time (min) 2.00
7.69
2.69
3.00
7.56
22.02
4.00
7.57
122.21
The ES was applied to determine the best places to be monitored, by considering the number of devices fixed at “3,” as found by the MMS, as well as at “2” and “4” in order to evaluate a smaller variation around “3.” It can be observed that the best result from the ES matches with the best MMS solution. The results of Tables 10 and 11 show the presence of conflicting objectives in the OBF of Eq. (5). If a large amount of precise and expensive devices as PMU are placed, the investment cost and OBF increase in relation to the optimal results. On the other hand, if only SM is placed, even though at all buses, the estimation indexes increase due to the lower accuracy of SM compared to PMU. Then, the OBF increases despite the investment cost being lower.
260
F. G. Duque et al.
Analysis-B Analysis-B considers the same conditions of Analysis-A with the addition of eight probable configuration scenarios to the scenario of Fig. 5. The purpose is to find an optimal solution that can establish a suitable trade-off between different scenarios in terms of state estimation precision and investment cost. The considered scenarios are: Scenario 1—Open switches—S5, S10, S16; Scenario 2—Open switches—S5, S10, S14; Scenario 3—Open switches—S5, S10, S15; Scenario 4—Open switches—S1, S10, S16; Scenario 5—Open switches—S1, S10, S14; Scenario 6—Open switches—S1, S10, S15; Scenario 7—Open switches—S1, S5, S16; Scenario 8—Open switches—S1, S5, S14; and Scenario 9—Open switches —S1, S5, S15. In Scenarios 1–3, the entire system is supplied by the feeder that starts from the substation through S1. The feeder in scenarios 4–6 starts through S5 and in scenarios 7– 9, in turn, through S10. In this case, the weights a and b of OBF Eq. (5) are equal to “1/9” to maintain the trade-off between state estimation precision and investment costs. From empirical analyses, it could be concluded that a good choice for a and b is given by the inverse of the number of scenarios. Without allocation of PMU or SM, the OBF is 29.8138 u.c. Table 13 presents the best results. The ES was performed for the same total number of measurement devices of the MMS, and however, ES requires much more time than the proposed MMS with the embedded state estimation model (SEM) and both lead to the same result for this case. Table 13. Results for the 16-bus test system, Analysis-B Algorithm PMU buses SM buses IMVD (%) IAFD (crad) MMS/ES 2-6-9-13 – 3.9732 0.3904 MS/GA 2-6-9-14 4 3.9269 0.4110 SA 2-6-7-13 5 3.9203 0.3568
Inv. cost (u.c.) OBF (u.c.) Time (min) 4.00
8.3636
20.78/1093.63
4.20
8.5379
52.01/84.90
4.20
8.4771
61.15
In 100 simulations of the MMS algorithm, which obtained the best result in Table 13, the deviation between the best result, 8.3636 u.c., and the worst one, 9.1743 u.c., is 1.99%. The results of Analysis-B faced with Analysis-A show that the consideration of different network scenarios impacts the planning of measurement systems in EDS since the OBF increases with the number of scenarios due to the more realistic representation. Finally, Fig. 9 brings the convergence curve of the algorithms discussed in Table 13.
Modified Monkey Search Technique Applied
261
Fig. 9. Convergence curve of the MMS, MS, GA, and SA algorithms, 16-bus, Analysis-B
Case 2—33-bus system The 12.66 kV 33-bus test system of [36], Fig. 10, is formed by a substation and 32 branches. The exhaustive search is performed with the total number of measurement devices obtained by MMS to determine the global best location and assess the proposed approach.
Fig. 10. 33-bus test system [36]
In this case, the considered scenarios are listed hereafter [28]: Scenario 1—Open switches—S33(7–20), S34(8–14), S35(11–21), S36(17–32) and S37(24–28), the original configuration of Fig. 10; Scenario2—Open switches—S7(7–20), S10(8–14), S14(11–
262
F. G. Duque et al.
21), S32(17–32) and S37(24–28); and Scenario 3—Open switches—S7(7–20), S9(8–14), S14(11–21), S32(17–32) and S37(24–28), which is the optimized configuration for minimum loss [37]. The weight factors a and b of Eq. (5) are set at “1/3.” In this case, the OBF is 24.5351 u.c. when no PMU or SM is allocated. Table 14 presents the best results, where the one from the MMS/SEM is the best found. The deviation between the best and worst solutions from the MMS/SEM is 4.60%, with the respective OBF equal to 11.4439 u.c. and 13.6279 u.c. Again, the solution from the MMS algorithm is close to the globally optimal ES solution, whereas GA and SA can also find good options in terms of OBF. Figure 11 brings the convergence curve of the algorithms discussed in Table 14. Table 14. Results for the 33-bus test system, Analysis-B Algorithm PMU buses
MMS
12-17-23-30
GA SA
12-17-21-23-2531 5-8-12-17-23-31
ES
10-16-23-30
SM buses
IMVD (%) IAFD (crad) 8 6.9554 0.2885 5-8-30 5.44206 0.3128 21-25-30 5.2904 0.2350 7 6.8430 0.2689
Inv. cost (u.c.)
OBF (u.c.)
Time (min)
4.20
11.44
16.35
6.60
12.35
51.60
6.60
12.13
39.60
4.20
11.31
4.89 104
Fig. 11. Convergence curve of the MMS, GA, and SA algorithms, 33-bus, Analysis-B
Modified Monkey Search Technique Applied
263
5 Discussion and Conclusions The changes in the MS required to obtain the most appropriate MMS for application in meter allocation were presented. The modifications provided greater ease in terms of implementation and understanding. To clarify the modifications made in MS, a tutorial was presented for the problem of meter allocation. Finally, the case study involves the system of 33 buses. For this system, the proposed MMS obtained the best results in computational terms, less processing time, and in terms of financial, the lower total cost in the optimal allocation of meters when compared to MS, GA, and SA. 5.1
Further Research Topics
Application of the proposed methodology in three-phase distribution systems with unbalanced load characteristics, as well as under different load scenarios such as light, medium and heavy loads and load changes throughout the day. Validation of the method for current scenarios, involving the use of renewable energy sources, active networks due to the insertion of distributed generation, analysis of contingencies, and restoration scenarios. Acknowledgements. The authors gratefully acknowledge the financial support in part of CAPES—Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil, CNPq— Conselho Nacional de Desenvolvimento Científico e Tecnológico—Brasil, INERGE—Instituto Nacional de Energia Elétrica, and FAPEMIG—Fundação de Amparo à Pesquisa no Estado de Minas Gerais. The authors also express gratitude for the educational support of UFJF—Federal University of Juiz de Fora.
References 1. Gomez-Exposito A, Abur A (2004) Power system state estimation: theory and implementation. CRC Press 2. Sallam AA, Malik OP (2018) Electric distribution systems. Wiley-IEEE Press 3. De Araujo LR, Penido DRR, Carneiro S Jr, Pereira JLR (2018) Optimal unbalanced capacitor placement in distribution systems for voltage control and energy losses minimization. Electr Power Syst Res 154:110–121 4. Primadianto A, Lu CN (2017) A review on distribution system state estimation. IEEE Trans Power Syst 32(5):3875–3883 5. Available in http://www.aneel.gov.br. Agência Nacional De Energia Elétrica (Brasil) [Online] 6. Fan J, Borlase S (2009) The evolution of distribution. IEEE Power Energ Mag 7(2):63–68 7. Repo S, Maki K, Jarventausta P, Samuelsson O (2008) ADINE-EU demonstration project of active distribution network. In: Proceedings of CIRED Seminar 2008: SmartGrids for Distribution, vol 23–24. Frankfurt, Germany, pp 1–5 8. Meliopoulos AS, Zhang F (1996) Multiphase power flow and state estimation for power distribution systems. IEEE Trans Power Syst 11(2):939–946 9. Singh R, Pal BC, Vinter RB (2009) Measurement placement in distribution system state estimation. IEEE Trans Power Syst 24(2):668–675
264
F. G. Duque et al.
10. Singh R, Pal BC, Jabr RA, Vinter RB (2011) Meter placement for distribution system state estimation: an ordinal optimization approach. IEEE Trans Power Syst 26(4):2328–2335 11. Muscas C, Pilo F, Pisano G, Sulis S (2009) Optimal allocation of multichannel measurement devices for distribution state estimation. IEEE Trans Instrum Meas 58(6):1929–1937 12. Madlener R, Liu J, Monti A, Muscas C, Rosen C (2009) Measurement and metering facilities as enabling technologies for smart electricity grids in Europe. A Sectoral e-Business Watch 13. Raghuraman S, Jegatheesan R (2011) A survey on state estimation techniques in electrical power system. In: Recent advancements in electrical, electronics and control engineering (ICONRAEeCE), pp 199–205 14. Amin SM, Wollenberg BF (2005) Toward a smart grid: power delivery for the 21st century. IEEE Power Energ Mag 3(5):34–41 15. Exposito AG, Abur A, Jaen AV, Quiles CG (2011) A multilevel state estimation paradigm for smart grids. Proc IEEE 99(6):952–976 16. Dey N (ed) (2017) Advancements in applied Metaheuristic computing. IGI Global 17. Gupta N, Patel N, Tiwari BN, Khosravy M (2018) Genetic algorithm based on enhanced selection and log-scaled mutation technique. In: Proceedings of the future technologies conference, pp 730–748. Springer 18. Singh G, Gupta N, Khosravy M (2015) New crossover operators for real coded genetic algorithm (RCGA). In: Intelligent informatics and biomedical sciences (ICIIBMS), 2015 international conference on IEEE, pp 135–140 19. Gupta N, Khosravy M, Patel N, Senjyu T (2018) A bi-level evolutionary optimization for coordinated transmission expansion planning. IEEE Access 6:48455–48477 20. Gupta N, Khosravy M, Patel N, Sethi IK (2018) Evolutionary optimization based on biological evolution in plants. Procedia Comput Sci Elsevier 126:146–155 21. Moraes CA, De Oliveira EJ, Khosravy M, Oliveira LW, Honório LM, Pinto MF (2020) A hybrid bat-inspired algorithm for power transmission expansion planning on a practical Brazilian network. In: Applied nature-inspired computing: algorithms and case studies, pp 71–95. Springer, Singapore 22. Khosravy M, Gupta N, Patel N, Senjyu T, Duque CA (2020) Particle swarm optimization of morphological filters for electrocardiogram baseline drift estimation. In: Applied natureinspired computing: algorithms and case studies, pp 1–21. Springer, Singapore 23. Mucherino A, Seref O (2007) Monkey search: a novel metaheuristic search for global optimization. AIP Conf Proc 953(1):162–173 24. Kammerdiner AR, Mucherino A, Pardalos PM (2009) Application of monkey search metaheuristic to solving instances of the multidimensional assignment problem. Optimization and cooperative control strategies. Springer, Berlin, Heidelberg, pp 385–397 25. Zhao RQ, Tang WS (2008) Monkey algorithm for global numerical optimization. J Uncertain Syst 2(3):165–176 26. Mucherino A, Liberti L, Lavor C, Maculan N (2009, July) Comparisons between an exact and a metaheuristic algorithm for the molecular distance geometry problem. In: Proceedings of the 11th annual conference on genetic and evolutionary computation, ACM, pp 333–340 27. Duque FG, de Oliveira LW, de Oliveira EJ, Marcato AL, Silva IC Jr (2015) Allocation of capacitor banks in distribution systems through a modified monkey search optimization technique. Int J Electr Power Energy Syst 73:420–432 28. Duque FG, de Oliveira LW, de Oliveira EJ (2016) An approach for optimal allocation of fixed and switched capacitor banks in distribution systems based on the monkey search optimization method. J Control Autom Electr Syst 27(2):212–227
Modified Monkey Search Technique Applied
265
29. Duque FG, de Oliveira LW, de Oliveira EJ, Augusto AA (2017) State estimator for electrical distribution systems based on an optimization model. Electr Power Syst Res 152:122–129 30. Liu J, Tang J, Ponci F, Monti A, Muscas C, Pegoraro PA (2012) Trade-offs in PMU deployment for state estimation in active distribution grids. IEEE Trans Smart Grid 3 (2):915–924 31. Tecchio PP, Benedito RA, Alberto LFC (2010) The behavior of WLS state estimator near the maximum loadability point of power systems. In: Power and energy society general meeting IEEE, pp 1–6 32. Oliveira EJ, Oliveira LW, Pereira JLR, Honório LM, Silva IC, Marcato ALM (2015) An optimal power flow based on safety barrier interior point method. Int J Electr Power Energy Syst 64:977–985 33. da Silva IC, Carneiro S, de Oliveira EJ, de Souza Costa J, Pereira JLR, Garcia PAN (2008) A heuristic constructive algorithm for capacitor placement on distribution systems. IEEE Trans Power Syst 23(4):1619–1626 34. Su CT, Lee CS (2001) Feeder reconfiguration and capacitor setting for loss reduction of distribution systems. Electr Power Syst Res 58(2):97–102 35. Civanlar S, Grainger JJ, Yin H, Lee SSH (1988) Distribution feeder reconfiguration for loss reduction. IEEE Trans Power Deliv 3(3):1217–1223 36. Baran ME, Wu FF (1989) Network reconfiguration in distribution systems for loss reduction and load balancing. IEEE Trans Power Deliv 4(2):1401–1407 37. Oliveira LW, Carneiro S, de Oliveira EJ, Pereira JLR, Silva IC, Costa JS (2010) Optimal reconfiguration and capacitor allocation in radial distribution systems for energy losses minimization. Int J Electr Power Energy Syst 32(8):840–848
Chapter 12 Artificial Neural Network Trained by Plant Genetic-Inspired Optimizer Neeraj Gupta1, Mahdi Khosravy2,3(&), Nilesh Patel1, Saurabh Gupta4,5, and Gazal Varshney6 1
Department of Computer Science and Engineering, Oakland University, Rochester, MI, USA 2 Media Integrated Communication Lab, Graduate School of Engineering, Osaka University, Osaka, Japan [email protected] 3 Electrical Engineering Department, Federal University of Juiz de Fora, Juiz de Fora, Brazil 4 Department of Advanced Engineering, John Deere India Pvt. Ltd., Pune, India 5 Research Scholar, Department of Computer Science, Banasthali Vidyapith, Vanasthali, Rajasthan, India 6 University of Information Science and Technology, Ohrid, North Macedonia
1 Introduction Despite having different structure and connections, approximately, all ANN are using common training algorithms. Backpropagation (BP) and gradient-based (GB) methodsbased training algorithms are widely in use as classical techniques. These training mechanisms are mostly deterministic numerical approach [1]. These classical methods for training/learning have the advantage of fast convergence but could entrap a local extremum easily. However, after observing the inevitable performance of the evolutionary algorithms on the problem of many fields, they have used to train the artificial neural network (ANN) instead of conventional training algorithms [2, 3]. As the population-based approach pulled the interest of many researchers to train the ANN, genetic algorithm (GA), based on the law of natural selection, has been applied successfully to improve the results [2–4]. A list of successful implementations of evolutionary and swarm algorithms to train ANN is given in [2–6]. After observing the success rate of evolutionary algorithms as an intelligent algorithm and increasingly significant tool in the field of neuroevolution, many metaheuristic optimization techniques had proposed by others and training strategy for ANN [6, 7]. These algorithms can handle a broad set of the variables of all types such as linear, nonlinear, and discrete without any additional bounding criteria. After observing the computational power of nature-inspired algorithms [6, 7] in the last few decades, a wide range of research papers are the results to design and train artificial neural network (ANN) by them. For an overview of recent advanced techniques for training
© Springer Nature Singapore Pte Ltd. 2020 M. Khosravy et al. (eds.), Frontier Applications of Nature Inspired Computation, Springer Tracts in Nature-Inspired Computing, https://doi.org/10.1007/978-981-15-2133-1_12
Artificial Neural Network Trained
267
algorithms, readers are suggested to follow an excellent survey and publications [8, 9]. Before implementing the METO as training algorithm, it would be worth to discuss the ANN. This framework of using the evolutionary algorithm as a training mechanism of ANN is known as neuroevolution, a form of artificial intelligence [10]. Introduction of this phenomenon increased the efficiency of computational intelligence (CA). Although the above-mentioned neuroevolution technique is slower than the deterministic approach, it trains the ANN at high accuracy as intelligent learning techniques. Population-based evolutionary algorithms have shown their unavoidable dexterity over single-solution-based optimizers and above-specified classical methods of training ANN [9–12]. Several variants of evolutionary algorithms are proposed until the date to train the ANN as an intelligent form of learning process [2–13]. The traces of neuroevolution can be seen in survey papers [12, 14]. These papers [12, 14, 15] explained every corner of the neuroevolution and its scope. Moreover, the study shown in [15] uses the genetic algorithm as a training scheme to decide the weights for ANN. During the last decades, variants of GA, evolutionary strategies, and memetic algorithms have been suggested with different coding schemes to improve the accuracy of neuroevolution. Reference [16] suggests using of GA and PSO together to train the ANN, where [17] focused on BBO, and [17] proposes the improvement by utilizing DE as evolutionary training algorithm [18]. Augmented by the conclusion given in above research papers, the recent neuroevolution training algorithms are Ant lion optimization [19], grasshopper optimization [20], invasive weed optimization (IWO) [21], Teaching–learning-based optimization [22], and many more [19].
2 Evolutionary Nature-Inspired Optimizers Nowadays, evolutionary optimization [23, 24] has drawn a great attention of the researchers especially for non-convex nonlinear problems where the classic optimization fails to solve. Despite existing a variety of nature-inspired meta-heuristic optimizers, since the demand for accuracy and speed of algorithms is increasing, there is an effort to present more accurate and faster optimizers. The wide range of evolutionary algorithms (EAs) techniques covers a variety of optimizers inspired from different natural phenomena like genetic algorithm (GA) [25–27], biogeography-based optimization (BBO) [28], PSO [29–34], Bat algorithm [35–37], cuckoo search [38], firefly [39], teaching– learning-based optimization (TLBO) [40], Plant biology-inspired optimizer known as Mendelian evolutionary theory optimization (METO) [41, 42], etc. Although the focus of the chapter is on neuroevolution, EOs have a great potential to be applied for variety of applications as text feature detection [43], blind component processing [44], blind source separation [45–49], noise cancelation [50], image enhancement [51, 52], ECG processing [53–56], quality assessment [57], data mining [58], imaging systems [59], information hiding [33], telecommunications [60–63],
268
N. Gupta et al.
morphological filtering [64, 65], image adaptation [66], acoustic OFDM [67], power line communications [68], fault detection [69], etc. Main motivation of this work is improving the existing EAs training algorithms by introducing a new evolutionary structure such as integration of multi-species and selforganization in population genetics. Thus, the resulting distinctive features of proposed METO are as follows: 1. This is a binary-coded ANN trainer as explores the transformed genome weight search space instead of real. 2. A double-strand chromosome scheme makes it much better than the conventional GA. 3. It deploys natural breading process of fusing opposite strands of DNA instead of single-strand chromosome as in GA. 4. It possesses global information exchange strategy through the exchange of genetic characteristics between different species parents. 5. It internally possesses a self-organizing process after mutation which resists the inappropriate changes in genetic characteristics during the evolution. 6. Bypassing stagnation on local optima during the training of ANN. 7. Showing the outperformance of METO over peer algorithms as given in last chapter Table-1. 8. Designing AI-based diagnostic tool for condition and monitoring of the oil filter in AgM. 9. Reducing the number of ANN evaluations to obtain global optima.
3 Artificial Neural Network (ANN) ANN is an artificial intelligence (AI) system which is a systematic composition of several neurons as input, hidden and output layers, where neurons are connected in various fashions through synaptic weights. Its designing process is a tedious task due to its dependency on various parameters such as network structure, used transfer functions, and values of synaptic weights. For the selected structure of the ANN, synaptic weights associated with all connections between the neurons acquire new value as a result of evolution phase and achieve sufficient knowledge to solve the problem. This process is continuous until the termination criteria found such as number of evolution phases, acceptable mean square to achieve goal value. After training the ANN, it is tested and validated by using unknown samples of trained ANN for the same problem. This step certifies that the design ANN can be used to solve the problem such as classification, pattern recognition, and regression. ANN problem can be defined for the set of inputs X ¼ fx ¼ xi jx 2 Rn ;
i ¼ 1; 2; . . .; ng
ð1Þ
Artificial Neural Network Trained
269
o ¼ 1; 2; . . .; mg
ð2Þ
and outputs Y ¼ fy ¼ yo jy 2 Rm ;
It shows that input and output layers of ANN have, respectively, n and n artificial neurons. For designing the ANN for above-given sets of data for chosen number of hidden layers, the objective function can be represented as F ¼ f ðX; Y; WÞ
ð3Þ
where W is the set of synaptic connections between all layers as W ¼ W12 ; W23 ; . . .; Wl1;l Here, l is the total number of layers including input, hidden, and output. In the above formulation, minimizing the F, mean square error (MSE) gives an optimized synaptic weight of ANN. A pictorial view of MLP ANN can be seen in Fig. 1, where each neuron is composed of two sections. First, it adds all the inputs and then passes to the transfer function at second section. Thereafter, the output of this neuron is available as the input to other neurons in the next layer. Possible connections between the neurons are also shown in Fig. 1, where each neuron is connected to all neurons of next layers by the synaptic weights. Mathematical representation forms each neuron is given as yp ¼ F
n X
! wpi xpi
þb
p
ð4Þ
i¼0
Here, xpi is the ith input to the pth neuron, which is multiplied with wPi synaptic weight value associated with the connection between the ith and pth neuron. In the expression, F denotes the transfer function, which can be any type for the neuron, such as step function, linear function, and nonlinear functions. Each neuron requires a bias for firing efficiently and is represented by b in above equation. yp is the output of pth neuron. Each neuron in any layer of the MLP follows above equation, which passes the learned information as yp to the next neuron.
270
N. Gupta et al.
Fig. 1. Schematic of artificial neural network
Under the learning process of MLP, wPi associated with each synaptic connection between the neurons is subjected to change. The evaluation of optimal value of these synaptic weights is the problem of learning ANN. These synaptic weights are stopped to change once the network has learned the behavior of problem. In this case, the output of the ANN is desired and justified by the minimized objective function. After training the ANN, validation is the next step before utilizing to solve the problem. For supervised learning, generally, ANN is tested for the set of input pattern and associated output as per the following equation: n T ¼ ts ¼ xti 2 Rn ; ytj 2 Rm ji ¼ 1; 2; . . .; n;
j ¼ 1; 2; . . .; mg
ð5Þ
In above equation, T is the testing set, where each element of this is composed of xti input pattern and yti output. Here, xti and yti are vectors, respectively, of n and m elements equal to the number of neurons in corresponding layers of ANN. ANN is okay to use as an intelligent system for the given testing set to it if the MSE e calculated from the output y of trained ANN and actual output yt is in acceptable limit. e¼
s X n 2 1 X yti;s yi;s sn i j
ð6Þ
Until the e is minimized, the synaptic weights are subjected to change to learn behavior of the input pattern. For this, generally, classical learning algorithms were used as gradient descendant technique and backpropagation etc. These techniques of
Artificial Neural Network Trained
271
learning the ANN can be trapped at local minima and resulted non-optimal synaptic weights values. Non-optimality of weights value indicates that ANN is not able to capture the exact behavior of problem. However, researchers at Uber have claimed that a simple structural ANN-based trained by evolutionary algorithms is competitive with sophisticated modern industry-standard gradient-descent deep learning algorithms. He stated the reason that neuroevolution does not get stuck in dead ends. Inspired by this concept and previous work, we present a novel binary-coded evolutionary algorithm which takes advantage of biological intelligence to transfer the information from one evolution phase to others.
4 Complexity Level in Designing ANN The design levels of ANN are shown in Fig. 2, where search of connection weight for the given network structure is at the low difficulty level. Main objective of the ANN is to minimize the error subject to avoid over fitting. Searching the optimal weights value is called learning the network, for which we adopt the meta-heuristic evolutionary algorithm. However, only finding the optimal weights does not provide the best solution, where the structure of neurons in hidden layers is an important task. Thus, along with finding the optimal weights proposed, meta-heuristic technique optimizes the behavior of each neuron by finding associated transfer function. Thus, we handle quasi-level design of architecture of the fixed structure ANN. For deciding the learning rule, we apply a variety of meta-heuristic algorithms and proposed the one which best trains the network. In summary, three-level design of ANN has the following mode of automation. 1. Weight search strategy is fully automatic by the meta-heuristic learning algorithms (evolution of learning rule). 2. Decision of learning algorithm/rule is manual (evolution of learning rule).
Fig. 2. Complexity level in designing ANN
272
N. Gupta et al.
3. Selection of hidden layers and neurons in them are worked with the hand, where the behavior of the neuron is automated. 4. Selection of hidden layers and neurons in them is worked with the hand, where the behavior of the neuron is automated by meta-heuristic learning algorithm. Thus, the evolution of architecture is quasi-automated, where only neurons’ behavior is optimized along with weights.
Fig. 3. Population of chromosomes to represent artificial neural networks
5 Generation of Population Each population contains n chromosomes strands. Each chromosome strand is composed of b bits, where b is the sum of lw and lTF bits corresponding to weights and transfer function (TF). Each chromosome represents the structure of an ANN, where the total number of ANN is simulated equal to the number of chromosomes strands in the population in an evolution. One evolution is completed by the sequential two subevolution results from F1 and F2 generation. Offspring population and next parent population are designed based on the fitness value of each ANN. Figure 3 illustrates the population of chromosomes to represent artificial neural networks. For this strategy, we adopt an elite selection operator to form a population of better ANN parents for the next generation.
Artificial Neural Network Trained
273
Fig. 4. Block diagram of training ANN by PGO
A block diagram of training ANN is given in Fig. 4. Here, we can see that METO provides weights and biases value to the ANN structure, and then, calculated fitness is reverted to the METO to select next better weights value. For the inputted weights, ANN calculates the fitness using training samples. Detailed explanation of the complete process of METO is given in Fig. 5 in the context to optimize the ANN. Parent-1 and parent-2 representing the set of weights and biases are first cross-breeded to generate F1 generation offspring, and then, heredity is transferred in it when self-breed to generate F2 generation offspring. At the last, best of all generated offspring and both parents are taken to form heredity to pass to the next evolution epoch.
Fig. 5. Evolution of ANN using PGO algorithms
274
N. Gupta et al.
Figure 6 presents the genotype representation of the weights in the form of chromosome. Each weight is represented by l number of bits, which can be considered as max (10, ½wU wL ). Here, wU and wL ; respectively, are the upper and lower limit of weight. METO operates on genotype representation of the weights, and then, the decoded value is converted to input to the ANN. Also, in this Fig. 6, we can see at the right-side bits representing transfer functions (TF)s, where they are different from the weights. Right-side bits are providing integer values where each integer value represents a TF. For example, if we have four TFs, then the decoded output of one string will be an integer between 1 and 4. However, string associated with weight provides decoded real value.
Fig. 6. Genotype representation of weights and transfer functions (TF)
6 Loss Function Optimizing the ANN requires a way to measure the quality of the solution provided by METO. This is done using the objective function, which brings the solution closer to the goal. Objective function takes the weights’ real values, TF and training samples as arguments and evaluates them to return a numerical value. Based on the output value, METO optimizes ANN parameters to minimize the numerical value of objective function. In other words, we say loss function to measure the inconsistency between predicted value and actual label. Robustness of ANN is increasing with decreasing nonnegative loss function. Mean squared logarithmic error (MSLE) is widely used loss function, which we used to train our model. It is defined as
Artificial Neural Network Trained
L¼
n 2 1X log yðiÞ þ 1 log by ðiÞ þ 1 n i¼1
275
ð7Þ
This function is a variant of mean square error (MSE) and used to measure the difference between actual and predicted. Taking log of actual and predicted value does not penalize huge difference in the predicted and actual value. It penalizes underestimates as follows: 1. MSE and MLSE are equal for small actual and predicted values. 2. MLSE is smaller or almost negligible than MSE if predicted and actual values are huge.
7 Feature Extraction Feature extraction turns raw data into the information that is suitable for machine learning algorithms. Feature extraction eliminates the redundancy present in many types of measured data, facilitating generalization during the learning phase. We utilized MFCC techniques, which extract 13 features from the acoustic signal. We have extracted features in Chap. 4, which we have utilized with 13 MFCC features to train the model. So, we have a total of 15 features as input to the ANN. For example, for six fault conditions, we can label all extracted features from 1 to 6. Developing a predictive model is an iterative process which involves these steps: i. Select the training and validation data. ii. Select a classification algorithm. iii. Iteratively train and evaluate classification models. For training the model, we can divide 70% data in training set and 30% data in the validation set. We can randomly sample the training set and validation set, without repetition. Evolutionary ANN is first trained on training set and then checked for validation set for its accuracy. The accuracy is calculated as b 1 accuracy ¼ ðyvalidation yvalidation Þ 100: n
ð8Þ
8 Conclusion In this chapter, we explained that how METO is integrated in the artificial neural networks. To understand well this integration, a brief introduction to METO has been presented. We have also described the objective function which is mean squared logarithmic error to penalize underestimates more than over-estimates. The neuroevolutionary system of METO-ANN is suggested for faulty condition detection in
276
N. Gupta et al.
hydraulic system of an agriculture machine, wherein the condition of system is predicted from the sounds of oil pump. To aim this goal, it is required to detect the pollution level as suggested by this chapter a model for training over 15 features, where 13 features are from MFCC. In a separate research work, we will provide the simulation results and relevant analysis evaluating thet METO performance compared to the other optimizers.
References 1. Anthony M, Bartlett PL (2009) Neural network learning: theoretical foundations. Cambridge university press 2. Blackwell T, Branke J (2004) Multi-swarm optimization in dynamic environments. In: Workshops on applications of evolutionary computation. Springer, Berlin, Heidelberg, 489–500 3. Urgen Branke J (1995) Evolutionary algorithms for neural network design and training. In: Proceedings of the 1st nordic workshop on genetic algorithms and its applications 4. Blackwell T, Branke J (2006) Multiswarms, exclusion, and anti-convergence in dynamic environments. IEEE Trans Evol Comput 10(4):459–472 5. Branke J, Kaußler T, Smidt C, Schmeck H (2000) A multi-population approach to dynamic optimization problems. Evolutionary design and manufacture. Springer, London, pp 299–307 6. Engelbrecht AP (2007) Computational intelligence: an introduction. John Wiley & Sons 7. Grefenstette JJ (1999) Evolvability in dynamic fitness landscapes: a genetic algorithm approach. In: Proceedings of the 1999 congress on evolutionary computation-CEC99, Cat. No. 99TH8406, vol. 3, pp 2031–2038 8. Gudise VG, Venayagamoorthy GK (2003) Comparison of particle swarm optimization and backpropagation as training algorithms for neural networks. In Proceedings of the 2003 IEEE swarm intelligence symposium. SIS’03, cat. no. 03EX706, pp 110–117 9. Holm JE, Botha EC (1999) Leap-frog is a robust algorithm for training neural networks. Network Comput Neural Syst 10(1):1–13 10. Torrecilla JS, Otero L, Sanz PD (2007) Optimization of an artificial neural network for thermal/pressure food processing: evaluation of training algorithms. Comput Electron Agric 56(2):101–110 11. Floreano D, Dürr P, Mattiussi C (2008) Neuroevolution: from architectures to learning. Evol Intel 1(1):47–62 12. Stanley KO, Clune J, Lehman J, Miikkulainen R (2019) Designing neural networks through neuroevolution. Nat Mach Intell 1(1):24–35 13. Pagliuca P, Nolfi S (2019) Robust optimization through neuroevolution. PLoS ONE 14(3): e0213193 14. Sloss AN, Gustafson S (2019) 2019 Evolutionary algorithms review. arXiv preprint arXiv: 1906.08870 15. Mohammed MA, Ghani MKA, Arunkumar NA, Hamed RI, Abdullah MK, Burhanuddin MA (2018) A real time computer aided object detection of nasopharyngeal carcinoma using genetic algorithm and artificial neural network based on Haar feature fear. Future Gener Comput Syst 89:539–547 16. Jamali B, Rasekh M, Jamadi F, Gandomkar R, Makiabadi F (2019) Using PSO-GA algorithm for training artificial neural network to forecast solar space heating system parameters. Appl Therm Eng 147:647–660
Artificial Neural Network Trained
277
17. Pham BT, Nguyen MD, Bui KTT, Prakash I, Chapi K, Bui DT (2019) A novel artificial intelligence approach based on multi-layer perceptron neural network and biogeographybased optimization for predicting coefficient of consolidation of soil. CATENA 173: 302–311 18. Dahou A, Elaziz MA, Zhou J, Xiong S (2019) Arabic sentiment classification using convolutional neural network and differential evolution algorithm. Comput Intell Neurosci 19. Heidari AA, Faris H, Aljarah I, Mirjalili S (2019) An efficient hybrid multilayer perceptron neural network with grasshopper optimization. Soft Comput 23(17):7941–7958 20. Alameer Z, Elaziz MA, Ewees AA, Ye H, Jianhua Z (2019) Forecasting gold price fluctuations using improved multilayer perceptron neural network and whale optimization algorithm. Res Policy 61:250–260 21. Gong Y, Xiao S (2019) Synthesis of sparse arrays in presence of coupling effects based on ANN and IWO. In: 2019 IEEE international conference on computational electromagnetics (ICCEM), pp 1–3 22. Dash CSK, Behera AK, Dehuri S, Cho SB (2019) Building a novel classifier based on teaching learning based optimization and radial basis function neural networks for nonimputed database with irrelevant features. Appl Comput Inf 23. Dey N (ed) (2017) Advancements in applied metaheuristic computing. IGI Global 24. Dey N, Ashour AS (2016) Antenna design and direction of arrival estimation in metaheuristic paradigm: a review. Int J Serv Sci Manage Eng Technol 7(3):1–18 25. Gupta N, Patel N, Tiwari BN, Khosravy M (2018 Nov) Genetic algorithm based on enhanced selection and log-scaled mutation technique. In: Proceedings of the future technologies conference. Springer, Cham, pp 730–748 26. Singh G, Gupta N, Khosravy M (2015 Nov) New crossover operators for real coded genetic algorithm (RCGA). In: 2015 international conference on intelligent informatics and biomedical sciences (ICIIBMS), IEEE, pp 135–140 27. Gupta N, Khosravy M, Patel N, Senjyu T (2018) A bi-level evolutionary optimization for coordinated transmission expansion planning. IEEE Access 6:48455–48477 28. Simon Dan (2008) Biogeography-based optimization. IEEE Trans Evol Comput 12(6): 702–713 29. Chatterjee S, Sarkar S, Hore S, Dey N, Ashour AS, Balas VE (2017) Particle swarm optimization trained neural network for structural failure prediction of multistoried RC buildings. Neural Comput Appl 28(8):2005–2016 30. Jagatheesan K, Anand B, Samanta S, Dey N, Ashour AS, Balas VE (2017) Particle swarm optimisation-based parameters optimisation of PID controller for load frequency control of multi-area reheat thermal power systems. Int J Adv Intell Paradigms 9(5–6):464–489 31. Chatterjee S, Hore S, Dey N, Chakraborty S, Ashour AS (2017) Dengue fever classification using gene expression data: a PSO based artificial neural network approach. In: Proceedings of the 5th international conference on frontiers in intelligent computing: theory and applications. Springer, Singapore, pp 331–341 32. Jagatheesan K, Anand B, Dey N, Gaber T, Hassanien A E, Kim TH (2015 Sept) A design of PI controller using stochastic particle swarm optimization in load frequency control of thermal power systems. In: 2015 fourth international conference on information science and industrial applications (ISI), IEEE, pp 25–32 33. Chakraborty S, Samanta S, Biswas D, Dey N, Chaudhuri SS (2013 Dec) Particle swarm optimization based parameter optimization technique in medical information hiding. In: 2013 IEEE international conference on computational intelligence and computing research, pp 1–6
278
N. Gupta et al.
34. Khosravy M, Gupta N, Patel N, Senjyu T, Duque CA (2020) Particle swarm optimization of morphological filters for electrocardiogram baseline drift estimation. In: Dey N, Ashour AS, Bhattacharyya S (eds) Applied nature-inspired computing: algorithms and case studies. Springer, Singapore, pp 1–21 35. Moraes CA, De Oliveira, EJ, Khosravy M, Oliveira LW, Honório LM, Pinto MF (2020) A hybrid bat-inspired algorithm for power transmission expansion planning on a practical Brazilian network. In: Dey N, Ashour AS, Bhattacharyya S (eds) Applied nature-inspired computing: algorithms and case studies. Springer, Singapore, pp 71–95 36. Satapathy SC, Raja NSM, Rajinikanth V, Ashour AS, Dey N (2018) Multi-level image thresholding using Otsu and chaotic bat algorithm. Neural Comput Appl 29(12):1285–1307 37. Rajinikanth V, Satapathy SC, Dey N, Fernandes SL, Manic KS (2019) Skin melanoma assessment using Kapur’s entropy and level set—a study with bat algorithm. In: Smart intelligent computing and applications. Springer, Singapore, pp 193–202 38. Dey N, Samanta S, Yang XS, Das A, Chaudhuri SS (2013) Optimisation of scaling factors in electrocardiogram signal watermarking using cuckoo search. Int J Bio-Inspired Comput 5(5):315–326 39. Dey N, Samanta S, Chakraborty S, Das A, Chaudhuri SS, Suri JS (2014) Firefly algorithm for optimization of scaling factors during embedding of manifold medical information: an application in ophthalmology imaging. J Med Imaging Health Inf 4(3):384–394 40. Rao RV, Savsani VJ, Vakharia DP (2011) Teaching learning-based optimization: a novel method for constrained mechanical design optimization problems. Computer-Aided Des 43(3):303–315 41. Gupta N, Khosravy M, Patel N, Sethi IK (2018) Evolutionary optimization based on biological evolution in plants. Procedia Comput Sci Elsevier 126:146–155 42. Gupta N, Khosravy M, Mahela OP, Patel N (2020) Plants biology inspired genetics algorithm: superior efficiency to firefly optimizer. In: Applications of firefly algorithm and its variants, from springer tracts in nature-inspired computing (STNIC). Springer International Publishing (in press) 43. Gutierrez CE, Alsharif MR, Khosravy M, Yamashita K, Miyagi H, Villa R (2014) Main large data set features detection by a linear predictor model. AIP Conf Proc 1618(1):733–737 44. Khosravy M, Gupta N, Marina N, Asharif MR, Asharif F, Sethi IK (2015 Nov) Blind components processing a novel approach to array signal processing: a research orientation. In: 2015 international conference on intelligent informatics and biomedical sciences (ICIIBMS), IEEE, pp 20–26 45. Khosravy M, Asharif MR, Yamashita K (2009) A PDF-matched short-term linear predictability approach to blind source separation. Int J Innov Comput Inf Control (IJICIC) 5(11):3677–3690 46. Khosravy M, Alsharif MR, Yamashita K (2009) A PDF-matched modification to stone’s measure of predictability for blind source separation. International symposium on neural networks. Springer, Berlin, Heidelberg, pp 219–228 47. Khosravy M, Asharif MR, Yamashita K (2011) A theoretical discussion on the foundation of stone’s blind source separation. SIViP 5(3):379–388 48. Khosravy M, Asharif M, Yamashita K (2008 July) A probabilistic short-length linear predictability approach to blind source separation. In: 23rd international technical conference on circuits/systems, computers and communications (ITC-CSCC 2008), Yamaguchi, Japan, pp 381–384 49. Khosravy M, Kakazu S, Alsharif MR, Yamashita K. (2010) Multiuser data separation for short message service using ICA (信号処理). 電子情報通信学会技術研究報告. SIP, 信号 処理: IEICE Tech Rep 109(435):113–117
Artificial Neural Network Trained
279
50. Khosravy M, Asharif MR, Sedaaghi MH (2008) Medical image noise suppression: using mediated morphology. IEICE Tech Rep, IEICE, pp 265–270 51. Ashour AS, Samanta S, Dey N, Kausar N, Abdessalemkaraa WB, Hassanien AE (2015) Computed tomography image enhancement using cuckoo search: a log transform based approach. J Signal Inf Process 6(03):244 52. Khosravy M, Gupta N, Marina N, Sethi IK, Asharif MR (2017) Brain action inspired morphological image enhancement. Nature-inspired computing and optimization. Springer, Cham, pp 381–407 53. Dey N, Mukhopadhyay S, Das A, Chaudhuri SS (2012) Analysis of P-QRS-T components modified by blind watermarking technique within the electrocardiogram signal for authentication in wireless telecardiology using DWT. Int J Image Graphics Signal Process 4(7):33 54. Dey N, Ashour AS, Shi F, Fong SJ, Sherratt RS (2017) Developing residential wireless sensor networks for ECG healthcare monitoring. IEEE Trans Consum Electr 63(4):442–449 55. Sedaaghi MH, Khosravi M (2003 July) Morphological ECG signal preprocessing with more efficient baseline drift removal. In: Proceedings of the 7th IASTED international conference, ASC, pp 205–209 56. Khosravi M, Sedaaghi MH (2004 Feb) Impulsive noise suppression of electrocardiogram signals with mediated morphological filters. In: The 11th Iranian Conference on Biomedical Engineering, Tehran, Iran, pp 207–212 57. Khosravy M, Patel N, Gupta N, Sethi IK (2019) Image Quality assessment: a review to full reference indexes. Recent trends in communication, computing, and electronics. Springer, Singapore, pp 279–288 58. Gutierrez CE, Alsharif MR, Yamashita K, Khosravy M (2014) A tweets mining approach to detection of critical events characteristics using random forest. Int J Next-Gener Comput 5 (2):167–176 59. Hore S, Chakraborty S, Chatterjee S, Dey N, Ashour AS, Van Chung L, Le DN (2016) An integrated interactive technique for image segmentation using stack based seeded region growing and thresholding. Int J Electr Comput Eng 6(6):2088–8708 60. Khosravy M, Alsharif MR, Guo B, Lin H, Yamashita K (2009 Mar) A robust and precise solution to permutation indeterminacy and complex scaling ambiguity in BSS-based blind MIMO-OFDM receiver. In: International conference on independent component analysis and signal separation. Springer, Berlin, Heidelberg, pp 670–677 61. Asharif F, Tamaki S, Alsharif MR, Ryu HG (2013) Performance improvement of constant modulus algorithm blind equalizer for 16 QAM modulation. Int J Innov Comput Inf Control 7(4):1377–1384 62. Khosravy M, Alsharif MR, Yamashita K (2009) An efficient ICA based approach to multiuser detection in MIMO OFDM systems. Multi-carrier systems and solutions 2009. Springer, Dordrecht, pp 47–56 63. Khosravy M, Alsharif MR, Khosravi M, Yamashita K (2010 June) An optimum pre-filter for ICA based multi-input multi-output OFDM system. In: 2010 2nd international conference on education technology and computer, IEEE, vol 5, pp V5–129 64. Sedaaghi MH, Daj R, Khosravi M (2001 Oct) Mediated morphological filters. In: Proceedings 2001 international conference on image processing (cat. no. 01CH37205), IEEE, vol 3, pp 692–695 65. Khosravy M, Gupta N, Marina N, Sethi IK, Asharif MR (2017) Morphological filters: an inspiration from natural geometrical erosion and dilation. Nature-inspired computing and optimization. Springer, Cham, pp 349–379
280
N. Gupta et al.
66. Khosravy M, Gupta N, Marina N, Sethi IK, Asharif MR (2017) Perceptual adaptation of image based on Chevreul-Mach bands visual phenomenon. IEEE Signal Process Lett 24(5): 594–598 67. Khosravy M, Punkoska N, Asharif F, Asharif MR (2014) Acoustic OFDM data embedding by reversible Walsh-Hadamard transform. AIP Conf Proc 1618(1):720–723 68. Picorone AAM, Oliveira TR, Sampaio-Neto R, Khosravy M, Ribeiro MV (2020) Channel characterization of low voltage electric power distribution networks for PLC applications based on measurement campaign. Int J Electr Power Energy Syst 116:105554 69. Gupta S, Khosravy M, Gupta N, Darbari H (2019) In-field failure assessment of tractor hydraulic system operation via pseudospectrum of acoustic measurements. Turkish J Electr Eng Comput Sci 27(4):2718–2729
Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines Iztok Fister Jr.1(B) , Milan Zorman1 , Duˇsan Fister2 , and Iztok Fister1 1
2
1
Faculty of Electrical Engineering and Computer Science, University of Maribor, Koroˇska cesta 46, 2000 Maribor, Slovenia [email protected] Faculty of Economics and Business, University of Maribor, Razlagova 14, SI-2000 Maribor, Slovenia
Introduction
As has been known for a long time, some basic research areas (e.g., medicine, biology) cannot solve some specific problems in their highly specialized laboratories without modern scientific computational methods and algorithms. Bioinformatics stands for a very vibrant interdisciplinary research area that, nowadays, has been solving problems where the aid of digital computers is unavoidable. Therefore, it is no wonder that the scientific discipline encompasses specialists from different research areas, such as, for example, computer scientists, mathematicians, biologists, geneticists, and statisticians. Indeed, the advent of digital computers has changed the principles of experimental work as well. In past, most of the experiments in science were performed either in vitro or in vivo. The former case refers to controlled experiments that are conducted outside of an organism (e.g., in cellular biology), while the latter to experimentation running on a living organism (e.g., animals). Recently, most experiments have been performed in silico, where they are performed on computers or as computer simulations (e.g., DNA analysis). For instance, let us imagine the whole human genotype that consists of approximately 25,000 different genes. In line with this, several questions have arisen in bioinformatics, as follows: How can we analyze all complex interactions between them, or even gene expression profiles, without specialized computational algorithms? How to process big data that are produced by the nextgeneration sequencing (NGS) technology [29] and can easily be measured in gigabytes or even terabytes? How to acquire new information or laws from biodata? Very topical problems have arisen as a consequence of all these questions, and the answers are searched for through the world sponsors of many scientific related projects by many research agencies around the world. For example, in recent years, the famous European Union Project Horizon 2020 supported many projects from these areas. By the same token, many hospitals opened their doors c Springer Nature Singapore Pte Ltd. 2020 M. Khosravy et al. (eds.), Frontier Applications of Nature Inspired Computation, Springer Tracts in Nature-Inspired Computing, https://doi.org/10.1007/978-981-15-2133-1_13
282
I. Fister et al.
and hired bioinformaticians. Bioinformaticians support clinicians or laboratory researchers with new information that can be obtained from data [19]. Most of the above posted questions coincide with the same as are faced by data science as well [5]. Data science is a multi-disciplinary domain that uses scientific methods, processes, algorithms, and systems for extracting knowledge, and, thus, enabling new insights from structured and unstructured data. Often, data scientists also need to use methods from machine learning (ML) by data analysis. ML is a part of an artificial intelligence (AI) that is capable of building a mathematical model of sample data in order to make predictions or decisions without being programmed explicitly to perform this task [2]. Thus, they were confronted with a question, how to employ these methods as effectively as possible? As a key solution, automated machine learning (AutoML) became a topic of considerable interest [25]. The purpose of AutoML is to automate some phases of ML tasks, especially those demanding from data scientists to select the proper ML method, or to set the more appropriate hyperparameters. Obviously, these tasks are far from easy for non-experts. Therefore, the AutoML searches for customized ML pipelines which are actually optimized sequences of ML methods, processes, algorithms, and appropriate hyperparameters for controlling their behavior. AutoML can be modeled as an optimization problem. There is a big pool of different AutoML methods that are based on evolutionary algorithms (EAs) [12]. TPOT [26] is a tree-based pipeline optimization tool for AutoML. It is based on GP. Some improvements of TPOT are offered in the thesis [15]. RECIPE [8] is another excellent example that is based on grammar-based GP that builds customized classification pipelines. Interestingly, paper [34] reports a new EA for the AutoML task of selecting the best ensemble of classifiers and their hyperparameter settings automatically for an input dataset. An example of the swarm intelligence (SI) algorithm ant colony optimization (ACO) was used in the work by Costa and Rodrigues [7]. Classification problems are very common within the bioinformatics area. The classification task of a sample data means to classify each item from the sample, represented by features, into one of a predefined set of classes. Although the task seems trivial from the expert’s point of view, a data scientist, for example, needs to perform at least three tasks to accomplish this: The proper feature selection algorithm must be selected first, followed by selection of the appropriate ML classification methods. Finally, the optimal hyperparameter setting must be found for each of the selected algorithms and ML methods. As a result, a classification pipeline is modeled manually that represents a customized sequence of classification methods and algorithms from the data scientist’s point of view. As we can see from the aforementioned example, modeling of a classification pipeline is a really complex task. Additionally, it is worth mentioning that the bioinformatics community also consists of data scientists who are not computer experts. Potentially, they have trouble in conducting classification tasks using their own sample datasets. For that reason, they rely mostly on some existing, usually commercial, software. However, this has many bottlenecks, especially due to a lack of robustness of software solutions on the market.
Continuous Optimizers for Automatic Design and Evaluation
283
In this paper, the stochastic nature-inspired population-based algorithms are proposed for evolving classification pipelines automatically in bioinformatics. In line with this, a novel AutoML method, named NiaAML, is presented, which has the following advantages: • the problem of AutoML is modeled as a continuous optimization problem, where each stochastic nature-inspired population-based algorithm can be used for solving this problem, • the proposed method is presented in layer style architecture. Hence, adding new components (e.g., classifiers) can be done in a very easy way, • it is also intended to be used by non-programmers. The structure of this paper is as follows: Sect. 2 deals with AutoML and its characteristics. Section 3 outlines the proposed NiaAML solution, and Sect. 4 shows performed experiments and results. Section 5 concludes the paper and outlines directions for the future work.
2
AutoML
The advent of big data has brought an explosion of ML researches and applications [20]. Consequently, scientists from various areas have been confronted with issues of how to analyze data as efficiently as possible. Typically, a lot of ML methods are employed in this analysis. Unfortunately, the performance of the analysis depends on the selected ML methods, on the one hand, while all those methods are sensitive to a plethora of parameters controlling their behavior, on the other. Therefore, finding the optimal sequence of ML methods for application on sample data (also ML pipeline), together with their optimal parameter settings, represents big trouble even for experts, let alone other domain specialists. Obviously, the optimal ML pipeline cannot be found in one step, but demands a “trialand-error” approach that was confirmed in evolutionary computation (EC) [12]. Unfortunately, this approach cannot be applied here, due to the huge amount of sample data. As a result, searching for the novel approximation methods needs to be performed achieving the results good enough for using in practice. In order to simplify complex ML tasks, the AutoML has been evolved, with the aim to automate creation of the optimal ML pipelines and to enable other domain specialists to use the complex set of ML methods and their optimal hyperparameter settings easily. This means that the AutoML is a way of democratization of ML, wherein ML experts wish to draw the state-of-the-art ML methods nearer to the other domain specialists. Although we are witnesses of an early stage of AutoML development, the AutoML can even outperform human ML experts and show its potential for the future. This fast-moving area of ML is focused on three main issues: • Hyperparameter optimization (HPO), • Meta-learning,
284
I. Fister et al.
• Neural architecture search (NAS). Various settings of hyperparameter values are crucial for good performance of ML methods. Typically, these methods have a lot of hyperparameters, with ranges usually unknown in advance. Therefore, it is not easy to determine their optimal values manually. This issue has led researchers in AutoML to HPO, where two optimization types are proposed in summary: • blackbox optimization, • multi-fidelity optimization. The former captures algorithms such as: model-free blackbox HPO and Bayesian optimization. The model-free blackbox HPO is referred to traditional search algorithms starting from a grid search, going through random search, and ending with stochastic nature-inspired population-based algorithms [13,16]. On the other hand, the Bayesian optimization is applied successfully for tuning deep neural networks. Commonly, both mentioned types of blackbox optimization algorithms are too time-consuming to be used in deep learning (DL). The latter type speeds up the manual tuning by probing hyperparameter setting on small subsets of data, casting this into a formal multi-fidelity algorithm, and applying it on sample data. This HPO type presents the best trade-off between optimization performance and runtime. Meta-learning is the science of observing systematically how different ML methods perform on a wide range of learning tasks, and then learning from this experience [20]. Actually, the challenge in meta-learning is to learn from prior experience in a systematic, data-driven way, with the goal of searching for optimal models for new tasks. This means, the learning does not start from scratch, but bases on the collected meta-data. Meta-learning consists of two steps [20]: • Collecting a meta-data, which describes prior learning tools and previously learned models. The meta-data consists of algorithm configurations, including hyperparameter settings, pipeline composition and/or network architecture, and model evaluation expressed as accuracy and training time. • Exploring the meta-data to learn, extract, and transfer knowledge for guiding the search for the optimal models for new tasks. Unfortunately, learning from prior experience is effective only until a new task represents completely unrelated phenomena or random noise. However, the realworld tasks are usually not sensitive to the mentioned disturbances. Deep learning brought a need for using complex network architectures. The more efficient architectures cannot be searched manually. Therefore, growing interest in an automated NAS has been increasing recently. Although NAS has a significant overlap with hyperparameter optimization and meta-learning, it can be seen as a subfield of AutoML. According to Hutter et al. [20], NAS methods are classified with regard to: • search space,
Continuous Optimizers for Automatic Design and Evaluation
285
• search strategy, • performance estimation strategy. Incorporating prior knowledge about the search space can reduce its dimensionality, and, thus, simplify the search. A search strategy determines how to explore the search space. A performance estimation strategy searches for those performance measures that allow reduction of the cost of their estimations, on the one hand, and achieve highly predictive performance on unseen data on the other [20].
3
Proposed NiaAML Method
Although classification pipelines have already been composed using various stochastic nature-inspired population-based algorithms (e.g., TPOT, RECIPE. etc.), their origins were usually found in genetic programming (GP) [22], where individuals are represented as trees. This study is focused on the real-valued stochastic nature-inspired population-based algorithms for evolving classification pipelines. A NiaAML method was developed according to the directions of AutoML development, as proposed in [20]. In this sense, the NiaAML is referred to the collecting meta-data step. As already seen, the collecting meta-data is a meta-learning step devoted for pipeline composition including hyperparameter optimization and model evaluation. An evolving classification pipeline is illustrated in Fig. 1, from which it can be seen that the classification pipeline composition consists of three tools: • feature selection, • classifier selection, • hyperparameter optimization.
Fig. 1. A classification pipeline evolving
286
I. Fister et al.
The task of the feature selection tool is to determine the suitable algorithm for feature selection, while the classifier selection tool is to select the proper ML method. However, the algorithms, as well as classification methods, are incorporated into a framework of tools. The hyperparameter optimization serves as a searching mechanism for specifying the proper values of the parameters arisen in the feature selection algorithms and classification methods. The result of the pipeline composing is a customized classification pipeline that needs to be evaluated in the model evaluation step. The task of evaluation is to assess the quality of the composed model on a sample data. The evolving process, consisting of pipeline composing and model evaluation, is launched until the termination condition is satisfied. An architecture of the NiaAML method is depicted in Fig. 2. The architecture is represented schematically as a feature diagram (FD) [21] and describes tasks needed for composing customized classification pipelines. Actually, the FD is a tree consisting of vertices and arcs. Vertices represent task models, and arcs determine relationships between these. The vertices are either mandatory or optional, where the former are denoted by closed dots, and the latter by open dots. As can be seen from Fig. 2, the task of composing the customized classification pipelines (‘NiaAML’ vertex) has three conceptual levels, as follows: • pipeline composing models (features in FD), • model families (sub-features in FD), • tools and hyperparameter settings (attributes in FD). Interestingly, each feature/sub-feature is determined by its own set of subfeatures/attributes that are connected between each other with various relationships. There are three different relationships in the FD: ‘and’, ‘one of’, and ‘more of’. The ‘one of’ relationship is denoted by an opened semicircle joining
Fig. 2. Feature diagram of NiaAML method
Continuous Optimizers for Automatic Design and Evaluation
287
the arcs from a feature to its sub-features, the ‘more of’ by a closed semicircle, while the ‘and’ relationship is without any semicircle. The meaning of these relationships is explained simultaneously with a detailed description of the FD that follows. Composing the customized classification pipeline with NiaAML consists of three mandatory tasks that include modeling the feature selection, the classifier selection, and the hyperparameter optimization. These tasks are connected with relation ‘and’, which means that all three tasks must be included in the process of composing the pipeline. Modeling the feature selection is composed of modeling the feature selection algorithm and feature scaling. Both sub-features in FD are connected with the relation ‘more of’. Because modeling the feature scaling is optional, the relation means that modeling feature selection can be performed with or without the feature scaling. However, the feature selection algorithm must be modeled anyway. A classification method can even be modeled from four classification families as follows: linear, SVM, boosting, and decision tree. These sub-features in FD are mutually connected with the relation ‘one of’, meaning that one of the classification families cooperates in composing the classification pipelines. The hyperparameter optimization is specific, because it has only one task, i.e., modeling the hyperparameter setting that is dependent on the selected feature selection algorithm, as well as the selected classification method. The lowest level in an FD tree represents a framework of tools and hyperparameter settings. This consists of feature selection/scaling algorithms, classification methods, and optimized hyperparameter settings. All framework elements are selected with regard to the corresponding model families. For instance, a feature selection algorithm can be modeled using four different feature selection algorithms. Moreover, the rescaling and normalization procedures are available for the feature scaling. At the moment, one classifier method is modeled for each classification family, except a decision tree supporting even three classification methods. Obviously, in the future, we plan to increase the number of classification methods. As already explained, the hyperparameters’ setting is model dependent. In the remainder of the paper, the algorithm for composing the classification pipeline is presented in detail. The section is concluded with illustrating the model evaluation. 3.1
Composing the Classification Pipeline
The problem can be defined informally as follows: The task is to compose the classification pipeline from tools incorporated into a framework, where the pipeline consists of feature selection and feature scaling algorithms, classification methods, and optimal setting of hyperparameters, controlling the algorithms and/or methods, such that the maximum classification accuracy is achieved. The problem is defined as an optimization and solved by the particle swarm optimization (PSO) [11]. Although it can be solved with the other stochastic population-based nature-inspired algorithms as well, the PSO was preferred due to its simplicity.
288
I. Fister et al.
3.1.1 Particle Swarm Optimization PSO is a member of the SI-based algorithm family that was developed by Eberhart and Kennedy in 1995 [11]. It is inspired by the social behavior of bird flocking and fish schooling [9,17]. This algorithm works with a swarm (i.e., population) of particles representing (t) candidate solutions xi of the problem to be solved. The particles fly virtually through the problem space and are attracted by more promising regions. When the particles are located in the vicinity of these regions, they are rewarded with the better values of fitness function by the algorithm. The PSO algorithm exploits usage of additional memory, where the particle’s (t) personal best pi , as well as the swarm’s global best g(t) locations in the search space are saved. In each time step t (i.e., generation), all particles change their (t) velocities vi toward its personal and global best locations according to the following mathematical formula: (t+1)
vi
(t+1)
xi
(t) (t) (t) (t) + C2 · rand(0, 1) · pi − xi , = vi + C1 · rand(0, 1) · g(t) − xi (t)
(t)
(1)
= xi + vi ,
where C1 and C2 present learning rates typically initialized to 2, and rand(0, 1) is a random value drawn from uniform distribution in the interval [0, 1]. The pseudo-code of the original PSO is illustrated in Algorithm 1, from which it can be seen that the PSO is distinguished from the classical EAs by four specialties: • does not have survivor selection, • does not have parent selection, • does not have crossover operator, and Algorithm 1 The original PSO algorithm 1: procedure ParticleSwarmOptimization 2: t ← 0; initialization of population 3: P (t) ←Initialize; 4: while not TerminationConditionMeet do (t) 5: for all xi ∈ P (t) do (t) (t) evaluation of candidate solution 6: fi = Evaluate(xi ); (t) (t) 7: if fi ≤ fbest i then (t)
(t)
(t)
(t)
8: pi = xi ; fbest i = fi ; 9: end if (t) (t) 10: if fi ≤ fbest then (t) (t) (t) (t) 11: g = xi ; fbest = fi ; 12: end if (t) (t) 13: xi = Move(xi ); 14: end for 15: t = t + 1; 16: end while 17: end procedure
preserve the local best solution
preserve the global best solution move the candidate w.r.t. Eq. (1)
Continuous Optimizers for Automatic Design and Evaluation
289
• the mutation operator is replaced by the move operator changing each element (t) of particle xi with probability of mutation pm = 1.0. Let us mention that the selection is implemented in the PSO implicitly, i.e., by improving the personal best solution permanently. However, when this improving is not possible anymore, the algorithm gets stuck in the local optima. In what follows, the necessary modifications to the PSO algorithm are described in order to prepare it for composing the classification pipelines. The section is concluded with a discussion about the classification pipeline evaluation. 3.1.2 Representation of Individuals In the modified PSO algorithm, individuals are represented by floating-point vectors as follows: (t)
(t)
xi = {
xi,1
(t)
,
xi,2
,
(t)
(t)
xi,3
(t)
(t)
(t)
, xi,4 , . . . , xi,k , xi,k+1 , . . . , xi,D },
FeatureSelection FeatureScaling Classification
HyperParameters
(2) (t) where each element of the individual xi,j is mapped to the attribute from the corresponding feature sets according to genotype–phenotype mapping illustrated in Table 1. The table illustrates a mapping of the elements of a vector to the features, and then into the attributes of the corresponding feature sets. Indeed, the first three elements of the floating-point vector are mapped to attributes, as follows: Table 1. Genotype–phenotype mapping Vector elements
Feature name
Feature set
xi,1
FeatureSelection
{DE , PSO, GWO, BA}
xi,2
FeatureScaling
{No, Rescaling, Normalization}
xi,3
Classification
{MLP, LS-SVM, ADA, RF , ERT , BAG}
(t) (t) (t) (t)
(t)
(t)
(t)
xi,4 , . . . , xi,k , xi,k+1 , . . . , xi,D
HyperParameters Model dependent
(t) attr feat (t)
=
(t) xi,j , |attr feat |
for j = 1, . . . , 3,
(3)
where attr feat denotes the specific attribute of the feature and the |attr feat | is the size of feature set feat ∈ {FeatureSelection, FeatureScaling, Classification}. Thus, the result of the division is truncated. The remaining elements of the vector represent the absolute values of the corresponding hyperparameter laying in the corresponding hyperparameter domain. Although the maximum number of elements is fixed according to the algorithm and method with the maximum number of control hyperparameters, the number of effective elements is variable and depends on the selected tools. However, the hyperparameter domain depends on the particular hyperparameter and must be determined experimentally.
290
I. Fister et al.
3.1.3 Hyperparameter Domains This subsection focuses on defining the hyperparameter domains. Actually, searching for the optimal setting of hyperparameters is a part of the NiaAML optimization process, and not by the separate process as proposed by the AutoML community. Consequently, the NiaAML performs HPO simultaneously with composing the classification pipelines, and, therefore, can be faster than the traditional AutoML. In our study, we dealt with feature rescaling or normalization, four feature selection algorithms, and six classification methods. The feature scaling can either be selected (i.e., rescaling or normalization) or not selected. In both cases, however, there are no special parameters for controlling these algorithms. Among the feature selection algorithms, the following stochastic nature-inspired population-based algorithms were proposed: Differential evolution (DE) [32], PSO, gray wolf optimization (GWO) [23], and bat algorithm (BA) [35]. Obviously, even six methods, like multilayer perceptron (MLP) [27], least squares support vector machine (LS-SVM) [33], AdaBoost (ADA) [28], random forest (RF) [4], extremely andomized trees (ERT) [14], and BAGging (BAG) [3], can be selected for classification. The hyperparameters of the proposed algorithms and methods, that were the subject of HPO, are illustrated in Table 2. Table 2. Hyperparameter domains Alg.
Hyperparameter
Domain of values
DE
F, CR
F ∈ [0.5, 0.9], CR ∈ [0.0, 1.0]
PSO
C1 , C 2
C1 ∈ [1.5, 2.5], C2 ∈ [1.5, 2.5]
GWO
a
a ∈ [0.0, 2.0]
BA
A, r, Qmin , Qmax A ∈ [0.5, 1.0], r ∈ [0.0, 0.5], Qmin ∈ [0.0, 1.0], Qmax ∈ [1.0, 2.0],
MLP
act, sol, lr
act ∈ {identity, logistic, tanh, relu}, sol ∈ {lbfgs, sgd, adam} lr ∈ {constant, invscaling, adaptive}
LS-SVM gamma, c
gamma ∈ [0.1, 100], c ∈ [0.1, 100]
ADA
n estim, alg
n estim = [10, 110], alg ∈ {samme, samme.r}
RF
n estim
n estim ∈ [10, 110]
ERT
n estim
n estim ∈ [10, 110]
BAG
n estim
n estim ∈ [10, 110]
Let us mention that the maximum number of hyperparameters occurring in the selected algorithms and classification methods is seven. As a result, the size of the real-valued vector for composing the classification pipeline is 10. 3.1.4 Fitness Function Evaluation In order to calculate the value of fitness function, tenfold cross-validation is used [1]. Thus, data are split into k = 10 equal parts using stratified sampling, and each of the k parts are classified by the classification pipeline. The main advantage of this approach is that all these socalled training sets have 80% of data in common when k = 10. Consequently, the
Continuous Optimizers for Automatic Design and Evaluation
291
trade-off between the bias and variance parts of the prediction error is minimized due to reducing both as much as possible. The performance of the classification is estimated by accuracy (Accuracy), a statistical measure that estimates the proportion of true predictions (i.e., true positives and true negatives) among the total number of samples. Mathematically, this proportion can be expressed as: Accuracy(M (xi )) =
TP + TN , TP + TN + FP + FN
(4)
where M (xi ) denotes a model built on the basis of the customized classification pipeline xi , TP = True Positive, TN = True Negative, FP = False Positive, and FN = False Negative. The fitness function is then defined as follows: f (xi ) = 1 −
k 1 Accuracy(M (xi )), k i=1
(5)
where Accuracy(M (xi )) is calculated according to Eq. (4). Let us emphasize that the fitness function evaluates the average performance of the classification methods obtained in 10-folds of training data. The task of the optimization is to minimize the value of the fitness function. 3.2
Model Evaluation
In this step, the quality of the composed classification pipeline is evaluated, found in the last step that consists of selected features, classifier, and hyperparameters. Typically, performance of the classification method in data science is evaluated by applying the evolving model to unseen test data. A standard 80–20% holdout validation is used, where 80% of the data is used for training and the other 20% for testing. The classification performance is then assessed according to the accuracy as expressed in Eq. (4).
4
Experiments and Results
The purpose of our experimental work was to show that the algorithm for composing the classification pipeline finds comparable, if not better, results than those tuned by the real data science experts. In line with this, three experiments were performed using well-known datasets from the UCI machine learning repository [10]. Actually, three datasets from the life sciences domain were taken into account. The characteristics of these datasets are depicted in Table 3.
292
I. Fister et al. Table 3. Datasets used
Dataset name Characteristics Attribute characteristics
#instances #features Missing data
Yeast
Multivariate
Real
1,484
8
No
Ecoli
Multivariate
Real
336
8
No
Abalone
Multivariate
Categorical, integer, real 4,177
8
No
As can be seen from the table, all datasets are multivariate with 8 features. As the number of features (denoted as #features in the table) is low, we can expect that the feature selection algorithm cannot reduce this number a lot. On the other hand, the number of instances (denoted as #instances in the table) increases from 336 toward 4,177. Although the NiaAML is prepared to tune the algorithm’s parameters as well, the parameter values were fixed in this preliminary study due to simplicity. The parameter settings of the used algorithms are illustrated in Table 4. Let us mention that the population size NP = 20 and the number of fitness function evaluations nFES = 400 are fixed for all algorithms due to fairness by mutual comparison. However, the parameter settings of the NiaAML algorithm are the same as presented in Table 4 for the PSO, except the population size of Np = 15 and the number of fitness function evaluations nFES = 500. Interestingly, the GWO algorithm is parameterless, and, therefore, does not demand any parameter setting. The optimal setting of hyperparameters is the subject of the HPO as presented in Table 2. Table 4. Parameter setting Algorithm
Acronym Parameter 1 Parameter 2 Parameter 3
Differential evolution
DE
Gray wolf optimizer
GWO
F = 0.5
CR = 0.9
Particle swarm algorithm PSO
C1 = 2.0
C2 = 2.0
w = 0.7
Bat algorithm
A = 0.5
r = 0.5
Q ∈ [0.0, 2.0]
BA
Parameter 4
v ∈ [−4, 4]
The results of the optimization were measured according to three statistical measures: precision, Cohen’s kappa κ, and F1 -score. The precision is defined as follows [1]: TP , (6) precision(M (xi )) = TP + FP where TP = True Positives, and FP = False Positives. Metric for handling multivariate class problems is Cohen’s kappa defined as [6]: k k n i=1 CM ii − i=1 CM i. CM .i κ(M (xi )) = , (7) k n2 − i=1 CM i. CM .i where CM i. is the sum of the elements in the ith row of confusion matrix CM and CM .i the sum of the elements in the ith columns of the CM [31]. The
Continuous Optimizers for Automatic Design and Evaluation
293
metric measures a level of agreement between two annotators on a multivariate classification problem and can occupy values in the interval [−1, 1]. Values below 0.8 mean that there is no agreement between annotators. Finally, the F1 -score based on precision and recall is expressed as follows: F1 (M (x)) = 2 ·
precision · recall , precision + recall
(8)
where recall = TPTP +FN , FN = False Negatives, and TP = True Positives. It is worth mentioning that precision, recall , and F1 can be determined for the binary classification only, i.e., two prediction classes. However, using the sklearn’s metrics library, one can compute a weighted average among many binary classification tasks and thus can use those measures for multiclass classification. Let us emphasize that we focus on the results of the composing of the classification pipeline in the preliminary study. This means that the results of the model evaluation phase were left for the future. In the remainder of the paper, the aforementioned experiments are described in detail. 4.1
The Results on the Yeast Dataset
The three top results of composing the classification pipeline according to measure Accuracy on the Yeast dataset are illustrated in Table 7, from which it can be seen that these were all obtained by applying the RF classification method.
Fig. 3. Top three configurations found on the Yeast dataset
The numerical results of data, depicted in Fig. 3, are presented in Table 5, where each of the best results is denoted by places from 1 to 3. In the table, the values in square brackets denote the number of reduced features according to the total set in the column ‘Feature selection’ and the optimized number of
294
I. Fister et al.
estimators used by the corresponding classification method in the column ‘Classification method’. The sign ‘n/a’ (i.e., not applicable) in the column ‘Feature scaling’ indicates that no feature scaling was applied. The column ‘Accuracy’ denotes the accuracy calculated according to the Eq. (4). Table 5. Numerical results obtained on the Yeast dataset Place Feature selection Feature scaling Classification method Accuracy 1
[8/8]
n/a
Random forest [93]
0.6337
2
[8/8]
n/a
Random forest [98]
0.6329
3
[8/8]
n/a
Random forest [110]
0.6321
Interestingly, the best results were obtained using the PSO FS algorithm, no feature scaling, and the RF classification method. Although the FS procedure was performed, it was found out on the basis of the top three configurations that the best results are obtained by incorporating all the features. This indicates that each feature plays a relevant role for interpretation of results and that low redundancy is present among the features. RF, with 93 estimators, is the best among all built models. We can see that accuracy performance lowers by increasing model complexity. The results according to the statistical measures, i.e., Accuracy, Precision, Cohen’s kappa, and F1 -score, are depicted in Table 6. Table 6. Statistical measures obtained on the Yeast dataset Statistical measure Statistical score Accuracy
0.6465
Precision
0.6382
Cohen’s kappa, κ
0.5356
F1 -score
0.6387
As can be seen from the Cohen’s kappa measure κ = 0.5356 in the table, the value is higher than 0.5. This indicates moderate agreement between true and prediction labels. Obtained Accuracy = 0.6465 is higher than in cross-validation Accuracy = 0.6337. 4.2
The Results on the Ecoli Dataset
Three of the best results of the NiaAML algorithm for composing the classifier pipeline according to measure Accuracy obtained on the Ecoli dataset are illustrated in Table 7. The meaning of the variables in the table is the same as
Continuous Optimizers for Automatic Design and Evaluation
295
discussed in the last experiment. Also, in this case, the RF classification method was the most preferable by the NiaAML, and the FS procedure has not shown to be beneficial. The feature scaling was not applied in any case, while the number of estimators by classification method was more than >100 for each instance. Although the number of estimators are somehow similar to the Yeast dataset, classification accuracy is much higher for the Ecoli dataset. Table 7. Numerical results obtained on the Ecoli dataset Place Feature selection Feature scaling Classification method Accuracy 1
[7/7]
n/a
Random forest [104]
0.8899
2
[7/7]
n/a
Random forest [101]
0.8882
3
[7/7]
n/a
Random forest [109]
0.8880
The graphical representation of the same results are presented in Fig. 4, from which it can be seen that the second configuration with the least number of estimators provides the shortest standard deviation, and the third configuration with the highest number of estimators, the largest standard deviation. All three configurations seize an approximately equal highest classification accuracy score, which is set at 0.9697. We can conclude that using an arbitrary number of estimators (unless too low), high classification accuracy can be scored, but standard deviation increases, due to a possible over-fitting problem. It is desired to lower the classification method complexity maximally to avoid such problems.
Fig. 4. Top three configurations found on the Ecoli dataset
The results according to the four statistical measures are illustrated in Table 8, from which it can be seen that Cohen’s kappa κ increases significantly
296
I. Fister et al. Table 8. Statistical measures obtained on the Ecoli dataset Statistical measure Statistical score Accuracy
0.9412
Precision
0.9373
Cohen’s kappa, κ
0.9017
F1 -score
0.9318
compared to the Yeast dataset. This indicates very high agreement between true and predictive classes, and might highlight the more predictive (correlative) nature of the Ecoli dataset. Using the optimal classification method configuration, Accuracy = 0.9412 is improved drastically compared to the cross-validation Accuracy = 0.8899. 4.3
The Results on the Abalone Dataset
The last experiment was conducted on the Abalone dataset. In line with this, the results of the NiaAML for composing the classification pipeline obtained on this dataset are presented in Table 9, from which it can be seen that ERT and bagging classification methods proved to be the most beneficial results. Overall Accuracy is the lowest among the three datasets, reaching barely over 0.55, which might indicate the increased complexity (i.e., number of instances) of the Abalone dataset. In our opinion, this is also the main reason why the RF performed worse. Table 9. Numerical results obtained on the Abalone dataset Place Feature selection Feature scaling Classification method
Accuracy
1
[8/8]
n/a
Extremely randomized trees [110] 0.5650
2
[8/8]
n/a
Bagging [81]
0.5545
3
[8/8]
n/a
Bagging [83]
0.5542
The same results are presented graphically in Fig. 5. As can be seen from the figure, the first configuration reaches the highest overall Accuracy. ERTs are comprehensive classification methods which perform well for complex datasets. Second configuration, i.e., Bagging [81], scores the lowest standard deviation among them but produces two outliers, i.e., black spots significantly higher and lower than the boxplot.
Continuous Optimizers for Automatic Design and Evaluation
297
Fig. 5. Top three configurations found on the Abalone dataset
Finally, the results according to four statistical measures are depicted in Table 10, from which it can be seen that Cohen’s kappa κ indicates fair agreement only. Although the Accuracy = 0.5754 is improved, compared to crossvalidation Accuracy = 0.5650, one should consider the obtained predictive performance. The Precision also scores the lowest value among all three datasets, which indicates an increase of FP predictions. Table 10. Statistical measures obtained on the Abalone dataset Statistical measure Accuracy
Statistical score 0.5754
Precision
0.5701
Cohen’s kappa, κ
0.3589
F1 -score
0.5725
298
4.4
I. Fister et al.
Summary
In order to compare the quality of classification pipelines, the best results obtained on the various datasets are presented in Fig. 6. The following conclusions can be summarized after analyzing the results: The Yeast and Ecoli datasets are shorter (i.e., 1,484 and 336 instances), compared to the Abalone dataset consisting of 4,177 instances. Therefore, the RF overcomes the results of the other classification methods by classifying the shorter datasets. However, by increasing the number of instances, e.g., the Abalone dataset, more comprehensive methods such as ERT and Bagging emerge.
Fig. 6. The best classification pipelines obtained on the three datasets
The results obtained by NiaAML are comparable or better than the other domain experts, e.g., for the Yeast dataset authors [36] report Accuracy = 0.5792 on a so-called infinite latent SVM classifier. This is significantly less than in our case, where we reached Accuracy = 0.6465. For the Ecoli dataset, the following accuracies were reported by [24]: Accuracy = 0.8214 for the J48 decision tree classifier, Accuracy = 0.8095 for the Ridor classifier and Accuracy = 0.8125 for the JRip classifier. The authors in [30] for the best case report the mean Accuracy = 0.8880 by cross-validation. Using the NiaAML, we scored Accuracy = 0.9412 for evaluation, and Accuracy = 0.8899 by cross-validation. While the evaluation is significantly improved, cross-validation is comparable. Review of [18] revealed following results for the Abalone dataset: Accuracy = 0.5422 for decision trees, Accuracy = 0.5478 for Na¨ıve Bayes, Accuracy = 0.5393 for k-nearest neighbors, and Accuracy = 0.5456 for SVM. The proposed approach gives Accuracy = 0.5754, which is at least comparable to the domain experts.
Continuous Optimizers for Automatic Design and Evaluation
5
299
Conclusions and Future Work
Recently, ML methods like classification became useful tools in various sciences. Unfortunately, using these methods is far from being easy. Typically, these methods are controlled by many hyperparameters, where the optimal setting depends on the problem to be solved. Fortunately, the different methods solving a particular step in the ML phase, like feature selection/scaling, classification, and HPO, can be composed into pipelines and can be executed one after another. Therefore, a new discipline of the ML has emerged, i.e., AutoML, with the goal to help the ordinary users by applying the ML methods. The AutoML is capable of selecting the most appropriate ML methods, as well as finding their optimal parameters. Normally, the algorithms for composing the ML methods into ML pipelines are of deterministic nature. In our study, we go a step further, and propose the stochastic NiaAML for composing the classification pipelines. Actually, the stochastic NiaAML is capable of: (1) performing automated feature selection and feature scaling to reduce the complexity of a dataset, (2) classifier selection, and (3) HPO to find the optimal configuration of the classifier. Classifier configurations are tested using cross-validation. The proposed NiaAML was tested on three different ML datasets: Yeast, Ecoli, and Abalone. Although feature selection was applied to each dataset, it was not found to be beneficial due to the too low number of features. The solution thus might help non-technical users obtain good classification performance. It was found that NiaAML searches successfully for the optimal classification method and its configuration. As a result, we can conclude that the obtained results are comparable to those proposed by domain experts. In future, we would like to implement the NiaAML framework for detection and accounting the imbalanced datasets. Furthermore, constrained feature selection might be proposed, where a user could specify the maximal number of features to be incorporated into the classifier. Nevertheless, a Web graphical user interface and automated visualization framework might be desired as well.
References 1. Aggarwal Charu C (2014) Data classification: algorithms and applications. Chapman and Hall/CRC, Chapman & Hall/CRC data mining and knowledge discovery series 2. Bishop Christopher M (2007) Pattern recognition and machine learning, 5th edn. Springer, Information Science and Statistics 3. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140 4. Breiman L (2001) Random forests. Mach Learn 45(1):5–32 5. Cleveland William S (2014) Data science: an action plan for expanding the technical areas of the field of statistics. Stat Anal Data Mining 7:414–417 6. Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Measure 20(1):37–46
300
I. Fister et al.
7. Costa VO, Rodrigues CR (2018) Hierarchical ant colony for simultaneous classifier selection and hyperparameter optimization. In: 2018 IEEE congress on evolutionary computation (CEC). IEEE, pp 1–8 8. de S´ a AGC, Pinto WJGS, Oliveira LOVB, Pappa GL (2017) RECIPE: a grammarbased framework for automatically evolving classification pipelines. In: European conference on genetic programming. Springer, pp 246–261 9. Dey N (2017) Advancements in applied metaheuristic computing. IGI Global 10. Dua D, Graff C (2017) UCI machine learning repository 11. Eberhart R, Kennedy J (1995) Particle swarm optimization. In: Proceedings of ICNN ’95—international conference on neural networks, vol 4, pp 1942–1948 12. Eiben AE, James E (2015) Introduction to evolutionary computing, 2nd edn. Springer Publishing Company, Incorporated, Smith 13. Fister I Jr, Yang X-S, Fister I, Brest J, Fister D (2013) A brief review of natureinspired algorithms for optimization. Elektrotehniˇski vestnik 80(3):116–122 14. Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42 15. Gijsbers P (2018) Automatic construction of machine learning pipelines. Master’s thesis, Eindhoven University of Technology 16. Gupta N, Khosravy M, Patel N, Senjyu T (2018) A bi-level evolutionary optimization for coordinated transmission expansion planning. IEEE Access 6:48455–48477 17. Gupta N, Khosravy M, Patel N, Sethi I (2018) Evolutionary optimization based on biological evolution in plants. Proc Comput Sci 126:146–155 18. Herranz J, Matwin S, Nin J, Torra V (2010) Classifying data from protected statistical datasets. Comput Sec 29(8):875–890 19. Holzinger A, Dehmer M, Jurisica I (2014) Knowledge discovery and interactive data mining in bioinformatics—state-of-the-art, future challenges and research directions. BMC Bioinf 15(6):I1 20. Hutter F, Kotthoff L, Vanschoren J (eds) (2019) Automatic machine learning: methods, systems, challenges. Series on challenges in machine learning. Springer 21. Kang KC, Cohen SG, Hess JA, Novak WE, Peterson AS (1990) Feature-oriented domain analysis (FODA) feasibility study. Technical report CMU/SEI-90-TR-021, Software Engineering Institute, Carnegie Mellon University, Pittsburgh, PA 22. Koza John R (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, MA, USA 23. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61 24. Mohamed WNHW, Salleh MNM, Omar AH (2012) A comparative study of reduced error pruning method in decision tree algorithms. In: 2012 IEEE international conference on control system, computing and engineering. IEEE, pp 392–397 25. Olson RS, Bartley N, Urbanowicz RJ, Moore JH (2016) Evaluation of a treebased pipeline optimization tool for automating data science. In: Proceedings of the genetic and evolutionary computation conference 2016, GECCO 2016. ACM, New York, NY, pp 485–492 26. Olson RS, Moore JH (2016) TPOT: a tree-based pipeline optimization tool for automating machine learning. In: Workshop on automatic machine learning, pp 66–74 27. Rosenblatt F (1961) Principles of neurodynamics. Perceptrons and the theory of brain mechanisms. Cornell Aeronautical Lab Inc, Buffalo, NY 28. Schapire RE (1999) A brief introduction to boosting. In: Proceedings of the 16th international joint conference on artificial intelligence, IJCAI ’99, vol 2. Morgan Kaufmann Publishers Inc, San Francisco, CA, pp 1401–1406
Continuous Optimizers for Automatic Design and Evaluation
301
29. Schuster Stephan C (2007) Next-generation sequencing transforms today’s biology. Nat Methods 5(1):16 30. Soda P, Iannello G (2010) Decomposition methods and learning approaches for imbalanced dataset: an experimental integration. In: 2010 20th international conference on pattern recognition. IEEE, pp 3117–3120 31. Stehman Stephen V (1997) Selecting and interpreting measures of thematic classification accuracy. Remote Sens Environ 62(1):77–89 32. Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Glob Opt 11(4):341–359 33. Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300 34. Xavier-J´ unior JC, Freitas AA, Feitosa-Neto A, Ludermir TB (2018) A novel evolutionary algorithm for automated machine learning focusing on classifier ensembles. In: 2018 7th Brazilian conference on intelligent systems (BRACIS). IEEE, pp 462– 467 35. Yang X-S (2010) A new metaheuristic bat-inspired algorithm. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 65–74 36. Zhu J, Chen N, Xing EP (2011) Infinite latent SVM for classification and multi-task learning. In: Advances in neural information processing systems, pp 1620–1628
Chapter 14 Evolutionary Artificial Neural Networks: Comparative Study on State-of-the-Art Optimizers Neeraj Gupta1, Mahdi Khosravy2,3(&), Nilesh Patel1, Saurabh Gupta4,5, and Gazal Varshney6 1
Department of Computer Science and Engineering, Oakland University, Rochester, MI, USA 2 Media Integrated Communication Lab, Graduate School of Engineering, Osaka University, Suita, Japan [email protected] 3 Electrical Engineering Department, Federal University of Juiz de Fora, Juiz de Fora, Brazil 4 Department of Advanced Engineering, John Deere India Pvt. Ltd., Pune, India 5 Research Scholar, Department of Computer Science, Banasthali Vidyapith, Vanasthali, Rajasthan, India 6 University of Information Science and Technology, Ohrid, North Macedonia
1 Introduction Today evolutionary optimizers (EOs) [1, 2] play an important role in real-life problems. EOs are inspiration from the natural phenomena such as genetic algorithm (GA) [3, 4] and PSO [5–10]. Apart from the application discussed in this chapter, EOs have a great potential to be applied in a wide range of fields such as text feature detection [11], blind component processing [12], noise cancelation [13], blind source separation [14–18], data mining [19], image enhancement [20, 21], ECG processing [22–25], quality assessment [26], information hiding [9], image segmentation [27], morphological filtering [28, 29], acoustic OFDM [30], telecommunications [31–57], power line communications (PLC) [35], image adaptation [36], and fault detection system [37]. In this chapter, we provide the result of designed ANN trained by plant geneticsinspired optimizer which is known as Mendelian evolutionary theory optimizer (METO) [38, 39] in comparison with thirteen other state-of-the-art optimizers. These algorithms are (1) binary hybrid GA (BHGA) [40], (2) biogeography-based optimization (BBO) [41], (3) invasive weed optimization (IWO) [42], (4) shuffled frog leap algorithm (SFLA) [43], (5) teaching-learning-based optimization (TLBO) [44], (6) cuckoo search (CS) [45], (7) novel bat algorithm (NBA) [46–48], (8) gravitational search algorithm (GSA) [49], (9) covariance matrix adaptation evolution strategy (CMAES) [50], (10) differential evolution (DE) [51], (11) firefly algorithm (FA) [52], (12) social learning PSO (SLPSO) [53], and (13) real coded simulated annealing (RSA) [54]. On these algorithms, one can find huge literature that proves the efficiency © Springer Nature Singapore Pte Ltd. 2020 M. Khosravy et al. (eds.), Frontier Applications of Nature Inspired Computation, Springer Tracts in Nature-Inspired Computing, https://doi.org/10.1007/978-981-15-2133-1_14
Evolutionary Artificial Neural Networks: Comparative Study
303
of these optimizers. We have selected two layers ANN: One is hidden, and another is output layer as given in Fig. 1. Hidden layer has 10 neurons, and output layer is softmax layer with six outputs. This model is trained for 15 input samples. The data set under study is complex and with different features. One can observe that dataset is highly disturbed to classify with normal or machine learning techniques. Thus, it needs intelligent method that is METO in our simulation. We trained ANN model 100 times, where each training is not connected to each other. 100 times training gives us 100 trained models. Thus, we have the distribution of the solutions from each optimizer. We statistically analyzed this distribution by Kruskal–Wallis Test. The analysis is presented in this chapter to prove that METO is statistically best among all optimizers.
Fig. 1. ANN structure used in this comparative study
2 Agriculture Machinery The case of study for feature detection via application of neuro-evolutionary techniques is prognostic condition evaluation as well as fault detection of agriculture machinery. The tractor is the backbone of the agricultural industry. It is considered as a prime mover. Tractors are classified as off road and agriculture machinery and are most widely used for various other purposes in addition to agriculture. In India, tractor industry is divided mainly into four main categories—less than 30 HP, 31 HP to 40 HP, 41 HP to 50 HP, and greater than 50 HP power. Since a tractor’s main function is to operate as a prime mover and support the power requirements to operate the various types of implements either from drawbar, 3-point hitch rear PTOs, or front PTO. Most of the implements are draft implements and operated through hydraulic system. Therefore, the most important part of a tractor is its hydraulic system. To simply explain the hydraulic system of a tractor, it is an open center system where the drivetrain acts as a sump. The oil is sucked into the hydraulic pump through the hydraulic filter which then distributes it evenly to the steering unit and the rockshaft unit (3-point hitch). This filter cleans the oil by blocking the impurities and debris present in the oil. The output of the steering is the inlet of the brakes. Pump supplies flow to control valve, and control valve controls the rockshaft cylinder. In the field, it is very common that small particles easily contaminate hydraulic system. Filter is used to separate out the impurities from the hydraulics. Filter has pre-defined efficiency for filtration. 100% filtration is not possible. Although the filter is able to keep them out initially, once it is blocked, the bypass opens to avoid any pressure burst of the system
304
N. Gupta et al.
and the particles find their way into the hydraulic system and get accumulated in the pumps, valves, drivetrain, and gears to degrade performance. The objective of the problem is to diagnose the choking stages using smart phones and facilitate a timely change of the filter by characterizing the noise emitted by the pump at different levels of filter choking. The inlet and outlet pressures, oil flow, engine RPM, and noise from pump will be collected and analyzed to identify the relation between filter choking and pump noise. The output of this analysis will help to notify the operator about the requirement of the filter change. 2.1
System Context
Figure 2 shows the system context for monitoring and tracking tractor performance. Current system is capable to capture the real-time health and performance data of tractor aggregates and transmit it on cellular network using cellular modem. Using of sensors increases complexity of system. It is not feasible to put the sensors for condition monitoring of all the aggregates. It is proposed to use cellphone to capture noise of the aggregates on tractor and use algorithm to detect the current health of the aggregates using available hardware on the tractor and transmit it on net to owner and dealer to avoid any severe damage and down of machine. 2.2
Scenarios
The operational scenarios of the problem are as follows: 1. Manipulation of onboard vehicle data or code: Remotely manipulate code onboard vehicle to cause unexpected vehicle behavior (e.g., applies brakes, kills engine, and depresses accelerator). 2. Service technician accessibility: The service technician has access to the onboard telematics system through the OBD-II port which is a mandated interface. A malicious agent playing the role of the service technician is a security threat on the system. 3. OEM software introduction: Regular updates are sent by the manufacturers to improve or enhance the products and services provided by telematics systems. The use scenarios are as follows: 1. Insurance: Pay-for-usage insurance policies allow drivers to only pay for as much insurance as they need based on their driving usage. Insurance companies subscribe to the driver’s telematics information to monitor time and hours of usage, driving location, speeds, etc. 2. OEM: Tractor manufactures use tractor data to inform service needs, new software updates, new features and offers, etc.
Evolutionary Artificial Neural Networks: Comparative Study
305
Fig. 2. Average convergence curve of all algorithms for optimizing the ANN
2.3
Stakeholders and Needs
Figure 3 shows the stake holder [55] description of all the major stakeholders with their needs. The primary stakeholders are operator, tractor manufacturer, and service technicians.
306
N. Gupta et al.
Fig. 3. Stakeholder analysis
Other stakeholders act as a secondary stakeholder. Reduced scope mapped to system requirements is shown in Fig. 4. A tractor is an engineering vehicle specifically designed to deliver a high tractive effort (or torque) at slow speeds, for the purposes of pulling a trailer or machinery used in agriculture or construction. Most widely, the term is used to describe a farm tractor
Fig. 4. Stakeholder analysis—reduced scope
Evolutionary Artificial Neural Networks: Comparative Study
307
that provides the power and traction to mechanize agricultural tasks, especially (and originally) primary and secondary tillage, with the change in farming practices a great variety of automation. Agricultural implements may be towed behind or mounted on the tractor, and the tractor also acts as a prime mover if the implement is required hydraulic or mechanical power. This implement is operated using the hydraulic power from the highly complex hydraulic system of the tractor. Since a tractor’s main function utilizes the hydraulic system, it is essential to keep this system in prime functioning condition. However, due to the severity of the operating conditions, this is not natural to done automatically and is done by the operator. As there is human interaction involved, the accuracy or reliability of this maintenance is not efficient. Keeping this in mind, the project was established to support fault diagnostic proficiency of operator.
3 Convergence Curve As we have discussed before that we have distribution of the solution, 0020, we are presenting average convergence curve to show the performance of the proposed trainer. This curve in Fig. 5 shows the performance of the above-described algorithms. Here, dark and highlighted blue line is for METO, which shows that how fast METO converges to the global optimal solution compared to other optimizers.
Fig. 5. Average convergence curve of all algorithms for optimizing the ANN
In this figure, x-axis showing the number of ANN trained to achieve optimal solution. Here, we have trained 100,000 ANN through all optimizers to get global optimal solution. Vertical axis is for the value of objective function which we are
308
N. Gupta et al.
Fig. 6. ROC curve for PGO trained ANN
interested to minimize during the training the ANN. Associated result table is given in Table 1, where performance of trainers is shown in eight measures. First is B, which is the best trained ANN achieved by the particular trainer. Second is M, which is the mean of the distribution achieved by each trainer from 100 individual runs. Third is Me, which is the median of the solution distribution. Fourth and fifth are Mo and MC, which are, respectively, mode and number of time Mo is coming from all solution attempts. Sixth is std, which is the standard deviation of the distribution and associated with M. Seventh is C; this is consistency of the trainer, which reveals that how many times the trainer achieves the solution below the threshold. Considered threshold in our simulation is 2.06E−03 which is the mean of METO. The last attribute is W; it is the worst performance of the trainer.
Evolutionary Artificial Neural Networks: Comparative Study
309
From the table, we can observe that METO is highly consistent and best trainer for developing intelligent and efficient ANN. Also, the validation accuracy achieved by this is more than 70%. Associated ROC curve is shown in Fig. 6. Four ROCs are shown in this figure. First is training ROC which shows that METO trainer is very efficient to train for Class 1 where it gives degraded performance for Class 5. Same can be observed in validation ROC, test ROC, and all ROC curves. Matrix of true class and predicted class is given in Fig. 7.
Fig. 7. Matrix for showing the distribution of output in true class and predicted class
310
N. Gupta et al. Table 1. Comparative results of ANN for various training algorithms
Trainer
B
M
Me
Mo
MC
std
PGO
1.11E −16 6.55E −02 1.45E −01 3.15E −01 2.49E −01 1.01E −01 3.57E −01 1.59E −01 6.51E −02 9.50E −02 2.72E −01 6.98E −01 5.54E −01 2.32E −01
2.06E −03 1.33E −017 2.60E −01 4.16E −01 3.45E −01 3.41E −01 4.46E −01 4.72E −01 1.12E −01 1.59E −01 3.32E −01 7.73E −01 6.40E −01 3.15E −01
1.05E −10 1.29E −01 2.41E −01 4.18E −01 3.44E −01 3.70E −01 4.51E −01 4.77E −01 1.07E −01 1.49E −01 3.34E −01 7.76E −01 6.41E −01 3.07E −01
1.11E −16 6.55E −02 1.45E −01 3.15E −01 2.49E −01 1.01E −01 3.57E −01 1.59E −01 1.01E −01 9.50E −02 2.72E −01 6.98E −01 5.54E −01 2.32E −01
2
3.66E −03 3.08E −02 6.97E −02 4.38E −02 4.58E −02 1.06E −01 2.32E −02 8.99E −02 2.41E −02 4.67E −02 2.11E −02 2.61E −02 3.32E −02 4.40E −02
BHGA BBO IWO SFLA TLBO CS NBA GSA CMAES DE FA SLPSO RSA
1 1 1 1 1 1 1 3 1 1 1 1 1
C (%) 74 0 0 0 0 0 0 0 0 0 0 0 0 0
W 1.49E −02 2.19E −01 4.17E −01 4.95E −01 4.39E −01 4.84E −01 4.82E −01 5.99E −01 1.61E −01 2.64E −01 3.72E −01 8.13E −01 7.08E −01 4.19E −01
The above matrix in Fig. 7 is supported by the matrix as given in Figs. 8 and 9. In this, we can see positive predictive value for all classes, where each class belongs to one fault condition of the monitoring system. We can observe that it is very high for normal operating condition of the hydraulic pump. Based on this, we can sort the performance of the monitoring system from high to low positive prediction capability as Class 1, Class 6, Class 3, Class 4, Class 2, and Class 6. Also, matrix in Fig. 7 shows the analysis in more detail. Trained model is giving 90% true positive results for Class 1, where it is 79% for Class 6, and so on for other classes.
Evolutionary Artificial Neural Networks: Comparative Study
311
Fig. 8. Result of trained ANN for giving positive and false alarm
Statistically speaking, the ANN-MLP algorithm escapes the local extremes in the datasets with achieving the best classification accuracy, thus achieving minimum MSLE. According to the METO algorithm, it explores the search space and leads to find the diverse ANN structures during the optimization. In addition, the Flipper operator randomly spreads the points in search space. This is an important mechanism to avoid the local extremes. From the literature, we can see that ANN needs an algorithm that should avoid local minima. Here, the results of METO show high exploration mechanism; it avoids local minima for global solution. It is due to the changing solution in every evolution epoch for every training dataset. Based on the statistical results, METO appears as the very effective trainer.
312
N. Gupta et al.
Fig. 9. Result of trained ANN for giving true positive and false negative rate
4 Kruskal–Wallis Statistical Analysis of the Results We tested distributions of all training algorithms for their significant difference using Kruskal–Wallis one-way ANOVA rank test [56, 57]. It is an extended version of the Mann–Whitney test. For the test, we consider the null hypothesis, H0, as “distribution of METO is same as the distribution of other training algorithm,” where the distributions of all training algorithms are coming from independent experiments. It discriminates the distribution of all trainer based on the calculated “critical chi-square f2 and Kruskal–Wallis test (KWT) value”. If the value of KWT is smaller than value X f2 , the H0 cannot be rejected. Thus, to reject the H0, KWT value should be greater the X f2 . For this procedure, p-value is utilized to test the significant difference between than X
distributions with 1% significance level. Table 2 provides the additional test results. ANOVA results for each function have six attributes. Sr represents the source of the variability. Based on the different types of variability here, three types of sources are given. First is Cl, representing groups; it is due to the variability that exists due to the differences among the distribution means. Second is e, which is an error, and the variability exists due to the differences between the dataset within the group and the group mean. This is also called variability within the group.
Evolutionary Artificial Neural Networks: Comparative Study
313
The third is the T, total, which represents total variability. SS is the sum of square due to each Sr, df is the degree of freedom, and df associated with each Sr is calculated. For Cl, df is the degree of freedom (DoF) between the distributions/groups and calculated as df = K − 1; here K = 13 the number of trainer. For e, the df is the DoF within the distribution groups and defined as df = N − K; here N = 650, the number of observations. The total DoF is calculated as df = N − 1, which is equal to (N − K) + (K − 1). Next attribute, MS, is the mean squares for each source and calculated as SS df . F-statistics is represented for the Sr and the ratio of the MS. The last column of this f2 can take a value larger than the table is p-value, which is the probability that the X computed test-statistic value. ANOVA1 derives this probability from the cdf of f2 and p-value are important, where other aboveF-distribution. In the ANOVA table, X described parameters value support to calculate them. Table 2. Kruskal–Wallis ANOVA1 table Sr SS df MS e2 X
CI 2.6 E+7 1.20 E+l 1.72 E+6 5.84 E+2
e 2.29 E+6 6.37 E+2 3.60 E+3 –
T 2.29 E+7 6.49 E+2 – –
e2 prob > X
2.77 E−117
–
–
Moreover, to show the significant difference between the distributions of solutions achieved by each trainer, notched box plot is shown in Figs. 7 and 8. The notched box is associated with an optimizer which has two sections divided by a centerline, and this is the median. Two end edges of each notched box, the bottom and the top, indicate the q1 = 25th and q3 = 75th percentiles, respectively. Outliers O in the distribution are plotted individually using the “+” symbol. In Figs. 7 and 8, we can observe that METO results are free of any outlier. Also, there is a significant difference between the METO and other optimizers. We can observe that notches of METO box plot do not overlap the others, which shows that true medians do differ with others with 95% confidence level. Beyond the whiskers length, the ith solution in the solution distribution is displayed as outliers Oi: If Oi [ q3 þ w ðq3 q1 Þ or Oi [ q1 þ w ðq3 q1 Þ, where w is the maximum whisker length. Horizontal axis numbers in each subplot represent training algorithms number, where from 1 to 13 they are METO, BHGA, BBO, IWO, DE, CMAES, SFLA, FA, TLBO, CUCKOO, NBA, GSA, and SLPSO, respectively.
314
N. Gupta et al.
5 Conclusion It is worth discussing the performance of the proposed trainer for ANN over others. Generally speaking, that Epimutation operation maintains the diversity in the population species. This is an additional mechanism along with Flipper operation to promote the exploitation. However, in the later phase of the search strategy, it diverges the solution; thus, F2 generation offspring brings back the solution toward the best solution and promotes exploitation of the solution. We have discussed this in Chapter 5 as well. Moreover, selection of individual ANNs in the population to go in the next evolution epoch is done based on elitism, where best fitness-valued ANN is selected. Also F1 generation offspring explores the area between two parents. Therefore, the composition of all operators in METO avoids local optima and provides the best solution by smoothly balancing the exploration and exploitation. Half of the iteration is devoted to exploration and the rest to exploitation.
Fig. 10. Genotype representation of weights and transfer functions (TF)
According to this comprehensive study, METO provides remarkable results and is highly recommended as ANN trainer to develop intelligent monitoring system. For very large dataset and the number of features, METO is efficient where low feature and dataset could be solved by gradient-based training algorithms (GBTA) such as back-propagation. METO training is slower than the GBTA but provides better results. We have simulated the ANN using other swarm and evolutionary algorithms, and in contrast, we found that METO is best of all. It avoids the extreme number of local minima of the objective function and makes the other algorithms almost ineffective (Fig 10).
Evolutionary Artificial Neural Networks: Comparative Study
315
Moreover, the accuracy of ANN of optimized weights and biases is considerably high. Here, we also discuss the reason of poor performances of other training algorithms. This is due to the low exploration capability, where METO outperforms by two exploration mechanism one in Epimutation and another is Flipper.
References 1. Dey N (ed) (2017) Advancements in applied metaheuristic computing. IGI Global 2. Dey N, Ashour AS (2016) Antenna design and direction of arrival estimation in metaheuristic paradigm: a review. Int J Serv Sci Manag Eng Technol 7(3):1–18 3. Gupta N, Patel N, Tiwari BN, Khosravy M (2018) Genetic algorithm based on enhanced selection and log-scaled mutation technique. In: Proceedings of the future technologies conference, Springer, Cham, pp 730–748 4. Singh G, Gupta N, Khosravy M (2015) New crossover operators for real coded genetic algorithm (RCGA). In: 2015 international conference on intelligent informatics and biomedical sciences (ICIIBMS), IEEE, pp 135–140 5. Chatterjee S, Sarkar S, Hore S, Dey N, Ashour AS, Balas VE (2017) Particle swarm optimization trained neural network for structural failure prediction of multistoried RC buildings. Neural Comput Appl 28(8):2005–2016 6. Jagatheesan K, Anand B, Samanta S, Dey N, Ashour AS, Balas VE (2017) Particle swarm optimisation-based parameters optimisation of PID controller for load frequency control of multi-area reheat thermal power systems. Int J Adv Intell Paradig 9(5–6):464–489 7. Chatterjee S, Hore S, Dey N, Chakraborty S, Ashour AS (2017) Dengue fever classification using gene expression data: a PSO based artificial neural network approach. In: Proceedings of the 5th international conference on frontiers in intelligent computing: theory and applications, Springer, Singapore, pp 331–341 8. Jagatheesan K, Anand B, Dey N, Gaber T, Hassanien AE, Kim TH (2015) A design of PI controller using stochastic particle swarm optimization in load frequency control of thermal power systems. In: 2015 fourth international conference on information science and industrial applications (ISI), IEEE, pp 25–32 9. Chakraborty S, Samanta S, Biswas D, Dey N, Chaudhuri SS (2013) Particle swarm optimization based parameter optimization technique in medical information hiding. In: 2013 IEEE international conference on computational intelligence and computing research, pp 1–6 10. Khosravy M, Gupta N, Patel N, Senjyu T, Duque CA (2020) Particle swarm optimization of morphological filters for electrocardiogram baseline drift estimation. In: Dey N, Ashour AS, Bhattacharyya S (eds) Applied nature-inspired computing: algorithms and case studies. Springer, Singapore, pp 1–21 11. Gutierrez CE, Alsharif MR, Khosravy M, Yamashita K, Miyagi H, Villa R (2014) Main large data set features detection by a linear predictor model. In: AIP conference proceedings, vol 1618, no 1, pp 733–737 12. Khosravy M, Gupta N, Marina N, Asharif MR, Asharif F, Sethi IK (2015) Blind components processing a novel approach to array signal processing: a research orientation. In: 2015 international conference on intelligent informatics and biomedical sciences (ICIIBMS), IEEE, pp 20–26 13. Khosravy M, Asharif MR, Sedaaghi MH (2008) Medical image noise suppression: using mediated morphology. IEICE Tech Rep, IEICE, pp 265–270
316
N. Gupta et al.
14. Khosravy M, Asharif MR, Yamashita K (2009) A PDF-matched short-term linear predictability approach to blind source separation. Int J Innov Comput Inform Control (IJICIC) 5(11):3677–3690 15. Khosravy M, Alsharif MR, Yamashita K (2009) A PDF-matched modification to Stone’s measure of predictability for blind source separation. In: International symposium on neural networks, Springer, Berlin, Heidelberg, pp 219–228 16. Khosravy M, Asharif MR, Yamashita K (2011) A theoretical discussion on the foundation of Stone’s blind source separation. SIViP 5(3):379–388 17. Khosravy M, Asharif M, Yamashita K (2008) A probabilistic short-length linear predictability approach to blind source separation. In: 23rd international technical conference on circuits/systems, computers and communications (ITC-CSCC 2008), Yamaguchi, Japan, pp 381–384 18. Khosravy M, Kakazu S, Alsharif MR, Yamashita K (2010) Multiuser data separation for short message service using ICA (信号処理). 電子情報通信学会技術研究報告. SIP, 信号 処理: IEICE technical report, 109(435), pp 113–117 19. Gutierrez CE, Alsharif MR, Yamashita K, Khosravy M (2014) A tweets mining approach to detection of critical events characteristics using random forest. Int J Next-Gener Comput 5 (2):167–176 20. Ashour AS, Samanta S, Dey N, Kausar N, Abdessalemkaraa WB, Hassanien AE (2015) Computed tomography image enhancement using cuckoo search: a log transform based approach. J Signal Inform Process 6(03):244 21. Khosravy M, Gupta N, Marina N, Sethi IK, Asharif MR (2017) Brain action inspired morphological image enhancement. Nature-inspired computing and optimization. Springer, Cham, pp 381–407 22. Dey N, Mukhopadhyay S, Das A, Chaudhuri SS (2012) Analysis of P-QRS-T components modified by blind watermarking technique within the electrocardiogram signal for authentication in wireless telecardiology using DWT. Int J Image Gr Signal Process 4(7):33 23. Dey N, Ashour AS, Shi F, Fong SJ, Sherratt RS (2017) Developing residential wireless sensor networks for ECG healthcare monitoring. IEEE Trans Consum Electron 63(4):442– 449 24. Sedaaghi MH, Khosravi M (2003) Morphological ECG signal preprocessing with more efficient baseline drift removal. In: Proceedings of the 7th IASTED international conference, ASC pp 205–209 25. Khosravi M, Sedaaghi MH (2004) Impulsive noise suppression of electrocardiogram signals with mediated morphological filters. In: The 11th Iranian conference on biomedical engineering, Tehran, Iran, pp 207–212 26. Khosravy M, Patel N, Gupta N, Sethi IK (2019) Image quality assessment: a review to full reference indexes. Recent trends in communication, computing, and electronics. Springer, Singapore, pp 279–288 27. Hore S, Chakraborty S, Chatterjee S, Dey N, Ashour AS, Van Chung L, Le DN (2016) An integrated interactive technique for image segmentation using stack based seeded region growing and thresholding. Int J Electric Comput Eng 6(6):2088–8708 28. Sedaaghi MH, Daj R, Khosravi M (2001) Mediated morphological filters. In: Proceedings 2001 international conference on image processing (Cat. No. 01CH37205), IEEE, vol 3, pp 692–695 29. Khosravy M, Gupta N, Marina N, Sethi IK, Asharif MR (2017) Morphological filters: an inspiration from natural geometrical erosion and dilation. Nature-inspired computing and optimization. Springer, Cham, pp 349–379
Evolutionary Artificial Neural Networks: Comparative Study
317
30. Khosravy M, Punkoska N, Asharif F, Asharif MR (2014) Acoustic OFDM data embedding by reversible Walsh-Hadamard transform. In: AIP conference proceedings, vol 1618, no. 1, pp 720–723 31. Khosravy M, Alsharif MR, Guo B, Lin H, Yamashita K (2009) A robust and precise solution to permutation indeterminacy and complex scaling ambiguity in BSS-based blind MIMOOFDM receiver. In: International conference on independent component analysis and signal separation, Springer, Berlin, Heidelberg, pp 670–677 32. Asharif F, Tamaki S, Alsharif MR, Ryu HG (2013) Performance improvement of constant modulus algorithm blind equalizer for 16 QAM modulation. Int Innov Comput Inform Control 7(4):1377–1384 33. Khosravy M, Alsharif MR, Yamashita K (2009) An efficient ICA based approach to multiuser detection in MIMO OFDM systems. Multi-carrier systems and solutions 2009. Springer, Dordrecht, pp 47–56 34. Khosravy M, Alsharif MR, Khosravi M, Yamashita K (2010) An optimum pre-filter for ICA based multi-input multi-output OFDM system. In: 2010 2nd international conference on education technology and computer, IEEE, vol 5, pp V5–129 35. Picorone AAM, Oliveira TR, Sampaio-Neto R, Khosravy M, Ribeiro MV (2020) Channel characterization of low voltage electric power distribution networks for PLC applications based on measurement campaign. Int J Electric Power Energy Syst 116:105554 36. Khosravy M, Gupta N, Marina N, Sethi IK, Asharif MR (2017) Perceptual adaptation of image based on Chevreul-Mach bands visual phenomenon. IEEE Signal Process Lett 24 (5):594–598 37. Gupta S, Khosravy M, Gupta N, Darbari H (2019) In-field failure assessment of tractor hydraulic system operation via pseudospectrum of acoustic measurements. Turk J Electric Eng Comput Sci 27(4):2718–2729 38. Gupta N, Khosravy M, Patel N, Sethi IK (2018) Evolutionary optimization based on biological evolution in plants, vol 126. Procedia Computer Science, Elsevier pp 146–155 39. Gupta N, Khosravy M, Mahela OP, Patel N (2020) Plants biology inspired genetics algorithm: superior efficiency to firefly optimizer. In: Applications of firefly algorithm and its variants, from Springer tracts in nature-inspired computing (STNIC). Springer International Publishing (in press) 40. Gupta N, Khosravy M, Patel N, Senjyu T (2018) A bi-level evolutionary optimization for coordinated transmission expansion planning. IEEE Access 6:48455–48477 41. Simon D (2008) Biogeography-based optimization. IEEE Trans Evol Comput 12(6):702– 713 42. Xing B, Gao WJ (2014) Invasive weed optimization algorithm. Innovative computational intelligence: a rough guide to 134 clever algorithms. Springer, Cham, pp 177–181 43. Eusuff M, Lansey K, Pasha F (2006) Shuffled frog-leaping algorithm: a memetic metaheuristic for discrete optimization. Eng Optim 38(2):129–154 44. Rao RV, Savsani VJ, Vakharia DP (2011) Teaching learning-based optimization: a novel method for constrained mechanical design optimization problems. Comput-Aided Des 43 (3):303–315 45. Dey N, Samanta S, Yang XS, Das A, Chaudhuri SS (2013) Optimisation of scaling factors in electrocardiogram signal watermarking using cuckoo search. Int J Bio-Inspir Comput 5 (5):315–326 46. Moraes CA, De Oliveira, EJ, Khosravy M, Oliveira LW, Honório LM, Pinto MF (2020) A hybrid bat-inspired algorithm for power transmission expansion planning on a practical Brazilian network. In: Dey N, Ashour AS, Bhattacharyya S (eds) Applied nature-inspired computing: algorithms and case studies. Springer, Singapore, pp 71–95
318
N. Gupta et al.
47. Satapathy SC, Raja NSM, Rajinikanth V, Ashour AS, Dey N (2018) Multi-level image thresholding using Otsu and chaotic bat algorithm. Neural Comput Appl 29(12):1285–1307 48. Rajinikanth V, Satapathy SC, Dey N, Fernandes SL, Manic KS (2019) Skin melanoma assessment using kapur’s entropy and level set—a study with bat algorithm. In: Smart intelligent computing and applications. Springer, Singapore, pp 193–202 49. Rashedi E, Nezamabadi-Pour H, Saryazdi S (2009) GSA: a gravitational search algorithm. Inf Sci 179(13):2232–2248 50. Hansen N, Müller SD, Koumoutsakos P (2003) Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evol Comput 11(1):1–18 51. Islam SM, Das S, Ghosh S, Roy S, Suganthan PN (2011) An adaptive differential evolution algorithm with novel mutation and crossover strategies for global numerical optimization. IEEE Trans Syst Man Cybern Part B (Cybern) 42(2):482–500 52. Dey N, Samanta S, Chakraborty S, Das A, Chaudhuri SS, Suri JS (2014) Firefly algorithm for optimization of scaling factors during embedding of manifold medical information: an application in ophthalmology imaging. J Med Imaging Health Inform 4(3):384–394 53. Cheng R, Jin Y (2015) A social learning particle swarm optimization algorithm for scalable optimization. Inf Sci 291:43–60 54. Hrstka O, Kučerová A, Lepš M, Zeman J (2003) A competitive comparison of different types of evolutionary algorithms. Comput Struct 81(18–19):1979–1990 55. Händel P, Ohlsson M, Skog I, Ohlsson J, Movelo AB (2015) Determination of activity rate of portable electronic equipment. U.S. Patent Application 14/377,689 56. Siegel S (1988) The Kruskal-Wallis one-way analysis of variance by ranks. Nonparametric statistics for the behavioral sciences 57. Ostertagová E, Ostertag O, Kováč J (2014) Methodology and application of the KruskalWallis test. Appl Mech Mater 611:115–120
Chapter 15 Application of Recent Metaheuristic Techniques for Optimizing Power Generation Plants with Wind Energy F. F. Panoeiro, G. Rebello, V. A. Cabral, C. A. Moraes, I. C. da Silva Junior, L. W. Oliveira, and B. H. Dias(&) Department of Electrical Energy, Federal University of Juiz de Fora (UFJF), Juiz de Fora, Brazil [email protected]
1 Introduction With the growth of energy consumption and generating costs, as well as an increasing concern regarding environment preservation and global warming, novel alternative energy sources, such as wind energy, have experienced exponential growth globally. According to the Global Wind Energy Council (GWEC), 51.3 GW of installed capacity was added in 2018, with an increase forecast of 300 GW in the next 5 years, coming especially from emergent markets and from offshore wind farms [1]. The wind energy generation technology can be done in two manners: onshore and offshore. Onshore wind farms present lower installation costs compared to offshore wind farms, due to the simpler structures that sustain the wind turbines on the ground and to the smaller size of the electrical collector system that reduces the final cost of the project. As for the offshore wind farms, besides not causing impact on people’s lives, such as visual and noise pollution, they present lower payback times and larger efficiency, since the wind speeds are higher at the sea [2, 3]. In terms of generation reliability, the location of the wind turbines in a wind farm is a factor of high relevance, since it directly impacts the extracted power and converted energy, both associated with intermittent wind regimes. The process of transforming the kinetic energy from the winds into electrical energy is not ideal. Besides the loss of kinetic energy when interacting with the turbine, turbulences can be generated. Turbines located downstream have their conversion potential reduced and experience an effort increase in their supporting structures. This phenomenon is denominated wake effect, and several studies have been proposed to model it, since it is an important factor to be taken into account in the layout optimization process [4, 5]. The wind intermittency is the second factor to be considered, since it impacts on the amount of energy that can be extracted. The two main aspects related to the wind are its velocity and incidence direction [6, 7]. Several objective functions can be used to obtain the optimal wind turbines layout, such as (i) maximizing power conversion; (ii) maximizing net revenues; (iii) maximizing energy production; and/or (iv) minimizing cost per extracted power [6, 8–10]. © Springer Nature Singapore Pte Ltd. 2020 M. Khosravy et al. (eds.), Frontier Applications of Nature Inspired Computation, Springer Tracts in Nature-Inspired Computing, https://doi.org/10.1007/978-981-15-2133-1_15
320
F. F. Panoeiro et al.
Several optimization algorithms have been used to solve the offshore wind turbine optimal layout problem. The deployment of intelligent techniques such as metaheuristics by evolutionary computation and swarm intelligence has been highlighted due to the combinatory nature of the decision-making process, which involves continuous and discrete variables. The standard genetic algorithm (GA) and variations are used in [8–13], while in [6, 7, 14] other evolutionary algorithms (EA) were implemented to solve the problem. The technique named artificial immune system (AIS) is used in [15]. In [16–18], the particle swarm optimization (PSO) method is applied to solve the problem. 1.1
Contributions
Following this line of research, the present chapter aims at comparing the optimization techniques bat algorithm (BA), grey wolf optimizer (GWO), and sine cosine algorithm (SCA), applied to the offshore wind farm layout optimization (OWFLO) problem. These techniques are able to perform global and local searching in an efficient manner, being applied in solving many nonlinear optimization problems, with non-convex solution space and with large dimensions, such as the OWFLO problem. Also, a resource named chaotic map is applied on the optimization techniques in order to improve the local/global searching stages of the metaheuristics. 1.2
Organization
This chapter is organized in the following manner: Sect. 2 presents the problem formulation, i.e., the objective function and problem constraints. The methodologies employed, corresponding to the optimization algorithms, and the chaotic map resource are described in Sect. 3. Section 4 presents the simulation results and analysis. The final observations and conclusion are presented in Sect. 5.
2 Problem Formulation 2.1
Wake Effect
In the OWFLO problem, it is necessary to consider the wind weakening effect. To do so, the ‘wake effect’ model proposed in [5] is used. This is the effect that an upstream turbine ‘j’ causes on a downstream turbine ‘i’, or in other words the downstream output power is reduced due to the variation in the mean wind speed caused by the upstream turbine [16]. The wind speed velocity at a downstream wind turbine ‘i’, with respect to the variation caused by the upstream turbine ‘j’ (uði;jÞ ), is obtained through Eqs. (1) and (2). 2
0
6 B uði;jÞ ¼ u0 41 @h
13 2a C7 i2 A5 xij 1 þ a ri
ð1Þ
Application of Recent Metaheuristic Techniques for Optimizing
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð 1 aÞ ri ¼ rj ð1 2aÞ
321
ð2Þ
where ri effective radius of the rotor located downstream related to the upstream rotor rj ; xij distance between turbines i and j in the direction of wind incidence; a treadmill expansion rate; a axial induction factor; u0 mean speed. The axial induction factor is associated with the impulse coefficient ðCT Þ, and the treadmill expansion rate is associated with the soil roughness (z) and the height of the wind turbine ðz0 Þ, obtained through Eqs. (3) and (4), respectively. CT ¼ 4að1 aÞ a¼
0:5 ln zz0
ð3Þ ð4Þ
In the case of multiple interferences, i.e., when the downstream turbine is affected by more than one upstream turbine, the resulting wind speed ures;i is obtained by the sum of kinetic energy weakening effects caused by the upstream turbines ‘j’, as shown in Eq. (5). 3 vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi7 6 u N 6 u X uij 2 7 7 6 u ¼ u0 61 u 1 7 6 u0 7 u 5 4 tj ¼ 1 j 6¼ i 2
ures;i
2.2
ð5Þ
General Formulation
The resolution of the OWFLO problem consists of determining, in the best possible way, the location of the wind turbines by evaluating parameters that indicate the performance of the configuration. In this work, the total wind power production and the annual costs of the project are used by the authors as the basis for evaluation. The formulation of the nonlinear optimal layout problem is represented by the Eqs. (6)–(9). In general, the objective function (OBF) of the optimal layout problem aims at minimizing the project costs and maximizing the wind farm extracted power [16].
322
F. F. Panoeiro et al.
2 N 23 þ 13 e0:00174N min OBF ¼ P360 PN i¼1 fk Pi ures;i k¼0 s:t :
ures;i 2
>
> : 0
for for for for
ures;i 2:3m/s 2:3\ ures;i 12:8m/s 12:8 ures;i 18m/s ures;i [ 18m/s
ð10Þ
3 Metaheuristics Metaheuristic algorithms are present in different applications in the most diverse domains [19, 20]. The stochastic optimization methods known as metaheuristics are the methodologies applied to solve the OWFLO problem. These techniques are applied in several areas of knowledge, due to its ease of implementation, flexibility in adapting to different problems, and avoidance of derivative methods. In general, these techniques are widely used in nonlinear problems, with non-convex solution space (with local maximum and minimum values) and of large dimensions, such as the OWFLO problem. Many metaheuristic implementations can be found in the literature. Genetic algorithms (GA) reiterated the principles of Darwinian evolution for solving optimization problems [21, 22]. GA as an optimization technique has gone through evolutions and resulting in varieties. A recent variation implies Mendelian evolution on multispecies as inspired from plants biology [23] incorporating the use of double-strand DNA for evolution.
Application of Recent Metaheuristic Techniques for Optimizing
323
Currently, the metaheuristics are a widely popular approach to solve combinatorial problems. This has been reflected in the application such as transmission expansion planning (TEP), which is solved by a bi-level evolutionary optimization [24] and a hybrid bat-inspired algorithm [25]. Particle swarm optimization (PSO) is used to investigate a faster parameterization technique in the one-stage morphological ECG baseline estimation [26] and also is used to do the load frequency control (LFC) of multi-area reheat thermal power system with proportional–integral–derivative (PID) controller [27]. PSO in combination with trained neural network is an approach into structural failure prediction of multistoried RC buildings [28]. Chaotic bat algorithm is applied in complex image processing such as automatic target recognition [29]. Due to this scenario of metaheuristic applications, in the specialized literature, there are different classifications regarding metaheuristics techniques: (a) algorithm’s inspiration (biology, physics, chemistry, among others); (b) number of solutions involved in the optimization process (individuals or populations) [30]. Among these last two classifications immediately cited, it can be observed recent applications of bio-inspired techniques in swarm intelligence to solve the OWFLO problem. Such techniques are based on the collective behavior of some species and, normally, have fewer parameters/operators to be adjusted than evolutive techniques (crossover, mutation, elitism, among others). The population-based techniques normally use a set of random initial solutions that are improved over the course of iterations (searching process). These techniques are efficient since the candidate solutions share information regarding the search space allowing a more effective exploration toward the optimal solution of the problem. In the solution searching process, each technique has its own characteristics/operators that allow a search throughout the entire search space (global search) or around promising regions (local search) [31]. In the present work, the authors chose to investigate different population-based techniques, being two of them bio-inspired swarm intelligence algorithms and the other one based on mathematical formulations. The goal is to determine the most efficient technique to solve the OWFLO problem. To do so, the bat algorithm, grey wolf optimizer, and the sine cosine algorithm were implemented. Also, a feature called chaotic map is applied on the optimization techniques to improve the global/local search process of the methods. 3.1
Bat Algorithm—BA
The bat algorithm (BA), proposed in [32], is a bio-inspired optimization technique that is based on the bats echolocation. Bats are able to locate obstacles and prey/food through the emission and capture of ultrasonic waves, identifying distances by measuring the wave time of return in the form of an echo. This biological capability is named echolocation, being a resource widely used by bats and other animals with nocturnal habits. Figure 1 depicts the pseudo-code of the bio-inspired BA. In the bat algorithm, the number of bats ðnÞ, amplitude decay ðaÞ, and pulse emission increase rates ðcÞ are defined. The positions ðXi Þ and their individual
324
F. F. Panoeiro et al.
Fig. 1. Algorithm 1—bat algorithm pseudo-code
parameters such as velocity ðVi Þ, frequency ðfri Þ, pulse emitting rate ðri Þ, and amplitude ðAi Þ are randomly initialized respecting the solution region boundaries. The bats positions represent the solutions to the problem under analysis that will be evaluated and ranked according to the objective function value. The best position ðX Þ is associated with the best bat in the population. Then, the iterative process starts until a stopping criteria is met that can be, for example, a maximum number of iterations or the stagnation of the best solution. For each iteration (t), the frequencies ðfri Þ, velocities Vit þ 1 , and position Xit þ 1 of the bat i are updated using Eqs. (11)–(13). fri ¼ frmin þ ðfrmax frmin Þ b
t
ð11Þ
Vit þ 1 ¼ Vit þ Xit X fri
ð12Þ
Xit þ 1 ¼ Xit þ Vit þ 1
ð13Þ
Application of Recent Metaheuristic Techniques for Optimizing
325
where the bat’s (i) frequency is around the minimum ðfrmin Þ and maximum ðfrmax Þ assigned values and ‘b’ is a random number in [0, 1]. After updating the bats positions, the local search stage begins, where the pulse emitting rate ðri Þ is compared to a random value in [0, 1]. If the condition is met ðrand [ ri Þ, Eq. (14) is used to generate a new local solution for the bat Xit þ 1 , with respect to the best bat Xt , mean pulse amplitude ðAt Þ, and ‘ɛ’, which is a random vector in [−1, 1] of equivalent dimension to the bat. Xit þ 1 ¼ Xt þ e meanðAt Þ
ð14Þ
In order to avoid violating the search space boundaries, the bat’s position is checked with respect to the established minimum and maximum boundaries. The bats are evaluated, and the global search stage begins. At this stage, two conditions are analyzed: (1) If the numerical value of the objective function is smaller than at the previous iteration f Xit þ 1 \f Xit and (2) if the random number is smaller than the pulse amplitude rand\Ati . In case the conditions are met, the bat’s Xit þ 1 pulse t þ 1 t þ 1 and pulse amplitude Ai are updated, according to Eqs. (15) emitting rate ri and (16). rit þ 1 ¼ ri0 ½1 ect
ð15Þ
Ait þ 1 ¼ a Ati
ð16Þ
At last, the position of the best bat ðX Þ is updated. It is interesting to notice that during the searching process, the pulse amplitude ðAi Þ decreases andthepulse emitting rate ðri Þ increases, tending to the maximum initial value considered ri0 . Therefore, at the first iterations the global search mechanism happens more frequently, but at the end of the process the condition is hardly met due to the decrease in the pulse amplitude, leading to a more thorough local search, since the mean amplitude tends to zero. 3.2
Grey Wolf Optimizer—GWO
The bio-inspired optimization technique named grey wolf optimizer (GWO) was proposed in [31] and is based on the social hierarchy and hunting behavior of grey wolves. The grey wolves’ hierarchy is divided in alphas, betas, deltas, and omegas, respectively, in order of dominance. The alpha wolf, also named dominant wolf, is the main responsible for making the decisions, e.g., time to hunt, places to sleep, and wake up times. The beta wolf is the alpha’s right hand and helps in the decision-making process. Also, the beta wolf is the main candidate to replace the alpha in the future. The delta wolves belong to the categories of scouts, sentinels, elders, hunters, and caretakers. At last, the omega wolves compose the lowest level of the hierarchy and play the role of scapegoats. The wolves hunting strategy is divided into three steps: (i) tracking, chasing, and approaching the prey; (ii) chasing and encircling until stationary situation; (iii) attacking the prey [31].
326
F. F. Panoeiro et al.
Figure 2 depicts the GWO pseudo-code.
Fig. 2. Algorithm 2—grey wolf optimizer pseudo-code
In the GWO algorithm, the size of the wolves population is defined ðgÞ, whose positions ðXi Þ represent the number of solutions investigated within the solution region, which are randomly initialized. The solutions are evaluated through the evaluation of the objective function and the hierarchy of the wolf pack is defined, where the three best solutions of the set give rise to the alpha ðXa Þ, beta ðXb Þ, and delta ðXd Þ wolves, respectively, in order of superiority. After the initialization process, the iterative process of the algorithm starts, i.e., the hunting step that will update the position of the wolves of the entire pack. To perform ! ! such update, the search coefficients ‘ A ’ and ‘ C ’, defined by Eqs. (17) and (18), are calculated, where ‘! r 1 ’ and ‘! r 2 ’ are random vectors in [0, 1] and ‘at ’ is the exploration coefficient that linearly decreases over the course of the (t) iterations. ! A ¼ at ! r 1 at
ð17Þ
! C ¼2! r2
ð18Þ
These coefficients provide the local or global search during the wolves update stage; ! i.e., for A \1, the wolves are forced to attack the prey (local search refinement), and for ! A [ 1, the wolves search for better preys in the search space (diverge from the best
Application of Recent Metaheuristic Techniques for Optimizing
327
(a)
(b)
Fig. 3. a Local/global search behavior. b Wolf’s estimate position for the GWO algorithm
! ! solution), according to Fig. 3a. As for the vector C, if C \1 or C [ 1 attenuates or increases the magnitude of the best solution in the searching mechanism, respectively. Then, the wolves’ positions are updated. Equations (19) and (20) model the encircling behavior during the hunting stage. !t ! !t !t D i ¼ C X p X i
ð19Þ
!t þ 1 !t ! !t X i ¼ X p A Di
ð20Þ
where ‘t’ depicts the current iteration at the convergence process, ‘Xpt ’ is the prey’s position, and ‘Xit ’ is the wolf’s position. It is supposed that the location of the prey is
328
F. F. Panoeiro et al.
not known and that the alpha, beta, and delta wolves, that represent the best solutions at the pack, have a better knowledge regarding the prey’s position during the hunting stage, according to Fig. (3b). Thus, the estimate position Xit þ 1 , to update the wolf’s current position, is given by the mean displacement with respect to the positions of the dominant wolves, defined by Eqs. (21), (22), and (23). ! ! !t !t ! ! !t !t ! ! !t !t D a ¼ C 1 X a X i ; D b ¼ C 2 X b X i ; D d ¼ C 3 X d X i
ð21Þ
! !t ! ! ! !t ! ! !t ! ! X 1 ¼ X a A 1 D a; X 2 ¼ X b A 2 D b; X d A 3 D d
ð22Þ
! ! ! X1þ X2þ X3 !t þ 1 Xi ¼ 3
ð23Þ
After the update, it is verified if there are any violations to the search space boundaries and the solution is evaluated through the objective function. Finally, the wolves’ hierarchy is updated by replacing the alpha, beta, or delta wolves by the current solution, if an improvement was observed. 3.3
Sine Cosine Algorithm—SCA
The optimization technique known as sine cosine algorithm (SCA) was proposed in [30]. This optimization tool is based on the mathematical functions sine and cosine, in which the local and global search components are made according to random adaptive variables integrated to the algorithm. Figure 4 depicts the SCA pseudo-code.
Fig. 4. Algorithm 3—sine cosine algorithm pseudo-code
Application of Recent Metaheuristic Techniques for Optimizing
329
In the SCA algorithm, the total number of candidate solutions ðgÞ is defined. The positions of these ðXi Þ solutions are randomly dispersed in the solution region and evaluated through the objective function, defining the best solution in the set ðP Þ. Then, the algorithm’s iterative process starts, at which the search for the optimal solution is performed by the approximation or distancing between the current solution t Xi and the best solution ðP Þ. Based on this distancing, the updated position Xit þ 1 is given by Eq. (24), at which the random value in [0, 1] of the parameter ‘r4 ’ determines the choice of the sine or cosine component. Xit þ 1
¼
Xit þ r1t sinðr2 Þ r3 P X it ; r4 \0:5 Xit þ r1t cosðr2 Þ r3 P Xit ; r4 [ ¼ 0:5
ð24Þ
a Tmax
ð25Þ
With: r1t ¼ a t
Parameter ‘r1t ’ is a function of the maximum number of iterations ðTmax Þ, the current iteration (t), and an exploration coefficient (a) that defines the solution searching region; i.e., when r1t \1, it is estimated that the updated position can be in the region between the current solution Xit and the best ðP Þ, or outside this region for r1t [ 1, as shown in Fig. 5.
Fig. 5. Local/global search behavior of SCA
Since this parameter decreases over the course of iterations, the global search is more likely to be performed at the beginning of the process and the local search at the end of the process, refining local solutions. Parameter ‘r2 ’ is related to the extension of the step toward or outwards the best solution ðP Þ. As for parameter ‘r3 ’, r3 \1, or r3 [ 1, respectively, attenuates or increases the magnitude of the best solution,
330
F. F. Panoeiro et al.
both obtained in [0, 2p] and [0, 2]. Then, it is verified if there are any violations to the search space boundaries and the solution Xit þ 1 is evaluated. At last, the position of the best solution is updated ðP Þ. 3.4
Adaptations and Chaotic Map
The randomly generated initial solutions (Xi ) in the methods are of binary nature, at which ‘1’ and ‘0’ indicate the presence or absence of a wind turbine in the layout. During the solution update process Xit þ 1 , random variables of continuous nature are applied. This way, the updated solution is rounded (round Xit þ 1 ), and possible violations to the search space boundaries are verified through Eq. (26). Xit þ 1
¼
1 if 0 if
Xit þ 1 0:5 Xit þ 1 \0:5
ð26Þ
In the specialized literature, different versions of the described optimization techniques are found, aiming at improving the local/global search components of the algorithms. In this context, one of the possible applications is the insertion of a chaotic model in the replacement of the searching parameters of the optimization methods. A chaotic system can be understood as a random generator obtained from deterministic systems with dynamic properties, semi-stochastic, ergodic, and very sensitive to initial conditions [33]. These deterministic systems are based upon a mathematical relation and are known as chaotic maps. According to [34], the dispersion of random numbers is better, allowing solutions to perform large steps to escape from local minima or short steps allowing the refinement in the local search. At the present work, the authors will use the sinusoidal map, represented by Eq. (27) [35]. xk þ 1 ¼ ax2k sinðpxk Þ
ð27Þ
where a = 2.3 and the chaotic numbers are generated in [0, 1]. At the BA optimization method, this chaotic map is applied in the adjustment of the pulse amplitude ðAi Þ and frequency random variable ðbÞ parameters. Equations (28) and (29) depict the frequency and pulse amplitude including the chaotic map. fri ¼ frmin þ ðfrmax frmin Þ a ðxti Þ2 sin p xti
ð28Þ
2 Ait þ 1 ¼ a xti sin p xti
ð29Þ
For the GWO and SCA optimization methods, the chaotic map is applied to substitute the exploration coefficient (a) of the searching coefficient (A) and ðr1 Þ, respectively.
Application of Recent Metaheuristic Techniques for Optimizing
331
Equations (30) and (31) depict the novel parameters. ! 2 A ¼ a ðxt Þ sin p xti ð2r1 1Þ
2
ð30Þ
t
ð31Þ
r1t ¼ a ðxt Þ sin p xi
! Parameters ‘ A ’ and ‘r1t ’ determine the global/local searching steps throughout the search space, as previously mentioned. This dispersion caused by the chaotic map aims at improving the exploration during these steps. Figure 6 depicts a comparison between the linear and chaotic exploration coefficients over 200 cycles/iterations. 1 GWO/SCA
Exploration Coefficient
0.9
Sinusoidal Map
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
20
40
60
80
100
120
140
160
180
200
Iterations
Fig. 6. Exploration coefficients of GWO and SCA. Linear and chaotic models
4 Results and Discussion To evaluate the methodologies investigated in this chapter, simulations were performed for a well-known case study from the literature. The wind farm’s terrain has dimensions of 2 km 2 km, totaling an area of 4 million m2. This region was divided in 100 bits, at which the distance between the bits is of 200 m, equivalent to five times the rotor diameter (5RD) [16]. The optimization process will be of binary nature, as previously mentioned, at which the variable is a vector of 100 positions. The stopping criteria applied to all techniques is the maximum iterations number. The simulations were performed with the following configurations: • Number of initial solutions: 100, 500, and 1000; • Maximum number of iterations: 300; • Wind incidence scenarios: (I) north–south (0°) and (II) multiple directions (0°, 45°, 90°, 135°, 180°, 225°, 270° and 315°). For each configuration, 40 simulations were performed starting from the same random initial solution for all optimization techniques. The adopted parameters for the optimization methods are shown in Table 1.
332
F. F. Panoeiro et al. Table 1. Optimization techniques’ parameters
Optimization technique BA
GWO SCA
4.1
Parameter frmin frmax k a a a
Value 0 1 0.85 0.9 1 1
Description Minimum frequency Maximum frequency Pulse emission increase rate Amplitude decay rate Exploration coefficient Exploration coefficient
Case (I): North–South
Case (I) is used to validate the methodology employed with respect to the specialized literature. In this case, it is considered a mean wind speed ðu0 Þ of 12 m/s at direction k = 0º, corresponding to the north–south direction. Since there is no variation in the incidence direction, the wind density probability ðfk Þ is equal to 1. Therefore, there is no weighing in the calculation of the extracted power. The first analysis is about obtaining the optimal values of the objective function for the forty executions inherent to the proposed configurations. Figure 7 depicts the results obtained for the objective function that represents the project costs per output extracted power, for the forty simulations performed in this study. From Fig. 7a, it can be observed that, among the basic methodologies, the GWO presented the smaller results’ dispersion, disregarding discrepant values (+) named outliers. However, from the results obtained, the SCA obtained the lowest mean, median, and minimum solution, indicating that this methodology obtained better layout configurations by achieving a lower cost per extracted power. Analyzing the introduction of the chaotic maps in the adjustment of the algorithm’s parameters, it can be observed that the results’ dispersion is different from the ones considering only the basic methodologies. The CSCA presented the lowest dispersion, but presented some outliers (+), differently from the SCA. It is interesting to mention that, among all methodologies, the CBA was the only capable of obtaining the minimum solution of 0.0015436$/kW and that the CGWO obtained the lowest mean and median values. With an increase in the number of initial solutions, it can be seen from Fig. 7b and c that the results’ dispersion was smaller, which is expected, since there are a greater number of solutions investigating the search space. Thus, the quality of solutions is improved, with all methodologies reaching the solution of 0.0015436$/kW for the configuration of 1000 initial solutions. However, with 500 solutions the BA was not able to reach the optimal value. By analyzing Fig. 7b, it can be observed that the solutions’ dispersion is similar for all methodologies, but it is more likely that the solutions obtained by the SCA and CGWO will be better for more simulations, since these techniques obtained lower median values. For the configuration represented in Fig. 7c, the median for the CGWO results is at the minimum value, indicating that at least half of the simulations reached the optimal result. In general, the solutions’ dispersion, and median and mean values for the CGWO technique were better compared to the other investigated techniques,
Application of Recent Metaheuristic Techniques for Optimizing
333
(a)
(b)
(c)
Fig. 7. Solution dispersion for the configurations with a 100, b 500, and c 1000 solutions for Case (I)
leading to the belief that this is the most reliable technique to solve the OWFLO problem, in the sense of knowing what to expect from it. Table 2 depicts the minimum, median, and mean values obtained from the simulations for all optimization techniques for Case (I), highlighting the best results of each configuration.
334
F. F. Panoeiro et al. Table 2. Statistical data for Case (I)
Optimization technique BA
GWO
SCA
CBA
CGWO
CSCA
Number of solutions 100 500 1000 100 500 1000 100 500 1000 100 500 1000 100 500 1000 100 500 1000
Minimum ($/ kW) 0.0015489 0.0015439 0.0015436 0.0015469 0.0015436 0.0015436 0.0015454 0.0015436 0.0015436 0.0015436 0.0015436 0.0015436 0.0015443 0.0015436 0.0015436 0.0015468 0.0015436 0.0015436
Median ($/ kW) 0.0015540 0.0015476 0.0015451 0.0015513 0.0015466 0.0015440 0.0015509 0.0015451 0.0015451 0.001550 0.0015453 0.0015451 0.0015485 0.0015451 0.0015436 0.0015506 0.0015453 0.0015451
Mean ($/ kW) 0.0015541 0.0015471 0.0015457 0.0015519 0.0015459 0.0015444 0.0015513 0.0015455 0.0015447 0.0015503 0.0015458 0.0015456 0.0015489 0.0015452 0.0015444 0.0015511 0.0015459 0.0015446
The second analysis concerns the convergence of the optimization techniques. The convergence curves (objective function evaluation over the course of iterations) depicted in Fig. 8 represent the best simulations for each optimization technique among all forty performed simulations, in the sense of reaching the minimum values shown in Table 1 with the smallest number of iterations. By analyzing the convergence curves in Fig. 8a, it can be observed that only the CBA technique reaches the optimal result of 0.0015436$/kW after performing 278 iterations. Another interesting aspect to be observed is the GWO/CGWO convergence behavior, at which the convergence to a local minimum occurs within the first iterations, is faster when compared to the other techniques. From Fig. 8b, it can be seen that 299 iterations were needed for the SCA to reach the optimal value, whereas the chaotic version CSCA reached it after 236 iterations. As for the GWO and CGWO, it can be observed that the basic version reached the minimum of 0.0015436$/kW with less iterations than its chaotic version. Regarding Fig. 8c, the CGWO reaches the optimal result with less iterations (68) than all other methodologies. In general, the convergence behavior for all simulations performed was similar, with the CGWO showing itself the most efficient method to solve the OWFLO problem. Figure 9 depicts the optimal layout obtained by the methodologies, corresponding to a cost of 0.0015436$/kW.
Application of Recent Metaheuristic Techniques for Optimizing
335
(a)
(b)
(c)
Fig. 8. Convergence curves for the configurations with a 100, b 500, and c 1000 solutions for Case (I)
Table 3 depicts the number of wind turbines, costs, and total output power extracted from the wind farm obtained in the present study, compared to the results found in the literature in order to validate the methodology employed.
336
F. F. Panoeiro et al.
Fig. 9. Wind farm layout for Case (I) Table 3. Literature comparison: Case (I) References
Number of wind Cost ($) Output power (kW) Objective function turbines ($/kW) [13] 26 20 12,352 0.001620 [16] 30 22.1 14,310 0.0015436 Present study 30 22.1 14,310 0.0015436
4.2
Case (II): Multiple Directions
For Case (II), multiple wind directions are considered in order to analyze its impact in the layout’s optimization. To do so, it is considered that the region of the wind farm presents variations in the wind incidence directions k = (0º, 45º, 90º, 135º, 180º, 225º, 270º, and 315º), weighed with probability densities of fk ¼ 1=8 ¼ 0:125 and mean speed of u0 ¼ 12 m=s. Figure 10 depicts the results’ dispersions for Case (II). From the results obtained by the optimization techniques for the configuration depicted in Fig. 10a, the GWO and CGWO obtained the smallest dispersions. When comparing the methodologies applied in Case (II), it is possible to observe that the increase in the number of solutions did not cause a significant impact in the solutions’ dispersion, as occurred in Case (I). However, better layouts were obtained for the configurations shown in Fig. 10b and c; i.e., layouts with lower cost per extracted
Application of Recent Metaheuristic Techniques for Optimizing
337
(a)
(b)
(c)
Fig. 10. Solution dispersion for the configurations with a 100, b 500, and c 1000 solutions for Case (II)
power were found. The minimum solution obtained in this case was of 0.0015939$/kW by the optimization techniques SCA, CSCA, CGWO, and CBA. For each investigated solution, eight wind incidence directions are verified, as previously mentioned. The probability density is the same for each direction, causing a ‘conflict’ during the layout optimization process, i.e., the location of the wind turbines’ impact in different performances due to the incidence angle. This impact is observed by
338
F. F. Panoeiro et al.
the greater dispersion of the obtained results, justifying the difficulty of obtaining a ‘global’ minimum solution for the problem, getting stagnated at local minimum solutions in many simulations. Table 4 depicts the minimum, median, and mean values obtained from the simulations for all optimization techniques for Case (II), highlighting the best results of each configuration. Table 4. Statistical data for Case (II) Optimization technique BA
GWO
SCA
CBA
CGWO
CSCA
Number of solutions 100 500 1000 100 500 1000 100 500 1000 100 500 1000 100 500 1000 100 500 1000
Minimum ($/ kW) 0.0015949 0.0015966 0.0015959 0.0015949 0.0015946 0.0015976 0.0015951 0.0015949 0.0015939 0.0015952 0.0015939 0.0015963 0.0015949 0.0015946 0.0015939 0.0015952 0.0015939 0.0015959
Median ($/ kW) 0.0015996 0.0015990 0.0016009 0.0015977 0.0015986 0.0016021 0.0015974 0.0015965 0.0016001 0.0015979 0.0015985 0.0016008 0.0015981 0.0015972 0.0015967 0.0015985 0.0015989 0.0015981
Mean ($/ kW) 0.0016001 0.0015995 0.0015999 0.0015984 0.0015985 0.0016018 0.0015994 0.0015984 0.0015998 0.0015985 0.0015993 0.0015999 0.0015979 0.0015981 0.0015973 0.0015995 0.0015986 0.0015990
Figure 11 depicts the convergence curves obtained for the best solutions for Case (II). From the convergence curves shown in Fig. 11a, it can be seen that even though the methods GWO and CBA converge to the best solutions obtained, they get stagnated at the beginning of the iterative process. For the configuration related to the convergence curve depicted in Fig. 11b, the CSCA and CBA converge to the value 0.0015929$/kW at iterations 260 and 9, respectively. For the configuration with 1000 solutions, depicted in Fig. 11c, the SCA and GWO obtain the same value of 0.0015929$/kW at iterations 114 and 31, respectively. For this case, there is no results’ progression; i.e., for each configuration, the optimization methods converge to different solutions, showing the complexity of Case (II). The location of the wind turbines for the solution of 0.0015936$/kW is shown in Fig. 12.
Application of Recent Metaheuristic Techniques for Optimizing
339
(a)
(b)
(c)
Fig. 11. Convergence curves for the configurations with a 100, b 500, and c 1000 solutions for Case (II)
Table 5 summarizes the obtained results for Case (II). Table 6 presents the output power generated for each wind direction, at which it is possible to observe that for the directions of 45º, 135º, 225º, and 315º, the wake effect is less intense.
340
F. F. Panoeiro et al.
Fig. 12. Wind farm layout for Case (II) Table 5. Case (II) results’ summary Number of wind turbines 35
Cost ($) 24.7177
Weighted output power (kW) 15,512
Objective function ($/kW) 0.0015939
Table 6. Output power at each wind direction Direction (º) 0 45 90 135 180 225 270 315
Output power (kW) 14,653 16,644 13,749 17,029 14,686 16,557 13,778 17,003
5 Conclusion The present work presents methodologies employed to solve the OWFLO problem based on optimization techniques known as metaheuristics, including the representation of intermittent wind regimes in the problem, considering different flow directions.
Application of Recent Metaheuristic Techniques for Optimizing
341
Since the OWFLO is a complex problem, regarding the total number of possible combinations, the chaotic map feature is used to adjust some parameters of the optimization techniques in order to improve their global/local searching stages. The main goal is to compare the performance of the basic and chaotic optimization techniques in solving the OWFLO problem, regarding the solutions quality and convergence. According to the results obtained, the chaotic methodology CGWO was able to obtain the optimal solution for Cases (I) and (II). In general, the dispersion of the set of solutions referring to the forty executions carried out by varying the number of initial solutions was better in the CGWO; i.e., layouts with fewer investment costs per power extracted were obtained, indicating this to be the most reliable technique in the sense of knowing what to expect from it. From a convergence point of view, for some cases, the CGWO reaches the optimal solution after a few iterations of the iterative process. This fact is also observed in the CBA when applied to Case (II). However, Case (II) has shown itself far more complex, causing difficulties for all investigated methods. Regarding the layout problem, it is noticeable that the number and location of the wind turbines are directly impacted by the wake effect for different wind incidence scenarios. Thus, the project costs and extracted power vary according to the scenario under study, justifying the proposed formulation and the application of the optimization techniques for this modeling. Novel strategies are introduced at the local/global searching steps of the optimization techniques, mostly at the exploration coefficients of the sine cosine algorithm t r1 and grey wolf optimizer (a). Regarding the OWFLO problem, different objective functions are investigated, since the objective function used in this work is dimensionless, i.e., more suited for academic studies. Acknowledgements. The authors acknowledge the Brazilian National Research Council (CNPq), the Coordination for the Improvement of Higher Education Personnel (CAPES), the Foundation for Supporting Research in Minas Gerais, and Electric Power National Institute (INERGE) for their great support.
References 1. Global Wind Energy Council GWEC (2019) Global wind report forecasts over 300 GW capacity to be added in next 5 years—growth to come from emerging markets and offshore wind, 3 Apr 2019. Available https://gwec.net/. Accessed 10 May 2019 2. Hou P (2017) Optimization of large-scale offshore wind farm. Ph.D Dissertation, Aalborg Universitetsforlag 3. Kerkvliet H, Polatidis H (2016) Offshore wind farms decommissioning: a semi quantitative multi-criteria decision aid framework. Sustain Energy Technol Assess 18:69–79 (Elsevier) 4. Han X, Guo J, Wang P, Jia Y (2011) Adequacy study of wind farms considering reliability and wake effect of WTGs. In: Power and energy society general meeting, IEEE, pp 1–7 5. Jensen NO, Katic I, Hojstrup C (1986) A simple model for cluster efficiency. In: European wind energy association conference and exhibition, pp 407–410 6. Kusiak A, Song Z (2010) Design of wind farm layout for maximum wind energy capture. Renew Energy 35(3):685–694
342
F. F. Panoeiro et al.
7. González JS, Rodriguez AGG, Mora JC, Santos JR, Payan MB (2010) Optimization of wind farm turbines layout using an evolutive algorithm. Renew Energy 35(8):1671–1681 8. Gao X, Yang H, Lin L, Koo P (2015) Wind turbine layout optimization using multipopulation genetic algorithm and a case study in Hong Kong offshore. J Wind Eng Indus Aerodyn, 139 9. Wu YK et al (2014) Optimization of the wind turbine layout and transmission system planning for a large-scale offshore windfarm by ai technology. IEEE Trans Indus Appl 50 (3):2071–2080 (IEEE) 10. Changshui Z, Guangdong H, Jun W (2011) A fast algorithm based on the submodular property for optimization of wind turbine positioning. Renew Energy 36(11):2951–2958 11. Duan B, Wang J, Gu H (2014) Modified genetic algorithm for layout optimization of multitype wind turbines. In: IEEE, American control conference (ACC), pp 3633–3638 12. Shakoor R et al (2014) Wind farm layout optimization by using definite point selection and genetic algorithm. In: 2014 IEEE international conference on power and energy (PECon), IEEE, pp 191–195 13. Mosetti G, Poloni C, Diviacco B (1994) Optimization of wind turbine positioning in large windfarms by means of a genetic algorithm. J Wind Eng Ind Aerodyn 51(1):105–116 14. Jiang D et al (2013) Modified binary differential evolution for solving wind farm layout optimization problems. In: 2013 IEEE symposium on computational intelligence for engineering solutions (CIES), IEEE, pp 23–28 15. Gomes LL, Oliveira LW, Silva IC Jr, Passos Filho JA (2017) Optimization of wind farms layout through artificial immune system. In: Latin—American congress on electricity generation and transmission, GLACTEE, vol 12 16. Pookpunt S, Ongsakul W (2013) Optimal placement of wind turbines within wind farm using binary particle swarm optimization with time-varying acceleration coefficients. Renew Energy 55:266–276 (Elsevier) 17. Hou P et al (2015) Optimized placement of wind turbines in large-scale offshore wind farm using particle swarm optimization algorithm. IEEE Trans Sustain Energy 6(4):1272–1282 (IEEE) 18. Yang H et al (2016) Wind farm layout optimization and its application to power system reliability analysis. IEEE Trans Power Syst 31(3):2135–2143 (IEEE) 19. Dey N (ed) (2017) Advancements in applied metaheuristic computing. IGI Global 20. Dey N (2018) Advancements in applied metaheuristic computing. IGI Global, Hershey, PA, pp 1–978 21. Gupta N, Patel N, Tiwari BN, Khosravy M (2018) Genetic algorithm based on enhanced selection and log-scaled mutation technique. In: Proceedings of the future technologies conference, Springer, pp 730–748 22. Singh G, Gupta N, Khosravy M (2015) New crossover operators for real coded genetic algorithm (RCGA). In: 2015 international conference on intelligent informatics and biomedical sciences (ICIIBMS), IEEE, pp 135–140 23. Gupta N, Khosravy M, Patel N, Sethi IK (2018) Evolutionary optimization based on biological evolution in plants. Proc Comput Sci 126:146–155 (Elsevier) 24. Gupta N, Khosravy M, Patel N, Senjyu T (2018) A bi-level evolutionary optimization for coordinated transmission expansion planning. IEEE Access 6:48455–48477 25. Moraes CA, De Oliveira EJ, Khosravy M, Oliveira LW, Honrio LM, Pinto MF, A hybrid bat-inspired algorithm for power transmission expansion planning on a practical Brazilian network. In: Applied nature-inspired computing: algorithms and case studies, from springer tracts in nature inspired computing (STNIC), Springer International Publishing, will be appeared in 2019
Application of Recent Metaheuristic Techniques for Optimizing
343
26. Khosravy M, Gupta N, Patel N, Senjyu T, Duque CA (2019) Particle swarm optimization of morphological filters for electrocardiogram baseline drift estimation. In: Applied natureinspired computing: algorithms and case studies, from springer tracts in nature-inspired computing (STNIC),Springer International Publishing (in press) 27. Jagatheesan K, Anand B, Samanta S, Dey N, Ashour AS, Balas VE (2017) Particle swarm optimisation-based parameters optimisation of PID controller for load frequency control of multi-area reheat thermal power systems. Int J Adv Intell Paradig 9(5–6):464–489 28. Chatterjee S, Sarkar S, Hore S, Dey N, Ashour AS, Balas VE (2017) Particle swarm optimization trained neural network for structural failure prediction of multistoried RC buildings. Neural Comput Appl 28(8):2005–2016 29. Satapathy SC, Raja NSM, Rajinikanth V, Ashour AS, Dey N (2018) Multi-level image thresholding using Otsu and chaotic bat algorithm. Neural Comput Appl 29(12):1285–1307 30. Mirjalili S (2016) Sca: a sine cosine algorithm for solving optimization problems. KnowlBased Syst 96:120–133 (Elsevier) 31. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey Wolf optimizer. Adv Eng softw 69:46–61 (Elsevier) 32. Yang XS (2010) A new metaheuristic bat-inspired algorithm. In: Nature inspired cooperative strategies for optimization. Springer, pp 65–74 33. Tavozoei MS, Haeri M (2007) Comparison of different one-dimensional maps as chaotic search pattern in chaos optimization algorithms. Appl Math Comput 187(2):1076–1085 (Elsevier) 34. Mendel E, Krohling RA, Campos M (2011) Swarm algorithms with chaotic jumps applied to noisy optimization problem. Inform Sci 181(20):4494–4514 (Elsevier) 35. Gandomi AH, Yang XS (2014) Chaotic bat algorithm. J Comput Sci 5(2):224–232 (Elsevier)
Chapter 16 Design and Comparison of Two Evolutionary and Hybrid Neural Network Algorithms in Obtaining Dynamic Balance for Two-Legged Robots Ravi Kumar Mandava1,2(&) and Pandu R. Vundavilli1 1
School of Mechanical Sciences, Indian Institute of Technology Bhubaneswar, Bhubaneswar, Odisha 752050, India [email protected] 2 Department of Mechanical Engineering, MANIT, Bhopal 462003, India
1 Introduction Compared to wheeled robots, their legged counterparts are more comfortable while moving on various discontinuous terrains of an environment. However, upholding the dynamic balance of the legged robot is difficult due to its discrete footholds. To achieve this, we need to regulate the various motors, which are mounted on the individual joints in a coordinated manner. Moreover, the stability of the two-legged robot is measured with the help of DBM. In the past few decades, almost all of the scientists have been working on the development of dynamically balanced gaits for the two-legged robot under various terrain conditions. In [1], the authors have proposed a semi-inverse method that is used to calculate the trunk motion of the two-legged robot for the predefined ZMP trajectories with various boundary conditions to move the ZMP inside the foot support polygon. Alongside, Takanishi et al. [2] developed a novel control algorithm for generating optimal trajectories for the waist and trunk motion of the twolegged robot. For generating optimal trajectories for the waist and trunk, they used a performance index that is considered to the relative position of the waist and trunk. Finally, the developed algorithm was verified on real WL-12R two-legged robot. Moreover, Goswami [3] discussed a concept called foot rotation indicator (FRI) for determining the amount of DBM of the two-legged robot while in the single support phase. Further, the authors [4] proposed a robust algorithm for regulating the angular momentum of a two-legged robot with the manipulation of ZMP. In addition to the above works, a control method has been developed by Sano and Furusho [5] to achieve the dynamic walking for the two-legged robot in both the planes (i.e., sagittal and frontal) by conserving the angular momentum. Further, the developed algorithm was verified on BLR-G2 biped robot at a speed of 0.35 m/s. However, for developing a dynamically balanced robust gait for a planar five-link two-legged robot, Seo and Yoon [6] constructed a reasonable set of gaits to fulfill the periodicity of the two-legged robot
© Springer Nature Singapore Pte Ltd. 2020 M. Khosravy et al. (eds.), Frontier Applications of Nature Inspired Computation, Springer Tracts in Nature-Inspired Computing, https://doi.org/10.1007/978-981-15-2133-1_16
Design and Comparison of Two Evolutionary and Hybrid
345
walking. The foot strike time margin was considered as a performance measure, and the robot achieved the robust gait even with the external disturbances. The generation of dynamically balanced gaits for ascending and descending the sloping surface [8] and staircase [7] for a seven-DOF biped was established with the help of simulation studies only. It was observed that in this approach, the authors did not use any controller to control the motors and used the concept of DBM to maintain stable gait generation. Nguyen et al. [9] also discussed the effect of foot structure while walking on the floor. The center of gravity of the robot with varying toe is compared with the usual walking on flat terrain and determines the best toe mechanism. In [10], Gaurav and Ashish proposed a method for generating trajectories and step planning for navigation of 12-DOF biped robot while walking on irregular terrains with the obstacles are presented. In addition to the above methods, Zhong et al. [11] designed a controller for the robot after using a neural network (NN) and fuzzy logic controller (FLC) for walking on uneven terrains. A modified PSO algorithm was used to minimize the weights of the NN and rule base of FLC. It was observed that in some works [7, 8, 11, 12], they worked on the structure of the robot that had a limited degree of freedom and considered the moment only in a sagittal plane. The objective of this study is to diminish the error between the desired setpoint and actual value and to improve the balance of the robot. For reducing the error of each joint of the two-legged robot, one has to tune the motors of the joints in a coordinated way. Researchers all over the world are using different types of tuning methods, which are explained below. In [13], Visioli established a fuzzy logic-based PID controller for the robotic manipulator. Further, the abovementioned developed controller was compared to the standard PID controller in terms of incremental fuzzy expert PID controller (IFE), fuzzy gain scheduling (FGS), fuzzy setpoint weighting (FSW), fuzzy self-tuning of a single parameter (SSP) and fuzzy PID controllers [14]. Later on, Mandava and Vundavilli [15] designed a PID controller for four-DOF planar and spatial robotic manipulators. The authors are proposed a manual tuning procedure to tune the gains of the controller. It was observed that the authors did not use any optimizer to attain the optimal tuning parameters of the PID controllers. Moreover, in [16], the author discussed a self-organizing PID controller for achieving the best tuning parameters for the revolute joints of the robotic arm. When compared with ordinary PID controller, the SOF-PID controller produced the small steady-state error, faster rise time, and insignificant overshoot for the set input trajectory. Further, Helon and Leandro [17] proposed a multi-objective genetic algorithm (i.e., NSGA-II) to tune the gains of a PID controller for a two-DOF planar manipulator. It was simple to implement and provide an excellent reference tracking performance. Gutierrez et al. [18] implemented a NN tracking controller for a single-link flexible manipulator and also compared the enactment of the standard PD and PID controllers. Compared to the PID and PD controller, the tracking performance of the NN-based controller was far better and maintained the zero tracking error due to the addition of additional frictional terms. To control the manipulator by using the model-based approaches, it requires more computational time and gives poor control performance. To overcome this problem, Murat et al. [19] presented a decentralized model, where each controller was accompanying with one joint, and a distinct neural network was used to regulate the controller
346
R. K. Mandava and P. R. Vundavilli
variables. Pirabakaran and Becerra [20] also demonstrated the application of artificial NN for automatic tuning of a PID controller, i.e., model reference adaptive controller (MRAC). In that approach, initially, the multilayer perceptron (MLP) network and a plant emulator were built. The emulator was proposed to train the NN and adjust the variables of the PID controller to diminish the error between the reference outputs and measured outputs. Further, Joel Perez et al. [21] established an adaptive NN controller for the two-link robot manipulator. The stability of the tracking error was evaluated based on Lyapunov control functions, and the control law was attained based on the concept of PID approach. In this regard, for verifying the tracked trajectory, a robot model along with friction terms and unknown external disturbances were considered. In [22], Zang et al. designed a PID controller for a musculoskeletal robot which helps track the joint trajectory for achieving the desired motion. Moreover, Zhong and Chen [23] presented an intelligent control system for the biped robot, which was analyzed the stairs and calculated the required trajectories for the feet. To overcome the problems arise while controlling, the biped robot is directly controlled by the neural network, fuzzy logic controller. A modified PSO algorithm was used to train the weights of the NN and the rules of FLC. Through the above studies, the researchers laid the foundation for solving the problems related to gains of the PID controllers [24]; however, they were not optimal in any sense. Moreover, the manual tuning of the PID parameters is time-consuming and computationally expensive; these are not suitable for online applications. To evolve optimal tuning parameters for each of the motors, that control the robot and increase the possibility of using the online control, it is necessary to develop an adaptive tuning scheme. To tackle the difficulties like trapping in the local minima during the training problem, metaheuristic optimization algorithms, namely genetic algorithm (GA) [25–27] particle swarm optimization (PSO) [28], cat swarm optimization (CSO) [29], ant colony optimization (ACO) [30, 31], bird mating optimizer (BMO) [32], and modified invasive weed optimization (MIWO) [33] algorithms, were tried by various researchers. Alongside, Ge et al. [34] developed a controller based on the concept of a neural network and used an inverse dynamical model. A Gaussian radial basis function network was used for achieving uniformly stable adaptation and asymptotical tracking. The controller reduced the errors and bounding disturbances of the network. In addition to the above methods, Woon et al. [35] designed an adaptive NN controller for the coordinated motion of the manipulator. Further, Sun et al. [36] designed a NN-based sliding mode adaptive control (NNSMAC), NN-based adaptive observer, and NNbased sliding mode adaptive output feedback control (NNSMAOFC) for tracking the trajectory of the robotic manipulator. For converging the tracking error to zero, the Lyapunov theory was used. Finally, for finding the effectiveness of the problem, they compared the three developed approaches. In addition to the above methods, Mehran et al. [37] developed an adaptive NN integral sliding mode controller for a biped robot. The gains of the sliding mode controller were tuned by using bat algorithm (BA) and compared with the integral sliding mode controllers. It was observed that the study was concentrated on designing the controller for limited degrees of freedom biped robot and controlling the said robot only in the sagittal plane. Based on the above study, most of the researchers have used the neural network to control the ZMP within the foot support polygon of the two-legged robot [38].
Design and Comparison of Two Evolutionary and Hybrid
347
Some other researchers also used NN-based approaches to tune the gains of the controller for the robotic manipulators. In addition to the above works, very few researchers used stochastic optimization algorithms to train the neural networks for the problems related to the control of a two-legged robot. The contributions of the present research work are as follows: • The authors of this study have implemented a variation to the standard invasive weed optimization algorithm by introducing chaotic and cosine variables during spatial dispersal phase. These two variables help the algorithm to increase the search space for distributing the seeds and to reduce the chances of solution trapped into a local optimum, respectively. • Alongside, to achieve the dynamically balanced walking of the two-legged robot while moving on a flat surface, a torque-based PID controller is designed for whole joints of the two legs. • Moreover, the gains (Kp, Kd, and Ki) of the PID controllers are regulated with the support of the feed-forward neural network, and the weights of the network are trained by using a non-traditional optimization algorithm (MCIWO). Further, the performance of the developed algorithm (i.e., MCIWO-NN) has been compared with PSO-NN algorithm concerning variation of error at each joint, average torque essential to complete single cycle, and DBM of the two-legged robot. • Finally, the gait developed by the optimal control algorithm is verified on a real two-legged robot in a laboratory and established that the legged robot accomplished its walk on the flat surface effectively.
Fig. 1. Structure of the two-legged robot a line diagram, b actual robot
348
R. K. Mandava and P. R. Vundavilli
2 Description of the Two-Legged Robot The two-legged robots are having a structure similar to human beings and are looking more anthropomorphic. These two-legged robots are intended to work in an environment where human co-exists like hospitals, hotels, houses, and offices. The current research work focuses primarily on the design of a PID controller for the two-legged robot on a flat surface. Figure 1 shows the structure of the two-legged robot [39], which was used in this work. Further, the block diagram presenting the methodology used to solve the problem is given in Fig. 2. Configuration of the biped robot
Forward Kinematics
Inverse Kinematics
Establish the equations of motion using L−E formulation
NO Verify the DBM of the gait YES
Derive the controller equation for torque based PID controller Biped Robot Tuning the gains of the PID controller with the help of adaptive NN
Optimize the architecture of NN
1
W11 W12
V11 V12 V13
Get the gains of the torque based PID Controller
Kp1 Kd1 Ki1
by using MCIWO/PSO
Obtain the angular displacement Kp12
12
for each joint
Kd12
Input layer
Ki12 Hidden layer
Output layer
Find the error in angular displacement for each joint
Fig. 2. Block diagram presenting the methodology used to solve the problem
Initially, coordinate frames are assigned for each joint, to attain the D-H parameters of the two-legged robot. These D-H parameters help calculate the position and orientation of the particular joint. For developing the dynamically balanced walk, thirdordered polynomial trajectories are assigned for the hip joint and swing foot of the twolegged robot. Moreover, the formulation related to the mathematical model for the twolegged robot to walk on a flat surface is done based on the following assumptions. • Walking along the flat surface, the foot is always parallel to the ground, i.e., there is no ankle moment h6 ¼ h12 ¼ 0 • The mass of each limb is assumed uniformly throughout its length and center of mass of each limb is at the one-third distance from the endpoint. • The walking direction of the two-legged robot is assumed both in the sagittal and frontal views. • The walking cycle consists of both single and double support phases. • The PID controller is developed for both the legs of the two-legged robot.
Design and Comparison of Two Evolutionary and Hybrid
349
Further, the dynamic balance of the two-legged robot is determined with the support of ZMP. Figure 3 shows the schematic diagram presenting the representation of ZMP. Z
Z F: Ground reaction force
Lower limb Lower limb
X
ZM P X
F
ZM P
Y
zmp
Y ZMP
F
Y X DBM Sagittal Plane
DBM
Frontal Plane
Fig. 3. Location of ZMP in X- and Y-directions
In the present chapter, the PID controllers are designed for every joint of the twolegged robot. These controllers are helpful to achieve the desired gait while walking the two-legged robot on a flat surface. Further, the Lagrangian–Euler formulation (Eq. 1) is used to calculate the dynamics of the robot. The developed dynamics are helpful to design of PID controller. si;the ¼
n X
::
Mij ðqÞ qj þ
j¼1
n X n X
:
:
Cijk qj qk þ Gi
i; j; k ¼ 1; 2. . .n
ð1Þ
j¼1 k¼1
where the terms si;the represent the theoretical torque (N m), qj indicates the displacement in (rad), q_ j denotes the velocity in (rad/s), and € qj represents the acceleration in (rad/s2), respectively. Mi;j ¼
h i Tr dpj Ip dpiT
n X
i; j ¼ 1; 2. . .n
ð2Þ
p¼maxði;jÞ
Ci;j;k
@ dpk T ¼ Tr Ip dpi @qp p¼maxði;j;kÞ n X
Gi ¼
n X p¼i
mp gdpi perp
i; j ¼ 1; 2. . .n
i; j ¼ 1; 2. . .n
ð3Þ
ð4Þ
350
R. K. Mandava and P. R. Vundavilli
where Ip and perp indicate the moment of inertia (kg m/s2) and the center of mass (m) of pth link, respectively. Further, g represents the acceleration due to gravity in (m/s2). It is important to note that the angular acceleration of each links plays a vital role while governing every joint of the two-legged robot. Therefore, the expression obtained by rearranging the above equation in terms of the acceleration of the link is given in Eq. (5). :: qj
¼
n X
" Mij ðqÞ
1
j¼1
þ
n X n X
# :
: Cijk qj qk
Gi
j¼1 k¼1 n X
!
1
Mij ðqÞ si;the
ð5Þ i; j; k ¼ 1; 2. . .n
j¼1
Now, by looking at the term n X
Mij ðqÞ1 si;the ¼ ^s
ð6Þ
j¼1
The expression for determining the actual torque essential for every joint of the two-legged robot after utilizing the joint-based PID controller is mentioned below. Z ð7Þ sact ¼ Kp e þ Kd e_ þ Ki edt In Eq. (7), Kp represents the proportional, Kd indicates the derivative, and Ki denotes the integral gains of the controllers, respectively. Moreover, after adding the meaning of e and e_ , the expanded form of Eq. (7) can be rewritten as follows.
si;act
eðhi Þ ¼ hif his Z : ¼ Kpi hif his Kdi his þ Kii eðhis Þdt
i ¼ 1; 2; . . .n
ð8Þ
where si;act specifies the actual torque given to individual joints by the controller to move from the initial angular position ðhis Þ to the final angular position ðhif Þ. Also, the integral terms of the Eq. (8) replaced by their state variables, namely x_ i and its significance are mentioned in Eq. (9). Z : ð9Þ xi ¼ eðhis Þdt ) xi ¼ hif his i ¼ 1; 2; . . .n Further, Eq. (10) represents the final control equation which controls all the joints of the robot.
Design and Comparison of Two Evolutionary and Hybrid
:: qj
¼
n X
" Mij ðqÞ
j¼1
1
n X n X
351
#
:
: Cijk qj qk
: Gi þ Kpi hif his Kdi his þ Kii xi
ð10Þ
j¼1 k¼1
3 Proposed Soft Computing-Based Approaches In the present manuscript, two soft computing-based approaches, which are MCIWO trained NN (MCIWO-NN) and PSO trained NN (PSO-NN) algorithms, have been proposed to tune the gains of the developed PID controller of the two-legged robot. The neural network consists of simple processing units connected to process the data. Moreover, the network structure (Fig. 4) consists of three different layers, the input, hidden, and output layers, which are made up of neurons. Further, the weighted sum of the inputs and the bias value added to every neuron and passed through the linear transfer functions at the input layer, log-sigmoid transfer function at hidden layer, and tan-sigmoid transfer function at the output layer of the network, respectively. It is an entirely connected network with neurons in each layer and is attached to the neurons in the neighboring layer and passing the signal from one layer to another. bias1 bias2 V11
W11 W12
W21
1
V12
Kp1
V13
W22
Kd1
W13
2
Ki1
Kp12 Kd12
11 Ki12
12 OutputLayer
InputLayer HiddenLayer
Fig. 4. Structure of the neural network
In this work, the weights of the feed-forward NN have been trained by using two metaheuristic algorithms, namely MCIWO and PSO algorithms. It is significant to note that NN is a highly potential tool for learning and adaptation, and it also can solve complex real-time problems. Initially, a set of training data is used for training the network, and an optimization algorithm has been discussed in this training for evolving the most appropriate weights for the network to diminish the error for the
352
R. K. Mandava and P. R. Vundavilli
corresponding problem. The difference between joint angles of equal time intervals of all 12 joints, Δh1, Δh2, Δh3, Δh4, Δh5, Δh6, Δh7, Δh8, Δh9, Δh10, Δh11, Δh12, is considered as inputs to the neural network (Fig. 5). Dhk ¼ hj hi ;
ð11Þ
where i = 1, …, 11, j = 2, …, 12, and k = 1, … 12. The gains of the PID controller such as Kp, Kd, and Ki for all the twelve controllers are treated as network outputs. Inputs: Change in angle of links at regular time intervals i.e 1 12
Neural Network Module
Weight optimization using MCIWO/PSO for Neural Network Structure
Outputs: Tuning parameters reqiured for different joints i.e Kp1, Kd1, Ki1 ...........................Kp12, Kd12,Ki12
Change in error, Actual Torque required, ZMP and Dynamic balance
Fig. 5. The flow chart showing the structure of the proposed algorithm
The change in the deviation of RMS, the angular displacement between the completion of every interval (/ijkf) and the starting of the interval (/ijki), is considered as the fitness (f) of every population and is given in Eq. (12). sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi# " p d b 2 1X 1X 1X f ¼ aijkf aijks ; ð12Þ d i¼1 b j¼1 2 k¼1 where d, b, and p represent the number of training, intervals considered in one step, and joints. In this study, the authors of the present chapter have discussed two metaheuristic optimization algorithms, such as MCIWO and PSO algorithm to train the architecture MCIWO−NN and PSO−NN approaches
q
e
des
PID Controller
u
Control Plant
q
q
act
Feed Back Controller
Fig. 6. Block diagram showing the proposed control strategy
Design and Comparison of Two Evolutionary and Hybrid
353
of the NN. Figure 6 shows the pictorial representation of the suggested adaptive PID controller used in this chapter. The detailed description related to the said optimization algorithms is discussed in the following sections. 3.1
MCIWO Algorithm
The newly proposed MCIWO algorithm has been established based on the variation of the standard IWO algorithm, which was initiated by the Mehrabian and Lucas [40]. Firstly, a limited number of weeds are created to form a solution in N-dimensional space with a random position (Fig. 7). Note that each seed position denotes one possible solution to the problem. Further, in the reproduction stage, each weed is permitted to produce the number of new seeds based on the fitness value. The number of seeds reproduced from every weed is determined based on the value of weed fitness and assigned the lowermost and uppermost fitness values of the colony. Moreover, the weed with the worst fitness will produce fewer seeds, and the weed with superior fitness will generate new seeds. The primary benefit of this algorithm is that, in a solution space, each weed will contribute to the reproduction process, and the weed which will generate the worst fitness value also shares some valuable information to the evolution process. Moreover, Eq. (13) is helpful to determine the number of seeds (S) developed by every weed, which is given below.
f fmin S ¼ Floor Smin þ Smax ; fmax fmin
ð13Þ
where fmin and Smin represent the minimum fitness and production by each plant, fmax and Smax denote the maximum fitness and production by each plant, respectively. To increase the enactment of the said algorithm, two terms are known as, i.e., a chaotic variable [41, 42] and a cosine variable [43, 44] are added during spatial dispersal phase. In IWO, the seeds are dispersal after following the normal distribution. However, in the present problem, the chaotic, random number is used to distribute the seeds. This reduces the number of chances of the solution to be trapped in the local optimum while searching. Further, the chaotic, random number used in the present study, which is obtained after using a Chebyshev map, is given in Eq. (14). Xk þ 1 ¼ cos k cos1 ðXk Þ
ð14Þ
Also, the chaotic, random number the authors have added a cosine term which is helpful to explore the search space in a superior manner. This cosine variable [45] is helpful to increase the search space in a better manner after utilizing fewer resources. Once the cosine variable is introduced, Eq. (15) can be revised into the following form. rGen ¼
ðGenmax GenÞn jcosðGenÞj ðrinitial rfinal Þ þ rfinal ðGenmax Þn
ð15Þ
Finally, the developed seeds are grown into a flowering weed and are located in the weed colony along with the parent weeds, and the elimination is performed based on the fitness value. Moreover, the newly developed seeds and their locations on the search space
354
R. K. Mandava and P. R. Vundavilli Start Initialization: Choose the parmeters i.e weights of the neural network and other parameters i.e Generations, initial pop, maximum pop exponent, sigma_initial, sigma_final, Smax and Smin Generate uniform random weeds in the search space Evaluate the fitness value i.e error for each parameter and assign the rank for entire population Reproduction: Based on the fitness of the weeds the new seeds will be produced after including cosine and chaotic variables are introduced Spatial dispersal: The newely produced seeds are normally distributed in the search space by varying SD
The total no of weeds and seeds > Pmax
Yes
No
Competative Exclusion: Evaluate the fitness of weeds and seeds Choose the better fitnessweeds and seeds in the colony equal to Pmax
Use all the weeds and seeds in the colony
Exit weeds = Colony
Next generation
No
Stop criteria
Yes Keep the best Kp, Kd and Ki
Stop
Fig. 7. MCIWO algorithm
are ranked along with their parent weeds. Based on the fast reproduction of the seeds, the population size in the exploration space will cross its maximum limit (Pmax) after certain generations. Once the size of the population is reached the maximum, the fewer fit weeds are excluded, and the weeds that have superior fitness are used for the subsequent generation. Further, the above procedure is repeated until the termination condition is reached.
4 Results and Discussions In this chapter, the gains of the developed controllers are tuned with the help of a feedforward NN, and the weights of the network are trained by using MCIWO and PSO algorithms. The results obtained are presented and discussed in the subsequent subsections.
Design and Comparison of Two Evolutionary and Hybrid
4.1
355
MCIWO-NN Approach
Initially, a study was conducted to examine the optimal number of neurons in the hidden layer of the NN structure. While carrying out this study, the following parameters of MCIWO are kept fixed in this study: initial population (npopi) = 5, final population (npopf) = 15, minimum no. of seeds (Smin) = 0, maximum no. of seeds (Smax) = 5, final standard deviation (rfinal) = 0.00001, initial standard deviation (rinitial) = 3%, modulation index (n) = 3, and maximum no. of generations (Gen) = 20 (Fig. 8). (a)
(b)
0.06118
0.0617
Hidden layer=Tansigmoid, Output layer= Logsigmoid, Max it=20, Initial_pop=5, Final_pop=15, Smin=0, Smax=5, Sigma_initial=3%, Sigma_final=0.00001
0.06116
Hidden layer=Tan sigmoid, Output layer=Logsigmoid, No.of hidden neurons=16, Max it=20, Initial_pop=5, Final_pop=15, Smin=0, Smax=5, Sigma_initial=4%, Sigma_final=0.00001
0.0616 0.0615 0.0614
Fitness
Fitness
0.06114
0.06112
0.06110
0.0613 0.0612 0.0611 0.0610
0.06108
0.0609
14
16
18
20
1
22
2
3
4
No.of hidden neurons
(c)
6
7
(d)
0.061175
0.06123
Hidden layer=Tan sigmoid, Output layer=Log sigmoid, No.of Hidden neurons=16, Max it=20, Initial_pop=5, n=3, Final_pop=15, Smin=0, Smax=5, Sigma_final=0.00001
Hidden layer=Tansigmoid, Output layer= Logsigmoid, No.of hidden neurons=16, Max it=20, Initial_pop=5, Final _pop=15, n=5, Smin=0, Sigma_initial=4%, Sigma_final=0.00001
0.06122
0.061170
0.06121 0.06120 Fitness
0.061165
Fitness
5
Exponent
0.061160
0.06119 0.06118 0.06117
0.061155
0.06116 0.061150
0.06115 1%
2%
3%
4%
5%
2
3
4
Initial standard deviation
(e) 0.06121 0.06120
6
7
8
(f) Hidden layer= Tansigmoid, Output layer=Logsigmoid, No. of hidden neurons=16, Max it=20, Initial_pop=5, n=5, Smin=0, Smax= 6, sigma_initial=4%, sigma_final=0.00001
Hidden layer= Tansigmoid, Output layer=Logsigmoid, No. of hidden neurons=16, Max it=20, Initial_pop=5, n=5, Smin=0, Smax= 6, sigma_initial=4%, sigma_final=0.00001, Final_population=25
0.06120
0.06115
Fitness
0.06119
Fitness
5
Maximum seeds (Smax)
0.06118
0.06110
0.06117 0.06105
0.06116 0.06115
0.06100
5
10
15
20
Final_population
25
30
0
5
10
15
20
25
30
35
40
45
50
Generations
Fig. 8. Graphs showing results of systematic study for MCIWO-NN controller a hidden neurons versus fitness, b exponent versus fitness, c initial standard deviation versus fitness, d maximum number of seeds versus fitness, e maximum no. of population versus fitness, and f generations versus fitness
356
R. K. Mandava and P. R. Vundavilli
Moreover, the deviation of fitness with the number of hidden neurons in the NN is shown in Fig. 9a. It has been observed that the NN architecture with 16 numbers of hidden neurons is found to provide better fitness. As there are 12 inputs and 36 outputs, the number of connecting weights in the network is found to be equal to 768 (12 16 + 16 36). After including two bias values, the number of variables related to the problem is coming out to be 770. While solving the problem, the connecting weights are varied between 0.0 and 16.0, and the bias values are varied between 0.0 and 0.0001. Further, the transfer functions used in the problem are set as linear, tansigmoid, and log-sigmoid for the input, hidden, and output layers, respectively.
(a)
(b)
0.06100 0.06098
Input_layer=Linear, Hidden_layer=Tansigmoid, No.of Hidden neurons=18, Output_layer=Logsigmoid, w=1, wdamp=0.9, C1=C2=2, Max_Generations=20
0.06085
0.06096
0.06080
0.06094
0.06075
Fitness
Fitness
0.06090
Input_layer=Linear, Hidden_layer=Tansigmoid, Output_layer=Logsigmoid, w=1, wdamp=0.9, C1=C2=2, Maxi_population=10, Max_Generations=20
0.06092 0.06090
0.06070 0.06065
0.06088
0.06060
0.06086
0.06055
14
16
18
20
10
22
20
30
40
50
Population size
No.of hidden neurons
(c) 0.0612
Input_layer=Linear, Hidden_layer=Tansigmoid, Output_layer=Logsigmoid, w=1, wdamp=0.9, C1=C2=2, Maxi_population=30
0.0610
Fitness
0.0608
0.0606
0.0604
0.0602 0
5
10
15
20
25
30
35
40
45
50
Generations
Fig. 9. Graphs are showing results of the systematic study for PSO-NN controller a hidden neurons versus fitness, b final population size versus fitness, c maximum no. of generations versus fitness
Once the study is completed, the optimal parameters related to the above-said algorithm are given below. Modulation index (n) = 5 Initial standard deviation (rinitial) = 4%, Final population (npopf) = 25, Maximum no. of seeds (Smax) = 6, Maximum no. of generations (Gen) = 50.
Design and Comparison of Two Evolutionary and Hybrid
357
Fig. 10. Comparison of error at different joints of the swing leg a joint 2, b joint 3, c joint 4, and d joint 5
4.2
PSO-NN Approach
Similarly, here also a study has been conducted to calculate the suitable number of neurons at the hidden layer of the network. The following parameters of PSO are used to perform this study: inertia weight (w) = 1, inertia weight damping ratio (wdamp) = 0.9, acceleration constants C1 = C2 = 2, maximum population size and the no. of generations are equal to 10 and 20, respectively. To enhance the performance of NN, the optimal no. of neurons at the hidden layer is equal to 18, and the total no. of variables are required to the present problem is 866. Moreover, the coefficient of transfer functions used for the three layers is the same as the one used in the MCIWONN approach. Further, the optimal values of the parameters achieved after the parametric study (Fig. 10) are mentioned below: Population size (Pmax) = 30 No. of generations (Genmax) = 50.
358
4.3
R. K. Mandava and P. R. Vundavilli
Comparative Study
Once the optimal structure has been obtained for the NN-based PID controllers, a comparative study was conducted among the PSO-NN, and MCIWO-NN tuned controllers regarding the variation of angular error, average torque essential for every joint, deviation of ZMP in foot, and DBM of the two-legged robot. Further, the quality of solutions obtained by the PSO-NN and MCIWO-NN tuned controllers are measured in simulations. Figure 10a–d shows the deviation of error at various joints of the swing leg of the robot. Based on Fig. 10a–d, it is seen to be observed that the magnitude of the error is low in MCIWO_NN-based controller compared to PSO-NN based controllers at the beginning and decreases slowly as it reaches steady state after a certain amount of time. Moreover, it can also be observed that the both MCIWO-NN and PSO-NN based controllers reach the steady-state error by showing the same trend, but for PSO-NN based PID controller, the steady state has not been settled to zero. Similarly, the results related to the standing leg for deviation of error at different joints of the two-legged robot are shown in Fig. 11a–d, respectively.
Fig. 11. Comparison of error at different joints of the standing leg a joint 8, b joint 9, c joint 10, and d joint 11
Design and Comparison of Two Evolutionary and Hybrid
359
Further, Fig. 12a and b shows the results associated with the torque required at different joints of the swing and stand leg, respectively. Based on these graphs, the MCIWO-NN controller has consumed less torque when compared with the PSO-NN tuned PID controller and has exhibited a similar trend. Moreover, the torque required at the hip joint in both the legs that are joint 3 and joint 9 is seen to be high when compared with the remaining joints of the two-legged robot in both MCIWO-NN and PSO-NN based PID controllers. It may happen due to the reason that while moving the robot from one point to another, the hip joint of the robot is carrying the other links of the leg of the two-legged robot.
Fig. 12. The torque required at various joints of the a swing leg and b stands leg
0.3
Y-ZMP in 'm'
0.2 0.1 0.0 -0.1
IWO-NN PSO-NN MCIWO-NN
-0.2 -0.3 -0.05
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
Y-ZMP in 'm'
X-ZMP in 'm' -0.065 -0.070 -0.075 -0.080 -0.085 -0.090 -0.095
Enlarged view
IWO-NN PSO-NN MCIWO-NN -0.030 -0.025 -0.020 -0.015 -0.010 -0.005 0.000 0.005 0.010
X-ZMP in 'm'
Fig. 13. Deviation of ZMP a X-ZMP and b Y-ZMP
360
R. K. Mandava and P. R. Vundavilli
Fig. 14. Comparison of DBM a X-direction and b Y-direction
Based on the above comparisons, a systematic study was conducted to see the deviation of ZMP in both X- and Y-directions of the gait obtained (actual gait) after the controller action. Figure 13 shows the plan view of the variation of actual ZMP in both X- and Y-directions when the robot is moving on a flat surface. It can be observed that the X-ZMP point is moving from the rear side of the ankle joint toward the front side of the same, whereas Y-ZMP is moving from the right side of the ankle joint to toward the left side of the same. Further, it can also be observed that the position of ZMP is seen to be very adjacent to the center of the foot in MCIWO-NN controller when compared with the other two (IWO-NN and PSO-NN) algorithms. Finally, the generated gait is found to be balanced in nature when the ZMP is inside the foot support. Further, Fig. 14a, b shows the comparative study between DBM in X- and Ydirections for IWO-NN, PSO-NN, and MCIWO-NN algorithms applied when the twolegged robot is walking on the flat surface. It has been observed that MCIWO trained neural network-based PID controller has generated high dynamically balanced gaits when compared with PSO trained neural network-based PID controller. Finally, the gait angles generated by the MCIWO-NN based PID controller are fed into the real two-legged robot (Fig. 15). It can be observed that the two-legged robot has wellperformed walk on the flat surface with the help of the gait generated using MCIWONN based controller.
5 Conclusions The main aim of this chapter is to development of an adaptive PID controller for the two-legged robot to walk on flat terrain. MCIWO and PSO learning algorithms are used to train the feed-forward artificial neural network. Further, the simulation results show that together, the MCIWO and PSO algorithms produce high-quality solutions and can develop balanced gaits for the two-legged robot. Moreover, it has been seen that the gaits generated by using MCIWO-NN based PID controller are seen to develop high dynamically balanced when compared with the PSO-NN based PID controller.
Design and Comparison of Two Evolutionary and Hybrid
361
Fig. 15. Snapshots show the two-legged robot walking on the flat surface
This may be due to the intention that the MCIWO algorithm has enhanced the search space due to the inclusion of cosine variable and minimized the chances to grasp local optimum solution due to the chaotic variable when compared with the PSO algorithm. Later on, both algorithms have been tested in computer simulations. The learning time for MCIWO-NN algorithm with a population size often for twenty numbers of generations is seen to be equal to 120 min on a Dell PC with Intel 3.30 GHz processor and a RAM of 8 GB. However, the CPU time required for the execution of the adaptive NN controller to generate the gains is found to be equal to 0.018 s on the same computer. Finally, the optimal gait data obtained from MCIWO-NN is fed into the real twolegged robot and seen that the robot has effectively negotiated the flat terrain.
References 1. Juricic D, Vukobratovic M (1972) Mathematical modeling of biped walking systems. ASME Publ. 72-WA/BHF13 2. Takanishi TM, Kaharaki H, Kato (1989) I Dynamic biped walking stabilized with optimal trunk and waist motion. In: IEEE/RSJ international workshop on intelligent robots and systems, 4–6 Sept 1989. Tsukuba, Japan, pp 187–192 3. Goswami (1999) Foot rotation indicator (FRI) point: a new gait planning tool to evaluate postural stability of biped robots. In: IEEE international conference on robotics and automation, May 1999. Detroit, Michigan, pp 47–52 4. Mitobi K, Capi G, Nasu Y (2004) A new control method for walking robots based on angular momentum. Mechatronics 14:163–174
362
R. K. Mandava and P. R. Vundavilli
5. Sano A, Furusho J (1990) Realization of natural dynamic walking using the angular momentum information. In: IEEE international conference on robotics and automation, pp 1476–1481 6. Seo YJ, Yoon YS (1995) Design of a robust dynamic gait of the biped using the concept of dynamic stability margin. Robotica 13:461–468 7. Vundavilli Pandu R, Sahu SK, Pratihar DK (2007) Dynamically balanced ascending and descending gaits of a two-legged robot. Int J Humanoid Rob 4(4):717–751 8. Vundavilli PR, Pratihar DK (2011) Balanced gait generations of a twolegged robot on sloping surface. Sadhana Acad Proc Eng Sci 36(4):525–550 9. Nguyen T, Tao L, Hasegawa H (2017) The effect of foot structure on locomotion of a small biped robot. In: MATEC web of conferences, vol 95 10. Mohammad R, Noorani S, Rashidi SF, Maryam S (2017) Gait generation and transition for bipedal locomotion system using Morris-Lecar model of central pattern generator, ScientiaIranica 11. Zhong Q-B, Fei C (2016) Trajectory planning for biped robot walking on uneven terraine taking stepping as an example. CAAI Trans Intell Technol 1:197–209 12. Gupta G, Dutta A (2018) Trajectory generation and step planning of a 12 DoF biped robot on uneven surface. Robotica, Cambridge University, pp 1–26 13. Visioli A (2001) Tuning of PID controllers with fuzzy logic. IEEE Proc Control Theory Appl 148(1):1–8 14. Azimi SM, Miar-Naimi H (2018) Design an analog CMOS fuzzy logic controller for the inverted pendulum with novel triangular membership function. ScientiaIranica. https://doi. org/10.24200/SCI.2018.5224.1153 15. Mandava RK, Vundavilli PR (2015) Design of PID controllers for 4-DOF planar and spatial manipulators. In: IEEE international conference on robotics, automation, control and embedded systems, 18–20 Feb 2015. Hindustan University, Chennai, India, pp 1–6 16. Kazemian HB (2002) The SOF-PID controller for the control of a MIMO robot arm. IEEE Trans Fuzzy Syst 10(4) 17. Helon VHA, dos Leandro SC (2012) Tuning of PID controller based on a multi-objective genetic algorithm applied to a robotic manipulator. Expert Syst Appl 39(10):8968–8974 18. Gutierrez LB, Lewis FL, Lowe JA (1998) Implementation of a neural network tracking controller for a single flexible link: comparison with PD and PID controllers. IEEE Trans Ind Electron 45:307–318 19. Sonmez M, Kandilli I, Yakut M (2006) Tracking control based on neural network for robot manipulator. In: Artificial intelligence and neural networks, volume 3949 of the series lecture notes in computer science, pp 49–57 20. Pirabakaran K, Becerra V (2002) PID autotuning using neural networks and model reference adaptive control. 15th Triennial IFAC World Congress, Spain 21. Joel Perez P, Jose PD, Rogelio S (2012) Trajectory tracking error using PID control law for two-link robot manipulator via adaptive neural networks. In: The Iberomerican conference on electronics engineering and computer science published in procedia technology, vol 3, pp 139–146 22. Zang X, Liu Y, Liu X, Zhao J (2016) Design and control of a pneumatic musculoskeletal biped robot. Technol Health Care 24:S443–S454 23. Zhong QB, Chen F (2016) Trajectory planning for biped robot walking on uneven terrain— taking stepping as an example. CAAI Trans Intell Technol 1:197–209 24. Ravi KM, Vundavilli PR (2018) Near optimal PID controllers for the biped robot while walking on uneven terrains. Int J Autom Comput 15(6):689–706 25. Montana DJ, Davis L (1989) Training feed forward neural networks using genetic algorithms. Mach Learn 762–767
Design and Comparison of Two Evolutionary and Hybrid
363
26. Asadi H, Tavakkoli Moghaddam R, Shahsavari Pour N, Najafi E (2018) A new nondominated sorting genetic algorithm based to the regression line for fuzzy traffic signal optimization problem. Scientia Iranica E 25(3):1712–1723 27. Alikhani H, Alvanchi A (2017) Using genetic algorithms for long-term planning of network of bridges. ScientiaIranica. https://doi.org/10.24200/SCI.2017.4604 28. Gudise VG, Venayagamoorthy GK (2003) Comparison of particle swarm optimization and back propagation as training algorithms for neural networks. IEEE Swarm Intell Symp 110–117 29. John Paul TY (2013) Optimizing artificial neural networks using cat swarm optimization algorithm. Int J Intell Syst Appl 1:69–80 30. Blum C, Socha K (2005) Training feed-forward neural networks with ant colony optimization: an application to pattern classification. In: fifth international conference on hybrid intelligent systems 31. Moeini R (2017) Arc based ant colony optimization algorithm for solving sewer network design optimization problem. ScientiaIranica A 24(3):953–965 32. Askarzadeh A, Rezazadeh A (2013) Artificial neural network training using a new efficient optimization algorithm. Appl Soft Comput 1206–1213 33. Giri R, Chowdhury A, Ghosh A, Das S, Abraham A, Snasel V (2010) A modified invasive weed optimization algorithm for training of feed-forward neural networks. In: IEEE international conference on systems man and cybernetics, pp 3166–3173 34. Ge SS, Hang CC, Woon LC (1997) Adaptive neural network control of robot manipulators in task space. IEEE Trans Ind Electron 44(6):746–752 35. Woon LC, Ge SS, Chen XQ, Zhang C (1999) Adaptive neural network control of coordinated manipulators. J Robotic Syst 16(4):195–211 36. Sun TR, Pei HL, Pan YP, Zhou HB, Zhang CH (2011) Neural network-based sliding mode adaptive control for robot manipulators. Neurocomputing 74(14/15):2377–2384 37. Rahmani M, Ghanbari A, Ettefagh MM (2016) A novel adaptive neural network integral sliding-mode control of a biped robot using bat algorithm. J Vib Control 23:1–16 38. Yazdani M, Salarieh H, Saadat Foumani M (2018) Hierarchical decentralized control of a five-link biped robot. ScientiaIranica B 25(5):2675–2692 39. Ravi KM, Pandu RV (2018) Whole body motion generation of 18-DOF biped robot on flat surface during SSP and DSP. Int J Model Identif Control 29(3):266–277 40. Mehrabian AR, Lucas C (2006) A novel numerical optimization algorithm inspired from weed colonization. Ecol Inf 1(4):355–366 41. Ahmadi M, Mojallali H (2012) Chaotic invasive weed optimization algorithm with application to parameter estimation of chaotic systems. Elsevier Chaos Solitons Fract 45:1108–1120 42. Ghasemi M, Ghavidel S, Aghaei J, Gitizadeh M, Falah H (2014) Application of chaos-based chaotic invasive weed optimization techniques for environmental OPF problems in the power system. Elsevier Chaos Solitons Fract 69:271–284 43. Basak A, Pal S, Das S, Abraham A (2010) A modified invasive weed optimization algorithm for time-modulated linear antenna array synthesis. IEEE Congress Evol Comput 1–10 44. Roy GG, Das S, Chakraborty P, Suganthan PN (2011) Design of non-uniform circular antenna arrays using a modified invasive weed optimization algorithm. IEEE Trans Antennas Propag 59(1):110–118 45. Ravi KM, Pandu RV (2018) Implementation of modified chaotic invasive weed optimization algorithm for optimizing the PID controller of the biped robot. Sadhana Acad Proc Eng Sci Springer 43(3):1–18
Optimizing the Crane’s Operating Time with the Ant Colony Optimization and Pilot Method Metaheuristics Andresson da Silva Firmino(B) , Val´eria Ces´ario Times, and Ricardo Martins de Abreu Silva Center for Informatics, Federal University of Pernambuco, Recife, PE, Brazil {asf2,vct,rmas}@cin.ufpe.br
1
Introduction
In container terminals, due to restricted space in the storage area, containers are piled up vertically, creating various stacks. The stacks are grouped into blocks and these are distributed over the container yard and delimited by terminal traffic lanes. A block consists of a parallel group of bays, and a bay consists of a parallel array of stacks, as illustrated in Fig. 1. A longitudinal section of containers in a bay denotes a tier. The maximum stack height delimits the number of tiers of the bay. A rail-mounted gantry crane is regularly used to operate containers in a block. In this crane model, for many reasons, including safety issues, the cranes mostly cannot travel across bays while sustaining a container [3]. Thus, the movement of containers occurs between stacks in the same bay, and the containers are accessed from the sides of the bay [20].
Fig. 1. A single container block divided into bays, stacks, and tiers
c Springer Nature Singapore Pte Ltd. 2020 M. Khosravy et al. (eds.), Frontier Applications of Nature Inspired Computation, Springer Tracts in Nature-Inspired Computing, https://doi.org/10.1007/978-981-15-2133-1_17
Optimizing the Crane’s Operating Time with the Ant Colony
365
The crane performs two operations: relocation, when a container located at the top of a stack is transferred to the top of another stack, and retrieval, when a container placed at the top of a given stack is moved outside the bay. Moreover, in a bay, each container is assigned a value indicating a retrieval priority such that all the containers are sequentially retrieved from the bay according to their priorities. The container priorities are usually assigned to ships’ stowage plans to ensure the ship’s stability and the container retrieval sequences demanded by the ports that the ships will visit. Retrieving a container that is not on top of a stack implies relocating containers above it to other stacks in the bay. These additional operations waste time, consequently reducing the productivity of the container retrieval operations, as well as the efficiency of a container terminal. Finding an optimal sequence of operations for the crane, enabling it to retrieve all the containers from the bay according to their retrieval priorities, is known as the container retrieval problem (CRP). The optimal sequence of operations is obtained by minimizing the operating time of the crane, and thus maximizing the container’s removal efficiency, bearing in mind that operating time is one of the primary measures used to evaluate the performance of ports [31]. Figure 2 illustrates an example of a CRP solution for an instance with eight containers, four stacks and four tiers. The bay is displayed as a two-dimensional
Fig. 2. A representation of a solution for the container retrieval problem
366
A. da Silva Firmino et al.
storage structure for the sake of simplicity, since the number of stacks and tiers delimits the bay dimensions and the bay stores uniformly shaped containers. The black-bordered boxes represent the bay slots, where the numbered white boxes indicate the slots occupied with containers (numbered according to their priorities) and the others denote empty slots. In this example, the crane performed four relocations and eight retrievals (i.e., twelve operations enumerated from A to L), and all retrievals occurred at the upper right corner of the bay. In relocation operations (A), (C), (E), and (J), respectively, the containers 8, 4, 6, and 8 (again) were relocated, allowing for all containers to be retrieved. Note that the relocation operations are performed whenever the highest priority container is currently placed at the top of a stack. For example, the retrieval operation (B) occurs when the container with priority one is at the top of the first stack, and thus, the operation (B) retrieves the container with priority one from the bay. The logistic optimization in the container terminals has been increasingly relevant for the scientific literature. One reason for that is the increase in the rates of container port traffic over the last few decades [34]. Dealing with the heavy traffic at container terminals requires the development of strategies to reduce operational costs and increase terminal efficiency. Reflecting these trends, this paper tackles the CRP with the aim of optimizing the crane’s operating time, a relevant issue for reaching high yard operational efficiency in a container terminal system. Moreover, metaheuristic algorithms are increasingly present in various applications for different domains in order to provide solutions to complicated problems [5]. Hence, this paper proposes algorithms for the CRP based on the metaheuristics: ant colony optimization (ACO) and pilot method. It is also worth pointing out that although the relocation operations are timeconsuming and have to be avoided as much as possible, obtaining the solution with the minimum crane’s operating time, by seeking the sequence of operations with the minimum number of relocations, is not reliable. Lin et al. [20] have already reported that the number of relocations and the crane’s operating time do not necessarily mirror each other. Furthermore, da Silva Firmino et al. [27] have proved that minimizing the number of relocations does not ensure the solution with the minimum operating time for the CRP. Therefore, unlike most studies, where algorithms have been proposed for the CRP in order to reduce the number of relocation operations, this study proposes algorithms to lower the crane’s operating time. The next section reviews related work and presents the main approaches to the CRP, connecting some of them to the approach of this work. Section 3 defines the CRP and how the crane’s operating time is computed in this problem. Then, Sect. 4 introduces the proposed heuristic algorithm for solving the CRP, which is used as a constructive heuristic by the other algorithms proposed in this paper. Afterward, Sects. 5 and 6 describe the proposed algorithms based, respectively, on the ant colony optimization (ACO) and pilot method metaheuristics. The computational experiments conducted are covered in Sect. 7. Finally, conclusions and directions for future research are discussed in Sect. 8.
Optimizing the Crane’s Operating Time with the Ant Colony
2
367
Related Work
Several approaches have been proposed to solve the CRP problem, which belongs to the NP-hard optimization problem category [4]. In literature, the CRP is also known as the Block Relocation Problem [2], the Block Retrieval Problem [25], and the Container Relocation Problem [9]. Traditionally, the CRP’s main optimization goal is to reduce the number of relocation operations, because having a lower number of relocations, empirically, implies a lower operational indicator—e.g. time, energy, labor, equipment maintenance, and money consumed—to retrieve all containers from the bay. In this respect, optimal approaches can be found in [1,9,18,23,28,36], and near-optimal approaches in [2,13–15,29,30] in order to handle larger instances. The optimal approaches ensure the optimal solution, but due to the NP-hardness of CRP, the size of the problem instances that such approaches can solve is very limited when compared to the near-optimal approaches, where near-optimal solutions are pursued. Nevertheless, minimizing the number of relocations does not ensure solutions with the crane’s minimum operating indicator. For example, [20,27] indicate that the number of relocation operations and the crane’s operating time are not always proportionate, and [27] shows that minimizing the number of relocations does not guarantee the solution for the crane’s minimal operating time. Accordingly, there are a few other works that tackle with other optimization goals that differ from the number of relocations. In [11,26,32], the objective function is to minimize, respectively, the fuel consumption, the horizontal travel distance of the crane and the crane’s trajectory. The studies in [8,12,14,17,19,20,27] address the crane’s operating time. Regarding these studies that aim to minimize the crane’s operating time, there are divergences in the computation of the crane’s operating time. Hence, sometimes the crane’s operating times are not directly comparable, as stated in [20]. For example, in [9] authors proposed an Integer Linear Programming (ILP) formulation to minimize the crane’s travel time, but the travel time does not consider the crane’s vertical travel time. Another example is found in [12,14], where the crane’s operating time contemplates the crane’s horizontal and vertical travel times. However, the travel time needed to align the crane’s spreader before conducting any operation (i.e., relocation or retrieval) was not taken into consideration for computing the crane’s operating time, as performed in [17,20,27] and this study. In particular, the algorithms proposed in [17,20,27] have produced good results in reducing the crane’s total operating time. Lin et al. [20] minimize the number of relocations, but also use in its heuristic a penalty function to minimize the crane’s operating time. Kim et al. [17] proposed a heuristic based on four circumstances present on a container bay. The two heuristics of [17,20] have shown better results than the heuristic proposed in [19] in terms of reducing the crane’s total operating time. da Silva Firmino et al. [27] developed an A* search algorithm, an optimization model and a reactive GRASP algorithm to minimize the operating time for the restricted CRP. The reactive GRASP algo-
368
A. da Silva Firmino et al.
rithm was able to achieve better results than the heuristic in [17], even though this heuristic was designed for the unrestricted CRP. According to the relocation fashion, there are two versions of CRP: the restricted CRP and the unrestricted CRP. In the restricted CRP, the source stack of each relocation is restricted to the stack in which the container with the highest priority is currently located, whereas in the unrestricted CRP, such constraint is not imposed. As an example, Fig. 2 displays a restricted CRP solution with four restricted relocations. Though most studies focus on the restricted CRP, the unrestricted CRP yields more opportunities for optimization than the restricted version, as mentioned in [30]. In other words, better solutions can be achieved in the unrestricted CRP, since promising relocations may be done. Relocations are considered promising if they are not restricted and involve moving a container that needs to be relocated sooner or later in a way that results never having to relocate it again. Thus, this work addresses the unrestricted CRP, as done in a few other works, e.g., [7,13,17,20,30]. The pilot method [33] metaheuristic has already been used for optimizing the unrestricted CRP. Tricoire et al. [30] proposed a pilot method algorithm and a constructive heuristic utilized by this algorithm. However, the proposed pilot method algorithm and its constructive heuristic were designed to minimize the number of relocation operations rather than the crane’s operating time. Similarly, the ant colony optimization (ACO) [6] metaheuristic has also been applied for optimizing the unrestricted CRP. Jovanovic et al. [14] proposed an ACO algorithm and a greedy heuristic used by this algorithm to minimize the number of container relocations. Jovanovic et al. [14] also explain how the proposed algorithms can be extended to minimize the operating time. Nonetheless, as mentioned, the crane’s operating time covered in [14] does not consider the time required to align the crane’s spreader before performing any operation. Taking all these into consideration, this paper differs from earlier works for the following reasons. Firstly, unlike the majority of studies, it addresses the unrestricted CRP for optimizing the crane’s operating time. In addition, the crane’s operating time covered in this paper is more complete than the one defined in [14], since our crane’s operating time computes the total time elapsed (horizontal and vertical) for the crane to empty the bay, including the time needed to align the crane’s spreader, as employed in [27]. Secondly, our approach for defining the pheromone matrix differs from that of [14]. The pheromone matrix defined in [14] consists of a multidimensional array that stores information about different bay configurations. However, this multidimensional approach for defining the pheromone matrix may generate a high memory cost, as mentioned in [14]. In this work, such problem is avoided by dynamically allocating pheromone values only for the encountered bay configurations. Also, with respect to the crane’s operating time covered in this paper, our ACO algorithm works with a different heuristic function and a lower bound to decide how to build the CRP solutions. Lastly, a constructive heuristic to quickly build high-quality solutions, in terms of the crane’s operating time, is developed. An ACO algorithm and a pilot method algorithm are also presented, both making use of the constructive heuristic to find even better solutions.
Optimizing the Crane’s Operating Time with the Ant Colony
3
369
Definition of the CRP
The CRP consists of finding a sequence of operations for the crane, enabling it to retrieve all the containers from the bay according to their retrieval priorities, where the optimal sequence of operations is that with the minimum operating time spent by the crane. The bay has the maximum width W and maximum height H and originally accommodates N containers, labeled 1, . . . , N . The coordinates (i, j) indicate each slot within the bay, where i ∈ {1, . . . , W } and j ∈ {1, . . . , H}, which denotes the stack and the tier within the bay, respectively. Besides, v(c) indicates the value of retrieval priority for a container c, where v(c) ∈ {1, . . . , N }, and all containers are of the same size. Finally, the initial configuration of the bay is known in advance, and the relocation and retrieval operations form the sequence of operations of the crane. These operations consume operating time and are subject to the following constraints: – Each operation performed by the crane produces a bay configuration (or state), and the next crane operation must be conducted recognizing this new configuration. – Only the container with the highest priority can be retrieved, and when a container is retrieved, it is removed from the bay. – The retrieval operation of container c, such that v(c) = N , defines the last operation of the crane. – Only containers from the top of a stack can be accessed (i.e., retrieved or relocated). – The relocation operations move containers to the top of stacks, and when a stack is empty, the top is the ground (i.e., tier 1). – The following relocation operations are not allowed: (1) from a stack i to the same stack i; (2) to a tier l such that l > H, i.e., it does not comply with the maximum height; (3) to a stack (k) such that k > W , i.e., it does not comply with the maximum width. – The crane’s spreader moves containers horizontally on height H + 1. – In the restricted CRP, only the restricted relocation operations are allowed, that is, the source stack of each relocation is restricted to the stack in which the container with the highest priority is currently located. A mathematical model for the CRP with a focus on minimizing the crane’s operating time has been defined in [27]. At the CRP resolution, in each bay configuration examined, when the container with the highest priority is not currently located at the top of an existing stack, a relocation operation needs to be selected among all eligible relocations and then executed. Relocations are considered eligible if they comply with the constraints mentioned above. In addition to the definitions presented so far, some notations used throughout this paper are described below for a better understanding. Notations and definitions regarding the crane’s operating time will be given in the following section. – LCR expresses the list of candidate relocations to be executed by the crane in a given bay configuration. This list consists of all eligible relocations aforementioned.
370
A. da Silva Firmino et al.
– c˘ and cˆ indicate, respectively, the container with the lowest priority and the container with the highest priority that are present in the bay. – The stack where cˆ is currently placed is designated as source stack, and c¯ denotes the container located at the top of the source stack. – S indicates the set of stacks of the bay, and p(s) indicates the position of stack s in the bay, where p(s) ∈ {1, . . . , W }. – h(s) denotes the value of the highest priority present in stack s, and if the stack s is empty, h(s) = v(˘ c) + 1. – a blocking is the event where a higher priority container is below a container of lower priority, noting that a smaller value indicates a higher priority. 3.1
The Crane’s Operating Time
The crane’s operating time is the total operating time (horizontal and vertical) the crane spends to empty the bay, considering the operating time related to any operation conducted by the crane, i.e., relocation and retrieval operations. The operating time inherent to any operation done by the crane is calculated according to the steps given below.
Fig. 3. Example of the representation of the coordinates in a bay
Firstly, some notations and assumptions will be introduced. Each crane’s operation has an origin coordinate and a target coordinate. The retrieval operˆ , H), ˆ where W ˆ = W + 1 and H ˆ = H + 1, ations have as its target coordinate (W and thus any retrieval operation occurs at the upper right corner of the bay. In the initial bay configuration, the crane’s spreader is originally over the stack 1. Besides, d(s) denotes the distance from stack s to the exit position (i.e., ˆ |). Figure 3 illustrates the representation of the coordinates in a bay |p(s) − W with four stacks and three tiers. Note that the coordinates outside the bay (1, 4) and (5, 4) are, respectively, the initial coordinate of the crane’s spreader (i.e., ˆ and the target coordinate of any retrieval (i.e., (W ˆ , H)). ˆ (1, H))
Optimizing the Crane’s Operating Time with the Ant Colony
371
The operating time is computed using the traveled distances and the respective crane speeds: the speed of the trolley (i.e., the horizontal speed) and the speed of the spreader (i.e., the vertical speed). Four crane speeds are considered: the horizontal speed carrying a container (ν1 ), the horizontal speed no carrying a container (ν2 ), the vertical speed carrying a container (ν3 ), and the vertical speed no carrying a container (ν4 ). The function w(f, i, j, k, l) (Eq. 1) defines the operating time the crane spends to perform any operation. This function represents the traveling time to align ˆ to coordinate (i, j) and then move the crane’s spreader from coordinate (f, H) ˆ the container from coordinate (i, j) to coordinate (k, l). The coordinate (f, H) indicates the previous coordinate of the crane’s spreader before moving the container. The function w(f, i, j, k, l) contemplates the following six operating times: 1. The operating time resulting from moving the crane’s trolley from position f to i horizontally without carrying any container (|f − i|). 2. The operating time resulting from lowering the crane’s spreader from tier g ˆ to j without carrying any container in order to collect the container (|j − H|). ˆ 3. The operating time resulting from lifting the container from tier j (|j − H|). 4. The operating time resulting from moving the container from stack i to position k (|i − k|). ˆ 5. The operating time resulting from lowering the container to tier l (|l − H|). 6. The operating time resulting from returning the crane’s spreader to the top ˆ without carrying any container (|l − H|). ˆ 4 + |j − H|/ν ˆ 3 + w(f, i, j, k, l) = |f − i|/ν2 + |j − H|/ν (1)
(2)
(3)
ˆ 3 + |l − H|/ν ˆ 4 |i − k|/ν1 + |l − H|/ν (4)
(5)
(1)
(6)
This paper focuses on the unrestricted CRP for optimizing the crane’s operating time explained above and defined in [27]. However, it is worth mentioning that all algorithms proposed in this paper and described in the next sections can be extended to solve the restricted CRP. For this purpose, the LCR, adopted by the algorithms proposed here, should be comprised solely of restricted relocations.
4
The Heuristic Algorithm
The proposed heuristic to solve the CRP, optimizing the crane’s operating time, is called Triad. This constructive heuristic employs the three criteria defined below to select the operations used for building the solution. Criterion 1: In case there are eligible stacks s ∈ S such that h(s) > v(¯ c), then select a stack s with the smallest d(s) in order to relocate c¯ that is blocking cˆ. The advantage of this criterion is twofold. The former aims to move the
372
A. da Silva Firmino et al.
source container c¯ to a stack without generating additional relocations (i.e., nonblocking), regardless of the organization of the existing containers in that stack (i.e., when h(s) > v(¯ c)). The second consists in selecting the stack closest to exit position (i.e., the smallest d(s)), thus lowering the trolley travel time when retrieving c¯. Criterion 2: When Criterion 1 is met, as a result, a stack t is selected to receive the container c¯. If stack t has n + 1 available spaces for stacking and there exists a set C with a maximum of n containers to be pre-relocated to stack t before moving container c¯ into t, then Criterion 2 is applied as follows. The containers q ∈ C must satisfy two conditions, (1) h(t) > v(q) > v(¯ c) to prevent blocking; and (2) container q is located at the top of its stack, blocking another container. Moreover, the containers in C are pre-relocated in ascending order by priority so that the container with the lowest priority is pre-relocated first to the stack t, thus avoiding blocking. This second criterion characterizes a CRP unrestricted approach because containers other than c¯ can be relocated (i.e., pre-relocated). The benefit of this pre-relocation is to transfer containers, at a suitable time, to a stack where they no longer will be relocated, preventing further relocations. Furthermore, only blocker containers are transferred (i.e., condition 2), meaning they will need to be relocated sooner or later because they are blocking another container. Criterion 3: If Criterion 1 cannot be met and there are eligible stacks s ∈ S, then select a stack s with the largest h(s) to relocate c¯. Once c¯ has been placed in the selected stack s, a blocking is produced because Criterion 1 is not fulfilled c)). Consequently, c¯ must be relocated again in a future as near as (i.e., h(s) < v(¯ c¯ is close to h(s). Therefore, the purpose of this criterion is to delay as much as possible this second relocation of c¯, selecting the stack with the largest h(s). This way, the second relocation will occur with fewer containers in the bay, so that an appropriate relocation is more easily found. The relocation (A), illustrated in Fig. 2, exemplifies the use of this criterion. In this example, the container with priority eight, at the top of the first stack, needs to be moved to enable the removal of the container with priority one (i.e., v(¯ c) = 8 and v(ˆ c) = 1). The second stack (s2 ) and the fourth stack (s4 ) are the existing eligible stacks, and h(s2 ) = 7 and h(s4 ) = 3. Note that Criterion 1 cannot be met because h(s2 ) and c). Hence, the second stack was selected to relocate c¯ h(s4 ) are smaller than v(¯ according to Criterion 3, since h(s2 ) > h(s4 ). This selected relocation (i.e., (A)) allowed the next relocation of the container 8 (i.e., (J)) to take place as late as possible and thus, to occur with far fewer containers in the bay. The proposed heuristic receives as input a partial solution to be built according to the informed bay configuration. Algorithm 1 depicts the pseudocode of this constructive heuristic. The algorithm’s iterations occur continually while the bay (bay) is not empty (lines 1–20). At the start of each iteration, the algorithm verifies whether the container with the highest retrieval priority in the bay (i.e., cˆ) is currently located at the top of the stack (line 2). If so, the container is retrieved, and hence a retrieval operation is produced (line 3). Otherwise, the container located at the top of the source stack (i.e., c¯) needs to be moved to the top of another stack.
Optimizing the Crane’s Operating Time with the Ant Colony
373
Algorithm 1 Pseudocode of the Triad Heuristic for solving the CRP Require: bay: bay configuration Require: solution: partial solution Ensure: solution built 1: while bay is not empty do 2: if cˆ is located at the top then 3: operation ← produce a retrieval in bay by retrieving cˆ 4: else 5: t ← select a stack with Criterion 1 6: if t is not nil and is an unrestricted CRP then 7: C ← select a set of containers with Criterion 2 8: while C is not empty do 9: q ← select a container in C with Criterion 2 10: operation ← produce a relocation in bay by moving q to the stack t 11: add operation to solution 12: end while 13: operation ← produce a relocation in bay by moving c¯ to the stack t 14: else 15: t ← select a stack with Criterion 3 16: operation ← produce a relocation in bay by moving c¯ to the stack t 17: end if 18: end if 19: add operation to solution 20: end while 21: return solution
Since there may be various relocation options in the LRC, Criterion 1 is applied to select a stack t to place the container c¯. If there is such a stack t and a CRP unrestricted approach is employed, then a set of containers C is generated according to Criterion 2 (line 7). Next, as defined in Criterion 2, each container q ∈ C is relocated to stack t producing relocation operations in the solution (lines 9–11). When all containers in C are pre-relocated to stack t, then c¯ is relocated to stack t, and a relocation operation is produced (line 13). In case Criterion 1 does not return any stack, Criterion 3 is triggered, and a relocation operation is produced (lines 15–16). At the end of each iteration, the operation produced in the iteration (operation) is added to the solution (line 19). Finally, when the bay is emptied, the solution built is returned (line 21). It is noteworthy that the Triad heuristic is an extension of the heuristic with a penalty function developed in [20] for the unrestricted CRP. The main differences between these two heuristics are as follows. Firstly, here, the condition for pre-relocating a container is [h(t) > v(q) > v(¯ c)], while in the original heuristic, such condition is [(h(t) > v(q) > h(t) − 5) ∧ (v(q) > v(¯ c))]. The original condition was not adopted in the Triad heuristic because it limits the number of containers that can be pre-relocated. Secondly, the Triad heuristic is proposed as a constructive heuristic and, in addition, it has no penalty factor. Furthermore, the Triad heuristic can be used to solve the restricted CRP as well, since it provides different treatments for both approaches: restricted and unrestricted.
374
A. da Silva Firmino et al.
Fig. 4. Example of states and edges for an instance of the CRP
5
The ACO Algorithm
The ant colony optimization (ACO) metaheuristic [6] may be seen as an iterative process in which each iteration consists, briefly, in repeating three tasks until a certain stopping criterion is reached. The first task consists in having a colony of n artificial ants, where each ant generates a path (i.e., a solution), even if it is only a partial one, and deposits a certain amount of pheromone while walking. This pheromone information is used to guide the path-construction tasks of other ants. The second task involves evaporating a certain amount of pheromone. The last task comprises updating the best solution found so far. For solving the CRP, an ACO approach incorporating a constructive heuristic is presented. This approach considers the CRP as a path-construction problem, where nodes (or states) represent the bay configurations, and edges denote container operations. Hence, the aim is to find a (shortest) path from the initial state to a final state, exploring the states and their edges. The initial state is generated from the initial bay configuration, and each new state to be explored is generated through one operation (i.e., retrieval or relocation). The final state is achieved when all containers are retrieved. Figure 4 illustrates this representation with an instance considering 3 containers and 3 stacks, where the initial state is the leftmost, and the final state is the rightmost. Regarding the constructive heuristic incorporated in the proposed ACO approach, it extends partial solutions produced by the ants, while they are walking through the states and edges. The constructive heuristic works as an additional, fast and smart ant, where this one ant quickly finds a path (i.e., a solution), guided only by its heuristic knowledge.
Optimizing the Crane’s Operating Time with the Ant Colony
375
Moreover, the proposed ACO approach adopts a probabilistic transition rule to decide how to expand a partial solution and a dynamical pheromone model in order to collect the expertise acquired by the artificial ants. This expertise is gained through the deposit and evaporation of pheromone, the latter using global and local update rules. The proposed ACO algorithm differs from existing ones for the following reasons. First, a constructive heuristic has been incorporated to ACO for extending partial solutions produced by the ants. Secondly, as mentioned previously, a dynamic pheromone model has been adopted to avoid a high memory cost. More details about this model are given in the next section. Lastly, the proposed ACO approach addresses the unrestricted CRP for optimizing the crane’s operating time. Due to this optimization goal, the ACO approach’s implementation needed to be customized to achieve this optimization goal. Therefore, the following sections describe the pheromone model, the transition rule, the local and global update rules, and the implementation details of the proposed ACO algorithm for the CRP. 5.1
The Pheromone Model
The proposed pheromone model considers a pheromone value τ(b,o) to each stateoperation pair (b, o), where b is a bay configuration (i.e., a state) and o is an eligible crane operation (i.e., an edge) originated from the b. Thus, the pheromone values guide the ants to follow promising (compound) operations, according to the current bay configuration. The number of possible bay configurations grows exponentially as the bay dimension gets bigger, i.e., the number of stacks, tiers, and containers increase [18]. Hence, due to this exponential growth, storing all pheromone values in a statically allocated matrix may generate a high memory cost, as declared in [14]. Therefore, in the pheromone model proposed here, the pheromone values are dynamically created on-the-fly and efficiently maintained in a hash table. 5.2
The Transition Rule
At the time that an ant reaches a state b, it is necessary to decide which edge must be selected among all existing edges, originated from the b, in order to proceed to the next state. When applied to the CRP, for the bay configuration represented by b, this transition rule consists in selecting a container operation among the operations of the LCR. This selection is done based on a heuristic function and the expertise stored in the pheromone model defined below. Let σ be a container operation that moves a container c to a stack s, i.e., σ = (c, s). Thus, the proposed heuristic function f (σ) is defined by Eq. 2, and this function is equivalent to the function f (c, s) (Eq. 3). f (σ) = f (c, s) d(s)/W + 1/v(c), f (c, s) = M + 1/h(s),
(2) if h(s) > v(c), otherwise;
(3)
376
A. da Silva Firmino et al.
The function f (σ) (through f (c, s)) indicates how much the operation σ is desirable, where a smaller value indicates higher desirability. The most desirable operations are those that produce no blocking (i.e., h(s) > v(c)). For this reason, M is a “sufficiently large” constant used by Eq. 3 to ensure low desirability for operations that produce blocking. If all existing operations produce blocking, the preferred operations are those whose stacks s have higher values of h(s) (indicated by 1/h(s)), and thus, delaying as much as possible the next relocation of the container c. Among non-blocking operations, the most desirable are those whose stacks s are closer to the exit position and their containers c have lower priority, respectively, indicated by d(s)/W and 1/v(c). This preference increases the chances of container c being the next to be removed from the bay and of container c staying closer to the exit position, by lowering the trolley travel time when retrieving c. For convenience, the heuristic function f (σ) (Eq. 4) is adopted instead of function f (σ), since for f (σ), a higher value indicates higher desirability. This way the value of f (σ) is inversely proportional to the value of f (σ), while one (1) is added to avoid a division by zero. Once the proposed heuristic function was specified, the function g(σ) is defined by Eq. 5. This function computes the product of the heuristic function f by the corresponding pheromone value. f (σ) =
1 1 + f (σ)
g(σ) = f (σ) ∗ τ(b,σ)
(4) (5)
Finally, the transition rule proposed is specified by the Eq. 6, applying the function g(σ). This way the next operation is selected deterministically or nondeterministically according to a condition that compares a random variable q ∈ (0, 1) with a parameter q0 used to define the exploration rate. In the first case (q < q0 ), the operation with the maximal value of g(σ) is selected . Otherwise, when (q ≥ q0 ), the probability distribution, defined by Eq. 7, is applied for selecting an operation. The probability of selecting an operation σ is proportional to g(σ) and inversely proportional to the sum of all g(σ ) for all the relocations found in the LCR. arg max g(σ), q < q0 , σ∈LRC (6) select = prob(σ), q ≥ q0 prob(σ) =
g(σ)
σ ∈LRC
5.3
g(σ )
(7)
Global and Local Update Rules
The global and local update rules specify when and how the pheromone values are updated. The proposed ACO approach is based on the ant colony system (ACS) [6]. Consequently, the global update rule is applied after each iteration
Optimizing the Crane’s Operating Time with the Ant Colony
377
of the colony of ants, and only the best solution found deposits pheromone. The objective of this global update rule is to intensify the exploration around highquality solutions, reinforcing the selection of operations contained in the set of best solutions found. This update rule is formally defined using the equations below. val(sol) =
1 1 + t(sol) − LB
τ(b,σ) = (1 − p) ∗ τ(b,σ) + p ∗ Δτ, ∀(b, σ) ∈ solbest , Δτ = val(solbest )
(8)
(9)
The function val(sol) (Eq. 8) measures the quality of a solution sol, where this quality is inversely proportional to the difference between its operating time (i.e., t(sol)) and the lower bound LB. This LB denotes a lower bound for all possible CRP solutions derived from the initial bay configuration and is measured in terms of the crane’s operating time. Our ACO approach adopts the lower bound proposed in [27], which is employed as a heuristic function in an A* algorithm. Note that the addition of one was utilized in Eq. 8 to avoid a division by zero. Using the function val with the current best solution solbest (i.e., val(solbest )), the variable Δτ stores the quality of solbest . This variable is utilized by Eq. 9 for updating the pheromone values associated only to the operations found in solbest . Therefore, the global update rule is applied by the Eq. 9, where the parameter p ∈ (0, 1) specifies the influence of the best found solutions regarding the deposit of pheromone. Another updating method is the local update rule, and its goal is to diversify the exploration of the solution space by preventing the ants from selecting the same operations found in a given solution. In our ACO approach, after an ant i has generated a solution soli , the local update rule is applied using the Eq. 10. The parameter ϕ ∈ (0, 1) is used to specify the intensity of pheromone evaporation in the local update rule. τ(b,σ) = ϕ ∗ τ(b,σ) , ∀(b, σ) ∈ soli 5.4
(10)
Implementation
The first implementation details that must be explained for the proposed ACO method are the initialization value and the minimal limit for the pheromone model. To this end, the Eq. 11 defines the initial value (τ0 ) and the pheromone minimal limit (τmin ). Both values are based on the solution quality, since τmin is related to the current best found solution solbest , and τ0 is associated to the solution solinit found by the constructive heuristic when solving the initial bay configuration. τ0 =
val(solinit ) W
τmin =
val(solbest ) W2
(11)
378
A. da Silva Firmino et al.
Algorithm 2 outlines the proposed ACO algorithm for the CRP. The first step of the algorithm is to generate a solution solinit applying the constructive heuristic to the initial state (stateinit ), such that the initial state is created from the initial bay configuration (bay). Besides, the current best solution (solbest ) is initialized with solinit (lines 1–3). Next, the lower bound (i.e., LB) is calculated based on the initial state of the bay (stateinit ), and thus, the initial pheromone value is computed (lines 3–4). The ACO iterations are performed until some stopping criterion is met (lines 6–30). At each iteration, a colony with n ants is created, and each of these ants generates a full or empty solution (lines 7–28).
Algorithm 2 Pseudocode of the ACO algorithm for solving the CRP Require: bay: initial bay configuration, q0 : the exploration rate Require: p: the pheromone deposit rate, ϕ: the pheromone evaporation rate Ensure: best solution found 1: stateinit ← bay 2: solinit ← create an initial solution by the constructive heuristic using stateinit and an empty solution as input 3: solbest ← solinit 4: calculate the lower bound LB based on stateinit 5: compute the initial pheromone value using LB and solinit 6: while stopping criteria not satisfied do 7: for n ants do 8: sol ← ∅; state ← stateinit 9: while state is not final state do 10: if cˆ is located at the top then 11: operation ← produce a retrieval in state by retrieving cˆ 12: else 13: operation ← produce a relocation in state according to the transition rule and with the exploration rate q0 14: end if 15: if operation = nil then 16: sol ← ∅; break while 17: end if 18: add operation to sol 19: if operation is a relocation then 20: solh ← create a solution by the constructive heuristic using state and sol as input 21: update solbest with the best solution between solbest and solh 22: end if 23: end while 24: if sol ∅ then 25: apply the local update rule using sol and the evaporation rate ϕ 26: update solbest with the best solution between solbest and sol 27: end if 28: end for 29: apply the global update rule using solbest and the deposit rate p 30: end while 31: return solbest
Optimizing the Crane’s Operating Time with the Ant Colony
379
At the beginning of each ant’s iteration, the solution (solution) is initialized with an empty set of operations and the current exploration state (state) is initialized with the initial state (line 8). Then, the ant’s exploration starts and is executed continuously until the final state is achieved, i.e., the corresponding bay is empty (lines 9–23). At the start of each exploration step, the algorithm verifies whether the container with the highest retrieval priority in the bay (i.e., cˆ) is currently located at the top of the stack (line 10). If so, the container is retrieved, meaning a retrieval operation is produced (line 11). Otherwise, a relocation operation needs to be selected from within the LRC according to the transition rule and the exploration rate q0 (line 13). It is worth pointing out that, in the composition of the LRC operations, potentially unpromising operations are banned from this list to obtain exploration efficiency. Also, an operation σ produces a state b and composes a partial solution solpartial together with the previous operations. Then, σ is considered as potentially unpromising if solpartial cannot be potentially used to generate a solution that is better than the current best found solution. Thus, σ is excluded from the LRC when the sum of lower bounds for the state b with the crane’s operating time of solpartial is greater than or equal to the crane’s operating time of the current best found solution. Moreover, when an operation σ that has handled a container c is added to the current solution, any subsequent operation that handles the same container c is also banned from the LRC. If there is no operation to be added to the solution, then the solution generation is aborted, thus decreasing in computational cost (line 16). Otherwise, the operation produced (operation) is added to the solution (line 18). When an operation is produced (line 11 or 12), the bay’s current state (state) is updated according to operation. After operation is added, the constructive heuristic is used for generating an additional solution solh using the current state (state) and the partial solution (sol) achieved by operation (line 20). Afterward, the solution with the lowest crane’s operating time is assigned to solbest comparing the current best solution (solbest ) to the solution obtained by the heuristic (solh ) (line 21). After generating a non-empty solution sol, the local update rule is applied to the solution sol, and solbest is updated if sol is better than solbest (lines 25–26). Subsequently, when all the n ants of a colony have generated their solutions, the global update rule is applied (line 29). The algorithm ends when any of the following stopping criteria is met: either the runtime limit is achieved, or a maximum number of consecutive iterations is reached without improvements in the quality of the solution generated by the ants. Finally, the best solution found is returned (line 31).
6
The Pilot Method Algorithm
The pilot method is a metaheuristic which enhances the quality of a heuristic that incorporates it, by providing an intelligent, constructive mechanism to evaluate certain decisions [33]. This metaheuristic applies a constructive heuristic, called
380
A. da Silva Firmino et al.
pilot heuristic, as a lookahead sub-heuristic to guide a master building process toward a more promising solution. Starting from an initial state, three tasks are repeated in an iterative process until a given stopping criterion is reached. The first task is evaluating every possible construction step of the current state using the solution produced by the pilot heuristic. The second task consists in performing the step higher ranking, and as a result, a new state is generated and set as the current state. The last task is updating the best solution found so far. In this paper, regarding the implementation of the pilot method for the CRP, the proposed Triad heuristic is used as a pilot heuristic. Besides, the initial state is created from the initial bay configuration, and each new state is generated by one operation, i.e., a construction step, which can be either a retrieval or relocation. The final state is achieved when all containers are retrieved. In addition, a construction state includes the current bay configuration and the partial solution derived from all the operations performed to achieve this configuration from the initial bay configuration. Algorithm 3 outlines the proposed pilot method for the CRP. Initially, the best solution (bestsol) and the initial state (state) are initialized (line 1). At each iteration, all containers that can be removed from the bay are retrieved and state is updated (line 3). Afterward, the variables to store the best descendant of (state) and its respective cost are initialized (line 4). Then, all possibles descendants from state are evaluated using the heuristic procedure (lines 5–14). This procedure returns a solution found by the constructive heuristic (i.e., Triad), taking into account the partial solution and the bay configuration present in the current state d. The descendant whose solution presents the shortest crane’s
Algorithm 3 Pseudocode of the Pilot Method algorithm for solving the CRP Require: bay: initial bay configuration Ensure: best solution found 1: bestsol ← nil; state ← bay 2: while stopping criteria not satisfied do 3: state ← remove all retrievable containers in state and get the resulting state 4: bestdescend ← nil; bestcost ← ∞ 5: for all d ∈ descendants(state) do 6: solution ← create a solution by the constructive heuristic using the partial solution and the bay configuration present in the state d 7: if cost(solution) < bestcost then 8: bestdescend ← d 9: bestcost ← cost(solution) 10: if bestsol = nil or bestcost < cost(bestsol) then 11: bestsol ← solution 12: end if 13: end if 14: end for 15: state ← bestdescend 16: end while 17: return bestsol
Optimizing the Crane’s Operating Time with the Ant Colony
381
operating time (i.e., cost(solution)) is defined as the best descendant (line 8). If the best descendant finds a new best solution, bestsol is updated (lines 10–12). The algorithm ends when any of the following stopping criteria is met, which may be the final state being achieved (i.e., when state is nil), but also a runtime limit being reached, or a maximum number of consecutive iterations being reached without presenting any improvements to the quality of the best solution. Finally, the best solution found is returned (line 17).
7
Computational Experiments
In this section, computational experiments carried out with instances of different sizes are presented and used to assess the proposed algorithms, as well as to compare their results to those of existing algorithms already applied for the CRP. Experiments were carried out on a computer with Intel(R) Core(TM) i7-5500U CPU @ 2.40 GHz, 16 GB of RAM and operating with Windows 10 Education 64 bits. The Mersenne Twister found in the Apache Commons Math 3.3 library was the adopted randomness generator. The random seeds were instantiated from decimal places of π (.1415926535897932 . . .) and grouped in five, e.g. seed1 = 14, 159, seed2 = 26, 535, seed3 = 89, 793, and so on. All analyzed algorithms were programmed in Java language and tested on the instances suite presented in [27]. In it, each problem instance results from the combination of the number of stacks and tiers with several bay occupancy rates, where containers are randomly spread within the bay and have distinct priorities. This combination resulted in 1200 instances with storage capacities ranging from 20 to 128 containers, where the instances were classified as large, medium, normal, and small , according to their storage capacities.1 The storage capacity is the product of the number of tiers by the number of stacks. For example, a bay with 4 tiers and 6 stacks (4 × 6) has a storage capacity equal to 4 × 6 = 24. It is worth mentioning that the evaluated instances suite is comprised of instances far larger than practical use, where the storage capacity is typically about 24 and 60 containers [22,35]. In all algorithms analyzed, the crane’s operating time was computed using the operating time function defined by Eq. 1 of Sect. 3.1. The speed of the trolley is 2.4 s per container width while loaded and 1.2 s per container width while 1 1 and ν2 = 1.2 ). The speed of the spreader is 5.18 s per tier empty (i.e., ν1 = 2.4 1 1 and ν4 = 2.59 ). while loaded and 2.59 s per tier while empty (i.e., ν3 = 5.18 Lastly, in all the computational experiments, the proposed ACO algorithm used the standard values of the ACO parameters specified in [6]. The parameter values are p = 0.1, ϕ = 0.9 and q0 = 0.9. Besides, the number of artificial ants in a colony (i.e., n) is equal to the number of stacks in the bay. 7.1
Analysis of the Triad Heuristic Algorithm
In this section, we analyze the performance of the Triad heuristic by comparing the Triad heuristic of Sect. 4 with other heuristic methods. Three heuristics algo1
The 1200 instances are available at www.cin.ufpe.br/∼asf2/csp/instances/.
382
A. da Silva Firmino et al.
rithms are evaluated: TRIAD, PENALTY, and CASE, which were all designed to reduce the crane’s operating time for the unrestricted CRP. The former is the Triad algorithm proposed in this paper. The second is the heuristic with a penalty function defined in [20]. Because the PENALTY heuristic is influenced by the penalty factor, in our experiments, the penalty factor was set to 30. This value was adopted because it has produced the best results in the experiments carried out in [20] regarding the crane’s operating time. The last algorithm, proposed in [17], is also a recent heuristic algorithm with good results in reducing the operating time. All analyzed algorithms were executed 21 times for each of the 1200 instances.
Fig. 5. A comparison of TRIAD, PENALTY, and CASE algorithms regarding the crane’s operating time
The performance results for the TRIAD, PENALTY, and CASE algorithms processed over the 1200 instances are shown in Fig. 5. This figure shows the total crane’s operating time for the solutions obtained by the algorithms over the instance classes. Also, this figure displays the percentage reduction values promoted by the TRIAD algorithm. The computational results indicate that TRIAD produces better results than PENALTY and CASE for all instance classes. The crane’s operating time obtained by TRIAD generated the lowest value when compared to the two other algorithms, with an operating time reduction of 2.7% when compared to PENALTY and of 10.3% when compared to CASE. The total operating time in hours was 32,284, 33,168, and 35,982 for TRIAD, PENALTY, and CASE, respectively. Besides, as the instance sizes increase, the PENALTY and CASE algorithms tend to degrade the quality of solutions. This adverse situation is better evidenced in large-sized problem instances, where the reduction values were 3.6 and 11.0%. It is worth mentioning that the TRIAD, PENALTY, and CASE algorithms reported an average runtime below one millisecond for
Optimizing the Crane’s Operating Time with the Ant Colony
383
the 1200 instances. Therefore, direct comparisons concerning average runtime are not reported. Finally, the nonparametric Wilcoxon Rank Sum test [24] has confirmed the statistical relevance of the obtained results. With 99% of confidence, this nonparametric test reported that there is a statistically significant difference between the quality of the solutions obtained with TRIAD, PENALTY, and CASE algorithms. Therefore, the proposed algorithm TRIAD produced significantly better solutions than the other algorithms for reducing the crane’s operating time. 7.2
Analysis of the ACO Algorithm
COEFFICIENT OF VARIATION
This experiment aims to evaluate the performance of the proposed ACO algorithm. Because randomness influences the ACO algorithm, it was executed 21 2.45 % 2.1 %
SMALL
1.75 % 1.4 % 1.05 % 0.7 % 0.35 % 0%
COEFFICIENT OF VARIATION
INSTANCES ORDERED BY STORAGE CAPACITY IN ASCENDING ORDER 2.45 % 2.1 %
NORMAL
1.75 % 1.4 % 1.05 % 0.7 % 0.35 % 0%
COEFFICIENT OF VARIATION
INSTANCES ORDERED BY STORAGE CAPACITY IN ASCENDING ORDER 2.45 % 2.1 %
MEDIUM
1.75 % 1.4 % 1.05 % 0.7 % 0.35 % 0%
COEFFICIENT OF VARIATION
INSTANCES ORDERED BY STORAGE CAPACITY IN ASCENDING ORDER 2.45 % 2.1 %
LARGE
1.75 % 1.4 % 1.05 % 0.7 % 0.35 % 0% INSTANCES ORDERED BY STORAGE CAPACITY IN ASCENDING ORDER
Fig. 6. Coefficient of variation for the solutions produced by ACO over the instances
384
A. da Silva Firmino et al.
times for each instance, and each execution used the random seed corresponding to its execution number. For example, execution 1 used seed1 , whose value is 14,159, while execution 2 used seed2 , whose value is 26,535, and so on. The ACO algorithm was executed with the following stopping criteria: (i) runtime limitation of 6 s; and (ii) maximum of x consecutive iterations allowed without improvements in the quality of the solution generated by ants, where x is the product between the numbers of stacks (W ), tiers (H) and containers (N ) found in the bay (i.e., x = W × H × N ). The short time of 6 s was used as runtime limit in order to make the comparison between the ACO and the other algorithms possible, because the TRIAD, PENALTY, and CASE algorithms are fast to execute. Firstly, the solutions obtained by ACO over the 21 runs on each of the 1200 instances were evaluated for variability. The proposed algorithm reported a good homogeneity in the solutions produced in terms of the crane’s operating time, especially, for the small-sized and normal-sized instances, as shown in Fig. 6. The ACO algorithm has demonstrated a low dispersion of objective values among the solutions produced, bearing in mind that the highest coefficient of variation is 2.43% and the mean coefficient of variation is 0.25%. Therefore, although the proposed algorithm is a random approach, its solutions tend to converge even adopting distinct random seeds. The comparison results presented in Fig. 7 bring to light the efficiency of ACO when solving the assessed instances in terms of the crane’s operating time. This figure compares the total crane’s operating time for the solutions obtained by ACO, TRIAD, PENALTY, and CASE and shows the percentage reduction values promoted by the ACO algorithm over the instance classes. The computational results indicate that ACO obtains better results than the other algorithms for all instance classes. The operating time reduction is of 6.1% when compared to TRIAD, of 8.6% when compared to PENALTY, and of 15.7% when compared to CASE, considering that the total operating time in hours is 30,325, 32,284, 33,168, and 35,982, for ACO, TRIAD, PENALTY, and CASE, respectively. In addition to these excellent results, the proposed algorithm ACO reported significantly better solutions than the other algorithms in reducing the crane’s operating time, according to the nonparametric Wilcoxon Rank Sum test. This test, with 99% of confidence, has revealed a statistically significant difference between the quality of the solutions obtained with ACO and the other algorithms. It is noteworthy that the ACO algorithm was able to find even better solutions than the TRIAD algorithm, highlighting the excellent features of ACO, bearing in mind that TRIAD had already been found better solutions than the other algorithms. Also, the TRIAD is utilized as a constructive heuristic by ACO to produce its initial solution and additional solutions, while the ants are walking.
Optimizing the Crane’s Operating Time with the Ant Colony
385
Fig. 7. A comparison of ACO, TRIAD, PENALTY, and CASE algorithms regarding the crane’s operating time
7.3
Analysis of the Pilot Method Algorithm
A comparison between the pilot method algorithm (PILOT), proposed in this paper, and other heuristic algorithms is also performed. PILOT was executed with the following stopping criteria: (i) a runtime limitation of 6 seconds; and (ii) a maximum of 25 consecutive iterations without improvements in the quality of the best solution. The short runtime of 6 s was used for enabling comparisons with the other fast-execution heuristics, similarly to the chosen runtime value adopted for the ACO algorithm. The proposed pilot method algorithm has been designed to enhance the quality of a heuristic. In this study, the aim is to improve the solution found by the TRIAD heuristic. Having this in mind, comparative experiments between the PILOT, TRIAD, ACO, PENALTY, and CASE algorithms were performed. The experimental results are shown in Fig. 8. This figure compares the total crane’s operating time generated by the algorithms and shows the reduction values promoted by the PILOT algorithm for the 1200 instances over the 21 runs. The operating time reduction is of 3.0% when compared to ACO, of 8.9% when compared to TRIAD, of 11.3% when compared to PENALTY, and of 18.3% when compared to CASE. The total operating time in hours is 29,404, 30,325, 32,284, 33,168, and 35,982, for PILOT, ACO, TRIAD, PENALTY, and CASE, respectively. Besides, as the instance sizes increase, the proposed algorithms (notably the PILOT algorithm) tend to upgrade the quality of solutions with respect to the PENALTY and CASE algorithms. This positive aspect is more noticeable in large-sized problem instances, where the reduction values are higher. It is worth mentioning that all algorithms reported an average runtime
386
A. da Silva Firmino et al.
Fig. 8. A comparison of PILOT, ACO, TRIAD, PENALTY, and CASE algorithms regarding the crane’s operating time
of fewer than 3 s. Hence, due to these very short runtimes, a direct comparison concerning average runtime is not reported. The nonparametric Wilcoxon Rank Sum test has revealed, with 99% of confidence, a statistically significant difference between the quality of the solutions produced by PILOT and the other algorithms, for the 1200 instances. Thus, the proposed algorithm PILOT reported significantly better solutions than the other algorithms in reducing the crane’s operating time. It is worth noting that the PILOT solutions are slightly better than the ACO solutions in terms of the operating time reduction (i.e., 3.0%). In particular, on small-sized and normal-sized instances, the reduction is only 0.4% and 1.7%, respectively.
8
Conclusion
This research has tackled the container retrieval problem (CRP) aiming at minimizing the crane’s operating time, a crucial issue for container terminals. This paper presented several contributions for the CRP and concentrated on its unrestricted version, where better solutions can be achieved than using the restricted version of CRP. First, a heuristic algorithm (TRIAD) was introduced, which quickly builds high-quality CRP solutions. Thereafter, the TRIAD heuristic was incorporated into the ACO algorithm (ACO) and the pilot method algorithm (PILOT), as a constructive heuristic, in order to find even better CRP solutions. The computational results show that the TRIAD heuristic achieved the best results in all evaluated instance classes when compared to other heuristics from the studied literature. Besides, ACO and PILOT found better solutions than TRIAD in short runtime. Therefore, the results emphasize the efficiency of
Optimizing the Crane’s Operating Time with the Ant Colony
387
all proposed algorithms compared to previous research. Furthermore, as shown throughout this paper, all proposed algorithms can be extended to solve the restricted CRP version. For future research, the plan is to extend this work to optimize other operational costs incurred by the crane, such as money and energy. In this situation, considering logistical constraints is also possible, e.g., due to the infection or explosion hazard, a container cannot be close to another container. Regarding the customization of the proposed algorithms, a promising line is to address other stacking problems (e.g., the Premarshalling Problem) to optimize the crane’s operating time, and another line is to solve, in an integrated manner, the closely related optimization problems at the container terminal system. Lastly, another future research may be to improve the performance of the proposed ACO algorithm, evaluating other ACO algorithmic variants. Moreover, other nature-inspired metaheuristic algorithms can be studied for solving the CRP, e.g., bat-inspired algorithm [21], particle swarm optimization [16], and evolutionary optimization based on biological evolution in plants [10].
References 1. Azari E (2015) Notes on “a mathematical formulation and complexity considerations for the blocks relocation problem”. Scientia Iranica 22(6):2722–2728 2. Bacci T, Mattia S, Ventura P (2019) The bounded beam search algorithm for the block relocation problem. Comput Oper Res 103:252–264. https://doi.org/10. 1016/J.COR.2018.11.008 3. Carlo HJ, Vis IF, Roodbergen KJ (2014) Storage yard operations in container terminals: literature overview, trends, and research directions. Eur J Oper Res 235(2):412–430. https://doi.org/10.1016/j.ejor.2013.10.054 4. Caserta M, Schwarze S, Voß S (2012) A mathematical formulation and complexity considerations for the blocks relocation problem. Eur J Oper Res 219(1):96–104. https://doi.org/10.1016/j.ejor.2011.12.039 5. Dey N (ed) (2018) Advancements in applied metaheuristic computing. Advances in data mining and database management. IGI Global. https://doi.org/10.4018/ 978-1-5225-4151-6 6. Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53– 66. https://doi.org/10.1109/4235.585892 7. Feillet D, Parragh SN, Tricoire F (2019) A local-search based heuristic for the unrestricted block relocation problem. Comput Oper Res 108:44–56. https://doi. org/10.1016/J.COR.2019.04.006 8. Forster F, Bortfeldt A (2012) A tree search heuristic for the container retrieval problem. In: Klatte D, L¨ uthi HJ, Schmedders K (eds) Operations research proceedings 2011 SE-41, operations research proceedings. Springer Berlin Heidelberg, pp 257–262. https://doi.org/10.1007/978-3-642-29210-1 41 9. Galle V, Barnhart C, Jaillet P (2018) A new binary formulation of the restricted container relocation problem based on a binary encoding of configurations. Eur J Oper Res 267(2):467–477. https://doi.org/10.1016/j.ejor.2017.11.053 10. Gupta N, Khosravy M, Patel N, Sethi I (2018) Evolutionary optimization based on biological evolution in plants. Proc Comput Sci 126:146–155. https://doi.org/ 10.1016/j.procs.2018.07.218
388
A. da Silva Firmino et al.
11. Hussein M, Petering MEH (2012) Genetic algorithm-based simulation optimization of stacking algorithms for yard cranes to reduce fuel consumption at seaport container transshipment terminals. In: 2012 IEEE congress on evolutionary computation (CEC). IEEE, pp 1–8. https://doi.org/10.1109/CEC.2012.6256471 12. Inaoka Y, Tanaka S (2017) A branch-and-bound algorithm for the block relocation problem to minimize total crane operation time. In: 19th international conference on harbor maritime and multimodal logistics M&S (HMS 2017), pp 98–104 13. Jin B, Zhu W, Lim A (2015) Solving the container relocation problem by an improved greedy look-ahead heuristic. Eur J Oper Res 240(3):837–847. https:// doi.org/10.1016/j.ejor.2014.07.038 14. Jovanovic R, Tuba M, Voß S (2019) An efficient ant colony optimization algorithm for the blocks relocation problem. Eur J Oper Res 274(1):78–90. https://doi.org/ 10.1016/J.EJOR.2018.09.038 15. Jovanovic R, Voß S (2014) A chain heuristic for the blocks relocation problem. Comput Ind Eng 75:79–86. https://doi.org/10.1016/j.cie.2014.06.010 16. Khosravy M, Gupta N, Patel N, Senjyu T, Duque CA (2019) Particle swarm optimization of morphological filters for electrocardiogram baseline drift estimation. Springer, pp 1–21. https://doi.org/10.1007/978-981-13-9263-4 1 17. Kim Y, Kim T, Lee H (2016) Heuristic algorithm for retrieving containers. Comput Ind Eng. https://doi.org/10.1016/j.cie.2016.08.022 18. Ku D, Arthanari TS (2016) On the abstraction method for the container relocation problem. Comput Oper Res 68:110–122. https://doi.org/10.1016/j.cor.2015.11.006 19. Lee Y, Lee YJ (2010) A heuristic for retrieving containers from a yard. Comput Oper Res 37(6):1139–1147. https://doi.org/10.1016/j.cor.2009.10.005 20. Lin DY, Lee YJ, Lee Y (2015) The container retrieval problem with respect to relocation. Transp Res Part C Emerg Technol 52:132–143. https://doi.org/10.1016/j. trc.2015.01.024 21. Moraes CA, De Oliveira EJ, Khosravy M, Oliveira LW, Hon´ orio LM, Pinto MF (2019) A hybrid bat-inspired algorithm for power transmission expansion planning on a practical Brazilian network. Springer, pp 71–95.https://doi.org/10.1007/978981-13-9263-4 4 22. Murty KG, Liu J, Wan YW, Linn R (2005) A decision support system for operations in a container terminal. Decis Support Syst 39(3):309–332. https://doi.org/ 10.1016/j.dss.2003.11.002 23. Quispe KEY, Lintzmayer CN, Xavier EC (2018) An exact algorithm for the blocks relocation problem with new lower bounds. Comput Oper Res 99:206–217. https:// doi.org/10.1016/J.COR.2018.06.021 24. Sheskin DJ (2007) Handbook of parametric and nonparametric statistical procedures, 4th edn. Chapman & Hall/CRC 25. de Melo da Silva M, Erdogan G, Battarra M, Strusevich V (2018) The block retrieval problem. Eur J Oper Res 265(3):931–950. https://doi.org/10.1016/j.ejor. 2017.08.048 26. da Silva Firmino A, de Abreu Silva RM, Times VC (2016) An exact approach for the container retrieval problem to reduce crane’s trajectory. In: 2016 IEEE 19th international conference on intelligent transportation systems (ITSC), pp 933–938. https://doi.org/10.1109/itsc.2016.7795667 27. da Silva Firmino A, de Abreu Silva RM, Times VC (2019) A reactive grasp metaheuristic for the container retrieval problem to reduce crane’s working time. J Heurist 25(2):141–173. https://doi.org/10.1007/s10732-018-9390-0
Optimizing the Crane’s Operating Time with the Ant Colony
389
28. Tanaka S, Takii K (2016) A faster branch-and-bound algorithm for the block relocation problem. Autom Sci Eng 13(1):181–190. https://doi.org/10.1109/TASE.2015. 2434417 29. Ting CJ, Wu KC (2017) Optimizing container relocation operations at container yards with beam search. Transp Res Part E Logist Transp Rev 103:17–31. https:// doi.org/10.1016/j.tre.2017.04.010 30. Tricoire F, Scagnetti J, Beham A (2018) New insights on the block relocation problem. Comput Oper Res 89:127–139. https://doi.org/10.1016/J.COR.2017.08. 010 31. UNCTAD: Review of Maritime Transport (2017). UNITED NATIONS PUBLICATION. https://www.unctad.org/en/PublicationsLibrary/rmt2017 en.pdf ¨ uyurt T, Aydin C (2012) Improved rehandling strategies for the container 32. Unl¨ retrieval process. J Adv Transp 46(4):378–393. https://doi.org/10.1002/atr.1193 33. Voß S, Fink A, Duin C (2005) Looking ahead with the pilot method. Annals Oper Res 136(1):285–302. https://doi.org/10.1007/s10479-005-2060-2 34. World Bank: container port traffic (teu: 20 foot equivalent units) (2018). https:// www.data.worldbank.org/indicator/IS.SHP.GOOD.TU?end=2018&start=2000. Accessed 30 Aug 2019 35. World Cargo News: GCS adds RMG at Moscow terminal (2013). www. worldcargonews.com/news/news/gcs-adds-rmg-at-moscow-terminal-32292. Accessed 30 Aug 2019 36. Zehendner E, Caserta M, Feillet D, Schwarze S, Voß S (2015) An improved mathematical formulation for the blocks relocation problem. Eur J Oper Res 245(2):415– 422. https://doi.org/10.1016/j.ejor.2015.03.032