266 58 19MB
English Pages [765]
Studies in Computational Intelligence 967
Diego Oliva Essam H. Houssein Salvador Hinojosa Editors
Metaheuristics in Machine Learning: Theory and Applications
Studies in Computational Intelligence Volume 967
Series Editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
The series “Studies in Computational Intelligence” (SCI) publishes new developments and advances in the various areas of computational intelligence—quickly and with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, selforganizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. Indexed by SCOPUS, DBLP, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.
More information about this series at http://www.springer.com/series/7092
Diego Oliva · Essam H. Houssein · Salvador Hinojosa Editors
Metaheuristics in Machine Learning: Theory and Applications
Editors Diego Oliva Computer Sciences Department CUCEI University of Guadalajara, Guadajalara, Jalisco, Mexico
Essam H. Houssein Department of Computer Science Faculty of Computers and Information Minia University Minia, Egypt
Salvador Hinojosa Computer Sciences Department CUCEI University of Guadalajara, Guadajalara, Jalisco, Mexico
ISSN 1860-949X ISSN 1860-9503 (electronic) Studies in Computational Intelligence ISBN 978-3-030-70541-1 ISBN 978-3-030-70542-8 (eBook) https://doi.org/10.1007/978-3-030-70542-8 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
In recent years, metaheuristics (MHs) have become important tools for solving hard optimization problems encountered in industry, engineering, biomedical, image processing, as well as in the theoretical field. Several different metaheuristics exist, and new ones are under constant development. One of the most fundamental principles in our world is the search for an optimal state. Therefore, choosing the right solution method for an optimization problem can be crucially important in finding the right solutions for a given optimization problem (unconstrained and constrained optimization problems). There exist a diverse range of MHs for optimization. Optimization is an important and decisive activity in science and engineering. Engineers will be able to produce better designs when they can save time and decrease the problem complexity with optimization methods. Many engineering optimization problems are naturally more complex and difficult to solve by conventional optimization methods such as dynamic programming. In recent years, more attention has been paid to innovative methods derived from the nature that is inspired by the social or the natural systems, which have yielded outstanding results in solving complex optimization problems. Metaheuristic algorithms are a type of random algorithm which is used to find the optimal solutions. Metaheuristics are approximate types of optimization algorithms that can better escape from the local optimum points and can be used in a wide range of engineering problems. Recently, metaheuristics (MHs) and Machine learning (ML) became a very important and hot topic to solve real-world applications in the industrial world, science, engineering, etc. Among the subjects to be considered are theoretical developments in MHs; performance comparisons of MHs; cooperative methods combining different types of approaches such as constraint programming and mathematical programming techniques; parallel and distributed MHs for multi-objective optimization; adaptation of discrete MHs to continuous optimization; dynamic optimization; software implementations; and real-life applications. Besides, Machine learning (ML) is a data analytics technique to use computational methods. Therefore, recently, MHs have been combined with several ML techniques to deal with different global and engineering optimization problems, also real-world applications. Chapters published in the “Metaheuristics in machine learning: theory and applications (MAML2020)” book describe original works in different topics in both science and engineering, v
vi
Preface
such as: Metaheuristics, Machine learning, Soft Computing, Neural Networks, Multi-criteria decision making, energy efficiency, sustainable development, etc. In short, it can be said that metaheuristic algorithms and machine learning are advanced and general search strategies. Therefore, the main contribution of this book is to indicate the advantages and importance of metaheuristics with machine learning in various real-world applications. Guadajalara, Mexico Minia, Egypt Guadajalara, Mexico
Diego Oliva Essam H. Houssein Salvador Hinojosa
Introduction
This book “MAML2020” collects several hybridized metaheuristics (MHs) with machine learning (ML) methods for various real-world applications. Hence, the MHs have become essential tools for solving hard optimization problems encountered in industry, engineering, biomedical, image processing, as well as in the theoretical field. Besides, machine learning (ML) is a data analytics technique to use computational methods. Therefore, recently, MHs have been combined with several ML techniques to deal with different global and engineering optimization problems, also real-world applications. However, this book addresses the issues of two important computer sciences strategies: MHs and ML. The idea of combining the techniques is to improve the performance of the original methods in different applications. The book guides the reader along with different and exciting implementations, but it also includes the theoretical support that permits understanding of all the ideas presented in the chapter. Moreover, each chapter that offers applications includes comparisons and updated references that support the results obtained by the proposed approaches. At the same time, every chapter provides the reader with a practical guide to go to the reference sources. The book was designed for graduate and postgraduate education, where students can find support for reinforcing or as the basis for their consolidation; researchers can polish their knowledge. Also, professors can find support for the teaching process in areas involving machine vision or as examples related to main techniques addressed. Additionally, professionals who want to learn and explore the advances in concepts and implementation of optimization and machine learning-based algorithms applied to several real-world applications can find in this book an excellent guide for such purpose. This exciting book has 30 chapters organized considering an overview of metaheuristics (MHs) and machine learning (ML) methods applied to solve various realworld applications. In this sense, Chapters 1 and 2 provide the cross-entropy-based thresholding segmentation of magnetic resonance prostatic images and hyperparameter optimization in a convolutional neural network using metaheuristic algorithms, respectively. Chapter 3 presents a diagnosis of collateral effects in climate change through the identification of leaf damage using a novel heuristics and machine learning framework. Chapter 4 explains the feature engineering for machine learning and deep learning-assisted wireless communication. Chapter 5 introduces the genetic vii
viii
Introduction
operators and their impact on the training of deep neural networks. In Chapter 6, the implementation of metaheuristics with extreme learning machines is described. Chapter 7 presents the architecture optimization of convolutional neural networks by microgenetic algorithms. In Chapter 8, optimizing connection weights in neural networks using a memetic algorithm incorporating chaos theory is introduced. Further, Chapter 9 provides a review of metaheuristic optimization algorithms for wireless sensor networks. In Chapter 10, a metaheuristic algorithm for white blood cell classification in healthcare informatics is presented. In Chapter 11, a review of multi-level thresholding image segmentation using nature-inspired optimization algorithms is introduced. Chapter 12 explains the hybrid Harris hawks optimization with differential evolution for data clustering. Chapter 13 introduces the variable mesh optimization for continuous optimization and multimodal problems. Chapter 14 provides traffic control using image processing and deep learning techniques. Chapter 15 introduces the drug design and discovery: theory, applications, open issues, and challenges. The thresholding algorithm applied to chest X-ray images with pneumonia is presented in Chapter 16. Moreover, Chapter 17 presents a comprehensive review of artificial neural networks on stock market prediction. Chapter 18 introduces the image classification with convolutional neural networks. Chapter 19 provides the applied machine learning techniques to find patterns and trends in bicycle-sharing systems influenced by traffic accidents and violent events in Guadalajara, Mexico. In Chapter 20, a review on machine reading comprehension (LSTM) is presented. Chapter 21 introduces a survey of metaheuristic algorithms for solving optimization problems. Chapter 22 integrates metaheuristic algorithms and minimum cross-entropy for image segmentation in mist conditions. Chapter 23 provides a machine learning application for particle physics: Mexico’s involvement in the Hyper-Kamiokande observatory. Besides, Chapter 24 provides a novel metaheuristic approach for image contrast enhancement based on grayscale mapping. Chapter 25 presents the geospatial data mining technique survey. In Chapter 26, an integration of Internet of things and cloud computing for cardiac health recognition is discussed. Chapter 27 introduces the combinatorial optimization for artificial intelligence-enabled mobile network automation. Chapter 28 presents the performance optimization of a PID controller based on parameter estimation using metaheuristic techniques. Chapter 29 provides the solar irradiation change detection for photovoltaic systems through ANN trained with a metaheuristic algorithm. Finally, in Chapter 30, the genetic algorithm-based global and local feature selection approach for handwritten numeral recognition is presented. It is important to mention that an advantage of this structure is that each chapter could be read separately. This book is an important reference for hybridized metaheuristics (MHs) with machine learning (ML) methods for various real-world applications. These areas are relevant and are in constant evolution. For that reason, it
Introduction
ix
is hard to collect all the information in a single book. I congratulate the authors for their effort and dedication to assembling the topics addressed in the book. Diego Oliva Essam H. Houssein Salvador Hinojosa
Contents
Cross Entropy Based Thresholding Segmentation of Magnetic Resonance Prostatic Images Using Metaheuristic Algorithms . . . . . . . . . . Omar Zárate and Daniel Záldivar Hyperparameter Optimization in a Convolutional Neural Network Using Metaheuristic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Angel Gaspar, Diego Oliva, Erik Cuevas, Daniel Zaldívar, Marco Pérez, and Gonzalo Pajares
1
37
Diagnosis of Collateral Effects in Climate Change Through the Identification of Leaf Damage Using a Novel Heuristics and Machine Learning Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Juan Salazar, Eddy Sánchez-De La Cruz, Alberto Ochoa-Zezzatti, Martin Montes, Roberto Contreras-Masse, and José Mejia
61
Feature Engineering for Machine Learning and Deep Learning Assisted Wireless Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vijay Kumar and Sarat Kumar Patra
77
Genetic Operators and Their Impact on the Training of Deep Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . David Eliel Bocanegra Michel and Daniel Zaldivar Navarro
97
Implementation of Metaheuristics with Extreme Learning Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Hector Escobar and Erik Cuevas Architecture Optimization of Convolutional Neural Networks by Micro Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Edgar Saul Marquez Casillas and Valentín Osuna-Enciso Optimising Connection Weights in Neural Networks Using a Memetic Algorithm Incorporating Chaos Theory . . . . . . . . . . . . . . . . . . . 169 Seyed Jalaleddin Mousavirad, Gerald Schaefer, and Hossein Ebrahimpour-Komleh xi
xii
Contents
A Review of Metaheuristic Optimization Algorithms in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Essam H. Houssein, Mohammed R. Saad, Kashif Hussain, Hassan Shaban, and M. Hassaballah A Metaheuristic Algorithm for Classification of White Blood Cells in Healthcare Informatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Ana Carolina Borges Monteiro, Yuzo Iano, Reinaldo Padilha França, and Rangel Arthur Multi-level Thresholding Image Segmentation Based on Nature-Inspired Optimization Algorithms: A Comprehensive Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Essam H. Houssein, Bahaa El-din Helmy, Diego Oliva, Ahmed A. Elngar, and Hassan Shaban Hybrid Harris Hawks Optimization with Differential Evolution for Data Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Laith Abualigah, Mohamed Abd Elaziz, Mohammad Shehab, Osama Ahmad Alomari, Mohammad Alshinwan, Hamzeh Alabool, and Deemah A. Al-Arabiat Variable Mesh Optimization for Continuous Optimization and Multimodal Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Jarvin A. Antón-Vargas, Luis A. Quintero-Domínguez, Guillermo Sosa-Gómez, and Omar Rojas Traffic Control Using Image Processing and Deep Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Rafael Baroni, Sthefanie Premebida, Marcella Martins, Diego Oliva, Erikson Freitas de Morais, and Max Santos Drug Design and Discovery: Theory, Applications, Open Issues and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 Essam H. Houssein, Mosa E. Hosney, Diego Oliva, No Ortega-Sánchez, Waleed M. Mohamed, and M. Hassaballah Thresholding Algorithm Applied to Chest X-Ray Images with Pneumonia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 Jesus Murillo-Olmos, Erick Rodríguez-Esparza, Marco Pérez-Cisneros, Daniel Zaldivar, Erik Cuevas, Gerardo Trejo-Caballero, and Angel A. Juan Artificial Neural Networks for Stock Market Prediction: A Comprehensive Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409 Essam H. Houssein, Mahmoud Dirar, Kashif Hussain, and Waleed M. Mohamed
Contents
xiii
Image Classification with Convolutional Neural Networks . . . . . . . . . . . . . 445 Alfonso Ramos-Michel, Marco Pérez-Cisneros, Erik Cuevas, and Daniel Zaldivar Applied Machine Learning Techniques to Find Patterns and Trends in the Use of Bicycle Sharing Systems Influenced by Traffic Accidents and Violent Events in Guadalajara, Mexico . . . . . . . 475 Adrian Barradas, Andrea Gomez-Alfaro, and Rosa-María Cantón-Croda Machine Reading Comprehension (LSTM) Review (State of Art) . . . . . . 491 Marcos Pedroza, Alberto Ramírez-Bello, Adrián González Becerra, and Fernando Abraham Fausto Martínez A Survey of Metaheuristic Algorithms for Solving Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515 Essam H. Houssein, Mohamed A. Mahdy, Doaa Shebl, and Waleed M. Mohamed Integrating Metaheuristic Algorithms and Minimum Cross Entropy for Image Segmentation in Mist Conditions . . . . . . . . . . . . . . . . . . 545 Mario A. Navarro, Diego Oliva, Daniel Zaldívar, and Gonzalo Pajares Machine Learning Application for Particle Physics: Mexico’s Involvement in the Hyper-Kamiokande Observatory . . . . . . . . . . . . . . . . . . 583 S. Cuen-Rochin, E. de la Fuente, L. Falcon-Morales, R. Gamboa Goni, A. K. Tomatani-Sanchez, F. Orozco-Luna, H. Torres, J. Lozoya, J. A. Baeza, J. L. Flores, B. Navarro-Garcia, B. Veliz, A. Lopez, and B. Gonzalez-Alvarez A Novel Metaheuristic Approach for Image Contrast Enhancement Based on Gray-Scale Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609 Alberto Luque-Chang, Itzel Aranguren, Marco Pérez-Cisneros, and Arturo Valdivia Geospatial Data Mining Techniques Survey . . . . . . . . . . . . . . . . . . . . . . . . . . 635 Jorge Antonio Robles Cárdenas and Griselda Pérez Torres Integration of Internet of Things and Cloud Computing for Cardiac Health Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645 Essam H. Houssein, Ibrahim E. Ibrahim, M. Hassaballah, and Yaser M. Wazery Combinatorial Optimization for Artificial Intelligence Enabled Mobile Network Automation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663 Furqan Ahmed, Muhammad Zeeshan Asghar, and Ali Imran Performance Optimization of PID Controller Based on Parameters Estimation Using Meta-Heuristic Techniques: A Comparative Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691 Mohamed Issa
xiv
Contents
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained with a Metaheuristic Algorithm . . . . . . . . . . . . . . . 711 Efrain Mendez-Flores, Israel Macias-Hidalgo, and Arturo Molina Genetic Algorithm Based Global and Local Feature Selection Approach for Handwritten Numeral Recognition . . . . . . . . . . . . . . . . . . . . . 745 Sagnik Pal Chowdhury, Ritwika Majumdar, Sandeep Kumar, Pawan Kumar Singh, and Ram Sarkar
Cross Entropy Based Thresholding Segmentation of Magnetic Resonance Prostatic Images Using Metaheuristic Algorithms Omar Zárate and Daniel Záldivar
1 Introduction Medical images (MI) are important sources of information to help to detect and diagnose illnesses and abnormalities in the human body. Magnetic Resonance Images (MRIs) systems measure the spatial distribution of several distinct tissue-related parameters, such as relaxation times and proton density [1]. MRI measurements are collections of features (that is, numerical characteristics) from a spatial array that are aggregated into multidimensional data (from a single anatomical slice). Prostate MRIs are primarily used for the medical diagnosis of prostate diseases. The most common varieties are: Prostatitis: The prostate can become inflamed secondary to an infectious process, this is known as Prostatitis, it has a cure, but it deserves long-term treatment. Its major complication is the development of an abscess (collection of pus) in the prostate. Benign Prostatic Hyperplasia (BPH): It is the most frequent benign tumor in the male sex and the cause in most cases of the annoying voiding symptoms (prostatism) that appear after the 40 years in which it develops. It can be treated with medications or through surgical procedures (surgeries). Prostate cancer (malignant growth): It is one of the three most frequent cancers and a cause of death in men. It usually occurs after age 50, and its symptoms are NOT
O. Zárate (B) · D. Záldivar División de Electrónica y Computación, CUCEI, Universidad, de Guadalajara, Av. Revolución 1500, Guadalajara, Jalisco, México e-mail: [email protected]; [email protected] D. Záldivar e-mail: [email protected] O. Zárate Departamento de Tecnologías de La Información, Universidad, Tecnologíca de Jalisco, Luis J. Jiménez 577, Guadalajara, Jalisco, México © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 D. Oliva et al. (eds.), Metaheuristics in Machine Learning: Theory and Applications, Studies in Computational Intelligence 967, https://doi.org/10.1007/978-3-030-70542-8_1
1
2
O. Zárate and D. Záldivar
different from those of benign prostate growth (BPH). Its timely detection allows its healing to be achieved either through surgery or some other type of procedure [2]. It is essential to make a correct differential diagnosis to indicate the appropriate treatment. Prostate MRIs analysis is performed by experts using visual evaluations based on their professional experience and skills. However, this type of inspection is limited and time-consuming. Due to these limitations, computer-assisted techniques have been developed with the objective of extracting the anatomical structures with segmentation being the principal methodology [3]. Image segmentation consists of obtaining underlying structures to facilitate their interpretation, obtaining borders or groups of pixels that form regions with some property such as intensity or texture, but this approach is computationally expensive [4]. To reduce the computational time required, metaheuristic algorithms (MA) have been proposed to solve complex engineering problems. MA are stochastic search algorithms that use rules or heuristics applicable to any problem to accelerate their convergence to near-optimal solutions. The vast majority of metaheuristic algorithms (MA) have been derived from biological behavior systems or physical systems in nature. Each of these algorithms has advantages and disadvantages [5]. Within the family of metaheuristic algorithms, the methods that are of interest for research are genetic algorithms that emulate evolution by passing genes [6]. Genetic Algorithm is the most popular stochastic optimization algorithm, was proposed to alleviate the drawbacks of the deterministic algorithms. Taking the evolution of species as inspiration, they set out to try to solve problems, particularly optimization problems, opening a wide field of research that continues to be in force as evolutionary algorithms (Mirjalili, Moth-Flame Optimization Algorithm: A Novel Nature-inspired Heuristic Paradigm [7]. Holland and his colleagues proposed one of the first genetic algorithms; this algorithm is based on the adaptation theory of artificial populations of individuals, which is inspired by the mixture of Darwin’s theories of the evolution of species with the genetics of Mendel [8]. Optimization is the process of finding the best possible solution for a particular problem, finding optimal values for the parameters from all the possible values to maximize or minimize its output. The vast majority of optimization problems with practical implications in science, engineering, economics, and business are very complex and challenging to solve. Such problems cannot be solved in an exact way using classical optimization methods. Under these circumstances, metaheuristic algorithm methods have established themselves as an alternative solution (Mirjalili, SCA: A Sine Cosine Algorithm for Solving Optimization Problems [9]. The feasibility of performing different image processing tasks, oriented towards segmentation, has been demonstrated through the use of MA, such as image thresholding. Image thresholding performs the separation of pixels in an image by considering certain intensity values (thresholds) that partition the image histogram into a finite number of classes based on pixel values relative to thresholds [10].
Cross Entropy Based Thresholding Segmentation of Magnetic …
3
This chapter evaluates three recently published evolutionary algorithms in multilevel thresholding with the classic thresholding criteria. The MA applied for segmentation of prostate magnetic resonance images (MRIs) were Moth-Flame Optimization Algorithm (MFO) that is a metaheuristic algorithm recently published (2015) with a considerable number of citations by the scientific community, Sine Cosine Algorithm for Solving Optimization Problems (SCA) is another metaheuristic algorithm published in 2016. Sunflower Optimization (SFO) Algorithm is an algorithm published in 2018, these three metaheuristic algorithms have been used in several investigations of digital image processing, and these three have a large number of citations in this field.
1.1 Applied Metaheuristic Algorithms These three metaheuristic algorithms were selected to perform the experiments and apply it to prostate magnetic resonance imaging. • Moth-Flame Optimization Algorithm (MFO) This algorithm is inspirited in the navigation method of moths in nature called transverse orientation (Mirjalili, Moth-Flame Optimization Algorithm: A Novel Nature-inspired Heuristic Paradigm [7]. • Sine Cosine Optimization Algorithm (SCA) This algorithm fluctuates outwards or towards the best solution using a mathematical model based on sine and cosine functions (Mirjalili, SCA: A Sine Cosine Algorithm for Solving Optimization Problems [9]. • Sunflower Optimization Algorithm (SFO) This algorithm is inspirited on sunflowers’ motion; every day, they awaken and accompany the sun like the needles of a clock. At night, they travel the opposite direction to wait again for their departure the next morning (Ferreira Gomes, Simões da Cunha and Ancelotti [11]. The three metaheuristics mentioned above were applied, which seek the best results for thresholding with 2, 3, 4, 5, 8, 16, and 32 thresholds, applied to five prostate magnetic resonance images, the three algorithms use Minimum Cross Entropy Thresholding as the objective function. The proposed algorithms are evaluated in terms of quality, and statistical analysis is presented to compare the results of these algorithms against traditional approaches Table 1. It was observed in the results that the MFO algorithm generated the best results, in fitness, evaluated by the objective function and in the statistical tests. The results were competitive between the SCA and MFO algorithms, where SCA had shorter processing time, but which is still not the decisive factor. This work is divided into seven sections: In the first section are explained the three used algorithms; Moth-flame optimizer algorithm, Sine Cosine Optimization. Algorithm and Sunflower Optimization Algorithm. In Sects. 2, 3, 4 and 5 are described the
4
O. Zárate and D. Záldivar
Table 1 Parameters for MFO, SCA and SFO algorithms MFO
SCA
SFO
Population: 60 Number of experiments: 30 Iterations: 1000 Lower bound: 1 Upper bound: 255 Number of thresholds nt: 2, 3, 4, 5, 8, 16, 32 Objective function: minimum cross entropy thresholding
Search agents number: 60 Number of experiments: 30 Iterations: 1000 Lower bound: 1 Upper bound: 255 Number of thresholds nt: 2, 3, 4, 5, 8, 16, 32 Objective function: minimum cross entropy thresholding
Number of sunflowers: 60 Number of experiments: 30 Iterations/generations: 1,000 Lower bound: 1 Upper bound: 255 Number of thresholds nt: 2, 3, 4, 5, 8, 16, 32 Objective function: minimum cross entropy thresholding Pollination rate best Values p: 0.05 Mortality rate, best values m: 0.1 Survival rate, best values: 1-(p + m) Problem dimension: 2
Image Segmentation Using Minimum Cross Entropy. In Sect. 6 and 7 are analyzed results obtained and the argue of final conclusions.
2 Moth-Flame Optimizer Algorithm Moths are insects, which are similar to the family of butterflies. Basically, there are over 160,000 various species of this insect in nature. The most interesting fact about moths is their special navigation methods at night. They use a mechanism called transverse orientation. In this method, a moth flies by maintaining a fixed angle with respect to the moon, a very effective mechanism for traveling long distances in a straight path. The inspiration of this optimizer is the navigation method of moths in nature called transverse orientation. The MFO algorithm mathematically models this behavior to perform optimization (Fig. 1). The MFO considers a spiral as the main update mechanism of moths. However, any types of spiral can be utilized here subject to the following conditions: S Mi , F j = Di · ebt · cos(2π t) + F j
(1)
where Di indicates the distance of the ith moth for the jth flame, b is a constant for defining the shape of the logarithmic spiral, and t is a random number in [−1, 1]. D is calculated as follows: Di = F j − Mi
(2)
Cross Entropy Based Thresholding Segmentation of Magnetic …
5
Fig. 1 Moth flame transverse orientation for navigation
Fig. 2 Fly spirally around the lights
where Mi indicates the ith moth, Fj indicates the jth flame, and Di indicates the distance of the ith moth for the jth flame (Fig. 2). The moth eventually converges towards the light. Modeled this behavior and proposed an optimizer called Moth-Flame Optimization (MFO) algorithm [7]. N −1 moth f lameno = r ound N − l ∗ T
(3)
where l is the current number of iteration, N is the maximum number of moth flames, and T indicates the maximum number of iterations.
6
O. Zárate and D. Záldivar
Begin Input parameters of the algorithm and the initial data Initialize the positions of moths and evaluate their fitness values ) While (the stop criterion is not satisfied or Update moth flame no. OM = Fitness Function( M ) If iteration = 1 F = sort( M ) OF = sort( OM ) Else F = sort( Mt – 1, Mt ) OF = sort ( Mt – 1, Mt ) End if For i = 1:N For j = 1:D Update r and t Calculate D with respect to the corresponding moth End for j End for i End While End
Computational complexity of the MFO algorithm depends on the number of moths flames, number of variables, maximum number of iterations, and sorting mechanism of flames in each iteration [7].
3 Sine Cosine Optimization Algorithm Population-based optimization techniques create multiple initial random candidate solutions. They are evaluated repeatedly by an objective function and improved by a set of rules that is the kernel of an optimization technique. An optimization algorithm combines the random solutions in the set of solutions abruptly with a high rate of randomness to find the promising regions of the search space. SCA creates multiple initial random candidate solutions and requires them to fluctuate outwards or towards the best solution using a mathematical model based on sine and cosine functions. Several random and adaptive variables also are integrated into this algorithm to emphasize exploration and exploitation [9]. SCA uses the following position updating: X it+1 = X it + r1 × sin(r 2) × r3 Pit − X it X t+1 = X t + r1 × cos(r 2) × r3 P t − X t i
i
i
ii
(4)
Cross Entropy Based Thresholding Segmentation of Magnetic …
7
Fig. 3 SCA with the range in [−2, 2]
where X it is the position of the current solution in ith dimension at tth iteration, r1 /r2 /r3 are random numbers, Pi is the position of the destination point in ith dimension. These two equations are combined to be used as follows: X it+1
=
X it+1 = X it + r1 × sin(r 2) × r3 Pit − X it , r4 < 0.5 X it+1 = X it + r1 × cos(r 2) × r3 Pit − X iti , r4 ≥ 0.5
(5)
where r4 is a random number in [12] (Fig. 3). begin InitializeSearchAgents[X]; while t= 0 σ ( x) = (1) 0w · x + b < 0, where x is the input vector to the perceptron with weights w and bias b. The relation between weights, inputs and bias terms in relation to the parameter space is shown in Fig. 7. A common classification problem is distinguishing two or more classes using features, as shown in Fig. 7 (top) for a particular binary classification problem using red diamonds and blue triangles, where the number of input nodes would be used as features. In principle, we could pick different values for weights w and biases b to generate different boundaries between the two classes. For the geometric shapes of Fig. 7 these features could be the number of vertices, the number of edges, the color, the perimeter, the area, and so on. If the different classes to classify are not linearly separable, the single-layer perceptron algorithm does not converge, and a multi-layer perceptron is needed for a correct classification to occur. Notice Fig. 7, (middle) and (bottom), where a second layer, called the hidden layer, is used to separate the classes in a two-step procedure. In general, a multi-layer perceptron is a neural network with several hidden layers capable to solve non-linear problems. To do this, more advanced methods such as backpropagation, and additional non-linear activation functions are used. Typical non-linear activation functions used in NN are sigmoid, tanh, ReLU, and Leaky ReLU. Traditional neural networks consist of a stack of layers where each neuron is fully connected to other neurons of the neighboring layers, also called Fully-Connected Multi-Layer Perceptrons. The first layer is refered to as the input layer x, followed by n hidden layers, and lastly an output layer y, as shown in Fig. 8.
Machine Learning in Mexico for the Hyper-K Observatory …
595
Fig. 7 Top: A single-layer perceptron is used to classify two linearly separable classes. Middle: Adding a new data point, the resultant two new classes are non-linearly separable. More neurons can be added to the layer, but still, the problem cannot be solved. Bottom: A second layer is added to transform the information of the first layer to a new linearly separable space (0 , 1 ). Now, the original problem can be solved. Based in [19]
596
S. Cuen-Rochin et al.
y
x F1(x, w1) a1
F2(a1, w2) a2
...
Fi(ai-1, wi) ai Input layer x
Hidden layers
Output layer y
...
Fn(an-1, wn) an
L(an, y) Fig. 8 Fully-Connected Multi-Layer Perceptrons. Based in [19]
3.2 Training NN with Gradient-Based Method As displayed in Fig. 8, a N-layer perceptron has the following members: • • • • • • •
x : input data Fi : ith hidden layer ai : ith layer output wi : ith layer weights yˆ = an : prediction y : correct answer L : loss (cost) function
The loss function is defined as the cost to minimize the error between the prediction yˆ = an of the neural network and the correct answer y. Generally speaking, to train (training) is to optimize a set of parameters by minimizing the loss function. Along this line, gradient descend is an iterative approach to optimize weights via minimizing the loss function, as follows: • Define the loss function L and measure the error between the predictions an of the model and the real answers or true labels y. • Compute the loss change with respect to the weights wi and apply backpropagation: ∂ L/∂wi • Update weights: (2) winew = wi − (λ) wi ,
Machine Learning in Mexico for the Hyper-K Observatory …
597
where λ is the learning rate and wi is the weight adjustment through partial derivatives of the loss function L with respect to weights wi , that is, the gradient of the loss function. Moreover, backpropagation is the method to compute the gradient of the loss function with respect to the weights of the i-th layer. To better illustrate this procedure, let’s start by defining the loss function using the information of the layers: L(an , y) = L(Fn (an−1 , wn ), y) = · · · = L(Fn (. . . Fi (ai−1 , wi ) . . .), y, )
(3)
where the n-th layer is the output layer with the predictions yˆ ≡ an , the input data a0 ≡ x, and the true labels y = [yk ]. Then, by applying the chain rule we can find the following relationship: ∂ L ∂an ∂an−1 ∂ai+1 ∂ai ∂L = ... . ∂wi ∂an ∂an−1 ∂an−2 ∂ai ∂wi
(4)
Thus, we can use (4) to update weights as (2). Information about the characteristics of the neural network is needed in order to calculate each factor of (4). As an example, to obtain ∂ L/∂an with true labels y = [yk ] and predictions yˆ = an = [an k ], we can use the quadratic loss as our loss function L, that is: L(an , y) =
m
(an k − yk )2 ,
(5)
k=1
where m is the number of training examples. Then, the change of the loss function with respect to the input vector an is, ∂L = (2an k − 2yk ). ∂an k=1 m
(6)
That is, for a quadratic loss function, the update of the weights will depend linearly on the difference between the predictions and real values. On the other hand, the iterative algorithm of backpropagation requires the standard formula of the chain rule: ∂ L ∂ai ∂L = , (7) ∂wi ∂ai ∂wi and because we know by definition that ai = Fi (ai−1 , wi ), then we can update the weights as (2), where: wi =
∂L ∂ L ∂ Fi (ai−1 , wi ) = . ∂wi ∂ai ∂wi
(8)
598
S. Cuen-Rochin et al.
Now, through the iterative process of backpropagation, the value of the gradient ∂L , can be obtained, and be used to obtain the expression for of L at the ith layer, ∂a i the (i − 1)-th layer: ∂L ∂ L ∂ai ∂ L ∂ Fi (ai−1 , wi ) = = . ∂ai−1 ∂ai ∂ai−1 ∂ai ∂ai−1
(9)
Therefore, now we can compute the gradient of L at the previous layer i − 1 using ∂L , and the partial derivative of Fi with respect to ai−1 . as input: ai−1 , ∂a i Iteratively, we can update the weights of each hidden layer i with (8). In particular, after n steps, we have the weights of the first layer, w1 , and we know that in this case the input data x is equal to the values a0 of the input layer, i.e., a0 = x. Then, because the weights and input information at each layer and neuron is combined through the inner product, we have for the first layer: a1 = F1 (x, w1 ) = w1 · x,
(10)
where for simplicity we denote x as x. Using (9) with i = 1 we obtain:
and then with (10):
∂ L ∂a1 ∂L ∂L = , = ∂a0 ∂x ∂a1 ∂ x
(11)
∂L ∂L w1 . = ∂x ∂a1
(12)
From these results, the weights of the first layer can be updated using the expression (8) with i = 1: ∂L ∂ L ∂ F1 w1 = = , (13) ∂w1 ∂a1 ∂w1 and then with (10): w1 =
∂L x. ∂a1
(14)
Therefore, for the first update of the weights, we are left with: w1new = w2 = w1 − λ
3.2.1
∂L ∂L = w1 − λx . ∂w1 ∂a1
(15)
Convolutional Neural Networks
Convolutional Neural Networks are and have been, for a significant period of time, the de-facto method for image classification. CNN use more than just the perceptron and fully-connected-layer concepts of the standard multilayer perceptron. CNN also uses
Machine Learning in Mexico for the Hyper-K Observatory …
599
Fig. 9 Schematics of a CNN. Taken from [20]. Credit: Chintala
the idea of convolution or sweeping the images via different kernels, to extract feature maps and then subsampling the information with, for example, pooling kernels as shown in Fig. 9 [20], in the same way as the architecture of the classical LeNet-5 model [21]. After some layers of convolution and sub-sampling, the network is flat as a 1D array. At this point, fully-connected-layers are added to create the final output. Deep learning algorithms are differentiated from machine learning by the complexity of the neural networks, where more non-linear kernels and layers are used. With the use of graphics processing units (GPU) and applications in the field of computer vision, deep learning algorithms have gained great popularity in the last years. At the same time, diverse applications of deep learning models have gained great popularity in neutrino experiments, in particular with the water Cherenkov experiment [22]. Using a statistical approach, the SNO (Sudbury Neutrino Observatory) experiment was one of the first to explore the use of neural networks in neutrino experiments in the mid-90s [23]. ImageNet project has also helped to develop and promote a wide variety of deep learning architectures, with deep CNN among the first to gain attention [24]. Thus, the most widely used deep learning algorithms in neutrino experiments have been inspired by the advances made in image and pattern recognition. Some of these models are LeNet, VGG, GoogleNet/Inception/ResNet, among others [21, 25, 26]. The success of deep learning algorithms in neutrino experiments caught the attention of the research community, and fueled to desire to apply these techniques to similar experiments and detectors [27–32].
4 CNN Application for Particle Identification in the Hyper-Kamiokande Experiment The results shown in this book chapter were performed in Compute Canada’s Cedar cluster, which has a “theoretical peak double precision performance of 6547 teraflops for CPUs, plus 7434 for GPUs, yielding almost 14 petaflops of theoretical peak double precision performance. ” [33], using Monte Carlo (MC) data from the Machine learning for water Cherenkov detectors workshop at the University of Victoria organized by Hyper-K Canada in 2019 [19]. By doing so, the Mexican research
600
S. Cuen-Rochin et al.
Fig. 10 Varying the radial position of the initial interaction of the neutrino while its energy remains invariant. Illustration of the second MC dataset
group has gained expertise to ultimately implement and transfer frameworks (in the near future) to CADS-UdeG, and ITESM campus Guadalajara, and run simulations via supercomputing using ML. As a first approach, this work presents two different MC datasets, both representing the same water Cherenkov detector with a barrel-shaped geometry and configuration. The first dataset fixes the initial neutrino interaction vertex in the center of the detector while changing the energy of the incoming neutrino beam; the second dataset keeps the neutrino energy invariant while varying the radial position within the tank of the initial neutrino interaction vertex, see Fig. 10. This research presents a CNN model for the classification problem of the electron, muon, and gamma particles, in relation with the water Cherenkov detector. The first approach used the information of dataset 1, while dataset 2 is used in the second approach. The accuracy obtained with the CNN model and dataset 1 was 77.7%, while the CNN model with dataset 2 had an accuracy of 70.2%. Based on these results, our immediate future work will consist on designing and proposing new Cherenkov water detector geometries and photosensor configurations in conjunction with new CNN architectures, to improve the classification accuracies of the particle detector obtained in the present approach. At the same time, an extensive comparison is needed between convolutional neural network algorithms and the traditional likelihood analysis approaches used in Super-Kamiokande.
4.1 From Monte Carlo Simulation to Image-Like Data Training is performed on two different datasets having an IWCD geometry with a grid of single 3” PMTs. For simplicity, the top and bottom caps of the tank are not included. The simulated data used from [19] was produced using the WCSim-Geant4
Machine Learning in Mexico for the Hyper-K Observatory …
601
[34] simulation package. Events are labeled according to the neutrino-type source, either e, or μ. In case of π 0 background (i.e., quark anti-quark decays into two γ ) we label the event as γ . Each simulated data event has a 2D array of integers mapping active PMTs locations into a two-dimensional image, matching the barrel topology of the detector. Each PMT in the image possesses a channel for registering light intensity and another for timing information. • Dataset-1, Varying energy • e−, μ−, γ , 1,000,000 events each. • Energy between 20 MeV and 2 GeV. • Neutrino interaction vertex in the center of the tank. • Dataset-2, Varying radial position • e−, μ−, γ , 1,000,000 events each. • Fixed visible energy 200 MeV. • The second dataset keeps the neutrino energy invariant while varying the radial position within the tank of the initial neutrino interaction vertex. See Fig. 10. As suggested above, each event can be plotted (visualized) as a 2D image constructed from an array of activated PMTs (a map of intensities), generated by the incoming Cherenkov light produced by the neutrino-water interaction; see Fig. 11. It is important to realize that such an image represents the PMTs hit intensity for all the period associated with that event. In this case, the image has a fixed size of 88 × 168 pixels.
Fig. 11 Simulated PMT hit intensity for a neutrino event. The hit intensity comes from the Cherenkov light cone generated by the charged particles scattered from the neutrino-water interaction. This image is discrete data, 88 × 168 pixels, the intensity values are also discrete
602
S. Cuen-Rochin et al.
Fig. 12 Simulated PMT timing distribution for a neutrino event. The timing distribution comes from the Cherenkov light cone generated by the neutrino interacting with water. This image is a discrete 1D histogram
To explore the time distribution of the events, a 1D visualization (histogram) of the timing information for one event, and all PMTs, come in handy. Figure 12 shows the 1D histogram of an array of 14784 entries; such a number (88*168) was expected given the dimension of the 2D images. Noticeable in the Fig. 12, given the nature of data, is a cut applied to events with a 0 ns timestamp, since those events represent PMTs that were not activated during the event. The peak of the timing distribution is found at around 10 ns, but another feature can be seen at about 30 ns.
4.2 CNN Implementation As a supervised ML classification problem, labels of three classes are attached to both datasets using the WCSim-Geant4 event generator framework. The input to the CNN networks are the 2-channel 2D images. The CNN architecture design proposal for this research consists of 7 convolutional layers, 4 pooling layers, and 3 fully connected layers. See Fig. 13. The convolutional and subsampling pooling layers serve the purpose of the feature learning or feature extraction information of the data. The input images are propagated forward through the feature maps, applying and learning weights of 3 × 3 kernels using the backpropagation algorithm. Pooling layers of size 2 × 2 are often interlaced between convolutional layers, downsampling the feature maps to reduce computations and go deeper in the network, promoting translational and rotational invariance. Finally, the fully connected layers of the network perform the classification process, using the previously extracted features of the convolutional layers as input, and working as a standard multilayer perceptron (MLP). The output of the softmax layer gives the classification of the three classes: electron, muon, and gamma particles. A summary of these layer process is the shown in Fig. 13.
Machine Learning in Mexico for the Hyper-K Observatory …
603
Fig. 13 The architecture of the CNN used with the two datasets. Conv: convolutional layer with kernel nxn, stride S, Padding Pad; MaxPool: max pooling layer with kernel nxn, stride S; Avg Pool: average pooling; FC: Fully connected layer and softmax function
604
S. Cuen-Rochin et al.
The training process of the weights was developed running a CUDA-GPU configuration system. In the output layer, a Cross-Entropy Loss function was used as a metric of the errors and a Softmax Activation as a probability function for the predictions. The Adam optimizer algorithm [35] was used instead of the standard stochastic gradient descent (SGD) procedure in order to have a dynamic learning rate during the weight update process, accelerating the convergence of the learning method. All the previous information was implemented applying forward and backward propagation steps, to obtain our final convolutional neural network model.
4.3 Results and Discussion It is a standard practice to split the dataset in two (or more) so-called folds, the training dataset, and the test dataset. The test dataset is used to check if a network has been over-trained.16 Both models were trained for three epochs.17 Model 1 uses Dataset-1: the interaction of each incoming neutrino always starts at the center of the tank, but the energy of the neutrino beam at the moment of the interaction is changed. Model 2 uses Dataset-2: the initial radial position of the interaction of the neutrino is varying as depicted in Fig. 10, while the energy of each incoming neutrino remains constant. The learning curves of the Model 1 training process is shown in Fig. 14(top), that is, the behavior of the accuracy and loss metrics of the training (lines) and test (circles) sets during the learning process. The learning curves of Model 2 are shown in Fig. 14(bottom). The discrete test results and their continuous test averages curves are plotted at the left and right of Fig. 14, respectively. As expected, the loss function and the accuracy are anti-correlated, as shown by Fig. 14. Overtraining is not observed, as evidenced for the learning curves and by how the test (circles) learning behavior follows that of the train (lines) set. Moreover, once the loss and accuracy have reached a plateau or peak performance, learning should be stopped to avoid overtraining. In contrast, the plot for moving averages shown in Fig. 14(right) reveals that the network associated with Dataset-1’s (varying energy and direction) is still learning, while the network associated with Dataset-2’s (varying radial position) seems to have reached a plateau (implying that is no longer learning after three epochs). It follows up that we could train the Dataset-1’s network for more epochs, to achieve better accuracy. In general, Result-1 has about 7% better classification results than Result-2. Result-1 from Dataset-1 (varying energy) corresponds to Table 1. Result-2 from Dataset-2 (varying radial position) corresponds to Table 2. For both results, muons are fairly well distinguished, while there is some confusion between electrons and 16 The lack of generality and hence failure to classify unclassified datasets displayed by a learning algorithm (due to the training prescriptions and configurations), leading to learned patterns too closely or exactly resembling those associated with a particular set of data. 17 An epoch is one complete iteration of the dataset through the learning network.
Machine Learning in Mexico for the Hyper-K Observatory …
605
Fig. 14 Top: Training Dataset-1 for varying energy. Bottom: Training Dataset-2 for varying radial position. Left: The test results. Right: Test averages Table 1 Result-1 confusion matrix varying energy
Predictions
muon muon electron gamma
Accuracy Error
True Labels electron gamma
33.3% 0.1% 0.0%
0.2% 22.0% 11.2%
0.0% 10.8% 22.5%
33.3%
33.3%
33.3%
33.5% 32.9% 33.7%
77.71% 22.29%
gamma events. This is due to the muon event having very well defined rings. On the other hand, electrons or gamma rings are known to be blurry and very similar. This comparison method can be used to further study the parameter space throughout the simulated data, to be able to propose novel geometrical configurations and mechanical designs for the future WCD suite, and its photosensors for the Hyper-K project.
606
S. Cuen-Rochin et al.
Table 2 Result-2 confusion matrix varying radius
Predictions
muon muon electron gamma
Accuracy Error
True Labels electron gamma
32.7% 0.6% 0.0%
0.5% 15.7% 17.1%
0.1% 11.5% 21.8%
33.3%
33.3%
33.3%
33.3% 27.8% 38.9%
70.23% 29.77%
5 Conclusions and Future Work We showed that ML CNN particle detector analysis applied to a simplified IWCD simulation data-set (for two different configurations), presents a potential for the particle identification with high accuracy. The first configuration consist of varying the energy of neutrino interactions, while the second one consider varying the position of the neutrino interactions whitin the tank. Future work will involve more complex simulations using supercomputing facilities in Jalisco, Mexico. Future studies will include the mPMT sensor configuration and the updated geometry of Hyper-K detector suite. By considering in the simulation, the geometrical structure of the mPMT detector, and its effects in the accuracy of the particle identification ML technique, we can propose new geometrical designs of mPMT to maximize accuracy in particle identification. A comparison of CNN and other [36] machine learning techniques to the classical likelihood methods for particle identification will be included in the future. Additionally we aim at building mechanical mPMT prototypes with future optimized geometries to seize the possibility of implementing a mPMT detector assembly line for the international effort. Acknowledgements The authors thank the anonymous referees for useful comments that enhance the work. EdelaF thanks to Prof. Takaki Kajita for accepting the invitation to visit Mexico, and SEP-PRODEP UDG-CA-499 for financial and logistic support during the sabbatical year (2021). He also thanks Ruth Padilla, Oscar Blanco, Humberto Pulido, Gilberto Gómez, for all academic and partial financial support to attend to the 10th Hyper-K Proto-collaboration Meeting at the University of Tokyo, as well as Guillermo Torales and José Luis García-Luna (UdeG-CA-499) for useful collaborative work. SCR and EdelaF are very thankful for the invitation to the 10th Hyper-Kamiokande Proto-Collaboration Meeting to Masato Shiozawa and Akira Kanoka. We also thank Francesca Di Lodovico, Yoshitaka Itow, and the rest of the steering committee for accepting the participation of Mexico in Hyper-K. We thanks Thomas Lindner, Matej Pavin, and John Walker for his guidance in mPMT development. We also thank Dean Karlen, Patrick de Perio, Nick Prouse, Kazuhiro Terao, Wojciech Fedorko and the WatChMaL(Water Cherenkov Machine Learning: working group developing machine learning for water Cherenkov detectors https://www.watchmal.org/) group for organizing and generating the simulated data for the Machine Learning Workshop at the University
Machine Learning in Mexico for the Hyper-K Observatory …
607
of Victoria (2019) from which we learned how to apply ML techniques to particle identification. We also thanks Carlos Téllez and Alfredo Figarola from ITESM-Campus Guadalajara. We are very grateful for the thoughtful suggestions, comments, and editing of Richard Mischke that helped to improve our manuscript.
References 1. M.G. Aartsen et al., Multimessenger observations of a flaring blazar coincident with highenergy neutrino IceCube-170922A. Science 361, 147–151 (2018) 2. A.U. Abeysekara et al., Very-high-energy particle acceleration powered by the jets of the microquasar SS 433. Nature 562, 82–85 (2018) 3. A.U. Abeysekara et al., Multiple galactic sources with emission above 56 TeV detected by HAWC. Phys. Rev. Lett. 124, 021102 (2020) 4. A.U. Abeysekara et al., Constraints on Lorentz invariance violation from HAWC observations of gamma rays above 100 TeV. Phys. Rev. Lett. 124, 131101 (2020) 5. T. Kajita et al., Establishing atmospheric neutrino oscillations with Super-Kamiokande. Nucl. Phys. B 908, 14–29 (2016) 6. T. Kajita, Kamiokande and Super-Kamiokande collaborations, Proceedings Supplements of Atmospheric neutrino results from Super-Kamiokande and Kamiokande -Evidence for νμ oscillations-Nuclear Physics B 77 (1999), pp. 123-132 7. M. Fukugita, T. Yanagida, Barygenesis without grand unification. Phys. Lett. B 174, 45–47 (1986) 8. J. Migenda, The hyper-Kamiokande collaboration, Supernova Model Discrimination with Hyper-Kamiokande Astrophys. J. Accepted (2020). arXiv: 2101.05269 9. S. Fukuda, Super-Kamiokande collaboration. Super-Kamiokande Detector Nucl. Instrum. Methods Phys. Res. Sect. A 501, 418–462 (2017) 10. Hyper-Kamiokande Proto-Collaboration. Hyper-Kamiokande Design Report. http://arxiv.org/ abs/1805.04163 arxiv:1805.04163 (2018), pp. 1–325 11. Hyper-K Collaboration Proposal for A Water Cherenkov Test Beam Experiment for HyperKamiokande and Future Large-scale Water-based Detectors Scientific Committee Paper. Report number CERN-SPSC-2020-005 (2020), SPSC-P-365, https://cds.cern.ch/record/ 2712416 12. S. Cuen-Rochin, Multi-photomultiplier tube module development for the next generation Hyper-Kamiokande neutrino experiment. In the 20th International Workshop on Next generation Nucleon Decay and Neutrino Detectors (NNN19), The University of Medellin, November 7-9 (2019). https://indico.cern.ch/event/835190/contributions/3613897/ 13. K. Abe, The T2K collaboration. T2K Exp. Nucl. Instrum. Methods Phys. Res. Sect. A 659, 106–135 (2011) 14. The Worldwide LHC Computing Grid. In CERN computing web site (2020). Retrieved from https://home.cern/science/computing/worldwide-lhc-computing-grid 15. T. Mitchell, Machine learning. 1st Edn (McGraw Hill Higher Education, 1997) 16. C.M. Bishop, Pattern Recognition and Machine Learning, 2nd edn (Springer, 2006) 17. E. Alpaydin, Introduction to Machine Learning, 2nd edn. (The MIT Press, 2014) 18. I. Goodfellow, Y. Bengio, A. Courville, Deep Learning (Adaptive Computation and Machine Learning series), 1st edn. (The MIT Press, 2016) 19. Hyper-K Canada, Machine Learning Workshop. University of Victoria, April 15-17 (2019). https://mlw.hyperk.ca/ 20. S. Chintala, Neural network tutorial, in Deep Learning with pytorch: A 60 minute blitz. Retrived from https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html (2020) 21. Y. LeCun, et al., Gradient-based learning applied to document recognition, in Proceedings of the IEEE, (1998). http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf
608
S. Cuen-Rochin et al.
22. F. Psihas, M. Groh, C. Tunnell, K. Warburton. A review on machine learning for neutrino experiments. Int. J. Modern Phys. (2020). arXiv:2008.01242v1 23. S. Brice, The results of a neural network statistical event class analysis. Sudbury Neutrino Observatory Technical Report, SNO-STR-96-001 (1996) 24. ImageNet. http://www.image-net.org/ 25. K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition (2015). arXiv:1409.1556 26. C. Szegedy, S. Ioffe, V. Vanhoucke, A. Alemi, Inception-v4, Inception-ResNet and the Impact of Residual Connectios on Learning (2016). arXiv:1602.07261 27. A. Aurisano, et al., A Convolutional Neural Network Neutrino Event Classifier (2016). arXiv:1604.01444v3 28. N. Choma, et al., Graph Neural Networks for IceCube Signal Classification (2018). arXiv:1809.06166 29. R. Li, Z. You, Y. Zhang, Deep learning for signal and background discrimination in liquid based neutrino experiment. J. Phys. Conf. Ser. 1085, 042037 (2018) 30. C. Fanelli, Machine Learning for Imaging Cherenkov Detectors (2020). https://doi.org/10. 1088/1748-0221/15/02/C02012 31. J. Renner et al., Background rejection in NEXT using deep neural networks. J. Instrum. 12, T01004–T01004 (2017) 32. F. Psihas et al., Context-enriched identification of particles with a convolutional network for neutrino events. Phys. Rev. D 100, 073005 (2019) 33. Cedar—CC Doc. Retrieved from https://docs.computecanada.ca/wiki/Cedar in 2020 34. T. Dealtry, A. Himmel, J. Hoppenau, J. Lozier, Water Cherenkov Simulator (WCSim). Retrieved from https://github.com/WCSim/WCSim (2020) 35. D. P. Kingma, J. Ba, Adam: a method for stochastic optimization, in The 3rd International Conference on Learning Representations (ICLR). Ithaca, NY: arXiv.org, San Diego, CA, USA (2015) 36. Neutrino Physics and Machine Learning (NPML): Lightning Talks https://indico.slac.stanford. edu/event/377/timetable/ 17 and 19 Jun (2020)
A Novel Metaheuristic Approach for Image Contrast Enhancement Based on Gray-Scale Mapping Alberto Luque-Chang, Itzel Aranguren, Marco Pérez-Cisneros, and Arturo Valdivia
1 Introduction Image enhancement (IE) has attracted the research community’s attention due to its multiple applications in areas such as medicine, security, and transportation, among others [1–5]. The IE process is a crucial stage in almost every image processing system; its principal aim is to improve the interpretability of the information present in an image for human viewers. IE can be defined as the operation that aims to change an image’s photometric characteristics, such as contrast, to get a better look image and make it easier to interpret and be used in further steps in any automatic image processing system [5, 6]. Overall, IE methods modify pixel values through the histogram equalization, quadratic transformation, or fuzzy logic operation [2]. Among these techniques, histogram equalization (HE) [6–8] is the most applied, efficient, and straightforward procedure for IE. The HE considers the statistical features of pixels. In its operation, pixels relatively concentrated in positions of the histogram are redistributed over its whole scale. During this process, each existent intensity value into the original image is mapped into another value in the processed image without matter the number of pixels corresponding in the original image. Therefore, these schemes present an inadequate redistribution of the pixel data under the presence of noise or irrelevant A. Luque-Chang · I. Aranguren (B) · M. Pérez-Cisneros · A. Valdivia División de Electrónica y Computación, Universidad de Guadalajara, CUCEI, Guadalajara, Jalisco, Mexico e-mail: [email protected] A. Luque-Chang e-mail: [email protected] M. Pérez-Cisneros e-mail: [email protected] A. Valdivia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 D. Oliva et al. (eds.), Metaheuristics in Machine Learning: Theory and Applications, Studies in Computational Intelligence 967, https://doi.org/10.1007/978-3-030-70542-8_24
609
610
A. Luque-Chang et al.
small sets of pixels. Consequently, these methods produce enhanced images with different problems, such as generating undesirable artifacts and noise amplification [10]. Lately, the problem of ICE has been approached through metaheuristic techniques as an alternative to histogram equalization-based schemes such as in [9–15] where the authors regard edge information and an entropy metric. In Different from HE techniques, metaheuristic methods face the contrast enhancement problem from an optimization perspective. Therefore, under these methods, some aspects of the contrast enhancement process are related to an objective function whose optimal solution represents a good quality enhanced image. In general, contrast enhancement algorithms based in metaheuristic schemes have demonstrated better results than those delivered by HE in terms of pixel intensity redistribution. Diverse metaheuristic algorithms such as: Social Spider Optimization (SSO) [16, 17], Particle Swarm Optimization (PSO) [18, 19], Artificial Bee Colony (ABC) [20], Cuckoo Search Algorithm (CS) [21], Differential Evolution (DE) [22], to mention a few has been proposed to solve the ICE problem. These algorithms propose optimal solutions due to their ability to find robust and convenient solutions to non-linear optimization problems [23]. Although these methods have produced interesting image enhancement results, they present several flaws such as premature convergence to local optima and inability to maintain population diversity. The success of the algorithms in their tasks depends on their balance between exploration and exploitation [24] a bad balance between exploration and exploitation often leads the algorithms to get stacked in a local optimum. The Moth Swarm Algorithm (MSA) proposed by Al-Attar et al., is a metaheuristic algorithm inspired by moths’ orientation towards the moonlight. Unlike other metaheuristic approaches, the MSA divides its population into three sub-populations that employ different movement operators that give it a right balance between exploration and exploitation [25], making it a feasible option to solve the ICE problem. Due to its exciting characteristics, MSA has been extensively applied in many complex engineering problems such as image processing [26], power systems [25], energy conversion [27], to mention a few. The entropy establishes an index of statistic diversity on the gray intensity levels present in an image. Moreover, the Kullback–Leibler entropy [28, 29], introduced by Kullback and Leibler, allows objectively evaluating the divergence between two different probability distributions. In general terms, the KL-entropy is used to assess how one probability distribution differs from a second probability considered a reference. Its minimal value represents the complete dissimilarity between both distributions, while its maximal value corresponds to its best resemblance. Applications of the KL-entropy include evaluating the randomness in continuous time-series, the characterization of the relative entropy in information systems, and the measurement of contained information when comparing different statistical models of inference. In this paper, an Image Contrast Enhancement (ICE) algorithm for generalpurpose images is introduced. The Moth Swarm Algorithm (MSA) is adopted to redistribute the histogram’s pixel intensities in the complete range so that the value
A Novel Metaheuristic Approach for Image Contrast Enhancement …
611
of the symmetric Kullback-Leibler Entropy (KL-entropy) between a candidate distribution and the original information has been maximized. This paper presents two significant contributions: (I) Using the Moth Swarm Algorithm (MSA) to find the best redistribution that represents the best-enhanced image through a penalization function. (II) The incorporation of the symmetric KL-entropy stablishes a significant metric to measure the statistical change respect the original image and the new candidate image. Though its inclusion, images with a better human visual appearance and quality metrics are produced through this new scheme. The proposed scheme’s performance has been tested considering several representative generalpurpose images commonly found in the image processing literature. Experimental results suggest that the proposed method has a better performance than other schemes in terms of several performance indexes. The rest of the paper is organized as follows: In Sect. 2, the main characteristics of the ICE problem are illustrated; in Sect. 3, the MSA is described; Sect. 4 presents the proposed MSA-Based ICE approach; furthermore, Sect. 5 records the experimental results; finally, in Sect. 6, conclusions are drawn.
2 Image Contrast Enhancement Based on Gray-Scale Mapping Contrast enhancement is a fundamental step in the display of digital images. Moreover, the design of and effective contrast enhancement technique requires an understanding of human brightness perception. Analyzing the contrast enhancement as an optimization problem leads to giving the solutions a vectorial representation, this provides a vector of L integers ∈ [0, 255], where L is the number of gray levels of the original image [20]. This vector represents the possible mapping solutions of the original image grey levels. The set of encoded vectors is operated through a metaheuristic approach so that the solutions (histograms) are iteratively improved in each generation of the optimization process. In general, a contrast enhancement task involves maximizing the objective function, the higher the index, the better the improved result. In the gray-scale mapping method, the resultant output image is obtained by mapping each gray level of the original image to its corresponding grey level in the solution vector (the levels taken are those who correspond to the same vector place). In Fig. 1 illustrates the enhancement process based on gray-level mappings under an optimization approach. The Fig. 1, a solution a is adjusted by the operators of the optimization strategy. As a result of this adjustment, a new solution b is produced. The new solution b presents a better enhancement index according to its evaluation of the objective function.
612
A. Luque-Chang et al.
Fig. 1 Illustration of the image contrast enhancement process based on gray-level mappings under an optimization approach a image’s original gray-levels (I), b image’s mapped gray-levels (I’)
3 Moth Swarm Algorithm The Moth Swarm Algorithm (MSA) is a recently proposed swarm algorithm introduced by Al Attar Ali Mohamed to solve global optimization problems [25]. In this method, optimal solutions are represented by the position of a light source, and the fitness of this solution is depicted as the luminescence intensity of the light source. The MSA approach divides the swarm population into three groups, pathfinders, a small group to explore the search area with the principle of First In - Last Out (FILO). Prospectors, this group makes random spiral exploitation around the position marked by the pathfinders. Onlookers, this group goes direct to the best global solution, which has been given by the prospectors [19]. This section describes the main steps of the MSA approach, with attention to its leading operators.
3.1 Initialization search agents are modeled bya set of p individual moths M = In the MSA approach, m1 , m2 , . . . m p , where mi = m i,1 , m i,2 , . . . m i,n represents a candidate solution for a given optimization problem. These search agents are meant to interact with each other while exploring the feasible solution space, guided by a set of operators inspired by the natural behavior of moths [19]. During the initialization step, the MSA starts by generating a random population of moths as follows: min + m min − m m i j = rand(0, 1) · m max j j j ∀i ∈ {1, 2, . . . , p}, j ∈ {1, 2, . . . , n}
(1)
where, m max and m min are the upper and lower limits of the search space, respecj j tively, while, n stands for the dimensionality (number of decision variables) of the solution space.
A Novel Metaheuristic Approach for Image Contrast Enhancement …
613
3.2 Recognition Phase To avoid premature convergence and improve the solution diversity, the pathfinders update its positions using crossover operators and Lévy flights [30]. For the crossover operation, the algorithm starts by selecting a crossover point, as described as follows: μij =
1 pn
2 pn i i m − m ¯ a=1 aj j m¯ ij
(2)
where: m¯ ij =
pn 1 i m pn a=1 a j
(3)
with pn being the number of pathfinder moths. Every pathfinder with a low dispersal degree will belong to the crossover point cp, as described: j ∈ cp i f μij ≤ ϑ i
(4)
where ϑ i correspond to a variation coefficient, and is calculated as follows: ϑi =
n 1 i μ n j=1 j
(5)
The crossover points set variates dynamically with the progress of the algorithm.
3.2.1
Lévy Flights
The MSA uses a Lévy flight class of non-Gaussian random process to generate random walks [30]. This strategy generates random steps from the Lévy distribution. Lévy α-stable distribution is strongly linked with a heavy-tailed probability density function (PDF), anomalous diffusion, and fractal statistics. The PDF of the individual jumps ς (r ) ∼ |r |−1−α and decays at the generated variable r. The stability index α ∈ [0, 2] describes how the tail of the distribution decay. A simple Lévy distribution, r ~ Lévy(μ, ϑ) can be defined as follows: f (r ) =
1 γ − 2(rμ−ϑ) ,0 < ϑ < r < ∞ 3 e 2π (r − ϑ) 2
(6)
614
A. Luque-Chang et al.
To emulate the behavior of the α-stable distribution Lévy-flight by generating random samples L i the Mantegna´s algorithm [31] described as follows: L i ∼ scale ⊕ L evy(α) ´ ∼ 0.01
v |z|1/α
(7)
where, the scale is the size related to the scales of the interest problem, ⊕ is the entry wise multiplication, v = N (0, μ2v ) and z = N (0, μ2v ) are normal stochastic distributions with: ⎡
⎤1/α (1 + α)sin πα 2 ⎦ μv = ⎣ , μz = 1 (1+α) α−1 2 2α 2
3.2.2
(8)
Lévy Mutation
For nc ∈ cp operations, the algorithm creates a sub-trial vector − → = x p1 , x p2 , . . . , x pnc , by perturbing the host-vector components xp − → y p = y p1 , y p2 , . . . , y pnc , with related components in donor vectors. The mutation strategy is described as follows: − →i −→ −→ −→ −→ i i i i i i x p = L p1 ∗ m r 2 , −, m r 3 + L p2 ∗ m r 4 , −, m r 5 ∀r 1 = r 2 = r 3 = r 4 = r 5 = p ∈ {1, 2, . . . , np}m ri 1
(9)
where, L p1 and L p2 are two independent variables used as a mutation scaling factor generated by an extensive stability index of the Lévy-flights using L i ∼ random(nc) ⊕ Levy(α). The set of mutual indices (r 1 , r 2 , r 3 , r 4 , r 5 ) are selected from the pathfinders.
3.2.3
Adaptive Crossover
To achieve a complete trail solution, each pathfinder updates its position using crossover operations, incorporating the sub-trail vector’s mutated variables into the corresponding variables of the host-vector. This trail/mixed solution Tmspj , is described as follows: ⎧ → ⎨− x ipj i f j ∈ cp i T ms pj = −→ (10) ⎩ mi i f j ∈ / cp pj
A Novel Metaheuristic Approach for Image Contrast Enhancement …
615
Here the variation coefficient ϑ i is used to control the crossover rate instead of being used as a validation index.
3.2.4
Selection Strategy
Once the adaptive crossover operator is complete, the complete trail solution’s fitness value is calculated and compared with its corresponding host solution. In this step, the bests solutions are selected for the next generation. In a maximization problem, the survivor moth is selected as follows: ⎧− − − → → → −−i+1 → ⎨ m ip i f f x ip ≤ f m ip − − (11) mp = − → → ⎩→ x ip i f f x ip > f m ip Every solution gets a probability value pvp estimated proportionally to luminescence intensity lip , calculated as follows: li p pv p = pn
p=1 li p
(12)
On a maximization problem, the value lip is calculated from the objective function fitness value fit p as follows: li p =
1 1+| f it p |
f or f it p < 0
1 + f it p f or f it p ≥ 0
(13)
3.3 Transversal Flight The group of moths with the best luminescence intensities are set as prospectors in the next iteration. On the course of the iterations i of the MSA, the prospector’s number p f decreases as follows: i p f = r ound ( p − pn ) × 1 − ic
(14)
where ic is the current iteration number. Each prospector mf updates its position according to a spiral flight path mathematically expressed as follows: = m if − m ip ∗ eθ cos 2π θ + m ip m i+1 f
616
A. Luque-Chang et al.
∀ p ∈ {1, 2, . . . , pn }; f ∈ { pn + 1, pn + 2, . . . , p f }
(15)
where, θ ε[r, 1] is a random number that defines the spiral shape and r = −1 − i/i c .
3.4 Celestial Navigation During the optimization process, the decreasing prospectors number increases the onlooker’s number (on = p − pn − pf ). This may lead to a fast increment on the convergence rate. The moths with the lower luminescent sources in the swarm are considered onlookers. These moths travel directly to the most shining solution. In this phase, the MSA force the onlookers to search more effectively by zooming on the hot spots of the prospectors. The onlookers are divided in two equal parts.
3.4.1
Gaussian Walks
In this phase, to focus on promising areas on the search space, a stochastic Gaussian distribution is used due to its ability to limit distributions of random samples. The first set of onlookers with size on G = r ound(on /2) walks using Gaussian distributions according to: Gaussian Distribution, r ∼ N (ϑ, μ2G ) 2
− (r −ϑ) 1 2 e 2μG ; −∞ < r < ∞ f (r ) = √ 2π μG
(16)
The new onlookers sub-set moves with serial steps of Gaussian walks described as follows: = m io + ε1 + ε2 × best − ε3 × m io ; ∀o ∈ [1, 2, . . . , on G ] m i+1 o
(17)
log t i × m o − gbesgt i ε1 ∼ random(si ze(n)) ⊕ N gbest i , t
(18)
where, 1 is a random sample drawn from the Gaussian distribution scaled to the size of this set, gbest is the global best solution obtained by the transversal orientation phase, 2 and 3 are random numbers ∈ [0,1].
A Novel Metaheuristic Approach for Image Contrast Enhancement …
3.4.2
617
Associative Learning Mechanism with Immediate Memory (ALIM)
The second onlookers sub-set with size onA = on – onG drift towards the global best according to associative learning operators with immediate memory to emulate moths’ behavior in nature. This memory is initialized from a continuous uniform max − m io . The updating Gaussian distribution on the interval from m io − m min o to m o equation of this subset is completed in the form: max = m io + 0.001 · G m io − m min − m io m i+1 o o , mo 2g
g · r1 · pbest i − m io + + 1− · r2 · gbest i − m io G G
(19)
where, oε{1, 2, . . . , on A }, 2g/G is the social factor, 1−g/G is the cognitive factor, r1 and r2 are random numbers ∈ [0, 1] and pbest is a light source randomly chosen from the new pathfinder’s group based on the probability value pvp of its corresponding solution. The flowchart of the MSA is shown in Fig. 2.
4 MSA-Based Image Contrast Enhancement via Gray-Levels Mapping As illustrated in Sect. 2, ICE based on gray-levels mapping involves processing a grayscale source image by the definition of an appropriate gray-level distribution mapping to allow better visualization of said source image. While the process itself is quite intuitive, the main problem is to define a gray-level mapping function that yields a maximum enhancement of details. In this paper, the MSA optimization approach is proposed for solving the ICE problem based on gray-levels mapping, where the integer values of each moth is encoded to map a different candidate solution or output image and this moths are evaluated thorough the objective function.
4.1 Objective Function In this section, the proposed contrast enhancement method using the MSA and the objective function, and the representation of the solution are described. Since image contrast enhancement is being threatened as an optimization problem, it is necessary to have an objective function to evaluate the output image’s quality or the quality of I’. This function combines
different measures of the image as the number of edge pixels of the image ep I given
by a Sobel filter, the sum of the intensity of the edge pixels of the image ei I
, the image resolution hp ∗ vp and the symmetric Kullback-Leibler entropy KLe I, I also known as relative entropy [28, 32, 33].
618
Fig. 2 MSA flowchart
A. Luque-Chang et al.
A Novel Metaheuristic Approach for Image Contrast Enhancement …
619
In Fig. 3 is shown 2 examples of calculation the symmetric Kullback-Leibler entropy with the respective histograms: A higher entropy value indicates a more significant difference between the distribution of the pixels in the histogram of the original image and the output image, which indicates an improvement in the image quality. The objective function can be described as follows:
Fig. 3 Comparison of 2 calculations of different imagens I is the original, the image, I’1 is an improved contrast version of I and finally I’2 is a low enhancement image in terms of contrast to I
620
A. Luque-Chang et al.
Fig. 4 Steps in the fitness function calculation
Input the images I and I’.
Calculate the sum of the edge information ep(I’) . Calculate resolution of the image I’.
Calculate the symmetric KL entropy over the histograms of I and I’. (Eq.21) Perform the fitness calculation (Eq.20) Return the fitness value End
fit = log(log ei (I)) ·
ep (I) · KLe (I) hp*vp
(20)
where, KLe (I) is the asymmetric Kullback-Leibler entropy between the original image and the output image given by: KLe (I) =
h 1 · log
h1
h2 + h 2 · log h2 h1
(21)
h 1 and h 2 are the histograms of the original image and the output image, respectively. In Fig. 4 is shown the steps to calculate the fitness value of fit and in Fig. 5 is presented the general schema of ICE proposed in this paper.
4.2 Penalty Function Due to the problem nature, it is desirable that the solution vector L presents the gray-levels distribution in ascendant order. On the other hand, using a sort function limits the algorithm capabilities and the exploration of the solution. The proposed solution is to attach the problem as a constrained optimization problem and define a penalty function that punishes the solutions vectors which are not presented in the desired order. The constrained optimization problem can be formulated as follows: max fit(m) subject to : li < li+1 , l ∈ m , i = 1, 2, . . . , L
(22)
A Novel Metaheuristic Approach for Image Contrast Enhancement … Fig. 5 General schema of the proposed methodology for ICE
621
Start. Input the image I .
Initialization of the optimization algorithm population. Calculate output image I’ for every member in the population using mapping technique. Calculate the fitness value of every member of the population. Performs the operators of the optimization algorithm. Update the value/position of the members of the population. Stop condition reach ?
Return the fitness value.
End.
The corresponding penalty function can be defined as follows: Fit(m) =
for P(m) > 0 fit(m) for P(m) = 0 fit(m) P(m)
(23)
where, P(m) is the penalty factor that directly affects the fitness function. This factor increases with each violation of the constrain and can be calculated as follows: P(m) = (w1 NVCm + w2 SVCm ) ∗ PC
(24)
NVCm is the number of constraints violated by m, SVCm is the sum of all violated constraints: SVCm =
L
i=1
max{0, li }
(25)
622
A. Luque-Chang et al.
Moreover, w1 and w2 are static weights, and PC is a high value constant which ensures a functional penalization of the fitness particle. In this work, w1 = w2 to give NVCm and SVCm the same importance in the penalization process.
5 Experimental Results The proposed MSA-ICE approach’s feasibility and efficacy are evaluated in a set of comparative experiments performed on two groups of reference images. The first group includes ten general-purpose images regularly used in image processing applications, such as Blonde, Peppers, Lena, and Cameraman. These images were transformed into low contrast images to demonstrate the robustness of the method. The second group contains four images extracted from Tampere Image Database 2013 (TID2013) [34], which are originally low contrast images; this group verifies the method’s efficiency when applied to images with a real contrast problem. The results of the MSA approach are compared against five well-known metaheuristic algorithms such as the Artificial Bee Colony (ABC) algorithm [27], the Differential Evolution (DE) [28], the Firefly Algorithm (FA) [29], the Gravitational Search Algorithm (GSA) [30] and the Particle Swarm Optimization [31]. For each test, the evolution of the fitness function illustrated in Eq. (22) is employed to evaluate each of the compared ICE methods’ performance. For each experiment, the population size has been set to N = 30, while the maximum of iterations is kmax = 100 iterations. This stop criterion has been selected to keep compatibility with other similar works reported in the literature [35]. Parametric settings from each algorithm are presented in Table 1. The parameter configurations chosen for each of the compared methods were obtained through exhaustive experimentation over the Table 1 Parametric settings for ABC, DE, FA, GSA, PSO, and MSA Algorithm Parameters ABC [26]
limit = num Of Food Sour ces ∗ dims num Of Food Sour ces = N (population size) dims = n(dimensionality of the solution space)
DE [27]
Crossover rate C R = 0.5 Differential weight F = 0.2
FA [28]
Randomness factor α = 0.2 Light absorption coefficient γ = 1.0
GSA [29] Gravitation constant G o = 100 Alpha constant α = 20 PSO [30]
Learning factors c1 = 2 and c2 = 2The inertia weight factor decreases linearly from 0.9 to 0.2
MSA [19] Number of sub-trial vectors nc = 8
A Novel Metaheuristic Approach for Image Contrast Enhancement …
623
Table 2 Image quality metrics Metric
Formulation
Structural similarity index (SSIM) [36]
2μ μ +C )(2σ Ir Is +C2 ) SS I M = (2 Ir 2Is 1 2 2
Relative enhancement contrast (REC) [37]
Remark
μ +μ +C1 Ir
Is
σ Ir +σ Is +C2
⎡
N M
1 (I(i, j))2 M×N i=1 j=1 ⎛ ⎞2 ⎤ M
N
1 ⎥ − −⎝ R(i, j)⎠ ⎦ M×N
R EC = 20 · log⎣
Measures the similarity of structural information between the original and the processed image Quantifies the contrast difference between the enhanced image and the original image
i=1 j=1
Range redistribution (RR) [38]
RR = 1 M×N ·(M×N −1)
L−1 L−1 q=1 r =q
P(q)P(r )(r − q)
Determines the way in which pixel intensities are distributed in the improved image
proposed ICE approach and represent the best possible parameters found on the experiments. Furthermore, the quality of the processed images is evaluated through three image processing metrics described in Table 2, the Structural Similarity Index (SSIM), the Relative Enhancement Contrast (REC), and the Range Redistribution (RR). All experiments were performed on MatLab® R2016a, running on a computer with an Intel® Core™ i7–3.40 GHz processor, and Windows 8 (64-bit, 8 GB of memory) as its operating system.
5.1 Standard Test Images This subsection examines the results of the first set of reference images. As mentioned above, this group comprises ten images widely known in image processing literature, namely Jet, Peppers, Blonde, Pirate, Lena, Cameraman, Sailboat, Couple, Baboon, and House. This group of images is modified from its original version to a low-contrast version, making it possible to evaluate the robustness of the proposed approach. Table 3 reports the fitness results, corresponding to 30 individual runs for each method compared. Values highlighted in bold letter indicate the best results. The comparisons are analyzed by considering the following performance indexes: the average, median and standard deviation of the fitness values for all 30 individual runs ( f mean , f median , and f std , respectively) and the best and worst fitness values found among said set of experimental runs ( f best and f worst , respectively). Since the objective function maximizes the entropy, it is determined that higher fitness values indicate better performance in the contrast enhancement process.
1.08 × 9.25 × 4.24 × 5.07 × 5.49 × 2.05 × 1.12 × 6.20 × 1.35 × 2.34 × 1.57 × 7.79 × 5.02 ×
10−06
10−06
10−06
10−06
10−05
10−06
10−05
10−06
10−05
10−01
10−06
10−02
10−06
1.59 ×
8.36 ×
5.71 ×
8.95 ×
4.29 ×
2.52 ×
1.15 ×
7.17 ×
1.07 ×
9.87 ×
1.50 ×
5.89 ×
5.19 ×
2.02 × 10−01
f worst
f mean
f median
f std
f best
f worst
f mean
f median
f std
f best
f worst
f mean
f median
f std
Lena
Pirate
Blonde
Peppers
2.78 × 1000
3.82 × 10−05
f best
Jet
2.68 × 10−04
4.40 × 10−04
1000
10−04
3.22 × 1001
5.26 ×
5.12 ×
1.10 ×
f std
f best
1.53 ×
10−05
4.02 ×
1000
3.24 ×
f median
10−05
1001
2.77 × 1000
10−05
f mean
1.30 ×
1.59 ×
5.32 × 10−05
1.62 ×
f worst
10−06
10−05
10−01
10−06
1000
10−01
10−05
10−01
10−06
1000
10−01
10−05
10−01
5.91 × 10−01
10−05
9.84 ×
10−05
6.01 ×
f best
10−04
1.73 ×
2.25 ×
1.26 ×
2.60 ×
6.14 ×
1.40 ×
1.53 ×
1.94 ×
3.06 ×
9.17 ×
1.53 ×
3.94 ×
1.83 ×
10−06
3.07 × 1000
FA
2.00 × 10−01
2.27 ×
1000
4.27 × 10−01
10−06
10−02
10−06
1000
10-05
10−06
10−05
10−06
10−05
10−01
10−06
10−02
10−06
DE
ABC
Fitness
Image
10−06
10−06
10−07
10−06
10−06
10−06
10−06
10−07
10−06
10−06
10−06
10−06
10−06
10−05 9.05 × 10−05
3.02 ×
1.51 ×
10−05
2.75 × 10−05
5.24 ×
1.54 ×
10−04
1.02 × 10−06
1.25 ×
1.70 ×
4.16 ×
3.45 ×
1.92 ×
3.20 ×
3.44 ×
7.30 ×
9.60 ×
1.49 ×
1.91 ×
2.34 ×
5.36 ×
10−07
6.50 × 10−06
GSA
Table 3 Comparison of fitness performance results acquired through ABC, DE, FA, GSA, PSO, and MSA methods
10−06
10−02
10−06
10−01
10−01
10−06
10−02
10−06
1000
10−01
10−06
10−01
10−05
10−05 2.78 × 1001
6.02 ×
5.05 ×
10−05
7.79 × 10−05
1.60 ×
2.02 ×
10−04
2.21 × 10−01
5.12 ×
6.88 ×
1.91 ×
9.89 ×
3.69 ×
7.77 ×
6.74 ×
2.26 ×
2.02 ×
8.62 ×
5.77 ×
2.21 ×
1.44 ×
10−06
4.02 × 1000
PSO
(continued)
8.68 × 1001
6.44 × 1000
1.38 × 1001
1.41 × 1001
3.82 × 10−04
2.64 × 1001
1.87 × 1000
1.10 × 1000
1.58 × 1000
1.62 × 10−05
1.09 × 1001
2.75 × 1000
3.66 × 1000
3.98 × 1000
6.70 × 10−05
1.19 × 1001
1.29 × 1000
2.06 × 1000
2.32 × 1000
3.34 × 10−05
5.14 × 1000
MSA
624 A. Luque-Chang et al.
Baboon
Couple
Sailboat
Cameraman
Image
9.52 × 1.50 × 4.48 × 2.95 × 2.07 × 3.18 × 2.44 × 3.82 × 2.98 × 1.23 × 5.81 ×
10−06
10−07
10−06
10−06
10−06
10−06
10-07
10−06
10−06
10−06
4.81 ×
9.82 ×
5.59 ×
2.21 ×
1.75 ×
1.75 ×
8.08 ×
2.21 ×
1.61 ×
1.50 ×
1.43 ×
f std
f best
f worst
f mean
f median
f std
f best
f worst
f mean
f median
f std
6.41 × 3.69 ×
10−04
9.82 ×
2.19 ×
7.26 × 10−06
f best
f worst
7.20 × 10−06
10−04
10−04 5.94 × 10−06
3.23 ×
3.13 ×
10−04
f std
7.43 × 10−05
10−05
1000
10−04
10−05
6.74 × 10−05
6.09 × 10−05
f median
1.95 ×
8.82 ×
1.79 ×
f mean
1.83 ×
10−05
2.52 ×
1000
2.67 ×
f worst
10−01
10−05
10−01
10−07
1000
10−01
10−06
10−01
10−07
1000
1000
10−05
1000
1.31 × 10−03
5.62 ×
2.29 ×
2.26 ×
8.22 ×
7.64 ×
3.61 ×
7.29 ×
2.39 ×
6.94 ×
4.83 ×
10−05
3.26 × 10−04
10−06
10−06
10−06
10−06
10−06
10−06
10−07
10−05
10−05
1.79 ×
10−05
5.38 × 1001
9.42 ×
10−07
2.43 ×
10−05
1.80 ×
f median
f best
2.08 ×
10−05
5.49 ×
10−05
3.29 ×
f mean 10−05
10−05
10−05
6.43 × 10−06
5.70 × 10−06
7.25 × 10−06
f worst
FA
DE
ABC
Fitness
Table 3 (continued)
10−07
10−07
10−07
10−07
10−06
10−07
10−07
10−07
10−07
10−06
10−05
10−05
10−05
10−05 3.54 × 10−06
2.67 ×
2.01 ×
10−05
3.89 × 10−05
3.88 ×
3.58 ×
10−06
9.62 × 10−05
4.02 ×
7.26 ×
7.82 ×
1.63 ×
1.74 ×
4.91 ×
9.12 ×
9.69 ×
1.57 ×
2.20 ×
1.55 ×
1.02 ×
1.34 ×
10−05
8.59 × 10−07
GSA
10−06
10−06
10−06
10−07
10−06
10−06
10−06
10−06
10−07
10−05
1000
10−05
10−04
10−05 9.08 × 10−06
4.98 ×
1.44 ×
10−04
6.64 × 10−05
1.03 ×
2.14 ×
10−05
8.03 × 10−04
1.85 ×
1.63 ×
2.22 ×
5.48 ×
8.36 ×
3.25 ×
1.99 ×
2.94 ×
3.95 ×
1.84 ×
5.07 ×
2.45 ×
9.26 ×
10−01
5.09 × 10−06
PSO
(continued)
1.08 × 10−04
2.56 × 1001
2.84 × 1001
5.93 × 1001
4.81 × 1001
4.27 × 10−04
8.61 × 1001
3.55 × 1000
7.47 × 10−01
1.83 × 1000
4.29 × 10−06
1.63 × 1001
1.33 × 1000
1.67 × 1000
1.84 × 1000
3.95 × 10−05
6.39 × 1000
1.93 × 1001
2.43 × 1001
2.27 × 1001
7.99 × 10−05
MSA
A Novel Metaheuristic Approach for Image Contrast Enhancement … 625
House
Image 1.42 × 6.89 × 2.08 × 7.92 × 3.40 × 2.11 × 3.73 ×
2.73 × 10−05
10−05
10−05
10−04
10−06
10−05
10−05
10−05
1.48 ×
3.97 ×
1.01 ×
7.92 ×
2.76 ×
2.23 ×
2.05 ×
f mean
f median
f std
f best
f worst
f mean
f median
f std
10−05
10−05
10−05
10−06
10−04
10−05
3.21 × 10−05 10−05
DE
ABC
Fitness
Table 3 (continued)
3.97 ×
3.57 ×
7.25 ×
1.22 ×
2.17 ×
8.84 ×
2.02 ×
1000
10−05
10−01
10−05
1001
10−05
10−05
5.61 × 10−05
FA
1.37 ×
1.44 ×
1.78 ×
2.56 ×
7.12 ×
6.70 ×
8.73 ×
10−05
10−05
10−05
10−06
10−05
10−06
10−06
1.13 × 10−05
GSA
4.65 ×
3.07 ×
4.56 ×
1.32 ×
2.60 ×
1.06 ×
1.83 ×
10−05
10−05
10−05
10−05
10−04
10−05
10−05
2.12 × 10−05
PSO
9.75 × 1000
1.77 × 1001
1.41 × 1001
2.93 × 10−04
2.68 × 1001
8.17 × 1000
1.69 × 1001
1.38 × 1001
MSA
626 A. Luque-Chang et al.
A Novel Metaheuristic Approach for Image Contrast Enhancement … Table 4 Average results for the image quality metrics SSIM, REC, and RR
Algorithm
627
Metric SSIM
REC
RR
ABC
0.6742
1.0121
35.9374
DE
0.6371
0.9947
27.2145
FA
0.7126
1.0964
36.4124
GSA
0.7422
1.0824
38.4513
PSO
0.6421
0.9134
25.1401
MSA
0.7926
1.1272
39.0121
According to Table 3, for each test image, the proposed MSA-Based ICE approach delivers significantly better results than ABC, DE, FA, GSA, and PSO approaches. The resultant image quality is appraised through three metrics, SSIM, REC, and RR, listed in Table 2. All quality metrics employed in this work establish that the higher the resulting value, the better the quality of the image. Table 4 presents the average results for each quality metric, and each metaheuristic algorithm evaluated the general-purpose images. The best results are shown in bold letter. The results in Table 4 indicate that the average performance of the proposed MSA approach was higher in terms of SSIM, REC, and RR compared to ABC, DE, FA, GSA, and PSO methods. The subjective results are shown in Table 5. This table presents the resultant images and their corresponding enhanced histograms after applying the mapping functions on each metaheuristic approach. For illustrative purposes, only four images from the group (Jet, Peppers, Blonde, and Pirate) were chosen to present the visual results. Images enhanced through the MSA method show better contrast and more detailed objects than the other approaches compared.
5.2 Low Contrast Test Images As mentioned at the beginning of Sect. 5, the second group of reference images includes four real low contrast images, extracted from the Tampere Image Database 2013 (TID2013) [34]. This set aims to appraise the MSA - ICE methodology’s effectiveness when tested in complex, low-contrast images. The low-contrast image group is appraised in terms of image enhancement quality through three quality metrics, Structural Similarity Index (SSIM), Relative Enhancement Contrast (REC), and Range Redistribution (RR). In turn, each image in the group is subjectively analyzed. Table 6 presents the mean SSIM, REC, and RR results of thirty runs for each method compared per image. The three-quality metrics (SSIM, REC, and RR) establish that the higher the resultant value, the better the image’s contrast enhancement, and quality. Values in bold depict the best results. As shown in Table 6, the proposed MSA method outperforms the ABC, DE, FA, GSA, and PSO approaches in
628
A. Luque-Chang et al.
Table 5 Resultant images and their corresponding enhanced histograms
Peppers
Blonde
Pirate
ABC
MSA
Original
Jet
(continued)
terms of SSIM, REC, and RR for all the images evaluated. The results obtained indicate that the MSA-ICE approach is competent in improving authentic low-contrast images and can be applied to solve these sorts of problems. From Figs. 6, 7, 8 and 9 are presented the visual results of every image from the low-contrast image group (IG_1, IG_2, IG_3, and IG_4, respectively) for each method compared. Figures 6a, 7, 8 and 9a presents the original low contrast image.
A Novel Metaheuristic Approach for Image Contrast Enhancement …
PSO
GSA
FA
DE
Table 5 (continued)
629
630
A. Luque-Chang et al.
Table 6 Average results of SSIM, REC, and RR for the low contrast test image group Image
Metric
IG_1
SSIM REC RR
IG_2
SSIM REC RR
IG_3
SSIM REC RR
IG_4
SSIM REC RR
ABC 0.5946
DE 0.5847
FA 0.6121
GSA 0.6574
PSO 0.4592
MSA 0.6921
1.2882
1.1308
1.3756
1.6267
1.0387
2.9026
25.0014
23.1421
35.1574
37.1421
22.0745
43.1271
0.6182
0.5074
0.6147
0.7127
0.4714
0.7941
1.2101
1.4764
1.4941
1.5232
1.0218
2.7076
29.4436
27.2180
33.1247
38.7415
24.0142
41.7131
0.6241
0.5112
0.6414
0.7011
0.4816
0.8014
1.2289
1.1753
1.4345
1.5024
1.0453
3.3714
26.1421
24.0142
32.1420
39.5871
21.5062
43.2014
0.6072
0.5214
0.6871
0.7874
0.4897
0.9121
1.2123
1.1467
1.3930
1.6758
1.0276
4.1740
27.1024
27.0174
39.4108
41.0187
26.1424
45.1473
Fig. 6 Low-contrast image IG_1 and its corresponding enhancement results with different techniques a original image, b MSA, c ABC, d DE, e FA, f GSA, and g PSO
From Figs. 6b, 7, 8 and 9b is demonstrated that the MSA approach is competent for its application in the enhancement of low contrast images. In turn, the images enhanced by MSA show fine details, as well as sharper edges, and better contrast in comparison to ABC Figs. 6c, 7c, 8c and 9c, DE Figs. 6d, 7d, 8d and 9d, FA Figs. 6e, 7e, 8e and 9e, GSA Figs. 6f, 7f, 8f and 9f, and PSO Figs. 6g, 7g, 8g and 9g. The quantitative and qualitative results obtained for the two groups of images show that MSA-ICE provides a suitable alternative to improve low-contrast images.
A Novel Metaheuristic Approach for Image Contrast Enhancement …
631
Fig. 7 Low-contrast image IG_2 and its corresponding enhancement results with different techniques a original image, b MSA, c ABC, d DE, e FA, f GSA, and g PSO
Fig. 8 Low-contrast image IG_3 and its corresponding enhancement results with different techniques a original image, b MSA, c ABC, d DE, e FA, f GSA, and g PSO
6 Conclusions In this work, the moth swam algorithm has been used for image contrast enhancement. The algorithm has successfully probed to give an optimal distribution for the grey levels of the input image in its search process. The approach has been tested on a set of ten different images and compared with other well-known metaheuristic techniques in the literature, such as artificial bee colony (ABC), differential evolution (DE), firefly algorithm (FA), gravitational search algorithm (GSA), and particle swarm algorithm (PSO). The MSA has proved to overcome the other algorithms in terms of the fitness of the solution due to its adequate adaption to the search space, diversity and finally fine balance between exploration and exploitation.
632
A. Luque-Chang et al.
Fig. 9 Low-contrast image IG_4 and its corresponding enhancement results with different techniques a original image, b MSA, c ABC, d DE, e FA, f GSA, and g PSO
For the experimental results, the algorithm was run 30 times for each image, considering the following performance indexes in terms of the fitness of the objective function: average, median, and standard deviation. In turn, three image quality metrics were assessed, the Structural Similarity Index (SSIM), the Relative Enhancement Contrast (REC), and the Range Redistribution (RR). Furthermore, the best and worst fitness values among all the runs where included, the experimental results show that the MSA outperforms all its competitors in those indexes. The actual schema presents various improvements in relation other works such as: • Penalty function is used the optimization algorithms instead sort the integer values of the solutions. • The Symmetric Kullback-Leibler entropy is used as statistical metric of change. • The simplicity of this methodology implies a low computational cost tunning the MSA parameters and phases. The synergy between the MSA and the proposed ICE methodology presents better performance compared to other optimization algorithms working in the same schema, in terms of contrast, information content, edge details, and structure similarity. As future work, it is planned to find modifications to improve the MSA method to get better results and apply the ICE to areas of interest as medical images, satellite images, and others.
References 1. J. Lewin, Comparison of contrast-enhanced mammography and contrast-enhanced breast MR imaging. Magn. Reson. Imaging Clin. N. Am. 26(2), 259–263 (2018) 2. M. Agarwal, R. Mahajan, Medical images contrast enhancement using quad weighted histogram equalization with adaptive gama correction and homomorphic filtering. Procedia
A Novel Metaheuristic Approach for Image Contrast Enhancement …
633
Comput. Sci. 115, 509–517 (2017) 3. M. Agarwal, R. Mahajan, Medical image contrast enhancement using range limited weighted histogram equalization. Procedia Comput. Sci. 125, 149–156 (2018) 4. S.S. Sahu, A.K. Singh, S.P. Ghrera, M. Elhoseny, An approach for de-noising and contrast enhancement of retinal fundus image using CLAHE, Opt. Laser Technol. (2018) 5. H.-T. Wu, S. Tang, J. Huang, Y.-Q. Shi, A novel reversible data hiding method with image contrast enhancement. Signal Process. Image Commun. 62, 64–73 (2018) 6. X. Wang, L. Chen, An effective histogram modification scheme for image contrast enhancement. Signal Process. Image Commun. 58, 187–198 (2017) 7. S.-D. Chen, A.R. Ramli, Contrast enhancement using recursive mean-separate histogram equalization for scalable brightness preservation. IEEE Trans. Consum. Electron. 49(4), 1301–1309 (2003) 8. B. Xiao, H. Tang, Y. Jiang, W. Li, G. Wang, Brightness and contrast controllable image enhancement based on histogram specification. Neurocomputing 275, 2798–2809 (2018) 9. M. Kanmani, V. Narsimhan, An image contrast enhancement algorithm for grayscale images using particle swarm optimization. Multimed. Tools Appl. 77(18), 23371–23387 (2018) 10. M. Kanmani, V. Narasimhan, Swarm intelligent based contrast enhancement algorithm with improved visual perception for color images. Multimed. Tools Appl. 77(10), 12701–12724 (2018) 11. L. Maurya, P.K. Mahapatra, G. Saini, Modified cuckoo search-based image enhancement. Adv. Intell. Syst. Comput. 404, 625–634 (2016) 12. M.A. Al-Betar, Z.A.A. Alyasseri, A.T. Khader, A.L. Bolaji, M.A. Awadallah, Gray image enhancement using harmony search. Int. J. Comput. Intell. Syst. 9(5), 932–944 (2016) 13. S. Suresh, S. Lal, C.S. Reddy, M. S. Kiran, A novel adaptive cuckoo search algorithm for contrast enhancement of satellite images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 10(8), 3665–3676 (2017) 14. K.G. Dhal, M.I. Quraishi, S. Das, Performance analysis of chaotic lévy bat algorithm and chaotic cuckoo search algorithm for gray level image enhancement. Adv. Intell. Syst. Comput. 339, 233–244 (2015) 15. K.G. Dhal, S. Das, Local search-based dynamically adapted bat algorithm in image enhancement domain. Int. J. Comput. Sci. Math. 11(1), 1–28 (2020) 16. E. Cuevas, M. Cienfuegos, A new algorithm inspired in the behavior of the social-spider for constrained optimization. Expert Syst. Appl. 41(2), 412–425 (2014) 17. L. Maurya, P.K. Mahapatra, A. Kumar, A social spider optimized image fusion approach for contrast enhancement and brightness preservation. Appl. Soft Comput. J. 52, 575–592 (2017) 18. A. Gorai, A. Ghosh, Gray-level image enhancement by particle swarm optimization, in 2009 World Congress on Nature Biologically Inspired Computing, no. 1, pp. 72–77 19. M. Braik, A. Sheta, Particle swarm optimisation enhancement approach for improving image quality. Int. J. Innov. Comput. Appl. 1(2), 138–145 (2007) 20. A. Draa, A. Bouaziz, An artificial bee colony algorithm for image contrast enhancement. Swarm Evol. Comput. 16, 69–84 (2014) 21. A.K. Bhandari, V. Soni, A. Kumar, G.K. Singh, Cuckoo search algorithm based satellite image contrast and brightness enhancement using DWT-SVD. ISA Trans. 53(4), 1286–1296 (2014) 22. L. dos S. Coelho, J.G. Sauer, M. Rudek, Differential evolution optimization combined with chaotic sequences for image contrast enhancement. Chaos, Solitons Fractals 42(1), 522–529 (20090 23. J.-P. Pelteret, B. Walter, P. Steinmann, Application of metaheuristic algorithms to the identification of nonlinear magneto-viscoelastic constitutive parameters. J. Magn. Magn. Mater. 464, 116–131 (2018) 24. E. Cuevas, A. Luque, D. Zaldívar, M. Pérez-Cisneros, Evolutionary calibration of fractional fuzzy controllers. Appl. Intell. (2017) 25. A.-A.A. Mohamed, Y.S. Mohamed, A.A.M. El-Gaafary, A.M. Hemeida, Optimal power flow using moth swarm algorithm. Electr. Power Syst. Res. 142, 190–206 (2017)
634
A. Luque-Chang et al.
26. A.K. Bhandari, K. Rahul, A context sensitive Masi entropy for multilevel image segmentation using moth swarm algorithm. Infrared Phys. Technol. 98, 132–154 (2019) 27. M.A. Ebrahim, M. Becherif, A.Y. Abdelaziz, Dynamic performance enhancement for wind energy conversion system using moth-flame optimization based blade pitch controller. Sustain. Energy Technol. Assessments 27, 206–212 (2018) 28. D.H. Johnson, S. Sinanovi´c, Symmetrizing the kullback-leibler distance. Unpublished 1(1), 1–8 (2001) 29. B.C.T. Cabella, M.J. Sturzbecher, D.B. de Araujo, U.P.C. Neves, Generalized relative entropy in functional magnetic resonance imaging. Phys. A Stat. Mech. its Appl. 388(1), 41–50 (2009) 30. M. Jamil, H.J. Zepernick, Lévy flights and global optimization. Swarm Intell. Bio-Inspired Comput. 49–72 (2013) 31. R.N. Mantegna, Fast, accurate algorithm for numerical simulation of Lévy stable stochastic processes. Phys. Rev. E 49(5), 4677–4683 (1994) 32. I. Volkau, K.N. Bhanu Prakash, A. Ananthasubramaniam, A. Aziz, W.L. Nowinski, Extraction of the midsagittal plane from morphological neuroimages using the Kullback-Leibler’s measure. Med. Image Anal. 10(6), 863–874 (2006) 33. E. Romera, Á. Nagy, Density functional fidelity susceptibility and Kullback-Leibler entropy, Phys. Lett. Sect. A Gen. At. Solid State Phys. 377(43), 3098–3101 (2013) 34. N. Ponomarenko et al., Image database TID2013: peculiarities, results and perspectives. Signal Process. Image Commun. 30, 57–77 (2015) 35. K. Jayanthi, L.R. Sudha, Optimal gray level mapping for satellite image contrast enhancement using grey wolf optimization algorithm, pp. 38–44, 2018 36. Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004) 37. C. Zhao, Z. Wang, H. Li, X. Wu, S. Qiao, J. Sun, A new approach for medical image enhancement based on luminance-level modulation and gradient modulation. Biomed. Signal Process. Control 48, 189–196 (2019) 38. Z.Y. Chen, B.R. Abidi, D.L. Page, M.A. Abidi, Gray-level grouping (GLG): an automatic method for optimized image contrast enhancement—part I: the basic method. IEEE Trans. Image Process. 15(8), 2290–2302 (2006)
Geospatial Data Mining Techniques Survey Jorge Antonio Robles Cárdenas and Griselda Pérez Torres
1 Introduction Today, researchers face the problem of analyzing enormous amounts of information in their studies, and geographic information is no exception. Geographic information and spatial data are stored in specialized databases; there is a variety of systems and software from different makers for this task. Geolocation technology has advanced greatly in recent years, it is now possible to acquire devices so accurate and inexpensive that millions of users have access to them and generate billions of geo-referenced records every day. Collecting georeferenced information is very important, these data by themselves are valuable for the scientific community in different research areas. Recently, new studies have led us to perform more in-depth analyses on these sources of information, and there is a need to find new ways to exploit these data collections. A solution to this problem is in development; data mining is a growing field of statics and computer sciences focused in to discovering of patterns in data collections. Data mining techniques combined with geographic information systems gives rise to what we call Geospatial Data Mining. It is the implementation of traditional data mining techniques adapted to geographic information systems to achieve new insights [1]. Those techniques takes into account the spatial relationships between geographic objects for their analysis. To get into the subject we will start by reviewing in Sect. 2 the basic concepts related to the storage and management of geographic information and the main concepts of the young data science. Then in Sect. 3, we will define what Geospatial Data Mining is.
J. A. Robles Cárdenas (B) · G. Pérez Torres Universidad de Guadalajara, Guadalajara, Jalisco, Mexico e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 D. Oliva et al. (eds.), Metaheuristics in Machine Learning: Theory and Applications, Studies in Computational Intelligence 967, https://doi.org/10.1007/978-3-030-70542-8_25
635
636
J. A. Robles Cárdenas and G. Pérez Torres
In Sect. 4, we will describe in depth each of the techniques that make up Geospatial Data Mining. Finally, in Sect. 5 we will discuss the two approaches that Geospatial Data Mining is developing with and close with conclusions in Sect. 6.
2 Review of Concepts Geospatial Reference Systems There are two types of geospatial reference systems: absolute and relative. In a relative system, we have a number of terms to explain locations or location references with real life everyday objects, to reference a place we use directions, addresses, etc. Instead, an absolute system use a set of numbers, letters or symbols to reference each location on Earth, this are the coordinates. The coordinates are generally chosen so that two of the references represent a horizontal position and a third represents altitude, a system of spherical or spheroid angular coordinates whose center is the center of the Earth and are usually expressed in degrees [2]. There are different reference systems, valid for any point on the planet. For example, WGS84 is the reference system used by the ordinary GPS devices, and PZ90 is the reference system used by the Glonass system [3]. Geospatial Data Any data associated with a place is geospatial data; we commonly reference data to a place through an address. However, if we need to be more precise and have an absolute reference of the location we must use a geospatial reference system [3]. Researchers can acquire geospatial data using a variety of technologies. Land surveyors, census takers, aerial photographers, and even average citizens with a smartphone can collect geospatial data using GPS and enter it into a database. Geographic Information Systems (GIS) A Geographic Information System is composed of hardware, software, data, methods and people. GIS are capable of capturing, storing, analyzing, and displaying geospatial data, even in real time [1]. Geographic information systems allow us to analyze our environment and better understand how society relates to it. A GIS is a set of tools that integrates and relates various components that allow the organization, storage, manipulation, analysis and modeling of large amounts of data from the real world that are linked to a spatial reference, facilitating the incorporation of social-cultural, economic and environmental aspects that lead to decision making more effectively [4]. Knowledge Discovery in Databases (KDD) Huge amounts of data have been collected and stored in large databases by database technologies and data collection techniques. But we not always need all the data,
Geospatial Data Mining Techniques Survey
637
sometimes our application needs to discard the major part and process only the essential data. This data is called knowledge or information. There is a growing need for a new generation of computational tools to help scientists extract useful information from large volumes of digital data. Data mining and knowledge discovery in databases are closely related. Generally, data mining and KDD are treated as synonymous, but in reality data mining is part of the knowledge discovery process [1]. Real world applications of KDD includes personalized marketing, investment selection and fraud detection. KDD is the process of extracting non-trivial information, information that is implicit in a database. We say it is not trivial because it is previously unknown and potentially useful information. This process creates knowledge from structured (relational databases, XML) and unstructured (text, documents, images) sources. The resulting information must be organized in a readable format that facilitates the inference of new knowledge and is interpretable by the machine [5]. Data Mining (DM) Data Mining is a field of statistics and computer science concerned with the process of discovering patterns in organized or unorganized volumes of data. The overall goal of the DM process is to extract information from a large dataset and transform it into an understandable data structure for later use [6]. In this field of data science techniques are defined to extract information implicit in databases, this information is potentially useful but people do not know it in advance. This extraction is done from large volumes of data, often incomplete, raw, disordered and unpredictable. The data analysis techniques used in DM are automated to discover relationships between study elements, data that otherwise could not be detected beforehand [7]. There are different types of data mining techniques, developed mainly by two communities: the Statistics community and the Database community, each with its own approach. The most frequently named techniques are: classification, association rules, characteristic rules, discriminant rules, clustering and trend detection [8]. Most of the work done with Data Mining has been done for data extraction in relational and transaction databases, but data extraction is also applied in other application databases, such as spatial databases, temporary databases, object-oriented databases, multimedia databases and others. DM evolved in a multidisciplinary field, including database technology, machine learning, artificial intelligence, neural network, information retrieval, and so on.
3 Definition of Geospatial Data Mining The specificity of Geospatial Data Mining lies in the necessity on analyze the interaction of data with real space. This analysis must be capable of recognize the spatial relationships between geographic objects.
638
J. A. Robles Cárdenas and G. Pérez Torres
A geographical database constitutes a spatio-temporal continuum in which properties concern to a particular place. The properties of geographic objects are closely related to the properties of the neighborhood where they are located, so they are usually explained in terms of the properties of its neighborhood. Therefore, we can see the great importance of spatial relationships in the analysis process. Temporal aspects for spatial data are also a central point but rarely taken into account [9]. Geospatial Data Mining refers to the data mining methods combined with GIS for carrying out a spatial analysis of geographic data. Some methods are based solely on the graphical aspect of the data and the others utilize a semantic representation of spatial relations such as graphs and neighbor matrices.
4 Techniques of Geospatial Data Mining Traditional data mining methods are not suited to spatial data because they do not support location data nor the implicit relationships between objects. Therefore, it is necessary to develop new methods to include the spatial relations between geographical objects and the handling of spatial data. Geospatial Data Mining techniques are an extension of traditional data mining methods. I describe below five techniques completely adapted to spatial data:
4.1 Spatial Class Identification This technique is also known as supervised classification. Spatial class identification provides a logical description of the objects and their attibutes that allows the best partitioning of the database. Classification rules constitute a decision tree in which each node represents a criterion on an attribute. The difference between the traditional technique and this technique in spatial databases is that the criterion could be a spatial predicate and, as spatial objects depend on the neighborhood, a rule involving the non-spatial properties of an object should be extended to neighborhood properties. In Geospatial Data Mining, a classification criterion can also be related to a spatial attribute, in which case it reflects its inclusion in a wider zone. The zones can be determined by the algorithm, for example by grouping adjacent objects or even merging them into one. They can also be determined from a predefined spatial hierarchy. The main application of the classification is to analyze remote sensing data, its objective is to identify each pixel of the image with a particular category. Once the homogeneous pixels are categorized, they are grouped together to form a geographic entity. The classification can be seen as an arrangement of objects, in which both their properties and the properties of their neighbors are used, not only for the direct neighbors but also for the neighbors’ neighbors and so on. These properties are of non-spatial values [9].
Geospatial Data Mining Techniques Survey
639
4.2 Spatial Outlier Detection This technique looks for spatial outliers, i.e. a spatially referenced object whose nonspatial properties differ significantly from those of other spatially referenced objects in the same dataset. Informally, a spatial outlier is a local instability (in values of non-spatial attributes) or a spatially referenced object whose no spatial properties has extreme values relative to its neighbors, even though the properties value may not be significantly different from the entire population average [10]. For example, a new house in an old neighborhood of a growing metropolitan area is a spatial outlier based on the no spatial attribute, because its age is significantly less than the age of its neighbors, but not much different than the average for the growing metropolitan area. There are three categories of outliers: set-based, graph-based, and space-based. The latter are multi-dimensional outliers. A set-based outlier is a data object whose attributes are incompatible with the attribute values of other objects in a given dataset, regardless of spatial relationships. On the other hand, both graph-based and multidimensional space-based are spatial outliers, that is, data objects that are significantly different in attribute values from the collection of data objects between spatial neighborhoods. However, they are based on different spatial definitions of neighborhoods. In graph-based outlier detection, the spatial neighborhood definition is based on the connectivity of the graphs, while in multi-dimensional spatial outlier detection, the definition is based on Euclidean distance [11]. The importance of identifying spatial outliers is that it reveals valuable hidden knowledge and has many applications. For example, it can help locate weather phenomena by looking for extreme values in the atmospheric properties of a region, thus revealing tornadoes or hurricanes. In the medical field, it can identify abnormal genes or tumor cells. It helps in logistics operations such as discovering traffic bottlenecks on roads, pointing out military targets on satellite images, determining possible locations of oil deposits and many more applications [12]. The detection of spatial outliers is a challenge for several reasons. First, the choice of a neighborhood is not trivial. Second, the design of statistical tests for spatial outliers needs to take into account the distribution of attribute values in various locations, as well as the distribution of attribute values in the neighborhoods. In addition, the cost of calculating the parameter determination for a neighborhoodbased test can be high due to the possibility of joint calculations.
4.3 Spatial Association Rules In this technique, geographic properties are analyzed to set rules that defines relations between one or more spatial objects, these rules are used to find associations between properties of objects and those of neighboring objects. To confine the number of
640
J. A. Robles Cárdenas and G. Pérez Torres
discovered rules, the concepts of minimum support and minimum confidence are used. The intuition behind this is that in large databases, there may exist a large number of associations between objects but most of them will be applicable to only a small number of objects, or the confidence of rules may be low [13]. Spatial objects are related to each other by their geographical properties: topological relations like intersects, overlap, disjoint, etc., spatial orientations like left of, west of, etc., distance information, such as close to, far away, etc. The basic idea is that a spatial database can be boiled down to deductive relational databases once that reference objects and task-relevant objects. Their spatial properties and the spatial relationships among them have been extracted according to a predefined semantics. In this approach, spatial attributes are transformed into ground facts of a logical language for relational databases. Then we are allowed to specify background knowledge such as spatial hierarchies, spatial constraints and rules for spatial qualitative reasoning [14].
4.4 Spatial Cluster Analysis Clustering is one of the most popular techniques and is the subject of research in several fields, as well as in data mining this topic is addressed in pattern recognition, automatic learning and statistics in general. Clustering is a complex task, and in addition, in data mining you work with very large data sets and with many attributes of different types. Clustering algorithms must solve this complexity with unique computational requirements. Recently, several algorithms have been successfully applied to real-life data mining problems [15]. Cluster analysis is the task of dividing data into meaningful or useful subgroups. This task is very useful in spatial databases. For example, a useful GIS product is thematic maps, which are created by grouping vectors of characteristics of a dataset as clusters. The collection of clusters is known as clustering. Partitioning Around Medoids, Clustering large Applications and Clustering large Applications based upon Randomized Search are different types of clustering, and the following is a description of each [16]. Partitioning Around Medoids (PAM) Like k-means algorithm, PAM divides data sets in to groups but based on medoids. Whereas k-means is based on centroids. The advantage of using medoids is that we can reduce the dissimilarity of objects within a cluster. In PAM, first calculate the medoid, then assigned the object to the nearest medoid, which forms a cluster [17]. Clustering large Applications (CLARA) CLARA can deal with much larger data sets than PAM, also finds objects that are centrally located in the clusters. Slowness is the main problem of PAM, because in this type of clustering each element of the matrix of dissimilarities must be iterated
Geospatial Data Mining Techniques Survey
641
one at a time. So for n objects, the space complexity of PAM becomes O(n2).But CLARA avoid this problem [18]. Clustering Large Applications Based Upon Randomized Search (CLARANS) This algorithm mix both PAM and CLARA by searching only the subset of the dataset and it does not confine itself to any sample at any given time. The main difference between PAM and CLARANS is that the latter only checks a sample of the neighbours of a node in the matrix. However, unlike CLARA, each sample is drawn dynamically, so that the nodes corresponding to particular objects are not completely eliminated. That is, in the CLARA method a sample of nodes is drawn at the beginning of a search, while in CLARANS a sample of neighbours is drawn at each step of the search. This has the benefit of not confining a search to a localized area [19].
4.5 Trends and Deviations Analysis This analysis technique is applied to temporal sequences in relational databases. But in spatial databases the objective is to find and characterize spatial trends. Central places theory is the basis of the database approach. This analysis is performed in four steps, first the centers must be discovered by calculating the local maxima of certain attributes. Once calculated, the theoretical trend of the same attributes away from the centers is determined. Then the deviations from these trends are determined and finally, we explain these trends by analyzing the properties of these areas. An example can be the analysis of the trend of the unemployment rate in different populations, compared to the distance of these ones from a reference metropolis. Another example is the trend analysis of the development of house construction [6]. A geostatistical approach is used for spatial analysis and for the prediction of spatio-temporal phenomena. This technique was first used for geological applications. Geostatistics covers a wide variety of techniques that are used in the analysis of variables distributed in space and/or time to predict unknown values. Since these values are closely related to the environment, the study of this correlation is called structural analysis. The kriging technique is used to predict the location values outside of the sample. Geostatistics is limited to the analysis of sets of points or polygonal subdivisions and deals with a single variable. Under these conditions, the analysis of single attributes is a good tool for the discovery of spatial and space–time trends.
642
J. A. Robles Cárdenas and G. Pérez Torres
5 Geospatial Data Mining Approaches Geospatial Data Mining techniques are carried out using two different approaches or the combination of them. Some techniques are derived from statistics methods and others from one field of computer science: machine learning. •Statistical Approach: In the statistical approach, one may describe autocorrelation tests, smoothing, and smoothed or contrasted factorial analysis as semantic methods. In this approach, we find that methods are based solely on the graphical aspect of the data, as in the exploratory analysis of spatial data [9]. •Machine Learning Approach: Utilize a semantic representation of spatial relations such as graphs and neighbor matrices. Apart from clustering, which remains a graphical method, most of the methods derived from the database approach fall into this category [20].
6 Conclusions Data mining is a field that has several years in development, even though it was born in the statistical field, the application of its concepts has had a lot of progress in computer science. Most data mining techniques have both a statistical and a database-oriented approach. These approaches do not conflict, but rather complement each other and lead to the development of more reliable and specialized algorithms. In different articles related to data mining methods applied to geospatial databases, I found these two approaches. The authors do not always address both approaches in their studies, as the objectives of the statistics and computer science communities may be different. This survey is oriented for new readers in this field, to help to better understanding the application of these techniques. Rather than breaking down the algorithms or implementations of each technique, I tried here to describe the basic concepts of operation and their purposes. In this paper, I addressed five Geospatial Data Mining techniques. However, more and more techniques are being adapted to this field, so we have to keep an eye on their development. There are many lines of research that can benefit from the use of these techniques, mainly in the government sector where I believe they would have the greatest impact. For example, it would help in the administration of public resources for the planning of infrastructure and services, identification of areas in danger due to social or environmental factors, etc. In the health sector we have a current example of the usefulness that these techniques can have in the recent epidemiological studies being carried out around the world.
Geospatial Data Mining Techniques Survey
643
References 1. H.J. Miller, J. Han, Geographic data mining and knowledge discovery, 2nd edn. Geogr. Data Min. Knowl. Discov. 1–475 (2009). https://doi.org/10.1201/9781420073980 2. R. Tomlinson, Geographical Information Systems, Spatial Data Analysis and Decision Making in Government (University of London, 1974) 3. P. Folger, Geospatial information and geographic information systems (GIS): an overview for congress. Fed. Geospatial Inf. Manage. Coord. pp. 1–19 (2013) 4. B. Tomaszewski, Geographic Information Systems (GIS) for Disaster Management (Boca Raton, FL, 2014) 5. U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, From Data Mining to Knowledge Discovery in Databases, in Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligent Lecture Notes Bioinformatics), vol. 9078, no. 3, pp. 637–648 (2015). https:// doi.org/10.1007/978-3-319-18032-8_50 6. M.-S. Chen, J. Han, P.S. Yu, Data Mining: an overview from database perspective. IEEE Trans. Knowl. Data Eng. 8(6), 866–883 (1997) 7. H. Sahu, S. Shrma, S. Gondhalakar, A brief overview on data mining survey. IJCTEE 1(3), 114–121 (2008) 8. C. Wan, Data Mining Tasks and Paradigms, in Hierarchical Feature Selection for Knowledge Discovery: Application of Data Mining to the Biology of Ageing (Springer International Publishing, Cham, 2019), pp. 7–15 9. K. Zeitouni, A survey of spatial data mining methods databases and statistics point of views. Data Warehous. Web Eng. (2011). https://doi.org/10.4018/9781931777025.ch013 10. S. Shekhar, M.R. Evans, J.M. Kang, P. Mohan, Identifying patterns in spatial information: a survey of methods. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 1(3), 193–214 (2011). https://doi.org/10.1002/widm.25 11. S. Shekhar, C.-T. Lu, P. Zhang, Detecting graph-based spatial outliers: algorithms and applications (a summary of results), in ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2001) 12. Y. Kou, C.-T. Lu, Outlier Detection, Spatial, in Encyclopedia of GIS. ed. by S. Shekhar, H. Xiong, X. Zhou (Springer International Publishing, Cham, 2017), pp. 1539–1546 13. K. Koperski, J. Adhikary, J. Han, Spatial data mining: progress and challenges survey paper. Phys. Earth Planet. Inter. 5(C), 7 (1972). https://doi.org/10.1016/0031-9201(72)90070-2 14. D. Malerba, F.A. Lisi, An ILP method for spatial association rule mining. Multi-Relational Data Mining (2001) 15. P. Berkhin, A survey of clustering data mining techniques, in Grouping Multidimensional Data: Recent Advances in Clustering, eds. by J. Kogan, C. Nicholas, M. Teboulle (Springer, Berlin, Heidelberg, 2006), pp. 25–71 16. S. Kumar, S. Ramulu, S. Reddy, S. Kotha, Spatial data mining using cluster analysis. Int. J. Comput. Sci. Inf. Technol. (2012) 17. M. Van der Laan, K. Pollard, J. Bryan, A new partitioning around medoids algorithm. J. Stat. Comput. Simul. 73(8), 575–584 (2003). https://doi.org/10.1080/0094965031000136012 18. G. Sheikholeslami, S. Chatterjee, A. Zhang, WaveCluster: a multi-resolution clustering approach for very large spatial databases, in Proceedings of the 24rd International Conference on Very Large Databases (1998) 19. R.T. Ng, J. Han, CLARANS: a method for clustering objects for spatial data mining. IEEE Trans. Knowl. Data Eng. 14(5), 1003–1016 (2002). https://doi.org/10.1109/TKDE.2002.103 3770 20. M. Perumal, B. Velumani, A. Sadhasivam, K. Ramaswamy, Spatial data mining approaches for GIS—a brief review, in Emerging ICT for Bridging the Future—Proceedings of the 49th Annual Convention of the Computer Society of India CSI, vol. 2, pp. 579–592, 2015
Integration of Internet of Things and Cloud Computing for Cardiac Health Recognition Essam H. Houssein, Ibrahim E. Ibrahim, M. Hassaballah, and Yaser M. Wazery
1 Introduction Internet of Things (IoT) is often referred to as the inter-communication between many smart machines, such as sensing devices, Cell phones, laptops, PDAs, and so on, without the intervention of a human [1]. There has been a noticeable improvement in network technologies in recent years to facilitate machine communication. Modern technologies such as 4G/5G have been used to work on improving the daily lives of communities and individuals around the world [2]. Also, IoT has been used in smart houses, smart infrastructure, smart grids, and smart cities to boost efficiency in the different fields. The Health Internet of Things (H-IoT) is one such area that is a famous landmark and a milestone in the advancement of information systems. It plays a vital role in improving people’s health and increasing the importance of life. Wearable sensors, wearable networking systems, Wireless Sensor Networks (WSN) [3, 4], WBAN, and Human Bond Networking are now an emerging research field worldwide [5] to support H-IOT infrastructures. These portable devices and technologies can monitor different human physiological measurements such as blood pressure (BP) [6], heart rate (HR) [7], and respiration rate (RR) [8] by a single click at home or work and by healthcare providers Although it is still at early stages of development, companies and industries have rapidly adopted the power of IoT in their existing infrastructures and have seen changes in efficiency as well as customer experience [9]. E. H. Houssein (B) · Y. M. Wazery Faculty of Computers and Information, Minia University, Minia, Egypt e-mail: [email protected] I. E. Ibrahim Faculty of Computers and Information, Luxor University, Luxor, Egypt M. Hassaballah Faculty of Computers and Information, South Valley University, Qena, Egypt © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 D. Oliva et al. (eds.), Metaheuristics in Machine Learning: Theory and Applications, Studies in Computational Intelligence 967, https://doi.org/10.1007/978-3-030-70542-8_26
645
646
E. H. Houssein et al.
Fig. 1 An overview for integration cloud computing with IoT
Nevertheless, the deployment of IoT technologies in healthcare does pose a range of issues, involving data storage, data processing, data sharing, centralized access, universal access, privacy, and protection. The potential approach which can solve those issues is cloud computing technologies. Figure 1 demonstrates traditional healthcare which combines both cloud computing and IoT to have an omnipresent and clear ability to access medical resources and community infrastructure, to provide on-demand services across the cloud, and to conduct functions that satisfy the requirements for H-IOT systems [10]. Cloud computing provides users with computational tools such as software and hardware as a service over a network including servers, networking, databases, and data analytic with flexible and scale resources [11]. The key concept of cloud computing is to share in the scientific applications the tremendous power of storage, computation, and knowledge. Throughout cloud computing, customer activities are coordinated and performed with the necessary resources to efficiently deliver the services. Also, the recent transition from a centralized approach to a decentralized approach (fog computing) [12]. Fog computing is a technology that brings closer computation, storage, and networking to target consumers and end devices for improved service delivery [13]. One of the major elements in fog computing success is how to incentivize the edge resources of specific users and deliver them to end-users so that fog computing is financially profitable to all economic players involved [14]. Monitoring of health criteria over significant periods may enable a drive into customized and predictive medicine whereby data is currently gathered using wearable technology other than clinical settings. Throughout the diagnosis and treatment of
Integration of Internet of Things and Cloud Computing …
647
many diseases, telemonitoring of physiological signals such as the electrocardiogram (ECG), photoplethysmogram (PPG), electroencephalogram (EEG), electromyogram (EMG) [15], blood pressure level, etc., nowadays is becoming quite relevant [16, 17]. The rapid development of IoT and cloud computing paved the way for a technical transition in the healthcare sector. In many countries, one of the leading causes of death is cardiovascular disease [18]. This will result in the transmission and storage of large quantities of ECG data. Managing the transmission of this bulk data poses a big challenge for IoT healthcare with tight energy usage constraints.
Table 1 Summary for the healthcare IoT and cloud computing studies Reference
Technology
Contributions description
Mutlag et al. [21]
IoT, Cloud
• Imply three basic reasons for controlling resources effectively in cloud-based healthcare schemes • Show weaknesses of approaches, structures, and frameworks of recent times
Kumari et al. [22]
Cloud
• Discussing numerous opportunities and concerns about fog-based application in medical systems • Propose a three-layer healthcare architecture for real-time applications
García et al. [23]
Cloud
• Introduce a fog-based architecture to Speed up mobile patient response • The proposed prototype structure reduced the time by four)
Farahani et al. [24]
IoT
• Bringing IoT into e-health • Present IoT issues and guidance for the future in medical systems
Ahmadi et al.[25]
IoT, Cloud
• Address various significant issues and concerns in e-Healthcare-based IoT
Baker et al. [26]
IoT
• Focus on various sensor designs, Ways to connect and Placing in a context which can be implemented in Various IoT uses in healthcare systems
Kraemer et al. [27]
Cloud
• Classify the technologies used for fog-based applications and evaluate which degree of network the fog-based can be used to perform complex computations
Islam et al. [20]
IoT, Cloud
• Display how IoT was applied in health care system in the medical sector, and analyze IoT protection as well as privacy problems in healthcare systems
Djelouat et al. [14]
IoT, Cloud
• Tackle ECG sensor limitation such battery life and connectivity
Qian et al. [28]
IoT, Cloud
• Proposed CULT—an ECG compression technique using unsupervised dictionary learning • Show several compression processing issues for IoT in healthcare
648
E. H. Houssein et al.
Therefore, it is based in large part on the development of efficient algorithms That overcomes the limitations of wearable and portable embedded sensors. Additionally, these sensors are typically meant to be an element of an ECG control system based on IoT. In such a system, wearable sensors detect the ECG signal, which is forwarded by short-range communication (e.g. Bluetooth-based) to a wearable device in which it is collected and analyzed, and then forwarded to a separate computer for further processing and storage via a wide-area link (e.g. Wi-Fi) [19]. Table 1 shows the valuable contribution from several review papers that regarded various elements of healthcare IoT and cloud computing. In [20], comprehensive IoT in healthcare surveys was conducted to discuss different elements of healthcare solutions, including structures, applications, and services; several issues were also explored, including security and standardization, requiring further research. The rest of the chapter is drawn as follows: Sect. 2 presents the exploration of ECG. IoT framework for ECG Monitoring System is provided in Sect. 3. Key Challenges of ECG Monitoring System is Reported in 4. Conclusion of this chapter is presents in Sect. 5.
2 Exploration of ECG Analysis ECG monitoring An ECG sensor tracks a heart’s electrical activity at rest, providing HR and rhythm details. An ECG examination is expected when an individual is in a category with high-risk cardiovascular problems like high BP, and signs include headaches or shortness of breath. Integrating IoT infrastructure and technologies into the ECG tracking system has a significant necessity to alert people of elevated heartbeats, which is a crucial indicator of advanced detection of cardiac diseases. Thus, various studies [14, 29–31] indirect description of ECG surveillance utilizing IoT technologies. It is indeed worthy of notice that the authors in [32] built an energy-saving system for wearable gadgets that conducted a real-time ECG compression and QRS detector. Then, they suggested a method to productively growing the identification of QRS and to use less computational resources. A further benefit of this system is that multipliers were not necessary. Also, we have seen the rise of smartwatches in recent times and now HR solutions are among the most critical characteristics of wearable and activities tracker. Heart rate monitors are ideal for generating data such as on the spot measurements or heart rate rest information that is an important indicator of the health status. It is also very helpful in tracking HR data while people work out.
2.1 ECG Data Acquisition The data used in most of the researches were taken from the publicly available online by PhysioNet [33] such as the MIT-BIH arrhythmia database (MIT-BIH),
Integration of Internet of Things and Cloud Computing …
649
Fig. 2 A structure of a typical ECG analysis
European ST-T database (EDB), and St. Petersburg INCART database (INCART). Every database includes even metadata for each database Record, where cardiologists have put a diagnostic label for each heartbeat that is used in the database used for these leads. The first phase of the analysis ECG is the reading and processing of each record of the data source. Figure 2 highlight the structure of a general ECG analysis and heartbeats classification.
2.2 Filtering Multiple noise types such as power line interference and baseline wandering are major noise sources that can greatly affect the ECG signal analysis. Various filtering methods are used to remove the sources of noise. Fourier decomposition technique was employed by Singhal [34] to eliminate power line interference and baseline wandering from ECG signals. The normal low-pass and high-pass philters of the wavelet transformation were replaced by fractional-order ones in [35]. Using lowpass and high-pass philters to remove noise adds to the standard.
2.3 Heartbeat Detection A promising field of research is automatic and computerized detection of the ECG attribute such as QRS. Specialized QRS detection algorithms are proposed. Previous works eliminate noise sources from philtres and emphasise QRS by methodologies such as derivatives [36], sixth power [37], wavelet transform [38], empirical decomposition mode [39], and multiple transforms [40].
2.4 Heartbeat Segmentation Based on R peaks found, the ECG is segmented in single heartbeats, signal. The annotated positions provided by the databases may be used or any other automatic
650
E. H. Houssein et al.
detection algorithm such as the algorithms were proposed in the literature for the segmentation [41–43], after obtaining the R peak locations, the heartbeats are segmented by taking the average of the RR intervals obtained using R peaks locations.
2.5 Feature Extraction A phase for feature extraction follows, consisting of a smaller number of features than the heartbeat forming ECG signal, to collect a feature vector for every heartbeat produced. In the publications different function extraction approaches have been applied [44, 45]. The most significant parts utilized in diagnostic purposes [46] • RR-interval: The R-wave is frequently used as one of the most noticeable characteristics to define the time of an ECG signal. RR-interval shows the time difference among two neighboring R-waves, which can become abnormal in the case of some heart issues, such as arrhythmia; • PR-interval: It is defined as the time between the onset of the P wave and the QRS complex. It shows the time the impulse requires for the sinus node to enter the ventricles; • QT-interval: QT measured as the time among the beginning of the Q-wave and the end of the T-wave, associated with ventricular depolarization and re-polarization. When the QT interval reaches the usual value, there’s still an elevated risk of atrial arrhythmia or heart failure; • QRS-complex: It is specifically linked with the ventricle depolarization, consisting of the major waves, i.e., Q-wave, R-wave, and S-wave. Any disorders, e.g. drug toxicity or electrolyte imbalance, are likely to be identified by examining the morphology and the length of the QRS complex.
2.6 Feature Selection After the features are extracted from the heartbeats the role of dimension reduction (feature selection) where relevant features are selected and then fed to the model classifier while removing irrelevant features. In addition to conventional methods for the reduction of dimensions, meta-heuristic algorithms play an important role in this process. There are various algorithms such as particle swarm optimization, cuckoo hunt, Firefly, Bat and Gray Wolf Optimizer, and Salp for the meta-heuristic algorithms [47, 48].
Integration of Internet of Things and Cloud Computing …
651
2.7 Classification The final stage consists of a classification model such as Support Vector Machine, Decision Tree, or Convention Network that labels each heartbeat according to the labeling scheme used against datasets such as the Association for the Advancement of Medical Instrumentation AAMI recommended practice [49]. Houssein et al. [50] Provide an ECG heartbeat classification model based on Artificial intelligence by utilizing the water wave (WWO) behaviors and supporting vector machine (SVM). In [51] improved twin support vector machine based on hybrid swarm optimizer was used to overcome some limitations ECG classification such as high time computational and slow convergence.
3 IoT Framework for ECG Monitoring System The speaking path is converted to IoT Framework for the ECG monitoring system in the following part. The system consists of three basic layers namely topology, structure, and platform. Each Layer in the IoT healthcare system provides a particular feature.
3.1 IoT Topology Topology manages the configuration of overall IoT components and describes some typical setups in the IoT Framework for the ECG showing scheme for specified application scenarios. In Healthcare topology, Fig. 3 introduces a standard IoT and cloud computing with three key elements: Publish: Firstly, a publisher represents a network of linked wearable sensors (Sensor modules) where the signal data can be measured using sensing circuits, and then a significant amount of raw ECG signal info is continuously sent to the fog node. The fog nodes allow additional communication protocols to accumulate Wearable Sensor data. For storage and processing, ECG data from a fog node is posted to the fog server. Broker: Second, the data obtained from various ECG wearable devices are analyzed and evaluated using a Fog server (broker) decision system, three application types with different functionalities are used in the IoT cloud environment: • HTTP server: It is ready to accept a request based on the standard request-response process and respond accordingly. Users will require to submit a GET-request to the server using a Uniform Resource Locator (URL) to access the ECG data. After processing and decoding a request message, the server responds with an HTTP standard responding.
652
E. H. Houssein et al.
Fig. 3 A structure of a typical IoT framework
• MQTT server: The transfer of the ECG signal from the tracking node to the website is carried out based on the protocol Message Queuing Telemetry Transport (MQTT). The MQTT protocol is intended to preserve long-lived connections among the system and the clients, unlike the standard HTTP protocol. • Storage servers: Data storage system having storage devices in which segments of ECG information are located, that act a critical position in cardiovascular disease diagnosing and earlyish detection. As such, apropos and secure storage for ECG information is considered to be one of the IoT cloud’s major critical features. Subscribe: Finally, using the IoT cloud’s Application Programming Interface (API), the API will subscribe to certain information relating to the fog node monitoring by the ECG. The user, who tracks patients directly, may access the data from any position and respond instantly if unforeseen accidents occur.
3.2 IoT Structure IoT structures or architectures refer to how actual IoT elements are organized, commonly utilized technique for communication among smart devices, also explains the gateway’s key function. Wireless communication methods are divided into two main classes for the IoT components: short and medium-range [52]. Although shortrange communication enables transmission within a medical body area network (MBAN) between objects, medium-range communications are typically utilized to
Integration of Internet of Things and Cloud Computing …
653
Fig. 4 A structure of a typical ECG analysis
enable communications among the base stations and central MBAN nodes. Every communication strategy would be addressed furthermore within healthcare systems based on IoT and cloud computing [53] (Fig. 4). • Short-range communication techniques: A signal can move from several centimeters to several meters. Although these techniques can apply to different network types. Among the most commonly used short-range communication methods are Infrarot, Bluetooth, and ZigBee. See Table 2 for the essential characteristics of the three techniques. Centered on all these distinguished characteristics, Zigbee and Bluetooth are widely utilized for healthcare applications in cloud computing and IoT [54]. • Medium-range Communication Techniques: A Low Power Wide Area Network (LPWAN) is a form of wireless communications across a wide area network which is important for IoT commercial software. The range of LPWAN is surprisingly longer than short-range communications strategies, it holds short data pulses which can travel up to several miles at limited bandwidth which low power consumption. [55]. This makes it ideal for healthcare applications focused on cloud computing and IoT, such as remote patient management and recovery. The most well-recognized standards are among LPWAN, LoRaWAN, and Sigfox protocols (Table 3).
3.3 IoT Platform The platform relies on computing platforms and networking technologies; it feeds a large volume of data provided by wearable devices, different sensors, and immediately performs real-time data analytic and services. The value of a network platform is it can help to ensure that all sensors function smoothly so that users can easily
654
E. H. Houssein et al.
Table 2 The classification description of different test cases along with their EEG data sets Characteristics WiFi based ECG Zigbee based ECG Bluetooth based ECG sensor connection sensor connection sensor connection Protocols Rang Data rate Power Dependency
Using
IEEE802.11 20:200 m 1.4–6.8 MB/s Medium Data collection is independent of smart terminals and very low bit error rate Very high supported by houses
IEEE802.15.4 2:20 m 1.25–31.25 kB/s Low Smart terminals are needed for receiving and forwarding sensed data Medium, often supported by smartphones
IEEE802.15.1 20:30 m 0.34–3 MB/s Low Smarter terminal are required for receiving and sending sensed data Low, only underpin by specific devices
Table 3 The classification description of different test cases along with their EEG data sets Characteristics SigFox LoRaWAN 6LoWPAN Ranges Data rate
3–9.3 mi 12.5–75 byte/s
0.006–0.06 m 31.25, 5, 2.5 kB/s
Payload
6.2–31 miles 122.5 byte/s to 2.7 kB/s 0–12 bytes
19–250 bytes
Bandwidth
100 Hz to 1.2 kHz
Channel
360 channels
Security
No encryption mechanism
125 and 500 kHz (915 MHz) 125 and 250 kHz (868 and 780 MHz) 80 channels (902–928 MHz) 10 channels (779:787 and 863:870 MHz) Two common protection keys: NwkSKey provides data integrity AppSKey provides data confidentiality
6 bytes (Header) and 127 bytes (data unit) 5 MHz (2.4 GHz) 2 MHz (915 MHz) 600 kHz (868.3 MHz) 16 channels (2.4 GHz) 10 channels (915 MHz) 1 channel for (868.3 MHz) Handled at link layer which includes secure
communicate with them. Studies in [56–58] discussed many problems on the IoT platform with the ECG Monitor system, such as interoperability, the difficulties that arise when cloud infrastructure is incorporated into healthcare applications. Each research is exceptional in the resolution of a particular problem. There is no universal forum for concerns with the health care system, however. Sensor Platform for ECG in a Residential Environment SPHERE (a Sensor Platform for HEalthcare in a Residential-Environment) is an off-the-shelf IoT platform
Integration of Internet of Things and Cloud Computing …
655
Fig. 5 Modern ECG sensors
and [59] custom sensors that seek to improve an image of people living in their houses for different clinical application. The wearable part of the SPHERE platform is a custom-made sensorimotor to wear on the [60] wrist, showing various modern wearable ECG sensors used in the ECG monitoring system (Fig. 5).
4 Major Issues for ECG Monitoring System ECG Monitoring system, as presented in this article, includes multiple elements, variables, and different involved customers, encompassing diverse technology. This complexity and heterogeneity of the perspectives and elements of the ECG monitoring system pose a variety of issues that many researchers have outlined [61]. The following sections will address the problems of ECG monitoring related to the use of monitoring instruments, signal quality, sensor design, reliability, data size, visualization, and integration.
4.1 Monitoring Device In-home environments find that manual static scanning (conventional surveillance) had significant limitations because sufferers need to understand and practice on using surveillance equipment like ECG Scan, as well as have acquired expertise in the use of heart monitoring mobile apps, which can often pose an Obstacle for Old people and illiterate one [62]. It stressed due to patients can fail to conduct monitoring tasks at home by using an ad hoc network setting, Lowering the benefits of regular tracking. Therefore, the alarms and screenings alert should be taken into consideration when developing a tracking system for the home [63]. Sun et al. [64] tackled the problem of electrode attachment by fastening electrodes to the inside of
656
E. H. Houssein et al.
a T-shirt. These electrodes were based on a chest-pressing, conductive cloth. This depends on holding the T-shirt in good contact with the user’s chest to ensure a good ohmic link. As a result, to obtain a good signal the T-shirt must be kept tight against the user’s body to ensure all the electrodes are touching the skin. Rachim and Chung [65] approached the problem of the electrical connexion by adding electrode into a T-shirt. This electrode was based on a conductive, chest-pressing cloth. This is focused on holding the T-shirt insufficient chest touch to maintain a clear ohmic relation. As a consequence, the T-shirt needs to be placed close to the patient’s body to guarantee that the electrode touches the skin for a successful signal to be received.
4.2 Energy Efficiency IoT network contains numerous sensor nodes with limited storage and enormous power consumption [66–70]. As these nodes produce data continuously, more use can be made of the battery. If there is more energy consumption the lifespan of the network is shortened. Choosing the best sink node is one of the energy optimization techniques. Clustering is a mechanism for dividing the sensor nodes into groups, by selecting each group’s leader based on different parameters. Groups are called clusters, and group leaders are called Sinks. Clustering helps maximize the use of network resources, node failure management, load balancing, energy usage, and life management of networks. Some of the constraints used to pick the best sink include transmission cost inside the cluster, sensor resources, position of the sink node. The cluster nodes will move the data to its sink, after choosing the best sink. The compiled data will be forwarded to the central node. Different Metaheuristic algorithms are used to select sink node and its positions [71–75].
4.3 Signal Quality Patients may have real-life practices in real-time monitoring, such as physical exercises which typically result in artifacts of motion, signal noise, and deterioration. Lee and Chung [76] illustrated in their research the value of integrating client filtering methods to track real-time apparatus with the elimination of motion artifacts during a person’s running or physical exercise. They also suggested using an accelerometer as a reference source of the noise. High precision in continuous monitoring of intensive care units is very important. The ECG signal, however, is noisy and estimated in millivolts which emphasizes the need for good filtering and amplification techniques. This problem was stressed and dealt with in [77].
Integration of Internet of Things and Cloud Computing …
657
4.4 Data Real-time scanning is typically done over a comparatively longer time than conventional scanning (e.g., days and perhaps weeks). As a consequence, the volume of ECG signal data produced is typically high and occasionally enormous. Afterward, the method of processing and decoding the signal turns into a very difficult task. This highlights the need for automated processing and evaluation of ECG signals for these tracking systems to produce meaningful alerts for both patients and caregivers. Bianchi et al. [78] noted that it is crucial to introduce intelligent function extraction algorithms fir Real-time scanning.
4.5 System Integration Baig et al. [79] illustrated some system integration problems facing current clinical decision making supported structures relating to robustness and reliability. They suggested leveraging cloud services to tackle the integration problem in real-time. Jovanov et al. [80] recommended the smooth incorporation of knowledge with sensors and other networks, and the use of public services and standard Internet technologies for authentication and safe communication.
5 Conclusion Related ECG control device designs and implementation based on IoT based on cloud techniques. At first, the ECG monitoring system architecture was illustrated in addition to IoT and cloud computing techniques. We discuss and compared traditional ECG sensing networks including Wi-Fi, Bluetooth, and Zigbee. An IoT-based ECG monitoring framework was introduced, based on the proposed architecture. Real-time ECG signals can be obtained with acceptable precision via a wearable monitoring node with three electrodes. The collected data was transmitted to the IoT cloud via Wi-Fi that allows high throughput and broad coverage areas. The IoT cloud has to represent the ECG information to the user and to preserve these useful data for even further study, which is applied based on various servers, i.e. the HTTP server, MQTT server, and the database system.
References 1. L. Atzori, A. Iera, G. Morabito, The internet of things: a survey. Comput. Netw. 54(15), 2787– 2805 (2010) 2. A.S. Arefin, K.T. Nahiyan, M. Rabbani, The basics of healthcare IoT: data acquisition, medical devices, instrumentations and measurements, in A Handbook of Internet of Things in Biomedical and Cyber Physical System (Springer, 2020), pp. 1–37
658
E. H. Houssein et al.
3. E.H. Houssein, M.R. Saad, K. Hussain, W. Zhu, H. Shaban, M. Hassaballah, Optimal sink node placement in large scale wireless sensor networks based on Harris’ hawk optimization algorithm. IEEE Access 8, 19381–19397 (2020) 4. M.M. Ahmed, E.H. Houssein, A.E. Hassanien, A. Taha, E. Hassanien, Maximizing lifetime of large-scale wireless sensor networks using multi-objective whale optimization algorithm. Telecommun. Syst. 72(2), 243–259 (2019) 5. T. Wu, F. Wu, J.-M. Redoute, M.R. Yuce, An autonomous wireless body area network implementation towards IoT connected healthcare applications. IEEE Access 5, 11413–11422 (2017) 6. S. Chen, N. Wu, S. Lin, J. Duan, Z. Xu, Y. Pan, H. Zhang, Z. Xu, L. Huang, B. Hu et al., Hierarchical elastomer tuned self-powered pressure sensor for wearable multifunctional cardiovascular electronics. Nano Energy 70, 104460 (2020) 7. M. Foster, R. Brugarolas, K. Walker, S. Mealin, Z. Cleghern, S. Yuschak, J. Condit, D. Adin, J. Russenberger, M. Gruen et al., Preliminary evaluation of a wearable sensor system for heart rate assessment in guide dog puppies. IEEE Sens. J. (2020) 8. L. Al-Ghussain, S. El Bouri, H. Liu, D. Zheng et al., Clinical evaluation of stretchable and wearable inkjet-printed strain gauge sensor for respiratory rate monitoring at different measurements locations. J. Clin. Monit. Comput. 1–10 (2020) 9. H. Legenvre, M. Henke, H. Ruile, Making sense of the impact of the internet of things on purchasing and supply management: a tension perspective. J. Purchas. Supply Manag. 26(1), 100596 (2020) 10. S. Selvaraj, S. Sundaravaradhan, Challenges and opportunities in IoT healthcare systems: a systematic review. SN Appl. Sci. 2(1), 139 (2020) 11. A. Karim, A. Siddiqa, Z. Safdar, M. Razzaq, S.A. Gillani, H. Tahir, S. Kiran, E. Ahmed, M. Imran, Big data management in participatory sensing: issues, trends and future directions. Future Gener. Comput. Syst. 107, 942–955 (2020) 12. P. O’Donovan, C. Gallagher, K. Leahy, D.T. O’Sullivan, A comparison of fog and cloud computing cyber-physical interfaces for Industry 4.0 real-time embedded machine learning engineering applications. Comput. Ind. 110, 12–35 (2019) 13. A. Baouya, S. Chehida, S. Bensalem, M. Bozga, Fog computing and blockchain for massive IoT deployment, in 2020 9th Mediterranean Conference on Embedded Computing (MECO) (IEEE, 2020), pp. 1–4 14. H. Djelouat, M. Al Disi, I. Boukhenoufa, A. Amira, F. Bensaali, C. Kotronis, E. Politi, M. Nikolaidou, G. Dimitrakopoulos, Real-time ecg monitoring using compressive sensing on a heterogeneous multicore edge-device. Microprocessors Microsystems 72, 102839 (2020) 15. M. Kilany, E.H. Houssein, A.E. Hassanien, A. Badr, Hybrid water wave optimization and support vector machine to improve EMG signal classification for neurogenic disorders, in 2017 12th International Conference on Computer Engineering and Systems (ICCES) (IEEE, 2017), pp. 686–691 16. E.H. Houssein, A.E. Hassanien, A.A. Ismaeel, EEG signals classification for epileptic detection: a review, in Proceedings of the Second International Conference on Internet of things, Data and Cloud Computing (2017), pp. 1–9 17. A. Hamad, E.H. Houssein, A.E. Hassanien, A.A. Fahmy, A hybrid eeg signals classification approach based on grey wolf optimizer enhanced SVMs for epileptic detection, in International Conference on Advanced Intelligent Systems and Informatics (Springer, 2017), pp. 108–117 18. S. Liu, Y. Li, X. Zeng, H. Wang, P. Yin, L. Wang, Y. Liu, J. Liu, J. Qi, S. Ran et al., Burden of cardiovascular diseases in china, 1990–2016: findings from the 2016 global burden of disease study. JAMA Cardiol. 4(4), 342–352 (2019) 19. M. Batra, J. Linsky, R. Heydon, B. Redding, L.G. Richardson, Bluetooth connectionless slave broadcast burst mode, US Patent App. 16/250,837, 19 March 2020 20. S.R. Islam, D. Kwak, M.H. Kabir, M. Hossain, K.-S. Kwak, The internet of things for health care: a comprehensive survey. IEEE Access 3, 678–708 (2015) 21. A.A. Mutlag, M.K. Abd Ghani, N.A. Arunkumar, M.A. Mohammed, O. Mohd, Enabling technologies for fog computing in healthcare IoT systems. Future Gener. Comput. Syst. 90, 62–78 (2019)
Integration of Internet of Things and Cloud Computing …
659
22. A. Kumari, S. Tanwar, S. Tyagi, N. Kumar, Fog computing for healthcare 4.0 environment: opportunities and challenges. Comput. Electr. Eng. 72, 1–13 (2018) 23. M. García-Valls, C. Calva-Urrego, A. García-Fornes, Accelerating smart ehealth services execution at the fog computing infrastructure. Future Gener. Comput. Syst. 108, 882–893 (2020) 24. B. Farahani, F. Firouzi, V. Chang, M. Badaroglu, N. Constant, K. Mankodiya, Towards fogdriven IoT ehealth: promises and challenges of IoT in medicine and healthcare. Future Gener. Comput. Syst. 78, 659–676 (2018) 25. H. Ahmadi, G. Arji, L. Shahmoradi, R. Safdari, M. Nilashi, M. Alizadeh, The application of internet of things in healthcare: a systematic literature review and classification. Univ. Access Inf. Soc. 18, 837–869 (2019) 26. S.B. Baker, W. Xiang, I. Atkinson, Internet of things for smart healthcare: technologies, challenges, and opportunities. IEEE Access 5, 26521–26544 (2017) 27. F.A. Kraemer, A.E. Braten, N. Tamkittikhun, D. Palma, Fog computing in healthcare—a review and discussion. IEEE Access 5, 9206–9222 (2017) 28. J. Qian, P. Tiwari, S.P. Gochhayat, H.M. Pandey, A noble double dictionary based ECG compression technique for IoTH. IEEE Internet Things J. (2020) 29. D. Hasan, A. Ismaeel, Designing ECG monitoring healthcare system based on internet of things Blynk application. J. Appl. Sci. Technol. Trends 1(3), 106–111 (2020) 30. M.A. Serhani, H.T. El Kassabi, H. Ismail, A. Nujum Navaz, ECG monitoring systems: review, architecture, processes, and key challenges. Sensors 20(6), 1796 (2020) 31. G. Ramesh, N.M. Kumar, Design of RZF antenna for ECG monitoring using IoT. Multimed. Tools Appl. 79(5), 4011–4026 (2020) 32. T. Tekeste, H. Saleh, B. Mohammad, M. Ismail, Ultra-low power QRS detection and ECG compression architecture for IoT healthcare devices. IEEE Trans. Circ. Syst. I: Reg. Pap. 66(2), 669–679 (2018) 33. A.L. Goldberger, L.A. Amaral, L. Glass, J.M. Hausdorff, P.C. Ivanov, R.G. Mark, J.E. Mietus, G.B. Moody, C.-K. Peng, H.E. Stanley, Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23), e215–e220 (2000) 34. A. Singhal, P. Singh, B. Fatimah, R.B. Pachori, An efficient removal of power-line interference and baseline wander from ECG signals by employing Fourier decomposition technique. Biomed. Signal Process. Control 57, 101741 (2020) 35. I. Houamed, L. Saidi, F. Srairi, ECG signal denoising by fractional wavelet transform thresholding. Res. Biomed. Eng. 36(3), 349–360 (2020) 36. A. Chen, Y. Zhang, M. Zhang, W. Liu, S. Chang, H. Wang, J. He, Q. Huang, A real time QRS detection algorithm based on ET and PD controlled threshold strategy. Sensors 20(14), 4003 (2020) 37. A.K. Dohare, V. Kumar, R. Kumar, An efficient new method for the detection of QRS in electrocardiogram. Comput. Electr. Eng. 40(5), 1717–1730 (2014) 38. A. Giorgio, C. Guaragnella, D.A. Giliberti, Improving ECG signal denoising using wavelet transform for the prediction of malignant arrhythmias. Int. J. Med. Eng. Inform. 12(2), 135– 150 (2020) 39. C.K. Jha, M.H. Kolekar, Empirical mode decomposition and wavelet transform based ECG data compression scheme. IRBM (2020) 40. D. Lee, H. Kwon, H. Lee, C. Seo, K. Park, Optimal lead position in patch-type monitoring sensors for reconstructing 12-lead ECG signals with universal transformation coefficient. Sensors 20(4), 963 (2020) 41. J. Pan, W.J. Tompkins, A real-time QRS detection algorithm. IEEE Trans. Biomed. Eng. 3, 230–236 (1985) 42. Y.-C. Yeh, W.-J. Wang, QRS complexes detection for ECG signal: the difference operation method. Comput. Methods Prog. Biomed. 91(3), 245–254 (2008) 43. H. Li, X. Wang, L. Chen, E. Li, Denoising and R-peak detection of electrocardiogram signal based on EMD and improved approximate envelope. Circ. Syst. Signal Process. 33(4), 1261– 1276 (2014)
660
E. H. Houssein et al.
44. S. Karpagachelvi, M. Arthanari, M. Sivakumar, ECG feature extraction techniques—a survey approach, arXiv preprint arXiv:1005.0957 (2010) 45. A. Hamad, E.H. Houssein, A.E. Hassanien, A.A. Fahmy, Feature extraction of epilepsy EEG using discrete wavelet transform, in 201612th International Computer Engineering Conference (ICENCO) (IEEE, 2016), pp. 190–195 46. E.H. Houssein, M. Kilany, A.E. Hassanien, V. Snasel, A two-stage feature extraction approach for ecg signals, in International Afro-European Conference for Industrial Advancement (Springer, 2016), pp. 299–310 47. C.P. Igiri, Y. Singh, R.C. Poonia, A review study of modified swarm intelligence: particle swarm optimization, firefly, bat and gray wolf optimizer algorithms. Rec. Adv. Comput. Sci. Commun. (Formerly: Recent Patents Comput. Sci.) 13(1), 5–12 (2020) 48. E.H. Houssein, I.E. Mohamed, Y.M. Wazery, Salp swarm algorithm: a comprehensive review, in Applications of Hybrid Metaheuristic Algorithms for Image Processing (Springer, 2020), pp. 285–308 49. Association for the Advancement of Medical Instrumentation et al., Testing and reporting performance results of cardiac rhythm and ST segment measurement algorithms, ANSI/AAMI EC38, 1998 50. E.H. Houssein, M. Kilany, A.E. Hassanien, ECG signals classification: a review. Int. J. Intell. Eng. Inform. 5(4), 376–396 (2017) 51. E.H. Houssein, A.A. Ewees, M. Abd ElAziz, Improving twin support vector machine based on hybrid swarm optimizer for heartbeat classification. Pattern Recogn. Image Anal. 28(2), 243–253 (2018) 52. Y. Kabalci, A survey on smart metering and smart grid communication. Renew. Sustain. Energy Rev. 57, 302–318 (2016) 53. C. Gellweiler, Types of it architects: a content analysis on tasks and skills. J. Theor. Appl. Electron. Commerce Res. 15(2), 15–37 (2020) 54. L.-F. Wang, J.-Q. Liu, B. Yang, X. Chen, C.-S. Yang, Fabrication and characterization of a dry electrode integrated gecko-inspired dry adhesive medical patch for long-term ECG measurement. Microsyst. Technol. 21(5), 1093–1100 (2015) 55. M.M. Cvach, M. Biggs, K.J. Rothwell, C. Charles-Hudson, Daily electrode change and effect on cardiac monitor alarms: an evidence-based practice approach. J. Nurs. Care Qual. 28(3), 265–271 (2013) 56. P. Verma, S.K. Sood, S. Kalra, Cloud-centric IoT based student healthcare monitoring framework. J. Ambient Intell. Hum. Comput. 9(5), 1293–1309 (2018) 57. P. Pace, G. Aloi, R. Gravina, G. Fortino, G. Larini, M. Gulino, Towards interoperability of IoT-based health care platforms: the inter-health use case, in Proceedings of the 11th EAI International Conference on Body Area Networks (2016), pp. 12–18 58. A. Manashty, J. Light, U. Yadav, Healthcare event aggregation lab (heal), a knowledge sharing platform for anomaly detection and prediction, in 2015 17th International Conference on Ehealth Networking, Application & Services (HealthCom) (IEEE, 2015), pp. 648–652 59. X. Fafoutis, A. Elsts, R. Piechocki, I. Craddock, Experiences and lessons learned from making IoT sensing platforms for large-scale deployments. IEEE Access 6, 3140–3148 (2017) 60. X. Fafoutis, A. Vafeas, B. Janko, R.S. Sherratt, J. Pope, A. Elsts, E. Mellios, G. Hilton, G. Oikonomou, R. Piechocki et al., Designing wearable sensing platforms for healthcare in a residential environment. EAI Endors. Trans. Pervasive Health Technol. 3(12) (2017) 61. M. Younan, E.H. Houssein, M. Elhoseny, A.A. Ali, Challenges and recommended technologies for the industrial internet of things: a comprehensive review. Measurement 151, 107198 (2020) 62. M. Aljuaid, Q. Marashly, J. AlDanaf, I. Tawhari, M. Barakat, R. Barakat, B. Zobell, W. Cho, M.G. Chelu, N.F. Marrouche, Smartphone ECG monitoring system helps lower emergency room and clinic visits in post-atrial fibrillation ablation patients. Clin. Med. Insights: Cardiol. 14 (2020). https://doi.org/10.1177/1179546820901508 63. S.-T. Hsieh, C.-L. Lin, Intelligent healthcare system using an Arduino microcontroller and an android-based smartphone. Biomed. Res. 28 (2017)
Integration of Internet of Things and Cloud Computing …
661
64. F. Sun, C. Yi, W. Li, Y. Li, A wearable H-shirt for exercise ECG monitoring and individual lactate threshold computing. Comput. Ind. 92, 1–11 (2017) 65. V.P. Rachim, W.-Y. Chung, Wearable noncontact armband for mobile ECG monitoring system. IEEE Trans. Biomed. Circ. Syst. 10(6), 1112–1118 (2016) 66. M. Mittal, C. Iwendi, A survey on energy-aware wireless sensor routing protocols. EAI Endors. Trans. Energy Web 6(24) (2019) 67. M. Mittal, L.K. Saraswat, C. Iwendi, J.H. Anajemba, A neuro-fuzzy approach for intrusion detection in energy efficient sensor routing, in 2019 4th International Conference on Internet of Things: Smart Innovation and Usages (IoT-SIU) (IEEE, 2019), pp. 1–5 68. C. Iwendi, A. Allen, K. Offor, Smart security implementation for wireless sensor network nodes. J. Wirel. Sens. Netw. 1(1) (2015) 69. C. Iwendi, Z. Zhang, X. Du, ACO based key management routing mechanism for WSN security and data collection, in 2018 IEEE International Conference on Industrial Technology (ICIT) (IEEE, 2018), pp. 1935–1939 70. V. Mohindru, Y. Singh, R. Bhatt, Hybrid cryptography algorithm for securing wireless sensor networks from node clone attack. Recent Adv. Electr. Electron. Eng. (Formerly Recent Patents Electr. Electron. Eng.) 13(2), 251–259 (2020) 71. F.A. Hashim, E.H. Houssein, K. Hussain, M.S. Mabrouk, W. Al-Atabany, A modified Henry gas solubility optimization for solving motif discovery problem. Neural Comput. Appl. 32(14), 10759–10771 (2020) 72. A.G. Hussien, A.E. Hassanien, E.H. Houssein, M. Amin, A.T. Azar, New binary whale optimization algorithm for discrete optimization problems. Engineering Optimization 52(6), 945– 959 (2020) 73. E.H. Houssein, I.E. Mohamed, A.E. Hassanien, Salp swarm algorithm: tutorial. Swarm Intelligence Algorithms: A Tutorial (2020), p. 279 74. E. H. Houssein, Machine learning and meta-heuristic algorithms for renewable energy: a systematic review, in Advanced Control and Optimization Paradigms for Wind Energy Systems (Springer, 2019), pp. 165–187 75. E.H. Houssein, I.E. Mohamed, A.E. Hassanien, Salp swarm algorithm: modification and application, Swarm Intelligence Algorithms: Modifications and Applications (2020), p. 285 76. S. Preejith, R. Dhinesh, J. Joseph, M. Sivaprakasam, Wearable ECG platform for continuous cardiac monitoring, in 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (IEEE, 2016), pp. 623–626 77. A.S. Ahouandjinou, K. Assogba, C. Motamed, Smart and pervasive ICU based-IoT for improving intensive health care, in 2016 International Conference on Bio-engineering for Smart Technologies (BioSMART) (IEEE, 2016), pp. 1–4 78. A.M. Bianchi, M.O. Mendez, S. Cerutti, Processing of signals recorded through smart devices: sleep-quality assessment. IEEE Trans. Inf. Technol. Biomed. 14(3), 741–747 (2010) 79. M.M. Baig, H. GholamHosseini, A.A. Moqeem, F. Mirza, M. Lindén, A systematic review of wearable patient monitoring systems—current challenges and opportunities for clinical adoption. J. Med. Syst. 41(7), 115 (2017) 80. E. Jovanov, A. Milenkovic, Body area networks for ubiquitous healthcare applications: opportunities and challenges. J. Med. Syst. 35(5), 1245–1254 (2011)
Combinatorial Optimization for Artificial Intelligence Enabled Mobile Network Automation Furqan Ahmed, Muhammad Zeeshan Asghar, and Ali Imran
1 Introduction The tremendous amount of data collected in mobile networks has led to an increased interest towards exploring the potential of artificial intelligence (AI) based approaches for improving the performance of mobile networks [1]. This motivates the development of intelligent network automation algorithms powered by cloud based big data platforms, open source software components, and emerging software development practices. As a consequence, in the field of mobile network automation and self-organizing networks (SON), machine learning approaches including supervised, unsupervised, and reinforcement learning, have garnered enormous interest [44, 54– 56]. These approaches are mostly useful for detection, estimation, and prediction problems related to the analysis of network performance data. Examples include monitoring of network key performance indicators (KPIs), performance prediction and analysis, and anomaly detection and root-cause analysis. However, the potential applications of other AI approaches related to combinatorial optimization such as metaheuristics have not yet been thoroughly investigated. A number of problems related to parameter configuration and optimization in mobile networks are discrete in nature, and therefore particularly suited for the application of combinatorial optimization techniques [8, 17]. In this regard, intelligent algorithms based on heuristics and metaheuristics are particularly promising for complex computational problems relevant to mobile networks. Moreover, networks are getting increasingly complex F. Ahmed (B) Elisa Corporation, Helsinki, Finland e-mail: [email protected] M. Z. Asghar Aalto University, Espoo, Finland e-mail: [email protected] University of Jyväskylä, Jyväskylä, Finland A. Imran University of Oklahoma, Norman, OK, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 D. Oliva et al. (eds.), Metaheuristics in Machine Learning: Theory and Applications, Studies in Computational Intelligence 967, https://doi.org/10.1007/978-3-030-70542-8_27
663
664
F. Ahmed et al.
and difficult to optimize not only due to a continuous increase in subscribers but also emerging use-cases that require massive data rates, ultra-reliable communications, and machine type communications. Apart from large number of configuration management parameters that need to be configured during roll-out and optimized in a continuous manner, heterogeneity in terms of technologies, frequencies, cell sizes, and usage patterns further emphasizes the need for more powerful and intelligent optimization methods. Challenges related to 5G spectrum sharing approaches and ultra-reliable communications are discussed in [10, 60], respectively. An overview of important SON and network automation use-cases for 5G and beyond is presented in [7]. In LTE, self-organizing network (SON) paradigm was introduced to address network complexity via different use-cases for handling tasks related to network configuration, optimization, performance monitoring and maintenance, in an automated manner [38]. This leads to not only better performance but also high agility with reduced need for human intervention, thereby reducing network capital expenditure (CAPEX) and operation expenditure (OPEX). In the context of SON for cellular networks, network automation use-cases usually fall into three main categories, namely self-configuration, self-optimization, and self-healing [19]. Self configuration of network involves automation of parameter configuration tasks including download and installation of software followed by configuration of basic roll-out parameters such as neighbor cell lists, physical cell identifiers (PCIs) and physical random access channel (PRACH) parameters. In contrast, self-optimization entails analysis of configuration and performance data, followed by continuous optimization of network parameters. Use-cases such as coverage and capacity optimization (CCO), inter-cell interference coordination (ICIC), mobility robustness optimization (MRO), mobility load balancing (MLB), and energy savings (ES) fall into this category. Also, use-cases such as PCI, PRACH, and automated neighbor relations (ANR) can be included in this category because continuous optimization of these parameters is important for efficient network operation. Self-healing use-cases aim at early detection and resolution of anomalies and potential performance degradation. Examples include sleeping cell and cell outage detection and compensation, and general anomaly detection and root cause analysis use-cases. Moreover, general principles of self-organization are applicable to emerging applications include internet of things and machine to machine communications [4, 58, 59]. Regarding implementation of SON use-cases, several architectural approaches have been considered, e.g. centralized, distributed, and hybrid. The centralized approaches enable better network-wide decisions, but may not suitable for demanding 5G applications with extreme requirements in terms of latency. Distributed SON solutions are usually vendor specific features integrated directly to radio base stations. These have faster response times and are particularly useful in certain cases (e.g. ANR). Combination of distributed and centralized approaches has led to the development of hybrid approaches as well, where functions working at shorter time cycles are handled using distributed SON, whereas a centralized more intelligent controller optimizes higher level parameters at a slower pace. The controller essentially works as an optimization engine, running on a powerful platform that has access to
Combinatorial Optimization for Artificial Intelligence …
665
network data. Apart from computing power and availability of data, the performance is strongly dependent on the selected algorithms and underlying models used for optimizing the network parameters. Regarding problem formulation, both discrete and continuous optimization models are used. Continuous optimization is often applied to problems such as interference coordination [9, 33], power control problems [6, 11, 15], resource allocation [5], network syncronization [12], whereas discrete models are relevant to PCI assignment [13] and orthogonal resource assignment [14] in mobile networks. As a number of important problems can be modeled as combinatorial optimization problems, the use of metaheuristics in conjunction with centralized cloud based controller seems a very promising research area, especially from a perspective of 5G and beyond. This chapter attempts to give an overview of the state-of-the-art heuristics and metaheuristics based algorithms for combinatorial optimization problems relevant to mobile networks. The rest of the chapter is organized as follows. Section 2 describes important network automation use-cases that are well suited for the application of AI based approaches including discrete and continuous optimization algorithms. Section 3 presents the details regarding relevant metaheuristics for solving combinatorial optimization problems. In Sect. 4, we discuss specific algorithms and how they are applied to solve problems related to mobile network planning and optimization. A detailed case study on the PCI assignment use-case is presented in Sect. 5. Finally, we draw our conclusions and discuss possible directions for future work in Sect. 6.
2 Network Automation Use-Cases 2.1 Physical Cell ID (PCI) Assignment Physical cell identifier (PCI) is an important parameter for Long Term Evolution (LTE) radio configuration. It is based on a reference signal sequence that acts as a unique physical layer identifier for LTE cell. The reference signal sequences are used as identifiers because they are constructed in a way that ensures robustness against interference, which is important for fast and reliable identification. In fact, it can be read within a very short timeframe (i.e., 5 ms), and enables user-equipment (UE) to uniquely identify neighboring cells in a given geographical area. Therefore, it is important to have a proper reuse distance so that no two neighbors measured by a UE have the same PCI. Essentially, there are two main types of constraints that are taken into account for PCI assignment—collisions and confusions. Collision means that two neighboring cells are operating on same frequency channel with the same PCI, whereas confusion occurs when a given cell has two neighboring cells on the same frequency channel, with the same PCI. It is worth mentioning that collisions and confusions result not only from planning errors, but also due to roll out of new cells, leading to the degradation of KPIs and related network quality issues. The
666
F. Ahmed et al.
Cell-ID A
Cell-ID A
Cell-ID C
(a) PCI assignment
Cell-ID A
Cell-ID A
Cell-ID C
(b) Confused PCI assignment
Cell-ID A
Cell-ID B
Cell-ID C
(c) Confusion-free PCI assignment
Fig. 1 PCI assignment (collision and confusion)
aim of PCI use-case is to optimize the network-wide PCI assignment, and ensure there are no collisions or confusions in the network. The concepts of collision and confusion are illustrated in Figs. 1 and 2. In Fig. 1a, there are two cells with PCI A, these would be in collision if they were neighbors. Then, a new cell with PCI C appears, which is confused as shown in Fig. 1b, because two of its neighbors have the same PCI equal to A. This confusion is resolved in Fig. 1c, by changing the PCI of one of its neighbors to B. Likewise, Fig. 2 illustrated a collision- and confusion-free PCI assignment at a network level. It can be seen that the same three PCIs can be used to achieve collision- and confusion-freeness in a larger network. Thus, intelligent assignment of PCIs is of paramount importance in achieving network-wide collision- and confusion-freeness. Moreover, PCI also serves as a control parameter for uplink and downlink reference signal allocation [64]. For example, frequency shift of downlink reference signal is determined by modulo 3 of PCI. Likewise, modulo 30 principle is used for uplink reference signals. The PCI comprises of two components namely, primary synchronization signal (PSS) and secondary synchronization signal (SSS). In LTE, the possible values for PSS is 0 to 2 and for SSS 0 to 167, which makes the total number of available PCIs in LTE equal to 504. Thus, reuse is inevitable in ultradense deployments. However, in 5G new radio (NR), range of SSS is increased to
Combinatorial Optimization for Artificial Intelligence …
Cell-ID B
Cell-ID C
Cell-ID A Cell-ID B
667
Cell-ID A
Cell-ID C
Fig. 2 Network wide PCI assignment (collision-free and confusion-free)
335, which makes the total number of PCIs equal to 1008. There are two use-cases for PCI i.e., deployment phase and operational phase. The deployment phase can be considered as a self-configuration use-case. When an eNB is deployed and activated for the first time, it detects the PCIs of neighboring cells using network monitoring mode or radio environment monitoring function. This helps to exclude those PCIs from the list of allowed PCIs. On the other hand in operational phase, eNB usually has already connections established with neighboring cells using X2 interface, which enables it to fetch the neighboring cells information using ANR table. PCI optimization is run as a continuous process to mainly avoid the confusions. We defer the discussion on the PCI assignment problem to Sect. 5, where it is discussed in more detail as a case study.
2.2 PRACH Configuration In LTE, random access procedure is used by the UE to acquire uplink synchronization and access the network for transmitting signaling and data. It involves transmission of a preamble by the UE to eNB. The basic principle behind random access process is shown in Fig. 3. There are two types of random access procedure—contention based and noncontention based random access. In the contention based random access, UE transmits a preamble in the available subframe, where eNB responds on physical downlink control channel and physical downlink shared channel. Also, timing advance information is measured. Next, UE sends its ID on message 3. If the UE is in idle state, Non-Access Stratum (NAS) information is provided otherwise Access Stratum (AS) ID (which is cell radio network temporary identifier) is used to perform the contention resolution, which is followed by an acknowledgment from the UE. In the non-contention based random access, a dedicated preamble is assigned to UE during the handover process since this access is collision-free. In order to maximize the orthogonality between users performing RACH procedure, Zadoff-Chu (ZC) sequences are used to generate preambles in each cell. The ZC sequences are used due to their constant amplitude and good auto/crosscorrelation properties. Total number of ZC root sequences according to 3GPP is 838. There-
668
F. Ahmed et al.
Contenon based random access
Non-contenon based random access eNB
UE
eNB
UE
Preamble on RACH Preamble response on PDCCH+PDSCH Message no. 3 on RACH
Assignment of dedicated preamble sequence Dedicated preamble on RACH
Contenon resoluon message on PDSCH
Preamble response on PDCCH+PDSCH
Acknowledgement Fig. 3 Contention and non-contention based random access
fore, it is important to plan properly. Depending on cell radius, more than one root sequence may be required. A UE needs to know which sequences it is allowed to use to generate the require preambles. This information needs to be signaled in the cell in an efficient manner. To this end, the index of only the first sequence, i.e. root sequence index (RSI), is broadcasted in the cell. The order of sequences is predefined, on the basis of criteria such as configuration management properties, maximum number of cyclic shifts. Therefore, it possible for UE to derive the preambles using RSI as a base index. Thus, RSI is a logical parameter that is used as a base to calculate the preambles. It is worth mentioning that this logical index is mapped to a physical RSI during the implementation phase. The UE gets information about the cyclic shift from the zero correlation zone config and applies it to the base index to generate the preambles. Resulting preamble sequences should not overlap with the sequences in neighbor cells. Overlapping preamble sequences lead to reservation of physical downlink control channel (PDCCH) and physical uplink shared channel (PUSCH) resources in cells that receive such ghost preambles. Due to the limited number of RSIs, its not possible to assign a unique RSI to every cell in the network. Under such practical constraints, reuse distance is defined and assignment of RSIs is done in a way that RSIs with in the re-use cluster do not conflict. Thus, the aim of PRACH parameter optimization use-case is to optimize RSI assignment at network level because non-conflicting RSIs are critical to the generation of correct preambles.
2.3 Mobility Robustness Optimization Mobility robustness optimization (MRO) use-case is meant for detecting handover problems such as (early, late, and ping-pong handovers) and to reduce them by adjust-
Combinatorial Optimization for Artificial Intelligence …
669
ing mobility configuration parameters, in an automated manner. The general task of MRO is to ensure proper mobility, which requires proper handovers in connected mode and cell reselection in idle mode. The main goals of the MRO are to minimize the call drops, radio link failures (RLF), unnecessary handovers, and other idle mode problems. There could be several outcomes of a mobility problem, the worst case is the call drops that leads to poor user experience. Increased RLFs are also an indication of mobility problems. When a RLF happens, a connection re-establishment is required. Unnecessary handovers are those handovers that occur too early, too late, or in a repeated manner. The repeated back and forth handovers between a given pair of cells within a short time are known as ping-pong handovers. Unnecessary handovers cause the inefficient use of network resources and may have impact on the user experience as well. Examples of mobility parameters include cell level parameters such as handover offsets (e.g. A3 and A5), hysteresis, and time to trigger (TTT), as well as cell pair level parameters including cell individual offsets (CIOs). In order to avoid problems such as RLFs and unnecessary handovers, it is important to configure these parameters according to coverage and user speed [47]. As discussed in [40], it is possible to mitigate these problems by changing the configuration of cell individual offsets for a given cell pair in a controlled manner. Furthermore, it is worth mentioning that the handover procedure in LTE is UE assisted—reference signal received power (RSRP)/reference signal received quality (RSRQ) levels of the serving cell are continuously measured, followed by the generation of measurement reports whenever the signal level crosses certain threshold. In fact, a number of handover events corresponding to different thresholds have been defined for both intra- and inter-frequency handovers. For example, an A3 event is entered when a neighbor cell becomes better than the source cell by an offset value. Consequently, it is important to identify the events causing mobility problems in the network, and devise a complete MRO strategy accordingly.
2.4 Mobility Load Balancing Mobility Load Balancing (MLB) is considered one of the most important use-cases for handling congestion and improving use-experience in the network. The key steps involved are monitoring the cell load and moving UEs from highly loaded cells to lightly loaded cells. In active mode, this can be done by changing the neighbor cell parameters such as CIO. This essentially adjusts the border between the source cell and destination cell. On the other hand, in idle mode, users can be pushed to less congested target frequency layer by changing cell reselection parameters such as threshold high. The general handover process is shown in Fig. 4. The UEs continuously measure the signal strengths i.e. RSRPs, from all the neighboring eNBs, and compares them with the RSRP of the serving eNB. Referring to Fig. 4, the UE is initially served by eNB-A. Then, the measured RSRP of eNB-B becomes higher than the measured RSRP of eNB-A at t0 . However, to prevent the network from performing unnecessary handovers, the execution is delayed to make
670
F. Ahmed et al.
Fig. 4 Handover process with hysteresis and TTT parameters
sure there is a certain amount of difference between the RSRPs of eNB-A and eNB-B (i.e. hysteresis). This criterion is achieved at t1 in Fig. 4, however the handover is yet to be ready to be initiated. This hysteresis criterion is supposed to be met for a certain amount of time i.e. TTT, to mitigate ping-pong effect. Finally, both the hysteresis and TTT conditions are fulfilled and the handover is initiated at t2 . Then, the handover execution phase is carried out between t2 and t3 , and UE is finally connected to eNB-B at t3 . It is important to note that in addition to hysteresis and TTT, neighbor cell specific parameters such as CIO (not shown in Fig. 4), also impact the handovers. However, this impact is limited to the source cell and its neighbor target cell. In fact, cells may have different CIO values for different neighbors. The CIO parameter, in general, indicates the willingness of a cell for the incoming and outgoing handovers. For instance, increasing CIO makes the incoming handovers harder and outgoing handovers easier. In LTE, there is an event based handover trigger mechanism, where handovers are triggered when certain events take place. For example, A3 event, one of the trigger events, occurs when the UE’s signal strength measurement from a neighboring eNB becomes more than the serving eNB. Therefore, A3 event can be triggered at t0 in Fig. 4.
3 Combinatorial Optimization Algorithms 3.1 Preliminaries Combinatorial optimization is an important area of related to mathematical optimization that involves searching for the best possible solution from a finite set of alternatives. It is a highly interdisciplinary field that lies at an intersection of a number of areas including discrete mathematics, theoretical computer science, AI, and operations research. Practical applications include VLSI design, communication networks planning and optimization, vehicle routing, and machine scheduling, to name a few [3, 68]. The complexity of the problems also differs widely. There are some
Combinatorial Optimization for Artificial Intelligence …
671
problems with a low degree of computational complexity that can be solved optimally in polynomial time, where optimality is defined in terms of a cost function that determines the quantitative measure of the quality of solution. However, a large number of problems are NP-hard, and consequently difficult to solve. For such problems, it is generally believed that it is not possible to find an optimal solution within polynomially bounded computation time. These are tackled using three main approaches namely: enumerative method, approximation algorithms that run in polynomial time, and heuristics [2, 3, 42]. The enumerative method guarantees optimality, but may not be feasible for large problem instances. On the other hand, approximation algorithms are designed to run in polynomial time with a provable guarantee on the quality of solution. However, these guarantees may be weak or may not even exist for problems that are hard to approximate. Lastly, heuristics provide feasible solutions but do not offer any guarantee on optimality, solution quality, or solution time. In general, the algorithms used for combinatorial optimization problems are divided into two main categories namely complete and approximate algorithms [3, 26]. Complete algorithms guarantee an optimal solution in bounded time for every finite size instance of a combinatorial optimization problem. However, from a perspective of NP-hard problems, approximate algorithms capable of finding a near-optimal solution in a reasonable amount of time are more important. In this context, approximate algorithms are usually divided into two main types namely constructive heuristics and local search methods. Constructive heuristics are fast and create a solution from scratch by adding to an initially empty partial solution. These methods are fast but often return inferior quality solutions compared to local search methods. In fact, local search based algorithms are often extremely effective, and therefore, widely used for solving combinatorial optimization problems as explained in the following discussion.
3.2 Heuristics and Metaheuristics Apart from general heuristics such as local search, more advanced and problem specific approaches known as metaheuristics have been developed with wide ranging applications. First developed during early 90s, metaheuristics combine basic heuristics in a higher level framework to search the solution space (a set where each element is a possible solution) in an efficient manner [26]. The aim is to find a solution to the combinatorial optimization problem in a reasonable time, without any guarantees on the quality of solution. These methods are approximate and usually non-deterministic, and are particularly effective when solution space is so huge that classical search techniques (e.g. complete algorithms) are too time consuming. Metaheuristics are not problem specific and often rely on some mechanism for avoiding getting trapped into confined areas of the search space. However, they may make use of domain specific knowledge in the form of suitable heuristics. In order to search the solution space, metaheuristics make use of two key concepts called intensification and diversification. Intensification exploits the accumulated
672
F. Ahmed et al.
knowledge, thereby focusing on regions that give good quality solutions, whereas diversification ensures exploration of search space. These two forces determine the behavior of metaheuristics, and are highly effective when balanced properly for the problem at hand. Combinatorial optimization essentially involves searching the solution space for a integer number, set, permutation, or graph structure. In most problems, not only the number of states is stupendous, but there are constraints between variables, which motivates the combinatorial optimization type of problem formulation. Thus, for NP-hard problems, metaheuristics based on intensification and diversification are often used to avoid entrapment in local minima, which enables searching the solution space in an efficient manner. More systematic search methods used for constraint programming can also be applied. Nevertheless, in practical problems, it is important to have an efficient search strategy that finds a good enough solution within a certain time bound, rather than exhaustively searching all possible states for the global optimal. These methods range from simple local search algorithms to complex and sophistication learning processes and perform reasonably well on a variety problems of practical interest. Due to their diverse characteristics and interdisciplinary origins, metaheuristics can be classified in a number of ways as summarized below [25, 26]. • • • • •
Single point versus population based approaches bio-inspired versus non-bioinspired dynamic versus static objective function one versus variable neighborhood structures memory based versus memory-less algorithms.
Among these, the single point versus population based is considered most important, as it emphasizes the algorithm design principles [25]. Accordingly, the two main categories are trajectory metaheuristics and population based metaheuristics as explained in the following discussion.
3.3 Trajectory Metaheuristics In trajectory based metaheuristics, the search process follows a trajectory in the search space. Starting from an initial state, the trajectory evolves over the time algorithm searches for a solution. Thus, the algorithm typically uses a single solution at a time, and replaces it as the iterations proceed. This search technique can be understood as a discrete dynamical system following a trajectory. Basic algorithms follow relatively simple trajectories, which usually comprise of a transient phase and an attractor phase. On the other hand, advanced algorithms may have very complex trajectories comprising of multiple phases, designed to balance exploitation and exploration characteristics. The trajectory of the algorithm in the search space characterizes its behavior and effectiveness. In general, the outcome is influenced by the algorithm, the problem representation, as well as the problem instance. The problem representation and the particular instance under consideration determines the search landscape,
Combinatorial Optimization for Artificial Intelligence …
673
whereas the algorithm characterizes the strategy to search the landscape for a solution. Simple algorithms based on iterative improvement of the current solution such as iterative local search often get stuck in a local optimum and at times unable to find a satisfactory solution. Therefore, it is important to have a termination criteria in the algorithm. Examples include maximum number of iterations, iterations without improvement, and the algorithm execution time, to name a few. Important categories of trajectory metaheuristics include local search based metaheuristics (e.g. iterated local search, guided local search and variable neighborhood search), simulated annealing, and tabu search. Local search methods work by starting from a candidate solution, and improving it iteratively using local moves (in a suitably defined neighborhood of the current solution), until a good enough local optimum is found. Local search algorithms are not complete, in that they do not guarantee convergence. For instance, in iterative local search a new move is accepted only if the resulting solution is better than the current solution. Thus, it can stuck in a local minimum. In contrast, simulated annealing metaheuristic was designed to escape such situations. It essentially balances exploration and exploitation, to search the configuration space effectively. Apart from plateau moves, a key feature of simulated annealing is uphill moves, which enables it to escape local minima in search of an optimal solution. Moreover, in simulated annealing, the probability of uphill moves is controlled by a noise parameter (temperature). A cooling schedule is defined (e.g. exponential, logarithmic), according to which the temperature is gradually lowered. On the other hand, tabu search is remarkably different as it makes use of the history of the search for both escaping local minima and exploring better solutions. The basic tabu search can be viewed as a best improvement local search with an additional short term memory of recently visited solutions known as tabu list, to avoid getting stuck in local minima and cycles.
3.4 Population Based Metaheuristics The population based metaheuristics make use of a population of solutions rather than a single solution, which then evolves over iterations where each iteration corresponds to a generation. Algorithms inspired by natural and biological phenomena such as evolution, are included in this category. The inspiration comes from the fact that these phenomena enable gradual changes in the characteristics of population over time, leading to an improvement. Accordingly, population based optimization approaches work on a number of candidate solutions, where the search is guided by using the population related characteristics. The use of the population particularly enables a more natural and effective approach towards the search of solution. Evolutionary computation and ant colony optimization constitute this category of metaheuristics. The evolutionary computation algorithms work by transforming current generation of solutions into a new generation using cross-over and mutation operators, which are applied to two or more solutions. Individuals for the next generation are selected based on a criteria involving quality or fitness. Thus, the aim is to gradually improve
674
F. Ahmed et al.
the quality of solution using the natural principle of survival of the fittest. Examples of evolutionary computation include genetic algorithms. On the other hand, metaheuristics such as swarm intelligence rely on the collective behavior of a population of decentralized agents that self-organize and produce intelligent behavior at global level. Notable examples include ant colony optimization and particle swarm optimization. Ant colony optimization algorithms are based on the pheromone model (trail pheromone is a chemical produced by ants exhibiting foraging behavior), a parameterized probabilistic model which is subsequently used by artificial ants to construct solutions and update pheromone values to converge to a high quality solution [32].
4 Applications in Mobile Network Automation 4.1 Local Search The local search principle involves local changes by individual agents to move in the configuration space from one point to another, satisfying constraints or avoiding conflicts locally. The search space is explored by making perturbations to the existing configuration, known as local moves. Application of local search for the planning of base station locations and optimization of cell parameters in LTE networks is discussed in [53]. The authors apply the well-known Shannon-Hartley formula modified with appropriate bandwidth and signal to noise ratio (SNR) efficiency factors to estimate LTE downlink air interface data rate. This helps in characterizing the cell loads in large radio networks, thereby resulting in a system of conservative cell load and call acceptance probability equations. The resulting equations are solved using an iterative procedure to estimate the network capacity. In order to plan the network to fulfill the capacity needs, a simple local search algorithm for the selection of base station locations and optimization of cell parameters is used. Likewise, planning and updating of tracking areas in LTE is addressed in [62]. The underlying problem is NP-hard, which is solved using an algorithm based on repeated local search. In order to update the tracking areas, re-optimization is done, which reassigns the cells to tracking areas while taking into account the traffic affected by the re-assignments. Variable neighborhood search for network slicing in 5G networks is discussed in [67]. The authors propose a robust heuristic based joint slice recovery and reconfiguration algorithm for handling uncertainty in the traffic demands. Resource allocation in the context of vehicle-to-everything (V2X) communications, enabled by cellular device-to-device (D2D) links, is considered in [66], where the objective is to maximize the total throughput of non-safety vehicular users on condition of satisfying the requirements on cellular users and on safety vehicular users. The resulting threedimensional matching problem is subsequently solved using a local search based approximation algorithm. Through simulation results, we show that the proposed
Combinatorial Optimization for Artificial Intelligence …
675
algorithm outperforms the existing scheme in terms of both throughput performance and computational complexity. Moreover, the use of local search algorithms for conflict-free resource allocation on network graphs is an important area of research. In fact, for certain configuration parameters, the optimization problem can be directly formulated as a graph coloring problem. For graph coloring, a local move is the change of color by one vertex. Two states that are connected by a local move are neighbors in the configuration space. The configuration space is searched by a sequence of local moves taken by vertices. To this end, a cost function is defined, e.g. in terms of the number of conflicts. A local move that decreases the cost is known as a downhill move. For graph coloring, a local move that reduces the number of conflicts of a given vertex is a downhill move. One of the key features of the configuration space of colorings on undirected graphs is the existence of plateaus, i.e., neighboring states with the same number of conflicts. A greedy local search in which each vertex picks the best move often gets trapped in local minima, which are often located at plateaus. A main feature of local search algorithms is that they can move on the plateaus and avoid getting trapped in local minima. In this context, a plateau move is a local move that keeps the number of conflicts unchanged. These moves are important for escaping from local minima on plateaus, and help in reaching a conflict-free state. For instance, local search algorithms have been used for PCI assignment and primary component carrier selection, modeled as graph coloring problem [16]. The algorithms take into account interference prices defined for the conflicts. In the case of real-valued interference pricing of conflicts, rapid convergence to local optimum is achieved. On the other hand, binary interference pricing may lead to a global optimum. In [13], the PCI assignment problem is addressed using distributed local search algorithms. A heterogeneous networking scenario with multiple operators sharing spectrum in the small-cell layer is considered. The algorithms are based on local search combined with the focused search principle, which allows moves only to the cells with unsatisfied constraints related to collisions and confusions. The algorithms considered are fully distributed and do not involve message-passing between the vertices belonging to different operators.
4.2 Simulated Annealing Another effective strategy for avoiding getting stuck in local minima is based on using random uphill moves with a small probability. This principle is used by the well known simulated annealing metaheuristic, which essentially balances exploration and exploitation to search the solution space in an efficient manner. In particular, the hill-climbing feature and the plateau moves play a key role in escaping from local minima and long plateaus, whereas downhill moves ensure attraction to the global optimum. Frequency of uphill moves is controlled via a temperature parameter which is set before-hand, and is reduced according to a cooling schedule as the algorithm progresses. In the context of SON for LTE, simulated annealing has
676
F. Ahmed et al.
been applied to a number of use-cases. In [29], authors propose simulated annealing for adjusting cell level MRO parameters such as A3-offset, TTT, and handover margin. The proposed solution comprises of two main steps: initial setup of handover parameters using Page Hinkley handover based decision, followed by an iterative process based on simulated annealing that further optimizes the initial configuration taking into account current radio conditions. Simulation results show that the proposed approach is effective for different user speed profiles, particularly when compared to the static manual settings. Existing work on simulated annealing for CCO includes [22], which proposes centralized SON for joint coverage and capacity optimization. In order to improve coverage and capacity, both transmit power and antenna downtilt is optimized at each eNB in the network. Due to the huge search space of the problem, finding an optimal solution is non-trivial. This motivates the use of simulated annealing for enabling efficient search while avoiding local minima leading to significant improvements in coverage and capacity. A general discrete resource assignment framework for LTE small cells is proposed in [14]. It also includes a rigorous discussion on different variants of simulated annealing. The orthogonal resource allocation problem is modeled using graph coloring formulation. A planar graph used to model interference relations between randomly deployed mutually interfering cells. Different variants of simulated annealing are investigated for coloring the planar graph. Apart from standard and fixed temperature variants, these include simulated annealing with focusing heuristic (i.e., limiting the local moves only to the cells that are in conflict). Moreover, algorithms for both static and dynamic topologies are presented. Distributed simulated annealing algorithms do not require any dedicated message-passing between the cells. However, in the case of dynamic topology, a distributed temperature control protocol based on message-passing is needed. The performance metrics considered include the number of cells with resource conflicts, number of resource reconfigurations required by the cells to resolve the conflicts, requirements on dedicated message-passing between the cells, and sensitivity to the temperature parameter that guides the stochastic search process. Optimization of base station planning is discussed in [69], where simulated annealing is used to determine the positions of base stations in a given coverage area for both macrocell and heterogeneous scenarios. Resource allocation optimization for downlink multiuser scheduling in LTE is considered in [21]. High complexity of finding optimal solution motivates the application of simulated annealing, which leads to a near optimal solution with reduced complexity. Likewise, a multi-objective optimization framework for downlink power allocation is proposed in [36]. In addition to high energy savings, the use of simulated annealing improves both network capacity and cell edge throughput. Furthermore, a centralized approach for enabling energy savings via base station on-off optimization is addressed in [46]. The optimization problem consists of an objective function that takes into account outage throughput and energy efficiency. Results show significant reduction in energy consumption and increased throughput particularly in heterogeneous networks.
Combinatorial Optimization for Artificial Intelligence …
677
4.3 Tabu Search Tabu search is a highly popular and widely used metaheuristic in the field of combinatorial optimization with wide ranging applications. It makes use of the history of search to not only avoid getting stuck in local minima, but also enable exploration of the solution space. It consists of a local search component and a short term memory that is used to prevent the algorithm from visiting the same state. To this end, a tabu list is constructed that keeps track of recently visited solutions. Thus, the neighborhood of current solution is restricted to the solutions that are not in tabu list. The list is updated in every iteration in that current solution is added and one of the previous ones is removed. In LTE networks, it has been studied for network planning and optimization. For instance, a service-oriented optimization framework that takes into account both technical and economical factors is presented in [37]. The problem is formulated using a mixed integer programming model, and solved by Pareto front and multi-objective tabu search. A similar model is studied in [65], for the LTE network planning problem. The aim is to plan the locations of small cells for capacity increase in a heterogeneous network. In LTE networks, the cell loads are coupled in a non-linear way due to mutual interference, which makes the problem NP-hard. In this case, the mixed integer linear programming problem is solved using a dynamic tabu search algorithm. Throughput maximization for uplink LTE with multicast D2D underlay in a multicell network is discussed in [57]. Resource allocation and power control framework order is used to mitigate the uplink interference. Moreover, a tabu search algorithm is developed, which maximizes the probability of finding an optimal solution resulting in improved sum throughput of the network. In [41], the authors apply tabu search for optimizing tracking area code (TAC) configuration, where the aim is to maximize the paging success rate while taking into account traffic load, TAC size, and handover patterns. Likewise, Safa and Ahmad [63] discuss tabu search for timing advance reconfiguration problem in LTE networks. To this end, both signaling overhead and reconfiguration cost is taken into account. Results show that tabu search is able to outperform simpler memory-less metaheuristics such as genetic algorithms. Joint allocation of multiple resources (e.g. as modulation and coding, subcarriers and power) to users is discussed in [52]. The high complexity of the optimization problem motivates the application of metaheuristics. Results show that application of decomposition in conjunction with tabu search yields near optimal solutions, alleviating the need for prior frequency planning for inter-cell interference mitigation. The approach is distributed and enables packing a higher number of users in the same bandwidth, thereby reducing the outage probability of the users. A similar joint resource allocation problem for D2D communication underlaying cellular systems is to increase spectral efficiency is discussed in [20]. The optimization problem consists of joint power control and resource block assignment, while meeting the constraints on minimum rate requirements and power budget of the users. Decomposition method is applied to decouple the problem into simpler problems, which are then solved individually. In particular, resource block assignment is reduced to the
678
F. Ahmed et al.
three-dimensional matching problem, which is solved for near-optimal solution using tabu search. Moreover, tabu search outperforms other approaches including genetic algorithms. In [43], authors propose tabu search for downlink packet scheduling in LTE networks, which is achieved by optimizing scheduling and resource allocation scheme under quality-of-service (QoS) constraints. The aim is to maximize the system’s sum throughput, under a fair distribution of available resource blocks. Simulation results shows the effectiveness of the proposed approach in improving throughput while ensuring fairness.
4.4 Evolutionary Computation Evolutionary Computation is a broad area comprising of evolutionary algorithms as well as a number of other approaches including ant colony optimization (inspired by the foraging behavior of ants) [32], swarm intelligence (e.g. particle swarm optimization), and genetic and memetic algorithms. The main idea behind this approach is to use a population based metaheuristic inspired by nature’s principle of survival of the fittest. In fact, evolutionary computation algorithms can be considered as computational models for evolutionary processes, where in each iteration cross-over and mutation operators are applied to the whole population of solutions, to create the next generation of solutions. The cross-over operators are used to combine candidate solutions to form new solutions, whereas mutation operators enable self-adaptation of individual solutions. Joint power control and scheduling of resource blocks for ICIC in a multi-cell LTE scenario is discussed in [51]. The authors propose a distributed optimization framework based on evolutionary potential games, whereas Lagrangian multiplier method is employed to solve the main constraint objective optimization problem. The particle swarm optimization method is applied subsequently to find the optimal power allocation and scheduling for each resource block. Results show that proposed algorithm significantly improves a number of performance metrics including the overall throughput, while maintaining the fairness. Swarm intelligence algorithms have also been applied to cell planning problem in LTE networks [35]. The objective is to find base station locations while taking into account both cell coverage and capacity constraints. To this end, swarm intelligence metaheuristic particle swarm optimization is applied, which gives suboptimal base station locations. The redundant base stations are subsequently eliminated using an iterative algorithm. Results obtained using Monte Carlo simulations in different scenarios show that the proposed approach always meet QoS targets and can be effectively used for planning the base station locations under strict requirements related to temporal traffic variations and location constraints due to limits on electromagnetic radiation. Likewise, base station placement in LTE heterogeneous networks has been studied using evolutionary algorithms [48]. An optimum network planning approach for large-scale LTE heterogeneous networks is presented. The original problem is decomposed into a number of subproblems, where subproblems are assigned cells that are correlated
Combinatorial Optimization for Artificial Intelligence …
679
due to mutual interference. This form of grouping leads to better performance in terms of system throughput compared to existing approaches. In [36], the authors discuss evolutionary algorithms for downlink power allocation in LTE/LTE-A networks. The aim is to minimize inter-cell interference, and thus improving network performance and user’s QoS. For general resource allocation problems, ant colony optimization algorithms are particularly well-suited. For instance, in [49], ant colony optimization is used to find Pareto-optimal solution to a multi-objective resource allocation problem, which consists of three objective functions namely, resource utilization efficiency, degree of fairness, and interference. Joint allocation of resources, and modulation and coding schemes using ant colony optimization based algorithm is discussed in [30], where the objective is to minimize the number of allocated resource blocks in a closed femtocell under constraints on the minimum throughput for each user. The optimization problem is formulated as an NP-hard integer linear programming problem, which is then solved using different ant colony optimization based algorithms. Similar approaches for joint resource allocation and relay selection in LTE networks is investigated in [71]. Resource sharing for network sum rate maximization while guaranteeing QoS of D2D communications in V2V networks is discussed in [34]. In LTE networks with MIMO and coordinated multipoint transmissions, ant colony optimization can be employed to select the best solution from a set of possible solutions [31], resulting in better system throughput compared to a dynamic interference avoidance scheme. Furthermore, ant colony optimization algorithms can also be applied to mobility related use-cases such as MRO [70], and MLB [27]. In [70], an automated handover optimization framework is discussed for femtocells in LTE. The proposed approach discusses ant colony optimization algorithm to differentiate between special areas and general areas. This is followed by configuring of different handover parameters such as hysteresis margin for different areas that exist in the same femtocell coverage. Results signify that the proposed method leads to a significant reduction of the handover number. Moreover, there is no negative impact on the network performance. On the other hand, ant colony optimization is used to resolve unbalanced distribution of traffic via load balancing in [27]. To this end, the load of all cells in the LTE network is estimated and based on simulation intensity users are selected to be handover to the neighbor cells, thereby achieve load balancing. This leads to a significant reduction in the number of unsatisfied users and the handover failure rate. A summary of different combinatorial optimization techniques and their use in mobile network automation use-cases is given in Table 1.
5 Greedy Heuristics for PCI Assignment: A Case Study In this section, we discuss some well-known greedy heuristics for PCI assignment use-case, which is modeled as a graph coloring problem. Graph coloring is a classic combinatorial optimization problem and well suited for the application of approximate methods in general and heuristics and metaheuristics in particular. As mentioned
680
F. Ahmed et al.
Table 1 Applications of combinatorial optimization in network automation SON use-case Combinatorial Algorithm approach Reference work optimization technique PCI MRO
Trajectory Trajectory Trajectory Population
MLB
Trajectory Population
ES CCO TAC ICIC
Trajectory Trajectory Trajectory Population
Network planning
Population Population
Resource allocation
Trajectory Population Population
Local search Tabu search Simulated annealing Ant colony optimization Simulated annealing Ant colony optimization Simulated annealing Simulated annealing Tabu search Particle swarm optimization Swarm intelligence Evolutionary algorithms Tabu search Evolutionary algorithms Ant colony optimization
[13, 16] [23] [29] [70] [18] [27] [46] [22] [41, 63] [51] [35] [48] [65] [36] [30, 49, 71]
previously, PCI assignment should be collision- and confusion-free. Here, we model these constraints by adding edges to first tier and second tier neighbors. Resulting graph is colored using greedy heuristics, where colors correspond to the set of PCIs. Successful coloring leads to a collision- and confusion-free PCI assignment. The performance is analyzed using number of PCIs needed to color the network graph, and the execution time of the algorithm.
5.1 Model We model LTE network by an undirected graph G(V, E) comprising a set V of vertices and a set E of edges. An edge ei, j ∈ E connects a pair of vertices in vi , v j ∈ V, for i = j. Two vertices connected by an edge are said to be adjacent to each other. Set of vertices adjacent to the vertex vi constitutes neighborhood Nvi , and the number of edges incident to vertex vi is known as the degree of vi denoted by deg(vi ). The vertices and edges denote LTE cells in the network, and the neighbor relations that exist between them, respectively. Neighbor relations or adjacencies
Combinatorial Optimization for Artificial Intelligence …
681
are based on mutual interference between cells, which is dependent on a number of propagation factors including spatial separation and antenna directions. According to graph coloring terminology, a N -coloring for a given set of colors N = {1, . . . , N }, is defined as function c : V → N . Two vertices vi , v j connected by an edge are said to be conflicting if they are assigned the same color, i.e., c(vi ) = c(v j ). In contrast, N -coloring is defined to be legal, if colors are assigned to the vertices in a way that there are no conflicting edges. Here, we are interested in legal N -coloring of G(V, E), where V = |V| is the cardinality of the set of cells (i.e. total number of cells in the network), and N = |N | is the cardinality of the set of valid PCIs. Thus, N -coloring problem is equivalent to a PCI assignment problem with N PCIs. It is possible to express the constraints in mathematical form as follows. Let xv,n be binary variable that is equal to one if cell v is assigned PCI n, and zero otherwise. Then, the constraint that every cell needs to be assigned a valid PCI can be expressed as: N
xv,n = 1, v ∈ V.
(1)
n=1
Here, V is the set of all the cells in the network. The constraint that no two adjacent cells should have the same PCI can be formulated as: xi,m + x j,m ≤ 1, m ∈ N ,
(2)
for all adjacent cell pairs (i.e. vertices connected by an edge ei, j ) in the network. Thus, these two constraints need to be satisfied for a valid PCI assignment. Next, we discuss construction of the network graph. To this end, we model edges between the vertices based on interference couplings. In order to calculate interference coupling or cost between the cells, it is important to take into account distances and antenna bearings. The first step is to calculate distance between cell locations (given in latitude, longitude). Using the distances, we calculate path losses between cells using COST-Hata urban model Received power is calculated considering path loss and transmit antenna bearing and received location, with Azimuth antenna pattern given by θ 2 , Am [dB] , (3) A(θ ) = min 12 θ3dB where Am = 20 and θ3dB = 70◦ .
5.2 Algorithms The NP-hardness of the graph coloring problem necessitates the use of fast heuristics for the determination of a (suboptimal) solution, especially in practical scenarios where the underlying graph is dynamic and mandates the use of polynomial time
682
F. Ahmed et al.
algorithms. We consider heuristics based on the greedy principle—a highly intuitive approach where a locally optimal solution is picked at every decision step. In the context of graph coloring, greedy algorithm refers to the approach where colors are assigned to vertices in a direct sequential manner. Resulting algorithms are suboptimal but fast, making it possible to find a solution in polynomial time. This approach is also called sequential coloring as it colors the vertices in a given sequence. Input arguments include a graph (i.e. sets of vertices and edges), and sequence S = (v1 , . . . , vV ) which is used by the algorithm to assign the colors. Design of S is critical to the performance of algorithm and can be done in a number of ways. In what follows, we briefly discuss a few classic greedy algorithms used for graph coloring. For a detailed discussion, the reader is referred to [45].
5.2.1
DSATUR (DS)
The DSATUR also known as saturation last first algorithm makes use of principle of coloring harder vertices first. It takes into account degree of saturation ρ(v) (hence the name DSATUR), which is defined as the number of distinct colors used in the neighborhood of vertex v. It works because constraints on the set of available colors for a given vertex v are indeed dependent on the set of colors used in the neighborhood Nv , rather than the degree deg(v).
5.2.2
Random Sequential (RS)
A basic variant of sequential coloring can be devised by simply using a random sequence of vertices S = (v1 , . . . , vV ). Accordingly, it is called random sequential, and is considered as a baseline for evaluating the performance of sequential coloring algorithms on a given class of graphs. It is relatively simpler to implement as it does not entail any logic for sequence generation.
5.2.3
Independent Set (IS)
Independent set is based on the maximal independent set algorithm. In the first step, a maximal set of vertices is computed. All vertices in the set are assigned the first available color. Colored vertices are subsequently removed from the graph and the same procedure is repeated. For example, consider color ci , assigned to all possible vertices in a maximal independent set (an important property of such a set is that no two vertices are adjacent). In the next step, the algorithm removes all these vertices from the original set of vertices, and continues with the remaining subgraph and color ci+1 . Note that the maximal independent set computed in each step ensures that the maximum number of vertices are assigned the next available color.
Combinatorial Optimization for Artificial Intelligence …
683
5.3 Results In order to construct network graph, we use data from an operational LTE network consisting of around 273 cells, deployed in Finland. The data comprises of cell locations, antenna heights, bearings, channel frequencies, and other related configuration management parameters. Details of parameter values is given in Table 2. Based on the given parameters, received signal to noise (SNR) ratio from interfering cells is calculated, and cells from which received SNR exceeds a predefined threshold are considered potential interferers (i.e. neighbors that should not use the same PCI). Each cell is a vertex in the graph and adds edges between itself and the set of potential interferers, forming adjacency matrix corresponding to first tier neighbors (for PCI collisions). Edges corresponding to second tier neighbors are added next, to include confusion constraints. Both first tier and second tier adjacency matrices are symmetrized and self-couplings are removed. Resulting two tier adjacency matrix represents the network graph that is subsequently used to run PCI assignment algorithms, as it includes both collision and confusion constraints. The cells correspond to the vertices connected by edges forming graph The PCIs are colors to be assigned to nodes so that no two nodes connected by an edge have the same color. Figure 5 shows instances of network graphs projected on geographical maps, where graph in Fig. 5a corresponds to one tier adjacency matrix that includes only collision constraints. On the other hand, the graph in Fig. 5b consists of both collision and confusion constraints, which clearly makes it much harder to color. Note that in this case, the threshold to add neighbors as potential interferers is set to 30. It is worth noting that increasing the aforementioned threshold means that for a given vertex, edges are added to only those neighbors from which received power is high. Thus, higher thresholds make the graph sparser and easier to color. As a consequence, less colors are needed to color it, and less number of PCIs are needed to achieve collision- and confusion-freeness. As shown in Table 3, the average number of neighbors increases rapidly as the threshold is reduced. For instance, when threshold is 10, the average number of neighbors is 272, which means that the graph is fully connected as the total number of vertices is 273.
Table 2 Network parameters Parameter No. of LTE cells Frequency band Transmit power System bandwidth Noise spectral density Noise figure Pathloss model
Value 273 1800 MHz 36 dBm 20 MHz −176 dBm/Hz 9 dB COST-Hata
684
F. Ahmed et al.
(a) Collision constraints
(b) Collision and confusion constraints
Fig. 5 Network graphs projected on geographical maps Table 3 Coloring difficulty Thresholds 10 No. of neighbors (mean) No. of neighbors (min) No. of neighbors (max)
20
30
40
50
272
228.55
104.80
32.65
8.82
272
26
9
2
0
272
272
188
103
39
In order to analyze the performance of coloring, we run the following three greedy heuristic algorithms for collision- and confusion-free PCI assignment in the LTE network represented by the graph. The set of thresholds used is given in Table 3. • DSATUR • random sequential • independent set. The comparison of three algorithms in terms of number of PCIs and algorithm execution time is shown in Fig. 6. It can be seen in Fig. 6a that the number of PCIs reduce sharply when the threshold is increased, as it leads to reduction in the number of edges between the vertices. Moreover, DSATUR requires least number of PCIs, followed by random set, and independent set metaheuristic. For instance, when threshold is
Combinatorial Optimization for Artificial Intelligence …
(a) Number of PCIs
685
(b) Execution time
Fig. 6 Performance of greedy heuristics algorithms for PCI assignment
set to 30, the number of PCIs needed for successful coloring by DSATUR, random sequential, and independent set are 97, 100, and 106, respectively. Interestingly, the difference vanishes at the extremes, i.e. when the graph is either trivial to color or fully connected. Figure 6b shows the execution times corresponding to Fig. 6a. Here, random sequential works slightly better than DSATUR, in the low threshold region, whereas the independent set algorithm needs significantly more time to successfully color the graph. For high threshold case, the algorithms converge to the same value as the coloring problem becomes trivial. Nevertheless, all three algorithms are able to achieve a valid collision- and confusion-free PCI assignment successfully.
6 Conclusions and Future Work 6.1 Conclusions This chapter gives an overview of combinatorial optimization techniques for different use-cases related to self-organizing networks and mobile network automation. We discuss the motivation for enabling intelligent automation in mobile networks as well as specific use-cases that can be addressed using combinatorial optimization algorithms. To this end, we review important categories of metaheuristics and focus on trajectory metaheuristics. In particular, we discuss local search based metaheuristics, simulated annealing, and tabu search, and see how these can be used to solve combinatorial optimization problems in the field of mobile networks. A survey of the state of the art in the applications of metaheuristics to combinatorial optimization in the field of mobile network automation is presented. As a case study, we discuss PCI assignment use-case using a graph coloring approach, and review different greedy heuristics (e.g. DSATUR, random sequential, independent set) that are applicable in this case. The graph coloring algorithms are tested on network conflict graphs created using configuration data from a real LTE network, where the graph edges are used
686
F. Ahmed et al.
to realized collision and confusion constraints. Results show that greedy coloring heuristics are highly effective for solving PCI assignment problem, and are able to achieve collision and confusion-free PCIs in both sparse and highly dense scenarios.
6.2 Future Work A highly promising area for future work is the use of machine learning approaches in conjunction with metaheuristics for solving optimization problems related to mobile network automation. It is important to note that the design of state-of-the-art algorithms for solving combinatorial optimizations problem often requires handcrafting of a suitable heuristic or metaheuristic, with a correct selection of input parameters. Thus, the process is not only time consuming but requires specialized knowledge along with a trial and error approach. Machine learning based approaches have been proposed to mitigate these issues especially for cases where the same optimization problem is solved repeatedly [24, 42]. However, such approaches have not yet been investigated from a perspective of mobile networks. Furthermore, in many real world problems the problem structure remains the same, only input data changes, which makes it possible to discover and learn algorithm parameters. To this end, different types of machine learning techniques have been investigated. For instance in [42], learning algorithms based on a combination of reinforcement learning and graph embedding are discussed for combinatorial optimization problems over graphs. Resulting framework can be applied to a wide range of problems and can learn parameters for well-known problems such as minimum vertex cover, maxcut, and traveling salesman problems. Deep learning heuristic tree search for the container pre-marshaling problem (CMPM) has been proposed in [39]. It makes use of deep neural networks and provides near optimal results to the CMPM problem on real-world sized instances. Reinforcement learning for designing a great deluge based hyper-heuristic is discussed in [61]. Results from a timetabling problem show improvements in performance when compared to the random great deluge based hyper-heuristic. Other approaches such as support vector machines and random forests are discussed in [28, 50].
References 1. A. Imran, A. Zoha, A. Abu-Dayya, Challenges in 5G: how to empower SON with big data for enabling 5G. IEEE Netw. 28(6), 27–33 (2014) 2. K. Aardal, A Decade of Combinatorial Optimization, vol. 1997 (Utrecht University, Information and Computing Sciences, 1997) 3. E. Aarts, E.H. Aarts, J.K. Lenstra, Local Search in Combinatorial Optimization (Princeton University Press, Princeton, 2003) 4. F. Ahmed, Self-organization: a perspective on applications in the internet of things, in Natural Computing for Unsupervised Learning (Springer, 2019), pp. 51–64
Combinatorial Optimization for Artificial Intelligence …
687
5. F. Ahmed, A. Dowhuszko, O. Tirkkonen, Distributed algorithm for downlink resource allocation in multicarrier small cell networks, in Proceedings of the IEEE International Conference on Communications (2012), pp. 6802–6808. https://doi.org/10.1109/ICC.2012.6364716 6. F. Ahmed, A. Dowhuszko, O. Tirkkonen, R. Berry, A distributed algorithm for network power minimization in multicarrier systems, in Proceedings of the IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (2013), pp. 1914–1918. https://doi.org/ 10.1109/PIMRC.2013.6666456 7. F. Ahmed, J. Deng, O. Tirkkonen, Self-organizing networks for 5G: directional cell search in MMW networks, in 2016 IEEE 27th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC) (2016), pp. 1–5. https://doi.org/10.1109/PIMRC. 2016.7794591 8. F. Ahmed, A.A. Dowhuszko, O. Tirkkonen, Network optimization methods for selforganization of future cellular networks: models and algorithms, in Self-organized Mobile Communication Technologies and Techniques for Network Optimization (IGI Global, 2016), pp. 35–65 9. F. Ahmed, A.A. Dowhuszko, O. Tirkkonen, Self-organizing algorithms for interference coordination in small cell networks. IEEE Trans. Vehic. Technol. 66(9), 8333–8346 (2017). https:// doi.org/10.1109/TVT.2017.2695400 10. F. Ahmed, A. Kliks, L. Goratti, S.N. Khan, Towards spectrum sharing in virtualized networks: a survey and an outlook, in Cognitive Radio, Mobile Communications and Wireless Networks (Springer, 2019), pp. 1–28 11. F. Ahmed, O. Tirkkonen, Local optimum based power allocation approach for spectrum sharing in unlicensed bands, in Self-organizing Systems, Lecture Notes in Computer Science, ed. by T. Spyropoulos, K. Hummel (Springer, Berlin/Heidelberg, 2009), pp. 238–243 12. F. Ahmed, O. Tirkkonen, Topological aspects of greedy self-organization, in 2014 IEEE Eighth International Conference on Self-adaptive and Self-organizing Systems (2014), pp. 31–39. https://doi.org/10.1109/SASO.2014.15 13. F. Ahmed, O. Tirkkonen, Self organized physical cell id assignment in multi-operator heterogeneous networks, in 2015 IEEE 81st Vehicular Technology Conference (VTC Spring) (2015), pp. 1–5. https://doi.org/10.1109/VTCSpring.2015.7146077 14. F. Ahmed, O. Tirkkonen, Simulated annealing variants for self-organized resource allocation in small cell networks. J. Appl. Soft Comput. 38, 762–770 (2016) 15. F. Ahmed, O. Tirkkonen, A.A. Dowhuszko, M. Juntti, Distributed power allocation in cognitive radio networks under network power constraint, in 2014 9th International Conference on Cognitive Radio Oriented Wireless Networks and Communications (CROWNCOM) (2014), pp. 492–497. https://doi.org/10.4108/icst.crowncom.2014.255738 16. F. Ahmed, O. Tirkkonen, M. Peltomäki, J.M. Koljonen, C.H. Yu, M. Alava, Distributed graph coloring for self-organization in LTE networks. J. Electr. Comput. Eng. 1–10 (2010) 17. F. Ahmed, et al., Discrete and continuous optimization methods for self-organization in small cell networks-models and algorithms, PhD dissertation, 2016 18. A. Albanna, H. Yousefi’Zadeh, Congestion minimization of LTE networks: a deep learning approach. IEEE/ACM Trans. Netw. 28(1), 347–359 (2020) 19. O. Aliu, A. Imran, M. Imran, B. Evans, A survey of self organisation in future cellular networks. IEEE Commun. Surv. Tutor. 15(1), 336–361 (2013). https://doi.org/10.1109/SURV. 2012.021312.00116 20. A.F. Ashtiani, S. Pierre, Power allocation and resource assignment for secure D2D communication underlaying cellular networks: a tabu search approach. Comput. Netw. 178, 107–350 (2020). https://doi.org/10.1016/j.comnet.2020.107350 21. M.E. Aydin, R. Kwan, J. Wu, J. Zhang, Multiuser scheduling on the LTE downlink with simulated annealing, in 2011 IEEE 73rd Vehicular Technology Conference (VTC Spring) (2011), pp. 1–5 22. N.M. Balasubramanya, L. Lampe, Simulated annealing based joint coverage and capacity optimization in LTE, in 2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE) (2016), pp. 1–5
688
F. Ahmed et al.
23. T. Bandh, G. Carle, H. Sanneck, Graph coloring based physical-cell-id assignment for LTE networks, in Proceedings of the 2009 International Wireless Communications and Mobile Computing Conference (IWCMC) (2009), pp. 116–120 24. Y. Bengio, A. Lodi, A. Prouvost, Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur. J. Oper. Res. (2020) 25. C. Blum, Theoretical and Practical Aspects of Ant Colony Optimization, vol. 282 (IOS Press, Nieuwe Hemweg, 2004) 26. C. Blum, A. Roli, Metaheuristics in combinatorial optimization: overview and conceptual comparison. ACM Comput. Surv. 35(3), 268–308 (2003). https://doi.org/10.1145/937503.937505 27. W. Bo, S. Yu, Z. Lv, J. Wang, A novel self-optimizing load balancing method based on ant colony in LTE network, in 2012 8th International Conference on Wireless Communications, Networking and Mobile Computing, pp. 1–4 (2012) 28. P. Bonami, A. Lodi, G. Zarpellon, Learning a classification of mixed-integer quadratic programming problems, in International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research (Springer, 2018), pp. 595–604 29. V. Capdevielle, A. Feki, A. Fakhreddine, Self-optimization of handover parameters in lTE networks, in 2013 11th International Symposium and Workshops on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt) (2013), pp. 133–139 30. X. Chen, L. Li, X. Xiang, Ant colony learning method for joint MCS and resource block allocation in LTE femtocell downlink for multimedia applications with QoS guarantees. Multimed. Tools Appl. 76(3), 4035–4054 (2017). https://doi.org/10.1007/s11042-015-2991-9 31. W.C. Chung, C.Y. Chen, C.Y. Huang, C.J. Chang, F. Ren, Ant colony-based radio resource allocation for LTE—a systems with MIMO and coMP transmission, in 2013 IEEE/CIC International Conference on Communications in China (ICCC) (IEEE, 2013), pp. 780–785 32. M. Dorigo, C. Blum, Ant colony optimization theory: a survey. Theor. Comput. Sci. 344(2–3), 243–278 (2005) 33. A.A. Dowhuszko, F. Ahmed, O. Tirkkonen, Decentralized transmit beamforming scheme for interference coordination in small cell networks, in Proceedings of the IEEE International Black Sea Conference on Communications and Networking (2013), pp. 121–126. https://doi. org/10.1109/BlackSeaCom.2013.6623394 34. S. Feki, A. Belghith, F. Zarai, Ant colony optimization-based resource allocation and resource sharing scheme for V2V communication. J. Inf. Sci. Eng. 35(3) (2019) 35. H. Ghazzai, E. Yaacoub, M. Alouini, Z. Dawy, A. Abu-Dayya, Optimized LTE cell planning with varying spatial and temporal user densities. IEEE Trans. Veh. Technol. 65(3), 1575–1589 (2016) 36. G.D. González, M. García-Lozano, S. Ruiz, D.S. Lee, A metaheuristic-based downlink power allocation for LTE/LTE—a cellular deployments. Wirel. Netw. 20(6), 1369–1386 (2014). https://doi.org/10.1007/s11276-013-0659-9 37. F. Gordejuela-Sanchez, J. Zhang, LTE access network planning and optimization: a serviceoriented and technology-specific perspective, in GLOBECOM 2009—2009 IEEE Global Telecommunications Conference (2009), pp. 1–5 38. S. Hämäläinen, H. Sanneck, C. Sartori, LTE Self-organising Networks (SON): Network Management Automation for Operational Efficiency (Wiley, Chichester, New York, 2012) 39. A. Hottung, S. Tanaka, K. Tierney, Deep learning assisted heuristic tree search for the container pre-marshalling problem. Comput. Oper. Res. 113, 104781 (2020) 40. J. Joseph, F. Ahmed, T. Jokela, O. Tirkkonen, J. Poutanen, J. Niemela, Big data enabled mobility robustness optimization for commercial LTE networks, in 2020 IEEE Wireless Communications and Networking Conference (WCNC) (2020), pp. 1–6 41. H. Kang, H. Kang, S. Koh, Optimization of TAC configuration in mobile communication systems: a tabu search approach, in 16th International Conference on Advanced Communication Technology (2014), pp. 5–9 42. E. Khalil, H. Dai, Y. Zhang, B. Dilkina, L. Song, Learning combinatorial optimization algorithms over graphs, in Advances in Neural Information Processing Systems (2017), pp. 6348– 6358
Combinatorial Optimization for Artificial Intelligence …
689
43. R. Khdhir, K. Mnif, L. Kammoun, Downlink packet scheduling algorithm using tabu method in LTE systems, in International Conference on Wired/Wireless Internet Communication (Springer, 2015), pp. 3–17 44. P.V. Klaine, M.A. Imran, O. Onireti, R.D. Souza, A survey of machine learning techniques applied to self-organizing cellular networks. IEEE Commun. Surv. Tutor. 19(4), 2392–2431 (2017) 45. A. Kosowski, K. Manuszewski, Classical coloring of graphs. Contemp. Math. 352, 1–20 (2004) 46. G.P. Koudouridis, H. Gao, P. Legg, A centralised approach to power on-off optimisation for heterogeneous networks, in 2012 IEEE Vehicular Technology Conference (VTC Fall) (2012), pp. 1–5 47. F. Laakso, J. Puttonen, J. Kurjenniemi, B. Wang, D. Zhang, F. Ahmed, J. Niemelä, Sons3: a network data driven simulation framework for mobility robustness optimization, in 2018 IEEE 88th Vehicular Technology Conference (VTC-Fall) (2018), pp. 1–5 48. S. Lee, S. Lee, K. Kim, Y.H. Kim, Base station placement algorithm for large-scale LTE heterogeneous networks. PLoS ONE 10 (2015) 49. Y.L. Lee, J. Loo, T. Chuah, A. El-Saleh, Multi-objective resource allocation for LTE/LTE—a femtocell/HeNB networks using ant colony optimization. Wirel. Pers. Commun. (2016). https:// doi.org/10.1007/s11277-016-3557-5 50. K. Li, J. Malik, Learning to optimize neural nets. arXiv preprint arXiv:1703.00441 (2017) 51. Z. Lu, Y. Yang, X. Wen, Y. Ju, W. Zheng, A cross-layer resource allocation scheme for ICIC in LTE-advanced. J. Netw. Comput. Appl. 34(6), 1861–1868 (2011). https://doi.org/10.1016/ j.jnca.2010.12.019 52. D. López-Pérez, A. Ladányi, A. Jüttner, H. Rivano, J. Zhang, Optimization method for the joint allocation of modulation schemes, coding rates, resource blocks and power in self-organizing LTE networks, in 2011 Proceedings IEEE INFOCOM (2011), pp. 111–115. https://doi.org/10. 1109/INFCOM.2011.5934888 53. K. Majewski, M. Koonert, Conservative cell load approximation for radio networks with Shannon channels and its application to LTE network planning, in 2010 Sixth Advanced International Conference on Telecommunications (2010), pp. 219–225. https://doi.org/10.1109/AICT.2010. 9 54. J. Moysen, F. Ahmed, M. García-Lozano, J. Niemelä, Big Data-Driven Automated Anomaly Detection and Performance Forecasting in Mobile Networks (2020) 55. J. Moysen, F. Ahmed, M. García-Lozano, J. Niëmela, Unsupervised learning for detection of mobility related anomalies in commercial LTE networks, in 2020 European Conference on Networks and Communications (EuCNC) (2020), pp. 111–115 56. J. Moysen, L. Giupponi, From 4G to 5G: self-organized network management meets machine learning. Comput. Commun. 129, 248–268 (2018) 57. D.D. Ningombam, S. Shin, Throughput optimization using metaheuristic-tabu search in the multicast D2D communications underlaying LTE—a uplink cellular networks. Electronics 7(12), 440 (2018). https://doi.org/10.3390/electronics7120440 58. D. Ok, F. Ahmed, M. Agnihotri, C. Cavdar, Self-organizing mesh topology formation in internet of things with heterogeneous devices, in 2017 European Conference on Networks and Communications (EuCNC) (2017), pp. 1–5. https://doi.org/10.1109/EuCNC.2017.7980779 59. D. Ok, F. Ahmed, P. Di Marco, R. Chirikov, C. Cavdar, Energy aware routing for internet of things with heterogeneous devices, in 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC) (2017), pp. 1–5. https://doi. org/10.1109/PIMRC.2017.8292757 60. U. Oruthota, F. Ahmed, O. Tirkkonen, Ultra-reliable link adaptation for downlink miso transmission in 5G cellular networks. Information 7(1), 14 (2016) 61. E. Özcan, M. Misir, G. Ochoa, E.K. Burke, A reinforcement learning: great-deluge hyperheuristic for examination timetabling, in Modeling, Analysis, and Applications in Metaheuristic Computing: Advancements and Trends (IGI Global, 2012), pp. 34–55 62. S.M. Razavi, D. Yuan, Performance improvement of LTE tracking area design: a reoptimization approach, in Proceedings of the 6th ACM International Symposium on Mobility
690
63.
64.
65.
66.
67.
68. 69.
70.
71.
F. Ahmed et al. Management and Wireless Access, MobiWac ’08 (Association for Computing Machinery, New York, NY, USA, 2008), pp. 77–84. https://doi.org/10.1145/1454659.1454673 H. Safa, N. Ahmad, Tabu search based approach to solve the TAs reconfiguration problem in LTE networks, in 2015 IEEE 29th International Conference on Advanced Information Networking and Applications (2015), pp. 593–599. https://doi.org/10.1109/AINA.2015.241 J. Salo, M. Nur-Alam, K. Chang, Practical Introduction to LTE Radio Planning. A White Paper on Basics of Radio Planning for 3GPP LTE in Interference Limited and Coverage Limited Scenarios (European Communications Engineering (ECE) Ltd, Espoo, Finland, 2010) I. Siomina, D. Yuan, Optimization approaches for planning small cell locations in load-coupled heterogeneous LTE networks, in 2013 IEEE 24th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC) (2013), pp. 2904–2908 Q. Wei, W. Sun, B. Bai, L. Wang, E.G. Ström, M. Song, Resource allocation for V2X communications: a local search based 3D matching approach, in 2017 IEEE International Conference on Communications (ICC) (2017), pp. 1–6. https://doi.org/10.1109/ICC.2017.7996984 R. Wen, G. Feng, J. Tang, T.Q.S. Quek, G. Wang, W. Tan, S. Qin, On robustness of network slicing for next-generation mobile networks. IEEE Trans. Commun. 67(1), 430–444 (2019). https://doi.org/10.1109/TCOMM.2018.2868652 L.A. Wolsey, G.L. Nemhauser, Integer and Combinatorial Optimization, vol. 55 (Wiley, Chichester, New York, 1999) E. Yaacoub, Z. Dawy, LTE radio network planning with HetNets: BS placement optimization using simulated annealing, in MELECON 2014—2014 17th IEEE Mediterranean Electrotechnical Conference (2014), pp. 327–333 Y. Wang, W. Li, P. Yu, X. Qiu, Automated handover optimization mechanism for LTE femtocells, in 7th International Conference on Communications and Networking in China (2012), pp. 601–605 A. Zainaldin, H. Halabian, I. Lambadaris, Ant colony optimization for joint resource allocation and relay selection in LTE-advanced networks, in 2014 IEEE Global Communications Conference (IEEE, 2014), pp. 1271–1277
Performance Optimization of PID Controller Based on Parameters Estimation Using Meta-Heuristic Techniques: A Comparative Study Mohamed Issa
1 Introduction Industrial control systems use a proportional–integral–derivative controller (PID) controller as a common control loop feedback mechanism where it reduces the difference between the desired set point and the measured variable. The integral part measures the response by summing of errors while the proportional part measures the response of the current error. The derivative part measures the response based on the rate at which the error has been modifying and all these three responses have influences of the final control action on the process [1]. Proportional–Integral–Derivative (PID) controller is broadly applicable due to it depends only on the measurable variable of the process without the need for the knowledge or details of the process. Hence, it was applied in many industrial applications [2–4]. Tuning the parameters of the PID controller aims to find the values of parameters that meet the performance specifications of a closed-loop system, and the robust performance of the control loop over a wide range of operating conditions should also be ensured. The main performance specifications such as settling time, rise time, and overshooting. Experimentally, it is complex and time-consuming to simultaneously achieve all of these objectives [1]. The control system if is designed robustly to disturbance by choosing conservative values for the PID controller, which may result in a slow closed-loop response. Ziegler and Nicolas (ZN) method [5] is the most tuning technique to find the suitable values of gains of the PID controller.ZN tuning technique computes the parameters but not the optimal or near-optimal performance of PID controller where M. Issa (B) Department of Computer and Systems Engineering, Faculty of Engineering, Zagazig University, Zagazig, Egypt e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 D. Oliva et al. (eds.), Metaheuristics in Machine Learning: Theory and Applications, Studies in Computational Intelligence 967, https://doi.org/10.1007/978-3-030-70542-8_28
691
692
M. Issa
large overshooting, settling, and rise time are achieved. Hence, there is a need to enhance the response of the control system that is produced due to gains of PID estimated by ZN. Meta-heuristic algorithm [6] be used to find the values of parameters of the PID controller that produce the performance of the process better than that found by the Z-N method. The main advantage of using meta-heuristic techniques for PID is that meta-heuristic techniques can be applied for higher-order systems without model reduction. A meta-heuristic algorithm is an approach to overcome the disadvantages of numerical and analytical methods. It searches for optimal global solutions depending on a search strategy that mimics a behavior from nature, physics, or humans. Many algorithms were developed such as differential evolution (DE) [7], genetic algorithm (GA) [8], sine–cosine algorithm (SCA) [9], gravitational search algorithm (GSA) [10], particle swarm optimization (PSO) [11], artificial bee colony (ABC) [12], moth-flame optimization (MFO) [13], ion motion optimization (IMO) [14] and there are other hundreds of algorithms since one algorithm maybe not suitable for all engineering problems as stated by No Free Launch (NFL) theory [15]. MA was succeeded in optimizing a lot of applications some of them such as optimization of radioactive transfer function [16], induction motor design [17], Local sequence alignment [18], global sequence alignment [19], feature selection [20], and multiple sequence alignment [21]. In this book’s chapters, the Egyptian Vulture Optimization Algorithm (EVOA) [22] was used to test its performance of measuring the response of control systems which are controlled by a PID controller. Three control systems are used with different orders (2nd, 3rd, and 4th) as a case study to test the performance of EVOA for choosing the suitable gain parameters of the PID controller that enhance the response of the systems. Four meta-heuristic algorithms are used for comparative study with EVOA are GA, SCA, PSO, and ASCA-PSO [18]. The rest of the chapter is organized as follows: Sect. 2 present the definition of the PID controller and how its parameters are estimated using ZN. In Sect. 3, the procedure of enhancing the performance of the PID controller using a meta-heuristic algorithm is described. SCA, PSO, ASCA-PSO, and EVOA are described in Sect. 4.
2 PID Controller The process plant is adapted to achieve the desired set point using the signal u(t) which represents a control valve, damper, and cooling or heating element for example. u(t) is computed using Eq. 1. t
u(t) = Kp e(t) + Ki ∫ e(k)dk + Kd 0
de(t) dt
(1)
Performance Optimization of PID Controller …
693
Fig. 1 Interconnection between PID and process control
Figure 1 shows the interconnection between process and PID controller where the output signals and the desired output (setpoint) are y(t) and r(t)in order and t is the time. e(t) is the error between the desired output r(t) and the output response y(t). u (t) is the control signal output from the PID controller delivered to the plant process. Kp , Ki , and Kd are the gains of proportional, integral, and derivative controllers in order. Experimentally tuning the parameters of PID controller together is a difficult operation to achieve a stable response of process. The poor tuning of the parameters of PID controller will lead to the oscillation of the response of the system without achieving the stable set point. The performance of the controlled system response is measured using some criteria such as rise time, settling time, and overshooting as shown in Fig. 2. Settling time (ts ) is the elapsed time of system response to enter and achieve stability within a certain error band. Rise time (tr ) is the time that a system response needs to reach to the set point. Overshooting (Mp ) is the difference between the maximum point of
Fig. 2 Main criteria of the controlled system response
694 Table 1 Ziegler-Nicolas formula for P, PI and PID controllers parameters
M. Issa Control type
Kp
Ki
P
0.50 Ku
PI
0.45 Ku
0.54 Ku /Tu
PID
0.60 Ku
1.2 Ku /Tu
Kd
3 Ku Tu /40
the system response and the setpoint. Also, there are another two criteria peak time (tp ) and delay time (td ). The most common method for tuning the parameters of PID controller is the Ziegler–Nichols (ZN) method [5]. To compute the parameters of PID using the ZN method set the Ki and Kd to zero firstly. Incrementally Kp is increased until reach the optimum gain (Ku ) at which the system starts to oscillate. Table 1 shows the required formulas to compute the gain parameters depending on the Ku and the oscillation period (Tu ). This tuning does not guarantee the optimum performance of the controlled system response so after the default tuning (using ZN method) the values of the parameters need to be optimized to produce better performance. Hence, this is the role of using meta-heuristic optimization algorithms. The objective function is the minimization of integral of the absolute sum of error (IASE) The IASE performance criterion formula is as in Eq. 2 and it was explained in Fig. 3. ∞
IASE = ∫ |r(t) − y(t)| 0
where r(t) and y(t) is the setpoint and measured response in order. Fig. 3 Showing the definition of ISE
(2)
Performance Optimization of PID Controller …
695
Fig. 4 The rule of using a meta-heuristic algorithm for enhancing the process response
3 Tuning Parameters of PID Controllers Using Meta-Heuristic Algorithms The parameters of PID controllers (Kp , Ki , and Kd ) are tuned using meta-heuristics by using the values that are derived using the Ziegler-Nicolas method and the metaheuristics try to optimize these parameters. This developed PID controller is called Heuristics PID controller, and it’s the rule of work as shown in Fig. 4. The flowchart in Fig. 5 shows the steps required to tune the parameters of PID controller starting with the parameter values found by Ziegler–Nicolas as the best solution. A generation of multiple solutions is generated within the specified range for each parameter. The solutions are updated using the best solution found and the current value of each solution according to the meta-heuristic used. For example, the Genetic Algorithm used evolutionary theory to generate the new solution but Particle swarm optimization is based on the local best solution found by each particle and a global best solution found during the iteration. IASE objective function is computed using each solution to update the global best solution.
4 PSO, SCA, ASCA-PSOand EVOA Algorithms 4.1 Particle Swarm Optimization (PSO) Algorithm [11] PSO belongs to the group of swarm optimization algorithms that mimic the behavior of birds flocking, fish schooling, and other similar species. PSO is a populationbased stochastic optimization method its search strategy mainly depends on global
696
M. Issa
Fig. 5 The procedure of finding the gain parameters of PID using the meta-heuristic algorithm
communications between the search agents where all search agents modify their movement pointed to the global search agents that find the global solution. The updating equations of PSO are represented as Eq. 3 and 4 where the particle (Pgbest ) has the global best position (solution) among all search agents and the best personal position (Pi best ) that each search agent found during the previous iterations. − Pi (t) + c2 rand Pgbest − Pi (t) vi (t + 1) = w* vi (t) + c1 rand Pbest i Pi (t + 1) = Pi (t) + vi (t + 1)
(3) (4)
where vi represents the velocity of the ith particle, c1 and c2 are the local the global best position weight coefficient in order. w is the coefficient of inertia that estimates the control of the previous velocity on the new estimated velocity. rand () is a uniformly distributed random variable in the range (0–1). The steps of PSO are summarized in Algorithm (1).
Performance Optimization of PID Controller …
697
Algorithm (1): PSO algorithm 1: Initialize a set of population solutions (Pi), initial velocity (vi), and algorithm's parameters (c1, c2, and w) 2: Repeat 3: Evaluate the objective function based on population solutions 4: Update the best local solution for each particle (Pibest) 5: Update the best global solution overall particles (Pgbest) 6: Update the next position of population solutions using Eq. 3 and Eq. 4 7: Until (T < maximum number of iterations) 8: Return the best solution (Pgbest) obtained as the global optimum
PSO has a complexity of time is O T ∗ n ∗ cpso where T is the number of iterations, n is the number of search agents and cpso is the cost time of modifying the position of one search agent. PSO has many advantages in information communication between search agents which give it more reliability to achieve a near-optimal solution with reasonable convergence speed besides robustness. PSO succeeded in solving many optimization problems such as solar cell design [23–25], electrical motor design [17], and surgical robot applications [26].
4.2 Sine–Cosine Optimization (SCA) Algorithm [9] SCA [9] is an optimization algorithm that uses sine and cosine mathematical functions as operators for adapting the movements of the search agents. The movements of search agents are controlled toward the global best solution (Pgbest ) found according to Eq. 5. P it+1 =
P it + r 1 sin(r 2 )r 3 P gbest − P it r 4 < 0.5 P it + r 1 cos(r 2 ) r 3 P gbest − P it r 4 ≥ 0.5
(5)
where Pi denotes the position of particle i and Pgbest is the best global solution between all search agents. r1 is to determine how far the next solution from the current one and determine the exploration scale through the search space as shown in Fig. 6. r 2 determines the direction of the next movement towards or outwards the best solution and its value updated randomly each iteration in the range (0–2π). r 3 controls the effect of destination (Pgbest ) on current movement, r 4 is used to balance between the usage of sine and cosine functions as in Eq. 5 and its value is updated randomly each iteration in the range (0–1). r1 is updated according to Eq. 6 for balancing between exploration and exploitation. t r1 = a 1 − T
(6)
698
M. Issa
Fig. 6 The effect of values of r1 and r3 on updating the solutions
where t is the current iteration, T is the maximum number of iterations, and a is a constant that should be set by the coder. Algorithm (2) shows the procedure steps of SCA. Algorithm (2) Sine Cosine Optimization Algorithm 1: Initialize a set of population solutions (Pi), algorithm parameters (r1, r2, r3, and r4) 2: Repeat 3: Evaluate the objective function based on population solution 4: Update the best solution obtained so far (Pgbest) 5: Update r1, r2, r3 and r4 6:Update the next position of population solutions using Eq. 5 and Eq. 6 7: Until (T < maximum number of iterations) 8: Return the best solution (Pgbest) obtained as the global optimum
The time complexity of SCA is O(T ∗ n ∗ csca ) where n is the size of populations and csca is the time cost of updating all populations per one iteration, and T is the number of iterations. In addition to the simplicity of SCA, it has an advantage of efficient exploration of search space than exploitation but due to many parameters of it which is complex to tune together, it produces poor results in some optimization problems. However, SCA succeeded in many optimization problems such as data clustering [27] and 3D stereo reconstruction [28].
Performance Optimization of PID Controller …
699
4.3 Hybrid SCA and PSO Algorithm (ASCA-PSO) [18] The main drawback of SCA its poor exploitation of the search space while its exploration performance is efficient. Hence, PSO was hybridized to gain the benefit of efficient exploitation of PSO in addition to the exploration of SCA. In this chapter, the proposed technique of ASCA-PSO is presented beside the experimental results of testing its performance on mathematical benchmark functions with high dimension. The proposed technique is structured using two layers of search agents as shown in Fig. 6. The bottom layer is responsible for exploring the search space where SCA is used to adjust the movement of the search agents towards the best solution found based on sine and cosine operators. The search agents in the bottom layer are divided into groups where the best solution found by this group its movement is adjusted using PSO to exploit the narrow region this layer is called the top layer. The movement of the search agents in the top layer is adjusted using PSO where there is a global best solution (ygbest ). ygbest represents the global best solution of the whole search agents in the two layers. As shown in Fig. 7 the top layer has M search agents (solutions) and each group of the bottom layer has N search agents hence the bottom layer consists of M*N search agents. Each search agent in the bottom layer is described as (xij ), where i = 1, 2, 3 …, M, and j = 1, 2, 3 … N. In the same context, each solution from the top layer is represented by (yi ), where i = 1, 2, 3, … M which also represents the best solution found by each i group in the bottom layer. The search agents of the bottom layer are updated based on SCA algorithm using Eq. 7 where the movements are pointed towards yi found by group i.
Fig. 7 The structure of the proposed SCA-PSO technique
700
M. Issa
ij
xt+1
⎧ ⎫ ⎨ xijt + r1 sin(r2 )r3 yi − xijt r4 < 0.5 ⎬ t = ⎩ xijt + r1 cos(r2 )r3 yi − xijt r4 ≥ 0.5 ⎭ t
(7)
An element of the top layer yi , whose position represents a solution is enhanced based on PSO update equation toward ygbest and yi pbest . Where ygbest is the global best solution found by all search agents and yi pbest is the best solution found by group i of the bottom layer. Equations 8 and 9 are used for updating the movements of the search agents in the top layer based on PSO algorithm. vit+1 = w* vit + c1 rand yipbest − yit + c2 rand ygbest − yit
(8)
yit+1 = yit + vit
(9)
where vi is the velocity of the ith particle, t is the iteration number and rand is a uniformly distributed random variable in the range (0–1). c1 and c2 are the best local and global position weight coefficients respectively and w is the inertia coefficient that controls the effect of the previous velocity on the new velocity. The search agents of the groups in the bottom layers explored the search space and adjust their movements based on SCA algorithm. The parameters of the SCA are adjusted so that explore the search space as possible. In the case of a search agent finding a solution better than that of ygbest then it updates best with the better solution. yi represents the best solution found by group i and it update its movement using PSO algorithm based on the best solution it found and ygbest . If an update for ygbest has occurred then yi is updated otherwise after finishing execution of each group yi is updated and ygbest is checked. The influence of search agents on each other during updating its movement is as shown in Fig. 8. As shown in Fig. 9, the search agents of the bottom layer xij are influenced by the best solution yi founded by the group i. Moreover, yi is influenced by the best solution found between the whole set of search agents ygbest . For the next groups, ygbest is the best solution found from previous groups with permitting chance of performing exploration besides exploitation in the same iterations. The other groups will be influenced by the updating of ygbest and so explore the search space more to increase the diversity of solutions found which leads to performing exploitation besides exploration. Increasing the diversity of solutions leads to enhancing the quality of the solution besides speeding up its convergence (the number of iterations to reach the optimum solution). This algorithm is classified as High level- Relay hybridization (HRH) scheme (general, global, and heterogeneous). It is classified as HRH due to each algorithm is self-contained and execute in series one after another per iteration. Algorithm (3) shows the detailed steps of the procedure of ASCA-PSO optimization time complexity of ASCA-PSO optimization algorithm is algorithm. The O T M N csca + c pso where M and N are the sizes of populations in the top and
Performance Optimization of PID Controller …
701
Fig. 8 Influence of search agents on each other during updating movements of a search agent
Fig. 9 Influence of search elements during updating search agents movements
bottom layer respectively. csca and cpso are the cost time of updating each search agent per one iteration for both SCA and PSO in order, and T is the number of iterations.
702
M. Issa
Algorithm 3) ASCA-PSO Optimization algorithm 1: Initialize population solutions (x), SCA parameters (r1, r2, r3 and r4), PSO parameters (w, c1, c2). 2: Evaluate the objective function based on population solutions (x). 3: Find yi, as the solution produce the best fitness from the related group in the bottom layer 4: Find ygbest as the solution produce best fitness from the swarms (yi) in the top layer 5: Repeat 6: for i = 1 : M 7: for j=1 : N 8: Update (xij) according to Eq. 7 9: If (F(xij) < F(ygbest)) Then ygbest = xij 10: Update (yi) according to Eq. 8 and Eq. 9 11: If (F(yi) < F(ygbest))Then ygbest = yi 12: end for j 13: Update (yi) according to Eq. 8 and Eq. 9 14: If (F(yi) < F(ygbest)) Then ygbest = yi 15: end for i 16: Until (T< maximum number of iterations) 17: Return the best solution (ygbest) obtained as the global solution
4.4 Egyptian Vulture Optimization Algorithm (EVOA) [22] EVOA is a meta-heuristic technique that mimics its search strategy from the Egyptian vulture to optimize the engineering problems. It can be implemented by independent problems (constrained and unconstrained). The main advantage of EVOA is that it has no parameters need to be tuned. Besides, Its solutions are updated randomly with no dependence on each other. EVOA was mainly used in enhancing the performance of two engineering problems such as the travel salesman problem and knapsack problem [22]. Algorithm (4) list the main procedure for implementing EVOA. Algorithm (4) Egyptian Vulture Optimization Algorithm (EVOA) 1: Initialize a single solution (P). 2: Repeat 3: Tossing of pebbles 4: Rolling of the solutions 5: Change angle of tossing 6: Evaluate fitness function and update the best solution (Pgbest) 7: Until (T < maximum number of iterations) 8: Return the best solution (Pgbest) obtained as the global optimum
Performance Optimization of PID Controller …
703
Fig. 10 EVOA operations
Three main steps of EVOA to find the optimal solutions are pebbles tossing, solution rolling, and modify the angle of tossing. Pebbles tossing is performed as in Fig. 10a where random position bit (P) and from this position L-bits are chosen as the size of pebbles. Exclusive logic operators are used to changing the bits randomly. The solution rolling is performed as in Fig. 10b to modify the region of search space to increase diversity which increases the probability of reaching the global optimum. The rolling is simulated by rotating the solution bits left or right randomly for random times. The third operation is modifying the angle of tossing as in Fig. 10c is simulated by a mutated random number of bits with random positions. The benefit of this step is increasing diversity.
5 Experimental Results In this section, the performance of ASCA-PSO was tested against SCA, PSO, GA for optimization of the gains of PID controller to reduce the overall error in comparison with the performance of ZN [5]. Two experimental cases are used for validating the performance the first is the 2nd order liquid level system and the second case is the 3rd and 4th representation of the process.
704
M. Issa
5.1 Liquid Level System The liquid level system consist of three tanks are arranged in cascade order to supply the liquid to the pump as shown in Fig. 11. The liquid level in the tank is controlled by calculating the error between the required level of the liquid level and the actual level in the tank. According to this error, PID releases a control signal to open the electrical valve in the range of (0–100%) to change the flow rate and so the desired liquid level is achieved. The liquid level system that was used as a case study for testing the performance of the ASCA-PSO optimization technique has a transfer function as in Eq. 10 and is as shown in [29]. 1 H(s) = v 0 (s) 64s3 + 9.6s2 + 0.48s + 0.008
(10)
As shown in Table 2, ASCA-PSO produces the lowest IASE in comparison with other algorithms. Other performance criteria are not the optimum due to the optimization problem is a single objective where IASE is fitness. Figures 12 and 13 show the performance response of controlling the level in the tank after estimating parameters using ASCA-PSO versus other meta-heuristics techniques. Fig. 11 Three tanks liquid level system
Performance Optimization of PID Controller …
705
Table 2 Comparison of transient response parameters for ZN, PSO, GA, and SCA Method
Kp
Ki
Kd
Set time
Rise time
Over-shoot (%)
IASE
Z-N
0.038
0.0011
0.17
592
18
65
103
PSO
0.026
0.0003
0.222
213
25
11.2
9.17
GA
0.057
0.0019
1.704
111
8.8
22
9.90
SCA
0.041
0.0005
0.886
130
14
10
10.5
ASCA-PSO
0.040
0.0005
0.335
176
18
25
8.60
EVOA
0.098
0.006
2.01
86
7.4
43.4
9.6
Fig. 12 Transient response result ZN, PSO, and GA for liquid level tanks
5.2 3rd and 4th Order Systems Process In this subsection, the performance of ASCA-PSO was tested to optimize the setting of gains of PID controller to control 3rd and 4th processes have a transfer function as in Eqs. 10 and 11. G 1 (S) =
1.2 0.0007s3 + 0.0539s2 + 1.441s
(11)
G 2 (S) =
27 s4 + 12s3 + 54s2 + 98s + 71
(12)
Tables 3 and 4 show the performance response of using ASCA-PSO for setting the parameters of the controller against PSO, GA, SCA, and ZN [5]. As shown in the
706
M. Issa
Fig. 13 Transient response result ZN, SCA, and ASCA-PSO for liquid level tanks Table 3 Comparison of transient response parameters for ZN, PSO, GA, and SCA, ASCA-PSO, and EVOA 3rd order Method
Kp
Ki
Kd
Set time
Rise time
Over-shoot
IASE
Z-N
18
0.045
0.018
0.254
0.0830
7.8
40.5
PSO
0.683
0.086
0.018
20.407
2.6339
12.57
13.3
GA
0.988
0.183
0.020
13.927
1.7970
12.80
12.8
SCA
1.245
0.211
0.033
13.368
1.540
13.98
11.2
ASCA-PSO
0.404
0.029
0.027
34.9
4.5
12.3
10.2
EVOA
1.25
0.3
0.4
12.2
1.9
13.5
11.1
Table 4 Comparison of transient response parameters for ZN, PSO, GA, and SCA, ASCA-PSO, and EVOA for 4th order system Method
Kp
Ki
Kd
Rise time
Over-shoot
IASE
Z-N
3.072
2.272
1.038
Set time 7.695
0.5737
18.41
11.3
PSO
2.742
2.094
3.180
7.699
0.3785
GA
2.71
2.714
3.084
6.630
0.3807
10.89
10.3
SCA
2.705
2.226
3.718
7.718
0.3492
10.01
10.8
ASCA-PSO
2.64
2.17
3.054
7.333
0.3865
7.607
7.38
EVOA
2.74
1.8
2.180
86
8.5
9.5
154.7
6.975
9.89
Performance Optimization of PID Controller …
707
tables ASCA-PSO produce the lowest IASE. The performance response of various techniques are drawn in Figs. 14 and 15 for the third-order process.
Fig. 14 Transient response result ZN, PSO, and GA for liquid level tanks
Fig. 15 Transient response result ZN, SCA, and ASCA-PSO for 3rd order process
708
M. Issa
6 Conclusion and Future Work This chapter proposes a comparison of a set of meta-heuristic algorithms (PSO, GA, SCA, ASCA-PSO, and EVOA) for tuning the parameters gain of the PID controller. Three systems with a different order (2nd, 3rd, and 4th) are used to test the performance of meta-heuristic algorithms for enhancing the response of systems better than that measured by ZN. From the experimental results, it indicated that the ASCAPSO algorithm proposes the best performance by getting the lowest IASE than other algorithms. The EVOA provides approximate IASE such as that of SCA and PSO. Hence, it is concluded that EVOA performance for enhancing the response of the PID controller needs to be enhanced due to the exploitation of EVOA is not good enough for this engineering problem.
References 1. K. Ogata, Y. Yang, Modern Control Engineering, vol. 4 (Prentice Hall India, 2002) 2. N.H.A., Hamid, M.M. Kamal, F.H. Yahaya, Application of PID controller in controlling refrigerator temperature, in CSPA 2009. 5th International Colloquium on Signal Processing and Its Applications (IEEE, 2009) 3. K.J. Åström, et al., Automatic tuning and adaptation for PID controllers-a survey, in Adaptive Systems in Control and Signal Processing 1992 (Elsevier, 1993), pp. 371–376 4. B.M. Vinagre, et al., Fractional PID controllers for industry application. A brief introduction. J. Vibr. Control 13(9–10), 1419–1429 (2007) 5. J.G. Ziegler, N.B. Nichols, Optimum settings for automatic controllers. Trans. ASME 64(11) (1942) 6. E.-G. Talbi, Metaheuristics: From Design to Implementation, vol. 74 (John Wiley & Sons, 2009) 7. R. Storn, K. Price, Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4), 341–359 (1997) 8. J.H. Holland, Genetic algorithms. Sci. Am. 267(1), 66–73 (1992) 9. S. Mirjalili, SCA: a sine cosine algorithm for solving optimization problems. Knowl.-Based Syst. 96, 120–133 (2016) 10. S. Yazdani, H. Nezamabadi-pour, S. Kamyab, A gravitational search algorithm for multimodal optimization. Swarm Evol. Comput. 14, 1–14 (2014) 11. Kennedy, Particle swarm optimization. Neural Netw. (1995) 12. D. Karaboga, B. Basturk, Artificial bee colony (ABC) optimization algorithm for solving constrained optimization problems. Found. Fuzzy Logic Soft Comput. 789–798 (2007) 13. S. Mirjalili, Moth-flame optimization algorithm: a novel nature-inspired heuristic paradigm. Knowl.-Based Syst. 89, 228–249 (2015) 14. B. Javidy, A. Hatamlou, S. Mirjalili, Ions motion algorithm for solving optimization problems. Appl. Soft Comput. 32, 72–79 (2015) 15. D.H. Wolpert, W.G. Macready, No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1(1), 67–82 (1997) 16. F.S. Lobato, V. Steffen Jr., A.J. Silva Neto, A comparative study of the application of differential evolution and simulated annealing in radiative transfer problems. J. Br. Soc. Mech. Sci. Eng. 32(SPE), 518–526 (2010) 17. M. Hannan, et al., Optimization techniques to enhance the performance of induction motor drives: a review. Renew. Sustain. Energy Rev. (2017)
Performance Optimization of PID Controller …
709
18. M. Issa et al., ASCA-PSO: adaptive sine cosine optimization algorithm integrated with particle swarm for pairwise local sequence alignment. Expert Syst. Appl. 99, 56–70 (2018) 19. Issa, M., et al., Pairwise global sequence alignment using sine-cosine optimization algorithm. in International Conference on Advanced Machine Learning Technologies and Applications (Springer, 2018) 20. E. Emary, H.M. Zawbaa, A.E. Hassanien, Binary grey wolf optimization approaches for feature selection. Neurocomputing 172, 371–381 (2016) 21. M. Issa, A.E. Hassanien, Multiple sequence alignment optimization using meta-heuristic techniques, in Handbook of Research on Machine Learning Innovations and Trends (IGI Global, 2017), pp. 409–423 22. C. Sur, S. Sharma, A. Shukla, Egyptian vulture optimization algorithm–a new nature inspired meta-heuristics for knapsack problem, in The 9th International Conference on Computing and InformationTechnology (IC2IT2013) (Springer, 2013) 23. V. Khanna, et al., Estimation of photovoltaic cells model parameters using particle swarm optimization, in Physics of Semiconductor Devices (Springer, 2014), pp. 391–394 24. A. Harrag, Y. Daili, Three-diodes PV model parameters extraction using PSO algorithm. Revue Des Energies Renouvelables 22(1), 85–91 (2019) 25. K. Ishaque et al., An improved particle swarm optimization (PSO)–based MPPT for PV with reduced steady-state oscillation. IEEE Trans. Power Electron. 27(8), 3627–3638 (2012) 26. W. Wang et al., A universal index and an improved PSO algorithm for optimal pose selection in kinematic calibration of a novel surgical robot. Robot. Comput.-Integr. Manuf. 50, 90–101 (2018) 27. V. Kumar, D. Kumar, Data clustering using sine cosine algorithm: data clustering using SCA, in Handbook of Research on Machine Learning Innovations and Trends (IGI Global, 2017), pp. 715–726 28. G. Kuschk, A. Božiˇc, D. Cremers, Real-time variational stereo reconstruction with applications to large-scale dense SLAM, in Intelligent Vehicles Symposium (IV) (IEEE, 2017) 29. B. Kumar, R. Dhiman, Tuning of PID Controller for Liquid Level Tank System Using Intelligent Techniques, vol. 1 (2011)
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained with a Metaheuristic Algorithm Efrain Mendez-Flores, Israel Macias-Hidalgo, and Arturo Molina
1 Introduction The constant growth of world population together with the greater industrial productivity demand, has lead to a constant increase in the worldwide energy demand, as explored in [1]. Thus, the International Energy Agency (IEA) in [2], helds that worldwide modern economies depend on reliable and affordable energy supply; therefore, clean energy from renewable sources have gained increasing relevance since the past decade. Moreover, among all the renewable energy sources, Photovolatic (PV) systems have gained greater popularity for power supply applications, due to its suitability and avaibility around the world (as highlighted in [3]); which by the way, is supported by the fact that the global Solar PV capacity additions growth by almost 14% in 2019 (compared to the previous year according to [4]), reaching over 109 GW of new capacity. Therefore, [5] explains how since the energy demand is expected to grow 30% by 2040, a more sustainable future relies on a larger expansion of renewable energy sources, where PV sources have critical importance in order to achieve a fossil to renewable sources transistion (as explained in [3]). However, the global impact that Covid-19 has had in all sectors did not discriminated the energy sector; where, according to renewable energy market analysis from the IEA presented in [4], the risks did not spread equally among all the generation E. Mendez-Flores (B) · I. Macias-Hidalgo · A. Molina Tecnologico de Monterrey, School of Engineering and Sciences, Mexico City 14380, Mexico e-mail: [email protected] I. Macias-Hidalgo e-mail: [email protected] A. Molina e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 D. Oliva et al. (eds.), Metaheuristics in Machine Learning: Theory and Applications, Studies in Computational Intelligence 967, https://doi.org/10.1007/978-3-030-70542-8_29
711
712
E. Mendez-Flores et al.
sources, since the PV sources have greater affectations due to construction delays and non anticipated weaker investments, forcing the IEA to downward by more than 15% the capacity addition projections by 2020 [4]. Yet, since the environmental need to reduce CO2 emissions will not undergo any change, the long-term sustainability goals should not change either. Henceforth, in order to overcome the unprecedented crisis triggered by the COVID-19 pandemic, [6] explains that innovation is the key tool to improve energetic technologies, which would allow to accelerate the addition of cleaner energy. Nowadays, the energy supply through PV sources represents the second-largest absolute generation growth between all renewable sources; which according to [7], generated more than 720 TWh of the global energy consumed in 2019. Which according to [3], attracts the attention of different researchers, causing more and more research topics to germinate, in order to meet the required growth tendency to address the sustainable goals in spite of the risks due to Covid-19 in 2020. Hence, the increasing relevance of PV energy sources led to arising research topics regarding DC/DC converters topologies (as shown in [8]), control algorithms (as explored in [9]), and optimization techniques to improve the energy harvested from PV sources (as validated in [10]). Consequently, as highlighted in [3], Maximum Power Point tracking (MPPT) algorithms are widely studied, since they have critical importance in order to properly track and acquire the Maximum Power Point (MPP) of energy that can be harvested from the PV source (as discussed in [11]). Yet, since the MPPT issue is a dynamic optimization problem (as stated in [11] and validated by [3]), a dynamic DC-optimizer is required. Therefore, [3] explains that MPPT algorithms are usually implemented through DC/DC converters, since their modulation properties allows to modify the impedance at the output of the PV source (as detailed in [3]), where the impedance variation allows tuning the energy harvested due to the relation between the voltage (V pv ) and current (i pv ) from the PV array. Hence, it can be implied that the MPPT algorithms are responsible of modulating the impedance at the output of the PV arrays through the DC/DC converters, seeking to find the dynamic V pv and i pv combination that leads to the maximum energy transference. Then, deepening into MPPT state of the art solutions, [11] explains that among the most implemented and studied solutions Perturb and Observe (P&O), Hill-Climbing (HC) and Incremental Conductance (IC) algorithms have been widely used due to their mathematical simplicity and their suitability capabilities for implementation; where the P&O algorithm has been the epicenter and inspiration of many other solutions (including IC and HC). Furthermore, [12] shows that in spite of a correct tracking behavior with low calibration parameter dependence and simple implementation capabilities, algorithms based on the classic P&O have a compromised trade-off relation between the settling time and steady-state oscillations, as validated in [3]. Henceforth, in order to improve the MPP tracking efficiency process and the dynamic performance of the system, [13] explains how the adaptation of metaheuristic optimization algorithms as MPPT solutions can improve the settling time with
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
713
steady-state oscillations reduction. Thereafter, [11] explores the implementation of the Particle Swarm Optimization (PSO) for a PSO-based MPPT algorithm, which was first presented in [14]. So, [3] mentions that just as the P&O served as inspiration for the IC and HC algorithms, the PSO-based MPPT has served as inspiration to adapt many other metaheuristic algorithms to MPPT solutions; such as, the Gray Wolf Optimization (GWO) based MPPT algorithm presented in [15], or even the Earthquake Algorithm (EA) based MPPT studied in [3]. Nevertheless, some of the solutions based on the PSO-based MPPT may not always ensure the expected behavior, since according to [3] they are highly susceptible and dependent to correct and precise parameter calibration. Moreover, the EA-based MPPT algorithm proposed in [3], claims to take the best part of both worlds, leading to a solution with a low calibration dependence and an improved dynamic behavior. Additionally, Metaheuristic-based MPPTs have also the feature of improving the system’s performance against partially shading conditions, compared to classical solutions like P&O (as explained in [11]). Yet, since none solution is perfect, many of the Metaheuristic-based MPPTs have issues against dynamically variable MPP through time, since in constant variations of MPP (regardless if there are partial shading conditions) they may lose effectiveness due to their converging capabilities, since Metaheuristic-based MPPTs usually reduce their searching ratio in order to converge in the solution. An example of the problem, is introduced in [16], where the under dynamically variable MPP the PSO-based MPPT is unable to find new MPP after a period of time, since all the particles were gathered in the previous MPP. Hence, [16] highlights the importance of the reinitialization stage in metaheuristic-based MPPTs. That being so, in [17] it is stated that two of the main strategies to overcome the issue can be a programmed periodically algorithm reinititalization, or a reinitialization based on MPP changes measured through estimation of changes of the shading patter. Thus, [16] explores with greater detail a reinitialization strategy based on the current and previous iterations of the tracked MPP. Nonetheless, the approach studied in [16] clearly validated that reinitializing metaheuristic-based MPPTs based in the shading conditions can significantly improve their performance; but still, the approach showed that after aggressive condition changes were detected, the reinitialization stage also required time to settle down and identify that the conditions were no longer changing. Which means that false-activation’s and miss-activation’s of the changes detector are a risk of that model, since an incorrect calibration or noise in the measurement could lead to errors in the detection. Where, the above should not be taken lightly, since the converter’s own ripple and the sensors measurement among, other practical features, may induce uncertainty to the detection or may complicate the calibration of the system. Subsequently, this work shows that the implementation of Artificial Intelligence for pattern recognition through the power data acquisition, can enable an efficient solution that allows detecting changes in solar irradiation, which leads to a reliable source to determine the ideal time to reinitialize the MPP searching algorithm, so the proposed solution makes a perfect fit for metaheuristic-based MPPTs. Nevertheless, the proposal can also be suited into classical MPPT algorithms, where the determi-
714
E. Mendez-Flores et al.
nation of solar irradiation changes can enable a state where those algorithms can stop oscillating around the global solution, in spite of the tracking precision that they can achieve. Therefore, it is well known that Artificial Neural Networks (ANN) are widely used for data classification and pattern recognizition (as explored in [18]), due to their capacity to process information and adapt their behavior without previous knowledge of a particular environment. Moreover, [19] explains that ANN are reliable in providing a general framework that allows representing non-linear mappings from several inputs to outputs variables. Yet, in order to ensure an optimal performance from the implemented ANN as irradiance changes detector, the training process has critical importance for the Network, where the training process is mostly a tuning procedure. Hence, since metaheuristic optimization algorithms are great tuning methods for optimal parameters location (as discussed in [20]), they have also been widely exploited as training methods for ANN; as validated in [19], where a comparative analysis of metaheuristic algorithms as ANN training methods is provided, leading to a complete framework for the suitability of the algorithms as training methods. On the other hand, the selected metaheuristic algorithm to perform the ANN training procedure in order to achieve a correct irradiation changes detection, is performed through the Earthquake Optimization Algorithm (EA) which was firstly introduced in [21] as a tuning method for a PID controller. Yet, the EA has been validated as ANN training method in [18], where the EA algorithm was used to train an ANN to achieve a mobile phone usage detection through the acquired data from the GPS and accelerometer sensors in mobile phones, achieving almost a perfect effectiveness rate regarding the usage detection. Results in this work, shows that the EA achieved a correct training process for the ANN, since the ANN performed a correct solar irradiation changes detection, where the performed tests show that the solution achieved over 99% of accuracy. In addition, this work seeks to enable other users to implement the designed ANN model through MATLAB/Simulink, looking forward to facilitate the implementation of the model for researchers that require a reliable simulation testbed, with an also reliable reinitialization signal through the detection of changes in solar irradiation. Moreover, a MATLAB code is provided an explain in order to enable other users to train ANN through the EA, or even to suite the EA to any other optimization applications. Moreover, this proposal also allows detecting MPP changes due to temperature variations, since the ANN changes tracker uses for the detection the changes in the electrical Ppv (PV power) signals. Yet, since the Irradiation has a more volatile behavior compared to the temperature, it causes more sense to focus in that variable for the MPPT reinitializations. Additionally, it is worth mentioning that this proposal was developed focused on fixed photovoltaic systems, but it could be easily suited into mobile tracking systems. Henceforth, this work introduces in Sect. 2 the Maximum Power Point concept for solar systems, which enables to understand the dynamic optimization issue addressed by MPPT algorithms; meanwhile, Sect. 3 explores the basic concepts of ANN, deep-
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
715
ening into the Perceptron network which is the selected topology for this application. Additionally, the EA main features are presented in Sect. 4, where also the framework of the EA as tuning method is presented. Then, the selected case study is presented in Sect. 5, and the obtained results in Sect. 6; those results, are analyzed in terms of the Simulink performed simulations, the precision of the detection of solar irradiation changes, and the data quantification. Finally, the conclusion of this work is presented in Sect. 7.
2 Maximum Power Point for Photovolatic Systems Given the gradual increase in energy costs and a global interest for reducing atmospheric carbon emissions to mitigate the destabilizing climate change effects, there has been an increasing interest in the use of renewable energy sources and alternate power systems. Therefore, nowadays renewable energy sources can be integrated in all kinds of electronic systems; from big, interconnected, continental networks to small autonomous systems and buildings. However, the problems presented by renewable energy integration are contextual and specific for each location, and they demand adjustments in electrical distribution systems. Hence, regarding Photovoltaic (PV) generation systems, the maximum amount of energy that PV systems can deliver (Ppvmax ) depends on the received solar irradiance (G) and internal temperature (T) in photovoltaic cells (PV cells), which are the input parameters in PV systems (as explained in [22]). Yet, according to [3], the best way to understand the dynamic behavior of a PV generating system, it is worth presenting the equivalent model that enables to analyze the key features that allows the PV system to convert solar energy into electricity, where the basic scheme of the electric diagram of the model is shown in Fig. 1. Where solar energy is converted in the current I ph , which passes through the diode in order to close the circuit, which produces the current through the Diode Id ; yet that current has some parasitic losses modeled through R P and R S , which model the ohmic losses of the moduls due to parallel of series connections. Hence, as explained
Fig. 1 Single-diode model with ohmic losses for PV modules
Current [A]
a
E. Mendez-Flores et al. 6
G = 1000 [W/m 2]
T = 25 [ºC]
G = 800 [W/m 2]
4
G = 600 [W/m 2] G = 400 [W/m 2]
2 I PV max
b
60
Power [W]
716
40
G = 1000 [W/m 2]
T = 25 [ºC]
G = 800 [W/m 2] G = 600 [W/m 2] G = 400 [W/m 2]
20
VPV max
MPP
0
0 0
5
10
15
0
20
5
10
c
G = 1000 [W/m 2]
T = 0 [ºC]
4
T = 25 [ºC]
2
d
T = 50 [ºC]
G = 1000 [W/m 2]
60
Power [W]
Current [A]
6
15
20
15
20
Voltage [V]
Voltage [V]
T = 50 [ºC]
40
T = 75 [ºC]
20
MPP
T = 75 [ºC]
0
T = 0 [ºC]
T = 25 [ºC]
0 0
5
10
Voltage [V]
15
20
0
5
10
Voltage [V]
Fig. 2 Maximum power point changing due to irradiation (a) and b and Temperature (c) and d parametric changes
in [11], the power generated from the PV system Ppv is obtained through the relation of the resultant current from the PV module (I pv ) and the obtained voltage at the module output (V pv ). Nevertheless, for a more detailed equations explanaition and a step-by-step development, it is recommended to refer to [3]. On the one hand, the maximum current that PV systems can deliver (I pvmax ) decreases with the drop on solar irradiation, meanwhile the maximum achievable voltage (V pvmax ) decreases with the raise on temperature (as explained in [23]), which makes the Maximum Power Point (MPP) difficult to track due to all the parametric variations. Henceforth, the Current vs Voltage curves presented in Figs. 2a, c, show how the current changes due to solar irradiation variations and how the voltage changes due to temperature variations respectively; so, that is the behavior of how the maximum voltage and current fluctuate constantly with the climatological conditions. On the other hand, as shown in Fig. 2a, c, the voltage delivered by a PV source depends on the non-linear load current, so that the amount of maximum energy that can be delivered by PVs, besides temperature and irradiance, depends on the load current (i L ). Therefore, the Maximum Power Point (MPP) changes with ambient conditions and the load current, yet by the knowledge that the PV power (Ppv ) can be obtained through the Ppv = V pv ∗ I pv relation, the resultant power curves can be observed on Fig. 2b, d, where the PV array behavior is shown through the Power vs Voltage profiles. Therefore, for the majority of practical applications, PV systems require an electronic interface between PV modules and the load they are supplying energy to; which on one hand, can regulate the amount of current demanded from the PV source (I pv ) by varying its output impedance (Z C VI N ) according to MPP changes, and on the other hand, can deliver a constant output voltage (Vo ) regardless of V pv and I pv fluctuations or regardless demand variations of the current load (I L ) as shown in Fig. 3.
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
717
Fig. 3 Conceptual scheme of a simple structure for a photovoltaic power system
Moreover, the electronic interface between the PV array and the load is known as Photovoltaic Power System for systems that are not directly connected to the electric grid, and as Grid Converters for Systems that are. Yet, they are basically composed by Power Converters; where, for a direct current output applications DC-DC converters are used, and DC-AC converters for applications that require an alternate current output. In addition, there are also some Photovoltaic Converters with AC output with two stages: in the first one a DC-DC converter and a DC-AC converter in the second one (as explained in [22]). Thus, most of the applications use switching converters due their high efficiency (as explained in [11]), where those power converters allow through MPPT algorithms to track the Maximum achievable Power Point through time, by internally modifying the commutation pattern of the switching semiconductors, so that the converter can change the input impedance Z C VI N , allowing to maintain a constant voltage on the output as long as input voltage from the converter is within certain margins (V pvmax and V pvmin ) for a maximum current load, which by the way achieves the maximum power point of the energy that can be supplied. Subsequently, after exploring the main components of photovoltaic systems, in addition to the dynamic optimization issues addressed by MPPT algorithms, this chapter will focus in the development of an Artificial Neural Network (ANN) capable of providing a solution for MPPT algorithms that require to know when to stop searching for the MPP (since there are no G or T changes anymore), or for algorithms that simply cannot work properly without a reset signal upon temperature and irradiance parametric changes; such as metaheuristic-based MPPTs like the PSO-based (proposed in [14]) or the EA-based (proposed in [3]) MPPT algorithms, which after some iterations converge on a global solution and need to know when to start looking for the MPP again to avoid getting stuck in old MPP points.
718
E. Mendez-Flores et al.
The proposed solution, receives V pv and I pv and generates a reference signal for a boost DC-DC Converter, where the ANN detects the pattern of the solar irradiation changes allowing to track when the MPP has changed, allowing to discriminate the noise from the switching converter. Even though the algorithm’s functionality is shown with a Step-Up DC-DC converter (Boost), the generated reference signal can be easily integrated as a sub-function in the control scheme of any kind of converter. Hence, the following section presents the main features of ANN, in addition to their suitability for the proposed application.
3 Artificial Neural Networks As introduced in [24], Artificial Neural Networks (ANN) emulate neural events and the relations among them, where the proposed logical model from [24] seeks to imitate the behavior of biological neurons and the interaction between them (as discussed in [25]). In other words, ANN intend to emulate the way in which the human brain processes information (as explained in [18]); hence, ANN are bioinspired computational models formed by artificial neurons interconnected to each other with weighted coefficients, which according to [26] constitute the artificial neural structures. Moreover, as highlighted in [18], ANN call neurons to their processing units, where each neuron receives input data from other nodes and generates a scalar output, which directly depends on the weighted information taken from the inputs. Thus, as explained in [18], the general ANN components can be summarized as: • Neurons set. • Neural interconnections. • Neural activation function. Hence, those components are linked to each other as shown in Fig. 4, where the general structure of a single neuron with multiple inputs is presented. Therefore, Fig. 4 basically shows how the inputs are processed with different weights, which are later handled through the weighted sum. Afterwards, the output of the system is obtained through an activation function, which allows obtaining the final state of the output regarding the input values. Yet, in order to understand the mathematical expression that represents the neuron model from Fig. 4, (1) shows the equation from the weighted sum that will later be introduced into the activation function. u=
n
xi ωi + θ = x1 ω1 + x2 ω2 + x3 ω3 + · · · + xn ωn + x0 ω0
(1)
i=1
where the neuron has n number of inputs (x) and weights (ω); then, the output given by the weighted sum before the activation function is given by u. On the other hand, θ
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
719
Fig. 4 General structure of a Single neuron with multiple weighted inputs
is an adjustment parameter for the network; which (according to [18]) can be defined as θ = x0 ω0 . Subsequently, [27] explains the importance of properly selecting an activation function according to the application, where (also according to [27]) the performance of the ANN is strongly affected by the selected function. Then, as highlighted in [18], the non-linear activation function f needed to obtain the representative output of the system, can be denoted as shown by (2). y = f (u)
(2)
where y is the output of the system and f (u) the evaluated value of u through an activation function. Which completes the mathematical structure of the neuron from Fig. 4. Yet, to properly select the activation function according the expected behavior for the ANN, [18] classifies some of the most used activation functions as: • Step function: Mostly represented as “hardlim” function, which is better suited to applications that require a binary output. • Lineal and mixed function: Suitable for applications where the output from the neurons can be linearly related from the weighted sum in a defined range. • Hyperbolic tangent function: As validated through [27], this function is selected when smooth positive or negative variations are expected at the output. • Gauss function: as explored in [18], the function is mainly used in order to reduce hidden mapping to one layer of neurons. Therefore, the interconnection and signal propagation between multiple neurons lead to different ANN topologies, which can be roughly classified into feedforward and feedback (recurrent) interconnection topologies (as explained in [18]). Yet, the selected ANN topology for this proposal (as shown in Fig. 5), is a single layer with a single neuron net using a feedforward structure, which is also known as Perceptron structure.
Fig. 5 Perceptron structure of the implemented single layer ANN for the solar irradiation changes detection
720
E. Mendez-Flores et al.
Fig. 6 Data acquisition structure for the ANN inputs
The preceptron ANN structure is a widely studied topology, which is perfectly suited for applications that requires linear classification, since allows a binary output estimated from the input data pattern. In this particular case, as shown in Fig. 5), the input variables are proposed to be Ppv (i), Ppv (i − 1) and Ppv (i − 2), which are the power changes through time in the current iteration (i) and from two iterations before (i − 1 and i − 2 respectively). So, also from Fig. 5), it can be observed that the activation function of the proposed ANN is the Hardlim function, since the expected output should be logic-1 if there is an irradiation change, and a logic-0 if there is not. Nevertheless, it is worth mentioning that Ppv (i − 1) and Ppv (i − 2) are data taken from the system’s memory, since the proposal has as main feature that it does not require additional measurements regarding to the already acquired data by the MPPT algorithm, in addition to the low computational cost of the proposed irradiance changes detector. Henceforth, in order to clarify the data acquisition process, the memory recovered data and the preprocessing operations for the proposed ANN, Fig. 6 presents the general structure for the data acquisition process for the input variables from Fig. 5. Then, as analyzed from Fig. 5, the first step is to acquire the voltage (v pv ) and current (i pv ) data in a certain time instance from the PV system, then the data is used in order to estimate the power in the present iteration (Ppv (i)). In the following step, the estimated power is used in order to calculate the change in Power (Ppv (i)) regarding the previously stored power value (Ppv (i − 1)), where Ppv (i) is estimated through (3). (3) Ppv (i) = |Ppv (i) − Ppv (i − 1)| where the absolute value of the change in power is taken, since the pattern to be identified through the ANN is terms of the magnitude of the power disturbance behavior, reason why two historical Ppv values are taken in order to address a correct pattern identification to find irradiation changes through the power behavior
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
721
in the system. Then, (1) can be traduced into (4), which by the way is the objective function to be optimized. u = [Ppv (i)]ω1 + [Ppv (i − 1)]ω2 + [Ppv (i − 2)]ω3 + x0 ω0
(4)
where ω0 , ω1 , ω2 and ω3 are the parameters to be optimized, meanwhile x0 (adjustment parameter) remains as constant x0 = −1 for this work. Nonetheless, as stated before and as validated through [18], the correct weights callibration has critical importance for the performance of the ANN, since the weights are the parameters that allow relating the input power values to the expected behavior as irradiance changes detector. Therefore, since the training process (weights calibration process) seeks to find the optimized values for the ANN, metaheuristic optimization algorithms have been widely used for the task (as validated in [28] and [18]), due to their reduced computational efforts compared to classical training methods (such as backpropagation [18]), in addition to their enhanced methods to find optimal solutions. Consequently, in this work the Earthquake Optimization Algorithm (EA) is selected as tuning method, since the EA was validated a training method for ANN in [18], where two examples were presented: ANN for logic gates emulation and an ANN for mobile phone usage while driving detection. Hence, the validated results from [18] inspired confidence to implement the same algorithm for training process for the ANN developed in this work.
4 Earthquake Algorithm In this chapter, the weights optimization for the proposed ANN, are tuned through the Earthquake Optimization Algorithm (EA), which by the way has been validated as training method for ANN in [18]. The EA is the first geo-inspired (inspired in a geological phenomena) metaheuristic optimization algorithm, which was firstly introduced in [21], where the algorithm was used for the optimization of the PID (Proportional Integrative Derivative) controller gains for the optimal solution of an embedded speed controller for DC motors. Yet, as stated in [20], the algorithm was firstly validated in [18] as ANN training method. The main features that allows the EA to be a reliable optimizer for different optimization issues, are the combination of the searching velocities achieved through the P and S wave velocities; which, according to [29], addresses wide and fine searching paths. Nonetheless, the EA is easily implementable for different applications, since as stated in [29] the EA requires less parametric calibration to suite different optimization applications, compared to other metaheuristic algorithms that require parameter adjustment. Henceforth, it is worth summarizing the main concepts and features of the EA’s deduction, in order to better understand the implementation of the algorithm for the
722
E. Mendez-Flores et al.
ANN weights calibration. Therefore, since the EA is inspired in the behavior of real earthquakes, in other words in the geological phenomena, the algorithm is mostly based in the aggressive propagation behavior from the P-wave and the most finer propagation from the S-wave which, as explained in [20], are the key features the algorithm takes advantage from. Hence, as explained in [29], Equation (5) describes the velocity of the P-wave and (6) presents the equation that describes the S-wave velocity. vp =
λ + 2μ ρ
vs =
μ ρ
(5)
(6)
where v p and vs , are the P- and S-wave velocities respectively, meanwhile λ and μ are the Lamé parameters; which, as explained in [29] and [18], the can be equal under some circumstances, thus for this application it is taken λ = μ = 1.5 G Pa, since it is the value validated in different applications; such as, [18, 20, 21, 29, 30]. On the other hand, also from (5) and (6), the density (ρ) of the solids where the waves propagate is taken as a random value, which according to [29] gives the heuristic behavior to the algorithm; so, as not to leave the geo-inspiration of the algorithm behind, [29] explains that the selected range for the density parameter should be taken between 2200 and 3300 Kg/m3 , since is the range in which the real geological properties of earth materials are. Yet, it is of critical importance for the algorithm to decide whether to use v p and vs for the searching pad to be followed, therefore [21] introduces the S-range (Sr) concept, which is the operating range in which the algorithm will operate with the vs velocity. Hence, since searching through the vs velocity achieves a finer search (as validated in [3]), the Sr is defined as an area around the global best solution, implying that the searching agents (epicenters) that are in the Sr operating with vs would (according to [29]) orbit around the global best solution. Yet, Fig. 7a highlights how the epicenters behave regarding the Sr. Where, [25] recommends the Sr to be 2% around the global best solution. Moreover, since v p and vs are estimated through a square root in (5) and (6) respectively, the algorithms frame contemplates to randomly select whether to select a positive or negative velocity value. Thus, the epicenters positions are updated through (7). X it = X it−1 + Vi
(7)
where X it and X it−1 are the current and previous positions, respectively. Meanwhile Vi is the epicenter’s selected velocity (vs or v p ). Moreover, [18] explains that as an additional heuristic degree, the EA uses a random position selection through an exponential distribution, which sadistically
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
723
Fig. 7 Epicenters behavior regarding the Sr and the exponential random positions generation
reduces the possibility of re-evaluating the same points. Nevertheless, it is important to highlight that, the concept of the exponential distribution is originally taken from [21], which according to [29], addresses the natural relation of the geological phenomena with the Poisson distribution explained in [20]. Hence, the exponential random generation of epicenters positions are made through (8). X it = X best + Expμ (s)
(8)
where X best is the global best solution and Expμ (s) is the random epicenter position generated through the exponential distribution from the value of μ, which according to [29] should be generated in a range of ±1.91. Subsequently, Fig. 7b shows way the epicenters behave, when some of them are generated through (8), which according to [20] improves escaping from local minimums capabilities. Then, to understand the EA optimization process in terms of the Sr and the exponential random generation, Fig. 7a shows an example of an initial randomly placed epicenters on a surface, which after ranking the solutions in order to find first global best epicenters, searching agents can determine whether they are in or out of the S-range, where the ones in Sr will use vs and the other ones are going to take v p as velocity. Consequently, after some iterations, Fig. 7b shows that the epicenters start to converge with a finer search around the current global best solution (as stated in [20]). Furthermore, the example in Fig. 7b shows that the searching agents are converging around a local minimum; yet, there are four epicenters whose positions were generated through the exponential distribution, and it is shown that one of those epicenters found a new global best solution, exemplifying how the EA escapes the local minimum where the other epicenters were getting trapped (as discussed in [20]).
724
E. Mendez-Flores et al.
Fig. 8 EA general flowchart
Nonetheless, a detailed structure of the EA is presented in the flowchart shown in Fig. 8. Still, it is important to explain how the algorithm is implemented as training method for the proposed ANN.
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
725
Fig. 9 Acquisitions system diagram
4.1 Earthquake Algorithm as ANN Tuning Method This subsection has the main objective of detailing the structure of the metaheuristic weights optimization for the ANN, from the perspective of the EA, but in general way so that in could be suited to any other metaheursitic algorithm for the ANN training process. First, the data acquisition is made through the complete PV system, where the variations in Ppv are measured through the v pv and i pv data acquired from the sensors. Hence by the previous knowledge of the irradiation conditions in the controlled environment, the irradiation input data (G in Fig. 9) is translated into a logical signal, where (as stated before) output is on state when (logic-1) there is an irradiation change, and it is off (logic-0) if there is not any irradiation change. Hence, that is the way how the artificial neural network would learn whether to activate or not regarding the Ppv behavior, allowing to discriminate the noise induced by the switching converter. Therefore, in spite this work is taking as reference the irradiation changes, this proposal allows also detecting MPP changes due to temperature (T from Fig. 9) variations, since as presented before the ANN changes tracker performs the detection through the changes in the electrical Ppv signals. Yet, since the Irradiation has a more volatile behavior compared to the temperature, this work focuses in that variable for the EA optimization. Therefore, after processing the acquired data to achieve the input/output relation that will perform as training data for the ANN, (4) is taken as objective function, where ω0 , ω1 , ω2 and ω3 are the variables to be optimized. Nevertheless, it is important to highlight that, after evaluating the set of neurons into (4), the fitness values are obtained after applying the activation function (harlim in this case) to the output of (4). Hence, the error is evaluated as shown in (9). erri = |yex p − yobt |
(9)
where erri is the error in the i sampling point, estimated through the absolute value of the expected logic value from the ANN (yex p ) minus the obtained logic value (yobt ). Meanwhile, the total error of the evaluated weights set is obtained through (10).
726
E. Mendez-Flores et al.
E=
n
erri
(10)
i=1
where E is the total error of the ANN with the evaluated ω0 , ω1 , ω2 and ω3 , estimated through the sum of erri points. Therefore, since the perceptron output is a logic output, (10) represents the number of times the ANN had an erroneous response against n number of Ppv samples, with the ω0 , ω1 , ω2 and ω3 configuration. Moreover, in order to improve the implementation capabilities of the EA as ANN training method, the following subsection, shows a MATLAB code example explained of the EA used for this application.
4.2 MATLAB Implementation of the EA for ANN Training With the main objective of enabling for other users the implementation of the EA as training method for ANN, this subsection explains a MATLAB example suited for this application. Nevertheless, this code can also be easily suited for other applications, since the general structure of the EA is explained through simple coding instructions, which makes it suitable for other applications with few changes required. Therefore, the first step in this approach, is loading test data, which can be from a real testbed or from a simulation: 1 2 3 4
% % L o a d i n g data from s i m u l a t i o n s load ( ’ t e s t D a t a . m a t ’ , ’ Ppv ’ , ’ t e m p o ’ , ’ i r r a d ’) % Where : Ppv e r r _ b e s t _ a c t u a l g l o b a l _ b e s t = wp ( i n d _ b e s t _ a c t u a l ,:) ; global_best_error = err_best_actual ; end
38 39 40 41
end % e n d i n g e p i c e n t e r s for loop i t a c t = i t a c t + 1; end % e nding main while loop
Hence, after ending the optimization loop, the best solutions are stored into the variables w0, w1, w2 and w3 which represent the ω0 , ω1 , ω2 and ω3 coefficients.
730
1 2 3 4 5 6 7
E. Mendez-Flores et al.
% D e t e r m i n i n g w h e t h e r it or err c o n d i t i o n was met if i t a c t > i t m a x disp ( ’ sali por i t e r a c i o n ’) e l s e i f e r r o r _ c i c l o >= g l o b a l _ b e s t _ e r r o r disp ( ’ sali por error ’ ) disp ( itact ) end
8 9 10 11 12 13
% Saving final results w0 = g l o b a l _ b e s t (1) ; % w1 = g l o b a l _ b e s t (2) ; % w2 = g l o b a l _ b e s t (3) ; % w3 = g l o b a l _ b e s t (4) ; %
peso peso peso peso
de de de de
ajuste x1 x2 x3
14 15 16 17
% C o n v e r g e n c e plot f i g u r e (1) plot ( m a t r i z _ g l o b a l _ e r r o r )
18 19 20 21
% Tone to a n n o u n c e code end load handel s o u n d ( y , Fs )
Finally, the end of the code is used in order to plot the convergence plot, regarding the optimization process, meanwhile the code plays a tone to announce the end of the code. Thus, the following section presents a case study where the proposal is applied, where the ANN is trained and validated.
5 Case Study In order to validate the proposed ANN structure for solar irradiation changes detection, a case study is presented, where the circuit topology is shown in Fig. 10. Nevertheless, to test the ANN in a validated testbed, the component parameters are inspired by the converter design in [3], where the DC-DC boost converter was designed as MPPT testbed, where a Particle Swarm Optimization (PSO) based, an EA-based, and the classic P&O MPPT algorithms were tested and compared.
Fig. 10 Circuit topology for the case study
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained … Table 1 DC-DC boost converter parameters Parameter Variable Input voltage Switching frequency Duty cycle Inductance Capacitance Load resistance Output power Output voltage Output current
Table 2 PV array parameters Parameter Maximum Power Cells per module Open circuit voltage Short-circuit current Voltage at MPP Current at MPP Temperature coefficient of Voc Temperature coefficient of Isc Light-generated current Diode saturation current Diode ideality factor Shunt resistance Series resistance Parallel strings Series-connected strings
731
Value
Vg f d L C R Po V i
12 V 120 kHz 50 % 47 mH 220 µF 12 48 W 24 V 2A
Variable
Value
MP CPM Voc Isc Vmp Imp TVoc TIsc IL IO n Rsh Rs Ps Ss
57.96 W 24 14.5 V 5.51 A 11.5 V 5.04 A −0.322%/◦ C 0.072%/◦ C 5.536 A 4.676e−11 A 0.92399 56.671 0.2598 1 1
Hence, the DC-DC converter is designed in order to operate in Continuous Conduction Mode (CCM), with a nominal input voltage of 12 V and step it up into an output voltage of 24 V at 2 A. Thus, as explained in [3], the converter was designed to operate at a nominal power of 48 W. Yet, Table 1 presents the main parameters of the DC-DC converter. Additionally to the DC-DC converter, the selected model for the PV array connected to the converter is the CRM60S125S, since it has a nominal voltage at MPP (Vmp ) of 11.5 V at 5.04 A, which allows a maximum achievable power of 57.96 W. Therefore, Table 2 shows the main parameters of the selected PV array.
732
E. Mendez-Flores et al.
Therefore, after defining the main parameters for the case study, the optimization and ANN training is made through MATLAB, just as explained in Sect. 4.1, where the optimization code is presented. Nevertheless, the following subsection explores with greater detail the ANN training process through the data acquisition from a Simulink’s Simscape simulation (validated as testbed in [3]).
5.1 ANN for Solar Irradiation Changes Detection Hence, in order to perform the analysis in a controlled case study, the data was acquired from the simulated testbed validated in [3], which allows performing the validation of algorithms behavior regarding MPPT methods. Yet, in order to provide enough data in order to train and validate the accuracy of the ANN as irradiance changes detector, ten test runs were performed with ten different irradiation curves, where the test curves are shown in Fig. 11. Moreover, as explored in [3] and validated in [29], MATLAB/Simulink provides a reliable testbed for power electronics, achieving a high accuracy on the representation of the dynamic behavior of DC-DC converters and PV systems. Therefore, following the validated simulation model from [3], the data was acquired by using components from the Simscape Power Systems™ library. Then, as proposed in [3], the Ordinary Differential Equations solver 4 (ode4) was selected to perform the simulation; where the performed tests were carried-out using a 1 µs sampling time, meanwhile the ANN used a 1 ms sampling time for the changes detection, since is a common sample time used for MPPT algorithms (as validated in [11]). Therefore, after applying the solar irradiation input curves (Fig. 10) into the circuit topology described in Fig. 11, the obtained power profiles that are going to be used as input values for the ANN tests are shown in Fig. 12. Where, as explained in Sect. 3, the data should be pre-processed in order to make the ANN understand whether the change in measured power is only noise (i.e. induced by the DC-DC convert ripple) or if it is a change in solar irradiation. Therefore, after applying (3) the obtained profiles are shown by Fig. 13.
1000 Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Test 8 Test 9 Test 10
G [W/m 2]
900 800 700 600 0
1
2
3
4
5
Time [sec]
Fig. 11 Solar irradiation test curves
6
7
8
9
10
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
733
55 Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Test 8 Test 9 Test 10
Power [W]
50 45 40 35 0
1
2
3
4
5
6
7
8
9
10
9
10
Time [sec]
Fig. 12 Acquired power curves obtained through the measured V pv and i pv data Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Test 8 Test 9 Test 10
Ppv[W]
0.06 0.04 0.02 0 0
1
2
3
4
5
6
7
8
Time [sec]
Ppv[W]
0.03 Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Test 8 Test 9 Test 10
0.02
0.01
0 8
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
9
Time [sec]
Fig. 13 Input data tests for the ANN (|Ppv | signal)
On the other hand, it is also worth analyzing from Fig. 13, that when zooming in into the range between 8 and 9 s, Fig. 13 shows how the ripple of the converter and the irradiation changes affect the power measurement; since in the Test 6 plot from Fig. 13 there is a rippling signal obtained when there is supposed to be no change at all (as validated in Figs. 11 and 12). Yet, Fig. 13 also validates that the perceptron proposal can deal with the classification, since it is shown that the noise from Test 6 can still be linearly separated, from the upper |Ppv | values achieved by the Test 1, Test 5, Test 7 and Test 8 curves which are under irradiation changes in the analyzed point. Then, it was determined that in order to avoid overtraining the ANN, Test 1 profile was going to be taken as training data, and the other 9 data tests where going to be
734
E. Mendez-Flores et al.
Power [W]
55 50 45 40 0
1
2
3
4
5
6
7
8
9
10
6
7
8
9
10
6
7
8
9
10
Time [sec]
Ppv [W]
0.04
0.02
0 0
1
2
3
4
5
Expected output
Time [sec] 1 0.5 0 0
1
2
3
4
5
Time [sec]
Fig. 14 Isolated power and Ppv curves from Test 1, in addition to the expected output profile for the ANN
used as validation data, in order to proof that the obtained network was perfectly suitable for different irradiation changes. Hence, Fig. 14 shows sequentially: the isolated power curve from Test 1 taken from Fig. 12, the isolated Ppv profile taken from Fig. 13, and at the end the constructed profile to be taken as expected output for the ANN; where, the logic-1 level represents the ANN detecting that there are irradiation changes regarding the Ppv data, meanwhile logic-0 represents the output of the ANN deactivated since there are no irradiation changes. Consequently, the EA was implemented as training method for the weighs optimization in the ANN (just as discussed in Sect. 4.2), where the EA was coded and performed using 100 epicenters and 100 iterations and performed 10 times, in order to reduce uncertainty in the performed optimization results. Therefore, in each on of the 10 trials, the EA converged in a final 8 Error value [E = 8 according to (10)], which can be validated by the convergence plot presented in Fig. 15. Henceforth, the obtained weigths for the trained ANN are summarized by Table 3, which by the way obtained E = 7, which is less than all the test runs. Therefore, parameters from Table 3 are used for all the validation test runs from the results presented in this work. Moreover, the achieved E = 7 final error value means that after 1000 testing points for the training process, the designed ANN had only 7 mistakes, achieving a
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
735
20 EA 1 EA 2 EA 3 EA 4 EA 5 EA 6 EA 7 EA 8 EA 9 EA 10
18
Error
16 14 12 10 8 0
20
40
60
80
100
Iterations
Fig. 15 EA convergence plot for the optimized values of the ANN Table 3 ANN trained weights optimized through EA Weight Value ω0 ω1 ω2 ω3
0.0495 11.0116 −2.1089 −0.3638
precision over 99%. Therefore, in order to validate the designed ANN against the other 9000 irradiation test points from the other 9 tests data, the following Subsection presents the Simulink designed perceptron for solar irradiation changes detection, which seeks to enable other users to easily suite the designed ANN into their PV simulations.
5.2 Simulink Implementation for Validation As validated in [3, 29], power electronic systems simulations have been widely selected as conceptual testbeds for many algorithms regarding PV systems and DCDC converters design, since have proofed in several works to have high fidelity compared to the dynamic behavior of the real system. Hence, the PV system simulation was performed through the components from the Simscape Power Systems™ library; which, as proposed in [3], the Ordinary Differential Equations solver 4 (ode4-Runge kutta) was selected to perform the simulation; where the training data was obtained using a 1µs sampling time, meanwhile the ANN used a 1msec sampling time for the solar irradiation changes detection training runs, in order to provide test runs with a more common sample time used for MPPT algorithms (as validated in [11]). Nevertheless, the validation test runs in simulink were
736
E. Mendez-Flores et al.
Fig. 16 Simulink circuit diagram for the case study
Fig. 17 Blocks diagram from the PV system located in the subsystem block from Fig. 16
performed using a 0.01 s sample time, in order to promote another uncertainty degree regarding the training data. Thus, the connections diagram from MATLAB/Simulink is shown in Fig. 16, where the main elements from the circuit topology presented in Fig. 10 can easily be related. Yet, it is worth mentioning that all the irradiation test runs begin and end at a 800 W/m2 solar irradiation, which is the reason why the duty cycle for the converter is initialized at 0.48 (MPPT duty cycle when G = 800 W/m2 ), since those parameters allow calibrating the ANN at the closest range to the nominal operating values in the system. Moreover, in order to explorer with greater detail the content of the applied subsystems, Fig. 17 shows the blocks diagram that are in the PV system subsystem block from Fig. 16. Where, the PV array block is calibrated as shown by Table 2. Meanwhile, also from Fig. 16, the Boost converter subsystem has in it the connections showed in Fig. 18, where the components of the circuit were configured with the parameters presented in Table 1.
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
737
Fig. 18 Blocks diagram from the boost converter subsystem in Fig. 16
Fig. 19 Implemented ANN structure for the Simulink validation, which is in the ANN subsystem in Fig. 16
On the other hand, the most relevant block from Fig. 16 is the ANN irradiation detector, which is clearly highlighted in Fig. 16. Hence, since Simulink does not have an ANN perceptron for power systems simulations, the authors addressed the issue by developing the Simulink subsystem showed in Fig. 19, which shows the implementation of the designed Artificial Neural Network design for the irradiation changes detection. Therefore, it is clearly highlighted in Fig. 19 how the network takes as input value the Ppv signal, which uses in order to estimate the change in power through time, meanwhile the memories are used in order to save the previous values of Ppv and Ppv signals, just as showed by Fig. 6. Additionally, Fig. 19 shows where the ω0 , ω1 , ω2 , ω3 can be modified if needed. Meanwhile, the Zero-Order Hold Blocks at the beginning and at the end of the ANN structure from Fig. 19, are selected in order to ensure the fixed sampling time for the validation tests (0.01 s). Furthermore, the following section presents the most relevant results of the validation test runs, in addition to the data quantification of the obtained detection precision by the ANN.
738
E. Mendez-Flores et al.
6 Results As explained before, after training the ANN through the EA to perform the detection of changes in sollar irradiation, the training process required only 1000 simulation points to achieve the optimized behavior; yet, this section presents the validated results of the ANN as solar irradiation changes detector, which were obtained through other 9 test runs with different irradiation profiles (as shown in Fig. 11), allowing to validate the ANN trained model against other 9000 data points. Moreover, in order to graphically expose the performance of the designed ANN, Fig. 20 shows the behavior of the network against the Test profile 3, where the first curve is the Power profile obtained after the irradiation changes, meanwhile the next plot shows the resultant Ppv profile (also analyzed in Fig. 13). Therefore, the last plot compares the constructed profile to be taken as expected output for the ANN, with the actual ANN output through the simulated scenario; where, as stated before, the logic-1 level represents the ANN detecting that there are irradiation changes through the evaluated Ppv data, meanwhile logic-0 represents the output of the ANN deactivated since there are no irradiation changes.
Power [W]
50 48 46 44 0
1
2
3
4
5
6
7
8
9
10
6
7
8
9
10
Ppv [W]
Time [sec] 0.04 0.02 0 0
1
2
3
4
5
Time [sec] ANN Output
1.5 Expected ANN
1 0.5 0 -0.5 0
1
2
3
4
5
6
7
8
9
10
Time [sec]
Fig. 20 Power and Ppv curves from Test 3 profile, in addition to the compared profiles between the expected output the ANN output
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
739
Hence, focusing on the ANN output plot from Fig. 20, the continuous line represents the Expected output from the ANN, and the segmented line is related to the ANN achieved profile. Thus, by comparing the expected output of the system with the ANN output, it can be clearly seen the high fidelity of the proposal what it is expected; however, it can also be observed that most of the errors were not found in the activation of the ANN, since most of them were constantly found in the response time, since it takes between one or two sampling periods for the change detection. Furthermore, the error signal induced by the sampling periods stated above, can be seen with greater detail in Fig. 20 exactly in Time = 2 s, where the irradiation change is detected but after two sampling periods, which by the way supports the proposal to use two previous measurements of Ppv , since it completes the picture that the ANN requires to determine whether the detection is noise or a real irradiation change. On the other hand, in order to avoid overexploiting the visual resources by means of figures similar to Fig. 20, which allow analyzing the fidelity of the neural network regarding to the expected behavior, a numerical quantification of the validation runs is presented below.
6.1 Quantification To validate the performance of the designed ANN, Table 4 shows a complete summary of the test runs and the accuracy the network achieved against them. Where all the tests were performed through the same sample time, and the accuracy has a consistent behavior even under the different irradiation profiles evaluated. On the other hand, it can be seen that the greater errors were obtained in Tests 6, 7 and 9, which by the way have common elements between their irradiation profiles variations (as shown in Figs. 11 and 12): all three of them have unrealistic irradiation
Table 4 Error data quantification Test run Samples Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Test 8 Test 9 Test 10
1000 1000 1000 1000 1000 1000 1000 1000 1000 1000
Errors
Accuracy (%)
7 3 7 8 6 11 11 6 11 8
99.3007 99.7003 99.3007 99.2008 99.4006 98.9011 98.9011 99.4006 98.9011 99.2008
740
E. Mendez-Flores et al.
steps, which enable testing under very extreme conditions the reliability of algorithms related to Maximum Power Point Tracking, as validated in [3]. Hence, representing the aggressive step changes from Tests 6, 7 and 9, Fig. 21 shows isolated from the other test curves, the results from the irradiation profile presented as Test 9 (from Fig. 11), which is mainly structured by step changes in solar irradiation, which is non-typical behavior of the sun irradiation, but allows to evaluate the performance of the ANN detector against very aggressive Ppv changes. Thus, it can be clearly seen that in terms of the dynamic behavior of Ppv , the output from the ANN has a very accurate response; yet, the solar irradiation in a step case should be only represented by a peak in terms of power changes, but in real PV systems Ppv also depends on the speed in which the capacitor and inductor elements from the DC-DC converter can provide energy or the velocity in which they charge themselves (as described in [31]). That is the reason why the step changes in solar irradiation are not simple peaks in Ppv from Fig. 21, since it can be seen that the peak is traduced into a slim pulse in the Ppv signal. Therefore, the ANN confuses and gives false activation states in the time that it takes the system to stabilize again into a normal state. Moreover, the Ppv pulse width after the step irradiation changes in Fig. 21, are in average around 0.08 sec, which corresponds to 8 samples of the ANN in this case study; still, the detector achieved to have only 2 (averaged) errors against the step changes. Then, in order to provide the hard-data analysis from the number of mistakes, Table 5 breaks-down the errors from Table 4, validate whether the mistakes are from false triggers or missing events. Furthermore, it can be seen that analyzed before and as validated through 21, the aggresive step changes in solar irradiation mostly present in Tests 6, 7 and 9, induced more false activation state errors than other tests with finer changes. Nevertheless, it is important to highlight the fact that the aggressive and fast irradiation changes where detected 100 % of the time, which translates into the ability to detect even the fastest changes in solar irradiance, which pushes the boundaries of this algorithm to detect all changes with little noise at the output. Additionally, the information in Table 5 allows validating the plurality of the test profiles, as shown in Fig. 11. Moreover, it can also be analyzed from Table 5 that the only irregular error signal behavior is found in the number of false activations, since there are more false activation mistakes in the total error sum due to the already described irradiation aggressive changes, which are in some of the test profiles. On the other hand, in order to provide an overal data quantification of all the tested data points and all the acquired errors, Table 6 presents the overall data quantified which summarizes the average performance of the ANN irradiation changes detector. Therefore, it can be concluded from the overall data summary from Table 6, that the provided method for the development of an ANN capable of detecting the change in solar irradiation using only the power measures from the PV array, was 99.1800% accurate in terms of the precision achieved by the detector, which makes this method a reliable method for the task. In addition, it is worth mentioning that as future work, the authors are working on the implementation of the designed ANN as part of the EA-based MPPT algorithm
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
741
Power [W]
50 48 46 44 0
1
2
3
4
5
6
7
8
9
10
6
7
8
9
10
Ppv [W]
Time [sec] 0.04 0.02 0 0
1
2
3
4
5
Time [sec] ANN output
1.5 Expected ANN
1 0.5 0 -0.5 0
1
2
3
4
5
6
7
8
9
10
Time [sec]
Fig. 21 Power and Ppv curves from Test 9 profile, in addition to the compared profiles between the expected output the ANN output Table 5 False activations and Missed changes Test run Errors Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Test 8 Test 9 Test 10 Total
7 3 7 8 6 11 11 6 11 8 82
False activation
Missed changes
2 0 3 6 4 10 7 4 11 5 56
5 3 4 2 2 1 4 2 0 3 26
742 Table 6 Overall data quantification Total test points Total number of errors Average accuracy Average error Error due false activations Error due missed changes
E. Mendez-Flores et al.
100,00 82 99.1800 0.8200 0.5600 0.2600
tested in [3], which would enable an efficient solution that could be implemented for energy harvesters and a proficient battery charger through PV sources. Consequently, the following section presents the final conclusions of this work.
7 Conclusion Starting from the primary objective of this chapter, it was possible to demonstrate firstly the power of metaheuristic algorithms for an optimal ANN training process, where the Earthquake Algorithm was again validated as tuning method. Consequently, the objective of developing a reliable solar irradiation detector was completely fulfilled, since the achieved over 99 of accuracy validated the proposal. Yet, this proposal takes advantage of the low mathematical complexity required for ANN implementation, since as validated by the presented structure in Fig. 19, the proposed Artificial Neural Network can be easily implemented even through few Simulink blocks. Therefore, the low suitability complexity and low mathematical operation requirements, makes this proposal ideal for implementation in PV systems. Moreover, this work can find its greatest relevance for systems that require a reliable activation signal in terms of when the Global Maximum Power Point is moving due to Temperature or solar Irradiation conditions; since, as validated in Table 6, the proposed ANN has great accuracy with low missed events and false activation probabilities. Therefore, this proposal is validated to be a solution for MPPT algorithms that require to know when to stop searching for the MPP (since there is no condition changes anymore), or for algorithms that simply cannot work properly without a reset signal upon temperature and irradiance parametric changes; as those algorithms, such as PSO-based and EA-based MPPT algorithms, converge on a global solution after some iterations and need to know when to start looking for the MPP again to avoid getting stuck in old MPP points. Moreover, this work also enables other users to suite the EA algorithm to other applications and the implementation of ANN into the Simulink environments; both achievements, through the presentation and exploration of a complete MATLAB code example for the implementation of the EA as ANN training method (used for
Solar Irradiation Changes Detection for Photovoltaic Systems Through ANN Trained …
743
this particular application), in addition to the complete Simulink testbed design with a full description of the ANN implementation for the Simulink environment; which by the way, enables to suite the EA for other optimization tasks among different research topics. On the other hand, this work proofs the implementation capabilities of the solar irradiation changes detection method, with a simple reproducibility of the results through the provided information. Henceforth, the applied solution through ANN, showed that with the low computational cost an effectiveness of over 99% is achieved, which allows it to be an implementable algorithm for embedded systems due to the little mathematical complexity it requires. Yet, the main limitation of the proposal is that its speed to detect changes in solar irradiance is completely at the speed of the sampling period, given that in the event of sudden changes it would take approximately one or two sampling periods to detect the change. However, despite being a limitation, it is not very critical either, given that for these types of applications the sampling period is around 1 mSec. Consequently, the features of this proposal, enable the implementation of the ANN for metaheuristic algorithms as dynamic optimizators for MPPT, due to the high reliability shown by this algorithm. Hence, as future work, the authors are working on the implementation of the designed ANN as part of the EA-based MPPT algorithm tested in [3], which would enable an efficient and reliable solution that could be suitable for energy harvesters and proficient battery chargers through PV sources.
References 1. IEA, Energy Policies Beyond IEA Countries: Mexico 2017. Tech. rep (International Energy Agency, 2017). https://www.iea.org/reports/energy-policies-beyond-iea-countries-mexico2017 2. IEA, Global Energy Review 2020. Tech. rep. (International Energy Agency, 2020). url: https:// www.iea.org/reports/global-energy-review-2020 3. E. Mendez et al., Improved MPPT algorithm for photovoltaic systems based on the earthquake optimization algorithm. Energies 13(12), 3047 (2020) 4. IEA, Renewable Energy Market Update. Tech. rep. (International Energy Agency, 2020). https://www.iea.org/reports/renewable-energy-market-update 5. E. Dupont, R. Koppelaar, H. Jeanmart, Global available solar energy under physical and energy return on investment constraints. Appl. Energy 257, 113968 (2020) 6. IEA, Clean Energy Innovation. Tech. rep. (International Energy Agency, 2020). https://www. iea.org/reports/clean-energy-innovation 7. IEA, World Energy Outlook 2019. Tech. rep. (International Energy Agency, 2019). https:// www.iea.org/reports/world-energy-outlook-2019 8. A. Nakpin, S. Khwan-on, A novel high step-up DC-DC converter for photovoltaic applications. Procedia Comput. Sci. 86, 409–412 (2016) 9. K.M.S.Y. Konara, M. Kolhe, A. Sharma, Power flow management controller within a grid connected photovoltaic based active generator as a finite state machine using hierarchical approach with droop characteristics, in Renewable Energy (2020)
744
E. Mendez-Flores et al.
10. S. Issaadi, W. Issaadi, A. Khireddine, New intelligent control strategy by robust neural network algorithm for real time detection of an optimized maximum power tracking control in photovoltaic systems. Energy 187, 115881 (2019) 11. N. Femia et al., Power Electronics and Control Techniques for Maximum Energy Harvesting in Photovoltaic Systems (CRC Press, Boca Raton, Florida, 2017) 12. M.A. Sahnoun, H.M. Ugalde, J.C. Carmona, J. Gomand, Maximum power point tracking using P&O control optimized by a neural network approach: a good compromise between accuracy and complexity. Energy Procedia 42, 650–659 (2013) 13. S. Meddour et al., A novel approach for PV system based on meta-heuristic algorithm connected to the grid using FS-MPC controller. Energy Procedia 162, 57–66 (2019) 14. K. Ishaque et al., An improved particle swarm optimization (PSO)-based MPPT for PV with reduced steady-state oscillation. IEEE Trans. Power Electron. 27(8), 3627–3638 (2012) 15. A.M. Eltamaly, H.M. Farh, Dynamic global maximum power point tracking of the PV systems under variant partial shading using hybrid GWO-FLC. Solar Energy 177, 306–316 (2019) 16. A.M. Eltamaly, M.S. Al-Saud, A.G. Abo-Khalil, Performance improvement of PV systems’ maximum power point tracker based on a scanning PSO particle strategy. Sustainability 12(3), 1185 (2020) 17. A.M. Eltamaly, H.M.H. Farh, M.S. Al Saud, Impact of PSO reinitialization on the accuracy of dynamic global maximum power detection of variant partially shaded PV systems. Sustainability 11(7), 2091 (2019) 18. E. Mendez et al., Mobile phone usage detection by ANN trained with a metaheuristic algorithm. Sensors 19(14), 3110 (2019) 19. E. Alba, R. Martí, Metaheuristic Procedures for Training Neural Networks, vol. 35 (Springer Science & Business Media, 2006) 20. P. Ponce-Cruz et al., A Practical Approach to Metaheuristics Using LabVIEW and MATLAB® (Chapman and Hall/CRC, 2020). https://doi.org/10.1201/9780429324413 21. E. Mendez et al., Electric machines control optimization by a novel geo inspired earthquake metaheuristic algorithm, in Nanotechnology for Instrumentation and Measurement (NANOfIM) (IEEE, 2018), pp. 1–6 22. R. Teodorescu, M. Liserre, P. Rodriguez, Grid Converters for Photovoltaic and Wind Power Systems, vol. 29 (Wiley & Sons, London, 2011) 23. Fang Lin Luo and Ye Hong, Renewable Energy Systems: Advanced Conversion Technologies and Applications (CRC Press, Boca Raton, 2017) 24. W.S. McCulloch, W. Pitts, A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5(4), 115–133 (1943) 25. E. Mendez et al. ANN Based MRAC-PID Controller Implementation for a Furuta Pendulum System Stabilization 26. S. Agatonovic-Kustrin, R. Beresford, Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. J. Pharm. Biomed. Anal. 22(5), 717–727 (2000) 27. C.E. Choong, S. Ibrahim, A. El-Shafie, Artificial Neural Network (ANN) model development for predicting just suspension speed in solid-liquid mixing system. Flow Meas. Instrum. 71, 101689 (2020) 28. B. Jamali et al., Using PSO-GA algorithm for training arti?cial neural network to forecast solar space heating system parameters. Appl. Therm. Eng. 147, 647–660 (2019) 29. E. Mendez et al., Novel design methodology for DC-DC converters applying metaheuristic optimization for inductance selection. Appl. Sci. 10(12), 4377 (2020) 30. E. Mendez-Flores et al., Design of a DC-DC converter applying earthquake algorithm for inductance selection, in ICAST 2019-30th International Conference on Adaptive Structures and Technologies (2019), pp. 157–158 31. R.W. Erickson, D. Maksimovic, Fundamentals of Power Electronics (Springer Science & Business Media, 2007)
Genetic Algorithm Based Global and Local Feature Selection Approach for Handwritten Numeral Recognition Sagnik Pal Chowdhury, Ritwika Majumdar, Sandeep Kumar, Pawan Kumar Singh, and Ram Sarkar
1 Introduction Handwritten numeral recognition has been a widely acknowledged research field since the 1980s. In the domain of pattern recognition and image processing, this problem is considered amongst the major benchmark research problems. Development in this area is integral to the enhancement of man-machine interface. The recognition of handwritten numerals greatly differs from that of printed numerals. Printed numerals are uniform in size, shape and position in a given font. Whereas the same cannot be assured for its handwritten counterparts as all the said parameters of the writing vary from one entity to another depending on the writing styles, background and educational qualifications of the individuals. Hence, handwriting is, in general, non-uniform in nature. As a result, detection of handwritten numerals is amongst the most challenging yet popular research areas which have attracted researchers since long. Handwriting numeral recognition can be broadly divided into two categories— Offline and Online. Offline numeral recognition utilizes a raster image taken from different digital input sources. Binarization of the image is done through appropriate threshold techniques on the basis of the grayscale patterns, in such a way that the image pixels can be either ‘1’ (foreground) or ‘0’ (background). In online recognition, S. P. Chowdhury · R. Majumdar · S. Kumar Department of Computer Science and Technology, Indian Institute of Engineering Science and Technology, Shibpur, Howrah 711103, West Bengal, India P. K. Singh (B) Department of Information Technology, Jadavpur University, Jadavpur University Second Campus, Plot No. 8, Salt Lake Bypass, LB Block, Sector III, Salt Lake City, Kolkata 700106, West Bengal, India R. Sarkar Department of Computer Science and Engineering, Jadavpur University, 188, Raja S.C. Mullick Road, Kolkata 700032, West Bengal, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 D. Oliva et al. (eds.), Metaheuristics in Machine Learning: Theory and Applications, Studies in Computational Intelligence 967, https://doi.org/10.1007/978-3-030-70542-8_30
745
746
S. P. Chowdhury et al.
the system is fed the current data and the recognition (of an input numeral) can be carried out simultaneously, if necessary. A stylus, in contact with a pressure sensitive input device, sends a string of (x, y) coordinate pairs. The recognition of offline handwritten numerals is trickier than the online case because of the noise introduced in the acquisition of image. Another reason is when an image of a handwritten document is acquired then information like stroke sequence and pen positions are lost. The sequential and dynamic information are extracted from these pen movements which, in turn, act as significant features for solving online numeral recognition problem. On the contrary, the absence of such information makes the task of offline recognition a challenging one. In our study, we deal with the offline handwritten numeral methods, where a device optically scans the writing which is then saved as an image. The recognition is dedicated to automation in processing bank cheques, sorting postal mails and packages, through the use of pin codes, evaluation of examination tests having multiple choice questions and so on. Due to this vast difference in the handwriting styles, there arise some key challenges in creating correct and precise recognition systems. Hence, feature extraction develops into an arduous task, thus necessitating the need for researchers to find methods to recognize handwritten digits with utmost accuracy. As a result, this problem is still an existing research area discovering methods to increase accuracy in recognition [1, 2]. Previous works on recognition of handwritten digits dealt with the Roman [3] script, mainly related to English, some European languages and Asian languages like Chinese [4]. In India, languages like Devanagari, Bangla, Odia and Tamil have started to gain traction [5]. Hindi and Bangla are the two most popularly spoken languages in India. Hindi ranks third in the world in terms of the number of speakers. Bangla, the second most popular language in India, is also the national language of Bangladesh. It also ranks sixth in the world having about 207 million speakers [6]. In India, Bangla is one of the commonly spoken languages in the eastern states. Despite all the popularly, there has been very few research works, as observed in literature, carried out in the field of handwritten numeral recognition of these two languages. In view of the said facts, the present work concentrates on the recognition of Bangla and Hindi handwritten numerals. Due to our colonial past as well as the diversity of languages/scripts in India, English is used as a binding language. After Hindi, English ranks second as the most spoken language in the Indian sub-continent. Having about 341 million speakers, it is also the fourth largest spoken language in the world. Most of the official works are performed using both regional language and English language. English is also taught in most schools, colleges and universities. So, in the present work, we have also considered English numerals (written in Roman script) too. Handwritten samples of Bangla, Hindi and Roman numerals considered in the present work are shown in Fig. 1. Generally, the steps in handwritten numeral recognition system are preprocessing, followed by feature extraction and classification. Document pages containing handwritten numerals are first scanned. From there, each individual numeral is extracted. After that these extracted digits are filtered (to remove any noisy
Genetic Algorithm Based Global …
747
Fig. 1 Examples of numeral sample written in: a Bangla, b Devanagari, and c Roman scripts
pixels) and resized to serve as the input data for our experiment. Next, feature extraction is done via the Histogram of Gradients (HOG) technique and local distancebased features. At first, features based on HOG are extracted from the handwritten numerals and the classification is done using Multi-Layer Perceptron (MLP) classifier. The confusion matrix generated, is analyzed to investigate the maximum number of misclassifications found among the numerals. Based on this matrix, the numerals showing the maximum number of misclassifications among each other are grouped together. The feature extraction is once again performed using the combination of HOG and local distance-based features. The running time and cost of a recognition method is increased due to each feature that is used for classification. Hence, we strive to create and execute a system which has minimized feature sets. On the other hand, we also have to include ample feature sets to obtain high recognition rates under challenging conditions. This has led to the development of a variety of optimization techniques for finding an “optimal” subset of features from a larger set of possible features. Machine learning techniques, which generate useful classification procedures, can be significantly improved if we can effectively find “optimal” feature subsets from the point of view of size and performance. In our research, we have used an adaptive feature selection strategy using GA which can significantly reduce the number of features required for intra-numeral classification of handwritten numerals formed within the groups. The MLP classifier is again used
748
S. P. Chowdhury et al.
Fig. 2 Flowchart of the proposed handwritten numeral recognition system
for the classification purpose. On obtaining the optimal feature sets using GA for each intra-numeral group, the numerals are separately classified to their respective numeral classes. The schematic diagram representing the key steps of the proposed approach is shown in Fig. 2. This proposed system is applied for three most popular scripts used in Indian subcontinent viz., Devanagari, Bangla, and Roman. The key advantage of this approach includes the ability to accommodate two important criteria such as number of features and accuracy of classifiers in order to adequately represent the handwritten numeral recognition problem. The rest of the paper is organized as follows: Sect. 2 describes a brief study of previous methods related to handwritten numeral recognition methods of Bangla, Devanagari and Roman scripts. Section 3 explains the collection process of input numeral databases for evaluating the proposed work. Section 4 presents the proposed methodology for recognition of handwritten numerals whereas experimental results are reported in Sect. 5. The final summary of the work and some possible future scopes are mentioned in Sect. 6.
2 Literature Survey Digit recognition is a subfield of character recognition and a subject of substantial importance since the early years of research in the field of handwriting recognition. There are a lot of methodologies to solve this problem, as proposed in the literature. Most of the knowledge provided by these investigations may also be applied to character and word recognition. In this section, we review some of the research articles related to the handwritten recognition of Bangla, Devanagari and Roman scripts.
Genetic Algorithm Based Global …
749
The method proposed by Basu et al. [7] divided the digit image into 9 overlapping sub-images of fixed size. Then, from each of these sub-images, they computed locally, the longest run-based feature. A MLP based classifier was used to test this approach on Bangla numeral dataset and achieved a recognition accuracy of 96.65%. To recognize handwritten Bangla numerals, Wen et al. [8] incorporated a kernel and Bayesian Discriminant based technique, whereas Nasir et al. [9] recommended a hybrid system for the same to be used for automated postal system. This carried out feature extraction using k-means clustering, Bayes theorem and maximum of posteriori and finally using Support Vector Machine (SVM) classifier. Surinta et al. [10] proposed the usage of a feature-set, for example, the outline of the handwritten image computed using 8-directional codes, distance measured between black pixels and hotspots, and the intensity of pixel points of small blocks. Now, these features were fed to a nonlinear SVM classifier individually, and the final conclusion was obtained on the basis of majority voting. For handwritten Bangla numeral recognition, a framework is presented in the study of Khan et al. [11], using Sparse Representation Classifier. To classify the Bangla digits, this classifier is applied on the image zone density, which is an image domain statistical feature extracted from the character image. This unique method for Bangla Optical Character Recognition (OCR) displays an outstanding accuracy of 94% on the off-line handwritten Bangla digit database CMATERdb 3.1.1. Akhand et al. in [12] investigated a Convolutional Neural Network (CNN) based handwritten Bangla numeral recognition system. The proposed system uses moderate preprocessing technique on the images of handwritten numbers by generating patterns from them, after which CNN is used to classify individual numerals. It does not utilize any feature extraction methods which is generally seen in other related works. Recently, Singh et al. in [13] reported a comprehensive survey especially for Bangla handwritten numeral recognition. The first research report on the recognition of Devanagari numerals was issued in 1977 [14]. However, for the next 30 years, less number of significant works has been reported. Some researchers have proposed various methods in recent times on handwritten Devanagari characters. Bhattacharaya et al. [15] put forward a classification approach based on MLP neural network for Devanagari numerals with an accuracy of 91.28%. They employed multi-resolution features based on wavelet transforms. Hanmndlu and Murthy [16] proposed a fuzzy model-based recognition for handwritten Devanagari numbers, where normalized distance was used as a feature for individual boxes and this resulted in 92.67% accuracy. For isolated handwritten Devanagari numerals, Singh et al. in [17] suggested an automatic recognition system. Feature extraction methods are based on the topological and geometrical properties of the character and the structure of the character image. Based on different levels of granularity, they have used the recursive subdivision of the handwritten image to extract the features. At each level, there are vertical and horizontal lines which split the handwritten image into 4 quadrants of sub-images, consisting of nearly the same quantity of foreground pixels. The intersection of these lines denotes a point, on the basis which, features are extracted. This image division method results in 4 and 16 sub-images. On each level initially, the features are calculated and the SVM Classifier is used to determine the highest recognition rate. The subsequent outcomes
750
S. P. Chowdhury et al.
are compared with the Quadratic and k-NN classifier. Another system for the recognition of isolated handwritten Devanagari numerals has been proposed by Aggarwal et al. [18]. The recommended method divides the sample image into sub-blocks. In these sub-blocks the strength of gradient is accumulated in 8 standard directions in which gradient direction is broken down resulting in a feature vector with a dimensionality of 200. For the classification, SVM classifier is utilized. In the research carried out by Singh et al. [19], a robust offline Devanagari handwritten recognition system is introduced using an amalgamation of global and local features. Structural features like cross point, endpoint, ‘C’ shaped structure, ‘U’ shaped structure, loop centroid, and inverted C shaped structure constitute global features. On the other hand, the zone-wise calculated distance of thinned image from geometric centroid and histogram-based features are combined to form the local features. Since the numerals are written by hand, there is bound to be variation in writing style. As a pre-processing step prior to feature extraction, this variation is managed by size normalization and normalization to constant thickness. For the classifier to be used for recognition, they have employed an Artificial Neural Network with an average accuracy rate of 95% or higher. Another feature extraction method as recommended by Prabhanjan et al. [20] is the Uniform Local Binary Pattern (ULBP) operator. Though this operator exhibited good performance in texture classification and object recognition, it is not used in Devanagari handwritten character/digit recognition. This suggested method works by extracting both the local and global features, and is carried out in two steps. The first step is noise-removal and this pre-processed image is converted to binary image and normalized to a fixed size of 48 by 48. The second step sees the application of the ULBP operator to the image in order to extract global features. The input image is thereafter split into 9 blocks and to extract local features, the operator is applied on each of the blocks. Lastly, global and local features are used for training the SVM classifier. Recently, Singh et al. [21] proposed a new feature extraction method known as Regional Weighted Run Length (RWRL) having a dimension of 196 elements, for handwritten Devanagari numeral recognition. For the recognition of handwritten Roman numerals, Cao et al. [22] propose zone-based direction histogram feature. This research was mainly driven by twostage classifier scheme consisting of two different neural networks. The lowest error rate was 0.17% with 14.5% rejection rate. To recognize offline handwritten English digits, Prasad et al. in [23] used a rotation invariant feature extraction scheme. Hybrid feature extraction method comprises features due to moment of inertia and projection features. They used a Hidden Markov Model (HMM) based classifier as the recognizer. Salouan et al. [24] presented isolated handwritten Roman numerals recognition on the basis of the combination of zoning method, Radon transform, Hough transform and Gabor filter. The performances of their individual and combined features are compared with respect to their accuracy and time complexity. The initial comparison is obtained between four hybrid methods used to extract the features from numbers. First, the zoning is combined with Radon transform, for the second, it is combined with Hough transform, for the third, it is next combined with Gabor and for the fourth, combined with all these three descriptors. On the other hand, the other comparison
Genetic Algorithm Based Global …
751
is implemented between three classifiers—first one is neuronal or the MLP classifier, second is probabilistic or HMM and the third classifier is a combination of the first and second classifiers. Qacimy et al. [25] examine the effectiveness of the four Discrete Cosine Transform (DCT) based feature extraction methodologies—first, the DCT upper left corner (ULC) coefficients, second, DCT zigzag coefficients, third, block based DCT ULC coefficients and finally, block based DCT zigzag coefficients. Feature extraction was conducted on the MNIST database [26] by inputting the coefficients of each DCT variant to the SVM classifier. It was determined that the recognition accuracy of block based DCT zigzag feature extraction was higher at a rate of 98.76%. It can be seen from the above literature analysis that a lot of works have been done for numerals written in a single script. However, some works, described in [27–31] have been performed for numerals written in multiple scripts. This is done in order to develop script invariant methods for handwritten numeral recognition. Furthermore, few works have also been reported based on feature selection for the aforementioned problem. In the year 2018, Ghosh et al. [32] proposed a feature selection method known as Histogram-Based Multi-objective GA (HMOGA) for handwritten Devanagari numeral recognition This feature selection approach was improved by Guha et al. [33], by introducing a modified version of HMOGA named Memory-Based HMOGA (M-HMOGA) to solve this handwritten digit classification problem for Bangla, Devanagari and Roman scripts. Ghosh et al. [34] tried an innovative approach by applying union-based ensemble approach of three popular filter methods, namely Mutual Information (MI), ReliefF and Chi-square to reduce the feature dimension of HOG descriptor extracted form Bangla, Hindi and Telugu script numerals. Few works, described in [35–37], have also been reported for feature selection in handwritten numeral recognition from multiple scripts. Recently, some researchers are also moving towards deep learning approach to tackle this state-of-the-art problem. Mukhoti et al. in [38] used deep learning to classify handwritten digits in Bangla and Hindi scripts. But there are limitations using this technique as it requires very large amount of data in order to perform better than other techniques. It is also extremely expensive to train due to complex data models. Moreover, deep learning requires expensive GPUs and hundreds of machines. Keeping this in mind, in the work done by Ghosh et al. [39], a script invariant feature vector is designed based on the concept of the DAISY descriptor and applied on handwritten digits written in four different scripts namely Arabic, Bangla, Devanagari and Roman. It is found to be computationally inexpensive approach when compared to other state-of-the-art prevalent deep learning architectures like Long Short Term Memory (LSTM) networks or (CNN). Motivated by the above facts, we are also using traditional machine learning techniques to tackle this problem of handwritten digit recognition.
752
S. P. Chowdhury et al.
3 Collection and Pre-processing of Numeral Databases Handwritten sample collection, containing differences in handwriting styles, is the main objective of gathering data. This is required in order to get an accurate assessment of any feature extraction methodology. 10,000 samples of handwritten numerals are considered for Roman script taken from the benchmark HDRC 2013 [40] database. Even though there are standard benchmark databases for Bangla and Devanagari numerals [25], the sample size is small. Hence, we have prepared an inhouse database of 10,000 handwritten numerals written in Bangla and Devanagari. Each person is asked to write 4 samples of numerals (0–9) each, and a total of about 250 people are involved in the data collection process who belong to varying age, sex, educational qualification, profession etc. Samples are collected on datasheets consisting of pre-defined rectangular grids in which they had to write the numerals using a blue or black colored pen. A datasheet sample containing Bangla numerals is demonstrated in Fig. 3. Each datasheet contains exactly 15 samples of handwritten numerals (0 through 9). For the present work, we have considered 10,000 numerals for every script namely, Devanagari, Bangla and Roman.
3.1 Pre-processing Pre-processing involves the initial processing of the image, so that it can be used to make the further processing easier for an input to the recognition system. At first, the datasheets containing Bangla or Devanaagri numerals are captured on a flatbed scanner with a resolution of 600 dpi and are stored in bitmap file format. Now,
Fig. 3 a Scanned datasheet containing handwritten Bangla numerals; b Image obtained after trimming the outer frame and column headers; c Image divided into 10 × 15 cells to get 150 isolated numeral images
Genetic Algorithm Based Global …
753
Fig. 4 Illustration of: a image resizing, b binarizing, and c image thinning for handwritten Bangla numeral image ‘1’
there are four below mentioned sub-processes involved for the processing of input numerals.
3.2 Extraction of Individual Digits Firstly, we get rid of the noise using Gaussian filter [41]. Then, both the column headers (located across the top of the datasheet) as well as outer frame margin are removed. This is done by dividing the frame margins into pre-defined cells of size 10 × 15 (shown in Fig. 3). The individual digits obtained are saved as “B_data#####.bmp” or “D_data#####.bmp” depending upon Bangla or Devanaagri numerals respectively. Here, “#####” represents the naming scheme of the numeral images.
3.3 Image Resizing The input numeral images may be of varying sizes which can affect the recognition results. Therefore, the minimally bounding rectangular box of each numeral image is normalized to 32 × 32 pixels separately, as shown in Fig. 4a.
3.4 Image Binarization The process of binarization converts a grayscale image to its binary counterpart using Otsu’s binarization methodology [41]. This turns out to be useful when features are extracted using the feature extraction module. An example is illustrated in Fig. 4b.
754
S. P. Chowdhury et al.
The following is a noteworthy point—the current process is applicable for extraction of local distance-based features.
3.5 Image Thinning The final step is used to reduce a handwritten digit of thick lines into thinner lines, thus making it easier for its feature extraction, as shown in Fig. 4c. In the present work, this technique is applied before the extraction of local distance-based features.
4 Design of Feature Set After pre-processing, the extraction of features is done to estimate the most pertinent characters of individual numeral classes to be used for the recognition stage. These features which have been extracted for our can be grouped into two main classes: global features and local features. Global features define the image in its entirety. Local features are extracted from the sub-regions or local regions and describe the most important sub regions in the image. The accuracy of the recognition process is improved by combining these local and global features subject to increase in computational overheads.
4.1 Global Features For detecting objects from images, Dalal and Triggs [42] initial explanation of the HOG descriptor was principally concentrated on recognition of pedestrians. The elementary guiding thought is that the outline and appearance of the object within an image can be described by the intensity gradient distribution or the edge directions. The HOG descriptor is processed by dividing an image into smaller component regions and for each of these regions, the gradient and orientation are computed. The histogram buckets are evenly spaced over 0–180° or 0–360° on the basis of signed or unsigned gradient values usage. The features are produced by combining the histogram of all the component regions. HOG features are well suited for this kind of challenge since it functions on the localized cells. It can also be used to describe the outline and appearance of the handwritten digits in the given context. Our study considers 8 buckets over 7 × 7 blocks for feature extraction, thus resulting in a 392-dimensional (8 * 7 * 7 = 392) feature vector. The HOG transformed images for handwritten Devanagari numerals ‘0’ and ‘3’ are shown in Fig. 5.
Genetic Algorithm Based Global …
755
Fig. 5 Illustration of HOG feature descriptor (images on the left side are the original Devanagari numerals ‘0’ and ‘3’ whereas the right side shows their corresponding HOG transformed images)
4.2 Local Distance Based Features These features mainly consider the distance of the first foreground pixel from the outer edge of the numeral calculated in 8 different directions as shown in the Fig. 5. Since all the numeral images are normalized to 32 × 32 pixels, the whole length is pigeonholed into 4 bins that is 0–7, 8–15, 16–23 and 24–31. Now, if we encounter a foreground pixel in the 1st bin, then the number of these foreground pixels present is calculated and stored in that bin. The same procedure is repeated for the other bins as well. After this process is done for one row (or column), we move on to the next row (or column) approaching in clockwise direction and the same procedure of counting foreground pixels for each bin is again repeated. Since we discretize these 32 values into 4 bins based on their count, thus a total of 32 (4*8) element feature vector is extracted using local distance features. Figure 6 shows an example of the estimation of the local distance features for handwritten Roman numeral ‘6’.
Fig. 6 Computation of local distance features for sample handwritten Roman numeral ‘6’. (ST , S B , S R , SL , ST R , ST L , S B R , S B L indicate the features values considered in all eight directions respectively)
756
S. P. Chowdhury et al.
4.3 Selection of Optimal Feature Subset Using GA The final confusion matrix produced as a result of the classification stage using MLP classifier, shows instances of overlap and misclassification among similar shaped numerals. The outputted confusion matrix is further examined in order to find the numeral classes having the highest number of classifications. These numerals are considered similar to each other in terms of discriminating features and grouped together. Subsequently, both HOG and local distance features are used to precisely classify numerals within these smaller groups of numerals. This methodology gives rise to a large number of such features, hence the task of exhaustive search for an optimal set of local features becomes a cumbersome one. To counter this situation, we choose GA based feature selection method, which lets us recognize an optimal set from both the global and local feature sets, thereby leading to better recognition performance. Objective of feature selection includes removal of irrelevant and misleading features, reduction of time required for learning the classification function, increasing overall accuracy of recognition and to preserve only the relevant features which provide broad understanding of every input pattern class. There are various traditional methods of feature selection used in the literature like Sequential Forward Selection (SFS), Sequential Backward Selection (SBS), Exhaustive search and GA. In our work, we have used GA owing to the following advantages. It is not advisable to carry out extensive search for large feature space as the time complexity is high. SFS and SBS consider all the features to find the optimal set simultaneously but in these algorithms if a feature is deleted from the set, then the chances of that feature getting selected is zero, resulting in removal of some discriminating feature in some cases. GA differs from all these methods due to the ability to evolve optimal features from the selected features and a good exploration of search space for newer and better solutions. The evolving process is made possible using GA operators such as selection, crossover and mutation. The process continues until the fitness criteria is met to the user’s preference or if the number of iterations specified by the user is over. The various parameters of GA used in the present work are listed in Table 1. GA is a meta-heuristic iterative process of improvement which incorporates evolution features like Darwin’s principle of survival of the fittest [43], crossover and mutation. It is based on the premise that evolution arises due to the need for search of optimal solution set. Much like the biological process of adaptation, GA makes use of the historical runs to forecast the future solutions, in order to achieve optimal performance. In a specific chromosome, each bit is assigned a value of ‘1’, if the feature is selected on the basis of its fitness; otherwise it is assigned a value of ‘0’. The flowchart of GA used in our current research is illustrated in Fig. 7. Dividing the pattern image into a fixed set of identically divided regions is the easiest way to detect the regions with highest data discrimination. There are regions
Genetic Algorithm Based Global …
757
Fig. 7 Flowchart of the feature sampling technique using GA Table 1 Values of parameters for GA used in the present work
GA parameters
Value
Population size
20
Initial population
100
Iterations
100
Selection method
Roulette-wheel
Crossover probability
0.85
Crossover point
Random
Mutation probability
0.15
758
S. P. Chowdhury et al.
which may overlap with each other, for such instances, local features are extracted. The global features along with the local features are combined and sampled randomly to generate different subsets. The effectiveness of the recognition methodology is assessed with each of these generated subsets. The subset, giving rise to the best outcome (fitness), can be assumed as an ideal set of features where the pattern classes can be distinguished considerably [44]. On the basis of these feature set outcomes, GA is applied to obtain the ideal local regions sets. Each candidate solution or chromosome can be thought of as an n bit binary vector, where each bit represents a particular feature of a digit image. The initial population is created by generating random vectors, consists of 0 and 1 s. As per the standard definition of GA, if a feature is selected on the basis of its fitness, the bit is assigned a value of ‘1’, else it is ‘0’. The initial population or generation consists of 100 such random vectors. The fitness of the GA algorithm is calculated by training the MLP classifier using the training generated by taking the features selected (set as ‘1’) in the given vector and it is then checked against the verified data set. The MLP classifier determines the fitness value of each chromosome. The top 20 vectors with the highest fitness values are used to obtain the next generation of chromosomes. These undergo crossover and mutation in the subsequent runs of the GA and another set of 100 vectors are further created. The percentage of correctly classified data is the fitness value. This is repeated for all the vectors and the best 20 of these is run for the next iteration of GA. The fitness is calculated once again for these vectors, to get the best 20 out of them and the process continues, until the number of specified iterations is completed (which, in our case is 100) or there is no significant change in result in the subsequent iterations of GA. For the current research, we have followed a two-level classification for handwritten digit identification of three aforementioned scripts namely, Roman, Devanagari and Bangla. For English numerals, we start by grouping at the first classification level where we have used 424 features. Of these, 392 are the HOG features while the rest 32 are local features. Upon completion of the first level of classification, we obtain certain groups of similar numerals. There are four such groups namely, EG1 (8, 9), EG2 (3, 4, 7), EG3 (5, 6, 0) and EG4 (2, 1) which are also illustrated in Fig. 8a. The numbers of features selected by GA are used for their intra-group classification. These feature vectors consist of 180, 167, 171 and 167 elements for EG1, EG2, EG3 and EG4 groups respectively. These grouping so occurred since numerals within a particular group have similar shape structure. For example, the upper halves of the numerals ‘8’ and ‘9’ are almost identical and as such, these were grouped together as EG1. Similar trends are observed amongst the other group members as well. In case of Bangla numerals, we employ 424 features in the first level of classification. Following this first level of classification, we obtain five groups of similar numerals namely, BG1 (1, 2, 9), BG2 (0, 3, 5, 6), BG3(4), BG4(7) and BG5(8). This grouping is demonstrated in Fig. 8b. It can be observed from Fig. 8b that the first two groups namely, BG1 and BG2 contain three and four numeral classes respectively whereas the other three groups involve only one individual numeral as their member.
Genetic Algorithm Based Global …
759
Fig. 8 Illustration of inter-class grouping of: a English, b Bangla and c Devanagari handwritten numerals in the first level of classification after the application of both global and local features
The numbers of features selected by GA are used for the first two intra-group classifications which are 180 and 183 respectively. On the other hand, the two groupings so occurred since numerals within a group have similar characteristics. For example, the Bangla numerals ‘1’ and ‘9’ are almost fully identical except for the bottom left quadrant. Again, Bangla numerals ‘1’, ‘2’ and ‘9’ collectively have analogous shaped right halves. Hence, these are grouped together as BG1. Considering Devanagari numerals, the first classification level consists of 424 features in a similar manner. Due to similarity in the shapes of some numeral classes, a second-level of classification is carried out wherein we obtain certain groups of similar numerals. There are five such groups namely, HG1 (4, 5, 6), HG2 (0, 7), HG3 (1, 2, 3), HG4(8) and HG5(9) which is also shown in Fig. 8c. The first three groups (HG1, HG2, and HG3) contain three, two and three members in their grouping whereas the remaining two groups have one numeral each. The numbers of features selected by GA are used for their intra-group classifications which are found to be 185, 187 and 178 respectively. For Devanagari numerals ‘1’, ‘2’ and ‘3’, it is observed that these numerals have rounded upper halves, which make them identically shaped in the upper quadrant. Hence, this grouping occurred due to such similar characteristics. As a result, these were grouped together as HG3.
760
S. P. Chowdhury et al.
5 Experimental Analysis In this section, we present the detail experimental results to illustrate the suitability of the proposed approach to handwritten numeral recognition. All the experiments are implemented in MATLAB 2015a under a Windows 8 environment on an Intel Core2 Duo 2.4 GHz processor with 4 GB of RAM and performed on gray-scale digit images. The recognition accuracy, used as assessment criteria for measuring the recognition performance of the proposed system, is expressed as follows: Recognition Accuracy(%) #Corr ectly classi f ieddigits × 100% = #T otal digits
(1)
A classification scheme modelled on neural-networks is devised for the task of classification. The MLP is a type of Artificial Neural Network (ANN). MLP classifier (as described in [45]) has been identified to be used, because of its recognized abilities to generalize and imbibe human behaviour by modelling the biological neural networks of a human. MLP is a feed-forward layered network of artificial neurons. A sigmoid function of the weighted sum of inputs is calculated by each artificial neuron in the MLP. An MLP contains the following layers: 1 input layer, 1 output layer and numerous hidden/intermediate layers. For our research, we had 1 hidden layer and the number of iterations for our MLP classifier was 100. For example, an artificial neuron combines its input signals by a weighted sum. The output is a single numerical figure, calculated from an ‘activation’ function, nearly modelling a ‘firing’ of a coarsely modelled biological neuron. Each neuron has synaptic weight coefficients of its own, with each neuron having an operating activation function. For the classification of handwritten numerals, we have to design the MLP, where the Back Propagation (BP) learning algorithm has a learning rate (η) of 0.8 and momentum term (α) of 0.7 being used here with different amounts of neurons in its hidden/intermediate layers. We work with a training set of 6000 samples and another test set of 4000 samples, chosen for handwritten Bangla, Devanagari and Roman numerals; we consider equal quantities of digit samples from each class. The designed HOG feature set along with local distance-based features are then applied on the three script numerals and the MLP classifier is used for the identification purpose. The confusion matrix generated for handwritten Bangla, Devanagari and Roman numerals are reported in Tables 2, 3 and 4 respectively. Average recognition accuracies scored for all the three script numerals using MLP classifier are also illustrated in Table 5. It can be seen from Table 5 that the best recognition accuracies of 93.5, 83.98 and 88.83% are achieved for handwritten Bangla, Devanagari and Roman numerals respectively. Based on the confusion matrices produced as a result of classification of handwritten numerals, the grouping of similar and overlapping numerals are performed. It can be examined from Table 2 that the Bangla numerals ‘1’, ‘2’ and ‘9’ are confused among each other and can be placed into one group. Similarly, the numerals ‘0’, ‘3’,
Genetic Algorithm Based Global …
761
Table 2 Confusion matrix produced by MLP classifier for handwritten Bangla numerals using both global and local features Classified as —>
a
b
c
d
e
f
g
h
i
j
a = ‘0’
590
0
0
3
0
4
0
2
0
1
b = ‘1’
0
561
14
0
5
0
1
0
0
19
c = ‘2’
0
0
581
1
5
0
6
0
7
0
d = ‘3’
10
4
0
520
3
14
38
0
6
5
e = ‘4’
0
1
3
0
576
0
7
11
2
0
f = ‘5’
19
7
0
5
11
489
54
2
10
3
g = ‘6’
0
3
0
8
0
4
582
0
3
0
h = ‘7’
2
5
3
0
16
2
1
565
2
4
i = ‘8’
0
1
3
0
1
0
2
0
593
0
j = ‘9’
0
28
3
0
9
0
2
4
1
553
Table 3 Confusion matrix produced by MLP classifier for handwritten Devanagari numerals using both global and local features Classified as —>
a
b
c
d
e
f
g
h
i
j
a = ‘0’
401
8
17
40
20
14
26
58
4
12
b = ‘1’
2
367
54
37
24
5
43
9
28
31
c = ‘2’
3
10
502
49
2
6
8
4
7
9
d = ‘3’
8
16
72
488
3
2
0
4
6
1
e = ‘4’
3
1
2
4
542
5
17
25
0
1
f = ‘5’
3
0
1
0
2
544
45
2
3
0
g = ‘6’
0
0
1
0
1
9
559
0
12
18
h = ‘7’
58
21
0
4
0
7
2
492
0
16
i = ‘8’
0
7
0
1
0
4
0
1
588
0
j = ‘9’
1
37
12
0
0
1
0
1
0
548
‘5’ and ‘6’ can be placed into another group. The remaining numerals (i.e., ‘4’, ‘7’ and ‘8’) are kept as singletons. In the case of Devanagari numerals, the maximum confusion is seen among the numerals ‘4’, ‘5’ and ‘6’ due to which we keep them in one group. For the same reason, the second grouping is formed by taking the numerals ‘0’ and ‘7’ whereas the third group consists of the numerals ‘1’, ‘2’ and ‘3’. The remaining numerals ‘8’ and ‘9’ are set aside as singletons. In a similar way, four main groups are formed for handwritten Roman numerals by observing the number of misclassifications. The first group consists of the numerals ‘8’, ‘9’; the second group consists of the numerals ‘3’, ‘4’ and ‘7’. The third grouping is prepared for the numerals ‘0’, ‘5’, ‘6’ whereas the final group contains the numerals ‘1’ and ‘2’. Here, no numeral is kept as singleton.
762
S. P. Chowdhury et al.
Table 4 Confusion matrix produced by MLP classifier for handwritten Roman numerals using both global and local features Classified as —>
a
b
a = ‘0’
591
1
1
0
3
0
4
0
0
0
b = ‘1’
14
569
8
2
3
0
0
1
0
3
c = ‘2’
5
84
492
0
3
0
7
5
3
1
d = ‘3’
3
0
9
543
0
4
6
9
21
5
e = ‘4’
0
7
6
3
567
1
11
1
0
4
f = ‘5’
14
0
0
12
1
537
23
11
0
2
g = ‘6’
14
1
3
2
5
79
492
1
2
1
h = ‘7’
1
4
2
18
7
0
5
534
2
29
i = ‘8’
7
2
0
2
3
17
1
0
545
23
j = ‘9’
21
2
9
28
5
8
0
13
54
460
Table 5 Average recognition accuracy achieved for handwritten Bangla, Devanagari and Roman numerals using both global and local features
c
d
e
f
g
h
i
j
Bangla Devanagari Roman Total number of instances
6000
6000
6000
Correctly classified instances
5610
5031
5330
Incorrectly classified instances 390
969
670
Recognition accuracy (%)
83.98
88.83
93.50
Now, for the classification of intra-numeral classes present in each group for the three scripts, the combined feature set consisting of 424 features (HOG and local distance features) is applied on the individual groups of numeral classes written in one of the three scripts. The numerals that are selected as singletons would not undergo this procedure. GA, as described in Sect. 4.3, is then implemented as a feature selection procedure. The summary of the results after applying GA for all groupings of the three script numerals is shown in Table 6. The final results for all three script numerals are shown in Table 7. It can be observed from Table 7 that overall recognition accuracies of 98.49, 97.65 and 97.96% are attained for Bangla, Devanagari and Roman scripts respectively which are found to be much higher than the previous recognition accuracies reported in Table 5. Despite the fact that we have achieved convincing results, there are still some misclassifications that occurred during the experimentation. Figure 9 shows some sample images of misclassified numerals. It can be seen from Fig. 9a, b that the Bangla numerals ‘0’ and ‘5’ show confusion in recognition. Similarly, Bangla numerals ‘5’ and ‘6’ also showed a lot of confusion due to unusual handwriting styles (refer to Fig. 9c, d). For Devanagari script, a significant amount of confusion is found among the numerals ‘1’, ‘2’ and ‘9’. For illustration, see Fig. 9e–h. Similarly, a substantial amount of Roman numerals ‘1’, ‘3’, ‘5’ and ‘9’ have been seen as misclassified with the numerals ‘7’, ‘5’, ‘3’ and ‘4’ respectively (shown in Fig. 9i–l). These misclassifications arise due to the fact that there are structural similarities between these
Genetic Algorithm Based Global …
763
Table 6 Recogntion accuracies of the intra-numeral classes achieved by MLP classifier Script
Group
Bangla
BG1 BG2 RG1 RG2 RG3 RG4 DG1
Roman
Devanagari
Numerals in the group
Number of features selected by GA
Total number of instances
Correctly classified instances
Incorrectly classified instances
Recognition accuracy (%)
1, 2, 9
180
1800
1760
40
97.8
0, 3, 5, 6
183
2400
2370
30
98.75
8, 9
180
1200
1186
14
98.8
3, 4, 7
167
1800
1775
25
98.6
0, 5, 6
171
1800
1758
42
97.67
1, 2
167
1200
1159
41
96.6
4, 5, 6
185
1800
1763
37
97.93
DG2
0, 7
187
1200
1157
43
96.42
DG3
1, 2, 3
178
1800
1744
56
96.89
Table 7 Overall summary of the results for handwritten Bangla, Devanagari and Roman numerals (the overall accuracy is obtained by performing the weighted average of recognition accuracies achieved for each individual group)
Script Bangla
Numeral classes 1, 2, 9
Recognition accuracy (%)
Overall accuracy (%)
97.8
98.49
0, 3, 5, 6
98.75
4
98.83
7 8 Roman
8, 9
98.8
3, 4, 7
98.6
5, 6, 0
97.67
2, 1 Devanagari
97.96
96.6
4, 5, 6
97.93
0, 7
96.42
1, 2, 3
96.89
8
99.61
97.65
9
numerals, which were created due to peculiar styles of handwriting styles of different sets of people. In the current research, we have compared our recommended approach with other contemporary numeral recognition techniques, as shown below in Table 8. We can observe that our recommended methodology performs better than most of the prior numeral recognition techniques, thus proving that our present procedure is well applicable for various scripts.
764
S. P. Chowdhury et al.
Fig. 9 Sample numeral images misclassified by the present technique of: a Bangla numeral ‘0’ (misclassified as ‘5’), b Bangla numeral ‘5’ (misclassified as ‘0’), c Bangla numeral ‘5’ (misclassified as ‘6’), d Bangla numeral ‘6’ (misclassified as ‘5’), e Devanagari numeral ‘1’(misclassified as ‘9’), f Devanagari numeral ‘9’(misclassified as ‘1’), g Devanagari numeral ‘1’(misclassified as ‘2’), h Devanagari numeral ‘2’(misclassified as ‘1’), i Roman numeral ‘1’ (misclassified as ‘7’), j Roman numeral ‘3’ (misclassified as ‘5’), k Roman numeral ‘5’ (misclassified as ‘3’), l Roman numeral ‘9’ (misclassified as ‘4’)
6 Conclusion Recognition of handwritten digits is one of the most interesting and challenging areas of research in the field of pattern recognition and image processing. Several research works have been accomplished for non-Indian scripts focusing on the development of new techniques with the aim of achieving higher recognition accuracy in this area. Still there exists a shortage of interest in researching identification of numerals written in different Indian scripts. The present work reports the recognition of numerals for three widely spoken official Indian scripts written in Bangla, Devanagari and Roman. For our study, we have used global and local features in tandem in order to identify handwritten digits. The extraction of global features is done using HOG descriptor whereas the local distance based features are measured as the local features. The proposed system applies the global features for grouping of the numerals which are structurally similar in nature. Then, the mix of both local and global features is taken for the selection of optimal subset of features using GA. Finally, these optimal set of features are employed for the classification of the intra-numeral classes using the MLP classifier. Applying this proposed technique, we have achieved impressive recognition accuracies of 98.49%, 97.65% and 97.96% for Bangla, Devanagari and Roman numerals respectively on a database of 10,000 handwritten numerals considered per script. Our research findings can be applied to other less-researched digit recognition areas of Indian scripts. Further study can be carried out by integrating HOG and local distance-based features along with other texture-based features with the objective of obtaining greater classification accuracy.
Numeral script
Bangla
Bangla
Bangla
Bangla
Bangla
Devanagari
Devanagari
Devanagari
Devanagari
Roman
Researchers
Basu et al. [7]
Nasir et al. [9]
Surinta et al. [10]
Khan et al. [11]
Proposed methodology
Bhattacharaya et al. [15]
Hanmandlu et al. [16]
Singh et al. [19]
Proposed methodology
Prasad et al. [23]
CENPARMI
Own database
Own database
Own database
Own database
Own database
CMATERdb 3.1.1 [7]
Own database
Own database
CMATERdb 3.1.1 [7]
Database used
500
10,000
3,000
Not known
Not known
10,000
6000
10, 920
300
6000
Size of the database
SVM
MLP
Classifier
Fuzzy model
MLP
MLP
SRC
Moment of Inertia based features
GA based HOG and local distance feature selection
HMM
MLP
Global, local and Profile ANN based features
Vector distances
Multi-resolution features based on wavelet transform
GA based HOG and local distance feature selection
Zone density feature extraction
Feature and Pixel based SVM features
Hybridization of k-means clustering and Bayes’ Theorem
Shadow, centroid and longest run feature
Feature set used
91.2
97.65
95
92.67
91.28
98.49
94
96.8
99.33
96.67
(continued)
Recognition accuracy (%)
Table 8 Comparative study of proposed methodology with some state-of-the-art techniques (present recognition accuracies are highlighted in bold style)
Genetic Algorithm Based Global … 765
Numeral script
Roman
Roman
Roman
Researchers
Salouan et al. [24]
Qacimy et al. [25]
Proposed methodology
Table 8 (continued)
HDRC 2013 [29]
MNIST
Own database
Database used
10,000
10,000
3,000
Size of the database
GA based HOG and local distance feature selection
Discrete Cosine Transform
Radon Transform, Hough transform and Gabor Filter
Feature set used
MLP
SVM
MLP and HMM
Classifier
97.96
96.61
96.60
Recognition accuracy (%)
766 S. P. Chowdhury et al.
Genetic Algorithm Based Global …
767
References 1. R. Hussain, A. Raza, I. Siddiqi, K. Khurshid, C. Djeddi, A comprehensive survey of handwritten document benchmarks: structure, usage and evaluation. EURASIP J. Image Video Process. 2015(1), 46 (2015) 2. U. Bhattacharya, B.B. Chaudhuri, Handwritten numeral databases of Indian scripts and multistage recognition of mixed numerals. IEEE Trans. Pattern Anal. Mach. Intell. 31(3), 444–457 (2008) 3. U. Pal, N. Sharma, T. Wakabayashi, F. Kimura, Handwritten numeral recognition of six popular Indian scripts, in Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 2. (IEEE, 2007), pp. 749–753 4. R.M. Bozinovic, S.N. Srihari, Off-line cursive script word recognition. IEEE Trans. Pattern Anal. Mach. Intell. 11(1), 68–83 (1989) 5. P.K. Wong, C. Chan, Off-line handwritten Chinese character recognition as a compound Bayes decision problem. IEEE Trans. Pattern Anal. Mach. Intell. 20(9), 1016–1023 (1998) 6. U. Pal, B.B. Chaudhuri, Indian script character recognition: a survey. Pattern Recogn. 37(9), 1887–1899 (2004) 7. S. Basu, N. Das, R. Sarkar, M. Kundu, M. Nasipuri, D.K. Basu, An MLP based approach for recognition of handwritten Bangla’ numerals. arXiv preprint arXiv:1203.0876 8. Y. Wen, L. He, A classifier for Bangla handwritten numeral recognition. Exp. Syst. Appl. 39(1), 948–953 (2012) 9. M.K. Nasir, M.S. Uddin, Hand written Bangla numerals recognition for automated postal system. IOSR J. Comput. Eng. (IOSR-JCE) 8(6), 43–48 (2013) 10. O. Surinta, L. Schomaker, M. Wiering, A comparison of feature and pixel-based methods for recognizing handwritten bangla digits, in 2013 12th International Conference on Document Analysis and Recognition (IEEE, 2013), pp. 165–169 11. H.A. Khan, A. Al Helal, K.I. Ahmed, Handwritten Bangla digit recognition using sparse representation classifier, in 2014 International Conference on Informatics, Electronics and Vision (ICIEV) (IEEE, 2014), pp. 1–6 12. M.A.H. Akhand, M. Ahmed, M.H. Rahman, Convolutional neural network based handwritten Bengali and Bengali–English mixed numeral recognition. Int. J. Image Graph. Sig. Process. 8(9), 40 (2016) 13. P.K. Singh, R. Sarkar, M. Nasipuri, A comprehensive survey on Bangla handwritten numeral recognition. Int. J. Appl. Pattern Recogn. 5(1), 55–71 (2018) 14. I.K. Sethi, B. Chatterjee, Machine recognition of constrained hand printed Devanagari. Pattern Recogn. 9(2), 69–75 (1977) 15. U. Bhattacharya, B.B. Chaudhuri, R. Ghosh, M. Ghosh, On recognition of handwritten Devnagari numerals, in Proceedings of the Workshop on Learning Algorithms for Pattern Recognition (in conjunction with the 18th Australian Joint Conference on Artificial Intelligence), Sydney, p. 1–7 (2005) 16. M. Hanmandlu, O.R. Murthy, Fuzzy model based recognition of handwritten numerals. Pattern Recogn. 40(6), 1840–1854 (2007) 17. M.J.K. Singh, R. Dhir, R. Rani, Performance comparison of devanagari handwritten numerals recognition. Int. J. Comput. Appl. 22 (2011) 18. A. Aggarwal, R.R. Renudhir, Recognition of Devanagari handwritten numerals using gradient features and SVM. Int. J. Comput. Appl. 48(8), 39–44 (2012) 19. P. Singh, A. Verma, N.S. Chaudhari, Handwritten Devnagari digit recognition using fusion of global and local features. Int. J. Comput. Appl. 89(1) (2014) 20. S. Prabhanjan, R. Dinesh, Handwritten devanagari characters and numeral recognition using multi-region uniform local binary pattern. Int. J. Multimedia Ubiquit. Eng. 11(3), 387–398 (2016) 21. P.K. Singh, S. Das, R. Sarkar, M. Nasipuri, Recognition of offline handwriten Devanagari numerals using regional weighted run length features, in 2016 International Conference on Computer, Electrical and Communication Engineering (ICCECE) (IEEE, 2016), pp. 1–6
768
S. P. Chowdhury et al.
22. J. Cao, M. Ahmadi, M. Shridhar, Recognition of handwritten numerals with multiple feature and multistage classifier. Pattern Recogn. 28(2), 153–160 (1995) 23. B.K. Prasad, G. Sanyal, A hybrid feature extraction scheme for Off-line English numeral recognition, in International Conference for Convergence for Technology (IEEE, 2014), pp. 1–5 24. R. Salouan, S. Safi, B. Bouikhalene, Isolated handwritten Roman numerals recognition using methods based on radon, Hough transforms and Gabor filter. Int. J. Hybrid Inf. Technol. 8, 181–194 (2015) 25. B. El Qacimy, M.A. Kerroum, A. Hammouch, Feature extraction based on DCT for handwritten digit recognition. Int. J. Comput. Sci. Issues (IJCSI) 11(6), 27 (2014) 26. Y. LeCun, The MNIST database of handwritten digits (1998). http://yann.lecun.com/exdb/ mnist/ 27. P.K. Singh, S. Das, R. Sarkar, M. Nasipuri, Recognition of handwritten Indic script numerals using Mojette transform, in Proceedings of the First International Conference on Intelligent Computing and Communication (Springer, Singapore, 2017), pp. 459–466 28. P.K. Singh, R. Sarkar, M. Nasipuri, A study of moment based features on handwritten digit recognition. Appl. Comput. Intell. Soft Comput. (2016) 29. P.K. Singh, S. Das, R. Sarkar, M. Nasipuri, Script invariant handwritten digit recognition using a simple feature descriptor. Int. J. Comput. Vis. Rob. 8(5), 543–560 (2018) 30. S. Ghosh, A. Chatterjee, P.K. Singh, S. Bhowmik, R. Sarkar, Language-invariant novel feature descriptors for handwritten numeral recognition. Vis. Comput. 2020 (2020). https://doi.org/10. 1007/s00371-020-01938-x 31. R. Samanta, S. Ghosh, A. Chatterjee, R. Sarkar, A novel approach towards handwritten digit recognition using refraction property of light rays. Int. J. Comput. Vis. Image Process. (IJCVIP) 10(3), 1–17 (2020) 32. M. Ghosh, R. Guha, R. Mondal, P.K. Singh, R. Sarkar, M. Nasipuri, Feature selection using histogram-based multi-objective GA for handwritten Devanagari numeral recognition, in Intelligent Engineering Informatics (Springer, Singapore, 2018), pp. 471–479 33. R. Guha, M. Ghosh, P.K. Singh, R. Sarkar, M. Nasipuri, M-HMOGA: a new multi-objective feature selection algorithm for handwritten numeral classification. J. Intell. Syst. 29(1), 1453– 1467 (2019) 34. S. Ghosh, S. Bhowmik, K.K. Ghosh, R. Sarkar, S. Chakraborty, A filter ensemble feature selection method for handwritten numeral recognition. EMR 7213 (2016) 35. A. Roy, N. Das, R. Sarkar, S. Basu, M. Kundu, M. Nasipuri, An axiomatic fuzzy set theory based feature selection methodology for handwritten numeral recognition, in ICT and Critical Infrastructure: Proceedings of the 48th Annual Convention of Computer Society of India, vol. I. (Springer, Cham ,2014), pp. 133–140 36. S. Sarkar, M. Ghosh, A. Chatterjee, S. Malakar, R. Sarkar, An advanced particle swarm optimization based feature selection method for tri-script handwritten digit recognition, in International Conference on Computational Intelligence, Communications, and Business Analytics (Springer, Singapore, 2018), pp. 82–94 37. S. Chakraborty, S. Paul, R. Sarkar, M. Nasipuri, Feature map reduction in CNN for handwritten digit recognition, in Recent Developments in Machine Learning and Data Analytics (Springer, Singapore, 2019), pp. 143–148 38. J. Mukhoti, S. Dutta, R. Sarkar, Handwritten digit classification in Bangla and Hindi using deep learning. Appl. Artif. Intell. 1–26 (2020) 39. A. Chatterjee, S. Malakar, R. Sarkar, M. Nasipuri, Handwritten digit recognition using DAISY descriptor: a study, in 2018 Fifth International Conference on Emerging Applications of Information Technology (EAIT) (2018), pp. 1-4 40. M. Diem, S. Fiel, A. Garz, M. Keglevic, F. Kleber, R. Sablatnig, ICDAR 2013 competition on handwritten digit recognition (HDRC 2013), in 2013 12th International Conference on Document Analysis and Recognition (IEEE, 2013), pp. 1422–1427 41. R.C. Gonzalez, R.E. Woods, Digital Image Processing, vol. I (Prentice-Hall, India, 1992) 42. N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1. (IEEE, 2005), pp. 886–893
Genetic Algorithm Based Global …
769
43. M. Srinivas, L.M. Patnaik, Genetic algorithms: a survey. Computer 27(6), 17–26 (1994) 44. Y.L. Wu, C.Y. Tang, M.K. Hor, P.F. Wu, Feature selection using genetic algorithm and cluster validation. Exp. Syst. Appl. 38(3), 2727–2732 (2011) 45. S. Basu, N. Das, R. Sarkar, M. Kundu, M. Nasipuri, D.K. Basu, Handwritten Bangla alphabet recognition using an MLP based classifier. arXiv preprint arXiv:1203.0882 (2012) 46. Languages spoken by more than 10 million people. Encarta Encyclopedia. Retrieved 23 February 2018