404 19 54MB
English Pages XXXI, 438 [460] Year 2021
Advances in Intelligent Systems and Computing 1221
Valentina Emilia Balas Lakhmi C. Jain Marius Mircea Balas Shahnaz N. Shahbazova Editors
Soft Computing Applications Proceedings of the 8th International Workshop Soft Computing Applications (SOFA 2018), Vol. I
Advances in Intelligent Systems and Computing Volume 1221
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. ** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink **
More information about this series at http://www.springer.com/series/11156
Valentina Emilia Balas Lakhmi C. Jain Marius Mircea Balas Shahnaz N. Shahbazova •
•
•
Editors
Soft Computing Applications Proceedings of the 8th International Workshop Soft Computing Applications (SOFA 2018), Vol. I
123
Editors Valentina Emilia Balas “Aurel Vlaicu” University of Arad, Faculty of Engineering Arad, Romania Marius Mircea Balas Faculty of Engineering “Aurel Vlaicu” University of Arad Arad, Arad, Romania
Lakhmi C. Jain Faculty of Engineering and Information Technology University of Technology Sydney, Centre for Artificial Intelligence Sydney, Australia Faculty of Science Liverpool Hope University Liverpool, UK KES International Shoreham-by-Sea, UK Shahnaz N. Shahbazova Department of Information Technology and Programming Azerbaijan Technical University Baku, Azerbaijan
ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-3-030-51991-9 ISBN 978-3-030-51992-6 (eBook) https://doi.org/10.1007/978-3-030-51992-6 © Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
These two volumes constitute the Proceedings of the 8th International Workshop on Soft Computing Applications, or SOFA 2018, held on September 13–15, 2018, in Arad, Romania. This edition was organized by Aurel Vlaicu University of Arad, Romania, in conjunction with Institute of Computer Science, Iasi Branch of the Romanian Academy, IEEE Romanian Section, Romanian Society of Control Engineering and Technical Informatics (SRAIT)-Arad Section, General Association of Engineers in Romania-Arad Section and BTM Resources Arad. Soft computing concept was introduced by Lotfi Zadeh in 1991 and serves to highlight the emergence of computing methodologies in which the accent is on exploiting the tolerance for imprecision and uncertainty to achieve tractability, robustness and low solution cost. Soft computing facilitates the use of fuzzy logic, neurocomputing, evolutionary computing and probabilistic computing in combination, leading to the concept of hybrid intelligent systems. Combining of such intelligent systems tools and a large number of applications can show the great potential of soft computing in all domains. The volumes cover a broad spectrum of soft computing techniques, and theoretical and practical applications find solutions for industrial world, economic and medical problems. The conference papers included in these proceedings, published post conference, were grouped into the following area of research: • Soft computing and conventional techniques in power engineering methods and applications in electrical engineering • Modeling, Algorithms. Optimization, reliability and applications • Machine learning, NLP and applications • Business process management • Knowledge-based technologies for Web applications, cloud computing, security algorithms and computer networks, smart city • Fuzzy applications, theory, expert systems, fuzzy and control • Biomedical applications • Image, text and signal processing
v
vi
Preface
• Computational intelligence techniques, machine learning and optimization methods in recent applications • Methods and applications in engineering and games • Wireless sensor networks, cloud computing, IoT In SOFA 2018, we had five eminent keynote speakers: Professor Michio Sugeno (Japan), Professor Oscar Castillo (Mexico), Academician Florin G. Filip (Romania), Professor Valeriu Beiu (Romania) and Professor Jeng-Shyang Pan (China). Their summary talks are included in this book. We especially thank the honorary chair of SOFA 2018, Prof. Michio Sugeno, who encouraged and motivated us, like to all the other SOFA editions. A special keynote was presented by Professor Shahnaz Shahbazova (Azerbaijan)–“In memoriam Lotfi A. Zadeh”, dedicated to the renowned founder of fuzzy set theory and in the same time the honorary chair of SOFA conferences, who passed away in September 2017. In fact, the whole conference was dedicated to the memory of Professor Zadeh. We all remembered in our presentations and discussions his great personality and how he influenced our lives. We are thankful to all the authors who have submitted papers for keeping the quality of the SOFA 2018 conference at high levels. The editors of this book would like to acknowledge all the authors for their contributions and the reviewers. We have received an invaluable help from the members of the International Program Committee and the Chairs responsible for different aspects of the workshop. We also appreciate the role of special sessions' organizers. Thanks to all of them we had been able to collect many papers of interesting topics, and during the workshop, we had remarkably interesting presentations and stimulating discussions. For their help with organizational issues of all SOFA editions, we express our thanks to TRIVENT Company, Mónika Jetzin and Teodora Artimon for having customized the Software Conference Manager, registration of conference participants and all local arrangements. Our special thanks go to Janus Kacprzyk (Editor-in-Chief, Springer, Advances in Intelligent Systems and Computing Series) for the opportunity to organize this guest edited volume. We are grateful to Springer, especially to Dr. Thomas Ditzinger (Senior Editor, Applied Sciences & Engineering Springer-Verlag), for the excellent collaboration, patience and help during the evolvement of this volume. We hope that the volumes will provide useful information to professors, researchers and graduated students in the area of soft computing techniques and applications, and all will find this collection of papers inspiring, informative and useful. We also hope to see you at a future SOFA event. Valentina Emilia Balas Lakhmi C. Jain Marius Mircea Balas Shahnaz N. Shahbazova
Invited Speakers
DSS, Classifications, Trends and Enabling Modern Information and Communication Technologies Florin Gheorghe Filip The Romanian Academy and INCE ffi[email protected]
Abstract. A decision support system (DSS) can be defined as an anthropocentric and evolving information system which is meant to implement the functions of a human support system that would otherwise be necessary to help the decision-maker to overcome his/her limits and constraints he/she may encounter when trying to solve complex and complicated decision problems that count (Filip, 2008). The purpose of the talk is to present the impact of modern Information and Communication Technologies (I&CT) on DSS domain with emphasis on the systems that support collaborative decision-making activities. Consequently, the talk is composed of three parts as it follows. In the first part, several basic aspects concerning decisions and decision makers are reviewed in the context of modern business models and process and management automation solutions, including Intelligent Process Automation (IPA), which is meant to liberate the human of “robot-type” operations. The evolution of models of human–automation device systems, from “either/or automation” to “shared and cooperative” control solutions (Flemisch et al. 2012), receives particular attention together with the explanation of causes for wrong decisions (Power, Mitra, 2016). The second part of the talk addresses several aspects concerning the DSS domain such as basic concepts, classifications and evolutions. Several classifications made in accordance with attributes such as purpose, dominant technology, number of users and real-time usage in crisis situations are presented. Collaborative systems (Nof, 2017; Filip at all, 2017) and “mixt knowledge” (Filip, 2008) solutions are described in detail. In the third part of the talk, several I&C technologies, such as big data (Shi, 2015), cloud and mobile computing, and cognitive systems (High, 2012; Tecuci et al. 2016), are presented as viewed from the perspective of their relevance to modern computer-supported collaborative decision making. Two application examples are presented with a view to illustrating the usage of big data, and cloud computing and service-oriented architectures, respectively. A list of concerns and open problems regarding the impact of new I&C technologies on human being’s personal and professional life is eventually evoked. Selected References Filip F.G. (2008) Decision support and control for large-scale systems. Annual Reviews in Control, 32(1), p. 62–70. Filip F G, Zamfirescu C B, Ciurea C (2017) Computer ix
x
F. G. Filip
Supported Collaborative Decision-Making. Springer Flemisch F, Heesen M, Hesse T. et al. (2012) Towards a dynamic balance between humans and automation: authority, ability, responsibility and control in shared and cooperative control situations. Cognition, Technology & Work,14(1), p. 3–8. High R (2012) The Era of Cognitive Systems: An Inside Look at IBM Watson and How It Works Nof S Y (2017) Collaborative control theory and decision support systems. Computer Science Journal of Moldova, 25 (2), 15–144 Power D J, Mitra A (2016) Reducing “Bad” Strategic Business Decisions. Drake Management Review, 5 (1/2), p. 15–21Shi (2015) Challenges to Engineering Management in the Big Data Era. Frontiers of Engineering Management, 293–303 TecuciG, Marcu D, Boicu M, Schum DA (2016) Knowledge Engineering: Building Cognitive Assistants for Evidence-based Reasoning. Cambridge University Press.
Florin Gheorghe Filip
Brief Bio Sketch: Florin Gheorghe Filip was born on July 25, 1947. He became corresponding member of the Romanian Academy (in 1991, when he was only 44 years old), and at 52 years old (1999) becomes full member in the highest cultural and scientific forum of Romania. For 10 years, during 2000–2010, he was Vice president of the Romanian Academy (the National Academy of Sciences) and in 2010, he was elected President of the 14th Section “Information Science and Technology” of the Academy (re-elected in 2015). He was the managing director of National Institute for R&D in Informatics-ICI (1991–1997). He has been a part-time researcher and member of the Scientific Council of INCE (the Nat. Institute for Economic Researches) of the Academy since 2004. His main scientific interests are optimization and control of complex systems, decision support systems, technology management and foresight, and IT applications in the cultural sector. He authored/coauthored over 300 papers published in international journals (IFAC J Automatica, IFAC J Control Engineering Practice, Annual Reviews in Control, Computers in Industry, System Analysis Modeling Simulation, Large Scale Systems, Technological and Economical Development of Economy and so on) and contributed volumes printed by international publishing houses (Pergamon Press, North Holland, Elsevier, Kluwer, Chapmann & Hall, etc). He is also the author/coauthor of thirteen monographs (published by Editura Tehnica,
DSS, Classifications, Trends and Enabling Modern Information
xi
Bucuresti, Hermes-Lavoisier Paris, J. Wiley & Sons, London, Springer) and editor/coeditor of 25 volumes of contributions (published by Editura Academiei Romane, Elsevier Science, Institute of Physics. Melville, USA, IEEE Computer Society, Los Alamitos, USA). He was an IPC member of more than 50 international conferences held in Europe, USA, South America, Asia and Africa and gave plenary papers at scientific conferences held in Brazil, Chile, China, France, Germany, Lithuania, Poland, Portugal, Republic of Moldova, Spain, Sweden, Tunisia and UK. F.G Filip was the chairman of IFAC (International Federation of Automatic Control) Technical Committee “Large Scale Complex Systems” (1991– 1997). He is Founder and Editor-in-Chief of Studies in Informatics and Control journal (1991), cofounder and Editor-in-Chief of International Journal of Computers Communications & Control (2006). He has received Doctor Honoris Causa title from “Lucian Blaga” University of Sibiu (2000), “Valahia” University, Targoviste (2007), “Ovidius“ University, Constanta (2007), Ecolle Centrale de Lille (France) (2007), Technical University “Traian Vuia.” Timisoara (2009), “Agora” University of Oradea (2012), Academy of Economic Studies, Bucharest (2014), University of Pitesti (2017), and “Petrol-Gaz “University of Ploiesti (2017). He is a honorary member of the Academy of Sciences of Republic of Moldova (2007) and Romanian Academy of Technical Sciences (2007). More details can be found at: http://www. academiaromana.ro/sectii/sectia14_informatica/sti_ FFilip.htm and http://univagora.ro/jour/index.php/ ijccc/article/view/2960/1125.
Distorted Statistics based on Choquet Calculus Michio Sugeno Tokyo Institute of Technology [email protected]
Abstract. In this study, we discuss statistics with distorted probabilities by applying Choquet calculus which we call “distorted statistics.” To deal with distorted statistics, we consider distorted probability space on the non-negative real line. A (non-additive) distorted probability is derived from an ordinary additive probability by the monotone transformation with a generator. First, we explore some properties of Choquet integrals of non-negative, continuous and differentiable functions with respect to distorted probabilities. Next, we calculate elementary statistics such as the distorted mean and variance of a random variable for exponential and Gamma distributions. In addition, we introduce the concept of density function for distorted exponential distribution. Further, we deal with Choquet calculus of real-valued functions on the real line and explore their basic properties. Then, we consider distorted probability pace on the real line. We also calculate elementary distorted statistics for uniform and normal distributions. Finally, we compare distorted statistics with conventional skew statistics.
Michio Sugeno
Biography: After graduating from the Department of Physics, the University of Tokyo, he worked at a company for three years. Then, he served the Tokyo Institute of Technology as Research Associate, Associate Professor and Professor from 1965 to 2000. After retiring from the Tokyo Institute of Technology, he worked as Laboratory Head at the Brain Science Institute, RIKEN from 2000 to 2005 and then as Distinguished Visiting Professor at Doshisha University from 2005 to 2010. Finally, he worked as Emeritus Researcher at the European Centre for Soft Computing in Spain from 2010 to 2015. He is Emeritus Professor at the Tokyo Institute of Technology. He was
xiii
xiv
M. Sugeno
President of the Japan Society for Fuzzy Theory and Systems from 1991 to 1993, and also President of the International Fuzzy Systems Association from 1997 to 1999. He is the first recipient of the IEEE Pioneer Award in Fuzzy Systems with Zadeh in 2000. He also received the 2010 IEEE Frank Rosenblatt Award and Kampéde Feriét Award in 2012.
Overview of QUasi-Affine TRansformation Evolutionary (QUATRE) Algorithm Jeng-Shyang Pan Fujian University of Technology, Harbin Institute of Technology [email protected]
Abstract. QUasi-Affine TRansformation Evolutionary (QUATRE) algorithm is a swarm-based algorithm and uses quasi-affine transformation approach for evolution. This talk discusses the relation between QUATRE algorithm and other kinds of swarm-based algorithms including particle swarm optimization (PSO) variants and differential evolution (DE) variants. Several QUATRE variants are described in this talk. Comparisons and contrasts are made among the proposed QUATRE algorithm, state-of-the-art PSO variants and DE variants under several test functions. Experimental results show that the usefulness of the QUATRE algorithm is not only on real-parameter optimization but also on large-scale optimization. Especially QUATRE algorithm can reduce the time complexity and has the excellent performance not only on uni-modal functions, but also on multi-modal functions even on higher-dimensional optimization problems.
Jeng-Shyang Pan
Biography: Jeng-Shyang Pan, Assistant President, Fujian University of Technology Professor, Harbin Institute of Technology Jeng-Shyang Pan, received the B.S. degree in Electronic Engineering from the National Taiwan University of Science and Technology in 1986, the M.S. degree in Communication Engineering from the National Chiao Tung University, Taiwan in 1988, and the PhD degree in Electrical Engineering from the University of Edinburgh, UK, in 1996. Currently, he is Assistant President and Dean of the College of Information Science and Engineering in Fujian University of Technology. He is also Professor in the Harbin Institute of Technology. He has published more than 600 papers in which 250 papers are indexed by SCI,
xv
xvi
J.-S. Pan
the H-Index is 41, and the total cited times are more than 7900. He is IET Fellow, UK, and has been Vice Chair of IEEE Tainan Section. He was awarded Gold Prize in the International Micro Mechanisms Contest held in Tokyo, Japan, in 2010. He was also awarded Gold Medal in the Pittsburgh Invention & New Product Exposition (INPEX) in 2010; Gold Medal in the International Exhibition of Geneva Inventions in 2011; and Gold Medal of the IENA, International “Ideas–Inventions–New products”, Nuremberg, Germany. He was offered Thousand Talent Program in China in 2010. He is on the editorial board of Journal of Information Hiding and Multimedia Signal Processing, and Chinese Journal of Electronics. His current research interests include soft computing, robot vision and big data mining.
Nature-Inspired Optimization of Type-2 Fuzzy Logic Controllers Oscar Castillo Tijuana Institute of Technology Tijuana, Mexico [email protected]
Abstract. The design of type-2 fuzzy logic systems is a complex task, and in general, achieving an optimal configuration of structure and parameters is time-consuming and rarely found in practice. For this reason, the use of nature-inspired meta-heuristics offers a good hybrid solution to find near-optimal designs of type-2 fuzzy logic systems in real-world applications. Type-2 fuzzy control offers a real challenge because the problems in this area require very efficient and accurate solutions; in particular, this is the case for robotic applications. In this talk, we present a general scheme for optimizing type-2 fuzzy controllers with nature-inspired optimization techniques, like ant colony optimization, the chemical reaction algorithm, bee colony optimization and others.
Oscar Castillo
Biography: Oscar Castillo holds the Doctor in Science degree (Doctor Habilitatus) in Computer Science from the Polish Academy of Sciences (with the Dissertation “Soft Computing and Fractal Theory for Intelligent Manufacturing”). He is Professor of computer science in the Graduate Division, Tijuana Institute of Technology, Tijuana, Mexico. In addition, he is serving as Research Director of Computer Science and Head of the research group on Hybrid Fuzzy Intelligent Systems. Currently, he is President of HAFSA (Hispanic American Fuzzy Systems Association) and Past President of IFSA (International Fuzzy Systems Association). Prof. Castillo is also Chair of the Mexican Chapter of the Computational Intelligence Society (IEEE). He also belongs to the Technical Committee on Fuzzy Systems of IEEE and to the Task Force on “Extensions to Type-1 Fuzzy Systems.” He is
xvii
xviii
O. Castillo
also a member of NAFIPS, IFSA and IEEE. He belongs to the Mexican Research System (SNI Level 3). His research interests are in type-2 fuzzy logic, fuzzy control, neuro-fuzzy and genetic-fuzzy hybrid approaches. He has published over 300 journal papers, 7 authored books, 30 edited books, 200 papers in conference proceedings and more than 300 chapters in edited books, in total more 740 publications according to Scopus and more than 840 according to Research Gate. He has been Guest Editor of several successful special issues in the past, like in the following journals: Applied Soft Computing, Intelligent Systems, Information Sciences, Non-Linear Studies, Fuzzy Sets and Systems, JAMRIS and Engineering Letters. He is currently Associate Editor of the Information Sciences Journal, Applied Soft Computing Journal, Granular Computing Journal and the IEEE Transactions on Fuzzy Systems. Finally, he has been elected IFSA Fellow and MICAI Fellow member last year. He has been recognized as Highly Cited Researcher in 2017 by Clarivate Analytics because of having multiple highly cited papers in Web of Science.
Seeing Is Believing “It is very easy to answer many of these fundamental biological questions; you just look at the thing!” Richard P. Feynman, “There’s Plenty of Room at the Bottom,” Caltech, December 29, 1959 Valeriu Beiu Aurel Vlaicu University of Arad, Romania [email protected]
Abstract. This presentation is geared toward the latest developments in imaging platforms that are able to tackle biological samples. Visualizing living cells, single molecules and even atoms are crucially important, but unfortunately excruciatingly difficult. Still, recent progress reveals that a wide variety of novel imaging techniques have reached maturity. We will recap here the principles behind techniques that allow imaging beyond the diffraction limit and highlight both historical and fresh advances in the field of neuroscience (as a result of such imaging technologies). As an example, single-particle tracking is one of several tools able to study single molecules inside cells and reveal the dynamics of biological processes (receptor trafficking, signaling and cargo transport). Historically, the first venture outside classical optics was represented by X-ray and electron-based techniques. Out of these, electron microscopy allows higher resolution by far. In time, this has diverged into transmission electron microscopy (TEM), scanning electron microscopy (SEM), reflection electron microscopy (REM) and scanning transmission electron microscopy (STEM), while lately these have started to merge with digital holography (scanning transmission electron holography, atomic-resolution holography and low-energy electron holography). Electron microscopy allows resolutions down to 40pm, while it is not trivial to use such techniques on biological samples. The second departure from classical optics was represented by scanning probe techniques like atomic force microscope (AFM), scanning tunneling microscope (STM), photonic force microscope (PFM) and recurrence tracking microscope (RTM). All of these rely on the physical contact of a solid probe tip which scans the surface of an object (which is supposed to be quite flat). The third attempt has come full circle and is represented by super-resolution microscopy which won the Nobel Prize in 2014.
xix
xx
V. Beiu
The presentation will start from basic principles, emphasizing the advantages and disadvantages of different bio-imaging techniques. The development of super-resolution microscopy techniques in the 1990’s and 2000’s (https://en.wikipedia.org/wiki/Superresolution_microscopy) has allowed researchers to image fluorescent molecules at unprecedentedly small scales. This significant boost was properly acknowledged by replacing the term “microscopy” with “nanoscopy” which was coined by Stefan Walter Hell in 2007. It distinguishes novel diffraction-unlimited techniques from conventional approaches, e.g., confocal or wide-field microscopy. An incomplete list includes (among others): binding-activated localization microscopy (BALM), cryogenic optical localization in 3D (COLD), fluctuation-assisted BALM (fBALM), fluorescence photo-activation localization microscopy (FPALM), ground-state depletion microscopy (GSDIM), Light sheet fluorescence microscopy (LSFM), photo-activated localization microscopy (PALM), structured illumination microscopy (SIM), including both linear and nonlinear, stimulated emission depletion (STED), stochastic optical reconstruction microscopy (STORM), single molecule localization microscopy (SMLM), scanning near-field microscopy (SNOM) and total internal reflection fluorescence (TIRF). Obviously, with such improvements in resolving power, new avenues for studying synapses and neurons more generally are being opened, and a few of the latest experiments that highlight unique capabilities will be enumerated, briefly reviewed and compared.
Valeriu Beiu
Biography: VALERIU BEIU (S’92–M’95–SM’96) received the MSc in computer engineering from the University “Politehnica” Bucharest in 1980, and the PhD summa cum laude in electrical engineering from the Katholieke Universiteit Leuven in 1994. Since graduating in 1980, he has been with the Research Institute for Computer Techniques, University “Politehnica” Bucharest, Katholieke Universiteit Leuven, King’s College London, Los Alamos National Laboratory, Rose Research, Washington State University, United Arab Emirates University, and currently is with “Aurel Vlaicu” University of Arad. His research interests have constantly been on biological-inspired nano-circuits and brain-inspired nano-architectures for VLSI-efficient designs
Seeing Is Believing
xxi
(ultra-low power and highly reliable), being funded at over US$ 51M. On such topics, he gave over 200 invited talks, organized over 120 conferences, chaired over 60 sessions, has edited two books and has authored over 230 journal/conference articles (30 invited), as well as 8 chapters and 11 patents. Dr. Beiu has received five fellowships and seven best paper awards, and is a senior member of the IEEE as well as a member of: ACM, INNS, ENNS and MCFA. He was a member of the SRC-NNI Working Group on Novel Nano-architectures, the IEEE CS Task Force on Nano-architectures and the IEEE Emerging Technologies Group on Nanoscale Communications, and has been an Associate Editor of the IEEE Transactions on Neural Networks (2005–2008), of the IEEE Transactions for Very Large Scale Integration Systems (2011–2015) and of the Nano Communication Networks (2010–2015).
Contents
Soft Computing and Conventional Techniques in Power Engineering Methods and Applications in Electrical Engineering I Daily Load Curve Forecasting. Comparative Analysis: Conventional vs. Unconventional Methods . . . . . . . . . . . . . . . . . . . . . . . Constantin Barbulescu, Stefan Kilyeni, Violeta Chis, Mihaela Craciun, and Attila Simo LoRaWAN Based Airport Runway Lights Monitoring System . . . . . . . Attila Simo, Constantin Barbulescu, and Stefan Kilyeni The Enhanced Crow Search Algorithm for Fuel-Cost Function Parameters Assessment of the Cogeneration Units from Thermal Power Plants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dinu-Calin Secui, Simona Dzitac, and Serban-Ioan Bunda
3
19
30
Modelling a Photovoltaic Power Station . . . . . . . . . . . . . . . . . . . . . . . . . Eva Barla, Dzitac Simona, and Carja Vasile
41
Identification of Fault Locations on Tree Structure Systems . . . . . . . . . Judith Pálfi and Adrienn Dineva
48
Probability Distribution Functions for Short-Term Wind Power Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Harsh S. Dhiman and Dipankar Deb
60
Modeling, Algorithms. Optimization, Reliability and Applications Feedback Seminar Analysis - An Introductory Approach from an Intelligent Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adriana Mihaela Coroiu and Alina Delia Călin Flexible Fuzzy Numbers for Likert Scale-Based Evaluations . . . . . . . . . József Dombi and Tamás Jónás
73 81
xxiii
xxiv
Contents
Inventory Optimization Model Parameter Search Speed-Up Through Similarity Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Tomáš Martinovič, Kateřina Janurová, Jan Martinovič, and Kateřina Slaninová On Posets for Reliability: How Fine Can They Be? . . . . . . . . . . . . . . . . 115 Valeriu Beiu, Simon R. Cowell, and Vlad-Florin Drăgoi Comparison of Neural Network Models Applied to Human Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Daniela Sánchez, Patricia Melin, and Oscar Castillo Machine Learning, NLP and Applications A Genetic Deep Learning Model for Electrophysiological Soft Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Hari Mohan Pandey and David Windridge Blockchain. Today Applicability and Implications . . . . . . . . . . . . . . . . . 152 Dominic Bucerzan and Crina Anina Bejan Integrating Technology in Training: System Dynamic-Based Model Considerations Regarding Smart Training and Its Relationship with Educational Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Victor Tița, Doru Anastasiu Popescu, and Nicolae Bold A Survey on Nonlinear Second-Order Diffusion-Based Techniques for Additive Denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Tudor Barbu Cathedral and Indian Mughal Monument Recognition Using Tensorflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Aniket Ninawe, Ajay Kumar Mallick, Vikash Yadav, Hifzan Ahmad, Dinesh Kumar Sah, and Cornel Barna DDoS Attack Prevention Protocol Through Support Vector Machine and Fuzzy Clustering Mechanism on Traffic Flow with Harmonic Homogeneity Validation Technique . . . . . . . . . . . . . . . . 197 Kirti Joon, Namrata Agrawal, Hifzan Ahmad, Vikash Yadav, Dinesh Kumar Sah, and Cornel Barna Business Process Management Analysis of Interrelationship for Lean and Sustainability Principles and Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Ilie Mihai Taucean, Serban Miclea, Larisa Ivascu, and Mircea Liviu Negrut
Contents
xxv
Adjusting the Information Flow on Time to Perform Bicycles Daily Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 Patriciu Ruset, Matei Tamasila, Andra Diaconescu, and Gabriela Prostean A Survey of Cybersecurity Risk Management Frameworks . . . . . . . . . . 240 Olivia Giuca, Traian Mihai Popescu, Alina Madalina Popescu, Gabriela Prostean, and Daniela Elena Popescu Remote Wind Energy Conversion System . . . . . . . . . . . . . . . . . . . . . . . 273 Cezara-Liliana Rat, Octavian Prostean, Ioan Filip, and Cristian Vasar Knowledge-Based Technologies for Web Applications, Cloud Computing, Security Algorithms and Computer Networks, Smart City Software for Integral Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 Adi Sala, Marius Constantin Popescu, and Antoanela Naaji Real Time Urban Traffic Data Pinpointing Most Important Crossroads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 Dacian Avramoni, Alexandru Iovanovici, Cristian Cosariu, Iosif Szeidert-Subert, and Lucian Prodan Priority Levels and Danger in Usage of Artificial Intelligence in the World of Autonomous Vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 Gabor Kiss and Csilla Berecz Mobile Robot Platform for Studying Sensor Fusion Localization Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 Paul-Onut Negirla and Mariana Nagy Fuzzy Applications, Theory, Expert Systems, Fuzzy and Control Design of Intelligent Stabilizing Controller and Payload Estimator for the Electromagnetic Levitation System: DOBFLC Approach . . . . . . 329 Ravi V. Gandhi and Dipak M. Adhyaru On the Concept of Fuzzy Graphs for Some Networks . . . . . . . . . . . . . . 344 V. Yegnanarayanan, Noemi Clara Rohatinovici, Y. Gayathri Narayana, and Valentina E. Balas Fuzzy Logic in Predictive Harmonic Systems . . . . . . . . . . . . . . . . . . . . . 358 W. Bel, D. López De Luise, F. Costantini, I. Antoff, L. Alvarez, and L. Fravega Particle Swarm Optimization Ear Identification System . . . . . . . . . . . . . 372 B. Lavanya, H. Hannah Inbarani, Ahmad Taher Azar, Khaled M. Fouad, Anis Koubaa, Nashwa Ahmad Kamal, and I. Radu Lala
xxvi
Contents
Puzzle Learning Trail Generation Using Learning Blocks . . . . . . . . . . . 385 DoruAnastasiu Popescu, Daniel Nijloveanu, and Nicolae Bold Design of an Extended Self-tuning Adaptive Controller . . . . . . . . . . . . . 392 Ioan Filip, Florin Dragan, Iosif Szeidert, and Cezara Rat Assessing the Quality of Bread by Fuzzy Weights of Sensory Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 Anca M. Dicu, Marius Mircea Balas, Cecilia Sîrghie, Dana Radu, and Corina Mnerie Catalan Numbers Associated to Integer Partitions and the Super-Exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412 Tiberiu Spircu and Stefan V. Pantazi Emotional Control of Wideband Mobile Communication . . . . . . . . . . . . 426 Vadim L. Stefanuk and Ludmila V. Savinitch Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
Short CVs of Guest Editors
Valentina Emili Balas is currently Full Professor in the Department of Automatics and Applied Software at the Faculty of Engineering, “Aurel Vlaicu” University of Arad, Romania. She holds a PhD in applied electronics and telecommunications from Polytechnic University of Timisoara. Dr. Balas is author of more than 300 research papers in refereed journals and international conferences. Her research interests are in intelligent systems, fuzzy control, soft computing, smart sensors, information fusion, modeling and simulation. She is the Editor-in-Chief to International Journal of Advanced Intelligence Paradigms (IJAIP) and to International Journal of Computational Systems Engineering (IJCSysE), member in Editorial Board member of several national and international journals and is evaluator expert for national, international projects and PhD thesis. Dr. Balas is Director of Intelligent Systems Research Centre in Aurel Vlaicu University of Arad and Director of the Department of International Relations, Programs and Projects in the same university. She served as General Chair of the International Workshop Soft Computing and Applications (SOFA) in eight editions 2005-2020 held in Romania and Hungary. Dr. Balas participated in many international conferences as Organizer, Honorary Chair, Session Chair and member in Steering, Advisory or International Program Committees. xxvii
xxviii
Short CVs of Guest Editors
She is a member of EUSFLAT, SIAM and Senior Member IEEE, member in Technical Committee– Fuzzy Systems (IEEE CIS), chair of the Task Force 14 in Technical Committee–Emergent Technologies (IEEE CIS), member in Technical Committee–Soft Computing (IEEE SMCS). Dr. Balas was past Vice-President (Awards) of IFSA International Fuzzy Systems Association Council (2013-2015) and is a Joint Secretary of the Governing Council of Forum for Interdisciplinary Mathematics (FIM)-A Multidisciplinary Academic Body, India. Professor Valentina Emili Balas, PhD Faculty of Engineering Aurel Vlaicu University of Arad B-dul Revolutiei 77 310130 Arad, Romania [email protected] Lakhmi C. Jain, BE(Hons), ME, PhD, Fellow (Engineers Australia), served as Visiting Professor in Bournemouth University, UK, until July 2018 and presently serving the University of Technology Sydney, Australia, and Liverpool Hope University, UK. Dr. Jain founded the KES International for providing a professional community the opportunities for publications, knowledge exchange, cooperation and teaming. Involving around 5,000 researchers drawn from universities and companies worldwide, KES facilitates international cooperation and generate synergy in teaching and research. KES regularly provides networking opportunities for professional community through one of the largest conferences of its kind in the area of KES. http://www.kesinternational.org/organisation.php His interests focus on the artificial intelligence paradigms and their applications in complex systems, security, e-education, e-healthcare, unmanned air vehicles and intelligent systems design.
Short CVs of Guest Editors
xxix
Professor Lakhmi C. Jain PhD|ME|BE (Hons)| Fellow (Engineers Aust), Founder KES International | http://www. kesinternational.org/organisation.php Visiting Professor | Liverpool Hope University, UK University of Technology Sydney, Australia Email – [email protected] Email – [email protected] Professor Lakhmi C. Jain, PhD Faculty of Engineering and Information Technology Centre for Artificial Intelligence University of Technology Sydney Broadway, NSW 2007 Australia [email protected] and Faculty of Science Liverpool Hope University Hope Park Liverpool, L16 9JD UK [email protected] and KES International PO Box 2115 Shoreham-by-Sea, BN43 9AF UK [email protected]
Marius Mircea Balas is currently Full Professor in the Department of Automatics and Applied Software at the Faculty of Engineering, University “Aurel Vlaicu” Arad (Romania). He holds Doctorate in Applied Electronics and Telecommunications from the Politehnica University of Timisoara. Dr. Balas is an IEEE Senior Member.
xxx
Short CVs of Guest Editors
He is the author of more than 150 papers in journals and conference proceedings and 7 invention patents. His research interests are in intelligent and fuzzy systems, soft computing, electronic circuits, modeling and simulation, adaptive control and intelligent transportation. The main original concepts introduced by Prof. Marius M. Balas are: the fuzzy-interpolative systems, the passive green-house, the constant time to collision optimization of the traffic, the imposed distance braking, the internal model bronze casting, PWM inverter for railway coaches in tropical environments, the rejection of the switching controllers effect by phase trajectory analysis, the Fermat neuron, etc. He has been mentor for many student research teams and challenges, awarded by Microsoft Imagine Cup, GDF Suez, etc. Professor Marius Mircea Balas, PhD Faculty of Engineering Aurel Vlaicu University of Arad B-dul Revolutiei 77 310130 Arad, Romania [email protected] Shahnaz N. Shahbazova received her Candidate of Technical Sciences degree in 1995 and has been Associate Professor since 1996. She has served for more than 34 years at the “Information Technology and Programming” Department of Azerbaijan Technical University. She is an academician of the International Academy of Sciences named after Lotfi A. Zadeh, 2002 and Vice-President of the same academy, 2014. Since 2011, she has served as General Chair and Organizer of World Conference on Soft Computing (WCSC), dedicated to preserving and advancing the scientific heritage of Professor Lotfi A. Zadeh. She is Honorary Professor of Obuda University, Hungary and Arad University in Romania, Honorary Doctor of Philosophy in Engineering Sciences, International Personnel Academy of UNESCO. She is International Expert of UNESCO in the implementation of Information Communication Technology (ICT) in educational environments in
Short CVs of Guest Editors
xxxi
Azerbaijan. She is a member of the Editorial Board of International Journal of Advanced Intelligence Paradigm (Romania), Journal of Pure and Applied Mathematics (Baku) and Journal of Problems of İnformation Technology (Baku). She was invited to serve as Program Committee member for over 30 international conferences and as a reviewer for nearly 40 international journals. Her awards are as follows: India-3 months (1998); Germany (DAAD), 3 months (1999); Germany (DAAD), 3 months (2003); USA, California (Fulbright), 10 months (2007-2008); Germany (DAAD)-3 months (2010); USA, UC Berkeley-6 months (2012, 2015, 2016, 2017); and USA, UC, Berkeley, 12 months (2018). She is the author of more than 152 scientific articles, 8 method manuals, 6 manuals, 1 monograph and 5 Springer publications. Her research interests include artificial intelligence, soft computing, intelligent system, machine learning techniques to decision making and fuzzy neural network. Shahbazova is a member of Board of Directors of North American Fuzzy Information Processing Society (NAFIPS); the Berkeley Initiative in Soft Computing group (BISC); the New York Academy of Sciences, the Defined Candidate Dissertation Society in Institute of Applied Mathematics at Baku State University. She is also a member of Defined Candidate Dissertation Society at Institute of Management System of National Academy of Sciences (NAS), Azerbaijan, a member of IEEE, and of the International Women Club, Azerbaijan. Professor Shahnaz N. Shahbazova, PhD Department of Information Technology and Programming Azerbaijan Technical University Baku, Azerbaijan [email protected]
Soft Computing and Conventional Techniques in Power Engineering Methods and Applications in Electrical Engineering I
Daily Load Curve Forecasting. Comparative Analysis: Conventional vs. Unconventional Methods Constantin Barbulescu1(&), Stefan Kilyeni1, Violeta Chis2, Mihaela Craciun2, and Attila Simo1 1 Power Systems Department, Power Systems Analysis and Optimization Research Center, Politehnica University Timisoara, 2 Bd. V. Parvan, Timisoara, Romania {constantin.barbulescu,stefan.kilyeni, attila.simo}@upt.ro 2 Aurel Vlaicu University of Arad, 77 Bd. Revolutiei, Arad, Romania [email protected], [email protected]
Abstract. The current paper is focusing on daily load consumption forecasting. This is a very important issue for every distribution system operator. Several methods are applied to achieve this goal: artificial neural networks and three conventional methods. The latter ones are based on linear approximation, curve fitting and decision. The starting point is represented by a big data set belonging to a real distribution system operator within our country. Thus, the authors are dealing with real data, extracted from the distribution network. In the following, a software tool has been developed, in Matlab environment for artificial intelligence based method and also for the conventional ones. A huge amount of data has been processed and the forecast has been performed for all the distribution branches belonging to that operator. Several indices have been computed in order to be able to provide related comments. Once all the forecasts have been carried-out, detailed analyses have been performed. Also, a hierarchy based on performance characteristics has been provided. Keywords: Load forecasting
Software tool Artificial neural networks
1 Introduction The background of the paper refers to a long-term collaboration between our University and an important distribution system operator within the Romanian Power System. The load forecasting problem, no matter if we are talking about short-term, mid-term or long-term, is a very important one for each distribution system operator. These results have an impact regarding the evolution of the distribution network, regarding its (current and future) behavior, related policy, etc. The authors are dealing with a large set of load consumption data provided by the distribution system operator involved. All the numerical results that are going to be presented have been obtained using own developed software-tools. Practically, the © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 3–18, 2021. https://doi.org/10.1007/978-3-030-51992-6_1
4
C. Barbulescu et al.
paper deals with the big data concept applied to power systems, particularly for the distribution networks’ field. The related scientific literature from the last 4 years has been analyzed and the significant aspects are briefly presented within the current section. In [1], the authors have performed a review regarding the short-term load forecasting methods. A number around 30 papers have been analyzed. They have been focused only on artificial neural networks (ANN). They have discussed about ANNs based on back-propagation algorithm, which is the main type of ANNs. But, they have also described other papers that are working with artificial intelligence methods (such as genetic algorithms, fuzzy logic, particle swam optimization) linked with ANNs [2]. In [3], the authors have developed a number of 8 different ANNs based algo-rithms. The differences are related to the ANN learning mechanism and operation: multilayer perceptron, radial operation, counter-propagation, regression, self-organizing map. They have performed a comparative analysis among these methods and another one which is very popular. They concluded that for short-term applications, considering the data used by them, the regression based ANN is the most competitive one. The current paper is also dealing with the short-term load forecasting, more exactly daily load curve. Another application of the daily load curve forecasting could be found in [4]. But, in [4] the authors have applied a harmonic model (grounded on Fourier series) in order to capture the daily consumption pattern. The desire of the authors is to use only consumption evolution related data. Using this approach they avoided other data use, such as weather related ones. In [5], the authors have focused on hourly load forecasting, also using ANNs. A commercial software-tool has been used. A comparison between the obtained results and the available ones to public is performed. Also, the ANN’s architecture is presented. The deep neural network concept is applied in [6]. It is grounded on the use of several hidden layers. The parameters of the ANN have been tuned using genetic algorithms. Authors concluded that it is a necessary approach in order to catch most of the dependencies between the historical data. The performed forecasts have been more accurate. [7] also addressed the daily load forecasting problem, but considered at minute level. The novelty brought refers to the load segmentation method. Each minute load is divided in several segments. A set of common factors is extracted (weather, temperature, day type, etc.). The forecast is solved using radial ANNs. In [8], the hourly load forecasting issue is tackled using adaptive neuro-fuzzy inference system. The back-propagation algorithm, together with the 1st order Sugeno fuzzy model has been used. The considered input variables referred to: actual consumption, month and hour. The next hourly load is forecasted for a real consumer (university) in Turkey. A mixture between three artificial intelligence methods, in order to predict the load, is proposed in [9]. The Elman Neural Network is used for short-term load forecasting and Particle Swarm Optimization (PSO) and Simulated Annealing have been used in order to tune the ANN’s parameters. The used type of ANN is a dynamic recurrent one having superior abilities is processing sequential input data.
Daily Load Curve Forecasting
5
In [10] is also presented a hybrid approach between a feed-forward ANN and PSO algorithm. The application refers to hourly load forecasting. The authors highlighted several advantages, such as: better convergence in case of ANN optimized using PSO, in comparison with back-propagation algorithm and the training process is twice faster (in PSO case). [11] proposes an approach in order to establish the best correlation between the data and combination weights. Several types of ANNs have been used. A special algorithm has been proposed having as a goal to establish the number of neurons belonging to the hidden layer. The current paper is divided into 4 sections. The 1st one tries to describe few of the relevant scientific works from the last years. The 2nd one is focusing on presenting the developed algorithms. The 3rd one presents the numerical results. Several comments are presented; a comparison of the applied methods is provided. Finally, the conclusions are synthesized within the 4th section.
2 Developed Algorithms All the algorithms that are going to be briefly presented in the following have been implemented in Matlab environment. 2.1
Artificial Neural Networks (ANN)
Authors are dealing with a feed-forward ANN, which uses a supervised learning mechanism, based on back-propagation algorithm. • data collection The following input variables have been considered: load, day type, hour, month and year. Bad data are identified and adjusted applying a statistical method in order to avoid model “contamination”. The scaling of input data is recommended, due to the fact that they are ranging between very different domains and, if not applied, convergence issues should appear. • ANN architecture It is a multilayer perceptron (MLP) ANN having one single hidden layer. A number of six neurons have been considered for the input layer, as follows: – – – – – – –
month (values ranging from 1–12); day (values ranging from 1–31); hour (values ranging from 1–24); day week (values ranging from 1–7); load from the previous week at the same t hour: P(t-168); load from the previous day at the same hour: P(t-24); average hourly load from the previous day. The output layer has one single neuron, meaning the forecasted load value.
6
C. Barbulescu et al.
The weights of each layer are randomly initialized when the ANN is configured. They are adjusted during the ANN learning phase, in order to obtain fore-casted values as close as possible to the measured (known) ones. This process is performed using the available training set. • ANN training The MLP ANN training has as a goal to minimize a function like this one [12]: E¼
m X n 1 X (t opj )2 m p¼1 j¼1 pj
ð1Þ
where: n – number of output layer neurons; m – number of models within the training set; tpj – desired output for j neuron, according to the p training model; opj – current output for j neuron, according to the p training model. • simulation The forecasted output is simulated using the trained ANN and the modeled in-put data. The ANN performance, following the training process, is achieved using another data set (different from the one used for training process), applying such a relation [12]: deviation % ¼
mANN mcomputed 100 mcomputed
ð2Þ
• ANN based forecast All the weights are going to be saved once the ANN training and simulation phases have been ended. At this point the ANN is ready for use. The requested forecast is obtained by introducing new models to the ANN. 2.2
Multiple Linear Regression (MLR)
Mathematical regression represents a method used to predict one variable value, based on the values of other variables. It is grounded on the simple correlation between the criterion and the predictor and has one single predictor variable. In case of multiple regression, multiple correlations are involved and several predictors [12]. The multiple regression model has the following general form [12]: YðtÞ ¼ fðX1 ; X2 ; L; Xm Þ þ e where: Y – dependent variable; X – independent variable. The implemented algorithm is the following one: 1. Predictors are being generated: • consumption from the previous day, at the same hour; • hourly average consumption from the previous day;
ð3Þ
Daily Load Curve Forecasting
7
• day number (values ranging from 1 to 31); • week day number (values from 1 to 7); 2. 3. 4. 5. 6. 7.
The input data are formed based on the predictors; Initial values are established for the regression coefficients; The MAPE error is computed for the modeled input data; The regression coefficients are validated through the input data; The regression coefficients are used for load forecasting for the next 24 h; Graphical results are presented.
This type of mathematical method is useful for predicting the general data tendency. It has a weak performance in representing nonlinearities. 2.3
Curve Fitting (CF)
The mathematical method is part of the numerical approximation of functions. Usually, in power engineering we are dealing with functions defined by points. Thus, only the function value for specific points is known [13]. ðXi ; Yi Þ;
i ¼ 0; n
ð4Þ
where: X, Y – known point and function value for that point; n – number of points. It is requested to approximate the function value in other points, different from the ones already known. Such an example refers to load forecast issue [13]. gðXi Þ ¼ fðXi Þ; i 2 ½ a; b]
ð5Þ
where: g(x) – approximation function; [a, b] – any point of the considered interval. A function with several variables has been used for the current paper. The involved variables are the following one: • year, month, day, week day, hour; • consumption from the previous day, at the same hour; • hourly average consumption from the previous day. In the following the model is validated and the MAPE error is computed for the input modeled data. To achieve this goal the following criterion has been used: S SE =
n X
wk ðyk ^ yk Þ 2
ð6Þ
k¼1
where: SSE – sum of squares due to error; wk – weights (may be defined or initialized with 1); yk – corresponding value for xk; – approximated value for xk. The SSE being smaller, the better the approximation is. The last stage refers to the load forecast based on the validated model and graphical representation of the results.
8
2.4
C. Barbulescu et al.
Decision Trees (DT)
It is a mathematical method included within the automated learning systems theory. It is a type of classifier able to extract and synthesize the information from a database. Further, based on the extracted information, it is capable to provide an answer when it is requested to solve similar problems. Briefly, the implemented algorithm is the following one [14]: 1. Predictors are being generated and the input data are formed: • consumption from the previous day and from the previous week, at the same hour; • hourly average consumption from the previous day; • year, month, hour (values ranging from 1 to 24), day number (values ranging from 1 to 31), week day number (values from 1 to 7); 2. 3. 4. 5. 6.
The The The The The
decision tree is generated and validated [15]. MAPE error for the modeled input data is computed. input data are graphically represented. load forecast is performed using the decision trees for the next 24 h. MAPE error is computed and the forecasted curve is graphically represented.
3 Numerical Results. Discussions The September the 1st load curve is forecasted based on the ones recorded between 1st of January and 31st of August. That means 243 load curves. The results associated to the 2017 and 2018 years are presented. The 1st 243 daily load curves have been used in order to perform the forecast. The 1st of September day has been used in order to validate the performed forecast. All the load curves’ data are belonging to a real distribution system operator managing the distribution network from the Western side of the Romanian Power System. The distribution system operator under discussion operates a number of 4 distribution branches. Only the results corresponding to one of these distribution branches (Distribution branch1) are going to be presented within the current paper. Several methods have been applied in order to perform the requested forecast: • • • • 3.1
artificial neural networks (ANN); multiple linear regression (MLR); curve fitting (CF); decision tress (DT). Forecast Performed for 2017 Year
The load curve associated to September the 1st, 2017 is presented in Table 1. These values are going to be used in order to validate the forecast. The results corresponding to different forecasting methods are presented in Table 2. The following quantities are presented:
Daily Load Curve Forecasting
9
Table 1. Load curve associated to September the 1st, 2017 Hour P [MW] 1 111.1 2 102.5 3 99.8 4 99.6 5 99.3 6 98.5
Hour 7 8 9 10 11 12
P [MW] 99.8 101.2 107.0 111.2 111.1 113.7
Hour 13 14 15 16 17 18
P [MW] 113.1 110.2 108.5 108.3 107.6 106.4
Hour 19 20 21 22 23 24
P [MW] 105.8 107.9 121.5 134.2 129.4 116.4
Table 2. The forecasted load curves for September the 1st, 2017 Percent relative deviation [%]
107.2 −3.51 12.32 111.7 0.54 0.29 93.2 −9.07 82.32 95.2 −7.15 51.14 91.1 −8.72 75.99 92.1 −7.69 59.06 90.3 −9.32 86.81 93.4 −6.22 38.75 89.3 −10.07 101.4 91.5 −7.83 61.38 88.9 −9.75 94.99 90.3 −8.29 68.80 89.8 −9.99 99.83 92.7 −7.07 50.04 93.5 −7.59 57.54 94.2 −6.92 47.84 98.1 −8.31 69.04 103.9 −2.94 8.61 102.6 −7.74 59.90 103.2 −7.19 51.76 103.4 −6.97 48.54 105.7 −4.86 23.62 106.4 −6.39 40.81 107.8 −5.15 26.56 107.0 −5.37 28.88 107.2 −5.26 27.68 107.1 −2.77 7.67 107.8 −2.16 4.66 106.0 −2.32 5.40 107.8 −0.62 0.38 107.2 −1.06 1.12 107.8 −0.43 0.19 108.3 0.65 0.43 107.8 0.22 0.05 108.5 1.97 3.87 103.9 −2.39 5.70 108.4 2.48 6.15 103.9 −1.83 3.36 111.6 3.45 11.90 107.8 −0.06 0.00 125.1 2.94 8.66 116.0 −4.53 20.57 136.1 1.38 1.90 134.7 0.41 0.17 129.5 0.07 0.01 118.5 −8.42 70.96 120.8 3.82 14.61 115.9 −0.44 0.20 PI2017 920.1 PI2017 621.8
Standard deviation
285.7 140.9 132.3 113.0 113.1 126.5 156.2 110.7 128.4 107.6 77.67 61.86 39.46 8.86 2.93 0.00 4.22 15.47 26.30 37.02 17.28 3.04 1.34 43.56 1753
Standard deviation
Forecasted value [MW]
Percent relative deviation [%]
Percent relative deviation [%] −16.90 −11.87 −11.50 −10.63 −10.63 −11.25 −12.50 −10.52 −11.33 −10.37 −8.81 −7.87 −6.28 −2.98 −1.71 −0.06 2.05 3.93 5.13 6.08 4.16 1.74 1.16 6.60
DT
Forecasted value [MW]
92.3 90.3 88.3 89.0 88.7 87.4 87.3 90.6 94.9 99.7 101.3 104.8 106.0 106.9 106.6 108.2 109.8 110.6 111.2 114.5 126.6 136.5 130.9 124.1 PI2017
MLR
Standard deviation
113.7 2.32 5.36 104.5 1.91 3.64 99.88 0.08 0.01 98.62 −0.98 0.96 96.99 −2.33 5.42 98.66 0.16 0.03 102.5 2.74 7.52 106.1 4.83 23.29 111.3 3.97 15.76 115.4 3.80 14.44 114.5 3.02 9.13 116.9 2.83 8.01 116.3 2.78 7.75 113.7 3.20 10.23 109.9 1.30 1.69 110.2 1.72 2.97 110.4 2.60 6.76 110.3 3.69 13.64 109.9 3.89 15.13 112.7 4.41 19.45 127.2 4.67 21.85 141.4 5.37 28.89 133.9 3.52 12.38 122.2 4.97 24.73 PI2017 259.0
Forecasted value [MW]
Standard deviation
111.1 102.5 99.8 99.6 99.3 98.5 99.8 101.2 107.0 111.2 111.1 113.7 113.1 110.2 108.5 108.3 107.6 106.4 105.8 107.9 121.5 134.2 129.4 116.4
CF Percent relative deviation [%]
Known value [MW]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Forecasted value [MW]
Hour
ANN
10
• • • •
C. Barbulescu et al.
forecasted values; differences regarding the known values (percent relative deviation); standard deviation; performance index (PI2017) being the sum of the standard deviations corresponding to the 24 h values.
The forecasted load curves applying the discussed methods are graphically presented in Fig. 1.
Fig. 1. Graphical representation – forecasted load curves for September the 1st, 2017
The performance indices and the forecast maximum errors are synthesized in Table 3. Table 3. Performance indices and the forecast maximum errors ANN CF MLR DT 259.0 1753 920.1 621.8 PI2017 Maximum error [%] 5.37 16.9 10.1 8.42
The following comments are performed based on the provided results applying the presented methods: • the best results have been obtained applying the artificial neural networks based method; • the conventional forecasting methods (curve fitting, multiple linear regression, decision trees) lead to inadequate results. The hierarchy based on the performance indices is the following one: multiple linear regression, curve fitting, decision trees; • the forecast maximum errors’ analysis highlights good results in case of artificial neural networks (below 5.5%). Values around 8%, 10%, 17% have been obtained in case of the conventional methods.
Daily Load Curve Forecasting
3.2
11
Forecast Performed for 2018 Year
The load curve associated to September the 1st, 2018 is presented in Table 4. These values are going to be used in order to validate the forecast. Table 4. Load curve associated to September the 1st, 2018 Hour 1 2 3 4 5 6
P [MW] 117.5 112.2 107.4 104.2 100.1 102.9
Hour 7 8 9 10 11 12
P [MW] 107.0 108.1 110.2 112.1 112.8 113.1
Hour 13 14 15 16 17 18
P [MW] 113.1 113.7 114.2 115.0 114.2 113.4
Hour 19 20 21 22 23 24
P [MW] 113.4 115.4 126.8 136.5 132.4 121.7
The forecast methods applied for the case of 2017 year are used for this case too. Table 5 has the same structure as Table 2, but the results are corresponding to the forecast performed for the 2018 year. Table 5. The forecasted load curves for September the 1st, 2018
Standard deviation
Percent relative deviation [%]
Forecasted value [MW]
Standard deviation
Percent relative deviation [%]
Forecasted value [MW]
0.91 0.82 105.7 −10.05 101.1 107.3 −8.71 75.83 106.7 −9.21 84.80 −3.21 10.29 102.6 −8.58 73.67 104.2 −7.13 50.84 106.7 −4.94 24.37 −3.11 9.66 99.7 −7.16 51.27 102.9 −4.24 17.95 99.3 −7.52 56.60 −2.65 7.04 94.9 −8.96 80.34 99.7 −4.28 18.32 99.1 −4.88 23.77 1.63 2.66 93.4 -6.68 44.58 94.2 −5.90 34.78 97.4 −2.67 7.13 0.86 0.74 93.9 −8.73 76.23 96.3 −6.42 41.19 97.2 −5.52 30.52 −2.36 5.58 95.3 −10.93 119.5 99.3 −7.18 51.50 98.3 −8.13 66.09 −0.35 0.12 96.0 −11.18 125.1 102.4 −5.27 27.80 98.5 −8.93 79.69 −0.22 0.05 96.7 −12.21 149.2 102.9 −6.67 44.48 98.5 −10.66 113.7 −0.04 0.00 98.2 −12.38 153.3 103.7 −7.46 55.62 101.2 −9.69 93.85 −0.28 0.08 99.0 −12.22 149.2 104.2 −7.61 57.86 102.3 −9.31 86.65 0.30 0.09 99.1 −12.41 154.0 105.9 −6.36 40.41 103.1 −8.82 77.86
Standard deviation
118.5 108.6 104.1 101.4 101.7 103.8 104.5 107.7 110.0 112.1 112.5 113.4
Percent relative deviation [%]
117.5 112.2 107.4 104.2 100.1 102.9 107.0 108.1 110.2 112.1 112.8 113.1
Forecasted value MW]
1 2 3 4 5 6 7 8 9 10 11 12
Standard deviation
Percent relative deviation [%]
DT
Forecasted value [MW]
MLR
Known value [MW]
CF
Hour
ANN
(continued)
12
C. Barbulescu et al. Table 5. (continued) DT
Standard deviation
Percent relative deviation [%]
Forecasted value [MW]
Standard deviation
Percent relative deviation [%]
Forecasted value [MW]
Standard deviation
Percent relative deviation [%]
−1.82 3.30 100.0 −11.55 133.5 104.7 −7.43 55.16 103.8 −8.22 67.61 −2.57 6.62 101.5 −10.71 114.7 105.2 −7.46 55.63 103.4 −9.06 82.06 −0.76 0.58 103.7 −9.20 84.59 107.3 −6.04 36.51 103.9 −8.98 80.72 −1.30 1.70 105.3 −8.51 72.36 107.0 −7.00 49.04 104.2 −9.40 88.40 1.46 2.13 107.1 −6.21 38.59 105.3 −7.82 61.21 106.7 −6.57 43.13 1.22 1.50 108.6 −4.24 18.00 106.6 −6.00 36.02 108.9 −3.96 15.65 −0.24 0.06 109.9 −3.06 9.34 108.0 −4.70 22.08 110.1 −2.87 8.25 −4.98 24.78 110.6 −4.20 17.62 109.7 −4.96 24.62 105.7 −8.39 70.37 −3.76 14.10 125.0 −1.41 1.99 120.8 −4.73 22.39 118.6 −6.50 42.28 1.30 1.68 127.3 −6.76 45.71 127.5 −6.59 43.47 121.2 −11.17 124.8 1.43 2.06 122.9 −7.21 51.96 123.4 −6.77 45.81 125.2 −5.47 29.91 −1.53 2.33 117.7 −3.23 10.44 118.2 −2.83 8.01 122.3 0.56 0.31 97.97 PI2018 1876 PI2018 976.5 PI2018 1398
Forecasted value MW]
111.0 110.8 113.3 113.5 115.9 114.8 113.1 109.7 122.0 138.3 134.3 119.8 PI2018
Standard deviation
113.1 113.7 114.2 115.0 114.2 113.4 113.4 115.4 126.8 136.5 132.4 121.7
MLR
Percent relative deviation [%]
Known value [MW]
13 14 15 16 17 18 19 20 21 22 23 24
CF
Forecasted value [MW]
Hour
ANN
The forecasted load curves obtained applying the discussed methods are graphically presented in Fig. 2. The real load curve is represented using blue color. It is used in order to validate the results. It is clearly highlighted from the figure that the ANN based forecast is the closest one (to the real load curve). The DT, MLR and CF methods are conventional ones. The corresponding results are a bit further from the real curve. The performance indices and the forecast maximum errors are synthesized in Table 6. The following comments are able to be provided in case of September the 1st, 2018 forecasted load curve: • the ANN based forecast proves to provide the best results (similar with the 2017 year); • the conventional methods based forecast leads to improper results. But, among them, the multiple linear regression provides the most acceptable ones; • the forecast maximum errors’ analysis highlights good results in case of artificial neural networks (below 5%). Values around 9%, 11%, 12% have been obtained in case of the conventional methods.
Daily Load Curve Forecasting
13
Fig. 2. Graphical representation – forecasted load curves for September the 1st, 2018 Table 6. Performance indices and the forecast maximum errors ANN CF MLR DT 98.0 1876 976.5 1398 PI2018 Maximum error [%] 5.0 12.4 8.7 11.2
3.3
Comparative Analysis for Several Forecasts
The September the 1st daily load curve forecast has been performed, for 2017 and 2018 years, for each of the 4 distribution branches and for the entire assembly of the distribution system operator. The ANN and MLR (the most acceptable results among the conventional methods) based forecast methods are taken into consideration for each case. The results corresponding to the 4 distribution branches and the entire assembly of the distribution system operator are presented in Table 7. It synthesizes the following information: • the performance indices known from the performed forecasts (Tables 2, 5 and others like these ones for the other cases). It is written within the column named Total index in Table 7; • the specific performance indices computed by dividing the performance index by 24 h. It is written within the column named Specific index in Table 7; • the ratio between the specific indices of the MLR and ANN methods. It is written within the column named Ratio in Table 7; • the 2017 and 2018 years that are appearing near the names of the distribution branches are corresponding to the forecast performed for those years.
14
C. Barbulescu et al. Table 7. Specific performance indices
No.
1 2 3 4 5 6 7 8 9 10
Distribution branch
Distribution branch1 2017 Distribution branch1 2018 Distribution branch2 2017 Distribution branch2 2018 Distribution branch3 2017 Distribution branch3 2018 Distribution branch4 2017 Distribution branch4 2018 Entire network assembly 2017 Entire network assembly 2018
ANN Total index 259.0
Specific index 10.79
MLR Total index 920.1
Specific index 38.34
Ratio
3.55
98.0
4.08
976.5
40.69
9.97
289.4
12.06
817.8
34.08
2.83
420.6
17.52
2205
91.87
5.24
592.5
24.69
729.2
30.38
1.23
654.2
27.26
1627
67.91
2.49
867.3
36.14
2617
109.0
3.02
1593
66.37
6369
265.4
3.99
266.0
11.08
865.4
36.06
3.25
149.3
6.22
898.9
37.45
6.02
Two hierarchies are provided based on the information presented in Table 7. The specific performance indices values have been used for this purpose and the obtained hierarchies are presented in Tables 8 and 9. Table 8. 2017 year performance indices No.
1 2 3 4 5
Distribution branch
Distribution branch1 2017 Entire network assembly 2017 Distribution branch2 2017 Distribution branch3 2017 Distribution branch4 2017
ANN Total index 259.0
Specific index 10.79
MLR Total index 920.1
Specific index 38.34
Ratio
3.55
266.0
11.08
865.4
36.06
3.25
289.4
12.06
817.8
34.08
2.83
592.5
24.69
729.2
30.38
1.23
867.3
36.14
2617
109.0
3.02
Daily Load Curve Forecasting
15
Table 9. 2018 year performance indices No.
1 2 3 4 5
Distribution branch
Distribution branch1 2018 Entire network assembly 2018 Distribution branch2 2018 Distribution branch3 2018 Distribution branch4 2018
ANN Total index 98.0
Specific index 4.08
MLR Total index 976.5
Specific index 40.69
Ratio
9.97
149.3
6.22
898.9
37.45
6.02
420.6
17.52
2205
91.87
5.24
654.2
27.26
1627
67.91
2.49
1593
66.37
6369
265.4
3.99
At this point, one single hierarchy is useful to be provided. Thus, in Table 10 the performance indices for the 2017 and 2018 years have been cumulated and the obtained hierarchy is presented. Table 10. Cumulated performance indices No.
1 2 3 4 5
Distribution branch
Distribution branch1 Entire network assembly Distribution branch2 Distribution branch3 Distribution branch4
ANN Total index 357.0
Specific index 7.44
MLR Total index 1897
Specific index 39.52
Ratio
5.31
415.3
8.65
1764
36.75
4.25
710.0
14.79
3023
62.98
4.26
1247
25.98
2356
49.08
1.89
2460
51.25
9013
187.8
3.59
16
C. Barbulescu et al.
The maximum forecast errors are presented in Table 11. Table 11. Maximum forecast errors No.
Distribution branch
1
Distribution branch1 2018 Entire network assembly 2018 Distribution branch1 2017 Entire network assembly 2017 Distribution branch2 2017 Distribution branch2 2018 Distribution branch3 2017 Distribution branch3 2018 Distribution branch4 2017 Distribution branch4 2018
2 3 4 5 6 7 8 9 10
ANN error [%] 5.0
CF error [%] 12.4
MLR error [%] 8.7
DT error [%] 11.2
4.7
11.3
9.1
5.37
16.9
10.1
5.00
15.9
6.00
10.6
10.2
12.3
6.70
23.1
20.6
18.0
7.68
12.3
11.8
13.4
7.86
12.2
11.3
14.5
10.8
14.1
16.8
14.7
14.7
29.4
22.0
31.7
9.78
10.4 8.42 12.6
The following analysis is provided based on the information presented within these tables: • Distribution branch1 is ranking on the 1st place, according to the ANN based forecast (Tables 8, 9 and 10). It is closely followed by the Entire network assembly of the distribution system operator and, at a larger distance, by Distribution branch2 and Distribution branch3. Distribution branch4 is ranking on the last place, at huge distance from the others; • the explanations related to the Distribution branch4 refer to “special” specific elements for the consumption evolution and load curves’ pattern; • the specific performance indices, in case of MLR method, cover a large range, from 4.08 (Distribution branch1 forecast for 2018 year) to 66.37 (Distribution branch4, also for 2018 year). The known load curves data are corresponding to the entire week (all days); • forecast maximum errors below 5% are considered to be the best, the ones ranging between 5–10% are considered to be acceptable and the ones greater than 10% – inadequate.
Daily Load Curve Forecasting
17
4 Conclusion The paper deals with daily load curve forecasting using real power system data. Conventional and unconventional methods have been applied in order to achieve this goal. The unconventional ones refer to ANNs. The ANN based load forecasting proves to be more efficient in case of real power system data like the ones used within the current paper. The specific performance index value is related to the initial data (daily load curves for January the 1st – 31st of August) and the load curves’ correlation degree. Specific performance index high values highlight the consumption bad data presence (totally discrepant data). By eliminating (or “correcting”) these data the situation could be improved. Acknowledgment. This work was partial supported by the Politehnica University Timisoara research grants PCD-TC-2017.
References 1. Baliyana, A., Gaura, K., Mishra, S.K.: A review of short term load forecasting using artificial neural network models. Procedia Comput. Sci. 48, 121–125 (2015) 2. Muhammad, A., Alam, K.A., Hussain, M.: Application of data mining using artificial neural network: survey. Int. J. Database Theory Appl. 8(1), 245–270 (2015) 3. Dudek, G.: Neural networks for pattern-based short-term load forecasting: a comparative study. Neurocomputing 205, 64–74 (2016) 4. Nuchprayoon, S.: Forecasting of daily load curve on monthly peak day using load research data and harmonics model. In: Proceedings of the 6th IEEE International Conference on Control System, Computing and Engineering, Malaysia, pp. 338–342 (2016) 5. Elgarhy, S.M., Othman, M.M., Taha, A., Hasanien, H.M.: Short term load forecasting us-ing ANN technique. In: Proceedings of the IEEE 19th International Middle East Power Systems Conference (MEPCON), Egypt, pp. 1385–1394 (2017) 6. Hui, X., Qun, W., Yao, L., Yingbin, Z., Lei, S., Zhisheng, Z.: Short-term load forecasting model based on deep neural network. In: Proceedings of the 2nd IEEE International Conference on Power and Renewable Energy, pp. 589–591 (2017) 7. Lin, J., Wang, F., Cai, A., Yan, W., Cui, W., Mo, J., Shao, S.: Daily load curve forecasting using factor analysis and RBF neural network based on load segmentation. In: Proceedings of the IEEE China International Electrical and Energy Conference (CIEEC), China, pp. 589– 594 (2017) 8. Akarslan, E., Hocaoglu, F.O.: A novel short-term load forecasting approach using adaptive neuro-fuzzy inference system. In: Proceedings of the IEEE 6th International Istanbul Smart Grids and Cities Congress and Fair (ICSG), pp. 160–163 (2018) 9. Han, X., Li, X., Zhao, H., Bai, W.: Power load forecasting based on improved Elman neural network. In: Proceedings of the IEEE International Conference on Energy Internet, pp. 152– 156 (2018) 10. Ozerdema, O.C., Olaniyi, E.O., Oyedotun, O.K.: Short term load forecasting using particle swarm optimization neural network. Procedia Comput. Sci. 120, 382–393 (2018) 11. Wang, L., Wang, Z., Qu, H., Liu, S.: Optimal forecast combination based on neural networks for time series forecasting. Appl. Soft Comput. 66, 1–17 (2018)
18
C. Barbulescu et al.
12. Chis, V.E.: Artificial intelligence techniques applied for load forecasting studies in power engineering field. Politehnica University Timisoara, Romania, Ph.D. thesis (2015). (in Romanian) 13. Kilyeni, S.T.: Numerical methods. Algorithms, Computing Programs, Power Engineering Application, 4th edn. Orizonturi Universitare Publishing House, Timisoara (2016). (in Romanian) 14. Eremia, M., Petricica, D., Bulac, C., Tristiu, I.: Artificial intelligence techniques. Concepts and applications in power systems. Agir Publishing House, Bucharest (2001). (in Romanian) 15. Csorba, L.M., Craciun, M.: An application of the multi period decision trees in the sustainable medical waste investments. In: Proceedings of the 7th International Workshop Soft Computing Applications (SOFA 2016), vol. 2, pp. 540–556 (2018)
LoRaWAN Based Airport Runway Lights Monitoring System Attila Simo, Constantin Barbulescu(&), and Stefan Kilyeni Power Systems Department, Power Systems Analysis and Optimization Research Center, Politehnica University Timisoara, Timișoara, Romania {attila.simo,constantin.barbulescu, stefan.kilyeni}@upt.ro
Abstract. Airports traffic must have a high-performance radio-navigation and light beacon system to ensure safe aircraft operation in low visibility conditions, which also means lower flight times for airlines, and therefore reduced fuel consumption. In many cases the airports runway light systems have been designed and built decades ago taking into consideration the technologies available at that time. The low power wide area networks gives us other perspectives. These new technologies are unique because they make things different from the classical ones, predominant in IoT field such as short-range wireless networks, legacy wireless networks, and cellular networks, etc. They offer unique sets of features and benefits, the most important being the connectivity over an extended geographic area for devices with low power consumption and with low data flow, which does not characterize the classical wireless technologies. This paper presents novel perspective in terms of light beacons system monitoring using noninvasive methods. The data collected by the developed devices is centralized and used for real-time visualization and alerting by monitoring and intervention staff. Keywords: LPWA
Airport runway lights Safe aircraft operation
1 Introduction This paper represents a part of a more complex work from the Low Power Wide Area (LPWA) networks field, ongoing in the Innovation Department from ETA-2U Company, in collaboration with the Power Systems Department from Politehnica University Timisoara. The goal is to implement and test an airport runway lights non-invasive monitoring system using wireless sensors and the LoRa technology. LPWA is a new communication concept, which complement the existing shortrange wireless and the classical cellular technologies, remaining in close contact with the Internet of Things (IoT) sector. They offer unique sets of features and benefits, the most important being the connectivity over an extended geographic area for devices with low power consumption and with low data flow, which does not characterize the classical wireless technologies. The low power wide area networks are special because they make different tradeoffs than the traditional ones predominant in IoT field such as short-range wireless © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 19–29, 2021. https://doi.org/10.1007/978-3-030-51992-6_2
20
A. Simo et al.
networks (Bluetooth, Zig-Bee, Infrared transmission), legacy wireless networks (WIFI), and cellular networks (GSM, LTE, CDMA), etc. Their market is growing. It is expected that 25% of a total of 40–45 billion IoT/M2M devices will be connected to the Internet using LPWA networks [1, 2]. Because the traditional non-cellular wireless technologies covering is limited to a few hundred meters, in ideal conditions, and a considerably higher energy consumption, they are not suitable to operate low power devices, which are distributed over large geographical areas [4, 14]. In the case of these traditional technologies, devices are also not recommended to be deployed at randomly or moved anytime and anywhere at any time as are often the case of tracking applications or agriculture use cases. For greater efficiency regarding the range of these technologies special schemes are designed with multiple devices and base stations, which involves other costs [2, 5]. Obviously, this field is addressed for the mobile industry, and initiatives have been ongoing for several years in an attempt to deliver standards that will enable mobile operators to offer LPWA-like connectivity. Due to device autonomy and coverage over large geographic areas, LPWA technologies promise the Internet “things” with low energy consumption and low price. Figure 1 shows the variety of uses cases across several business sectors that can exploit these technologies to connect their devices. They targeting smart industry, smart environment, smart metering, pet localization, tracking, industrial and traffic monitoring, agriculture applications, etc.
Fig. 1. LPWA technologies presence in different sectors
LoRaWAN Based Airport Runway Lights Monitoring System
21
Although these type of technologies achieve as advantages long range and low power operation, also presents disadvantages as low data rate (tens of kb/s) and higher latency (seconds or minutes). Obviously, the low power wide area technologies are not made to offer solutions for any type of IoT applications. If it’s a use case in which we need low power consumption devices, we tolerate delays in communication within certain limits, we are satisfied with a low data rate, surely the best choice are these low power wide area technologies. Nowadays Internet of Things (IoT) become even more fascinating because of Low Power Wide Area (LPWA) networks. Since the beginning, IoT promise to change the way we live. To achieve visions as energy crisis, resource depletion, environmental pollution, etc. must provide smart solutions where “things” are interconnected from multiple sectors and information reaching decision makers are as relevant as possible. Many studies talk about this sector, because it had a rapid ascension and attracted many glances. There are voices saying that by 2020 the percentage of human subscribers using PCs, laptops, tablets, cellphones, etc. will be considerably smaller than the percentage of connected devices. Others say that the whole IoT industry until 2025 will bring about 4.5 trillion dollars income through its different sectors [1, 3]. The 1st section provides an overview of this industry sector dominated by the low power wide area technologies. In the 2nd sections the LoRaWAN communication protocol is presented briefly. The 3rd part of this paper shows the use case about creating a device that helps monitoring an airport light beacon system. Finally, conclusions are synthesized within the 4th section.
2 LoRa Technology LoRa (long range) is a proprietary physical layer used for low power wide area connectivity, a digital wireless data communication technology using license-free subgigahertz ISM radio frequency bands, i.e. 169 MHz, 433 MHz, 868 MHz in Europe. Using a special chip spread spectrum technique they resolved a bidirectional communication, which spreads a narrow band input signal over a wider channel bandwidth. LoRaWAN is a MAC layer protocol; practically represent the network on which LoRa operates. Is a network layer protocol, which manage communication between LPWA network base stations and end-node devices as a routing protocol (Fig. 2). Works similarly to ALOHA schemes, communicate with multiple devices at the same time via multiple channels of communication (spreading factors). The end-devices are left to connect to any base station (gateways) without extra signaling over-head. With the help of these base stations, the devices actually connect via a backhaul to a network server. This network server pretends to be the central unit of the LoRaWAN system. The data from the network server reaches the application server. The application servers, being ready for user-defined tasks (Fig. 3), process the received data. Due to the diversity of application and the multitude of features of end-devices, the standard defining the protocol divides devices into 3 major categories. Devices in all three classes allow bidirectional communication but manages the divide on criteria such as downlink latency or power requirement. These classes are the following:
22
A. Simo et al.
Fig. 2. Technology architecture
Fig. 3. Generic architecture of a LoRa type network
• Class A - in this class the end-devices have the largest autonomy, but with the highest latency. This is possible because they listens for a downlink communication only shortly after the uplink transmission. • Class B - in this class, compared to the previous one, in addition, end-devices can schedule downlink receptions from gateway at certain periods. This makes it possible for applications to send control request to the final devices (for the possible realization of a drive function) • Class C - devices in this category are generally powered continuously, thus they are continuously are connected and receive downlink transmissions, obviously with the shortest possible latency at any time. For device authentication with the network LoRaWAN standard is using symmetric-key cryptography, and keep the application’s data confidential [2, 11, 12].
LoRaWAN Based Airport Runway Lights Monitoring System
23
There are many studies in which this communication protocol is evaluated in real terms, outdoor or indoor settings, some results praise, others criticize. In [6] the authors have made a study, in which they are testing successfully the LoRaWAN communication protocol with 15–30 km coverage on ground and water. Everything is happening in Oulu Finland. An interesting study in [7] provides a comprehensive analysis of LoRa modulation, including a multitude of parameters, such as spreading factor, data rate, etc. The authors are proposing a new testing structure to study the network performance. Following the study, the authors conclude that LoRa modulation due to the chirp spread spectrum modulation and high receiver sensitivity, offers good resistance to interference than other similar technologies. Their tests show that this kind of system can offer a network coverage up to 3 km in a suburban area with dense residential houses. The focus of [8] is on one of the most prominent LPWAN technologies: Lo-Ra™. The authors implemented a new ns-3 module to study the performance of a LoRa-based IoT network in a typical urban scenario. Simulation results show that a LoRa network can scale well, achieving packet success rates above 95% in presence of a number of end devices in the order of 104. Another interesting study, an IoT application in tactical troop tracking systems, developed by Thai authors and named Universal and Ubiquitous LoRa (U-LoRa), is presented in [9]. Their proposed system integrates four gateways (developed with the help of Raspberry Pi). For the end-device, it is possible to integrate different types of wireless sensors: temperature, GPS, humidity, other gas sensors. Data collected from the network server reaches the application server where it can be viewed and analyzed. The proposed system provides not only an emerging long-range communication but also low-power operation in a military campsite within 0.5 km using a transmission power of 4dBi The authors also present some interesting results in [10]. A gateway having 96.7% packet delivery ratio and an end-device transmitting information at 14 dBm to a base station at a 420 m distance. In another study, the authors question the reliability of LoRa technology [13]. They do tests in all sorts of environments, studying the impact of different factors on LoRa performance. For example, have shown that higher temperatures significantly reduce the power of the received signal and can drastically affect packet reception. Have also shown that it is not always worth adjusting the parameters that reduce the data rate in our attempt to increase the probability of successful data reception, especially on links at the edge of their communication range. In many airports the runway light systems have been built decades ago taking into consideration the technologies available at that time. Nowadays, with the in-crease in air traffic and the aging of the infrastructure, it becomes necessary to modernize these systems. The classic way to modernization involves redesigning and rebuilding, which involves effort, implementation time and significant costs. Technologies available at this time, such as LoRa can create viable alternatives that involve effort, costs, and deployment time considerably smaller, which let us to develop devices that will make the system to operate at the level of current requirements.
24
A. Simo et al.
3 Case Study and Results The proposed solution is based on LoRaWAN communication protocol. It has the following components: • voltage sensors installed on every electric circuit of each lamp; • gateways in order to receive the sensors provided data; • industrial server in order to collect data. It is connected via an Ethernet connection. Fiber optic is used for data transmission to the gateway and server. Also, there are involved conversion devices to Ethernet interface at both ends; • developed software-tool and customized according the requests on the involved airport; • e-mail based notification service installed on the server. It is able to be integrated into the IT infrastructure on the involved airport. Practically, a set of non-invasive devices is added to an existing light beacon system or to a part of it. The data received are centralized and used in order to view, to analyze and to real-time alert the intervention staff. The implemented project may be described as follows: • installed sensors are monitoring the lamps’ state. They are transmitting data to the receiver only when a problem occurs; • data sent from each sensor are received by the gateway and transmitted to the server; • the software-tool installed on the server is designed to collect, manage and analyze the received data. Three colors are used in order to describe the incidents: – green color meaning that all the lamps are operating; – yellow color signifying that an incident requiring a programming intervention has occur; – red color highlighting that a major incident has occur and it needs to be solved with maximum priority; • the software-tool is designed to have a user friendly interface. The lamps’ map is displayed. Several information possibilities are possible: – all the lamps are on (normal operation), all the incidents highlighted (alarm operation); – the collected parameters are displayed; – the history is able to be analyzed, having multiple sorting, filtering and exporting (csv, xls, etc.) possibilities; – the daily reporting is performed into a document based on an selectable interval by the user; – incidents reporting is automatically achieved at their occurrence. The report contains the map and the incident location. This report is going to be attached to the alerting e-mail. Thus, the on-field intervention team is able to receive it, even on mobile devices; – the software-tool is also going to collect and report the sensors’ operating parameters (state of the batteries, radio transmission parameters);
LoRaWAN Based Airport Runway Lights Monitoring System
25
– the map of the sensors is able to be edited through a dedicated interface (in the same application). Thus, the maintenance service and sensors’ re-location (in the future) is achieved; – login to the software-tool is performed through an user and password account. An original device has been developed based on the collaboration between Politehnica University Timisoara and ETA2U Company, Innovation Department. It has been designed in order to monitor the light beacon system for airports. The monitoring device detects two aspects related to the lamp power supply: • electric current presence into the power supply circuit. A magnetometer is used for this purpose, measuring the electric field generated by the conductor current; • electric voltage detection into the power supply circuit. An impulse counter is used, being electrically isolated through an optocoupler by the conductor supplying the lamp. This device is supplied from the same power outlet as the airport light beacon system, using the same connectors (Fig. 4).
Fig. 4. Connectors used for the light beacon system
The monitoring system device operating principle is presented in Table 1. The developed device incorporates a LoRa type radio module. A message is transmitted once the fault state is detected. Thus, the occurred error is signalized to the central receiver (a LoRa gateway, in our case). Additionally, at regular time intervals, information regarding the monitoring device state are sent.
26
A. Simo et al. Table 1. Monitoring system device operating principle.
Electric current
YES NO NO YES
Electric voltage
YES YES NO NO
Lamp state
OPERATING FAULT NO SUPPLIED N/A
The messages sent through the gateways are reaching an IT system composed by server and clients. Its goal is to process them, save and generate alerts in critical situations. The light beacon system current state is able to be viewed on a display. The data transmission and processing subsystem architecture is presented in Fig. 5.
Fig. 5. Data transmission and processing subsystem
The developed device has two main components: • measurement module. It is represented by a PCB. Its goal is to measure the voltage and current values from the monitored circuits; • ED 1608 LoRa type module adapted. Its goal is to take decisions and to transmit data to the gateway. It has been programmed using a firmware specially designed for the airport light beacon system monitoring. The monitoring device operation is described as follows: • “Boot” (0 00) type message is sent once the device is started; • “Alive” (0 01) type message is sent at each 6 h (configurable time period as the client requests), meaning that: the device is operating, the battery state, the lamp state; • “Update” (0 02) type message is sent once a fault occurred or fault has been cleared, according to the rules presented in Table 2.
LoRaWAN Based Airport Runway Lights Monitoring System
27
Table 2. Rules used for “Update” message. State On Off Error Recover
Voltage On Off On On
Current On Off Off On (after error)
Update message No No Yes Yes
The developed device includes a battery providing 2 years power supply. Theoretically, it provides the necessary energy in order to send 100000 messages. But, it is going to degrade, before reaching this number, due to the age (time passes) and climate conditions (temperature variations winter/summer). This monitoring device is isolated against humidity and dust, being encapsulated into an IP68 certified box. The lamps are supplied at 10–30 V A.C., 50 Hz. The consumed power is 180 W/ (24–30 V). The optocoupler is damaged if a voltage greater than 30 V is applied. The developed device is not able to be supplied with direct current (D.C.). The tests have been performed in laboratory conditions, using a 230/24 V transformer (50 Hz), 200 VA (8.33 A), as presented in Fig. 6.
Fig. 6. Laboratory developed device testing
The load has been represented by three 60 W/24 V light bulbs connected in parallel. Thus, different scenarios have been simulated and the device operating behavior has been analyzed.
28
A. Simo et al.
4 Conclusions Airports traffic must have a high-performance radio-navigation and light beacon system to ensure safe aircraft operation in low visibility conditions, which means lower flight times for airlines, and therefore reduced fuel consumption. This kind of facility offered to airlines by any airport gives other advantages such as considerably reducing delays or avoiding cancellation, and passengers will be able to travel according to the flight schedule. In many cases, however, airport runway light systems have been designed and built decades ago taking into consideration the needs and technologies available at that time. Over time, with the increase in air traffic and the aging of the infrastructure, it becomes necessary to modernize these systems in order to improve operating times and immediate detection of operating errors. The classic way to modernization involves redesigning and rebuilding these systems, which involves effort, implementation time and significant costs. A viable alternative that involves effort, costs, and deployment time considerably smaller is to equip the current infrastructure with additional devices that will make the system to operate at the level of current requirements. In view of these considerations, it is justified to implement an efficient beacon monitoring system, which can guarantee the continuous and safe operation of the system, and allows rapid identification of unwanted situations. The developed device offers a novel perspective in terms of light beacons system monitoring using these noninvasive methods. The data collected by these devices is centralized and used for real-time visualization and alerting by monitoring and intervention staff. Acknowledgment. This work was partial supported by the Politehnica University Timisoara research grants PCD-TC-2017. The authors are addressing special thanks to the members of Eta-2U, Department of Innovation.
References 1. de Carvalho Silva, J., Rodrigues, J.J.P.C., Alberti, A.M., Solic, P., Aquino, A.L.L.: LoRaWAN - a low power WAN protocol for internet of things: a review and opportunities. In: 2nd International Multidisciplinary Conference on Computer and Energy Science, Split, Croatia, pp. 1–6. IEEE (2017) 2. Trasviña-Moreno, C.A., Blasco, R., Casas, R., Asensio, Á.: A network performance analysis of LoRa modulation for LPWAN sensor devices. In: Lecture Notes in Computer Science, vol. 10070, pp. 174–181. Springer, Cham (2016) 3. Wang, H., Fapojuwo, A.O.: A survey of enabling technologies of low power and long range machine-to-machine communications. IEEE Commun. Surv. Tutor. 19(4), 102–1024 (2017) 4. LTE evolution for IoT connectivity, Nokia. Technical report, Nokia White Paper (2016) 5. LoRa Alliance (2018). https://lora-alliance.org
LoRaWAN Based Airport Runway Lights Monitoring System
29
6. Petajajarvi, J., Mikhaylov, K., Roivainen, A., Hanninen, T., Pettissalo, M.: On the coverage of LPWANS: range evaluation and channel attenuation model for LoRa technology. In: 14th International Conference on ITS Telecommunications (ITST), Copenhagen, Denmark, pp. 55–59. IEEE (2015) 7. Augustin, A., Yi, J., Clausen, T., Townsley, W.M.: A study of LoRa: long range and low power networks for the internet of things. Sensors 16(9), 1466 (2016) 8. Magrin, D., Centenaro, M., Vangelista, L.: Performance evaluation of LoRa networks in a smart city scenario. In: IEEE International Conference on Communications (ICC), Paris, France, pp. 1–6. IEEE (2017) 9. San-Um, W., Lekbunyasin, P., Kodyoo, M., Wongsuwan, W., Makfak, J., Kerdsri, J.: A long-range low-power wireless sensor network based on U-LoRa technology for tactical troops tracking systems. In: Third Asian Conference on Defence Technology (ACDT), Phuket, Thailand, pp. 1–6. IEEE (2017) 10. Petjjrvi, J., Mikhaylov, K., Hmlinen, M., Iinatti, J.: Evaluation of LoRa LPWAN technology for remote health and wellbeing monitoring. In: 10th International Symposium on Medical Information and Communication Technology (ISMICT), Worcester, MA, USA, pp. 1–5. IEEE (2016) 11. Mekki, K., Bajic, E., Chaxel, F., Meyer, F.: A comparative study of LPWAN technologies for large-scale IoT deployment. In: Open Access funded by The Korean Institute of Communications Information Sciences, vol. 3, no. 1, pp. 14–21 (2017) 12. Penariu, P.S., Popescu Bodorin, N., Stroescu, V.C.: A parallel approach on airport runways detection using MPI and CImg. In: Soft Computing Applications. Advances in Intelligent Systems and Computing, vol. 634. Springer, Heidelberg (2018) 13. Cattani, M., Boano, C.A., Römer, K.: An experimental evaluation of the reliability of LoRa long-range low-power wireless communication. J. Sens. Actuator Netw. 6, 1–18 (2017) 14. Gal, A., Filip, I., Dragan, F.: IoThings: a platform for building up the internet of things. In: Soft Computing Applications. Advances in Intelligent Systems and Computing, vol. 633. Springer, Heidelberg (2018)
The Enhanced Crow Search Algorithm for Fuel-Cost Function Parameters Assessment of the Cogeneration Units from Thermal Power Plants Dinu-Calin Secui, Simona Dzitac(&), and Serban-Ioan Bunda University of Oradea, University St. 1, 410087 Oradea, Romania [email protected], [email protected], [email protected]
Abstract. Optimal dispatching of the thermal power plants which produce simultaneously heat and electricity (the cogeneration units) represents an important economical issue in power systems. To approach this problem it is necessary to establish the parameters for fuel-cost function as accurate as possible, these parameters showing the link between production costs of one unit, electrical power and amount of heat produced. In this paper, an enhanced crow search (ECS) algorithm is applied to estimate the fuel-cost parameters of the thermal units which produce only heat or operate in cogeneration mode. The ECS algorithm has the same framework as the original crow search (CS) algorithm, but it introduces a different relation to modify the solutions from the search space. The effectiveness of the ECS and CS is verified on a three-unit thermal system. The results obtained by the ECS and CS algorithms are compared with those obtained by other well-known algorithms. Keywords: Crow search algorithm
Fuel-cost function Cogeneration
1 Introduction Cogeneration is a process in which both electricity and heat are produced simultaneously [1]. The efficiency of this process is higher than the efficiency of thermal units producing only electricity or only thermal energy [2, 3]. The economic dispatch (ED) [4, 5] and combined heat and power economic dispatch (CHPED) [6] are considered an important topic for power systems operating, and it is seen as a non-linear optimization problem. The CHPED problem involves determining the electrical power and the amount of heat generated by each unit (heat-only, power-only or combined heat and power), so the cost of fuel for the entire power system to be minimal and the technical restrictions to be satisfied [1]. Solving the CHPED problem requires knowledge of the model (function) which shows the relationship between the production cost, the electric power and the heat produced by a cogeneration unit. This relationship is called the fuel-cost function of the generating unit. The identification of the fuel-cost function for a cogeneration unit implies estimating as accurately as possible the parameters of the considered model. In order to estimate the fuel-cost function © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 30–40, 2021. https://doi.org/10.1007/978-3-030-51992-6_3
The Enhanced Crow Search Algorithm
31
parameters of the thermal units, several metaheuristics methods have been used over time. Some of these methods are outlined below: particle swarm optimization (PSO) [7, 8], artificial bee colony (ABC) [9], teaching learning based optimization [10], cuckoo search algorithm [11], radial movement optimization technique [12], and improved differential evolution algorithm [13]. The results obtained by the mentioned algorithms show that metaheuristics methods can be successfully used also in the problem of estimating the fuel-cost function parameters of the thermal units of various types (heatonly, power-only or combined heat and power). A new metaheuristic population-based technique was recently developed by Askarzadeh, called crow search (CS) algorithm. The CS algorithm mimics the intelligent behavior of the crows which hide excess food and then, if necessary, recover it [14]. The CS algorithm has been used successfully in optimizing some mathematical functions [14], and in solving engineering problems in various fields: economic dispatch [15], optimal selection of conductor size in radial distribution networks [16], DNA fragment assembly problem [17], and image processing [18]. The main work in this article is testing the efficiency of the ECS and CS algorithms in order to solve the problem of estimating the fuel-cost function parameters of the thermal units which produce only heat or operate in cogeneration mode.
2 Estimating the Fuel-Cost Function Parameters Estimating the fuel-cost function parameters is an optimization problem with restrictions, in which the components of the variables vector to be optimized a = [a1, a2, …, ap, …, aP] are the parameters of the considered model, P being the number of model parameters. The objective function, marked with eTotal, minimizes total estimation errors according to the equation [8]: eTotal ¼
V X Ca;v Ce;v
ð1Þ
v¼1
where, Ca,v, v = 1, 2, .., V are the current (measured) values for the fuel-cost function; Ce,v, v = 1, 2, .., V are the estimated values for the fuel-cost function; V represents the number of values/observations for a particular generating unit studied. The Ca,v values are considered known and Ce,v values are calculated based on the fuel-cost function with Eq. (2), where Ce,v = Cv(Hv), or with Eq. (3), where Ce,v = Cv(Hv, Pv). In this paper, two models are considered to describe the fuel-cost function of the thermal generating units. The first model (model 1 [2]) refers to thermal units that produce only heat and is defined by the equation: Cv ðHv Þ ¼ a1 þ a2 Hv þ a3 Hv2
ð2Þ
where, a1, a2, a3 are the cost coefficients (the variables of the problem) for a unit that produced only power; Hv is heat produced by a thermal unit, for a measured v-value; Cv(Hv) is the fuel-cost function for heat-only units.
32
D.-C. Secui et al.
The second model (model 2 [2]) refers to thermal units in cogeneration mode and is defined by equation: Cv ðHv ; Pv Þ ¼ a1 þ a2 Pv þ a3 P2v þ a4 Hv þ a5 Hv2 þ a6 Pv Hv
ð3Þ
where, a1, a2, a3, a4, a5, a6 are the cost coefficients (the variables of the problem) for a unit in cogeneration mode; Pv is active power produced by a thermal unit, for a measured v-value; Cv(Hv, Pv) is the fuel-cost function for a cogeneration unit. The restrictions imposed on model parameters are: amin;p ap amax;p ; p ¼ 1; 2; . . .; P
ð4Þ
where, amin,p, amax,p are the minimum and maximum values of the parameters of the studied model; p is the index for the ap parameter of the model.
3 The Crow Search Algorithm (CS) The CS [14] is a metaheuristic algorithm which models the behavior of a crow flock looking either their own food hidden in various places, or the hidden food of other crows. The crows are considered smart birds with the ability to communicate, to memorize for a long time the location in which they have hidden the food and to watch other crows to identify and steal from them their hidden food [14]. To build an artificial model, the following associations are made [14]: crows are considered search agents in a space with p dimensions; the positions of crows in the search space are feasible solutions of the problem; the potential of the food source corresponding to the position (solution) of a crow is valued quantitatively/qualitatively by the objective function f. The best identified source of food is considered the best solution of the problem. We consider a flock (population) made up of NC crows. The position of a i crow in the search space at iteration t is represented by the vector Xi(t) = [xi1(t), .., xip(t), .., xiP(t)], i = 1, 2, .., NC; xip(t) is the p-component of the Xi(t). Also, every crow, at iteration t, memorizes the position of the hidden place, noted by mi(t) = [mi1(t), .., mip(t), .., miP(t)], i = 1, 2, .., NC; mip(t) is the p-component of the mi(t). The position of the hidden place, mi(t), represents the best position obtained and memorized by each crow, until iteration t. Moving a crow i from position Xi(t) to iteration t, in position Xi(t + 1) at iteration t + 1 is done with the equation [14]: Xi ðt þ 1Þ ¼
Xi ðtÞ þ ri fli ðtÞ ðmj ðtÞ Xi ðtÞÞ if rj AFj ðtÞ Xmin þ r1i ðXmax Xmin Þ otherwise
ð5Þ
mj(t) is the best memorized position of crow j up to iteration t; Xmin, Xmax are vectors whose components are the minimum, respectively maximal values of the vector of variables [X] = [a]; ri, rj, r1i are random numbers that have a uniform distribution in the interval (0,1); fli(t) is the flight length for crow i at iteration t; AFj(t) is the awareness probability for crow j, at iteration t;
The Enhanced Crow Search Algorithm
33
The updating of the memory mi(t + 1) of each crow i, i = 1, 2, . , NC, at iteration t + 1, is performed using the equation: mi ðt þ 1Þ ¼
Xi ðt þ 1Þ if f ðXi ðt þ 1ÞÞ\f ðmi ðtÞÞ mi ðtÞ otherwise
ð6Þ
where, f(Xi(t + 1)), f(mi(t)) are the values of the objective function corresponding to the positions Xi(t + 1), respectively mi(t). A pseudo-code of the CS algorithm [14] is presented below, with some specifications regarding solving the estimation problem.
34
D.-C. Secui et al.
The CS has four parameters: the size of the crow population (NC), maximum number of iterations (tmax), the flight length (fl) and the awareness probability (AP). If the parameter fl is set to low values (fl < 1), a search is made close to the position Xi(t), and if fl takes high values (fl > 1), a search is made in areas further away from the position Xi(t). The awareness probability (AP) maintains the balance between the exploitation phase and the exploration phase. Small values for the AP parameter favor the exploitation phase, and the larger values favor the exploration phase.
4 The Enhanced CS Algorithm (ECS) The ECS algorithm has the same framework as the original crow search algorithm (CS) presented in Sect. 3. Strengthening the CS algorithm consists in introducing another equation to update the position of the crow in search space. Thus, the Eq. (5) of the original CS is replaced by the Eq. (7) in ECS, and updating the positions Xi(t + 1), i = 1, 2, .., NC, on iteration t + 1 is done with the equation: Xi ðt þ 1Þ ¼
mbest ðtÞ þ ri fli ðtÞ ðmj ðtÞ Xi ðtÞÞ if rj AFj ðtÞ Xmin þ r1i ðXmax Xmin Þ otherwise
ð7Þ
where, mbest(t) is the best position (solution) memorized by a crow to the iteration t; From Eq. (7) we notice that if the condition rj AFj(t) is met, the searching is done around the best solution identified until iteration t. Otherwise (rj < AFj(t)), a random solution is generated in the search space. The search around the solution mbest(t) is inspired from the variants of algorithms DE or ABC [19, 20].
5 Case Studies and Results The CS and ECS algorithms are applied for identification the fuel-cost function of tree thermal units. One of those (unit 1) produces only heat, and the other two units operate in cogeneration mode (unit 2 and 3). The fuel-cost function for unit 1 is modeled by the Eq. (2), respectively by the Eq. (3) in the case of units 2 and 3. For each unit the actual values (Pv, Cv(Pv) or Pv, Hv, Cv(Hv, Pv)) are known, being presented in Table A1. They are determined on the basis of actual cost coefficient values ap, p = 1, 2, .., P, taken from [2]. Unit 1 has V = 16 observations and units 2 and 3 have V = 80 observations. A total of 50 runs were conducted, in order to calculate the following indicators: minimum value -min(eTotal), average value - mean(eTotal), maximum value - max (eTotal) and standard deviation - SD(eTotal). Also, three well-known and frequently used algorithms are applied to compare the results: particle swarm optimization (PSO), harmony search (HS) and artificial bee colony (ABC). The maximum number of the evaluated functions was limited to the values MaxFE = 6000 (for unit 1), and MaxFE = 30000 (for unit 2&3). The algorithms were implemented in MathCAD, on a computer having the following characteristics: Intel i5 processor, 2.2 GHz CPU and 4 GB of RAM.
The Enhanced Crow Search Algorithm
35
The setting values for the parameters of the algorithms ECS, CS, PSO, HS, ABC are presented below, and they were obtained by experimental testing: ECS and CS: NC = 20, tmax = 300, fl = 0.25, AF = 0.02 (for unit 1) and NC = 30, tmax = 1000, fl = 0.25, AF = 0.02 (for unit 2 and 3); PSO: NC = 15, tmax = 400, c1 = 2, c2 = 2, w1 = 0.1, w2 = 0.9 (for unit 1) and NC = 50, tmax = 600, c1 = 2, c2 = 2, w1 = 0.1, w2 = 0.9 (for unit 2 and 3); HS: NC = 25, tmax = 6000, HMCR = 0.99, PAR = 0.1, BW = 0.01 (for unit 1) and NC = 100, tmax = 30000, HMCR = 0.99, PAR = 0.1, BW = 0.01 (for unit 2 and 3); ABC: NC = 15, tmax = 200, Limit = 0.4tmax (for unit 1) and NC = 25, tmax = 600, Limit = 0.4tmax, (for unit 2 and 3). The settings were made so that the applied algorithms to obtain the best results. In Table 1 we have presented the best estimates obtained for the parameters of model 1 (a1, a2, a3) and model 2 (a1, a2, a3, a4, a5, a6). Also, for each studied unit, the values of the indicators are specified: min(eTotal), mean(eTotal), max(eTotal) and SD (eTotal), as well as the average computing time (CPU). Analyzing the results from Table 1, it is noticed that for all the three studied units the best estimates are obtained by the ECS algorithm. From the point of view of the estimates quality (Min(eTotal)) and stability (measured by the indicator SD(eTotal)), the ECS algorithm is more performant than the other algorithms (CS, PSO, ABC and HS). The computing times have low values for all applied algorithms. The close values for CPU, for all the algorithms, are due to the fact that the same value was imposed to the number of evaluations (MaxFE).
Table 1. The estimated values of the parameters for the model 1 (a1, a2, a3) and model 2 (a1, a2, a3, a4, a5, a6), and the statistical indicators obtained through various algorithms Parameters PSO HS ABC Unit 1 (The fuel-cost function represented by Model 1) 949.815583 950.049999 950.040102 a1 a2 2.049186683 2.003112057 2.005279865 a3 0.037017433 0.038097579 0.038061088 Min(eTotal) 5.0944 1.3690 1.3858 Mean(eTotal) 77.3962 75.2113 3.2053 Max(eTotal) 208.1066 333.1999 10.3235 SD(eTotal) 51.366 69.073 1.830 CPU(s) 0.19 0.22 0.21 Unit 2 (The fuel-cost function represented by Model 2) a1 1274.705860 1248.914507 1233.238670 a2 35.1617463 35.9910709 36.4007622 a3 0.050862088 0.043500405 0.041369716
CS
ECS
950.175548 1.996886400 0.038187480 1.6887 5.9026 11.3750 2.088 0.18
950.0009879 2.003671045 0.038102043 1.3348 2.0576 9.6462 1.393 0.18
1251.397125 1250.297143 35.9590373 35.992900295 0.043726366 0.043537582 (continued)
36
D.-C. Secui et al. Table 1. (continued) Unit 2 (The fuel-cost function represented by Model 2) 0.661889973 0.627188923 0.690876635 a4 a5 0.026765934 0.026362881 0.026733939 0.009410696 0.011617168 0.009976951 a6 Min(eTotal) 568.1519 57.6877 86.7670 Mean(eTotal) 1364.6601 415.8396 285.3715 Max(eTotal) 3261.7949 1338.8835 673.6618 SD(eTotal) 531.516 283.137 144.808 CPU(s) 4.10 6.58 5.28 Unit 3 (The fuel-cost function represented by Model 2) a1 2538.593581 2633.525167 2661.156842 a2 15.5582272 14.6615204 14.3781947 a3 0.032245367 0.033899172 0.035005812 a4 4.843098004 4.360937639 4.079443585 a5 0.026858952 0.028428402 0.031405829 a6 0.029613536 0.031611949 0.030378142 Min(eTotal) 775.5290 192.4993 173.8889 Mean(eTotal) 3117.4647 722.2832 372.1194 Max(eTotal) 5909.6713 1845.2264 742.5518 SD(eTotal) 1066.294 359.093 136.441 CPU(s) 4.10 6.58 5.28
0.590301514 0.026930447 0.011234860 20.4274 94.2421 190.0802 32.263 5.11
0.599258596 0.026991881 0.011018071 3.0744 17.9243 59.9382 13.196 5.11
2640.652785 14.612060 0.034165520 4.238857202 0.030369841 0.030561974 114.2028 233.6267 397.3194 62.510 5.11
2650.060958 14.49920524 0.034502243 4.20022352 0.02999819 0.031000226 2.4526 17.1105 112.3107 17.297 5.11
In Fig. 1, 2 and 3 are presented the convergence curves for the ECS, CS, PSO and ABC algorithms. From these figures we can notice that the convergence process is faster for the ECS algorithm (the number of evaluations MaxFE being set). In Fig. 4 are shown the convergence curves of the HS algorithm for the three units (unit 1, unit 2, unit 3).
1200
ECS
40.0
CS PSO
30.0
ABC
20.0
ECS
1000
εTotal ($/h)
εTotal ($/h)
50.0
CS
800
PSO
600
ABC
400
10.0
200
No. iteraons
0.0 0
100
200
300
400
500
Fig. 1. The convergence curve for unit 1.
No. iteraons
0 0
200
400
600
800
1000
Fig. 2. The convergence curve for unit 2.
ECS
800
CS
600
PSO ABC
400
εTotal ($/h)
εTotal ($/h)
The Enhanced Crow Search Algorithm
Unit 1
400
Unit 2 Unit 3
200
200
0
200
400
600
800
0
1000
Fig. 3. The convergence for unit 3.
No. iteraons
0
No. iteraons
0
37
Fig. 4. The algorithm.
10,000
convergence
20,000
of
30,000
the
Table A1. The actual values for the thermal units [2] No. Unit 1 Pv [MW] Ca,v [$/h]
Unit 2 Pv [MW] Hv [MWth] Ca,v [$/h]
Unit 3 Pv [MW] Hv [MWth] Ca,v [$/h]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
40 44 44 44 44 44 50 50 50 50 55 55 55 55 55 60 60 60 60 60 60 60 65 65 65 65 65
81 85 86 90 90 90 95 95 95 98.8 100 100 100 100 105 105 105 105 110 110 110 110 120 120 120 120 120
0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 – – – – – – – – – – –
949.8569 958.6226 968.515 979.7136 991.7233 1005.4957 1020.1965 1035.9687 1053.2081 1071.4884 1091.1111 1111.8944 1134.0404 1157.0044 1181.7926 1207.3886 – – – – – – – – – – –
75 0 15.9 31 50 75 10 25 47 70 7 18 35 71 87 9 24 39 52 71 89 91 12 27 38 47 65
2989.4717 2918.2273 2942.3036 2977.8022 3039.8982 3151.3868 3172.9801 3204.4093 3272.4435 3371.5346 3371.2987 3392.0098 3436.886 3583.2735 3670.829 3580.0477 3612.4357 3656.7727 3705.1198 3792.1944 3892.5484 3904.8949 3793.5369 3829.0001 3862.7802 3895.2751 3973.2947
104.8 86.3 100 60 80 100 50 70 98 0 10 35 65 103.5 23.4 51.7 87.6 117.2 15 52.9 104.2 115.5 8 36.8 75.8 110.6 121.2
5083.6143 4945.0783 5138.8141 4761.8659 4985.6064 5233.5240 4771.0523 4985.9953 5327.2100 4419.3058 4521.0066 4737.1785 5046.2807 5521.9162 4743.7368 5018.4063 5436.0913 5838.6466 4783.3547 5149.0038 5781.1364 5941.6254 4952.0792 5218.8748 5659.4800 6129.7272 6287.3499
(continued)
HS
38
D.-C. Secui et al. Table A1. (continued)
No. Unit 1 Pv [MW] Ca,v [$/h] 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70
– – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – –
– – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – –
Unit 2 Pv [MW] Hv [MWth] Ca,v [$/h]
Unit 3 Pv [MW] Hv [MWth] Ca,v [$/h]
65 65 65 70 70 70 70 70 70 70 75 75 75 75 75 80 80 80 80 80 80 80 80 85 85 85 85 85 95 95 95 95 95 95 110 110 115 115 115 115 115 115 120
125 125 125 125 125 130 130 130 130 130 135 135 135 135 140 140 140 150 150 150 150 150 160 160 160 170 170 170 180 180 180 190 190 190 200 200 200 210 210 210 210 210 220
71 79 94 6 23 36 57 72 80 100 20 40 60 80 100 5 15 35 65 75 82 99 108 10 25 50 75 90 11 38 62 77 91 105 50 80 5 26 42 68 86 101 11
4003.271 4046.1372 4135.9529 3992.3406 4028.9205 4067.4604 4148.9012 4221.6997 4265.5618 4390.1422 4234.0111 4294.8855 4377.4378 4481.4411 4607.25 4416.5172 4436.7084 4493.3279 4618.6631 4671.2561 4711.2803 4819.5312 4883.1156 4642.3526 4679.5984 4768.5525 4891.2726 4981.1889 5083.8983 5164.1028 5268.4062 5349.353 5435.947 5532.9292 5894.402 6053.9682 5975.2038 6032.0228 6091.1842 6216.8862 6325.3663 6429.0079 6220.8048
2.7 27.3 74.1 88.9 113.7 16.7 47.3 73.4 92.8 126.2 33.5 88.2 120 132.4 72.8 100.6 132.1 3.7 54.9 91.7 122.1 138.9 37.1 66.4 140 27.4 102.6 135.8 92.8 120.5 142.7 112.6 136.2 145.9 123.4 147.1 160 31.6 63.2 108.4 156.8 171.3 78.2
5023.6454 5244.3834 5764.6903 5956.4836 6307.4947 5263.8787 5574.4510 5883.7214 6140.1455 6634.4510 5550.8425 6209.1863 6674.4981 6872.3446 6136.9387 6519.0115 7007.7743 5634.4169 6177.5347 6665.0645 7129.1735 7409.3303 6234.2475 6593.6990 7723.6076 6394.1343 7399.4417 7951.3460 7543.7538 7991.9246 8384.3352 8166.8605 8581.2562 8761.1309 8670.1931 9108.9795 9361.9884 7584.8316 8013.1055 8729.9184 9633.3334 9931.3289 8555.0006
(continued)
The Enhanced Crow Search Algorithm
39
Table A1. (continued) No. Unit 1 Pv [MW] Ca,v [$/h] 71 72 73 74 75 76 77 78 79 80
– – – – – – – – – –
– – – – – – – – – –
Unit 2 Pv [MW] Hv [MWth] Ca,v [$/h]
Unit 3 Pv [MW] Hv [MWth] Ca,v [$/h]
120 120 120 120 120 120 125.8 125.8 125.8 125.8
220 220 220 230 230 230 240 240 240 247
30 51 65 70 67 62 2 15 21 32.4
6278.3207 6364.4865 6435.3127 6463.1372 6446.2918 6419.2228 6471.2834 6503.1296 6520.8486 6559.7864
139.1 151.2 148.1 22.8 54.4 68.5 2.7 13.6 36.3 0
9623.2209 9861.8851 9799.8521 8083.9720 8515.1650 8726.9653 8148.8482 8281.0755 8579.2398 8336.2378
6 Conclusions In this paper is applied an enhanced crow search algorithm (ECS) for parameter estimation of the fuel-cost function of the units which produced only heat and for the units operating in cogeneration mode. Two models are considered for the fuel-cost functions. These models are based on grade 2 polynomials with one dimension (P), respectively two dimensions (P, H). The capacity of the ECS algorithm to solve the parameter estimation problems is tested on three thermal units. The performance of the ECS algorithm is better than the original CS algorithm or other known algorithms such as PSO, ABC and HS. The results show that the ECS algorithm is capable to obtain good quality and stable estimations in a reasonable amount of time (a few seconds) for the tested models.
References 1. Nguyen, T.T., Vo, D.N., Dinh, B.H.: Cuckoo search algorithm for combined heat and power economic dispatch. Electr. Power Energy Syst. 81, 204–214 (2016) 2. Vasebi, A., Fesanghary, M., Bathaee, S.M.T.: Combined heat and power economic dispatch by harmony search algorithm. Electr. Power Energy Syst. 29, 713–719 (2007) 3. Kádár, P.: Application of optimization techniques in the power system control. Acta Polytechnica Hungarica 10(5), 221–236 (2013) 4. Abbas, G., Gu, J., Farooq, U., Raza, A., Asad, M.U., El-Hawary, M.: Solution of an economic dispatch problem through particle swarm optimization: a detailed survey–part I. IEEE Access 5, 15105–15141 (2017) 5. Abbas, G., Gu, J., Farooq, U., Raza, A., Asad, M.U., El-Hawary, M.: Solution of an economic dispatch problem through particle swarm optimization: a detailed survey–Part II. IEEE Access 5(1), 24426–24445 (2017) 6. Dai, Y., Chen, L., et al.: Integrated dispatch model for combined heat and power plant with phase-change thermal energy storage considering heat transfer process. IEEE Trans. Sustain. Energy 9(3), 1234–1243 (2018)
40
D.-C. Secui et al.
7. El-Naggar, K.M., AlRashidi, M.R., Al-Othman, A.K.: Estimating the input-output parameters of thermal power plants using PSO. Energy Convers. Manage. 50, 1767–1772 (2009) 8. Alrashidi, M.R., El-Naggar, K.M., Al-Othman, A.K.: Particle swarm optimization based approach for estimating the fuel-cost function parameters of thermal power plants with valve loading effects. Electr. Power Compon. Syst. 37, 1219–1230 (2009) 9. Sonmez, Y.: Estimation of fuel cost curve parameters for thermal power plants using the ABC algorithm. Turk. J. Electr. Eng. Comput. Sci. 21, 1827–1841 (2013) 10. Durai, S., Subramanian, S., Ganesan, S.: Improved parameters for economic dispatch problems by teaching learning optimization. Electr. Power Energy Syst. 67, 11–24 (2015) 11. AlRashidi, M.R., El-Naggar, K.M., AlHajri, M.F.: Convex and non-convex heat curve parameters estimation using Cuckoo search. Arab. J. Sci. Eng. 40, 873–882 (2015) 12. Vanithasri, V., Balamurugan, R., Lakshminarasimman, L.: Modified radial movement optimization (MRMO) technique for estimating the parameters of fuel cost function in thermal power plants. Eng. Sci. Technol. Int. J. 19(4), 2035–2042 (2016) 13. Sayah, S., Hamouda, A.: Efficient method for estimation of smooth and nonsmooth fuel cost curves for thermal power plants. Int. Trans. Electr. Energy Syst. 28(3), 1–14 (2017) 14. Askarzadeh, A.: A novel metaheuristic method for solving constrained engineering optimization problems: Crow search algorithm. Comput. Struct. 169, 1–12 (2016) 15. Mohammadi, F., Abdi, H.: A modified crow search algorithm (MCSA) for solving economic loaddispatch problem. Appl. Soft Comput. 71, 51–65 (2018) 16. Abdelaziz, A.Y., Fathy, A.: A novel approach based on crow search algorithm for optimal selection of conductor size in radial distribution networks. Eng. Sci. Technol. Int. J. 20, 391– 402 (2017) 17. Allaoui, M., Ahiod, B., El Yafrani, M.: A hybrid crow search algorithm for solving the DNA fragment assembly problem. Expert Syst. Appl. 102, 44–56 (2018) 18. Oliva, D., Hinojosa, S., Cuevas, E., Pajares, G., Avalos, O., Gálvez, J.: Cross entropy based thresholding for magnetic resonance brain images using Crow Search Algorithm. Expert Syst. Appl. 79, 164–180 (2017) 19. Gao, W.-F., Liu, S.-Y.: A modified artificial bee colony algorithm. Comput. Oper. Res. 39, 687–697 (2012) 20. Gao, W.-F., Liu, S.-Y., Huang, L.L.: A global best artificial bee colony algorithm for global optimization. J. Comput. Appl. Math. 236(11), 2741–2753 (2012)
Modelling a Photovoltaic Power Station Eva Barla, Dzitac Simona(&), and Carja Vasile University of Oradea, University St. 1, 410087 Oradea, Romania [email protected], [email protected], [email protected]
Abstract. The continuous development of different domains of technique leads us to diversity; likewise, there are appearing new types of energy sources in the field of energy engineering, as well as new extraction procedures, conversions or energy caption, but also new components with better performances being used in many domain of industry. The nature offers us numerous energy resources, the only difficulty consists of their transformation into appropriate sources. In this article, a modelling and simulating method of a photovoltaic plant is presented with the help of the software called MatLAB, using the program’s library as well as the predefined configurations. The simulation with MatLAB and Simulink blocks leads us to accurate results due to the program’s high complexity level. The program is able to consider all necessary conditions, such as the variation of irradiance and temperature, to obtain a good, accurate and realizable project. Keywords: Specific irradiation
Modelling Methods
1 Introduction The continuous growing of the global population, results a high request of electricity and thermal energy in every domain of the life. Fossil fuels, such as carbon, oil, natural gas, are not only finite, but also expensive, regarding their conversion into electrical or thermal energy, as well as their transport and distribution to the consumers. Renewable energy is an expression that refers on certain energy forms which results from natural processes with a cycle of reproduction [1–3], such as wind, tidal, solar, biomass, geothermal energy sources; the problem is how to transform them into useful energies. The scientists have found many different solutions to convert the sustainable energies in electrical or thermal powers. Various specialists are searching for alternative sources to produce energy to reduce the greenhouse gases. Renewable energies, which through transforming procedures are converted into other forms of energy which are not polluting, have degradable residues, and their efficiency offers an advantage in economical point of view [7, 12]. The growing energy consumption requires the introducing of renewables, contributing also to the decreasing of non-renewable sources. Reducing greenhouse gases are essential in preventing the climate change, which has a critical impact on human life [8, 13, 14]. The simulation of different technical projects with the software called MatLAB is relatively simple, the graphics presents an advantage for the interpretation of the results [11]. © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 41–47, 2021. https://doi.org/10.1007/978-3-030-51992-6_4
42
E. Barla et al.
2 Simulation of the Photovoltaic Panels Using MatLAB Software Among the systems which are using renewable energies, the photovoltaic – PV – system have intrinsic qualities, such as the reduced exploitation cost, the limited necessities of maintenance, the silence, easy installation and reliability. In the procedure of designing a new project, the requirements and specification of the power station usually are not obvious, they are incomplete, and are not integrated within the design process [4–6]. 2.1
Simulink: Modelling of a PV Solar Cell
Usually, the solar cells are modelled using different types of equivalent circuits. Any PV model is based on the operation of diode; which offers to PV cells their exponential characteristic. With Simulink, the PV cells may be modelled with three modelling systems. The first possibility may be done with tools that are able to implement any differential equation or algebraic relation of a model with high mathematical complexity [9–11]. Another possibility is offered by Simscape that allows a direct modelling using the components’ physical behavior of an electrical domain (resistance, capacitance, diode) to implement the same mathematic equation. The solar cell from the library of MatLAB is an electrical source that contains solar electric induction and is dependent on the temperature. 2.2
The Intensity of the Induced Electrical Current by the Sun
The PV block is formed from a PV cell that is equivalent with a serial resistance Rs connected to an electrical source, 2 exponential diodes and a parallel resistance Rp. The produced current is calculated with the following equation: V þ IRp V + IR s I = Iph Is eNVt 1 Is2 e N2 Vt 1 Rp
ð1Þ
where: Iph - the current induced by the Sun Iph ¼ Iph0 IIr0r W Ir - solar irradiance onto the surface of the cell explained in m 2 Iph0 - current generated by the Sun for Ir0 radiation Is - electrical saturation for the first diode Is2 - electrical saturation for the second diode Vt ¼ kTq - thermal intensity depending on the temperature of the device (T) k - constant of Boltzmann 2.3
Dependence on the Temperature
The majority of the solar cells’ parameters, like the induced current Iph, electrical saturation of the first diode Is, Is2 the electrical saturation of the second diode, serial
Modelling a Photovoltaic Power Station
43
resistance Rs and the parallel R2 resistance, depends on the variation of temperature. The temperature of the PV cells is specified by the temperature value of the fix circuit TFIXED. Between the induced solar current Iph and temperature of a cell T, is the relation: Iph ðtÞ = Iph 1 + TIph1 ðT Tmeas Þ
ð2Þ
where: TIph1 - is the first coefficient of the temperature for Iph ; Tmeas - the multiplication parameter of the temperature
3 The Simulation of a PV Power Station In the following sequence of this scientific paper will be presented a PV power station, that contains 320 PV solar panels, divided in 64 strings connected in parallel each with 5 modules in series, (see Fig. 1).
Fig. 1. Scheme of the station
Beside the construction of the PV park, for an accurate simulation it is necessary to design the distribution network, it is observable on the left side of the figure the oscilloscopes, that will expose the graphical results obtained after the simulation. In the following figure (see Fig. 2), the parameters of the blocks are given, named PV array 1, and can be observed the parameters that may or may not be edited, these uses the values from data base of each type of module. Within the editable parameters are the number of the parallel strings and the number of arrays in each row. This array has 2 inputs and 3 outputs. The input part is connected to the elements which fulfill the solar
44
E. Barla et al.
radiation, expressed in W/m2, and the temperature expressed in Celsius degrees. As the output side is represented by the terminals + and – of the direct current produced by the groups of arrays respectively, and an element of connectivity to oscilloscope module.
Fig. 2. The parameters of the PV block
Also, from the block of PV the parameters of the array for each existing array in the library of the MatLAB can be observed, (see Figs. 3 and 4) is resulted after the tests made at an irradiance of 1000 W/m2 for a minimal temperature of 25 °C, and maximal 45 °C. The lines represent the electrical cables, the transformers are labeled with their specific values, and the symbols are conventional but also contain information about each element. The simulation was made at pre-established values, in this case different values for temperature and solar irradiance have been used to the panels, and in that case when the project contains more groups of PV, different values of temperature and irradiance for each group can be used, differences which appear in reality due to the shadowing, the curves of the Earth, the efficiency of the panels (see Fig. 5). PV1: temperature = 40 °C, radiation 1250 W/m2 After the simulation the resulted graphs given by the oscilloscope, the values for electrical voltage, the intensity and the reactive power can be observed. The following values can be modified: the number of inputs, the possibility to display many values in a single box (see Fig. 6).
Modelling a Photovoltaic Power Station
Fig. 3. Specified irradiances in case of a module
Fig. 4. Specified temperatures in case of five modules
45
46
E. Barla et al.
Fig. 5. The obtained values after simulation
Fig. 6. Graphs of the values V, I and PQ for the station
Modelling a Photovoltaic Power Station
47
4 Conclusions To design a photovoltaic – PV - power station, it is important to use certain software solutions to be able to model the final project. It is necessary to choose some parameters, such as the required power, solar radiation, the temperature of the arrays, and also economic factors. The software MatLAB is very complex but it offers accurate results. A PV park can be designed based on mathematical computation, and even though it seems to be simple, using a software modelling helps us to understand the complexity of such plant. In our case a power station constructed with 320 PV solar panels was simulated, divided in 64 strings connected in parallel each with 5 modules in series. In this case we had an irradiance between 1000 W/m2 and 1250 W/m2 within a year, the temperature between 25 °C 45 °C. The obtained power is approximately 100 kW. The software may be used to model and simulate a wind generator or a hybrid system of PV – wind generator.
References 1. Shea, S.P.: Evaluation of Glare Potential for Photovoltaic Installations. Sunniva, Inc. (2012) 2. Kilyeni, S: Optimization techniques in power engineering, Orizonturi Universitare (2015) (in Romanian) 3. Photovoltaics: Design and Installation Manual, Solar Energy International (SEI) (2007) 4. http://www.solarenergy.org/bookstore/photovoltaics-design-installationmanual 5. https://ec.europa.eu/energy/intelligent/projects/sites/iee-projects/files/projects/documents/ east-gsr_training_manual_romania.pdf 6. SolarBuzz.com: www.solarbuzz.com 7. North Carolina Solar Center, Raleigh, NC. http://www.ncsc.ncsu.edu/ 8. Siting of Active Solar Collectors and Photovoltaic Modules (2001) discusses evaluation of a building site for its solar potential at https://nccleantech.ncsu.edu/resource-center-2/factsheets-publications/ 9. Dorf, R.C.: The Electrical Engineering Handbook. CRC Press, Boca Raton (2000) 10. Goswami, D.Y.: Principles of Solar Engineering. CRC Press, Taylor and Francis Group, Boca Raton (2015) 11. Nemes, C., Munteanu, F.: Potential solar irradiance assessment based on a digital elevation model. Adv. Electr. Comput. Eng. 11(4), 89–92 (2011) 12. Banu, I.V., Istrate, M.: Modeling and simulation of photovoltaic arrays in MatLAB. Gheorghe Asachi 3, 161–166 (2012) 13. Salmi, T.: Int. J. Renew. Energy Res. 2(2) (2012). http://uran.donetsk.ua/masters/2014/etf/ petlevannaya/library/585.pdf 14. http://stiintasiinginerie.ro/wp-content/uploads/2013/12/51-OPORTUNITATEA-UTILIZ% C4%82RII-ENERGIEI.pdf 15. http://ec.europa.eu/energy/renewables/index_en.html
Identification of Fault Locations on Tree Structure Systems Judith P´alfi(B) and Adrienn Dineva ´ K´alm´an Kand´o Faculty of Electrical Engineering, Obuda University, B´ecsi Str. 96/b, Budapest 1038, Hungary {palfi.judith,dineva.adrienn}@kvk.uni-obuda.hu
Abstract. Many physical tree structure systems can be observed in real life on a wide spectrum of our world. Some of these were created by nature such as the nervous and vascular systems of organisms, shape of the trees, veins of the leaves, family tree and many more. Other systems were artificially created by humans in order to serve and improve their own living conditions such as the electricity-, water- and gas supply systems. The systems created by humans produce failures during operation. In order to carry out the repair works the identification of the fault location is required. Today’s modern technologies, such as smart sensors which send information to the data recorders and data processing information technology centers where the information is processed, provide good opportunities for the development of new algorithms with a large variety of functions for localizing the faults in the systems. In this paper the development of the principle of the operation of a new fault location identification algorithm is presented. Keywords: Fault localization · Low voltage network · Smart grid system · Tree graph system
1 Introduction The various technical systems in our environment can be considered as complex networks [14]. These complex networks have different structures. Such structures could be for example star graphs, looped graphs or tree structure graphs. The low voltage (LV) electricity network according to its topology, can be modeled by a tree structure graph. In order to identify the fault location on a LV network, a smart grid fault identification algorithm (SGFI) was developed. The algorithm is able to locate the position of the fault and identify the defective device based on the data collected from the smart meters and consumer announcements. The power supply network records allow us to set up a network failure defect characteristics database by which the accuracy of the error location can be further increased and the type of the error can be determined.
2 Description of the Troubleshooting System Currently Used by the Power Suppliers The faults in the LV electric networks cause significant losses to the power suppliers. The sources generating these losses have to be identified so that professional teams can c Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 48–59, 2021. https://doi.org/10.1007/978-3-030-51992-6_5
Identification of Fault Locations on Tree Structure Systems
49
be assigned to carry out the troubleshooting with the most appropriate tools. A solution for minimising the losses from blackouts is given by the improvement and the development of the work management processes and systems. These systems include the central call centres and the work management departments. The central work management departments order the appropriate specialists and tools for the reparation of the faults, review and control the troubleshooting process and account for the costs related to the reparation works. Currently, the identification of the faulty equipments is made manually [7]. The reason behind this is that there are no remote signalling equipments on the LV networks. Because of that the power suppliers are informed about the faults only via the notifications of the customers. Figure 1 shows the steps of the process from the error handling to the normal operating status. The employees working at the TeleCenter (TC) register the fault related notifications in MIRTUSZ [1] database. The dispatcher schedules the work manually (or by the assistance of the automatic resource allocator system) to a team close to the most optimal geographic position of the fault location. This team is usually working on a planned interruptible work (e.g. a low priority network review). They stop the lower priority work they were doing and following their arrival to the spot, the team determines the exact location of the fault, the reason behind it and reviews the available tools for the reparation. If the tools are available, the troubleshooting repair works can start. If not, then (based on the dispatcher’s decision) the team gets back to the site of the power supplier company to get the missing tool(s) and only after returning to the fault location they start the reparatory works. Finally, the team resolves the problem and restores the normal operating status of the network. The current fault repairing system contains several similar less efficient steps and procedures which if developed and improved could have as result a significant reduction of the troubleshooting time. In the following the reduction of the time length of the localisation process of the fault, that is the decrease of the time interval between the start of the blackout and the localization of the fault, or the optimization of the fault address scheduling is considered. In order to reduce the fault reparation time on the LV networks new scientific methods and modern technologies are to be developed and applied.
3 Mathematical Background of the Proposed Method The low voltage electricity network is modeled – according to its topology – by a tree graph [16]. The nodes represent the possible fault locations, the connections the overhead lines or cables. The graph’s mathematical description is given by the following [3]: Let G(V G, EG, cG, lG) be an undirected graph where V G indicates the cluster of the nodes interpreted on G graph, EG stands for the cluster of the edges interpreted on G graph, cG : EG → V G is a connection function and lG : EG → λ is a labelling function in the case of edges which fulfill the following: lG(e) = abs(cG(e)) for all e ⊂ EG. V (Vertex) nodes received names according to the real physical system elements, V j is the jth node of the graph where j = 1, .., n, where n indicates the number of the edges. V node can be considered as isolated in V G if it is not connected to any edge or no other edges are connected to it. E (Edges) means the edges connecting the nodes. Ei is the number of edges, i = 1, .., m as it is a tree graph, accordingly the number of edges
50
J. P´alfi and A. Dineva
Fig. 1. Flow chart of the troubleshooting process [10]
is E = n − 1. I indicates graph G’s incidence matrix and as the problem correlates to the alignment of the nodes and edges, the calculation of the combinatorial Laplace is: L = I ∗ I . The Laplace matrix is a quadratic real matrix, not positive. The aggregate of inputs outside of the diagonal and the rows is zero. Each diagonal input L( j, j) is the degree (G, j) provided by the degree of j node. The inputs outside of the diagonal in L indicates the edges in G in a way that L(i, j) = L( j, i) = −1 if there is an edge between i and j nodes, otherwise L(i, j) = L( j, i) = 0 [5, 13]. If a matrix is connected with a weighted undirected graph, then it generates an ordinary Laplace matrix. A standardized Laplace matrix is a Laplace matrix where the number of inputs outside of the diagonal does not exceed 1n , where n is the order of the matrix [2, 8].
4 Fault Probability Data Bank In order to run the algorithm a fault probability data bank connected to the low voltage electricity network topology has been established and on which the algorithm was tested. The fault probability data bank was generated from real network data provided ¨ EM ´ ASZ ´ by ELMUpower supply company. This contains the GIS (Geographic Inte¨ EM ´ ASZ ´ grated System) based map (ELMU - GIS) supported database [9], and all those electricity network elements that are needed and sufficient for the fault localization in the low voltage electricity network. The fault probability data bank needed
Identification of Fault Locations on Tree Structure Systems
51
for the fault localization was generated from this data base using Big Data analysis. The data bank according to the electricity network topology contains the elements of the network and the parameters of the failure probability of the devices which could potentially fail: transformers, distribution boards, consumers, cables and real network elements. Figure 2 shows the low voltage electricity network model representing the real physical network. In Fig. 2 the circles with numbers represent the potential fault locations, namely the nodes of the graph [6] (e.g. fuses) while the cables and overhead lines in between represent the connections between the nodes [4].
Fig. 2. Illustration of the elements of a real LV network
The new data bank contains the following: – the nodes of the logical graph [11], which are elements of the low voltage electricity network (transformers, distribution boards and consumers) LvnGraphVertex(Vi ), where i = 1, 2, . . . ., n, weighted according to the function in the network, – a priori fault probability assigned to the nodes LvnGraphVertex(n1, n2, . . . , nn), – the edges between the nodes, LvnGraphEdges(E j ), where j = 1, 2, . . . , m, – the hierarchical correlation of the connections (1 : N), and – the fault probability assigned to the edges based on the technical parameters (edge length, batch number on the edge length, (cable or overhead line) LvnGraphEdges(e1 , e2 , . . . , en ). The new data bank mirrors the substantial indicators of the network structure such as the most used average degree indicator and the graph Laplace spectrum used for representing the degree distribution. The data bank made possible the testing of the algorithm.
5 SGFI Algorithm The basic operating criteria of SGFI algorithm is that a signal is received from the sporadically installed sensors on the graph about whether there is power or not. The other operating conditions are given by the associative logical rules of network fault location:
52
J. P´alfi and A. Dineva
– (1) the “selective network protection logic”, that is the breaker release logic from the consumer to the direction of the electricity supplier (from the consumer via the distribution board to the transformer), – (2) the probability that fault happens in the same time in two nodes independently from each other is negligibly small, therefore the algorithm jumps to the first common connection point of the two nodes where the source of the fault is, – (3) signals arrive from the fault location(s) via smart sensors or through the TeleCentrum, – (4) the low voltage electricity network is provided with intelligent distribution boards containing smart sensors. The steps of the algorithm can be seen in Fig. 3 (first procedure) and in Fig. 4 (second procedure). The algorithm loads the A adjacency matrix from the DB database. 5.1
Localization of the Fault Zone (Procedure 1)
Based on matrix A the algorithm generates graph G by the following graph characteristics, labels and weights, where V is the graph nodes, E is the graph edges. In the incidence matrix, cG is a connectivity function interpreted on G graph’s edges, W Edges the matrix containing the weights of the edges. The algorithm collects the fault affected labelled nodes out of the smart sensor’s signals and interprets them as X 1 fault vector. Computes and generates the Laplace spectrum (δ 1) of the fault affected graph. Considers the built-in base rule no. 1 basic rule-set. The no. 1 rule-set is the reduction rule-set δ2 . According to this rule-set if L(i, j) (δ 1) and avr(L(i , j )) < (δ 2) then the algorithm cuts is where (δ 1) and (δ 2) are the thresholds. The avr(L(vi )) means the average degree number of the neighbouring nodes around Vi node in r radius (here the radius may be considered as the number of nodes and edges into all directions) The algorithm prepares X 2 according to (δ 1) and (δ 2). On the sub-graph S with additional cutting edges according to rule base no. 2 which builds on (δ 3) attributes the cutting of further extreme edges occurs. This forms outer edges matrix O with normalized data of each edge. W Edge assigns to the edges the length of the cables or lines, the number of bindings on the given lengths, the fault probabilities according to this technical content, respectively the age of the cables or lines as parameters to be taken into account. The algorithm uses all these from the predefined data bank DB. At is the adjacency matrix of S fault zone section graph from which the algorithm creates the T hierarchical tree object weighted and equipped with attributes according to the vector levels. From point of view of network diagnostics the section graph’s adjacency matrix can be expanded with further special elements and attributes. 5.2
Fault Position Localization in the Fault Zone (Procedure 2)
The algorithm loads the hierarchical tree object T and from this creates the matrix containing the fault vector X 2 It reviews on the fault tree whether an individual or a
Identification of Fault Locations on Tree Structure Systems
53
Fig. 3. The steps of the algorithm (Procedure 1)
group fault had occurred. An individual fault is considered when only one consumer is affected. In this case the algorithm provides the position of the fault. In order to validate the individual fault position, the definition and process of vector level was introduced by which it can be checked whether there is an another fault on the same level or one level higher of the earlier determined fault. If the algorithm does not find another individual fault, then it marks the original individual fault location as fault source. However, if another fault is found, the algorithm checks whether there is any correlation between the two individual faults. If correlation is found in relation to the problem source, then the fault location would be on the higher level-vector on the connection point of the two fault labels and the algorithm would mark this node as fault address. The algorithm performs this verification and evaluation process by the associative logical rules of network
54
J. P´alfi and A. Dineva
Fig. 4. The steps of the algorithm (Procedure 2)
fault location worked out by the authors of this paper. The Q dynamic matrix contains the results of all these decisions which serves as the mathematical description of the fault tree [12, 15] and assigns the appropriate level vector matrix and the node number on the level down from the intelligent fault signal level to each node. Step S1 is the comparison of the node’s level vector with the number of fault signals on k level. For the evaluation is sufficient to use the reduced level vector Qred: QRed , l = l0 , ..., lFaultlevel If there are two individual fault signals on one level vector, then the fault location is on the highest level vector on their common connection point above them according to the associative logical rules of network fault location and the algorithm identifies this place
Identification of Fault Locations on Tree Structure Systems
55
[21]. The S1 comparison runs on each level vector until it finds that k − 1 level where the faulty device is located. Then the algorithm assigns the individual identification of the device, the fault address and the fault location probability. If the fault location’s probability is low, then the algorithm sends out the “check the smart sensors” message along with the probability value. If not an individual, but a group fault happened, then the algorithm runs step S1 till it reviews the fault tree and localizes the fault position and the faulty device. Following this, the algorithm assigns the individual identification of the device, the fault address and the fault location probability. If the fault location’s probability is low, then the algorithm sends the message “check the smart sensors” along with the probability value. 5.3 Result of the Proposed Algorithm The result of running the SGFI algorithm is established (both in case of individual and group fault) in the form of a fault location attribute table according to Table 1. This table contains the characteristics of the fault location X 2 . The algorithm’s localization of the fault position using matrix Q and the level vectors is depicted in Fig. 6 Table 1. Fault location attribute format as the result of SGFI algorithm 1 Individual identification of a device 2 Fault location’s address (H) 3 Probability of the precision of the fault position localization P(H)
The fault location attribute table (Table 1) contains the individual identification of the device, the fault location’s address (H) and the probability of the precision of the fault position localization P(H). If P(H) is low (the power supplier company defines the “low value”), the algorithm sends a fault message to the decision makers that “check the smart sensors”. The localization with low probability of the fault position may have several causes: not enough smart sensors are available for the localization with a higher probability of the fault address, smart sensor communication failure, etc. The simulation analysis of SGFI algorithm was carried out by using MATLAB R2017a. 5.4 Simulation Analysis of the SGFI Algorithm In this section we present the performance of the SGFI algorithm through simulation investigations. Assume that there is an individual fault in Node19 and therefore there is a fault signal which gets indicated in fault vector matrix X 2 . Figure 5 shows the algorithm’s localization of the fault location using matrix Q and the level vectors. The N19 fault location level vector’s Q matrix then is the following: ⎡
⎤ 0 0 l0 N1 0 ⎢ ⎥ 0 0 l0 l1 N1 N2 0 ⎢ ⎥ ⎥ 0 l Q=⎢ l l l N N N 3 l l l l N N N 4 0 0 1 2 3 1 2 7 0 1 2 3 1 2 7 ⎢ ⎥ ⎣l0 l1 l2 l3 N1 N2 N7 3N2 9 l0 l1 l2 l3 N1 N2 N7 3N3 2 ⎦ l0 l1 l2 l3 N1 N2 N7 4N3 0 0 l0 l1 l2 l3 N1 N2 N7 4N3 N1 7 l0 l1 l2 l3 N1 N2 N7 4N3 N1 8 l0 l1 l2 l3 N1 N2 N7 4N3 N1 9
56
J. P´alfi and A. Dineva
Fig. 5. Example for an individual fault localization
Evaluation of example 1 is based on the associative logical rules of network fault location, i.e. if only the smart sensor from node N19 has sent a fault signal, then it is an individual fault and its position is 100% in N19 as per network fault location l0 l1 l2 l3 N1 N2 N7 4N3 N1 9. In practice this means that a distribution board can be found in node N19 in which with high probability the fuse needs to be changed. In the second example we assume, that there is a group of faults consisting of faults occurred in a Node29 and in Node19 and there is a fault signal which realizes in fault vector matrix The N29 and N19 fault location level vector’s Q matrix will be the following: ⎡
⎤ 0 0 l0 N1 0 ⎢ ⎥ 0 0 l l N N 0 0 1 1 2 ⎢ ⎥ ⎥ 0 l Q=⎢ l l l N N N 3 l l l l N N N 4 0 0 1 2 3 1 2 7 0 1 2 3 1 2 7 ⎢ ⎥ ⎣l0 l1 l2 l3 N1 N2 N7 3N2 9 l0 l1 l2 l3 N1 N2 N7 3N3 2 ⎦ l0 l1 l2 l3 N1 N2 N7 4N3 0 0 l0 l1 l2 l3 N1 N2 N7 4N3 N1 7 l0 l1 l2 l3 N1 N2 N7 4N3 N1 8 l0 l1 l2 l3 N1 N2 N7 4N3 N1 9
Evaluating the different fault locations (when N29 and N19 sends fault signals at the same time) based on the associative logical rules of network fault location we get a group fault case, and the fault is at the first connection point of the two fault signals at level l1 in l0 , l1 , N1 , N2 node.
Identification of Fault Locations on Tree Structure Systems
57
Fig. 6. Example for group fault localization
6 Conclusions The current fault localization practice of LV electricity networks has been presented and solutions for its development have been proposed and recommended which can result in the improvement of the quality of the services provided through these networks. The boundary condition of the recommendations is given by the installment of sporadically located smart sensors throughout the LV networks. The smart sensors – in case of a fault in the nodes of a tree structure network – send on-line signals to the operations control center. The possibilities of the potential development opportunities based on the information coming from the signals received from the sensors’ fault messages and the processing of the current consumers’ phone calls have been analysed. The graph the¨ EM ´ ASZ ´ ory model was constructed using the data provided by ELMUpower supply companies and taking into account the topology of the LV electricity network. Based on the graph model a fault probability data bank was assigned to the topology allowing the localization of the fault zone and the faulty device within the network. For he
58
J. P´alfi and A. Dineva
localization of the fault position a priori probability parameters were introduced. Using the data bank, the SGFI algorithm was worked out allowing the fast localization of a faulty device based on the sporadically installed smart sensors and the consumers’ notifications. The output of the SGFI algorithm to the operations control center is a fault location attribute table that contains also the individual identification number of the faulty device. During the development of the algorithm special attention was paid to the wide applicability of the results of the study in industry. The advantage of the SGFI algorithm is that it is independent from the electricity parameters. Consequently, the algorithm can be used universally in any tree structure network where online remote signals are sent into a data processing centre. Acknowledgement. The authors thankfully acknowledge the support of the AD& TE Research ´ ¨ Network Ltd. (Hungary). Group of Obuda University and the ELMU
References ˝ net1. Geometria kft. mirtusz workflow system, functions and their use, geometria kft., elmU ´ Asz ´ network ltd., private edition, Budapest (2005) work ltd. and Em 2. Anderson, W., Morley, T.: Eigenvalues of the Laplacian of a graph. Linear Multilinear Algebra 18, 141–145 (1985) 3. Archdeacon, D.: Topological graph theory, a survey. Congressus Numerantium 115(5–54), 18 (1996) 4. Fiedler, M.A.: lgebraic connectivity of graphs. Czech. Math. J. 23, 298–305 (1973) 5. Fowles, G.: Analytical Mechanics, 4th edn. Saunders College Publishing, Philadelphia (1986) 6. G¨osi, Z.: Value of the Manpower. Corvinus University of Budapest, Institute of Business Economics, HU, Budapest (2007). ISSN 1786-3031 7. Holcsik, P., P´alfi, J.: Use SCADA features in low-voltage network operation management. In: Smart Cities Conferens o´ buda University, Budapest, Hungary, p. 156 (2015). ISBN 978-6155460-57-9 8. Newman, W.M.: The Laplacian Spectrum of Graphs (Thesis work). University of Manitoba Winnipeg, Canada, 0-612-57564-0 (1985) 9. P. H. M. T.: Electrician forms’ evaluation using machine learning methods. In: Proceedings of the 9th International Scientific Symposium on Electrical Power Engineering Elektroenergetika, Stara Lesna, Slovak Republic, pp. 389–394, Technical University of Koˇsice Faculty of Electrical Engineering and Informatics Department of Electrical Power Engineering, 12– 14 September 12–14 2017. ISBN 978-80-553-3195 10. P´alfi, J., Tompa, M., Holcsik, P.: Analysis of the efficiency of the recloser function of LV smart switchboards. Acta Polytech. Hung. 14(2), 131–140 (2017) 11. Pokoradi, L.: Fault tree sensitivity analysis, technical scientific papers 3rd. In: 20th Scientific Meeting of Young Technicians, Cluj-napoca, pp. 263–266 (2015) 12. Rausand, M., Hoyland, A.: System Reliability Theory: Models, Statistical methods and Applications, 2nd edn. Wiley-Interscience, Hoboken (2004) 13. Rozenblat, A.: Matrix Determinant with Graphs for Laplace’s and Expansion Methods. AuthorHouse (1986). ISBN 978-1477293508 14. Souran, D.M., et al.: Smart grid technology in power systems. In: Souran, D.M., et al. (eds.) Soft Computing Applications, Advances in Intelligent Systems and Computing, vol. 357, pp. 1367–1381. Springer International Publishing Switzerland (2016)
Identification of Fault Locations on Tree Structure Systems
59
15. Stamatelatos, M., Veseley, W.: Fault Tree Handbook with Aerospace Applications. NASA Office of Safety and Mission Assurance, Washington DC (2002) 16. Wilson, R.: Introduction to Graph Theory. Longman group Ltd., Essex (1996). ISBN 0- 58224993-7
Probability Distribution Functions for Short-Term Wind Power Forecasting Harsh S. Dhiman(B) and Dipankar Deb Institute of Infrastructure Technology Research and Management, Ahmedabad 380026, Gujarat, India {harsh.dhiman.17pe,dipankardeb}@iitram.ac.in
Abstract. Wind energy estimation is pivotal to ensure grid side management and optimal dispatch of wind power. Wind speed distribution for given wind site can be modeled using various probability distribution functions (PDF) like Weibull, Gamma and Log-normal distribution functions. PDFs like Weibull and Log-normal do not fit the real-time wind speed scenarios. In this paper we analyze the PDFs for short-term wind power forecasting for a low wind speed regime based on combined wind speed and wind direction PDF. Short-term wind power forecasting based on ε-Support Vector Regression (SVR) and Artificial Neural Network (ANN) was carried for three wind farm sites in Massachusetts. The forecasting results were tested for Mixture density Weibull and Lindley PDFs and in terms of Root mean squared error, Weibull PDF outperformed Lindley PDF. Keywords: Wind speed distribution · Weibull distribution · Generalized Lindley · Wind power forecasting · SVM · ANN
1
Introduction
With the growing energy crisis across the globe, efficient and effective usage of renewable resources has gained a lot of importance recently [8]. Wind being a random variable, its accurate estimation and prediction is imperative for sufficient power generation [12]. Wind power forecasting helps to predict accurate power which is important for market operators in order to clear energy imbalances caused by stochastic nature of wind, also it used to clear day ahead markets. Wind speed and power forecasting methods are categorized based on the prediction interval, that is, very short-term forecasting, short-term forecasting and medium to long term forecasting [9]. Wind speed distribution of a particular regime depends on the topographical features like air temperature, pressure, humidity etc. Various wind speed distribution functions like Weibull distribution, Rayleigh distribution, log-normal and Gamma functions have been proposed in literature [4]. Amongst the listed probability distribution functions, Weibull distribution is most commonly used owing to its simplicity. Weibull distribution is widely used in applications like c Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 60–69, 2021. https://doi.org/10.1007/978-3-030-51992-6_6
Probability Distribution Functions for Short-Term Wind Power Forecasting
61
reliability analysis (failure rate), industrial engineering (to represent manufacturing and delivery time), and wind speed distribution [14]. Weibull distribution holds several advantages like that of flexibility and generally gives a good fit to the observed wind speed [16]. However Weibull distribution cannot accurately estimate all the wind regimes including that of bimodal wind distributions involving two peaks. Various other probability functions like Rayleigh and Lognormal distributions have also been quoted in literature. Studies have shown that two-component Weibull pdf works well with bimodal wind speed distributions. Similar to Weibull distribution, mixture density functions have also been proposed [6]. This paper focuses on effect of Lindley cumulative probability distribution function on short-term wind power forecasting. Wind power is directly proportional to the cube of wind speed, thus depicting dependence on both the random variables: wind speed and wind direction. We analyze the forecasting accuracy and study the effect of joint probability distribution function on short-term wind power forecasting. A joint probability distribution function of Lindley and Weibull type having wind speed and wind direction as its variable is chosen. The forecasting is done using two methods, that is, Support vector Regression (SVR) and Artificial neural networks (ANN). The paper is organized as follows: Sect. 1 discusses the introduction. In Sect. 2, various wind probability distribution functions are highlighted with case studies of three wind farms in Massachusetts (USA). Section 3 discusses wind power forecasting methods using support vector regression and artificial neural network and highlights features used in forecasting wind power which is followed by Conclusions.
2
Wind Probability Distribution Functions
Probability density functions are widely used in statistics and in other computational fields where the probability of occurrence of random variable such as wind speed is described mathematically. However wind speed is not the only variable that is affecting the power captured by the wind turbine. The correlation between wind speed and wind direction is equally important. The literature also shows implementation of mixture density functions which are composite probability density functions each with an associated weight [20]. The following sub-section discusses different probability distribution functions for wind farm sites like Blandford, Paxton and Bishop & Clerks; all three located in western Massachusetts. 2.1
Weibull Distribution
The majorly used wind PDF is Weibull which fits well for almost all wind regimes. The two parameter (k and λ) Weibull distribution is represented as f(v; k, λ) =
k v λ−1 −( v )k e λ , λ λ
(1)
62
H. S. Dhiman and D. Deb
where k, λ > 0 and v > 0. In this study, v is a random variable called wind speed (in m/sec) and k is known as shape parameter. Estimation of weibull parameters can be done by maximum likelihood method [17]. Similarly joint weibull PDFs can be used to model wind farms with bimodal wind regimes. A two-component joint density Weibull PDF can be expressed as ki v λi−1 −( λv )ki e i , i = 1, 2, λi λi f(v; k, λ) = w1 f1 (v; k1 , λ1 ) + (1 − w1 )f2 (v; k2 , λ2 ),
fi (v; ki , λi ) =
(2)
where k1 , λ1 and k2 , λ2 represent the shape and scale parameters for individual PDFs which in turn were estimated using Maximum Likelihood Estimation (MLE). 2.2
Lindley Distribution for Random Variables
The unimodal wind regimes can easily described by Weibull and gamma distributions [6]. In case of extremely low and high wind regions, Weibull and other PDFs are reported to not perform well and in such cases Lindley distribution function is suggested [2]. Lindley distribution has two versions, viz generalized Lindley and power Lindley as described by [2]. Generalized Lindley fgl and power Lindley fpl probability functions can be represented for a random variable wind speed v as 1 + β + βv2 −βv α−1 −βv αβ 2 (1 + v) 1 − e e , β+1 1+β α αβ 2 (1 + vα )vα−1 e−βv , fpl (v) = β+1 fgl (v) =
(3) (4)
where α and β represent shape and scale parameters respectively. The cumulative PDF for generalized Lindley is expressed as α 1 + β + βv e−βv . fcgl (v) = 1 − β+1
(5)
In [13], the authors talk about extended generalized Lindley distribution for random variables and its superiority over well known PDFs like Weibull, Rayleigh and two-component Weibull distribution functions and generalized Lindley. However not much literature is available on extended generalized Lindley distribution. The parameters of generalized Lindley distribution can be found using maximum likelihood estimation (MLE) [15]. Table 1 shows the parameters estimated by MLE. Figure 1 shows the Weibull and Generalized Lindley PDF plots for a low wind speed regime (1–5 m/sec) in Bishop & Clerks and Paxton, MA. For lower wind speeds (below 6 m/sec) Weibull fits the observed wind speed better than Lindley.
Probability Distribution Functions for Short-Term Wind Power Forecasting
63
Table 1. Parameters for cumulative probability distribution function, Weibull distribution Wind farm
(k1 , λ1 )
(k2 , λ2 )
Bishop & Clerks (3.5241, 4.3487) (3.4719, 4.1371) Paxton
(2.5053, 3.4373) (2.4495, 3.2379)
Blandford
(3.1857, 2.8273) (3.1247, 2.7098)
Fig. 1. Weibull & Lindley distribution for wind speed in Massachusetts (MA).
2.3
Joint Probability Distribution Function for Wind Forecasting
This paper focuses on improved wind power forecasting taking into consideration the effect of wind speed and wind direction on power extracted from the wind resource. In [5] a joint PDF is implemented in order to see the goodness of fit with respect to other models like isotropic Gaussian model and Anisotropic Gaussian model. Both these models take into account wind speed and direction
64
H. S. Dhiman and D. Deb
as two random variables. Thus in order to estimate accurate wind power for a given prediction interval one must consider dependence of wind direction on wind speed as described by [21]. The joint probability distribution function is used as a feature while predicting short-term wind power using support vector machines. The generalized lindely PDF with two weights w1 and w2 is used in this paper. Power extracted by wind turbine at a given free stream wind speed v with joint PDF is given as: 1 ρAf(v)f(θ)v3 , 2 f(v) = w1 f(v; α1 , β1 ) + (1 − w1 )f(v; α2 , β2 ),
P(v) =
f(θ) = w1 f(θ; α1 , β1 ) + (1 − w1 )f(θ; α2 , β2 )
(6)
where f(v,θ) is the joint PDF which is a feature along with wind speed in wind power forecasting. In our study of short-term wind power using ε-SVR and ANN, we use joint Lindley cumulative PDF which is composite of individual wind speed and wind direction Lindley functions as described by [2], and is given as 1 + β + βv −βv α e . fcdv (v) = 1 − 1+β
(7)
In our model, the parameters are taken as α1 = α2 = 10, β1 = β2 = 2 and w1 = 0.5.
3
Short-Term Wind Power Forecasting Using SVR and ANN
Since wind speed is a stochastic variable, high variability reduces system reliability and stability thereby increasing operational as stated by [1,7,10]. Wind forecasting methods are categorized as time-based forecasting methods like: very short-term, short-term and medium-term forecasting [11,18]. 3.1
Wind Power Forecasting Using Support Vector Regressors
Developed by [19], support vector regressors is a statistical learning algorithm that estimates a regression model between the inputs (x1 , x2 , ...xn ) and outputs (y1 , y2 , ...yn ) by minimizing the risk function: 1 w 2 +C (ξi + ξi∗ ), 2
(8)
w, xi + b − yi ≤ ε + ξi∗ ,
(9)
n
min
i=1
s.t. yi − w, xi − b ≤ ε + ξi ,
where wi represents weights associated with each input vector xi whereas b and ε are the bias and tolerance terms, ξi and ξi∗ are the slack variables introduced to solve the convex optimization problem. We consider wind farms Bishop and clerks, Paxton and Blandford (all three located in western Massachusetts). The
Probability Distribution Functions for Short-Term Wind Power Forecasting
65
Table 2. Descriptive wind statistics of Bishop & Clerks, Paxton and Blandford Wind farm
Max speed (m/sec) Min speed (m/sec) Std dev (m/sec)
Bishop & Clerks 13.31
0.36
2.5923
Paxton
11.663
1.345
2.0576
Blandford
13.73
0.3
2.1242
wind farm data of previous one week was taken for training the SVR model. The descriptive statistics of the three wind farms are listed in Table 2. The performance of Weibull and Lindley PDF was tested for three datasets corresponding to each wind farm which represent low wind speed regime patterns. Prediction of (ε-SVR) and ANN models is calculated using accuracy N ˆ 1 (Pt − Pt )2 , where Pˆt represents the forecasted wind power RMSE = N
t=1
at that time interval and Pt represents the actual wind power at a time interval. The support vector regression (ε-SVR) model here uses a quadratic kernel function for wind power forecasting. In terms of the feature set, the model was tested with joint PDF and wind speed. The forecasting for wind power is done with two cases being considered listed as follows: 1. Case A: Wind power forecasting is carried out considering Weibull joint PDF as one of the inputs to the SVR model. 2. Case B : Wind power forecasting is carried out considering Lindley joint PDF as one of the input to the SVR model. The SVR forecasting results depend on the choice of kernel function and its parameter values. Table 3 shows kernel scales values for a polynomial kernel with degree 2 and the forecasting results for short-term wind power for three sites. Table 3. Performance metrics (RMSE) and SVR kernel parameters Wind farm
RMSE RMSE Kernel scale Case A Case B
Bishop & Clerks 0.9410
0.2671
0.3754
Paxton
1.9387
3.2098
0.4303
Blandford
0.8004
1.0145
0.6336
As seen in Fig. 2, the prediction performance for case A (with Weibull PDF) is better than that of case B (with Lindley PDF). In all the three wind farm sites, the SVR model which considers Weibull joint PDF outperforms Lindley joint PDF. The joint PDF shows the correlation of wind speed and wind direction.
66
H. S. Dhiman and D. Deb 500
400
Wind Power (kW)
Bishop & Clerks, MA
Actual wind power Forecasted wind power (Case A) Forecasted wind power (Case B)
450
350 300 250 200 150 100 50 0 0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
Samples/10 min 350
Actual wind power Forecasted wind power (Case A) Forecasted wind power (Case B)
300
Paxton, MA
Wind Power (kW)
250
200
150
100
50
0
-50 0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
Samples/10 min 45
Actual Wind power Forecasted wind power (Case A) Forecasted wind power (Case B)
40
Blandford, MA
Wind power (kW)
35
30
25
20
15
10
5 0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
Samples/10 min
Fig. 2. Wind power forecasting with SVR.
3.2
Wind Power Forecasting Using ANN
Another machine learning algorithm for wind power forecasting used is artificial neural network (ANN) whose working is similar to the working of brain as described by [3]. ANN’s find their application in the field of pattern recognition, prediction and optimization. Neural networks can be represented as N wi xi + b), φ(g) = f(x) = φ( i=1
1 , 1 + e−g
(10)
where wi are the weights associated with each neuron present in the input layer, xi are the inputs and b is the bias term, φ denotes the activation or transfer function (here sigmoid function). In order to carry out wind power forecast using ANN, the training data of 800 samples was chosen from 1000 samples for the three wind farms, whereas the training algorithm chosen was Levenberg-Marquardt. Figures 3 shows that with Weibull joint PDF as one of the inputs to the ANN model gives better performance than Lindley PDF except for one of the wind farm site in Bishop & Clerks.
Probability Distribution Functions for Short-Term Wind Power Forecasting
67
500
Actual wind power Forecasted wind power (Case A) Forecasted wind power (Case B)
450
Wind Power (kW)
400
Bishop & Clerks, MA
350 300 250 200 150 100 50 0 0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
Samples/10 min 350
Actual wind power Forecasted wind power (Case A) Forecasted wind power (Case B)
300
Paxton, MA
Wind power (kW)
250 200 150 100 50 0 -50 0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
Samples/10 min 45
Actual wind power Forecasted wind power (Case A) Forecasted wind power (Case B)
Wind power (kW)
40
Blandford, MA
35
30
25
20
15
10 0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
Samples/ 10min
Fig. 3. Short-term wind power forecasting using ANN.
Table 4 below shows the RMSE results for case A and case B. Table 4. Performance metrics (RMSE) using ANN Wind farm
4
Case A Case B
Bishop & Clerks 4.2577
4.2285
Paxton
3.4667
3.7872
Blandford
3.9355
4.1558
Conclusion
This paper presents the short-term wind power forecasting for three different wind farm sites. The joint PDF comprising of individual Lindley and Weibull distribution functions for wind speed and wind direction was taken as a feature input for our ε-SVR and ANN model. Two cases were considered to examine the effect of joint PDF on forecasting short-term wind power. The forecasting results show that for case A gives better prediction performance than case B, suggesting that Weibull PDF fits lower wind speed regimes better than Lindley in real time scenarios. Further, based on RMSE, SVR model outperforms ANN for all the wind farm sites.
68
H. S. Dhiman and D. Deb
References 1. Ak¸cay, H., Filik, T.: Short-term wind speed forecasting by spectral analysis from long-term observations with missing values. Appl. Energy 191, 653–662 (2017) 2. Arslan, T., Acitas, S., Senoglu, B.: Generalized lindley and power lindley distributions for modeling the wind speed data. Energy Convers. Manage. 152, 300–311 (2017) 3. Ata, R.: Artificial neural networks applications in wind energy systems: a review. Renew. Sustain. Energy Rev. 49, 534–562 (2015) 4. Carta, J., Ram´ırez, P., Vel´ azquez, S.: A review of wind speed probability distributions used in wind energy analysis. Renew. Sustain. Energy Rev. 13(5), 933–955 (2009) 5. Carta, J.A., Ram´ırez, P., Bueno, C.: A joint probability density function of wind speed and direction for wind energy analysis. Energy Convers. Manage. 49(6), 1309–1320 (2008) 6. Chang, T.P.: Estimation of wind energy potential using different probability density functions. Appl. Energy 88(5), 1848–1856 (2011) 7. Dhiman, H., Deb, D., Muresan, V., Balas, V.: Wake management in wind farms: an adaptive control approach. Energies 12(7), 1247 (2019). https://doi.org/10.3390/ en12071247 8. Dhiman, H.S., Deb, D.: Decision and Control in Hybrid Wind Farms. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-0275-0 9. Dhiman, H.S., Deb, D., Balas, V.E.: Supervised Machine Learning in Wind Forecasting and Ramp Event Prediction. Elsevier (2020). https://doi.org/10.1016/ c2019-0-03735-1 10. Dhiman, H.S., Deb, D., Foley, A.M.: Lidar assisted wake redirection in wind farms: a data driven approach. Renewable Energy (2020). https://doi.org/10.1016/ j.renene.2020.01.027 11. Dhiman, H.S., Deb, D., Guerrero, J.M.: Hybrid machine intelligent SVR variants for wind forecasting and ramp events. Renew. Sustain. Energy Rev. 108, 369–379 (2019). https://doi.org/10.1016/j.rser.2019.04.002 12. Dhiman, H.S., Deb, D., Muresan, V., Unguresan, M.L.: Multi-criteria decision making approach for hybrid operation of wind farms. Symmetry 11(5), 675 (2019). https://doi.org/10.3390/sym11050675 13. Kantar, Y.M., Usta, I., Arik, I., Yenilmez, I.: Wind speed analysis using the extended generalized lindley distribution. Renewable Energy 118, 1024–1030 (2018) 14. Lai, C.D., Murthy, D., Xie, M.: Weibull distributions and their applications. In: Springer Handbook of Engineering Statistics, pp. 63–78. Springer, London (2006) 15. Maiti, S.S., Mukherjee, I.: On estimation of the PDF and CDF of the lindley distribution. Commun. Stat. Simul. Comput. 47, 1–12 (2017) 16. Ouarda, T., Charron, C., Shin, J.Y., Marpu, P., Al-Mandoos, A., Al-Tamimi, M., Ghedira, H., Hosary, T.A.: Probability distributions of wind speed in the UAE. Energy Convers. Manage. 93, 414–434 (2015) 17. Seguro, J., Lambert, T.: Modern estimation of the parameters of the weibull wind speed distribution for wind energy analysis. J. Wind Eng. Ind. Aerodyn. 85(1), 75–84 (2000) 18. Soman, S.S., Zareipour, H., Malik, O., Mandal, P.: A review of wind power and wind speed forecasting methods with different time horizons. In: North American Power Symposium 2010. IEEE (2010)
Probability Distribution Functions for Short-Term Wind Power Forecasting
69
19. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (2000) 20. Wang, J., Hu, J., Ma, K.: Wind speed probability distribution estimation and wind energy assessment. Renew. Sustain. Energy Rev. 60, 881–899 (2016) 21. Yuan, X., Tan, Q., Lei, X., Yuan, Y., Wu, X.: Wind power prediction using hybrid autoregressive fractionally integrated moving average and least square support vector machine. Energy 129, 122–137 (2017)
Modeling, Algorithms. Optimization, Reliability and Applications
Feedback Seminar Analysis - An Introductory Approach from an Intelligent Perspective Adriana Mihaela Coroiu(&) and Alina Delia Călin Babeș-Bolyai University, 400084 Cluj-Napoca, Romania {adrianac,alinacalin}@cs.ubbcluj.ro
Abstract. This paper presents a mixture of Artificial Intelligence Topics: Automated Learning with Deep Learning and Natural Language Processing, as an advantage provided by intelligent and automated approaches in opinion mining. The main objective of this paper is to determine if a writing review can contain a positive or a negative opinion and to discover which are the best intelligent approach to detect this issue’s objective. Data set used in our experiments contains 180 collected opinions/reviews from students related to a particular seminar. Methods used in our experiments are Multinomial Naive Bayes, Support Vector Machine with the RBF kernel, linear Support Vector Classification and Recurrent Neural Network with LongShort Term Memory Unit. Computed metrics used to evaluate methods performance are accuracy, precision, recall and F1-score. Keywords: Text mining Tokenization Lemmatization Feature engineering
1 Introduction Nowadays we live in a world full of information. Social media age through internet platform, people’s opinions, reviews, and recommendations have become a valuable resource for political science and businesses in general and for each person particularly. These modern technologies also provide us methods to collect and analyze such data most efficiently [13, 19]. Having all these aspects in mind is important to know what are the present methods used, how can be used and what are the advantages or disadvantages for each of them. Data analysis is a prolific research and working field and Machine Learning (ML) tools are becoming more and more known and used [14, 17]. Opinion Mining (also known as Sentiment Analysis) is a subfield of Natural Language Processing (NLP) and is concerned with analyzing the polarity of documents [1, 15]. Combining both of this two subfields of Artificial Intelligence (AI) we can achieve relevant information. And this information can be used in real-life aspects to improve our capacity of making decision, to classify or clustering any other new items that we can collect or find out [9].
© Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 73–80, 2021. https://doi.org/10.1007/978-3-030-51992-6_7
74
A. M. Coroiu and A. D. Călin
We choose these two fields due to their relevance for our research aspect: finding an automated way to discern objective between positive and negative reviews related to teaching manner within a seminar in faculty. The paper is structured as follows: we have state if the art section containing the other highlighted papers referring to our topic; second, is section Computational experiments description presenting the steps to realize an experiment and description of the data set and also the classification methods used. The paper proceeds with evaluation methods use to check is our classification methods work in a proper manner and finally we have the results and their presentation and we conclude with some remarks and future work directions.
2 State of the Art In literature there are some papers related to ML approach correlated with NLP: emotions analysis through emoticons used analysis [3]; blog (twitter, Facebook) posts analysis [4] or newspaper news analysis [5]. Within those analyses are used various methods to detect the attitude of the writer. Nevertheless, this mixture approach of ML and NLP is quite new and still allow many other studies in order to provide added value for scientific community.
3 Computational Experiments Description Step 1: Data preparation. This stage involves preprocessing the data and distributing them in three main sets: • Training data: data used in the training process to determine the parameters of the classification model; • Validation data: data used to analyze the behavior of the classification model throughout the learning algorithm; the performance gained on the validation set during the learning process is used to decide whether learning should be continued or not; • Test data: used to analyze the performance of the trained classification model. Step 2: Training the classification model. Means extracting the data classification model. During training, we will consider labels associated with each of the seminar reviews so that we can train our machine learning algorithm based on the given labels. After training, when we pass any unseen seminar reviews, then our trained machine learning algorithm will predict the opinion, which means whether the provided seminar review indicates a positive opinion or a negative opinion. 3.1
Data Set Descriptions
Data set used in our experiments is a data set consisting in 180 instances meaning the reviews of 180 students for teaching manner from a seminar. The selected seminar is related to the discipline Logic and Functional Programming. As features, in our data set we have only 2: student’s id and the review text itself.
Feedback Seminar Analysis
3.2
75
Preprocessing Data Set
Firstly, we created a vocabulary of unique tokens, words, from the entire set ant then we construct a feature vector from each document (review) that contains the counts of how often each word occurs in the particular document. Since the unique words in each document represent only a small subset of all the words in the bag-of-words vocabulary [10]. The next step is to assessing word relevancy via term frequency-inverse document frequency because hen we are analyzing text data; we often encounter words that occur across multiple review from both classes. These frequently occurring words typically don’t contain useful or discriminatory information. tf idf ðt; dÞ ¼ tf ðt; dÞ idf ðt; dÞ
ð1Þ
In this case a useful technique named term frequency-inverse document frequency (tf-idf) can be used to down weight these frequently occurring words in the feature vectors. The tf-idf can be defined as the product of the term frequency and the inverse document frequency. idfðt; dÞ ¼ log
nd 1 þ dfðd; tÞ
ð2Þ
nd is the total number of documents, and df (d, t) is the number of documents d that contain the term t. Adding the constant 1 to the denominator is optional and serves the purpose of assigning a non-zero value to terms that occur in all training samples; the log is used to ensure that low document frequencies are not given too much weight. Next, we have to split the text corpora into individual elements. One way to tokenize documents is to split them into individual words by splitting the cleaned documents at its whitespace characters. Finally, we have to remove stop words or any other punctuation marks.
4 Models Learning Used as Classification Methods The purpose of the experiments is to determine whether a review is positive or negative based on its content. To achieve this task, we investigated the behavior of different ML algorithms for our data set. 4.1
Feature Engineering
In the NLP domain, we need to convert raw text into a numerical format so that the ML algorithm can be applied to that numerical data. There are many techniques available, including indexing, count based vectorization, Term Frequency - Inverse Document Frequency (TF-IDF) [16].
76
4.2
A. M. Coroiu and A. D. Călin
Selecting the Machine Learning Model
Opinion mining is a classification problem. There are some algorithms that can be really helpful for us. In seminar reviews, you may discover that there are some phrases that appear quite frequently. If these frequently used phrases indicate some kind of opinion, most likely, they are phrases that indicate a positive opinion or a negative opinion. We need to find phrases that indicate an opinion. Once we find phrases that indicate opinion, we just need to classify the opinion either in a positive opinion class or a negative opinion class [18]. In order to find out the actual opinion class, we need to identify the probability of the most likely positive phrases and most likely negative phrases so that based on a higher probability value, we can identify that the given movie review belongs to a positive or a negative opinion. The probabilities we will be taking into account are the prior and posterior probability values. This represents the fundamental base of the Multinomial Naive Bayes (MNB) algorithm. In order to having a comparison and to decide in a knowledgeable way, we select and a Support Vector Machine (SVM) algorithm, a linear vector classification and an algorithm based on Recurrent Neural Network (RNN). The MNB is suitable for classification with discrete features, which means that if the features are word counts or TF-IDF vectors, then we can use this classifier. The multinomial distribution normally requires integer feature counts. However, fractional counts such as TF-IDF might work as well. So, we have to apply this algorithm to train vectors. The fit () method is the step where the actual training is performed. Here, we have used all hyper parameters by default. The SVM with the RBF kernel will in training steps. The equation for the RBF kernel function is as follows: rbf kernel ¼ expðckx x0 k Þ 2
ð3Þ
The Linear SVC has more flexibility in the choice of penalties and loss functions, and should scale better to a large number of instances. Besides these model learning methods presented above, we also have investigated the behavior of a RNN. In order to achieve this, we follow some phases: we used a glove pre-trained model and have trained the model using the RNN and LSTM networks. The glove model has been pre-trained on a large dataset so that it can generate more accurate vector values for words. That is the reason we are using glove here [11]. Loading precomputed ID matrix meaning we were generating the index for each word. This process is computationally expensive but we will see that improve the accuracy value. In order to building a neural network, we have used a recurrent neural net (RNN) with Long-Short Term Memory Unit (LSTMs) cells as a part of their hidden states. LSTM cells are used to store sequential information [12]. If there are multiple sentences, then LSTM stores the context of the previous or previous to previous sentences, which helps us improve the model. So, we define the hyper parameters. We set the batch size to 64, LSTM units to 64, the number of classes to 2, and then we perform 100,000 iterations.
Feedback Seminar Analysis
77
Once training is done, we can save the trained model and after loading this model, we can check its accuracy.
5 Evaluation of the Classification Models The quality of a classifier from the perspective of the correct identification of a class is measured using information in the confusion matrix containing [6]: • The number of data correctly classified as belonging to the interest class: True positive cases (TP) • The number of data correctly classified as not belonging to the class of interest: True negative cases (TN) • The number of data incorrectly classified as belonging to the interest class: False positive cases (FP) • The number of data incorrectly classified as not belonging to the interest class: False negative cases (FN) 5.1
Metrics Used for Evaluation
The metrics that measure the quality of the evaluation used in this article are described blow. Classification accuracy is determined by the ratio of the number of correctly classified instances to the total number of classified instances [7]. Accuracy ¼
TP þ TN TP þ TN þ FP þ FN
ð4Þ
The sensitivity metric is given by the ratio between the number of correctly classified data as belonging to the class of interest and the sum of the number of data correctly classified as belonging to the interest class and the number of data incorrectly classified as not belonging to the class of interest. Sensitivity ¼
TP TP þ FN
ð5Þ
The metric of specificity is given by the ratio of the number of data correctly classified as not belonging to the interest class and the sum of the number of correctly classified data as not belonging to the interest class and the number of data incorrectly classified as belonging to the class of interest. Specificity ¼
TN TN þ FP
ð6Þ
The precision metric is given by the ratio between the number of data correctly classified as belonging to the class of interest and the sum of the number of data
78
A. M. Coroiu and A. D. Călin
correctly classified as belonging to the interest class and the number of data incorrectly classified as belonging to the class of interest [8]: Precision ¼
TP TP þ FP
ð7Þ
The F1-score metric is the harmonic means of precision and recall and can be computed as follows: F1 Score ¼ 2
5.2
precision:recall precision þ recall
ð8Þ
Results and Discussion
In these experiments we applied the machine learning models in order to get the baseline model (MNB, SVM-RBF, L-SVC). After that, in order to optimize the baseline model, we changed the model approach and we applied deep-learning. We used glove, RNN, and LSTM techniques to achieve the best results. As we can see in Table 1 and in Fig. 1, the Deep Learning approach works better here than other three approaches, therefore it is recommended to be used if needed. Table 1. Metrics values achieved for selected methods Model\Metrics MNB SVM-RBF L-SVC RNN-LSTM
Accuracy 0,81 0,64 0,83 0,91
Precision 0,79 0,97 0,82 0,93
Recall 0,82 0,32 0,86 0,89
F1-score 0,82 0,48 0,84 0,91
The results of our computed metrics show us that any of 4 learning models used for classification in our text mining approach can be in other situation applicable successfully due to the fact that the smallest value of accuracy was 0.64 gathered in SVM_RBF, but in some circumstances it could be accepted. For other 3 learning models, the value of accuracy increases therefore we can say, we found a way to optimize our results. Indeed, related to the computation time, using a RNN will be much expensive than if we choose to use other three methods, and this can be drawn as a limitation of our experiments but related to our data set dimension, we accepted this cost-time, with the purpose to achieve good and very good results of accuracy.
Feedback Seminar Analysis
79
RNN-LSTM L-SVC
SVM-RBF MNB 0
0.2
0.4
F1-score
Recall
0.6 Precision
0.8
1
1.2
Accuracy
Fig. 1. Computed metrics for used models
6 Conclusions and Future Directions This paper presents an approach based on machine learning models used in text mining, in other words, use intelligent advantages to analyze a text, a review in particular and based on corpus to predict if that review express a positive or a negative opinion. Data set used in our experiment are collected from students as their expression related to teaching manner. Contains simple phrases of their opinions and our goal was to get an automated way to predict if their opinion is positive or negative. The methods used, Multinomial Naive Bayes, Support Vector Machine, linear vector classification and LSTM Recurrent Neural Network and the provided framework can be also extended and used and in other fields where is necessary text analysis (genetic, medicine, biology). This is possible due to the fact that our approach just finds relevant entities in a corpus of text and realize their classification in one of the entity types, the relevance being establish from the topic or subject. We investigated the behavior of 4 learning models We applied these models for a training data set and then we used the results achieved within testing data set to select the best of the learning models. After we have established model with best accuracy value, we can apply this model to new instances of data, unclassified yet. As new development directions, we propose to apply other classification methods that increase the value of accuracy metrics, to be more robust in relation to the data type (regardless of preprocessing methods applied) and obviously, increase the sample size of people questioned for a better generalization of results.
References 1. Dhania, R., Yogesh, A.: Sentiment analysis using machine learning. Int. J. Eng. Sci. 11946 (2017)
80
A. M. Coroiu and A. D. Călin
2. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retrieval 2(1– 2), 1–135 (2008) 3. Data mining and AI: Bayesian and Neural Networks, Santander Meteorology Group. http:// www.meteo.unican.es/research/datamining 4. Loren ,S.: Analyzing twitter. Blogs, Math works. http://blogs.mathworks.com/loren/2014/ 06/04/analyzing-twitter-with-matlab/. Accessed 4 June 2014 5. Padmaja, S, Sameen Fatima, S.: Evaluating Sentiment Analysis: Identifying Scope of Negation in Newspaper Articles. UCE Osmania University, IJARAI (2016). Accessed 17 Dec 2016 6. Dwivedi, A.K.: Performance evaluation of different machine learning techniques for prediction of heart disease. Neural Comput. Appl. 29(10), 685–693 (2016) 7. Ding, S., Zhao, H., Zhang, Y., Xu, X., Nie, R.: Extreme learning machine: algorithm, theory and applications. Artif. Intell. Rev. 44(1), 103–115 (2013) 8. Nikam, S.S.: A comparative study of classification techniques in data mining algorithms. Orient. J. Comput. Sci. Technol. 8(1), 13–19 (2015) 9. Leuhu, T.: Sentiment analysis using machine learning (2015) 10. Vinodhini, G., Chandrasekaran, R.M.: Sentiment analysis and opinion mining: a survey. Int. J. 2(6), 282–292 (2012) 11. Fan, Y., et al.: TTS synthesis with bidirectional LSTM based recurrent neural networks. In: Fifteenth Annual Conference of the International Speech Communication Association (2014) 12. Su, B., Lu, S.: Accurate scene text recognition based on recurrent neural network. In: Asian Conference on Computer Vision. Springer, Cham (2014) 13. Roy, S.S., Sinha, A., Roy, R., Barna, C., Samui, P.: Spam email detection using deep support vector machine, support vector machine and artificial neural network. In: Balas, V., Jain, L., Balas, M. (eds.) Soft Computing Applications, SOFA 2016. Advances in Intelligent Systems and Computing, vol 634. Springer, Cham (2018) 14. Isoc, D.: Expert system for predictive diagnosis (1) principles and knowledge base. In: Balas, V., Jain, L., Balas, M. (eds.) Soft Computing Applications, SOFA 2016. Advances in Intelligent Systems and Computing, vol 633. Springer, Cham (2018) 15. Malik, M., Habib, S., Agarwal, P.: A novel approach to web-based review analysis using opinion mining. Procedia Comput. Sci. 132, 1202–1209 (2018) 16. Lilleberg, J., Zhu, Y., Zhang, Y.: Support vector machines and word2vec for text classification with semantic features. In: 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC), pp. 136–140. IEEE (2015) 17. Veblen, T.: The place of Science in Modern Civilization. Routledge, Abingdon (2017) 18. Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education Limited, Malaysia (2016) 19. Topirceanu, A., Udrescu, M., Avram, R., Mihaicuta, S.: Data analysis for patients with sleep apnea syndrome: a complex network approach. In: Balas, V.C., Jain, L., Kovačević, B. (eds.) Soft Computing Applications, SOFA 2014. Advances in Intelligent Systems and Computing, vol 356. Springer, Cham (2016)
Flexible Fuzzy Numbers for Likert Scale-Based Evaluations J´ ozsef Dombi1 and Tam´ as J´on´ as2,3(B) 1
3
Department of Computer Algorithms and Artificial Intelligence, ´ ad t´er 2, 6720 Szeged, Hungary University of Szeged, Arp´ [email protected] 2 Institute of Business Economics, E¨ otv¨ os Lor´ and University, Egyetem t´er 1-3, 1053 Budapest, Hungary [email protected] Research and Development Department, Flextronics International Ltd., Hang´ ar u. 5-37, 1183 Budapest, Hungary
Abstract. In this paper, a novel class of fuzzy numbers called the flexible fuzzy numbers is introduced and its application in Likert scalebased evaluations is presented. We point out that depending on its shape parameter, the membership function of a flexible fuzzy number can take various forms. Next, we show that if the shape parameter is fixed, then the set of flexible fuzzy numbers is closed under the multiplication by scalar, fuzzy addition and weighted average operations. Then, as a generalization of the flexible fuzzy numbers we introduce the extended flexible fuzzy numbers which can have different left hand side and right hand side shape parameters. Here, we introduce an important asymptotic property of the extended flexible fuzzy numbers that allows us to perform approximate fuzzy arithmetic operations over them. The pliancy of these fuzzy numbers make them well suited to multi-dimensional Likert scale-based fuzzy evaluations in many areas of management. Keywords: Flexible fuzzy numbers · Fuzzy arithmetic operations Extended flexible fuzzy numbers · Fuzzy likert scale · Evaluations
1
·
Introduction
Likert scales are widely applied in many areas of business and management for evaluating business-related characteristics. These are discrete scales, on which the evaluator can express his or her opinion by selecting the most appropriate ‘value’ from a finite set of pre-defined categories. Applications of Likert scalebased evaluations result in ordinal data that are commonly coded by numeric values. Although the Likert scales are easy-to-use and can provide the users with ordinal numeric data, their application has certain limitations. In most of the cases, the evaluator makes complex decisions in a state of uncertainty; however, the evaluation result is a single crisp numeric value. Some researchers c Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 81–101, 2021. https://doi.org/10.1007/978-3-030-51992-6_8
82
J. Dombi and T. J´ on´ as
have pointed out that individuals cannot appropriately express their opinion of a given situation by a particular crisp value and proposed linguistic approaches to represent a specific numeric value, see, for example, the articles [1–6]. Next, if the number of ‘values’ that the user can choose from on a Likert scale is not sufficiently large, then the real variability associated with the rating may be lost [7]. It is also worth mentioning that on a Likert scale, the differences between the codes cannot always appropriately represent the differences in their magnitude. Thus, purely statistical techniques dealing with ordinal data can be utilized to analyze the rating results, but it may lead to lost of some relevant information [8]. Moreover, in situations where the evaluation is carried out for a given period of time, the rated item might be time dependent and so only one crisp value cannot express the perceived variation (see, e.g. [9]). In addition, when heterogeneous preferences of the evaluators are aggregated into a single crisp score, the aggregate result tends to hide the variability associated with the rated characteristic [10]. The fuzzy set theory and the fuzzy logic have been increasingly applied to situations where human perceptions, subjectivity and imprecision need to be taken into account during evaluations and managerial decisions [11–14]. The above-mentioned shortcomings of the traditional Likert scale-based evaluations can also be mitigated by utilizing the constructions and techniques of the fuzzy set theory and fuzzy logic [8,15,16]. Hesketh et al. [17] proposed the fuzzy rating scale as a construction that is suitable for psychometric evaluations. Gil and Gonzalez Rodrigues [7] highlighted the advantages of fuzzy scales versus the traditional Likert scales. The fuzzy rating scales have the ability to model the imprecision and uncertainty of human evaluations through the application of fuzzy numbers (see, e.g. [6,7,18–21]). This approach results in fuzzy-valued responses that are able to reflect the human perceptions more precisely than a Likert scale. Although fuzzy numbers are able to model human judgments more precisely, their application may require some knowledge in fuzzy theory and a lot of computations (see, e.g. [22,23]). In this paper, we introduce the so-called flexible fuzzy number, the membership function of which can take various shapes depending on its shape parameter λ. Namely, it can be bell-shaped (λ > 1), triangular (λ = 1), or ‘reverse’ bellshaped (0 < λ < 1). This feature of the flexible fuzzy number allows us to express the soft equation ‘x is approximately equal to x0 ’ in various ways. We should also point out that the 4-parameter membership function μ(x; l, x0 , r, λ) of a flexible fuzzy number may be viewed as a generalization of the triangular membership function. The parameters l and r determine the left hand side- and right hand side limits of the flexible fuzzy number, respectively; that is, if x ≤ l or x ≥ r, then the membership value of x in the fuzzy set {x =(l,r,λ) x0 } is zero. In other words, the truth of the statement that ‘x is approximately equal to x0 ’ is zero, if x ≤ l or x ≥ r. Next, we demonstrate that if the shape parameter λ is fixed, then the set of flexible fuzzy numbers is closed under the multiplication by scalar, fuzzy addition and weighted average operations. The pliancy of flexible fuzzy numbers and the above-mentioned properties of the operations over them
Flexible Fuzzy Numbers for Likert Scale-Based Evaluations
83
make the flexible fuzzy numbers suitable for Likert scale-based fuzzy evaluations. On the one hand, we can utilize them to deal with the vagueness that may originate from the uncertainty of the evaluators or from the variability of the perceived performance values. On the other hand, in multi-dimensional Likert scale-based evaluations, by having a rating result in each evaluation dimension given by a flexible fuzzy number, these results can be easily aggregated into one flexible fuzzy number by applying the above-mentioned arithmetic operations. Furthermore, as a generalization of the flexible fuzzy numbers we introduce the extended flexible fuzzy numbers which can have different left hand side and right hand side shape parameters. We demonstrate that the asymptotic flexible fuzzy number is just a quasi fuzzy number composed of an increasing left hand side and a decreasing right hand side sigmoid function. Exploiting this property of the extended flexible fuzzy numbers allows us to perform approximate fuzzy arithmetic operations over them in the same way as in Dombi’s pliant arithmetics [24]. It should be added that applications both of the flexible fuzzy numbers and the extended flexible fuzzy numbers require simple mathematical calculations and so the proposed fuzzy evaluation methods can be easily implemented and utilized in practice. The paper is structured as follows. In Sect. 2 below we introduce the flexible fuzzy numbers, the extended flexible fuzzy numbers and some arithmetic operations over them, which are important from the Likert scale-based evaluation point of view. In Sect. 3, we will demonstrate the applicability of flexible fuzzy numbers by means of a human performance evaluation example. Lastly, we will summarize the main conclusions and their managerial implications.
2
Flexible Fuzzy Numbers
Here, we will introduce the kappa function which we will later utilize to define the membership function of the flexible fuzzy number. 2.1
The Kappa Function
The triangular membership function μt (x; a, b, c) of a fuzzy set is commonly represented by ⎧ if x ≤ a ⎪ ⎪0, ⎪ ⎨ x−a , if a < x ≤ b b−a μt (x; a, b, c) = c−x (1) ⎪ c−b , if b < x < c ⎪ ⎪ ⎩ 0, if c ≤ x, where a < b < c, x ∈ R. The increasing left hand side linear component of μt (x; a, b, c) can be written as fl (x; a, b) =
x−a = b−a
1 x−a+b−x x−a
=
1 , b−x 1 + x−a
(2)
84
J. Dombi and T. J´ on´ as
b−x where x ∈ (a, b). Now, we will raise the term x−a to the power of λ and interpret the following κ(x; a, b, λ) function as a generalization of the linear function fl (x; a, b).
Definition 1. The kappa function κ(x; a, b, λ) is given by
κ(x; a, b, λ) = 1+
1 b−x x−a
λ ,
(3)
where a < b, λ ∈ R, x ∈ (a, b). Main Properties of the Kappa Function. Here, we state the most important properties of the kappa function κ(x; a, b, λ), namely domain, differentiability, monotonicity, limits, convexity and role of the function parameters. Domain. The domain of the function κ(x; a, b, λ) is the interval (a, b). Differentiability. κ(x; a, b, λ) is differentiable in the interval (a, b), and its derivative function is dκ(x; a, b, λ) κ(x; a, b, λ) (1 − κ(x; a, b, λ)) = λ(b − a) . dx (x − a) (b − x)
(4)
Monotonicity. Since κ(x; a, b, λ) ∈ (0, 1) for any x ∈ (a, b), and using the first derivative of κ(x; a, b, λ), we can state that if λ > 0, then κ(x; a, b, λ) is strictly monotonously increasing, and if λ < 0, then κ(x; a, b, λ) is strictly monotonously decreasing. Notice that if λ = 0, then κ(x; a, b, λ) has the constant value of 0.5 for any x ∈ (a, b). Limits.
0, lim κ(x; a, b, λ) = x→a+ 1,
if λ > 0 , if λ < 0
1, lim κ(x; a, b, λ) = x→b− 0,
if λ > 0 if λ < 0.
(5)
Convexity. It can be shown that the function κ(x; a, b, λ) changes its shape at x = (a + b)/2 – from convex to concave, if λ < −1 or λ > 1 – from concave to convex, if −1 < λ < 0 or 0 < λ < 1. Notice that if λ = 1, then κ(x; a, b, λ) = fl (x; a, b) =
x−a . b−a
(6)
That is, κ(x; a, b, λ) is an increasing linear function. Furthermore, if λ = −1, then κ(x; a, b, λ) is the decreasing linear function κ(x; a, b, λ) = fr (x; a, b) =
b−x . b−a
(7)
Flexible Fuzzy Numbers for Likert Scale-Based Evaluations
85
Role of Parameters. Parameters a and b determine the domain (a, b) of the function κ(x; a, b, λ). The function κ(x; a, b, λ) takes the value of 0.5 at x = (a + b)/2. Note that the linear functions fl (x; a, b) and fr (x; a, b) also take the value of 0.5 at x = (a + b)/2. The λ parameter determines the monotonicity and the shape of κ(x; a, b, λ). Moreover, the slope of function κ(x; a, b, λ) at x = (a + b)/2 is λ/(b − a). Figure 1 shows some typical kappa function plots.
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0 2
4
6
8
10
2
4
6
8
10
Fig. 1. Plots of some kappa functions
Connection with Continuous-Valued Logic. In fuzzy logic, the linguistic modifiers like ‘very’, ‘more or less’, ‘rather’ and ‘quite’ over fuzzy sets that have strictly monotonously increasing or decreasing membership functions can be modeled by using the following unary operator that is called the kappa modifier operator [25,26]. Definition 2. The kappa modifier operator (Dombi’s kappa function) is given by 1 (λ) κν,ν (x) = (8) λ , 0 1−ν0 ν 1−x 1 + ν0 1−ν x where ν, ν0 ∈ (0, 1), λ ∈ R, and x is a continuous-valued logic variable. Notice that if ν = ν0 = 0.5 then applying the x = (x − a)/(b − a) linear transformation to this function gives (λ) (x ) = κ(x ; a, b, λ) = κν,ν 0
1+
1 b−x x −a
λ .
(9)
86
J. Dombi and T. J´ on´ as
Connection with Probability Theory. It is worth mentioning here that the kappa function κ(x; a, b, λ) with the parameter settings a = −d, b = d, where d > 0, can be employed for approximating the logistic probability distribution function and the standard normal probability distribution function [27]. 2.2
Representing Flexible Fuzzy Numbers
Utilizing the kappa function κ(x; a, b, λ), we define the membership function of the so-called flexible fuzzy number as follows. Definition 3. The membership function μ(x; l, x0 , r, λ) of the flexible fuzzy number {x =(l,r,λ) x0 } is given by
μ(x; l, x0 , r, λ) =
⎧ 0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ 1+
1
λ
0 −x ( xx−l )
,
1 −λ , ⎪ ⎪ r−x ⎪ 1+ x−x ⎪ 0 ⎪ ⎩ 0,
if x ≤ l if l < x ≤ x0 if x0 < x < r
(10)
if r ≤ x,
where l < x0 < r, λ > 0, x ∈ R. The definition of the membership function μ(x; l, x0 , r, λ) contains an increasing left hand side (LHS) kappa function κ(x; a, b, λ) with the parameters a = l, b = x0 and a decreasing right hand side (RHS) kappa function κ(x; a, b, −λ), which has the parameters a = x0 , b = r. Notice that since λ > 0, −λ has a negative value. In Fig. 2, there are some typical membership function plots of flexible fuzzy numbers. We can see from it that depending on the value of parameter λ, the membership function plot of a flexible fuzzy number can take various shapes; namely, it can be bell-shaped (λ > 1), triangular (λ = 1), or ‘reverse’ bellshaped (0 < λ < 1). Hence, we call the parameter λ the shape parameter of the flexible fuzzy number. This flexibility of the membership function μ(x; l, x0 , r, λ) allows us to use the flexible fuzzy number {x =(l,r,λ) x0 } to express the soft equation ‘x is approximately equal to x0 ’ in various ways. The membership function μ(x; l, x0 , r, λ) may also be viewed as a generalization of the triangular membership function. The parameters l and r determine the LHS limit and RHS limit of the flexible fuzzy number, respectively; that is, if x ≤ l or x ≥ r, then the membership value of x in the fuzzy set {x =(l,r,λ) x0 } is zero. In other words, the truth of the statement that ‘x is approximately equal to x0 ’ is zero, if x ≤ l or x ≥ r. The next lemma demonstrates that the membership function μ(x; l, x0 , r, λ) in fact defines a fuzzy number. Lemma 1. The function μ(x; l, x0 , r, λ) is the membership function of a fuzzy number. Proof. The function μ(x; l, x0 , r, λ) is the membership function of a fuzzy number if μ(x; l, x0 , r, λ) has the following properties [28]:
Flexible Fuzzy Numbers for Likert Scale-Based Evaluations
87
1
0.8
0.6
0.4
0.2
0 1
2
3
4
5
6
7
8
9
10
Fig. 2. Membership function plots of some flexible fuzzy numbers
(1) μ(x; l, x0 , r, λ) is normal; that is, ∃a ∈ R for which μ(a; l, r, λ) = 1 (2) μ(x; l, x0 , r, λ) is fuzzy convex; that is, for any x1 , x2 ∈ R and for any t ∈ [0, 1], μ(tx1 + (1 − t)x2 ; l, r, λ) ≥ min(μ(x1 ; l, r, λ), μ(x2 ; l, r, λ)) (3) μ(x; l, x0 , r, λ) is upper semicontinuous (4) μ(x; l, x0 , r, λ) is compactly supported; that is, the closure of set {x ∈ R : μ(x; l, x0 , r, λ) > 0} is compact. Since μ(x0 ; l, x0 , r, λ) = 1 holds by definition, μ(x; l, x0 , r, λ) satisfies property (1). It follows from the properties of the function κ(x; a, b, λ) and from the definition of μ(x; l, x0 , r, λ) that μ(x; l, x0 , r, λ) satisfies properties (2) and (3). Based on the definition of μ(x; l, x0 , r, λ), {x ∈ R : μ(x; l, x0 , r, λ) > 0} = (l, r), the closure of which is compact and so μ(x; l, x0 , r, λ) satisfies property (4). We will apply the following lemma for interpreting some arithmetic operations over the set of flexible fuzzy numbers. Lemma 2. The α-cut of the flexible fuzzy number {x =(l,r,λ) x0 } is the interval [xα,l,x0 , xα,r,x0 ], where xα,l,x0
λ1 x0 + l 1−α α =
λ1 , 1 + 1−α α
xα,r,x0
λ1 x0 + r 1−α α =
λ1 , 1 + 1−α α
(11)
l < x0 < r, λ > 0, α ∈ (0, 1). Proof. Based on the definition of a fuzzy α-cut (see e.g. [29]), the α-cut of the fuzzy set {x =(l,r,λ) x0 } given by the membership function μ(x; l, x0 , r, λ) is the set {x ∈ R : μ(x; l, x0 , r, λ) ≥ α}. Exploiting the properties of the kappa
88
J. Dombi and T. J´ on´ as
function, it can be shown that the membership function μ(x; l, x0 , r, λ) takes all the real values of the interval [0, 1], μ(x0 ; l, x0 , r, λ) = 1, μ(x; l, x0 , r, λ) is strictly increasing, if x ≤ x0 , and it is strictly decreasing, if x > x0 . Furthermore, via simple calculations, we can see that μ(xl,α ; l, x0 , r, λ) = μ(xr,α ; l, x0 , r, λ) = α. That is, the inequality μ(x; l, x0 , r, λ) ≥ α holds if and only if x ∈ [xα,l,x0 , xα,r,x0 ]. This means that the α-cut of the fuzzy number {x =(l,r,λ) x0 } is the interval [xα,l,x0 , xα,r,x0 ]. 2.3
Some Arithmetic Operations
Next, we will describe some arithmetic operations which we will define over the set of flexible fuzzy numbers. Since from the Likert scale-based evaluation viewpoint the multiplication by a scalar, the fuzzy addition and the weighted average calculation are the most important operations, we will concentrate on interpreting these operations. Multiplication by Scalar Lemma 3. For any α ∈ (0, 1) and c ∈ R, if [xα,l,x0 , xα,r,x0 ] is the α-cut of the flexible fuzzy number {x =(l,r,λ) x0 }, then [cxα,l,x0 , cxα,r,x0 ] is the α-cut of the flexible fuzzy number {x =(l ,r ,λ) x0 }, where l = cl, x0 = cx0 and r = cr. Proof. Based on Lemma 2, if [xα,l,x0 , xα,r,x0 ] is the α-cut of the flexible fuzzy number {x =(l,r,λ) x0 }, then ⎡
λ1 1−α λ1 ⎤ x0 + l 1−α x + r 0 α α (12) [xα,l,x0 , xα,r,x0 ] = ⎣ 1−α λ1 , 1−α λ1 ⎦ , 1+ α 1+ α from which
λ1 1−α λ1 ⎤ cx cx0 + cl 1−α + cr 0 α α [cxα,l,x0 , cxα,r,x0 ] = ⎣ 1−α λ1 , 1−α λ1 ⎦ = 1+ α 1+ α ⎡ 1
1 ⎤ 1−α λ λ x0 + l 1−α x + r 0 α α =⎣ 1−α λ1 , 1−α λ1 ⎦ , 1+ α 1+ α ⎡
(13)
where l = cl, x0 = cx0 and r = cr. Applying Lemma 2 again, we can see that the last interval is the α-cut of the flexible fuzzy number {x =(l ,r ,λ) x0 }. Lemma 3 can be interpreted such that for any α ∈ (0, 1), the multiplication of the α-cut of a flexible fuzzy number by a scalar is the α-cut of a flexible fuzzy number as well; that is, the set of flexible fuzzy numbers is closed under multiplication by a scalar. Exploiting this result, the following proposition holds.
Flexible Fuzzy Numbers for Likert Scale-Based Evaluations
89
Proposition 1. The multiplication of the flexible fuzzy number {x =(l,r,λ) x0 } by the scalar c can be computed as c {x =(l,r,λ) x0 } = {x =(l ,r ,λ) x0 },
(14)
where l = cl, x0 = cx0 , r = cr and denotes the multiplication by a scalar operator.
Proof. This proposition follows from Lemma 3. Addition Lemma 4. For any α ∈ (0, 1), if xα,l1 ,x0,1 , xα,r1 ,x0,1 , xα,l2 ,x0,2 , xα,r2 ,x0,2 , . . . , xα,ln ,x0,n , xα,rn ,x0,n are the α-cuts of the flexible fuzzy numbers {x =(l1 ,r1 ,λ) x0,1 }, {x =(l2 ,r2 ,λ) x0,2 }, . . . , {x =(ln ,rn ,λ) x0,n }, respectively, then
n
xα,li ,x0,i ,
i=1
n
xα,ri ,x0,i
i=1
n is theα-cut of the flexible fuzzy number {x =(l,r,λ) x0 }, where l = i=1 li , n n x0 = i=1 x0,i and r = i=1 ri . Proof. Based on Lemma 2, if xα,li ,x0,i , xα,ri ,x0,i is the α-cut of the flexible fuzzy number {x =(li ,ri ,λ) x0,i }, then ⎡
λ1 1−α λ1 ⎤ x0,i + li 1−α x + r 0,i i α α xα,li ,x0,i , xα,ri ,x0,i = ⎣ (15) 1−α λ1 , 1−α λ1 ⎦ , 1+ α 1+ α i = 1, 2, . . . , n. Summing up the equations in (15) for i = 1, 2, . . . , n by making use of interval arithmetic operations, we get n n xα,li ,x0,i , xα,ri ,x0,i = ⎡ n
⎢ i=1 =⎢ ⎣
x0,i + 1+
i=1 n
i=1
li
1−α λ1 ⎡
α
i=1
n 1−α λ1 α
, i=1
x0,i + 1+
1−α λ1 ⎤ α ⎥ i=1 ⎥= 1−α λ1 ⎦ n
ri
α
(16)
λ1 1−α λ1 ⎤ x0 + l 1−α x + r 0 α α =⎣
λ1 , 1−α λ1 ⎦ , 1 + 1−α 1 + α α n n n where l = i=1 li , x0 = i=1 x0,i and r = i=1 ri . Next, exploiting Lemma 2, the last interval is the α-cut of the flexible fuzzy number {x =(l,r,λ) x0 }.
90
J. Dombi and T. J´ on´ as
Lemma 4 tells us that for any α ∈ (0, 1), the sum of α-cuts of flexible fuzzy numbers is the α-cut of a flexible fuzzy number; that is, the set of flexible fuzzy numbers is closed under fuzzy addition. Noting this result, the following proposition can be stated on the addition of flexible fuzzy numbers. Proposition 2. The sum of the flexible fuzzy numbers {x =(l1 ,r1 ,λ) x0,1 }, {x =(l2 ,r2 ,λ) x0,2 }, . . . , {x =(ln ,rn ,λ) x0,n }, can be calculated as {x =(l1 ,r1 ,λ) x0,1 } ⊕ {x =(l2 ,r2 ,λ) x0,2 } ⊕ · · · ⊕ {x =(ln ,rn ,λ) x0,n } = = {x =(l,r,λ) x0 }, where l = operator.
n
i=1 li ,
x0 =
n
i=1
x0,i , r =
n
i=1 ri
(17)
and ⊕ denotes the fuzzy addition
Proof. This proposition follows from Lemma 4.
Weighted Average Proposition 3. The weighted average w1 {x =(l1 ,r1 ,λ) x0,1 } ⊕ w2 {x =(l2 ,r2 ,λ) x0,2 } ⊕ · · · · · · ⊕ wn {x =(ln ,rn ,λ) x0,n }
(18)
of the flexible fuzzy numbers {x =(l1 ,r1 ,λ) x0,1 }, {x =(l2 ,r2 ,λ) x0,2 }, . . . , {x =(ln ,rn ,λ) x0,n } n with the weights w1 , w2 , . . . , wn ≥ 0, respectively, ( i=1 wi = 1) can be computed as w1 {x =(l1 ,r1 ,λ) x0,1 } ⊕ w2 {x =(l2 ,r2 ,λ) x0,2 } ⊕ · · · (19) · · · ⊕ wn {x =(ln ,rn ,λ) x0,n } = {x =(l,r,λ) x0 }, n n n where l = i=1 wi li , x0 = i=1 wi x0,i and r = i=1 wi ri . Proof. This proposition directly follows from Proposition 1 and Proposition 2. Figure 3 shows plots of membership functions of the flexible fuzzy numbers {x =(l1 ,r1 ,λ) x0,1 } and {x =(l2 ,r2 ,λ) x0,2 }, and a plot of the membership function of their weighted average w1 {x =(l1 ,r1 ,λ) x0,1 } ⊕ w2 {x =(l2 ,r2 ,λ) x0,2 }. Below, we will introduce some alternative notations for flexible fuzzy numbers and for their membership functions. These alternative notations may be useful in practical applications.
Flexible Fuzzy Numbers for Likert Scale-Based Evaluations
91
1
0.8
0.6
0.4
0.2
0 1
2
3
4
5
6
7
8
9
10
Fig. 3. Weighted average of two flexible fuzzy numbers
2.4
Alternative Notations
As we pointed out before, the flexible fuzzy number {x =(l,r,λ) x0 } is a representation of the soft equation ‘x is equal to x0 ’. Thus, the membership function μ(x; l, x0 , r, λ) of the flexible fuzzy number {x =(l,r,λ) x0 } may be viewed as a function that expresses the truth of statement that ‘x is equal to x0 ’. Hence, we will use the notation T (x = x0 ; l, r, λ) for the truth of statement that ‘x is equal to x0 ’; that is, (20) T (x = x0 ; l, r, λ) = μ(x; l, x0 , r, λ). The parameters l and r determine the LHS limit and RHS limit of the flexible fuzzy number {x =(l,r,λ) x0 }, respectively; namely, the triple (l, x0 , r) determines {x =(l,r,λ) x0 }, if the parameter λ has a given value. Notice that the properties of the above-mentioned arithmetic operations hold only if the parameter λ has a fixed value. Here, we will use the alternative notation N (λ) (l, x0 , r)
(21)
for the flexible fuzzy number {x =(l,r,λ) x0 }. With this notation, the above operations can be written in the following forms. 1. Multiplication by scalar: c N (λ) (l, x0 , r) = N (λ) (cl, cx0 , cr).
(22)
2. Addition: N (λ) (l1 , x0,1 , r1 ) ⊕ N (λ) (l2 , x0,2 , r2 ) ⊕ · · · ⊕ N (λ) (ln , x0,n , rn ) = = N (λ) (l, x0 , r),
(23)
92
J. Dombi and T. J´ on´ as
n n n where l = i=1 li x0 = i=1 x0,i and r = i=1 ri . 3. Weighted average: w1 N (λ) (l1 , x0,1 , r1 ) ⊕ w2 N (λ) (l2 , x0,2 , r2 ) ⊕ · · · · · · ⊕ wn N (λ) (ln , x0,n , rn ) = N (λ) (l, x0 , r),
(24)
n n n where ln = i=1 wi li , x0 = i=1 wi x0,i , r = i=1 wi ri , w1 , w2 , . . . wn ≥ 0 and i=1 wi = 1. 2.5
Extended Flexible Fuzzy Numbers and Approximate Operations
Previously we pointed out that the properties of the above-mentioned arithmetic operations over the set of flexible fuzzy numbers hold only if the parameter λ has a fixed value. However, there may be practical cases where arithmetic operations over flexible fuzzy numbers with various λ parameters should be performed. For example, imagine the case where the performance of a company’s supplier is evaluated in multiple dimensions of the performance using flexible fuzzy numbers. Here, it may be advantageous if in different dimensions of the supplier performance the perceived performance could be represented by flexible fuzzy numbers which have different λ parameter values. Furthermore, in some situations, it would also be helpful, if the LHS and the RHS of a flexible fuzzy number had different λ parameters. Taking these considerations into account, we shall now introduce the so-called extended flexible fuzzy numbers. Definition 4. The membership function μ(x; l, x0 , r, λl , λr ) of the extended flexible fuzzy number N (λl ,λr ) (l, x0 , r) is given by
μ(x; l, x0 , r, λl , λr ) =
⎧ 0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ 1+
1
λl
0 −x ( xx−l )
,
1 −λr , ⎪ ⎪ r−x ⎪ 1+ x−x ⎪ 0 ⎪ ⎩ 0,
if x ≤ l if l < x ≤ x0 if x0 < x < r
(25)
if r ≤ x,
where l < x0 < r, λl , λr > 0, x ∈ R. The fuzzy arithmetic operations for the flexible fuzzy numbers, which we discussed in Sect. 2.3, exploit the fact that the λ parameter is fixed. Hence, these operations cannot be directly applied to the extended flexible fuzzy numbers. Here, we will use the sigmoid function and some of its key properties to implement approximate operations over the set of extended flexible fuzzy numbers. (λ )
Definition 5. The sigmoid function σm σ (x) is given by (λσ ) σm (x) =
1 1+
e−λσ (x−m)
where x ∈ R, λσ , m ∈ R and λσ is nonzero.
,
(26)
Flexible Fuzzy Numbers for Likert Scale-Based Evaluations
93
The sigmoid function can neither take the value of 0, nor the value of 1, as these are its limits. However, if we choose λσ and m such that 1−ε 2 a+b ln , (27) λσ = , m= b−a ε 2 (λ )
where a < b, 0 < ε < 0.5 and ε ≈ 0, e.g. ε = 0.001, then σm σ (a) = ε and (λ ) σm σ (b) = 1 − ε. The next proposition demonstrates an important asymptotic connection between the kappa function κ(x; a, b, λ) and the sigmoid function (λ ) σm σ (x). Proposition 4. For any x ∈ (a, b), if m = (a + b)/2 and λσ = 4λ/(b − a), then (λσ ) lim σm (x) − κ(x; a, b, λ) = 0. (28) λ→∞
(λ )
Proof. Utilizing the definitions and the first derivatives of the functions σm σ (x) and κ(x; a, b, λ), it can be shown that if m = (a + b)/2 and λσ = 4λ/(b − a), (λ ) then σm σ (x) and κ(x; a, b, λ) are identical to first order at x = m. Furthermore, since λσ = 4λ/(b − a), if λ → ∞, then λσ → ∞ as well. Let x have a fixed value in the interval (a, b), and let Ax and Bx be defined as follows: b−a b−x 4 −(x−m) Ax = e , Bx = . (29) x−a Here, we will examine three cases; namely (1) x < m, (2) x = m, (3) x > m. (1) If x < m, then Ax > 1 and Bx > 1. As λσ → ∞, we may exploit the fact that λσ > 0. Here, we will distinguish three sub-cases. (a) If Ax > Bx > 1, then Aλxσ > Bxλσ > 1, and so (λσ ) lim σm (x) − κ(x; a, b, λ) = λσ →∞ 1 1 = lim − = σ λσ →∞ 1 + Aλ 1 + Bxλσ x (30) ⎞ ⎛ λσ Bx −1 Ax ⎜ ⎟ = lim ⎝ ⎠ = 0. λσ 1 λσ →∞ 1 + B x λσ + 1 A x
(b) If Ax = Bx , then (28) trivially follows. (c) If 1 < Ax < Bx , then 1 < Aλxσ < Bxλσ , and so (λσ ) lim σm (x) − κ(x; a, b, λ) = λσ →∞ 1 1 = lim − = σ λσ →∞ 1 + Aλ 1 + Bxλσ x ⎛ ⎞ λσ Ax 1 − Bx ⎜ ⎟ = lim ⎝ ⎠ = 0. λσ 1 λσ →∞ 1 + Ax λσ + 1 B x
(31)
94
J. Dombi and T. J´ on´ as (λ )
(2) If x = m, then σm σ (x) = κ(x; a, b, λ) = 0.5, and so (28) trivially follows. (3) If x > m, then 0 < Ax < 1 and 0 < Bx < 1, and so lim Aλxσ = lim Bxλσ = 0.
λσ →∞
(32)
λσ →∞
Hence, (28) has been proven for this case as well.
Based on Proposition 4, we may conclude that if m = (a+b)/2, λσ = 4λ/(b− a) and λ is large, then the kappa function κ(x; a, b, λ) can be approximated well (λ ) by the sigmoid function σm σ (x). Figure 4 shows how the maximum of the absolute difference between the kappa and sigmoid functions decreases as the value of parameter λ increases.
0.12
0.1
0.08
0.06
0.04
0.02
0 1
1.5
2
2.5
3
3.5
4
4.5
5
Fig. 4. Maximum of absolute difference between the kappa and sigmoid functions
Notice that if λ > 2, m = (a + b)/2 and λσ = 4λ/(b − a), then (λσ ) (x) < 0.03. max κ(x; a, b, λ) − σm x∈(a,b)
(33)
(λ )
With the sigmoid function σm σ (x), we can define the membership function of the sigmoid function-based quasi fuzzy number as follows. Definition 6. The membership function μσ (x; ml , x0 , mr , λσ,l , λσ,r ) of the sigmoid function-based quasi fuzzy number N (λσ,l ,λσ,r ) (ml , x0 , mr ) is given by (λ ) σmlσ,l (x), if x ≤ x0 (34) μσ (x; ml , x0 , mr , λσ,l , λσ,r ) = (−λσ,r ) σ mr (x), if x0 < x, where ml < x0 < mr , λσ,l , λσ,r > 0 and x ∈ R.
Flexible Fuzzy Numbers for Likert Scale-Based Evaluations
95
Note that since the membership function μσ (x; ml , x0 , mr , λσ,l , λσ,r ) does not take the value of 1, N (λσ,l ,λσ,r ) (ml , x0 , mr ) is not a fuzzy number. However, if λσ,l and λσ,r are sufficiently large, then μσ (x0 ; ml , x0 , mr , λσ,l , λσ,r ) ≈ 1. Here, we will state an important connection between the flexible fuzzy numbers and the sigmoid function-based quasi fuzzy numbers. Proposition 5. Let N (λl ,λr ) (l, x0 , r) be an extended flexible fuzzy number and let N (λσ,l ,λσ,r ) (ml , x0 , mr ) be a sigmoid function-based quasi fuzzy number. If ml = (l + x0 )/2, mr = (x0 + r)/2, λσ,l = 4λl /(x0 − l), λσ,r = 4λr /(r − x0 ) and λl , λr are both large, then N (λl ,λr ) (l, x0 , r) ≈ N (λσ,l ,λσ,r ) (ml , x0 , mr ). Proof. This proposition follows directly from Proposition 4.
Exploiting Dombi’s pliant inequality model, which uses the sigmoid function to represent soft inequalities, and the properties of the pliant arithmetic operations over pliant inequalities, the following proposition can be stated. Proposition 6. The multiplication by a scalar, the addition and the weighted average operations over the set of sigmoid function-based quasi fuzzy numbers can be performed as follows. (1) Multiplication by a scalar:
c N (λσ,l ,λσ,r ) (ml , x0 , mr ) = N (λσ,l ,λσ,r ) (ml , x0 , mr ),
(35)
where λσ,l = λσ,l /c, λσ,r = λσ,r /c, ml = cml , x0 = cx0 , mr = cmr and denotes the multiplication by a scalar operator. (2) Addition: N (λσ,l,1 ,λσ,r,1 ) (ml,1 , x0,1 , mr,1 ) ⊕ N (λσ,l,2 ,λσ,r,2 ) (ml,2 , x0,2 , mr,2 ) ⊕ · · · · · · ⊕ N (λσ,l,n ,λσ,r,n ) (ml,n , x0,n , mr,n ) = N (λσ,l ,λσ,r ) (ml , x0 , mr ), where
n
(36)
n
1 1 1 1 = , = , (37) λσ,l λ λσ,r λ i=1 σ,l,i i=1 σ,r,i n n n ml = i=1 ml,i , x0 = i=1 x0,i , mr = i=1 mr,i and ⊕ denotes the fuzzy addition operator. (3) Weighted average: w1 N (λσ,l,1 ,λσ,r,1 ) (ml,1 , x0,1 , mr,1 )⊕ (38) ⊕w2 N (λσ,l,2 ,λσ,r,2 ) (ml,2 , x0,2 , mr,2 ) ⊕ · · · (λσ,l,n ,λσ,r,n ) (λσ,l ,λσ,r ) (ml,n , x0,n , mr,n ) = N (ml , x0 , mr ), · · · ⊕ wn N where
n n wi wi 1 1 = , = , (39) λσ,l λ λσ,r λ i=1 σ,l,i i=1 σ,r,i n n n ml = i=1 wi ml,i , x0 = i=1 wi x0,i , mr = i=1 wi mr,i , w1 , w2 , . . . wn ≥ 0 n and i=1 wi = 1.
96
J. Dombi and T. J´ on´ as
Proof. This proposition follows from the properties of the pliant arithmetic operations over pliant inequalities (see [24]). Based on Proposition 5, we may conclude that an extended flexible fuzzy number can be approximated well by an appropriate sigmoid function-based quasi fuzzy number. Moreover, in Proposition 6, we demonstrated how the multiplication by scalar, addition and weighted average operations can be computed over the sigmoid function-based quasi fuzzy numbers. Exploiting these findings, the next proposition about the approximations of the examined operations over the set of extended flexible fuzzy numbers can be stated. Proposition 7. The multiplication by a scalar, the addition and the weighted average operations over the set of extended flexible fuzzy numbers can be approximated as follows. (1) Multiplication by a scalar: if λl and λr are sufficiently large, then
c N (λl ,λr ) (l, x0 , r) ≈ N (λl ,λr ) (l , x0 , r ),
(40)
where λl = λl , λr = λr , l = cl, x0 = cx0 , r = cr and denotes the multiplication by a scalar operator. (2) Addition: if λl,1 , λl,2 , . . . , λl,n and λr,1 , λr,2 , . . . , λr,n are sufficiently large, then N (λl,1 ,λr,1 ) (l1 , x0,1 , r1 ) ⊕ N (λl,2 ,λr,2 ) (l2 , x0,2 , r2 ) ⊕ · · · (41) · · · ⊕ N (λl,n ,λr,n ) (ln , x0,n , rn ) ≈ N (λl ,λr ) (l, x0 , r), where
x0 − l λl = , n x0,i −li
n
n
i=1
λl,i
r − x0 λr = , n ri −x0,i i=1
n
(42)
λr,i
l = i=1 li , x0 = i=1 x0,i , r = i=1 ri and ⊕ denotes the fuzzy addition operator. (3) Weighted average: if λl,1 , λl,2 , . . . , λl,n and λr,1 , λr,2 , . . . , λr,n are sufficiently large, then w1 N (λl,1 ,λr,1 ) (l1 , x0,1 , r1 ) ⊕ w2 N (λl,2 ,λr,2 ) (l2 , x0,2 , r2 ) ⊕ · · · · · · ⊕ wn N (λl,n ,λr,n ) (ln , x0,n , rn ) ≈ N (λl ,λr ) (l, x0 , r), where λl = n l= n
n
i=1
i=1 wi li , x0 = wi = 1.
i=1
x0 − l
n
x −li wi 0,i λl,i
i=1
,
λr = n
wi x0,i , r =
i=1
n
i=1
r − x0 r −x wi iλr,i0,i
,
(43)
(44)
wi ri , w1 , w2 , . . . wn ≥ 0 and
Proof. The proposition follows from Proposition 5 and Proposition 6.
Flexible Fuzzy Numbers for Likert Scale-Based Evaluations
97
Remark 1. If we replace the parameters λl and λr in (40) by a fixed λ, then λl = λr = λ. Similarly, if we substitute the parameters λl,i and λr,i in (41) and in (43) with a fixed λ, then the right hand sides of the formulas for λl and λr in (42) and in (44) also become identical with λ, where i = 1, 2, . . . , n. At the same time, the exact operations for the flexible fuzzy numbers can be utilized instead of the approximate operations for the extended flexible fuzzy numbers.
3
A Demonstrative Example
Now, we will demonstrate the application of flexible fuzzy numbers by means of an example. In this example, the performance of an employee of a company is evaluated on Likert scales using flexible fuzzy numbers. The company evaluates the human performance in four dimensions; namely, the quality of work, the efficiency of work, the collaboration with other colleagues and the communication with other stakeholders are evaluated by the employee’s supervisor. In each performance dimension, the evaluation is carried out on a Likert scale that includes the integers from 1 through 10. Number 1 represents the worst performance, while number 10 represents the best performance. The evaluator selects the values of l, x0 and r on the Likert scale such that in the evaluator’s judgement the employee’s performance (1) is not worse than l, (2) mostly has the value of x0 , (3) is not better than r. These three numbers and a pre-set value of the parameter λ determine a flexible fuzzy number.
quality
1 0.5 0 1
2
3
4
5 6 efficiency
7
8
9
10
1
2
3
4
5 6 collaboration
7
8
9
10
1
2
3
4
5 6 communication
7
8
9
10
1
2
3
4
5 6 weighted average
7
8
9
10
1
2
3
4
7
8
9
10
1 0.5 0 1 0.5 0 1 0.5 0 1 0.5 0
5
6
Fig. 5. An example of Likert Scale-based evaluations with flexible fuzzy numbers
Figure 5 shows an evaluation example in which the quality, the efficiency, the collaboration and the communication of an employee is represented by the
98
J. Dombi and T. J´ on´ as
flexible fuzzy numbers N (λ) (2, 5, 6), N (λ) (5, 8, 9), N (λ) (3, 4, 5) and N (λ) (1, 3, 5), respectively, while the parameter λ has the value of 2. The flexible fuzzy number N (λ) (l, x0 , r) represents the truth value of the statement that ‘x is equal to x0 ’. This truth value is 0 if x ≤ l or x ≥ r, and it is 1 if x = x0 . Therefore, for example, the fuzzy number N (λ) (2, 5, 6), which represents the performance in the work quality dimension, expresses the fact that in the evaluator’s opinion the employee’s performance is around 5, but it is not worse than 2, and it is not better than 6. It should be mentioned that the flexible fuzzy number-based evaluation carries much more information than the traditional evaluation which would represent the employee’s work quality by the crisp value of 5. Here, the flexible fuzzy number-based evaluation expresses the fact that the perceived performance is between 2 and 6, and it usually has the value of 5. It should also be added that since the left hand limit of the fuzzy number N (λ) (2, 5, 6) is more distant from 5 than its right hand limit, the employee’s performance is more likely to be less than 5 than it is greater than 5. In general, the r − l width of the flexible fuzzy number N (λ) (l, x0 , r) may be viewed as an indication of performance instability, while the asymmetry of N (λ) (l, x0 , r) provides information about the direction that the performance more likely tends to differ from the value of x0 . It means that utilizing a flexible fuzzy number for evaluation on a Likert scale allows the evaluator to express his or her uncertainty about the performance quite well. Having determined the flexible fuzzy number-based evaluation for each performance dimension, the employee’s aggregate performance is computed as the weighted average of the dimension specific evaluation results. That is, the weighted average of four flexible fuzzy numbers gives the aggregate performance results. Here, each performance dimension was taken with the same weight and so the aggregate result was computed as the arithmetic average of the dimension specific evaluation results. The aggregate performance is shown in the lowest plot of Fig. 5. The flexible fuzzy number N (λ) (2.8, 5.0, 6.2), which represents the aggregate performance of the evaluated employee, can be interpreted as the aggregate performance of the employee lies between 2.8 and 6.2, and its most likely value is 5. The gray colored horizontal line segment in each plot of Fig. 5 represents the α cut of the corresponding flexible fuzzy number for α = 0.8. In the case of the first performance dimension, the α cut of the flexible fuzzy number N (λ) (2, 5, 6) is the interval [4, 5.3]. It can be interpreted such that if the performance is in the interval [4, 5.3], then the truth of the statement that the perceived performance equals 5 is at least 0.8. Recall that we proved earlier that the weighted average of α cuts of flexible fuzzy numbers is the α cut of the aggregate flexible fuzzy number. Note that the values of parameters l, x0 and r of the aggregate fuzzy number are independent of the value of parameter λ, but the α cuts of the individual flexible fuzzy numbers and the α cut of the aggregate flexible fuzzy number all depend on the value of λ. It is worth mentioning that if the company carries out the performance evaluation for all its employees by applying the above above-described approach,
Flexible Fuzzy Numbers for Likert Scale-Based Evaluations
99
then the company level aggregate performance can be computed as the average of the flexible fuzzy numbers representing the aggregate performance of the individuals. Since the weighted average calculation over the set of flexible fuzzy numbers is very simple (see Proposition 3), it is easy to determine the company level aggregate performance. It should be added that all the above-mentioned performance evaluations could also be carried out by applying extended flexible fuzzy numbers instead of flexible fuzzy numbers. Note that in this case, the approximate operators discussed in Proposition 7 should be used to perform calculations over the extended flexible fuzzy numbers.
4
Conclusions
In our study, we introduced the flexible fuzzy number, the membership function of which can take various shapes depending on its shape parameter. We pointed out that this feature of the flexible fuzzy number makes it suitable for expressing the soft equation ‘x is approximately equal to x0 ’ in various ways. We showed that the triangular membership function is just a special case of the membership function of the flexible fuzzy number. Next, we proved that if the shape parameter λ is fixed, then the set of flexible fuzzy numbers is closed under the multiplication by scalar, fuzzy addition and weighted average operations. Moreover, as a generalization of the flexible fuzzy numbers we introduced the extended flexible fuzzy numbers which can have different LHS and RHS shape parameters. Here, we pointed out that the asymptotic flexible fuzzy number is just a quasi fuzzy number composed of an increasing left hand side and a decreasing right hand side sigmoid function. Exploiting this finding and Dombi’s pliant arithmetic operations [24], we implemented easy-to-compute approximate fuzzy arithmetic operations over the extended flexible fuzzy numbers. By means of a human performance evaluation example, we demonstrated that the flexible fuzzy numbers can handle the vagueness that may originate from the uncertainty of the evaluators or from the variability of the perceived performance values. This pliancy of the flexible fuzzy numbers and the above-mentioned properties of the operations over them make these fuzzy numbers suitable for Likert scale-based fuzzy evaluations. Since multiple flexible fuzzy numbers can easily be aggregated into one flexible fuzzy number by performing a weighted average operation over them, the flexible fuzzy numbers can be readily employed in multi-dimensional Likert scaled-based evaluations.
References 1. Herrera, F., Herrera-Viedma, E.: Choice functions and mechanisms for linguistic preference relations. Eur. J. Oper. Res. 120(1), 144–161 (2000) 2. Herrera, F., L´ opez, E., Mendana, C., Rodr´ıguez, M.A.: Solving an assignmentselection problem with verbal information and using genetic algorithms1. Eur. J. Oper. Res. 119(2), 326–337 (1999)
100
J. Dombi and T. J´ on´ as
3. Kacprzyk, J.: Towards ‘human-consistent’ multistage decision making and control models using fuzzy sets and fuzzy logic. Fuzzy Sets Syst. 18(3), 299–314 (1986) 4. Andayani, S., Hartati, S., Wardoyo, R., Mardapi, D.: Decision-making model for student assessment by unifying numerical and linguistic data. Int. J. Electr. Comput. Eng. 7(1), 363 (2017) 5. Carrasco, R.A., Villar, P., Hornos, M.J., Herrera-Viedma, E.: A linguistic multicriteria decision making model applied to the integration of education questionnaires. Int. J. Comput. Intell. Syst. 4(5), 946–959 (2011) 6. Chen, C.T.: Applying linguistic decision-making method to deal with service quality evaluation problems. Int. J. Uncertainty Fuzziness Knowl.-Based Syst. 9(supp01), 103–114 (2001) ´ Gonz´ 7. Gil, M.A., alez-Rodr´ıguez, G.: Fuzzy vs. likert scale in statistics. In: Combining Experimentation and Theory, pp. 407–420. Springer, Heidelberg (2012) ´ Descrip8. Lubiano, M.A., de S´ aa, S.D.L.R., Montenegro, M., Sinova, B., Gil, M.A.: tive analysis of responses to items in questionnaires. Why not using a fuzzy rating scale? Inf. Sci. 360, 131–148 (2016) ´ 9. T´ oth, Z.E., Surman, V., Arva, G.: Challenges in course evaluations at Budapest University of Technology and Economics. In: Bekirogullari, Z., Minas, M.Y., Thambusamy, R.X. (eds.) 8th ICEEPSY 2017 The International Conference on Education and Educational Psychology, pp. 629–641 (2017) 10. Kuzmanovic, M., Savic, G., Popovic, M., Martic, M.: A new approach to evaluation of university teaching considering heterogeneity of students’ preferences. High. Educ. 66(2), 153–171 (2013) 11. Lin, H.T.: Fuzzy application in service quality analysis: an empirical study. Exp. Syst. Appl. 37(1), 517–526 (2010) 12. Deng, W.J.: Fuzzy importance-performance analysis for determining critical service attributes. Int. J. Serv. Ind. Manag. 19(2), 252–270 (2008) 13. Lupo, T.: A fuzzy servqual based method for reliable measurements of education quality in Italian higher education area. Exp. Syst. Appl. 40(17), 7096–7110 (2013) 14. Lupo, T.: A fuzzy framework to evaluate service quality in the healthcare industry: an empirical case of public hospital service evaluation in Sicily. Appl. Soft Comput. 40, 468–478 (2016) 15. Li, Q.: A novel Likert scale based on fuzzy sets theory. Exp. Syst. Appl. 40(5), 1609–1618 (2013) 16. Quir´ os, P., Alonso, J.M., Pancho, D.P.: Descriptive and comparative analysis of human perceptions expressed through fuzzy rating scale-based questionnaires. Int. J. Comput. Intell. Syst. 9(3), 450–467 (2016) 17. Hesketh, B., Pryor, R., Gleitzman, M., Hesketh, T.: Practical applications and psychometric evaluation of a computerised fuzzy graphic rating scale. Adv. Psychol. 56, 425–454 (1988). Elsevier ´ Lubiano, M.A., De S´ 18. Gil, M.A., aa, S.D.L.R., Sinova, B.: Analyzing data from a fuzzy rating scale-based questionnaire: a case study. Psicothema 27(2), 182–192 (2015) ´ 19. J´ on´ as, T., T´ oth, Z.E., Arva, G.: Applying a fuzzy questionnaire in a peer review process. Total Qual. Manag. Bus. Excellence 29(9–10), 1228–1245 (2018) 20. Stoklasa, J., Tal´ aˇsek, T., Luukka, P.: Fuzzified likert scales in group multiplecriteria evaluation. In: Collan, M., Kacprzyk, J. (eds.) Soft Computing Applications for Group Decision-making and Consensus Modeling, pp. 165–185. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-60207-3 11
Flexible Fuzzy Numbers for Likert Scale-Based Evaluations
101
21. Tal´ asek, T., Stoklasa, J.: A numerical investigation of the performance of distance and similarity measures in linguistic approximation under different linguistic scales. Multiple-Valued Logic Soft Comput. 29, 485–503 (2017) 22. Fr¨ uhwirth-Schnatter, S.: On statistical inference for fuzzy data with applications to descriptive statistics. Fuzzy Sets Syst. 50(2), 143–165 (1992) 23. Amini, S., Jochem, R.: A conceptual model based on the fuzzy set theory to measure and evaluate the performance of service processes. In: 2011 15th IEEE International Enterprise Distributed Object Computing Conference Workshops (EDOCW), pp. 122–131. IEEE (2011) 24. Dombi, J.: Pliant arithmetics and pliant arithmetic operations. Acta Polytechnica Hungarica 6(5), 19–49 (2009) 25. Dombi, J.: On a certain type of unary operators. In: 2012 IEEE International Conference on Fuzzy Systems, pp. 1–7 (2012) 26. Dombi, J., J´ on´ as, T., T´ oth, Z.E.: The epsilon probability distribution and its application in reliability theory. Acta Polytechnica Hungarica 15(1), 197–216 (2018) 27. Dombi, J., J´ on´ as, T.: Approximations to the normal probability distribution function using operators of continuous-valued logic. Acta Cybernetica 23(3), 829–852 (2018) 28. Bede, B.: Fuzzy Numbers, pp. 51–64. Springer, Heidelberg (2013) 29. Zimmermann, H.J.: Fuzzy Set Theory-and Its Applications. Springer, Heidelberg (2011)
Inventory Optimization Model Parameter Search Speed-Up Through Similarity Reduction Tom´aˇs Martinoviˇc, Kateˇrina Janurov´ a(B) , Jan Martinoviˇc, and Kateˇrina Slaninov´ a ˇ Technical University of Ostrava, IT4Innovations, VSB, 17. listopadu 15/2172, 708 33 Ostrava, Poruba, Czech Republic {tomas.martinovic,katerina.janurova}@vsb.cz
Abstract. This paper is concerned with finding near optimal parameters for the inventory optimization model on large dataset. It is shown that our proposed method allows for very good model parameter estimation with great reduction in computation time. Model developed in cooperation with K2 atmitec s.r.o. company has four input parameters which must be set before the run. These parameters are estimated through computationally complex simulations by hyperparameter search. Since it is impossible to make grid search of the optimal parameters for all the input time series, it is necessary to approximate the parameters settings. This approximation is done through similarity search and computation of optimal parameters on the most central objects. Additionally, parameter estimation is improved by the clustering of time series and the results are upgraded by the new estimations. Keywords: Hyperparameter search · Recurrence Quantitative Analysis · Clustering · Inventory control
1
Introduction
Inventory optimization in companies is very important issue, however with thousands of different products it may be difficult to find the optimal model settings for all of them. Models for inventory control are focused on the minimizing of the inventory stock while preventing stockout (for further explanation see Sect. 2). Numerous models for optimization of inventory stocks were created such as Economic Order Quantity model [3], newsvendor model [1] or fixed time review period model with variable demand [16] and their modern variations. In case of the optimization model with two input parameters which can take 10 values each, it means that it is necessary to test 100 combinations for every product. Therefore it is important to develop methods which are able to bring near optimal results in a fraction of the original computing time. c Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 102–114, 2021. https://doi.org/10.1007/978-3-030-51992-6_9
Parameter Search Speed-Up Through Similarity Reduction
103
In cooperation with K2 atmitec s.r.o. a new inventory control algorithm was developed. This algorithm has four input parameters, which considerably affect the results. The best settings for the algorithm is based on the simulation of inventory movements. It is very time consuming to set the best combination of parameters for every individual product due to multiple runs of simulations. In this paper a method of approximating the optimal settings for the inventory control algorithm over the thousands of products is presented. This approximation consists of two steps. The first step is to compute the distance matrix of the product sales time series. Based on this it is possible to compute the first approximate parameter settings on the most central products. The second step is to divide these time series into clusters, compute the optimal parameter settings for the central products of these clusters and improve the overall results using the new sets of parameters.
2 2.1
Automated Inventory Control Problem Description
In real-world practice, companies usually face the challenge of optimizing inventory control. Given that this procedure usually involves control of thousands of products, it is not possible to handle this problem without an automated approach. In our previous work [10], the new inventory control algorithm was designed and implemented in cooperation with K2 atmitec s.r.o. This algorithm simplifies the decision making process of sales managers. It is very variable due to the four input parameters. Thanks to this, different levels of inventory control can be achieved with the proper combination of different input parameters. 2.2
Verification of Automated Inventory Control
Our approach was verified by the simulation, which imitate behaviour of sales manager on the assumption of complete trust in our algorithm. The results of such simulation were compared with the real historical data of sales and inventory time series of the two different companies from the Czech Republic to which we refer in the following text as Dataset 1 and Dataset 2. The data was provided by the K2 atmitec s.r.o. company. The following output parameters are important for overall evaluation of the results: the mean stockout, that was calculated for the sake of real information as the number of days for which the product was unavailable in simulated inventory (hereinafter referred to as stockout) and the mean relative difference that was calculated as the relative percentage difference with respect to the mean level of real inventory and the mean level of simulated inventory (hereinafter referred to as difference). The higher difference the more successful is the simulation. The task was: keep stockout below 5% with maximum possible difference.
104
T. Martinoviˇc et al.
2.3
Best Setting Search
As mentioned in Sect. 2.1, our algorithm has four input parameters. By combining a small number of reasonable input parameter values that were determined on the basis of previous analysis on real data, we have reached a total of 192 input combinations. The best setting for the algorithm for the specific dataset was then set by the grid search on the input combination as the combination with the maximum difference and stockout below 5%. Such setting is the same for all time series in the entire dataset and we call it the globally optimal results.
3
Approximation of the Global Optimum
The grid search process of getting the globally optimal results was illustrated in the previous section. To reduce the computation time of the grid search for the optimal parameters, it is possible to choose only small number of the time series to act as the characteristic elements. For this purpose is best to compute the dissimilarity matrix of the individual time series. Traditionally, there are a few methods used for computing the distances of the time series. These methods are e.g. Dynamic Time Warping [5] and Longest Common Subsequence [4]. In this article Recurrence Quantitative Analysis [18] is used to create time series features, which are consecutively used for the computation of the dissimilarity matrix. 3.1
Recurrence Quantitative Analysis
Recurrence Quantitative Analysis (RQA) is quantification of the structures found in the Recurrence plot (RP) [12]. The simplest form of the RP is basically the thresholded version of the distance matrix of subsequences of the time series. First, Takens embedding [17] is used to create subsequences. Two parameters must be set to do the Takens embedding: embedding dimension (m) and lag (l). Let xn be a time series with n observations, then embedded vector at time i is defined as Xi = {xi , xi+l , . . . , xi+(m−1)∗l }. Afterwards, a distance matrix of all the vectors Vi is computed. In this paper the maximum metric is used and the distance between two vectors is Di,j = d(Xi , Xj ) = max |xi+kl − xj+kl | , (1) k
for k = 1, . . . , m. Recurrence plot is then defined as 1, Di,j < ε ε RPi,j = 0, otherwise.
(2)
To increase the usability of RP and to allow more robust analysis quantification of the structures, namely diagonal and vertical lines of consecutive
Parameter Search Speed-Up Through Similarity Reduction
105
recurrences, a method called RQA was developed [23]. Following features were used for the computation of the dissimilarity matrix N lP (l) , (3) DET = Nl=lmin i,j=1 RP (i, j) N v=vmin vP (v) , (4) LAM = N i,j=1 RP (i, j) DIV = 1/lmax , N
min L = l=l N
TT =
(5) lP (l)
,
l=lmin P (l) N v=vmin vP (v) , N v=vmin P (v)
(6)
(7)
where DET is determinism, DIV is divergence, LAM is laminarity, L is average diagonal line length, T T is average vertical line length, or trapping time, l is the length of diagonal lines, v is the length of vertical lines, P (l) is the histogram of diagonal lines, and P (v) is the histogram of vertical lines, lmin and vmin stands for the minimal length of the diagonal/vertical lines which should be considered for the computation and lmax is the maximal length of diagonal lines. These features provide different information about the time series dynamics such as an amount of similar sequences repetitions and the consecutive repetition of the same values. Therefore dissimilarity matrix created from the feature vectors should provide good distribution of the time series based on their dynamics. Additionally, dissimilarity matrix may be later used for the clustering of the time series. RQA was previously used for the time series clustering in [22]. More exhaustive description of the RQA is in [12] and fast algorithm for the computation of the RQA was proposed in [11]. 3.2
k − ball Reduction
Based on the dissimilarity matrix described in Subsect. 3.1, it is possible to find the most central time series with the minimal sum of distances to all the other time series. Finding the optimal parameters for the optimization function for this central time series should provide relatively good approximation of the parameters for every other time series in the dataset. Most likely, this solution will not be the best solution for the whole dataset, but it should provide a reasonable results for the massive time reduction. It is possible to select more than one time series for the grid search to improve the solution, starting with the closest to the central time series. Such addition of the time series is similar to the ε − k − ball search as proposed in [8]. However, in this case, instead of choosing the ε, a k is chosen. Let such a selection be a k - ball selection. It is important to keep in mind, that every time series included in the grid search will linearly increase the time needed for the grid search of the optimal
106
T. Martinoviˇc et al.
parameters. If the ε was to be set, it could happen, that many other time series would be included at once and the grid search would take an enormous amount of time. Another consequence of this approach is, that in general it is slowly approaching the global optimization results as presented in the Sect. 2. Combining these facts, it means that it is possible to set in advance how much time, for specific number of elements in k − ball, is given for the optimization of the parameters and to know what results may be expected from such optimization. The illustration how the addition of more elements approach the global solution is in the Fig. 1
K2
K1 K4
K3
Global solution = K13
Fig. 1. k − ball and its approach to the global solution
3.3
Clustering
Of course in case of using all the time series for the optimization, it is possible that the appropriate parameters for inventory control model will not be found for the small number of selected objects in k − ball. This might be improved by creating multiple clusters of time series with similar features. Due to the similarity in selection of medoids with the k − ball selection, the Partitioning Around Medoids [15] was chosen as the clustering algorithm. Afterwards grid search for the optimal parameters on the k − ball products for each cluster is computed. Most important advantage of the clustering is that the user does not need to depend on the results of the clustering alone. It is possible to combine the parameter settings of the initial solution made on the whole dataset with the settings found for the cluster. In case of two clusters this results into having two clusters and two parameter settings for each cluster - one found from the whole dataset search and one from the cluster search. It is easy to compute the results for each combination of the data and the parameters and then use the parameters which provide best results for the given cluster. This methodology is based on the idea that for the best results it is necessary to find optimal settings for each product individually. Finding optimal setting
Parameter Search Speed-Up Through Similarity Reduction
107
for every product is time consuming, however it is enough to find the better results for the part of the products compared to the benchmark results, results found for the k − ball solution for all the products. Therefore it is enough to find parameter settings which improve results for at least one product to improve the initial solution. In this case clustering is used to improve results for the batch of the products at once. Since the learning phase is increasing linearly in time, for two clusters the final time will take two times more compared to single computation with using k − ball over the whole dataset. The combination of the parameters over the clusters is illustrated in Fig. 2. In the upper part of the figure is illustrated dataset. Red dots represent objects selected for the k − ball. In the lower part of the figure is illustrated how the combination of the best parameters for each cluster may improve the final score. Red objects are selected k-ball objects
Cluster 1 Parameters P1
Cluster 2 Parameters P2
No cluster Parameters P0 No cluster Objects subset
C1
C2
Parameters setting
P0
Average stockout
5.7
Weighted average stockout over C1 and C2
Clusters
C1
P0
P1
7.8
9.5
6.6
Combination
C2
C1
C2
P2
P0
P2
3.6
5.7
6.9
3.6
4.8
Fig. 2. Improvement of the stockout when the combination of the parameters is used
3.4
Partitioning Around Medoids
Partitioning Around Medoids (PAM) is a clustering method which is minimizing the sum of the dissimilarities among input objects. This method is composed of two steps. In the first step medoids are chosen. For the n clusters, the first chosen medoid is the object with smallest sum of the dissimilarities to every other object. Afterwards, objects which minimizes the sum of the dissimilarities are selected iteratively until n medoids are chosen. In the second step the initial solution is improved by swapping the medoids object with another unselected object. For the selected medoid effect of swap
108
T. Martinoviˇc et al.
on the sum of the dissimilarities is considered. Object with greatest decreases of the sum of the dissimilarities is chosen as the new medoid instead of the selected medoid. This operation is repeated until no further improvement is possible. 3.5
Clustering Computation and k − ball
To provide k − ball objects and clusters for the simulations, the RQA was computed for every time series in the dataset. The embedding dimension was set to 5, one working week, and lag to 1 for all the input time series. Since the datasets contained a lot of zero values indicating no sales on the given day, the long sequences of zeros were reduced to the maximum of 5 consecutive zeros. The longest sequence of 5 was set to equal the product of embedding dimension and lag. This should reduce extreme number of recurrences due to days without sales, but still create unique embedded vector for such a case. A threshold value ε was set dynamically so the recurrence rate (percentage of recurrences in the RP) ranges from 10% to 20%. In practice RQA is computed with ε set to the tenth of the range of the time series and if the recurrence rate is too low/high, the ε is increased/decreased by one percent of the range of the time series and new computation of RQA is made. This process is repeated until the desired recurrence rate is achieved. Exact value of recurrence rate differs based on the length and the variability of the input time series. Range of 10% to 20% was selected so the RP has enough recurrences to evaluate the dynamics, and at the same time it is not too much to oversaturate the RP with the recurrences. Before computing of the dissimilarity matrix the features needs to be normalized. Since all the features are positive numbers and recurrence rate, determinism and laminarity is defined on the interval < 0, 1 >, only average diagonal line length and average vertical line length needs to be normalized. This was easily achieved by diving all the values by maximum values of individual features. Dissimilarity matrix is computed as an euclidean distance between the features of all pairs of the input time series. Such dissimilarity matrix is used for the selection of k − ball objects and clustering by the PAM algorithm. In this paper the number of clusters was set to 2. The cluster distributions were validated by the computation of silhouettes [14]. The silhouette for the object i is defined as S(i) =
b(i) − a(i) , max(a(i), b(i))
(8)
where a(i) is the average distance of the object i from the other objects in its own cluster and b(i) is the lowest average distance to the all points in any other cluster. Ultimately, silhouette measures the amount of similarity of object i to its own cluster compared to other clusters. Values close to 0 and negative values indicate that the object might belong to another close cluster. In the Fig. 3 is shown the cluster distribution of the products over the two principal components and the silhouettes for the Dataset 1 (left) and Dataset 2 (right). The silhouette values for both datasets are high for most of the objects and only very small number of objects has negative silhouettes values.
Parameter Search Speed-Up Through Similarity Reduction
4
109
Results
For the computation was used the R statistical software environment [13] with the following packages [2,6,7,9,19–21]. The brief description of the globally optimal results for both datasets is given in Table 1. The stockout values given here should not be exceeded and the values of difference should be as close as possible in the reduced parameter search simulations. Table 1. Globally optimal results on both datasets Dataset 1 Dataset 2 Number of products
516
1468
Difference
65.16%
27.52%
Stockout
4.87%
Computing time in hours 17:29:36
4.98% 41:44:48
The calculated values of the difference and stockout for Dataset 1 with an increasing number of k − balls elements are plotted in the Figs. 4 and 5 respectively. As can be seen from the Fig. 4, the difference values for all approaches are approaching the globally optimal values with an increasing number of k − ball elements to the required difference value set in Table 1. The Clusters approach shows slightly better results than the other two approaches in the higher number of k − ball elements. On the other hand, the Combined approach has better results in the smaller numbers of k − balls elements. This result justifies the use of Combined approach for a smaller number of k − balls elements even if it is computationally more demanding than the other two approaches. The same result can be seen from the Fig. 5 for the stockout values. We can conclude that we have approached the required solution close enough in our simulations for a fraction of the computing time. The computing times for all approaches and both datasets are given in Table 2. The calculated values of the difference and stockout for Dataset 2 for an increasing number of k − balls elements are plotted in the Figs. 6 and 7 respectively. Unlike the Dataset 1, two approaches show a little different behaviour on Dataset 2, the stockout values are slightly above the allowed stockout value given in Table 1. Only the Combined approach has the stockout values below the required value. Again we can conclude that we have approached the required solution close enough in our simulations for a fraction of the computing time.
110
T. Martinoviˇc et al.
Fig. 3. Cluster distribution over first two principal components and silhouettes for the clusters of Dataset 1 (left) and Dataset 2 (right) Dataset 1 80
Difference
75
70
Clusters Combination
65
No cluster
60
55 0
10
20
30
40
50
Number of k−ball elements
Fig. 4. Improvement of difference by combination of optimal parameters on Dataset 1. The dashed black line represents the difference from globally optimal results given in Table 1
Parameter Search Speed-Up Through Similarity Reduction
111
Dataset 1
Stockout
15
10 Clusters Combination No cluster 5
0 0
10
20
30
40
50
Number of k−ball elements
Fig. 5. Improvement of stockout by combination of optimal parameters on Dataset 1. The dashed black line represents the stockout from globally optimal results given in Table 1 Dataset 2
Difference
40
Clusters 20
Combination No cluster
0
0
10
20
30
40
50
Number of k−ball elements
Fig. 6. Improvement of difference by combination of optimal parameters on Dataset 2. The dashed black line represents the difference from globally optimal results given in Table 1 Dataset 2 7
Stockout
6
5
Clusters Combination
4
No cluster
3
2 0
10
20
30
40
50
Number of k−ball elements
Fig. 7. Improvement of stockout by combination of optimal parameters on Dataset 2. The dashed black line represents the difference from globally optimal results given in Table 1
112
T. Martinoviˇc et al. Table 2. Computing times for different approaches in hours No. of Dataset 1 k − ball elements No cluster Clusters Comb
5
Dataset2
No cluster Clusters Comb
1
0:02:48
0:06:24
0:09:12 0:09:12
0:16:24
0:25:36
2
0:04:24
0:07:36
0:12:00 0:10:48
0:18:24
0:29:12
3
0:07:36
0:14:00
0:21:36 0:10:48
0:22:00
0:32:48
4
0:08:48
0:18:48
0:27:36 0:11:36
0:24:00
0:35:36
5
0:12:24
0:22:00
0:34:24 0:13:36
0:27:12
0:40:48
6
0:14:00
0:27:36
0:41:36 0:14:24
0:30:00
0:44:24
7
0:15:12
0:30:00
0:45:12 0:16:00
0:31:36
0:47:36
8
0:15:12
0:35:36
0:50:48 0:17:12
0:33:12
0:50:24
9
0:17:12
0:38:48
0:56:00 0:17:36
0:36:00
0:53:36
10
0:20:48
0:44:00
1:04:48 0:19:36
0:39:36
0:59:12
11
0:22:00
0:50:24
1:12:24 0:21:12
0:42:24
1:03:36
12
0:24:48
0:55:36
1:20:24 0:22:24
0:44:00
1:06:24
13
0:26:00
1:02:00
1:28:00 0:24:24
0:47:12
1:11:36
14
0:27:36
1:07:36
1:35:12 0:25:36
0:50:48
1:16:24
15
0:30:24
1:13:12
1:43:36 0:26:48
0:52:24
1:19:12
20
0:35:36
1:30:48
2:06:24 0:36:24
1:03:36
1:40:00
30
0:55:12
2:12:00
3:07:12 1:00:00
1:24:24
2:24:24
40
1:24:00
3:01:12
4:25:12 1:11:36
1:54:00
3:05:36
50
1:46:00
3:39:36
5:25:36 1:30:24
2:16:00
3:46:24
Conclusion
A way to speed up the estimation of the near optimal parameters for the inventory control model was presented. Estimation consist of selection of the most central objects for the parameter estimation and additionally improving the results by the estimation of the near optimal parameters for the subsets of the whole dataset. This approach reduce the computation time significantly with small effect on the results from 17 h and 30 min to 5 h and 25 min for Dataset 1 and from 41 h and 45 min to 3 h and 46 min for Dataset 2. Thank to the reduced estimation time inventory control system can provide good results on large dataset for which is impossible to compute the optimal settings for all of their products. In future, possibility for additional optimization may be explored through different clustering methods, which may further improve the results. Additionally, it is possible to test different distance matrix computation and another inventory
Parameter Search Speed-Up Through Similarity Reduction
113
control algorithms. One other important factor which should be studied is the handling of the outliers. Acknowledgements. This work was supported by The Ministry of Education, Youth and Sports from the National Programme of Sustainability (NPS II) project “IT4Innovations excellence in science - LQ1602” and by the “IT4Innovations infrastructure which is supported from the Large Infrastructures for Research, Experimental Development and Innovations project IT4Innovations National Supercomputing Center - LM2015070” and partially supported by the SGC grant No. SP2018/173 “Dynamic ˇ - Technical University of systems problems and their implementation on HPC”, VSB Ostrava, Czech Republic.
References 1. Axs¨ ater, S.: Inventory Control (International Series in Operations Research & Management Science). Springer (2015) 2. Chen, W.C., Ostrouchov, G., Schmidt, D., Patel, P., Yu, H.: pbdMPI: Programming with big data – interface to MPI (2012). R Package, https://cran.r-project. org/package=pbdMPI 3. Choi, T.M. (ed.): Handbook of EOQ Inventory Problems: Stochastic and Deterministic Models and Applications (International Series in Operations Research & Management Science). Springer (2013) 4. Das, G., Gunopulos, D., Mannila, H.: Finding similar time series. In: Komorowski, J., Zytkow, J. (eds.) Principles of Data Mining and Knowledge Discovery, pp. 88– 100. Springer, Heidelberg (1997) 5. Jeong, Y.S., Jeong, M.K., Omitaomu, O.A.: Weighted dynamic time warping for time series classification. Pattern Recogn. 44(9), 2231 – 2240 (2011). https://doi.org/10.1016/j.patcog.2010.09.022. http://www.sciencedirect. com/science/article/pii/S003132031000484X. Computer Analysis of Images and Patterns 6. Kassambara, A., Mundt, F.: factoextra: Extract and Visualize the Results of Multivariate Data Analyses (2017). https://CRAN.R-project.org/package=factoextra. R package version 1.0.5 7. Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik, K.: cluster: Cluster Analysis Basics and Extensions (2018). R package version 2.0.7-1 — For new features, see the ‘Changelog’ file (in the package source) 8. Martinoviˇc, J., Sn´ aˇsel, V., Dvorsk´ y, J., Dr´ aˇzdilov´ a, P.: Search in documents based on topical development. In: Sn´ aˇsel, V., Szczepaniak, P.S., Abraham, A., Kacprzyk, J. (eds.) Advances in Intelligent Web Mastering - 2, pp. 155–166. Springer, Heidelberg (2010) 9. Martinovic, T.: Chaos01: 0-1 Test for Chaos (2016). https://CRAN.R-project.org/ package=Chaos01. R package version 1.1.0 10. Martinoviˇc, T., Janurov´ a, K., Slaninov´ a, K., Martinoviˇc, J.: Automated application of inventory optimization. In: Saeed, K., Homenda, W. (eds.) Computer Information Systems and Industrial Management, pp. 230–239. Springer International Publishing, Cham (2016) 11. Martinoviˇc, T., Zitzlsberger, G.: Highly scalable algorithm for computation of recurrence quantitative analysis. J. Supercomput. (2018). https://doi.org/10.1007/ s11227-018-2350-5
114
T. Martinoviˇc et al.
12. Marwan, N., Romano, M.C., Thiel, M., Kurths, J.: Recurrence plots for the analysis of complex systems. Phys. Rep. 438(5), 237–329 (2007). https://doi.org/ 10.1016/j.physrep.2006.11.001. http://www.sciencedirect.com/science/article/pii/ S0370157306004066 13. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2018). https://www.R-project. org/ 14. Rousseeuw, P.J.: Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987). https://doi.org/10. 1016/0377-0427(87)90125-7. http://www.sciencedirect.com/science/article/pii/ 0377042787901257 15. Rousseeuw, P.J., Kaufman, L.: Finding groups in data. Wiley Online Library Hoboken (1990) 16. Russell, R., Taylor, B.: Operations and Supply Chain Management. Wiley (2016). https://books.google.cz/books?id=sj00DQEACAAJ 17. Takens, F.: Detecting strange attractors in turbulence, pp. 366–381. Springer, Heidelberg (1981). https://doi.org/10.1007/BFb0091924. https://doi.org/10. 1007/BFb0091924 18. Webber, C.L., Zbilut, J.P.: Dynamical assessment of physiological systems and states using recurrence plot strategies. J. Appl. Physiol. 76(2), 965–973 (1994). http://jap.physiology.org/content/76/2/965 19. Wickham, H.: ggplot2: Elegant Graphics for Data Analysis. Springer, New York (2009). http://ggplot2.org 20. Wickham, H.: tidyverse: Easily Install and Load the ‘Tidyverse’ (2017). https:// CRAN.R-project.org/package=tidyverse. R package version 1.2.1 21. Wickham, H., Francois, R., Henry, L., M¨ uller, K.: dplyr: a Grammar of Data Manipulation (2017). https://CRAN.R-project.org/package=dplyr. R package version 0.7.4 22. Witt, C.: Clustering Recurrence Plots. Master’s thesis, Humboldt-Universit¨ at zu Berlin, Germany (2016) 23. Zbilut, J.P., Webber, C.L.: Embeddings and delays as derived from quantification of recurrence plots. Phys. Lett. A 171(3), 199–203 (1992). https://doi.org/10. 1016/0375-9601(92)90426-M. http://www.sciencedirect.com/science/article/pii/ 037596019290426M
On Posets for Reliability: How Fine Can They Be? Valeriu Beiu1 , Simon R. Cowell1 , and Vlad-Florin Dr˘ agoi1,2(B) 1
Faculty of Exact Sciences, “Aurel Vlaicu” University of Arad, Arad, Romania {valeriu.beiu,simon.cowell,vlad.dragoi}@uav.ro 2 University of Rouen Normandy, LITIS, Mont-Saint-Aignan, France [email protected]
Abstract. This article investigates the structure of several posets over the set of reliability polynomials of minimal two-terminal networks. We expand on the work of Dr˘ agoi et al. [14], which have focused on posets of compositions of networks. Our simulations show that other classes of twoterminal networks are amenable to this approach, including series-andparallel networks and consecutive-k-out-of-n:F networks. We perform a finer analysis of the reliability polynomials associated with compositions of series-and-parallel. We also introduce here three different threshold points that define a new ordering on the set of reliability polynomials. This ordering is finer than the existing ones, and is total. Keywords: Poset of reliability · Minimal two-terminal networks · Compositions of series-and-parallel networks · Consecutive-k-out-of-n F networks
1
·
Introduction
The reliability of two-terminal networks is a well-established research topic. It was initiated by von Neumann [33] and Moore and Shannon [25,26] in the context of the theory of switching circuits. In [25,26] the authors analyzed designs for improving reliability under various constraints, e.g., number of devices and uniform distribution of failures of a device. The reliability of a two-terminal network, as per Moore and Shannon, is defined as the (s, t)-connectedness (also known in [2] as the reliability polynomial) of the network, which is the most common interpretation in network reliability theory. It was only after several years that researchers started to realize the difficulty of computing the reliability polynomial for a given network [32]. Valiant proved that several combinatorial problems related to network reliability, including the computation of the reliability polynomial of a random two-terminal network, are #P-complete [32]. Thus, different directions emerged, e.g., obtaining sharper bounds for the reliability polynomials, decreasing the complexity of the stateof-the-art algorithms for computing the reliability polynomials, and determining c Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 115–129, 2021. https://doi.org/10.1007/978-3-030-51992-6_10
116
V. Beiu et al.
new families of networks for which efficient algorithms for computing the reliability polynomials exist. All of these concepts are detailed in well-known books, such as [2,11]. One of these new directions in network reliability was given in 1991 by Boesch et al. [7] and Myrvold et al. [28]. When the networks are used for communication they are expected to maintain connectivity as much as possible, i.e., the probability that s and t are connected should be as high as possible. Informally this property can be expressed as follows. A two-terminal network made of n devices and ω wires is called uniformly most reliable if and only if its reliability polynomial is greater than or equal to the reliability polynomial of any other two-terminal network of n devices and ω wires (see [7] for an equivalent definition). These studies were continued in [1,6,8,34]. From such results the idea of a possible poset over the set of reliability polynomials of networks having n devices has emerged [14]. The main motivation for analyzing such structures is that they might offer a better understanding of how networks could be ranked. On one hand, they should confirm the existence of uniformly most reliable networks as the supremum of one of the posets. On the other hand, when searching for networks behaving as a switch [25,26], we should rather check inside the poset for optimum networks. In this article we continue to investigate the existence of such structures and start from the recent work of Dr˘ agoi et al. [14]. There the authors have proposed and analyzed the structure of a poset over the set of reliability polynomials of compositions of series-and-parallel networks. The order used in [14] () was borrowed from other fields, e.g., coding theory [4,19,24,27,30], cryptography [3,18], and poset theory [31]. Our Contribution. This article brings two novel contributions to the analysis of poset structures over the set of reliability polynomials of two-terminal networks. First of all, we extend the analyses to a larger class of minimal twoterminal networks, namely the family of series-and-parallel networks. We show that series-and-parallel networks allow for a finer poset when compared to compositions. Next, we evaluate for the first time the poset of reliability polynomials of consecutive-k-out-of-n:F networks. In this case the simulations reveal a totally ordered set (i.e., the reliability polynomials of such networks are ordered in a chain). These results support fresh theorems regarding uniformly most reliable two-terminal networks. Secondly, we investigate possible poset structures for minimal two-terminal networks that are envisaged for Boolean computations, i.e., as switching devices. This was also the motivation of Moore and Shannon’s article [25], but their results seem to have been forgotten as this direction of research has not been followed up properly. For determining networks that are optimal from the computational point of view, we analyze the properties of reliability polynomials for compositions of series-and-parallel networks. We propose three threshold/crossing points that define three isomorphic totally ordered sets. Our results verify the
On Posets for Reliability: How Fine Can They Be?
117
simulations in [15], but with significantly fewer computations. They could also be used for emulating networks larger than those considered in [15]. Outline of the Article. The article is organized as follows. Section 2 sets the technical background required for the rest of the paper. In Sect. 3 we define the existing poset over the set of reliability polynomials of compositions of seriesand-parallel, and we illustrate the theory through some examples. Section 4 is composed of three different subsections that detail all of our aforementioned results. We conclude the article in Sect. 5.
2
Preliminaries
The networks that we consider here are made of switches/devices while they act like a single switch/device. We fix the total number of devices of a network at a strictly positive integer n. We say that N is a two-terminal network of size n (or an n-network) if N is a network made of n identical devices, that has two distinguished terminals: an input or source S, and an output or terminus T . For any N the number of devices satisfies: n ≥ wl,
(1)
where w is the size of a “minimal cut” separating S from T , and l is the size of a “minimal path” from S to T (see Theorem 3 in [25]). Commonly l is the length and w is the width of N . One might define different types of network minimality, but here we stick to minimality with respect to the number of devices. Hence, a network for which n = wl will be called a minimal network. Starting from the set of minimal networks, there are several ways of generating a larger set of minimal networks. For example, if this set contains networks with the same l we might arrange in parallel all the elements of any subset of this set to obtain a new minimal network having a larger w. Equivalently, networks having the same w could be connected in series. It follows that seriesand-parallel arrangements can be used (under certain conditions) to generate larger minimal networks. Another solution is to use composition of (minimal) networks, an operation that we define below. Definition 1. Let C represent a composition of two networks. If we start from the device itself the smallest possible compositions are: two devices in series C (0) , and two devices in parallel C (1) . At the next level, a composition of C (0) with C (1) is C (0,1) = C (0) • C (1) , obtained by replacing each device in C (0) by C (1) . Any binary vector u ∈ {0, 1}m will be identified by its support Supp(u), that is the set of positions i where ui = 0. For a given binary vector u = (u0 , . . . , um−1 ) ∈ {0, 1}m , C u represents the composition C (u0 ) • · · · • C (um−1 ) . For a fixed m we denote the set of all composition by C2m = {C u | u ∈ {0, 1}m } . It is already known that such compositions are minimal networks.
118
V. Beiu et al.
Proposition 1 ([16]). Let m be a strictly positive integer and C u ∈ C2m . Then C u is a minimal network of size n = 2m , length l = 2m−|u | and width w = 2|u | , where |u| denotes the Hamming weight of u. We are interested in one particular aspect of an electrical circuit equivalent to a two-terminal network, namely its duality. The dual of a network N will be denoted N ⊥ . In order to formally express the dual of C u we need to define the bitwise complement of a binary vector u = (1, . . . , 1) ⊕ u. Proposition 2. Let m be a strictly positive integer and u ∈ {0, 1}m . Then C u has width w = 2m−|u | and length l = 2|u | and ⊥
(C u ) = C u .
3 3.1
(2)
A Poset for the Reliability Polynomials of Minimal Networks Reliability Polynomials
The reliability of any two-terminal network is defined as the probability that the source S and the terminus T are connected (also known as (s, t)-connectedness). We will use here an established notation for the reliability polynomial, namely Rel(N ; p), where p ∈ [0, 1] is the probability that a device is closed. The reliability polynomial has several written forms, including: n
Pi (N ) pi
(3)
Ni (N ) pi (1 − p)n−i
(4)
Rel(N ; p) =
i=0
Rel(N ; p) =
n i=0
Rel(N ; p) = 1 −
n
Ci (N ) pn−i (1 − p)i .
(5)
i=0
In general, computing Rel(N ; p) is #P-complete [32]. Still, there are particular classes of two-terminal networks for which this task becomes easier, including the class of compositions of series-and-parallel. Theorem 1 ([16]). Let m be a strictly positive integer and let u (u0 , . . . , um−1 ) ∈ {0, 1}m . Then Rel(C u ; p) = Rel(C (u0 ) ) ◦ · · · ◦ Rel(C (um−1 ) ; p)
= (6)
where Rel(C (0) ; p) = p2 and Rel(C (1) ; p) = 1 − (1 − p)2 . As for the reliability polynomial of the dual of a network, it was shown in [25] that Rel(N ⊥ ; p) = 1 − Rel(N ; 1 − p). (7)
On Posets for Reliability: How Fine Can They Be?
3.2
119
Partial Orders for Compositions
Usually, when comparing two networks we say that N1 is more reliable than N2 if (8) ∀ 0 ≤ p ≤ 1 Rel(N2 ; p) ≤ Rel(N1 ; p). Using this convention we say that u and v are comparable (u ≤ v or v ≤ u) if and only if for any p ∈ [0, 1] we have either Rel(C u ; p) ≤ Rel(C v ; p) or Rel(C u ; p) ≥ Rel(C v ; p). We will abuse the notation and rather omit the parameter p, therefore u ≤ v ⇔ Rel(C u ) ≤ Rel(C v ).
(9)
The orders that we will define here (1 , 2 , and ) are borrowed from [14], where they were used in the same context. Definition 2. Let u and v be two binary vectors of size m. We define u 1 v if and only if Supp(u) ⊆ Supp(v). Equality holds only for u = v. Definition 3. Let l < m be two strictly positive integers and u, v be two binary vectors of size m such that |u| = |v| = l. Let Supp(u) = {s1 , . . . , sl } and Supp(v) = {t1 , . . . , tl }, si < si+1 and ti < ti+1 for all 1 ≤ i ≤ l − 1. We define u 2 v if and only if ∀1 ≤ i ≤ l, si ≤ ti . These two orders can be combined in a natural way to define ⎧ ⎨ 1 when comparing vectors with different Hamming weights = . (10) ⎩ 2 when comparing vectors with identical Hamming weights Theorem 2 ([14]). Let u and v be two binary vectors of size m. Then we have u v ⇒ u ≤ v. 3.3
An Illustrative Example
We present simulation results for the particular case when m = 4 (n = 2m = 16). The reliability polynomials for all the possible 16 compositions have been computed using (6), and are reported in Table 1. In Fig. 1, we show the associated poset, and plots of all the Rel(C u ; p). This poset has the following properties: – it admits an infimum (0, 0, 0, 0) and a supremum (1, 1, 1, 1); – the poset can be partioned into 11 distinct ranks (see graded posets for details); – each level of the poset is an antichain, i.e., the elements of this set are not comparable; – the number of elements in each level of this poset is L16 = 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1;
120
V. Beiu et al. Table 1. Reliability polynomials for C u . w, l 16, 1
N C
Rel(N ; p)
(1,1,1,1)
1 − (1 − p)16
(0,1,1,1)
8, 2
C C (1,0,1,1) C (1,1,0,1) C (1,1,1,0)
(1 − (1 − p)8 )2 1 − (1 − (1 − (1 − p)4 )2 )2 1 − (1 − (1 − (1 − p)2 )2 )4 1 − (1 − p2 )8
4, 4
C (0,0,1,1) C (0,1,0,1) C (0,1,1,0) C (1,0,0,1) C (1,0,1,0) C (1,1,0,0)
(1 − (1 − p)4 )4 (1 − (1 − (1 − (1 − p)2 )2 )2 )2 (1 − (1 − p2 )4 )2 1 − (1 − (1 − (1 − p)2 )4 )2 1 − (1 − (1 − (1 − p2 )2 )2 )2 1 − (1 − p4 )4
2, 8
C (0,0,0,1) C (0,0,1,0) C (0,1,0,0) C (1,0,0,0)
(1 − (1 − p)2 )8 (1 − (1 − p2 )2 )4 (1 − (1 − p4 )2 )2 1 − (1 − p8 )2
1, 16
C (0,0,0,0)
p16
– L16 is symmetric (due to duality); – the longest chain of this poset has cardinality 11, which equals the number of levels in the poset; – the middle of the poset is the antichain {(0, 1, 1, 0), (1, 0, 0, 1)}. The simulations show that the reliability polynomials for compositions are totally ordered for n = 16. Indeed, one can see that there are no crossover points between any pairs of reliability polynomials (see Fig. 2). This implies that u ≤ v or v ≤ u for any u, v ∈ {0, 1}4 . If we associate with any u = (u0 , u1 , u2 , u3 ) 3 its corresponding integer u = i=0 ui 2i , we obtain the following totally ordered set: {0, 1, 2, 4, 8, 3, 5, 6, 9, 10, 12, 7, 11, 13, 14, 15}. These results are a clear improvement over the partial order given by , which for n = 16 (m = 4) generates several non-comparable elements, e.g., 3 and 4, 5 and 8, etc. Through simulations this behavior was observed for larger values of n. Starting from m = 5, {Rel(C u ; p), } no longer forms a totally ordered set. That is why other analytical properties of the reliability polynomials will be considered in the next section for defining new and finer orderings on such sets.
On Posets for Reliability: How Fine Can They Be?
4 4.1
121
Other Posets for Reliability Threshold Points
In this subsection we define three different threshold points which seem to be a good fit when ordering reliability polynomials: 1. Rel(C u ; 1/2); 2. p0 (u) such that Rel(C u ; p0 (u)) = 1/2; 3. p1 (u) ∈ (0, 1) such that Rel(C u ; p1 (u)) = p1 (u).
(1, 1, 1, 1)
(0, 1, 1, 1)
(1, 0, 1, 1)
(1, 1, 0, 1)
(1, 1, 1, 0)
(0, 0, 1, 1)
(0, 1, 0, 1)
(0, 1, 1, 0)
(1, 0, 0, 1)
(1, 0, 1, 0)
(1, 1, 0, 0)
(0, 0, 0, 1)
(0, 0, 1, 0)
(0, 1, 0, 0)
(1, 0, 0, 0)
(0, 0, 0, 0)
Fig. 1. The poset of the reliabilities of all compositions for m = 4.
While the first two threshold points exist and are unique, the third one exist and is unique except for a few particular reliability polynomials.
122
V. Beiu et al.
Lemma 1. Let N be a two-terminal network of width w ≥ 1 and length l ≥ 1. 1. If l > 1, then there is a δ > 0 such that for all p with 0 < p < δ, Rel(N ; p) < p. 2. If w > 1, then there is a δ > 0 such that for all p with 1 − δ < p < 1, Rel(N ; p) > p. Proof. 1. In general, Rel(N ; p) ∼ Nl (N )pl as p → 0. In particular, there exists a δ1 > 0 Rel(N ;p) such that 0 < p < δ1 ⇒ Nl (N )pl − 1 < 1. In the case that l > 1, we also have that Nl (N )pl /p = Nl (N )pl−1 → 0 as p → 0. In particular, there exists a δ2 > 0 such that 0 < p < δ2 ⇒ |Nl (N )pl /p| < 1/2. Setting δ = min(δ1 , δ2 ), ;p) Nl (N )pl we have that 0 < p < δ ⇒ Rel(N ; p)/p = Rel(N < (1 + 1)1/2 = 1, p Nl (N )pl hence 0 < p < δ ⇒ Rel(N ; p) < p, as required. 2. In general, 1−Rel(N ; 1−p) ∼ C w (N )pw as p → 0. In particular, there exists a ;1−p) δ1 > 0 such that 0 < p < δ1 ⇒ 1−Rel(N − 1 < 1. In the case that w > 1, Cw (N )pw we also have that Cw (N )pw /p = Cw (N )pw−1 and Cw (N )pw−1 → 0 as p → 0. In particular, there exists a δ2 > 0 such that 0 < p < δ2 ⇒ |Cw (N )pw /p| < 1/2. Setting δ = min(δ1 , δ2 ), we have that 0 < p < δ ⇒ (1−Rel(N ; 1−p))/p = 1−Rel(N ;1−p) Cw (N )pw < (1+1)1/2 = 1, hence 0 < p < δ ⇒ 1−Rel(N ; 1−p) < Cw (N )pw p p. Replacing p by 1 − p yields 1 − δ < p < 1 ⇒ Rel(N ; p) > p, as required.
Theorem 3. Let N be a two-terminal network of width w > 1 and length l > 1. Then there is exactly one p1 ∈ (0, 1) such that Rel(N ; p1 ) = p1 . Proof. By Lemma 1, there exist p and p such that 0 < p < p < 1, and Rel(N ; p )−p < 0 and Rel(N ; p )−p > 0. By the Intermediate Value Theorem, it follows that there exists p1 ∈ (p , p ) such that Rel(N ; p1 )−p1 = 0, as required. The uniqueness of p1 follows from Theorem 1 in [25]. Now, let us see how these different threshold points influence the point-wise order of the reliability polynomials. Proposition 3. Let u, v ∈ {0, 1}m be such that u ≤ v. Then we have – Rel(C u ; 1/2) ≤ Rel(C v ; 1/2); – p0 (u) ≥ p0 (v); – p1 (u) ≥ p1 (v). A direct consequence of Proposition 3 is that when u ≤ v and Rel(C v ; 1/2) ≤ 1/2 we have d ((1/2, 1/2), (p0 (u), Rel(C u ; 1/2))) ≥ d ((1/2, 1/2), (p0 (v), Rel(C v ; 1/2))) , where d(·, ·) is the Euclidean distance. Due to duality we also have
(11)
On Posets for Reliability: How Fine Can They Be?
123
Proposition 4. Let u ∈ {0, 1}m . Then – d ((1/2, 1/2), (p0 (u), Rel(C u ; 1/2))) = d((1/2, 1/2), (p0 (u), Rel(C u ; 1/2))); – p1 (u) + p1 (u) = 1; Proof. Use Eq. (7). In Table 2 we have computed Rel(C u ; 1/2), p0 (u), d ((1/2, 1/2), (p0 (u), Rel(C u ; 1/2))) as well as p1 (u) ∈ (0, 1), for all u which satisfy Rel(C u ; 1/2) ≤ 1/2. Additionally, all these results are also plotted in Fig. 2. Table 2. Threshold points for Rel(C u ; p) and distance to the middle point (1/2, 1/2). u
u
Rel(C u ; 1/2) p0 (u ) d ((1/2, 1/2), (p0 (u ), Rel(C u ; 1/2))) p1 (u )
0 (0, 0, 0, 0)
0.000015
0.958
0.678
–
1 (1, 0, 0, 0)
0.008
0.858
0.608
0.9823
2 (0, 1, 0, 0)
0.015
0.823
0.583
0.9649
4 (0, 0, 1, 0)
0.037
0.775
0.539
0.9311
8 (0, 0, 0, 1)
0,100
0.712
0.453
0.8670
3 (1, 1, 0, 0)
0.228
0.632
0.302
0.7244
5 (1, 0, 1, 0)
0.346
0.568
0.168
0.6180
6 (0, 1, 1, 0)
0.467
0.514
0.036
0.5248
(a) Rel(C u ) with Rel(C u ; 1/2) and p0 (u)
(b) Rel(C u ) with p1 (u)
Fig. 2. Reliability polynomials for C u (u ∈ {0, 1}4 ) and their associated threshold points.
Remark 1. Notice that both p0 and p1 define the same totally ordered set for the reliability polynomials of C u , while Rel(C u ; 1/2) defines exactly the opposite totally ordered set. In a narrow sense this might seem redundant and of
124
V. Beiu et al.
little value, as they both lead to the same ordering. However, each of these two threshold points reveals specific information about the reliability polynomials. Lastly, these threshold points are valid for any minimal network, except for the self-dual ones where Rel(C u ; 1/2) = 1/2, and hence the three threshold points are all equal to 1/2. 4.2
Series-and-parallel
A more general subclass of minimal networks is represented by the family of minimal two-terminal series-and-parallel networks. Compositions of C 0 and C 1 are only particular examples from this family. Two-terminal series-and-parallel networks were studied in [9,10,17,22,29]. For this type of network efficient algorithms for computing their reliability polynomials are known [21]. We have used an efficient algorithm to compute the reliability polynomials of all minimal two-terminal series-and-parallel networks of w = 2, l = 8; w = 4, l = 4; and w = 8, l = 2. These are plotted in Fig. 3. From this figure it can be seen that any two-terminal series-and-parallel network N of w = 2, l = 8 satisfies Rel(C (1,0,0,0) ) ≤ Rel(N ) ≤ Rel(C (0,0,0,1) )
(12)
and any series-and-parallel network N of w = 4, l = 4 satisfies Rel(C (1,1,0,0) ) ≤ Rel(N ) ≤ Rel(C (0,0,1,1) ). 4.3
(13)
Consecutive Networks
For complementing the results reported above we are going to present additional simulations for a quite different type of network known as consecutive-k-outof-n:F [20] (linking to Harary graphs). These are neither planar nor minimal networks, while the reason for considering them is due to certain proofs of their optimality [13]. Those proofs imply that we could expect consecutive-k-out-ofn:F networks to achieve very high reliability at low redundancy factors [5]. A consecutive-k-out-of-n:F network has devices placed in a sequence (e.g., in a row or in a circle), and fails if and only if at least k consecutive devices fail. This type of network was formalized by Kontoleon in 1980 [20], while the scheme itself was already in use for oil pipeline systems and, apparently even earlier, for rows of street lights. For the simulations we have performed here we have considered consecutivek-out-of-n:F networks made of n identical and statistically independent devices, where all devices are characterized by a constant probability of failure q (i.i.d. devices), the probability of functioning correctly being p = 1 − q. It was startling to realize [12] that the reliability of such consecutive-k-out-of-n:F networks has been calculated exactly – without wanting – by de Moivre. Problem LXXIV on pages 254−259 of [23] has computed the probability Pn,k , of having k successes (i.e., a run of length k) in n trials.
On Posets for Reliability: How Fine Can They Be?
125
Fig. 3. Reliability polynomials for minimal two-terminal series-and-parallel networks of n = 16.
Proposition 5 ([23]). By introducing the notation
n/(k+1)
βn,k =
j=0
j
(−1)
n − jk (pq k )j j
(14)
the probability Pn,k of a run of length k in n trials is Pn,k = 1 − βn,k + q k βn−k,k . From Proposition 5 it follows that the reliability of a consecutive-k-out-of-n:F network is 1 − Pn,k = βn,k − q k βn−k,k . (15) We have based our simulations on Eqs. (14) and (15) by setting n to a fixed value and varying k from 1 to n − 1. Obviously, when k = 1 the resulting network is nothing else but a series network. As k is increased the consecutivek-out-of-n:F networks grow more connections, hence they start approaching a parallel network more and more. Finally, when k = n − 1 we have each and every node connected to all the other n − 1 nodes, which means that a consecutive(n − 1)-out-of-n:F , reliability-wise, should behave like the all-parallel network. Simulation results for n = 16, 32, 64 are presented in Fig. 4 beside simulations for the coresponding compositions of series-and-parallel. It can be seen that compositions of series-and-parallel are reasonably uniformly distributed in-between the all-series (red) and all-parallel (blue) networks, while consecutive-k-out-of-n:F networks are shifted towards the reliability of the all-parallel network. It is also interesting to remark that while compositions are not totally ordered (except for n ≤ 16), consecutive-k-out-of-n:F networks are always ordered.
126
V. Beiu et al. Compositions of C (0) and C (1)
Consecutive k-out-of-n:F n = 16 1
0.5
0 0
0.5
1
n = 32 1
0.5
0 0
0.5
1
n = 64 1
0.5
0 0
0.5
1
Fig. 4. Reliability of consecutive and composition networks for n = 16, 32, 64.
On Posets for Reliability: How Fine Can They Be?
5
127
Conclusions
This article has focused on posets defined over the set of reliability polynomials of several particular classes of two-terminal networks, like, e.g., compositions of series-and-parallel as well as consecutive networks. Our first contribution was represented by enhancing previous results obtained for compositions of seriesand-parallel. Here we were able to identify three different threshold points, each leading to another ordering of the set of reliability polynomials, a particular one being finer than the existing ones, as being total. Our second contribution was represented by extending such results to consecutive-k-out-of-n:F networks. Overall, the results we have obtained have confirmed well-known theorems regarding uniformly most reliable two-terminal networks. Lastly, departing from the customary view of considering networks (only) for communication, we have investigated possible poset structures for two-terminal networks intended for Boolean computations. For identifying optimal two-terminal networks for Boolean computations we have (once again) relied on the properties of reliability polynomials for compositions of series-and-parallel. We expect that results like the ones we have presented here should prove useful both when trying to understand the reliability of larger networks for Boolean computations, as well as provide for enabling design concepts for Boolean gates/circuits. Aknowledgement. Research supported by the EU through the European Research Development Fund under the Competitiveness Operational Program (BioCellNanoART = Novel Bio-inspired Cellular Nano-architectures, POC-A1-A1.1.4-E-2015 nr. 30/01.09.2016).
References 1. Ath, Y., Sobel, M.: Some conjectured uniformly optimal reliable networks. Prob. Eng. Inf. Sci. 14(3), 375–383 (2000) 2. Ball, M.O., Colbourn, C.J., Provan, J.S.: Network reliability. In: Handbook of Operations Research: Network Models, Chap. 11, pp. 673–762. Elsevier, Amsterdam (1995) 3. Bardet, M., Chaulet, J., Dr˘ agoi, V., Otmani, A., Tillich, J.P.: Cryptanalysis of the McEliece public key cryptosystem based on polar codes. In: Proceedings of International Workshop on Post-Quantum Cryptography (PQCrypto), Fukuoka, Japan, pp. 118–143 (2016) 4. Bardet, M., Dr˘ agoi, V., Otmani, A., Tillich, J.P.: Algebraic properties of polar codes from a new polynomial formalism. In: Proceedings of IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, pp. 230–234 (2016) 5. Beiu, V., D˘ au¸s, L.: Deciphering the reliability scheme of the neurons one ion channel at a time. In: Proceedings of International Conference on Bioinspired Information and Communications Technologies (BICT), Boston, MA, pp. 182–187 (2014) 6. Bertrand, H., Goff, O., Graves, C., Sun, M.: On uniformly most reliable twoterminal graphs. Networks 72(2), 200–216 (2017) 7. Boesch, F.T., Li, X., Suffel, C.: On the existence of uniformly optimally reliable networks. Networks 21(2), 181–194 (1991)
128
V. Beiu et al.
8. Brown, J.I., Cox, D.: Nonexistence of optimal graphs for all terminal reliability. Networks 63(2), 146–153 (2014) 9. Brylawski, T.H.: A combinatorial model for series-parallel networks. Trans. Am. Math. Soc. 154, 1–22 (1971) 10. Carlitz, L., Riordan, J.: The number of labeled two-terminal series-parallel networks. Duke Math. J. 23(3), 435–445 (1956) 11. Colbourn, C.J.: The Combinatorics of Network Reliability. Oxford University Press, New York (1987) 12. D˘ au¸s, L., Beiu, V.: Lower and upper reliability bounds for consecutive-k-out-of-n:F systems. IEEE Trans. Reliab. 64(3), 1128–1135 (2015) 13. Deng, H., Chen, J., Li, Q., Li, R., Gao, Q.: On the construction of most reliable networks. Discrete Appl. Math. 140(1–3), 19–33 (2004) 14. Dr˘ agoi, V., Cowell, S.R., Beiu, V.: Ordering series and parallel compositions. In: Proceedings of IEEE International Conference on Nanotechnology (IEEE-NANO), Cork, Ireland, pp. 1–4 (2018) 15. Dr˘ agoi, V., Cowell, S.R., Beiu, V., Hoar˘ a, S., Ga¸spar, P.: How reliable are compositions of series and parallel networks compared with hammocks? Int. J. Comput. Commun. Control 13(5), 772–791 (2018) 16. Dr˘ agoi, V., Cowell, S.R., Hoar˘ a, S., Ga¸spar, P., Beiu, V.: Can series and parallel compositions improve on hammocks? In: Proceedings of International Conference on Computers Communications and Control (ICCCC), Oradea, Romania, pp. 124– 130 (2018) 17. Duffin, R.J.: Topology of series-parallel networks. J. Math. Anal. Appl. 10(2), 303–318 (1965) 18. Gordon, D.M., Miller, V.S., Ostapenko, P.: Optimal hash functions for approximate matches on the n-cube. IEEE Trans. Inf. Theory 56(3), 984–991 (2010) 19. He, G., Belfiore, J., Land, I., Yang, G., Liu, X., Chen, Y., Li, R., Wang, J., Ge, Y., Zhang, R., Tong, W.: β-expansion: A theoretical framework for fast and recursive construction of polar codes. In: Proceedings of IEEE Global Communications Conference (GLOBECOM), Singapore, Singapore, art. 8254146, pp. 1–6 (2017) 20. Kontoleon, J.M.: Reliability determination of a r-successive-out-of-n:f system. IEEE Trans. Reliab. R–29(5), 437–437 (1980) 21. Lee, C.Y.: Analysis of switching networks. Bell Syst. Tech. J. 34(6), 1287–1315 (1955) 22. Lomnicki, Z.A.: Two-terminal series-parallel networks. Adv. Appl. Prob. 4(1), 109– 150 (1972) 23. de Moivre, A.: The Doctrine of Chances, 1st ed., London (1718) 24. Mondelli, M., Hassani, S.H., Urbanke, R.: Construction of polar codes with sublinear complexity. In: Proceedings of IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, pp. 1853–1857 (2017) 25. Moore, E.F., Shannon, C.E.: Reliable circuits using less reliable relays - Part I. J. Franklin Inst. 262(3), 191–208 (1956) 26. Moore, E.F., Shannon, C.E.: Reliable circuits using less reliable relays - Part II. J. Franklin Inst. 262(4), 281–297 (1956) 27. Mori, R., Tanaka, T.: Performance and construction of polar codes on symmetric binary-input memoryless channels. In: Proceedings of IEEE International Symposium on Information Theory (ISIT), Seoul, South Korea, pp. 1496–1500 (2009) 28. Myrvold, W., Cheung, K.H., Page, L.B., Perry, J.E.: Uniformly-most reliable networks do not always exist. Networks 21(4), 417–419 (1991) 29. Riordan, J., Shannon, C.E.: The number of two-terminal series-parallel networks. J. Math. Phys. 21(1–4), 83–93 (1942)
On Posets for Reliability: How Fine Can They Be?
129
30. Sch¨ urch, C.: A partial order for the synthesized channels of a polar code. In: Proceedings of IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, pp. 220–224 (2016) 31. Stanley, R.P.: Enumerative Combinatorics. Cambridge University Press, Cambridge (2012) 32. Valiant, L.: The complexity of enumeration and reliability problems. SIAM J. Comput. 8(3), 410–421 (1979) 33. von Neumann, J.: Probabilistic logics and the synthesis of reliable organisms from unreliable components. Automata Stud. 34, 43–98 (1956) 34. Wang, G.: A proof of Boesch’s conjecture. Networks 24(5), 277–284 (1994)
Comparison of Neural Network Models Applied to Human Recognition Daniela Sánchez(&), Patricia Melin, and Oscar Castillo Tijuana Institute of Technology, Tijuana, Mexico [email protected], {pmelin,ocastillo}@tectijuana.mx
Abstract. In this paper a comparison among conventional Artificial Neural Networks (ANN), Ensemble Neural Networks (ENN) and Modular Granular Neural Networks (MGNNs) is performed. This comparison is performed use 10fold cross-validation using from 1 to 12 images for the training phase. Some parameters of neural networks are randomly established such as: the number of sub modules (for ensemble and modular granular neural networks), the number of neurons of two hidden layers for each sub module and learning algorithm. A benchmark database is used to observe the neural networks performances.
1 Introduction Nowadays, Artificial intelligence techniques have many areas of application among the most important is the human recognition [6, 8, 14]. Biometric measurements are important for information security. Using some physiological characteristics of the human, such as face [5, 24], fingerprint [23], iris [15], hand geometry [3] among others, biometrics measures allows to identify each individual and distinguish it from others [1], providing a greater control about access to areas or information. Human recognition based on biometry techniques has achieved more precise results than traditional techniques. It is important to remember the disadvantages of using traditional techniques, as an example, passwords or access cards. These kind of identification techniques are usually forgotten, duplicated, or even stolen. One of the main intelligent techniques responsible for performing this task is artificial neural networks. At the present time, there is a great variety of artificial neural networks that have arisen with the purpose of learning more data, offering precision and improve their performance [13]. This paper is organized as follows: In Sect. 2, the basic concepts to understand this research work are described. In Sect. 3, the general architecture of each kind of neural network is presented. Experimental results are shown in Sect. 4, and in Sect. 5, the conclusions of this comparison are presented.
© Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 130–142, 2021. https://doi.org/10.1007/978-3-030-51992-6_11
Comparison of Neural Network Models Applied to Human Recognition
131
2 Basic Concepts In this section, a brief overview of the basic concepts used to understand this research work are presented. 2.1
Artificial Neural Networks
An artificial neural network (ANN) is inspired by a biological neural network and it can be defined as an information-processing system trying to mimic certain characteristics of brain. One of the most important characteristic of an ANN is its ability to derive meaning when complicated information is used. ANNs are highly recommended to extract patterns and find trends that for other techniques or even humans can be a too complex task to notice. When an information is given to an artificial neural network in its training phase, the ANN can be considered as an expert in that information [9, 19]. 2.2
Ensemble Neural Networks
An ensemble neural network (ENN) is a kind of multiple ANN. This kind of neural network uses more than one conventional ANN to resolve a problem. Each individual artificial neural network is trained for the same task, the responses obtained of each neural network are combined to obtain a final result [12, 17]. 2.3
Modular Neural Network
The concept of modularity is an extension of the principle of divide and conquer [2]. A modular neural network (MNN) is composed by several simple neural networks, where a task or problem is divides into modules, and each neural network represents one of those modules. Each module can have its own architecture and learns a specific information, making it an expert of certain information. Due to each module handles different subtask, it is necessary to use an integrating unit allowing to combine responses of each module. MNNs have demonstrated to improved more results than conventional artificial neural networks due to allow a better learning when dividing a task into subtasks [11]. 2.4
Granular Computing
Granular computing (GrC) was proposed by Zadeh [29], this area is based on subsets or classes theories and involves human reasoning, problem solving and different research fields such as data mining or machine learning. Granulation consists of a whole part divided into sub parts named granules [28]. Each granule contains information according to some interpretation an example; similarity or proximity. GrC is a useful tool to resolve complex problems, due to avoid errors and save time of problem solving [4, 18].
132
D. Sánchez et al.
3 General Descriptions of Architectures and Experiments The proposed comparison consists in the observation of the performance of 3 different types of networks applied to the human recognition using iris as biometric measure. In this section, experiments and neural networks architectures are described. 3.1
Description of the Experiments
For the experiments, 10-fold cross-validation is performed. This cross-validation is performed using from 1 to 12 images. Figure 1 shows, an example of the 10-fold crossvalidation when 2 images are using for the training phase.
Fig. 1. An example of the 10-fold cross-validation
3.2
Description of the Neural Networks
The architecture for each type of neural network is described below. It is important to mention that for each type of neural network, the database of images is divided into a set of images for the training phase and a set of images for the testing phase. For the conventional artificial neural networks, a single ANN learns the whole set of images for the training phase. In this work, for this kind of neural network, the number of neurons of two hidden layers and learning algorithm are randomly determined in each training. Figure 2 shows an example of the ANN architecture.
Fig. 2. The architecture of the ANNs
For the ensemble neural networks, each module learns the whole set of images for the training phase. For this kind of neural network, the number of modules (or ANN),
Comparison of Neural Network Models Applied to Human Recognition
133
the neurons of two hidden layers of each module and learning algorithm are randomly determined in each training. In Fig. 3 an example of the ENN architecture is shown.
Fig. 3. The architecture of the ENNs
For these kind of neural networks, the error of recognition can be expressed as: PD error ¼
b¼1
Yb
D
ð1Þ
Where Yb is 0 if the artificial neural network gave a correct result and 1 if not, and D is total number of images used for the testing phase. For the modular granular neural network, the number of sub granules (modules), the number of person learns in each module, the neurons of two hidden layers of each module and learning algorithm are randomly determined in each training. Figure 4 shows an example of the MGNN architecture.
Fig. 4. The architecture of the MGNNs
134
D. Sánchez et al.
For these kind of neural networks, the error of recognition can be expressed as: error ¼
Xm a¼1
Dm X
!
!
Yb =Dm
ð2Þ
b¼1
Where the total number of modules is represented by m, Yb is 0 if the module gave a correct result and 1 if not, and Dm is total number of images used for testing in the corresponding module. The parameters randomly established (the number of sub modules of ENNs and MGNNs, the number of neurons of two hidden layers for each sub module and learning algorithm) have the ranges shown in Table 1. Only two hidden layers in each neural network are used because in other works best results have been achieved [20, 21]. The learning algorithms used for the neural networks are: gradient descent with adaptive learning and momentum (GDX), gradient descent with adaptive learning (GDA) and gradient descent with scaled conjugate gradient (SCG). These learning algorithms have already been used in other previous works [20–22]. Table 1. Parameters for NN architectures Parameter Sub-modules (m) Neurons Learning algorithm
3.3
Value 1–10 40–300 trainscg traingdx traingda
Iris Database
The database of human iris is from the Institute of Automation of the Chinese Academy of Sciences [7]. The first 77 persons were used, where each person has 14 images with JPEG format (7 for each eye). In Fig. 5 examples of the CASIA database are shown.
Fig. 5. Examples of the CASIA database
Comparison of Neural Network Models Applied to Human Recognition
135
The pre-processing for each image of this database was developed by Masek and Kovesi [10]. With this pre-processing the coordinates and radius of the iris and pupil are obtained to perform a cut in the iris, then the image is resized to 21 21.
4 Experimental Results To compare neural networks performance, 10-fold cross-validation were performed for each one (ANNs, ENNs and MNNs) using from 1 to 12 images for training phase. The average results of each 10-fold cross validation are shown in this section. 4.1
Artificial Neural Networks Results
The best, average and worst recognition rate of 10-fold cross-validation for each number of images used for the training phase (from 1 to 12) using conventional artificial neural networks are shown in Table 2. For each artificial neural network, the neurons number of two hidden layers and learning algorithm were randomly established. The best average is obtained when 10 images are used for the training phase. Table 2. Results of ANNs (10-fold cross-validation) Number of images (Training phase) Best Average Worst 1 44.46% 31.02% 1.30% 2 69.37% 48.02% 35.06% 3 78.04% 55.63% 34.59% 4 83.90% 66.06% 21.82% 5 89.61% 78.21% 69.99% 6 87.99% 76.62% 19.97% 7 90.54% 85.81% 78.85% 8 93.29% 88.31% 85.93% 9 92.73% 85.43% 72.73% 10 92.21% 88.38% 83.44% 11 92.21% 85.11% 63.20% 12 95.45% 80.52% 29.22%
The 10-fold cross-validation results for each number of images are in Fig. 6. The architecture of the best result is shown in Table 3.
136
D. Sánchez et al.
Fig. 6. 10-fold cross-validation results of ANNs Table 3. The best ANN architecture Images for training 1, 2, 3, 4, 5, 6, 7, 10, 11, 12, 13 and 14
4.2
Learning algorithm trainscg
Neurons 117, 236
Persons per module 1 to 77
Rec. rate 95.45%
Ensemble Neural Networks Results
The best, average and worst recognition rate of 10-fold cross-validation for each number of images used for the training phase (from 1 to 12) using ensemble neural networks are shown in Table 4. For each ensemble neural network, the number of modules (or ANN), the neurons of two hidden layers of each module and learning algorithm were randomly established. The best average is obtained when 12 images for the training phase are used. Table 4. Results of ENNs (10-fold cross-validation) Number of images (Training phase) Best
Average Worst
1 2 3 4 5 6 7 8 9 10 11 12
37.58% 49.42% 57.97% 72.43% 72.63% 86.28% 87.64% 88.55% 89.90% 89.77% 90.87% 91.49%
44.56% 71.65% 80.17% 83.77% 84.27% 89.45% 89.61% 91.77% 94.03% 92.21% 94.37% 96.10%
32.17% 37.99% 35.66% 42.99% 1.30% 80.68% 83.30% 82.68% 83.64% 87.01% 87.88% 87.01%
Comparison of Neural Network Models Applied to Human Recognition
137
The 10-fold cross-validation results for each number of images are in Fig. 7. The architecture of the best result is shown in Table 5.
Fig. 7. 10-fold cross-validation results of ENNs Table 5. The best ENN architecture Images for training 1, 2, 3, 4, 5, 6, 7, 10, 11, 12, 13 and 14
4.3
Learning algorithm trainscg
Neurons 203,59 63,272 210,278 258,35 182,83 16,237 76,92 205,45 269,96 88,102
Persons per module 1 to 77
Rec. rate 96.10%
Modular Neural Networks Results
The best, average and worst recognition rate of 10-fold cross-validation for each number of images used for the training phase (from 1 to 12) using modular granular neural networks are shown in Table 6. For modular granular artificial neural network, the number of sub granules (modules), the neurons of two hidden layers of each module, the number of persons learns in each module and learning algorithm were randomly established. The best average is obtained when 11 images for the training phase are used.
138
D. Sánchez et al. Table 6. Results of MGNNs (10-fold cross-validation) Number of images (Training phase) Best Average Worst 1 65.63% 55.57% 45.45% 2 77.27% 62.36% 48.70% 3 87.49% 73.62% 52.66% 4 91.56% 78.65% 39.87% 5 92.64% 86.10% 77.78% 6 94.64% 82.76% 15.91% 7 93.88% 91.91% 87.20% 8 97.62% 92.40% 86.15% 9 95.84% 86.94% 37.92% 10 95.78% 91.75% 85.06% 11 98.27% 93.25% 89.18% 12 98.05% 88.31% 33.12%
The 10-fold cross-validation results for each number of images are presented in Fig. 8. The best architecture is shown in Table 7.
Fig. 8. 10-fold cross-validation results of MGNNs
Comparison of Neural Network Models Applied to Human Recognition
139
Table 7. The best MGNN architecture Images for training
Learning algorithm traingda
1, 2, 3, 4, 5, 6, 10, 11, 12 and 13
4.4
Neurons 180,139 39,279 187,195 78,24 295,182 120,103 65,306 308,177 180,296 98,119
Persons per module 1 to 4 5 to 13 14 to 26 27 to 33 34 to 35 36 to 43 44 to 53 54 to 65 66 to 68 69 to 77
Rec. rate 98.27%
Comparison Results
The final average of each kind of neural network is shown in Table 8. As results shown, modular neural networks achieved better results than the others. Table 8. Comparison of averages ANNs ENNs MGNNs Averages 72.43% 76.21% 81.97%
In Fig. 9, the best results previously shown for each type of neural network are shown, where it can be observed a better performance of the MGNNs.
Fig. 9. Neural networks best results
140
D. Sánchez et al.
In Fig. 10, the averages previously shown for each type of neural network are shown, where it can be observed a better performance of the MGNNs especially when less images are used for training phase.
Fig. 10. Neural networks average results
In Fig. 11, the worst results previously shown for each type of neural network are shown.
Fig. 11. Neural networks worst results
Comparison of Neural Network Models Applied to Human Recognition
141
5 Conclusions In this paper, comparisons among conventional neural networks, ensemble neural networks and modular neural network are performed, where their best, average and worst values are shown. Each type of neural networks performed human recognition using iris as biometric measure. Different trainings were performed (10-fold crossvalidation) using from 1 to 12 images for training phase. The results shown a better performance when modular granular neural networks are used to perform human recognition based on iris biometric measure. As future work, an optimization technique will be applied to design neural network architectures, optimizing hidden layers, neurons and also type of neural network. It is important to mention that also comparisons with other benchmark databases and convolutional neural networks will be performed. As optimization method a meta-heuristic could be considered, like in [16, 25–27].
References 1. Abiyev, R., Altunkaya, K.: Personal Iris recognition using neural network. Department of Computer Engineering, Near East University, Lefkosa, North Cyprus (2008) 2. Azamm, F.: Biologically inspired modular neural networks. Virginia Polytechnic Institute and State University, Blacksburg, Virginia, Ph.D. thesis (2000) 3. Bakshe, R.C., Patil, A.M.: Hand geometry as a biometric for human identification. Int. J. Sci. Res. 4(1), 2744–2748 (2015) 4. Bargiela, A., Pedrycz, W.: The roots of granular computing. In: IEEE International Conference on Granular Computing (GrC), pp. 806–809 (2006) 5. Ch’ng, S.I., Seng, K.P., Ang, L.: Modular dynamic RBF neural network for face recognition. In: 2012 IEEE Conference on Open Systems (ICOS), pp. 1–6 (2012) 6. Chen, S.H., Jakeman, A.J., Norton, J.P.: Artificial intelligence techniques: an introduction to their use for modelling environmental systems. Math. Comput. Simul. 78, 379–400 (2008) 7. Database of Human Iris. Institute of Automation of Chinese Academy of Sciences (CASIA). http://www.cbsr.ia.ac.cn/english/IrisDatabase.asp 8. Dilek, S., Çakır, H., Aydın, M.: Applications of artificial intelligence techniques to combating cyber crimes: a review. Int. J. Artif. Intell. Appl. (IJAIA) 6(1), 21–39 (2015) 9. Khan, A., Bandopadhyaya, T., Sharma, S.: Classification of stocks using self organizing map. Int. J. Soft Comput. Appl. 4, 19–24 (2009) 10. Masek, L., Kovesi, P.: MATLAB Source Code for a Biometric Identification System Based on Iris Patterns. The School of Computer Science and Software Engineering, The University of Western Australia (2003) 11. Melin, P., Castillo, O.: Hybrid Intelligent Systems for Pattern Recognition Using Soft Computing: An Evolutionary Approach for Neural Networks and Fuzzy Systems, 1st edn. Springer, Heidelberg (2005) 12. Mohamad, M., Mohd Saman, M.Y.: Comparison of diverse ensemble neural network for large data classification. Int. J. Adv. Soft Comput. 7(3), 67–83 (2015) 13. Moreno, B., Sanchez, A., Velez, J.F.: On the use of outer ear images for personal identification in security applications. In: IEEE 33rd Annual International Carnahan Conference on Security Technology, pp. 469–476 (1999) 14. Pannu, A.: Artificial intelligence and its application in different areas. Int. J. Eng. Innov. Technol. (IJEIT) 4(10), 79–84 (2015)
142
D. Sánchez et al.
15. Patil, A.M., Patil, D.S., Patil, P.S.: Iris recognition using gray level co-occurrence matrix and Hausdorff dimension. Int. J. Comput. Appl. 133(8), 29–34 (2016) 16. Pintea, C.M., Matei, O., Ramadan, R.A., Pavone, M., Niazi, M., Azar, A.T.: A fuzzy approach of sensitivity for multiple colonies on ant colony optimization. In: Soft Computing Applications, SOFA 2016. Advances in Intelligent Systems and Computing, vol. 634. Springer (2016) 17. Pulido, M., Castillo, O., Melin, P.: Genetic optimization of ensemble neural networks for complex time series prediction of the Mexican exchange. Int. J. Innov. Comput. Inf. Control 9(10), 4151–4166 (2013) 18. Qian, Y., Zhang, H., Li, F., Hu, Q., Liang, J.: Set-based granular computing: a lattice model. Int. J. Approximate Reasoning 55, 834–852 (2014) 19. Ruano, M.G., Ruano, A.E.: On the use of artificial neural networks for biomedical applications. In: Soft Computing Applications. Advances in Intelligent Systems and Computing, vol. 195. Springer 20. Sánchez, D., Melin, P.: Optimization of modular granular neural networks using hierarchical genetic algorithms for human recognition using the ear biometric measure. Eng. Appl. Artif. Intell. 27, 41–56 (2014) 21. Sánchez, D., Melin, P., Castillo, O.: Optimization of modular granular neural networks using a firefly algorithm for human recognition. Eng. Appl. Artif. Intell. 64, 172–186 (2017) 22. Sánchez, D., Melin, P., Castillo, O.: A grey wolf optimizer for modular granular neural networks for human recognition, vol. 2017, pp. 4180510:1–4180510:26 (2017) 23. Sankhe, A., Pawask, A., Mohite, R., Zagade, S.: Biometric identification system: a finger geometry and palm print based approach. Int. J. Adv. Res. Electr. Electron. Instrum. Eng. 5(3), 1756–1763 (2016) 24. Solanki, K., Pittalia, P.: Review of face recognition techniques. Int. J. Comput. Appl. 133(12), 20–24 (2016) 25. Sombra, A., Valdez, F., Melin, P., Castillo, O.: A new gravitational search algorithm using fuzzy logic to parameter adaptation. In: IEEE Congress on Evolutionary Computation 2013, Cancun, Mexico, pp. 1068–1074 (2013) 26. Valdez, F., Melin, P., Castillo, O.: A survey on nature-inspired optimization algorithms with fuzzy logic for dynamic parameter adaptation. Expert Syst. Appl. 41(14), 6459–6466 (2014) 27. Valdez, F., Melin, P., Castillo, O.: Evolutionary method combining particle swarm optimization and genetic algorithms using fuzzy logic for decision making. In: FUZZ-IEEE 2009, pp. 2114–2119 (2009) 28. Yao, Y.Y.: On modeling data mining with granular computing. In: 25th International Computer Software and Applications Conference (COMPSAC), pp. 638–649 (2001) 29. Zadeh, L.A.: Toward a theory of fuzzy information granulation and its centrality in hu-man reasoning and fuzzy logic. Fuzzy Sets Syst. 90(2), 111–127 (1997)
Machine Learning, NLP and Applications
A Genetic Deep Learning Model for Electrophysiological Soft Robotics Hari Mohan Pandey1(&) and David Windridge2 1
2
Faculty of Arts and Science, Edge Hill University, St Helens Rd, Ormskirk L39 4QP, UK [email protected] Middlesex University, The Burroughs, London NW4 4BT, UK [email protected]
Abstract. Deep learning methods are modelled by means of multiple layers of predefined set of operations. These days, deep learning techniques utilizing unsupervised learning for training neural networks layers have shown effective results in various fields. Genetic algorithms, by contrast, are search and optimization algorithm that mimic evolutionary process. Previous scientific literatures reveal that genetic algorithms have been successfully implemented for training three-layer neural networks. In this paper, we propose a novel genetic approach to evolving deep learning networks. The performance of the proposed method is evaluated in the context of an electrophysiological soft robot-like system, the results of which demonstrate that our proposed hybrid system is capable of effectively training a deep learning network. Keywords: Deep learning Evolutionary algorithm Meta-heuristics Neural networks
Genetic algorithm
1 Introduction Deep learning networks are composed of multiple processing layers of predefined set of operations [6]. They have significantly improved the state-of-the-art across domains, including text mining, logical and symbolic reasoning, speech processing, pattern recognition, robotics and big data. Training deep learning networks is known to be hard [5]. Many standard learning algorithms randomly initialize the weights of the neural network (NN) and apply gradient descent using backpropagation. However, this gives poor solutions for networks with 3 or more hidden layers. Hence, fine-tuning of deep network parameters is an important aspect of learning and can be treated as a problem in which the fitness (or objective) function is considered as a criterion for optimization alongside parameters required to construct an efficient deep learning network architecture. In recent years, meta-heuristics algorithms were implemented to handle the problem of Restricted Boltzmann Machine (RBM) model selection. Kuremoto et al. [7] used a Particle Swarm Optimization (PSO) algorithm to optimize the size of neural networks (number of input (visible) and hidden neurons) and the learning rate for 3layer deep network of RBMs. Liu et al. [8] suggested a Genetic Algorithm (GA) based system for optimization of RBM. Later on, Levy et al. [9] proposed a hybrid approach © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 145–151, 2021. https://doi.org/10.1007/978-3-030-51992-6_12
146
H. M. Pandey and D. Windridge
(GA + RBM) for unsupervised feature learning, which was used for automatic painting classification. In [9], GA was applied to evolve weights of the RBM. Rodrigues et al. [10] employed Cuckoo Search (CS) algorithm for the fine-tuning of parameters of a Deep Belief Network (DBN). In order to validate the effectiveness results were compared against other meta-heuristic algorithms such as Harmony Search (HS), Improved Harmony Search (IHS) and PSO. Rosa et al. [11] utilized a Firefly algorithm for learning the parameters of DBN. They also took other optimization algorithms (HS, IHS and PSO) for performance comparison. Papa et al. [12] proposed a HS based method for fine tuning the parameter of a DBN, obtaining more accurate results than comparable methodologies. Horng [13] showed the implementation of Artificial Bee Colony (ABC) algorithms for calibration of the parameters of DBNs. Experimental results showed the superiority of the ABC and Firefly algorithms over HS, HIS and PSO algorithms. Authors [21, 22] have shown the utility of fuzzy controller system for parameters tuning. But in this paper, we have not yet included fuzzy based system for quantifying the DNN parameters. The aforementioned results reveal that meta-heuristic algorithms can be employed successfully for fine-tuning of parameters of deep learning networks. A comprehensive work on parameter calibration was presented in [12], though the authors suggest that better results can be achieved through Evolutionary Algorithms (EAs). GAs have been found very effective in several areas including grammar inference [14–17, 20], function optimization [18], time tabling [19]. Considering this view, we propose a hybrid deep learning mechanism which utilizes the merits of GAs to enhance Gradient Decent in backpropagation learning. Therefore, the main contributions of this paper are threefold: (a) introducing a GA-based approach to deep auto-encoder learning, (b) enhancing the working of gradient decent in backpropagation and (c) filling the gap in research regarding application of meta-heuristic algorithms to deep learning model selection. The remainder the paper is organized as follows: Sect. 2 presents a background on Deep Auto-Encoders. Section 3 presents our methodology for the application of Genetic Algorithms to Deep Learning Networks. Computational simulation and results are shown in Sect. 4. Finally, Sect. 5 states conclusions and future plans.
2 Training of Deep Autoencoder In this section, we set the context for the deep learning network used for creating the current system. An auto-encoder is an unsupervised neural network where input and output neurons are kept equal to follow a certain optimization goal. For output Neuron i is set to yi ¼ xi , where xi and yi respectively represents the value of input and output neurons. A hidden layer is introduced between input and output layers following the convention: “number of neurons in the hidden layer is less than those in the input and output layers” which helps the neural network in learning a higher-level representation by introducing an information bottleneck. Backpropagation methods are usually employed for training of an auto-encoder. Once training is over, the de-coder layer can be discarded and, the values of the encoder layer fixed, so that it can-not be modified further. At this stage, the output of hidden layer is considered as input to a new autoencoder. This new auto-encoder can be trained in a similar fashion. The whole structure
A Genetic Deep Learning Model for Electrophysiological Soft Robotics
147
encompasses a stack of layers referred to as a deep auto-encoders or deep belief network. The deep belief networks can be utilized for supervised and unsupervised classification utilizing the implicit higher-level representation.
3 Methodology Adapted for Training Deep Autoencoder In this paper, we introduce a GA-based method for training a deep neural network (deep autoencoder in our case). GA is a metaheuristic search and optimization algorithm proposed by Holland [2] that has been successfully implemented for training of neural networks [3]. More specifically, GAs have been employed as a substitute for the backpropagation methods. By contrast, we here propose to use GAs in conjunction with backpropagation to enhance the overall performance of deep neural net-works. We thus implement, as a proof-of-concept, a simple GA based deep learning network for the electrophysiological soft robot-like system as described in [1]. During the training phase of the auto-encoder, we store multiple sets of weights (W) for each layer and these weights are used to create a population for the GA, where each chromosome represents one set of weights. We determine the fitness of each chromosome using Eq. (1). F¼
T X
finitial þ fdiff 1 Pm =P0M
ð1Þ
t¼1
Where, finitial : the initial fitness value (= 0, in the beginning of the execution), fdiff : difference in the position of organism (green dot) after eating food (blue dot) from initialization and the end of T actuation cycles/time step (in our case T = 130), Pm : penalty matrix and P0M : maximum penalty matrix. The fitness value of all the chromosomes is determined and then sorted in descending order of their value. Next, we utilize backpropagation to update the weights of the high-ranking chromosomes and discard the lower ranked chromosomes from the pool by removing them from the population. We apply a uniform selection strategy to selection the chromosomes, so that all chromosomes have equal probability of selection for the next generation regardless of the fitness values of the chromosome. In our system, we use the fitness value to determine which chromosomes are to be removed from the population. In order to perform the crossover operation, a couple of parent populations are selected. Then, by selecting weights randomly from each parent the new offspring is created. On the other hand, the mutation operation is performed by replacing a selection of weights with zero values in the offspring. We demonstrate the crossover and mutation operations via the simple example depicted in Fig. 1.
148
H. M. Pandey and D. Windridge
Fig. 1. A simple example of crossover and mutation operations used during simulation.
Crossover and mutation operations are powerful mechanisms for introducing diversity in the population; - David and Greental [4] indicate that gradient descent methods such as backpropagation are susceptible to trapping in local minima. By adding the merits (in particular recombination operations) of a GA, we can alleviate propensity for the system to get stuck at local optima. In the preceding we set a maximum number of generations as the termination criteria. At the end of this process, the best value of the chromosomes are selected and shared among all the chromosomes of the new layer of the auto-encoder. Hence, the new layer currently being trained only contains the best value of the chromosomes, helping to improve the performance of the overall system.
4 Computational Simulation and Results All the experiments are conducted on Anaconda Spider (TensorFlow) with python 3.5. For our experiments we used a simple electrophysiological robot-like system as presented in [1]. The problem setup in our case consists of a DNN with a stack of 4-layers (50 neurons at 1st layer, whereas other three layers consist of 40, 30 and 20 neurons. We train each layer separately: we started training with 40–30 layers, then utilize the 30 output neurons as inputs to the 30–20 layers. We used a simple GA (SGA) with the following configuration: population size = 100, chromosome size = 15, crossover rate = 0.6, mutation rate = 0.4 and termination condition = maximum number of generations = 100. We executed the GA based deep learning network 30 times (independent runs with identical initial conditions) and collated the results. The objective function is the cost function in our experimental setup; the cost function is called once every generation (after a cycle of 130 time steps a generation is said to be complete). The fitness function value depends upon both: the collisions between the organism (green dot) and food particles (blue dot) as shown in Fig. 2.
A Genetic Deep Learning Model for Electrophysiological Soft Robotics
149
When an organism coincides with a food particle, the fitness function value of that organism is updated, and the food particle reappears at a new random location. In the second case, when an organism collides with any other organism, then we penalize the system. In each iteration, the GA provides training to the network in layered manner, identifies the closet food particle, determines the direction of the food particle and based on the response updates the position and velocity of the organism. We record the best, average and worst fitness value for each generation (graphically shown Fig. 3).
Gen: 0, T ime_Step:0
Gen 10, T ime_Step:23
Gen:15 T ime_Step:100
Gen:17, T ime_Step:79
Gen:20, T ime_Step: 79
Gen:26, T ime_Step:117
Gen: 38, T ime_Step: 85
Gen:55, T ime_Step: 60
Gen: 65, T ime_Step:114
Gen:70, T ime_Step:121
Gen:85, T ime_Step:70
Gen:90, T ime_Step:105
Gen:92, T ime_Step:50
Gen:95, T ime_Step:75
Gen:97, T ime_Step:124
Gen:99: T ime_Step:68
Fig. 2. Simulation results of GA based deep learning network in different generations
150
H. M. Pandey and D. Windridge
Fig. 3. Average fitness value Vs generation (first 20 iterations) chart for the best, average and wort fitness recorded for 30 independent runs with total time step 130.
5 Concluding Remarks and Future Plans This paper has presented a GA-based approach to applying evolution to a deep learning network problem. Initial results suggest that GAs can be utilized for the training of deep learning networks not just an alternative to backpropagation methods as in previous work but can rather work in conjunction with backpropagation effectively solve the deep learning optimization problem. Our experiments utilize an auto-encoder, we believe that the same method can be generalized to other forms of deep learning network architectures. In regard to future work, we aim to compare the performance of GA-based training methods with other meta-heuristic approaches and gradient descent methods, and to extend the method for de-noising auto-encoders and implement a similar system for training deep Boltzmann machines. In addition, we aim to develop a GA based deep learning system for autonomous driving. Acknowledgment. Authors would like to acknowledge financial support from the Horizon 2020 European Research project DREAM4CARS (#731593).
References 1. Cheney, N., MacCurdy, R., Clune, J., Lipson, H.: Unshackling evolution: evolving soft robots with multiple materials and a powerful generative encoding. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, pp. 167–174. ACM (2013) 2. Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence. MIT Press, Cambridge (1992) 3. Schaffer, J.D., Whitley, D., Eshelman, L.J.: Combinations of genetic algorithms and neural networks: A survey of the state of the art. In: International Workshop on Combinations of Genetic Algorithms and Neural Networks, 1992, COGANN-92. IEEE (1992)
A Genetic Deep Learning Model for Electrophysiological Soft Robotics
151
4. David, O.E., Greental, I.: Genetic algorithms for evolving deep neural networks. In: Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp. 1451–1452. ACM (2014) 5. Larochelle, H., Bengio, Y., Louradour, J., Lamblin, P.: Exploring strategies for training deep neural network. J. Mach. Learn. Res. 10(Jan), 1–40 (2009) 6. Pandey, H.M., Windridge, D.: A comprehensive classification of deep learning libraries. In: International Congress on Information and Communication Technology, London, UK (2018) 7. Kuremoto, T., Kimura, S., Kobayashi, K., Obayashi, M.: Time series forecasting using restricted Boltzmann machine. In: International Conference on Intelligent Computing, pp. 17–22. Springer, Heidelberg (2012) 8. Liu, K., Zhang, L.M., Sun, Y.W.: Deep Boltzmann machines aided design based on genetic algorithms. Appl. Mech. Mater. 568, 848–851 (2014). Trans Tech Publications 9. Levy, E., David, O.E., Netanyahu, N.S.: Genetic algorithms and deep learning for automatic painter classification. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp. 1143–1150. ACM (2014) 10. Rodrigues, D., Yang, X.S., Papa, J.P.: Fine-tuning deep belief networks using cuckoo search. In: Bio-Inspired Computation and Applications in Image Processing, pp. 47–59 (2017) 11. Rosa, G., Papa, J., Costa, K., Passos, L., Pereira, C., Yang, X.S.: Learning parameters in deep belief networks through firefly algorithm. In: IAPR Workshop on Artificial Neural Networks in Pattern Recognition, pp. 138–149. Springer, Cham (2016) 12. Papa, J.P., Scheirer, W., Cox, D.D.: Fine-tuning deep belief networks using harmony search. Appl. Soft Comput. 46, 875–885 (2016) 13. Horng, M.H.: Fine-tuning parameters of deep belief networks using artificial bee colony algorithm. DEStech Transactions on Computer Science and Engineering (aita) (2017) 14. Pandey, H.M., Chaudhary, A., Mehrotra, D., Kendall, G.: Maintaining regularity and generalization in data using the minimum description length principle and genetic algorithm: case of grammatical inference. Swarm Evol. Comput. 31, 11–23 (2016) 15. Pandey, H.M.: Natural language grammar induction using genetic and parallel genetic algorithms and load balancing. Int. J. Comput. Sci. Technol 1(1), 28 (2012) 16. Pandey, H.M., Chaudhary, A., Mehrotra, D.: Grammar induction using bit masking oriented genetic algorithm and comparative analysis. Appl. Soft Comput. 38, 453–468 (2016) 17. Choubey, N.S., Pandey, H.M., Kharat, M.U.: Developing genetic algorithm library using Java for CFG induction. Int. J. Adv. Technol. (2011) 18. Pandey, H.M., Rajput, M., Mishra, V.: Performance comparison of pattern search, simulated annealing, genetic algorithm and jaya algorithm. In: Data Engineering and Intelligent Computing, pp. 377–384. Springer, Singapore (2018) 19. Pandey, H.M.: Solving lecture time tabling problem using GA. In: 2016 6th International Conference Cloud System and Big Data Engineering (Confluence). IEEE (2016) 20. Pandey, H.M., Chaudhary, A., Mehrotra, D.: Bit mask-oriented genetic algorithm for grammatical inference and premature convergence. Int. J. Bio-Inspired Comput. 12(1), 54– 69 (2018) 21. Cervantes, L., et al.: Fuzzy dynamic adaptation of gap generation and mutation in genetic optimization of Type 2 fuzzy controllers. Adv. Oper. Res. 2018, 1–13 (2018) 22. Lagunes, M.L., et al.: Parameter optimization for membership functions of type-2 fuzzy controllers for autonomous mobile robots using the firefly algorithm. In: North American Fuzzy Information Processing Society Annual Conference. Springer, Cham (2018)
Blockchain. Today Applicability and Implications Dominic Bucerzan and Crina Anina Bejan(&) Aurel Vlaicu University of Arad, Arad, Romania [email protected], [email protected]
Abstract. Blockchain is an emergent technology with very rapid evolution that seems to radically reshape industry, economy and society [2]. It seems that blockchain technology triggers the beginning of the second era of digital economy. First era of digital economy is the result of the convergence of computing and communications technologies, meanwhile its second era tends to be a combination of computer science, mathematics, cryptography and behavioral economics [10]. It started back in 2008 when it was introduced for the bone structure of cryptocurrencies by a person or a group of people known for the name Satoshi Nakamoto. This paper aims to be an overview of what Blockchain currently involves, also it discusses its potential applications in different industries and its implications for society and economy in the context of next generation of internet. Keywords: Blockchain Consensus model hash Distributed ledger Mining
Cryptocurrency Cryptographic
1 Introduction New technology frontiers had been developed in the last years, which allow a more accurate, less time consuming, more connected and more decentralized world [1]. Blockchain is an emergent technology with very rapid evolution that seems to radically reshape industry, economy and society [2]. It started back in 2008 when it was introduced for the bone structure of cryptocurrencies [3] by a person or a group of people known for the pseudonym Satoshi Nakamoto. Figure 1 shows in brief the history of blockchain. Blockchain combines in a new and original way technologies that existed before. It links cryptography with distributed computers to create a new paradigm the “internet of value, ownership and trust” [5]. Trust and collaboration seems to be the main advantages that this technology offers to our society. Users all over the world can trust each other and transact data (digital money, voting, medical records, property agreements, etc.) using large decentralized peer-to-peer networks without a third party authority involved. Trust is established by protocols, cryptography and computer code [2], strengthening this way, collaboration and cooperation on a global scale. Blockchain is inspired by accounting procedures, using ledgers to store every transacted data in the network. According to National Institute of Standards and © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 152–164, 2021. https://doi.org/10.1007/978-3-030-51992-6_13
Blockchain. Today Applicability and Implications
153
Fig. 1. History of blockchain [4]
Technology (NIST) [6] blockchains are immutable digital ledgers systems implemented in a distributed environment without a central repository and usually without a central authority. Blockchain technology enables a community of users to record their transactions in a public digital ledger in such a way that no transaction can be modified once published. The ledgers are shared, trusted and public. Every user can inspect them, in the same tine no one controls them. The ledgers are cryptographically secured from tampering and revision [7]. This structure makes it highly difficult, impossible with today technology to change it rules. A transaction in the ledger is valid when the users from the network “reach a consensus”. There are different models of consensus, each with positive and negative parts for a specific environment. Inspired by the digital currencies, which allows trusted financial tranzactions between two users without the presence of a bank (the authority which validates the transaction), blockchain technology improves the efficiency of different economic, social and industry domains.
2 Blockchain Components, Actors and Principles A macro view perspective of blockchain system can include the fallowing different components (see Fig. 2) [8]: • Ledger: stores the transactions; • Peer Network: stores and maintains the ledger. Each node from the network keeps its own copy of the ledger. The role of the network as a whole is to reach a consensus on every new transaction from the ledger to ensure that every copy of the ledger is identical with the original one; • Membership Services: manages user identity, authorisation and authentication;
154
D. Bucerzan and C. A. Bejan
• Smart Contract: protocols in shape of programs and data (sometimes referred to as functions and state) that run on the blockchain; • Wallet: stores users’ credentials and tracks user associated digital assets. Technically it stores the user private key; • Events: notifications of updates and actions on the blockchain. Some examples of events: creation and dispersion of a new transaction on the chained network, addition of a new block, notification from smart contracts; • Systems Management: allows blockchain components management (creation, monitoring, modification); • Systems Integration: integration of blockchain with external systems.
Fig. 2. Blockchain components [8]
Blockchain solutions use multiple actors with different roles [8]: • Architect: is a person or a group with background in business analysis, project management and modern programing languages, whom designs the blockchain system; • Operator: have the role to develop and maintain the peer network; they may be considered the keepers of the ledger by storing, maintaining and updating it; • Developer: have the role to create the smart contracts that run on blockchain system and to implement the applications that interact with the blockchain system; • Regulator: depending on the rights in the organization, they can have increased visibility over the ledger compared to the rest of the users; • End User: is unlikely that end user interact directly with the blockchain structure. They are the consumers of the services build around blockchain; • Data Storage: represent traditional databases that deposit the data off- chain. Usually on-chain is stored only the hash of the data that is cheeped off-chain;
Blockchain. Today Applicability and Implications
155
• Data Processing: external software and devices used to extend the processing power of the network. Blockchain seems to be the engine of the new digital economy and it also outlines some basic principles for both [10]: • Network integrity: by using cryptography in an original way blockchain technology solves two major problems: user integrity and double spending. User integrity is validated by the network itself by a mathematic algorithm, no other third authority being involved. Blockchain mechanism minimize fraud by making it very costly regarding hardware resources and time. In blockchain there is no need for a third authority to guaranty users transactions, this is done by the system itself and a transaction is made directly between the sender and its intended receiver; • Distributed power: blockchain technology uses distributed networks and no central database or central server. In comparison with traditional databases if a participant leaves the network the system will go on with no loss of information. If more than half of the network tries to highjack it everyone else would know what they are doing. Also energy costs for dominating a blockchain network would exceed its financial advantages, making that goal worthless; • Value as an incentive: blockchain system are design to pay the ones who work for the system and in the same time the system belongs to the ones who own and use its value units. This way both parties will want system reliability; • Security: security measures are embedded in blockchain network itself which apparently has no failure and it guarantees in the same time confidentiality, authenticity, non-repudiation and immutability of a transaction; • Confidentiality: everyone should be able to control their own data. If there is no private life there is no freedom. People should have the right to decide what, when, how and how much of their identities they want to bring to someone’s attention. Eliminating the need to trust in others, blockchain eliminates the need to know the identity of users to interact with them. Blockchain technology is an opensource algorithm that can be used by anyone without having to authenticate. The system itself don’t need to know personal data to work properly; • Rights preserved: using proof of work protocol, blockchain technology assures that someone can trade only things (real property, digital property) they own. It is impossible even to trade things for somebody else (e.g. A layer to trade something for a client). Using smart contracts blockchain guaranties the write of property; • Inclusion: blockchain technology try to reduce the barriers for everyone who want to use it. It was designed to be used not only for rich people, it should be available for everyone.
156
D. Bucerzan and C. A. Bejan
3 How Blockchain Works If user A wants to send some asset to user B, they initiate the transaction which is broadcasted to the network and validated. The transaction is put into a block and executed (asset moves from A to B) and added to the chain. The network approves the block and the transaction is sealed (see Fig. 3).
Fig. 3. Blockchain transaction [4]
The NIST describe in [6] how blockchain works. Technically blockchain system keeps track of transactions. A transaction represents a recording of a transfer of assets between users. A collection of transaction is stored in ledgers. Every ledger is copied and distributed to all nodes of the system (see Fig. 4) [6].
Fig. 4. Blockchain distributed copy of ledgers [6]
A new transaction is submitted to a node which will alert the other nods from the network that a new transaction had arrived. Eventually after reaching consensus the transaction will be included by a node in a ledger and distributed to the network as completed. Users may initiate transactions to the ledger by sending them to some nods from the blockchain which will propagate the initiated transaction as padding to the
Blockchain. Today Applicability and Implications
157
nods from the network. The distributed transactions will wait until will be include in a ledger by mining nods [6]. Mining nods are the ones who maintain the blockchain by publishing new blocks. Blocks contain ledgers with validated transactions. Validity of a transaction is established by checking that the providers of funds engaged in the transaction had digitally signed the transaction. Mining nods will verify the validity of every transaction from a block. The block will be accepted to the blockchain only if the transaction which it contains are all valid. When a block is created a hash value will be generated and stored in the block [6]. Each new block will store two hashes: in the header will keep the hash for the previous block and in the end will keep its own hash. The hashes are the chains between the blocks (see Fig. 5). All block will store also the nonce [6].
Fig. 5. Chaining process [6]
The nonce is used in mining process and represents the value used by mining nods to solve the hash puzzle and give the mining nods which find it the right to publish the block. A block is published when reaching a consensus [6]. Figure 6 shows how blickchain flow works. The mining process is based on solving a cryptographic hash puzzle. A block is ready to be chain in the blockchain if miners find a certain value (nonce) that operated with the block data will generate a SHA256 code that respects some targeted criteria that a are changing periodically. For example [6]: ð1Þ where n > 0 and HV must starts with “000000”. Hash functions are mathematical one way functions. This means that from the obtained hash value is very difficult to generate the components that generated the hash value and the process needs hard computational work. If just one unit in the components of the hash is change the generated value will not be the same. In contrast with the difficulty of factorization of the hash value it is very easy to verify it. Consensus models are programs that enable the users to work together. They define what happens in conflict situations and which mining nod will publish a new block. The method of agreement must work even in the situation of possible attacks from malicious users. There are several models of consensus among which most popular are: poof of work, proof of stake and round robin [6].
158
D. Bucerzan and C. A. Bejan
Fig. 6. Blockchain flow [8]
Proof of work consensus model is based on the “proof” of the nonce. Finding the nonce is the proof for the work engaged in finding it. Some particular aspects of this model are [6]: • past work won’t influence the chance to solve future hash puzzles easily; • users are incentivised to accept new validated blocks, otherwise it is very likely that other nods will accept it and start build of it. If they refuse to build on the new block they will be building a shorter chain of block and the blockchain system will accept only the longest valid chain. The main advantage of proof of work model is that it is proper to a network where little trust or no trust between the users is. Also the main disadvantage is that it is expensive regarding energy consumption in the mining process. Proof of stake consensus model uses for the creation of a new block the amount of stake a user has. There are several methods how the system uses the stakes. Regardless of these aspects, it is certain that users with more stakes will be selected to produce new blocks. This consensus model is not as costly as proof of work because it doesn’t need resource intensive computations. In this model the “rich” can stake more and obviously gain more assets. For all that, it is cost prohibitive to gain control of the system [6]. Round robin consensus model is based on the level of trust between mining nods. This type of consensus is proper for privet blockchain networks and the mining nods take turn in creating blocks. In round robin, there is no need of cryptographic puzzle and it has low resources consumption [6].
Blockchain. Today Applicability and Implications
159
In this type of model there is a protocol which involves a degree of randomness which solves the situation in which a nod is not available when it is its turn to create a node, so not to stop the creation of new nods. No node creates the majority of blocks [6]. Updating the technology in a blockchain is called forking and changes to a blockchain software are called forks. These operations are extremely difficult regarding that they are targeting distributed networks, cryptographic functions and user consensus.
4 Applicability and Implications Today blockchain technology is considered to be like the internet back in 1992 before World Wide Web. Crypto currency is the first implementation of blockchain like email was the first popular application of internet [9]. Design primarily for bitcoin solution, blockchain system has become the base of several digital currencies, each of them with specific properties and design. Figure 7 shows the main cryptocurrencies that flow on the internet and their evolution in US dollars for the last five years, according to their exchange rate evolution.
Fig. 7. Evolution of Cryptocurrencies in USD [11]
The most valuable cryptocurrency remains bitcoin. However it can’t be mined on classic computers. This is the main reason for which other cryptocurrencies that can be mined on ordinary home computers wore created [11]. Industry and business had recognize the blockchain potential and between 2013 and 2017, capital has been invested in 120 blockchain start-ups [9]. Like seen in Fig. 8 blockchain is not proper yet to replaces all existing traditional technology. Figure 8 shows some questions that someone should ask before trying to
160
D. Bucerzan and C. A. Bejan
implement blockchain technology [8]. In many cases because the technology is relatively new and not understood correctly there will be attempt to integrate blockchain technology even if it is unnecessary [6].
Fig. 8. How to determine if blockchain is appropriate [8]
Blockchain is a data structure that allows only appending operations: visualising the past history of the blockchain and adding new blocks to the blockchain. Every block keeps the value of previous block. The history of the blockchain is public to all nods of the network [8]. Depending on their permission models blockchains are [8]: • Permissioned type of blockchain system suppose that only particular users can write or read to it (like an intranet); proper for: banking, supply chain, insurance and healthcare; • Permissionless blockchains systems suppose that any user can write and read to it (like the public internet); proper for: trusted timestamping and energy industry. Table 1 show a comparison between specific attributes of public and private blockchain structure. Until today blockchain proved that is suitable to be used in Business to Consumer (B2C) and Business to Business (B2B) transactions [8]. Table 2 shows some characteristics that blockchain improve this types of businesses. Digital cash (cryptocurrencies) was the first implementation for blockchain technology. Beside it there are some domains that could adopt blockchain easily: finance, business and economy. Some particular areas we can mention are [2, 12]:
Blockchain. Today Applicability and Implications
161
Table 1. Comparison between private and public blockchain [7] Public Private Access Open read/write Permissioned read and/or write Speed Slower Faster Security Proof of work Pre-approved participants Proof of stake Other consensus mechanisms Identity Anonymous Know identities Pseudonymous Asset Native asset Any asset
Table 2. Blockchain – B2C and Blockchain – B2B [8] Business to Consumer (B2C) Advantages Transparency to the consumer Responsibility from the supplier Labor verification Immutable shared view
Business to Business (B2B) Challenges Repetition of process Heavy dependence on paper Heavy dependence on people Excessive fees and charges
Areas of focus Food supply Traceability Procurement Labour Logistics Compliance
Areas looking for transparency Financial Logistic Charity funding Agriculture Precious metals Tracking
Advantages Efficaciness of process Encreased security of documents Reduce dependence on people
Immediate implementation Financial Global trade & commerce Verification of ownership
• Context globalization: although computers and telecommunications had expand our way of doing business the globalization economy process still exclude some of potential users. Blockchain is a solution that may finalize this cycle and offer business opportunities at a global scale minimizing cost and enhancing collaboration and trust in a trustless environment; • Property rights: property is more easily to track in a decentralized system where everyone can see the trail of an assest. Blockchain offers solutions in this direction; • Supply chain: supply chains involves collaboration between many organizations, this is why blockchain technology found in this environment a primer area of implementation. Embedding blockchain technology in supply chain systems improves them with several advantages like: it can reduce fraud and corruption, it offers control of authentication and trust, it can increase the transparency and visibility to the end user of the finalized product;
162
D. Bucerzan and C. A. Bejan
• Commerce: blockchain technology streamline the whole process of commerce, bringing closely sellers and buyers, by excluding the middle authority, minimizing costs, enhancing trust and transparency in distributed and decentralized networks; • Finance: blockchain may revolutionize payments system making them more clear, transparent and chipper by using digital money; • Elections: blockchain offers an open voting system that exclude a central authority that counts the votes. Also by using smart contracts somehow it cloud engage politicians to their governance program, making it not only a promise but also a fact; • Public administration and government: blockchain offers the possibility of a governance with seamless and efficient interactions. Government transactions and services may be conducted in a paperless and cashless manner without the need of visiting government offices; • Education: blockchain technology may be integrated in many institutional research programs that tat can include: the proof of learning, management of credential and transcripts, management of student’s records, management of reputation and payments [3]. Some interesting national projects regarding the implementation of blockchain technology are: • Smart Dubai 2021: a project that aims to make Dubai a smart city with a greater impact on costumers, financial environment, resource and infrastructure having like objective the full change of areas like: living, economy, people, mobility, environment and governance [13]; • Estcoin: is the project of Estonian government to lunch government-backed cryptocurrency called estcoin. Although the project was launched in 2017 it seems that if faces national and international obstacles from institution that will accept very difficult the change namely local and international banking authorities. Maybe Estonia will not fulfill its plan of issuing digital national money yet, still it was an example for other countries which take seriously into consideration the opportunity of their state-backed digital currency [14]. Blockchain technology is at its beginnings. With today diversity of smart mobile devices and with the variety of the applications[] distributed on them [16], blockchain technology will be shortly adopted by this environment. With every future implementation it shows its potential and applicability that waits to be revealed. In time, the performances of smart devices increases constantly [17] and they will include blockchain technology making an optimized smart blockchain things environment. Blockchain technology creates a “trust issues” free environment, where business tranzactions are guaranteed by blockchain itself replacing other existing mechanism that signal or convey trust [18].
Blockchain. Today Applicability and Implications
163
5 Conclusions and Future Works Blockchain technology evolved very fast in a relatively short period of time. Its advantages and implications are obvious to technical progress. Even so, blockchain is not a perfect technology and has its limitation leaving an open door to future research like: system control, malicious users, identities management, no trust, resource usage, credential storage and many more.
References 1. Sharma, N., Shamkuwar, M., Singh, I.: The history, present and future with IoT. In: Balas, V., Solanki, V., Kumar, R., Khari, M. (eds.) Internet of Things and Big Data Analytics for Smart Generation. Intelligent Systems Reference Library, vol. 154. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04203-5_3. ISBN 978-3-030-04203-5 2. Colchester, J.: Blockchain. An Overview. A Complexity Labs Publication (2018) 3. Holotescu, C.: Understanding blockchain technology and how to get involved. In: The 14th International Scientific Conference eLearning and Software for Education, Bucharest (2018). https://doi.org/10.12753/2066-026x-18-000 4. Heutger, M., Kückelhaus, M., Chung, G.: Blockchain in Logistics: Perspectives on the upcoming impact of blockchain technology and use cases for the logistics industry. DHL Customer Solutions and Innovation (2018) 5. Kinnaird, C., Geipel, M., Bew, M.: Blockchain Technology How the Inventions Behind Bitcoin are Enabling a Network of Trust for the Built Environment. Arup 2017, 13 Fitzroy Street, London WIT 4BQ (2017). arup.com 6. Yaga, D., Mell, P., Roby, N., Scarfone, K.: Blockchain Technology Overview. National Institute of Standards and Technology Internal Report 8202 (2018) 7. Voshmgir, S., Kalinov, V.: Blockchain A Beginners Guide. BlockchainHub (2017). https:// blockchainhub.net/blockchain-technology/. Accessed 1 Aug 2018 8. Blockchain: Understanding Its Uses and Implications. LinuxFoundationX: LFS170x. https:// courses.edx.org/courses/course-v1:LinuxFoundationX+LFS170x+2T2018/courseware/ d73f86a49d654374b8ce911d960085c8/5ab7244cd4924d46bff0f03759857b67/8?activate_ block_id=block-v1%3ALinuxFoundationX%2BLFS170x%2B2T2018%2Btype%40vertical %2Bblock%40bbf9460a071f4dd8bfee6bc489ce31b8. Accessed 01 Aug 2018 9. Barnas, N.B.: Blockchains in National Defense: Trustworthy Systems in a Trustless World. Maxwell Air Force Base, Alabama (2017). http://www.jcs.mil/Portals/36/Documents/ Doctrine/Education/jpme_papers/barnas_n.pdf?ver=2017-12-29-142140-393. Accessed 1 Aug 2018 10. Tapscott, D., Tapscott, A.: Blockchain Revolution: How the Technology Behind Bitcoin Is Changing Money, Business, and the World. Reprint Edition. Publiched by Portofolio (2016). ISBN-13 978-1101980149, ISBN-10 1101980141 11. Bucerzan, D., Bejan, C.A.: Challenges of securing digital information. In: ISREIE Conference, 7th edn. Mathematics & Computer Science, pp. 1–6, 17th–20th May 2018. ISSN 20652569 12. Boucher, P., Nascimento, S., Kritikos, M.: How blockchain technology could change our lives. European Parliamentary Research Service, Scientific Foresight Unit, PE 581.948 (2017) 13. Smart Dubai 2021 Preparing Dubai to embrace the future, now. https://2021.smartdubai.ae/ impact-areas/. Accessed 01 Aug 2018
164
D. Bucerzan and C. A. Bejan
14. Alexandre, A.: Estonia Rolls Back Its Plan to Issue National Digital Currency (2018). https:// cointelegraph.com/news/estonia-rolls-back-its-plan-to-issue-national-digital-currency. Accessed 01 Aug 2018 15. Khan, F., Altaf, A.: Android personal digital life software agent. In: Balas, V., Fodor, J., Várkonyi-Kóczy, A., Dombi, J., Jain, L. (eds.) Soft Computing Applications. Advances in Intelligent Systems and Computing, vol. 195. Springer, Heidelberg (2013). https://doi.org/ 10.1007/978-3-642-33941-7_38. ISBN 978-3-642-33941-7 16. Gal, A., Filip, I., Dragan, F.: IoThings: a platform for building up the internet of things. In: Balas, V., Jain, L., Balas, M. (eds.) Soft Computing Applications. SOFA 2016. Advances in Intelligent Systems and Computing, vol. 633. Springer, Cham (2018). https://doi.org/10. 1007/978-3-319-62521-8_15. ISBN 978-3-319-62521-8 17. Bucerzan, D., Raţiu, C.: Image processing with android steganography. In: Balas, V., Jain, L.C., Kovačević, B. (eds.) Soft Computing Applications. SOFA 2014. Advances in Intelligent Systems and Computing, vol. 356. Springer, Cham (2016). https://doi.org/10. 1007/978-3-319-18296-4_3. ISBN 978-3-319-18296-4 18. Beck, R., Stenum Czepluch, J., Lollike, N., Malone, S.: Blockchain – The Gateway to TrustFree Cryptographic Transactions (2016). Research Papers 153. http://aisel.aisnet.org/ ecis2016_rp/153. Accessed 1 Aug 2018
Integrating Technology in Training: System Dynamic-Based Model Considerations Regarding Smart Training and Its Relationship with Educational Components Victor Tița1, Doru Anastasiu Popescu2, and Nicolae Bold2,3(&) 1
Faculty of Management, Economic Engineering in Agriculture and Veterinary Medicine, University of Agricultural, Sciences and Veterinary Medicine Bucharest, Slatina, Romania [email protected] 2 Department of Mathematics and Computer Science, University of Pitești, Pitești, Romania [email protected], [email protected] 3 Proeuro-Cons Association, Pitești, Romania
Abstract. Dissociating training and education is not the best approach when we refer to the integration of the technological novelties, because, up to a point, these two components of the educational system are closely related. However, their partially-different target group separation leads to a different approach of these group constituents regarding receptivity to technological integrations in educational processes. Based on the particularities of the training market and its dynamic, in this paper we present a close image of the training system by representing its main components and the impact of the technology both in the training process and the training curricula. The model construction is based on a survey conducted on a training platform and built using system dynamics principles. Finally, links with the two main output groups of the training market and the lower education are made using the same principles. Keywords: Training
Smart Education System dynamics
1 Introduction Romanian training market still has its weak points of low development. One of the greatest improvement that can be added to both the methodology and the topics of training is the aid of technology. Methodology refers mainly to the training framework, represented by organisational characteristics (planification, assessment) and topics refer to the effective study of technology. The international factor influences the Romanian training market by the movements of various trainers facilitated by Erasmus projects. The differences between technology development and usage in various educational systems lead to good practices examples regarding technology intrusion in technology. Also, technology integration is a hot point regarding the educational environment at a worldwide level, given the large © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 165–176, 2021. https://doi.org/10.1007/978-3-030-51992-6_14
166
V. Tița et al.
amount of literature in this matter. Besides physical costs, this integration requires also educational costs, that can be less visible or prominent, and this is an important aspect that make the integration debatable. Issues within training can be solved using various methods developing nowadays. These issues are largely organisational, either based on the trainer teaching needs related to course preparation, or on training methodology, being used as a tool for presenting information or integrating practice within training. The methods that are used are related to artificial intelligence (AI) and its subsequent branches (neural networks, genetic algorithms, natural language processing, classifiers etc.). For example, genetic algorithms are used to generate assessment tests for various purposes. The extended model comprises a wider view of the technological influences within the training market in relation with the educational area. Its main purpose is to identify the main vectors of the technology in educational areas (university and training for entrepreneurs). In other words, the purpose of the model presented in this paper can be rephrased to finding to what extend technology helps at creating smart education training centers in relation with university and enterprise media. Section 2 presents literature references related to the technology and a previous model related to the paper topic and proposed in a previous paper. Section 3 proposes a statistical approach to the information and communication technologies (ICT) competences needed within the training and educational processes. Section 4 presents some methodological data regarding the model implementation and Sect. 5 continues the model description by detailing the model construction steps based on system dynamic traditional approach. Section 6 contains the description of the simulation approach of the model. Section 7 presents some important lines and draws future lines of development.
2 Previous Work and Literature Regarding the technology intrusion in education, we have created a previous model that creates a global view which comprises the educational, training and entrepreneurial components and which is shown in Fig. 1. The issue of technology intrusion in education and training is widely studied in the literature. More practical approaches were made by authors in papers [4] and [5], where there was studied the problem of generating assessment items using specific genetic algorithms. The particular issue of training was also studied in various studies and books [6]. These present in detail the methodologies and procedures to integrate the online technologies in education.
Integrating Technology in Training
167
Fig. 1. Main model comprising three components and the technology problem
Further, all the technology that can help at developing educational processes are based on AI developments made in the recent years and described in the literature. Various papers link the AI developments with the possibility of integration within the education and training processes [7, 8] based on AI principles. Some branches of AI field (decision support systems [9], intelligent systems, system dynamics, neural networks, genetic algorithms, natural language processing, classifiers etc.) can help at creating tools for enhancing training processes. In the end, the technology helps at creating educational hubs [2], environments [1] and platforms [3] which are basically creating new ways of understanding, teaching and training.
3 Statistical Data Regarding Training and IT In order to check the importance of technology within training, we have conducted a study that also contained data related to the behavior of the trainees towards technology. The study consisted in applying a survey of 20 questions on various themes on 165 trainees. A part of these questions refers to technology-related issues, such as preferred type of learning or subjects of training related to technology. Data regarding participants is presented in the graphs in Fig. 2.
168
V. Tița et al.
250 200 150 100 50 0
Number
teacher
uncategorised
school psychologist
methodic teacher
243
17
6
3
specialeducation teacher 3
Number
8 28
50
107
104
42
13 29
32
kindergarten uncategorised highschool - theoretical highschool - professional
236
12 primary secondary highschool - technological
male
female
not answered
Fig. 2. Statistical data regarding the participants: function, school level and gender
As we can see, the majority of the participants activate in the educational domain, being either kindergarten, primary, secondary or high-school teachers. Thus, the sample represents the target audience of the study of technology influence in education. Another important characteristic of the target group is the average age, which situates somewhere near 19 years, meaning that the average participant age is comprised between 35 and 40 years. The main issues that were addressed to the group referred to the importance of the training, the duration and organisational aspects of the courses, the main areas required for self-improvement, changes in educational approaches, types of methods used in the training processes and skills required at the workplace.
Integrating Technology in Training
169
Regarding the first issue, related to the importance of training, a great deal of participants confirmed its necessity, over 95% from the participants, while the rest either disagreed or did not respond to the question. As respects to the organisational aspects of the training, the attitudes were balanced, as seen in Fig. 3.
160 140 120 100 80 60 40 20 0
Number
160 140 120 100 80 60 40 20 0
>100h
>40h and = 5)? + 1 : −1 7 : CN T ← CN T + 1 7 : ENDIF 9 :ENDFOR B: Intersection JamScore computation 10 :JamScore(IID, JF I, SU I, CF I, CN T ){ 11 : IF (CF I > 0.5 ∗ CN T ) THEN 12 : IJS ← JF I/CN T 13 : ELSE 14 : IF (SU I/CN T >= 35) THEN 15 : corr ← SU I/CN T 16 : IJS ← IJS + 10/corr 17 : ENDIF 18 : ENDIF Output: numerical quality IJS (intersection JamScore).
depict a typically crowded intersection and consequently these are intersection which should receive optimization efforts, including any ITS solution. 3.3
Complex Network Analysis of the Road Network
As previously stated, in this paper we introduce complex network analysis principles and metrics to measure the importance of a road intersection based only on the topology of the network. The data set derived in Sect. 3.1 from OpenStreetMaps regarding the topology of the network, allows us to calculate the network metrics, such as network size (nodes and edges), average path length, clustering coefficient, average degree (and degree distribution), network diameter (the length of the longest shortestpath), density (the ratio between the number of edges in the network and the maximum number of edges corresponding to the complete graph), modularity (numerical representation of the fact we can decompose the network into communities), and centrality (the property of a node of being “important” to the network), in it’s different forms (betweenness, closeness and eigenvector) [2,15]. The data are presented in Table 2 and the value are consistent with the literature referenced ones for street networks [6].
300
D. Avramoni et al.
The network interpretation of the urban topology relates to road networks as it is important to determine, or even predict when possible, which areas are critical for the traffic congestion and which nodes are most influential to the topology (are most prone to be used by drivers when traversing the city) defining possible hot-spots for congestion. The results can help mitigate the congestion and predict the impact of increased traffic flow. Previously our goal was to find congestion-prone intersections in a given road network, by pointing their relatively high betweenness centrality values and validating the results by using computer simulations and observation of the real traffic values [3]. In Fig. 2 we have plotted the betweenness distribution of the nodes inside our study area and even if it differs from a power-law as we are accustomed when dealing with other complex networks, it’s consistent with the particularities of road networks, where there the long-link specific to other types of networks such as social one (friends across continents) or Internet (hyperlinks to servers across distant locations), and which in case of transportation networks would correspond to point-to-point highways. Optimizing communities will naturally lead to a more balanced and fluent traffic city-wide. Table 2. Static topological characteristics of the network presented in our study. The average degree between 1.3 and 1.4 is specific for transportation networks which are inherently flat while the average path length is correlated with the size of the road network. The high modularity number mean there is a strong, well defined, community structure. Metric
Value
No. of nodes
4170
No. of edges
6212
Diameter
63
Modularity
0.938
Average path length 43.127
3.4
Average degree
1.44
Population density
2.91
Correlating Congestion with Betweenness
As stated previously we consider the metric of betweenness centrality from complex network analysis as a good indicator of the importance of the node to the city grid. As the term implies, centrality is a general way of speaking of how “central” is a node to a network, and depending of the case it might even map to the node’s importance in that particular network. Any of the centrality metrics
Real Time Urban Traffic Data Pinpointing Most Important Crossroads
301
Fig. 2. The distribution of the betweenness centrality metric in our data-set shows a typical distribution for small to medium sized urban networks (it relates to the 10×10 km area of study). Using a logarithmic interpolation of the data-set one can observe the almost power law distribution specific to numerous complex networks.
is designed to provide a ranking of the nodes, but they do this well enough only for the top-tier of the nodes and loose of their power for the rest [15]. In the case of betweenness centrality we are speaking of the number of paths which pass trough the node. Informally the betweenness of a node is the number of shortest paths, between any two nodes in a network, with pass trough that particular node. In a mathematical way, this can be described, consistently with the above formulae, as: σ→ − (v) st (1) CB (v) = − σ→ st − denotes the total number of shortest paths between any two nodes where, σ→ st − (v) is the number of paths from any pair of nodes, sand t in the graph and σ→ st which pass trough node v. When using between we take into account a much better semantics of the network flow, because it’s close connection with the paths in a graph (Fig. 3).
302
D. Avramoni et al.
Fig. 3. Our area of study, city of Timisoara in Western part of Romania, in a lazy Sunday afternoon showing almost all the major boulevards on green, which corresponds to a JF metric of less than 3.
In our previous research we always tried to link the betweeness centrality metric with the importance of the node to the network, but always by using strictly simulation tools [4]. In this case we wish to investigate the actual facts with data from real time sensing provided by Here Maps. Running Algorithm 1 on the data corresponding to 48 days for observation during months of April and May 2018 we obtain the data depicted as a graph in Fig. 4.
Real Time Urban Traffic Data Pinpointing Most Important Crossroads
303
Fig. 4. Distribution of the JamScore as calculated by Algorithm 1 over 48 days of observation in city of Timisoara, Romania
In Fig. 5(a) we have shown the distribution of the betweenness centrality metric across the city of Timisoara for each of the intersection nodes. The nodes are have a density of gray dependent of the betweenness value. Thus we are taking under consideration a few troublesome areas which are depicted in the yellow spots. In the Northern part of the city we are having constant jams caused by both the commuter from neighboring dorm-village of Dumbravita and the big residential and commercial development of United Business Center witch is spanning across 4 major road arteries and where any restriction in traffic translated in long queues and delays. Furthermore in the South-West there is a smaller spot, but also a classical one represented by one of the major arteries which passes under a rail bridge introducing delays and also it’s the heavy lorry bypass route for the E70 together with one of the city’s malls with retail stores. The last interesting spot is represented by the South-Eastern region where there is a big accumulation of businesses and large employer, Continental Automotive. There are frequent jams and delays symmetrically during the morning and afternoon rush-hours. These observation are sustained by the data shown in Fig. 5(b) where we presented the same qualitative areas took under scrutiny earlier, but with real data gathered via the Here Maps API. In this case, each of the spots corresponds to a major intersection, while the color intensity is proportional to the Jam Score. For visualization purposes we filtered out intersections with Jam Score less than 3.0.
304
D. Avramoni et al.
Fig. 5. (a) Heat map of the Jam Score overlaid on top of the Timisoara map using GeoLayout plug-in in Gephi shows the area of the city known as hot-spots in traffic congestion from empirical observations; (b) Heat map of the Jam Score overlaid on top of the Timisoara map using GeoLayout plug-in in Gephi shows the area of the
4
Discussions and Conclusions
Our investigations were geared towards providing a strong experimental proof of the link between the importance of a road intersection as given by the betweenness centrality metric taken from complex network analysis and the amount of traffic flow trough the specific intersection which in turn is a cause of congestion. Using experimental data collected trough the Here Maps API regarding all the road junctions in the area of experimentation (city of Timisoara, Romania) we computed the newly introduced JamScore - a numerical expression of the
Real Time Urban Traffic Data Pinpointing Most Important Crossroads
305
degree a junction in jammed - and done a numerical and empirical correlation with the real life driving experience. We managed to confirm the intersections prone to congestion, have booth high betweenness and high JamFactor. At present we are collecting more than car flow data, but also metadata regarding external conditions such as weather, public road closures caused by construction works, holidays and other free days. Thus Algorithm 1 can be ran several times on different but short windows of time over each of the intersections in order to build a better picture of the city traffic quality. By fusing this metric with external factors we intent for forecast behavior of the traffic in case of local disruptions such as traffic incidents. Acknowledgment. This work was supported by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS/CCCDI - UEFISCDI, project number PN-III-P2-2.1-PED-2016-1518, within PNCDI III This work was supported by research grant GNaC2018 - ARUT, no. 1349/01.02.2019, financed by Politehnica University of Timisoara. The authors would also like to thank Here Maps who have kindly granted access to the historical data related to road tra ffic within the city of Timisoara.
References 1. Here traffic API (2018). https://developer.here.com/documentation/traffic/ 2. Albert, R., Barabasi, A.L.: Statistical mechanics of complex networks. Rev. Mod. Phys. 74(1), 47 (2002) 3. Baban, G., Iovanovici, A., Cosariu, C., Prodan, L.: Determination of the critical congestion point in urban traffic networks: a case study. In: 2017 IEEE 14th International Scientific Conference on Informatics, pp. 18–23. IEEE (2017) 4. Baban, G., Iovanovici, A., Cosariu, C., Prodan, L.: High betweeness nodes and crowded intersections: an experimental assessment by means of simulation. In: 2018 IEEE 12th International Symposium on Applied Computational Intelligence and Informatics SACI, pp. 31–37. IEEE (2018) 5. Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and manipulating networks. In: ICWSM (2009) 6. Cantarella, G.E., Pavone, G., Vitetta, A.: Heuristics for urban road network design: lane layout and signal settings. Eur. J. Oper. Res. 175(3), 1682–1695 (2006) 7. Chen, M., Nguyen, T., Szymanski, B.K.: On measuring the quality of a network community structure. In: 2013 International Conference on Social Computing (SocialCom), pp. 122–127. IEEE (2013) 8. Dong, T., Hu, W., Liao, X.: Dynamics of the congestion control model in underwater wireless sensor networks with time delay. Chaos Solitons Fractals 92, 130–136 (2016) 9. Fowler, H.J., Leland, W.E.: Local area network characteristics, with implications for broadband network congestion management. IEEE J. Sel. Areas Commun. 9(7), 1139–1149 (1991) 10. Haklay, M., Weber, P.: Openstreetmap: user-generated street maps. IEEE Pervasive Comput. 7(4), 12–18 (2008) 11. Hanif, M.K., Aamir, S.M., Talib, R., Saeed, Y.: Analysis of network traffic congestion control over tcp protocol. IJCSNS 17(7), 21 (2017)
306
D. Avramoni et al.
12. Ho, D., Park, G.S., Song, H.: Game-theoretic scalable offloading for videostreaming services over LTE and WiFi networks. IEEE Trans. Mobile Comput. 17, 1090–1104 (2017) 13. Holme, P.: Congestion and centrality in traffic flow on complex networks. Adv. Complex Syst. 6(02), 163–176 (2003) 14. Low, S.H.: Analytical methods for network congestion control. Synth. Lect. Commun. Netw. 10(1), 1–213 (2017) 15. Newman, M.E.: The structure and function of complex networks. SIAM Rev. 45(2), 167–256 (2003) 16. Olsen, J.R., Mitchell, R., Ogilvie, D., Study team, M., et al.: Effect of a new motorway on social-spatial patterning of road traffic accidents: a retrospective longitudinal natural experimental study. PloS one 12(9), e0184047 (2017) 17. Pattara-Atikom, W., Pongpaibool, P., Thajchayapong, S.: Estimating road traffic congestion using vehicle velocity. In: 2006 6th International Conference on ITS Telecommunications Proceedings, pp. 1001–1004. IEEE (2006) 18. Pizano, C.O.: Mitigating network congestion: analytical models, optimization methods and their applications. Ph.D. thesis, École Polytechnique Fédérale de Lausanne (2010) 19. Stubbs, P.C., Tyson, W.J., Dalvi, M.Q.: Transport Economics, vol. 21. Routledge, Abingdon (2017) 20. Topirceanu, A., Iovanovici, A., Cosariu, C., Udrescu, M., Prodan, L., Vladutiu, M.: Social cities: redistribution of traffic flow in cities using a social network approach. In: Soft Computing Applications, pp. 39–49. Springer (2016) 21. Verhoef, E.T.: Time, speeds, flows and densities in static models of road traffic congestion and congestion pricing. Reg. Sci. Urban Econ. 29(3), 341–369 (1999) 22. Welzl, M.: Network Congestion Control: Managing Internet Traffic. Wiley, Hoboken (2005) 23. Williams, K.: Spatial Planning, Urban Form and Sustainable Transport. Routledge, Abingdon (2017) 24. Zhao, L., Lai, Y.C., Park, K., Ye, N.: Onset of traffic congestion in complex networks. Phys. Rev. E 71(2), 026125 (2005) 25. Zhao, X., Cheng, X., Zhou, J., Xu, Z., Dey, N., Ashour, A.S., Satapathy, S.C.: Advanced topological map matching algorithm based on D-S theory. Arab. J. Sci. Eng. 43(8), 3863–3874 (2018)
Priority Levels and Danger in Usage of Artificial Intelligence in the World of Autonomous Vehicle Gabor Kiss(&) and Csilla Berecz Óbuda University, Bécsi út 96/b, Budapest 1034, Hungary [email protected], [email protected]
Abstract. Nowadays more and more vehicles are on the road wherein driver assistance functions are available but there are some which are capable for selfdriving in special conditions. At the same time, autonomous vehicles have been developing in numerous country, so, in a latter part of the development, these vehicles will show up on the roads en masse. In 10 years, autonomous vehicles will spread on roads, because numerous producer works on its developing and testing. The common is that they all use, as base of the system, Artificial Intelligence. Some kind of priority needed by situations in traffic for example, cars should provide a clear way for ambulance in case of emergency. A recommendation had been made for this hierarchy in the article to make safer the transport. Behaviour of AI depends on its teaching method – that was used when it had been programmed – and it could be a major risk for human in the vehicle. The article highlights teaching methods, which could rush people into danger. Before integrity of AI into the vehicle, its behaviour and teaching method should be widely tested. Considering development of this area lately, vehicle of state leaders and its escort could be autonomous also, in the future. The article contains an analysis about this theme to point out its disadvantages and suggest to avoid usage of fully autonomous vehicles in case of state leaders. Keywords: Autonomous vehicles Priority levels AI Manchurian AI State leaders
KISS priority Christine
1 Introduction People always have been dreaming about things what cannot be got in their lives. It was what bring fantasies into being about technologies, opportunities which had been realized later. Science-fiction literature of the XX. century entertains many times with the idea of a world, where robots work instead of humanity, and this already happened just think about assembly mills of automotive industry. These science-fiction books have some hints about fully automated vehicles and transport also. Owing to technical evolution, now vehicles with driver assistance on the road is quotidian and testing of autonomous driving is on in several countries in the world. The technological leap that lead from basic vehicles to autonomous driving, have been supported by recent developments and discoveries on the field of AI. Discovery of the © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 307–316, 2021. https://doi.org/10.1007/978-3-030-51992-6_24
308
G. Kiss and C. Berecz
advanced blocks of artificial neural network (Aizenberg et al. 2000) and the multilayered neural network that was created by André de Carvalho in 1994 (Carvalho et al. 1994), were the terms of use of Deep Learning, which phrase was used first by Rina Dechter during a research (Dechter 1986). By the usage of the Deep Learning structured algorithms, different abstractions of data can be retrieved and modelled. A huge amount of data is needed to reach the appropriate results. Owing to the newest algorithms and modells, higher efficiency can be reached at the field of image recognition, which can be used in healing and astronomy as well (Julia Computing 2017) or rather there are Open Source AI solutions too, such as Open AI. Usage of Depp Learning is obvious at autonomous driving because sensors provide the needed huge amount of incoming data (Aiordachioaie 2016). AI had learnt highway code but it must have capability to recognize road signs, to lanekeeping and it had been taught through a long learning process how to act right in traffic situations even if this needs to break the rules to avoid accidents. Human lives are entrusted to autonomous vehicles, so its system must function fail-safe and decision making must be error-safe. Our goal in this article is to explore risky circumstances and situations in the world of autonomous vehicles. Section 2 includes the conditions of autonomous vehicles, we show our recommendation on Priority levels in autonomous driving in Sect. 3 and we share our idea about the danger of modified AI (Manchurian AI) in autonomous vehicles in Sect. 4.
2 Conditions of Autonomous Vehicles 2.1
Categories
In 2014, Society of Automotive Engineers (SAE) International allocated the categories of autonomous vehicles in standard J3016 “Taxonomy and Definitions for Terms Related to On-Road Motor Vehicle Automated Driving Systems”. It had been updated in 2016 because of the circumstances and since then standard J3016_201806 is in use at automotive industry (SAE International 2018) (Table 1). It says, that autonomous driving has 6 levels. Level 0 is the basic, cars on this level has no driver assistance or automatic system and level 5 is the fully automated vehicle, there is no driver, only passengers, and all challenges are being done by the system. Differences between levels are demonstrated for example in this video of BMW (BMW Group 2017). Autonomous driving could show up first at highways, because there is no oncoming traffic, bars are well painted and wider than in the city. Therefore, more and more producers promise this function in upcoming models. On highways, system needs to mind less because the traffic is more relaxed than in the city with several road junctions or lane changes. It is also advantage, that there are no buildings, which could make impossible to monitor further traffic situations.
Priority Levels and Danger in Usage of Artificial Intelligence
309
Table 1. SAE International J3016, The 6 levels of autonomous vehicles Level
Name
0
No automation Driver assistance Partial automation Conditional automation High automation Full automation
1 2 3 4 5
2.2
Steering, Acceleration and deceleration Human
Monitoring of driving environment Human
Fallback performance of dynamic driving task Human
Human and System System
Human
Human
Human
Human
System
System
Human
System
System
System
System
System
System
Standards and Protocols
Components are made by suppliers all over the world, because manufacturing would be lossmaking for the brands. These components must compile with standard ISO 26.26.6.2. It contains for example, ABS has to have a maximum of 10 E-6 failure rate, which means high reliability. Communication between built-in units has a standardized protocol too. Most producer use FlexRay communication, which is described in ISO 17458-1:2013 standard and the second most common is Controller Area Network (CAN), which had been developed especially for autonomous vehicles. 2.3
Senzors
LiDAR, radar, camera and ultrasonic sensor are in use now in automotive industry. Vehicle is capable to map its environment by these sensors. LiDAR LiDAR (Light Detection and Ranging) creates 3D representation of the environment in 100 m and capable to separate objects within it. By using this, hazardous situations, like a bright-coloured vehicle in luminous environment, which cannot be perceptible for a camera, are avoidable. For example, the accident of a Tesla Model S on 7th May 2016 (The Guardian 2016). It has worse resolution than a HD camera but provides more data because of 3D. Advantage, it can be use at night without public lighting, as shows a video by Ford Motor Company, within it is tested (Ford test 2018). Disadvantage, that it is expensive and cannot be use in all weather conditions, because it not provides reliable data in snowy or foggy weather. Radar Radar is cheaper than light or laser, but provides much less data because it has worse resolution than a HD camera. Advantage, it can be use anytime, not bothered by
310
G. Kiss and C. Berecz
weather conditions or visibility. Radar makes a list of size and distance of objects. Reflection normally comes only from significant objects but in traffic multireflection and others have to take into account during data processing, like reflection from the road, guard rail and else. Radar can be useful to keeping distance and at breaking forecast. Due to multireflection, in December 2016 a Tesla started breaking before the camera could notice that a vehicle ahead did emergency breaking, because its radar detected this from reflection of that car on the road so it avoided an accident (Koerber 2016) Camera HD cameras are cheap, capable to tell apart colours (in adequate terms), could be put outside on a vehicle and cc. 10 cameras could monitor all the environment of the vehicle. Due to resolution and shooting frequence (30 pictures/s), it provides a lot of data, which have to be processed by the central unit. Ultrasonic Sensor Ultrasonic sensors have short range (cc. 10 m) so typically use for helping in parking or lane changing, when it can monitor objects behind. Cheap and reliable device. Central Unit The function of self-driving requires large computational needs. We get data from several sensors. On the example of camera, data must be processed within a deadline because there is no time to make choice to stop at the red light or not, to pull the steering wheel away or not, etc. These super computers must be placed inside the cars but it should not take useful space from passengers and baggage. It is true, that the Michigan Micro Mote (M3) Company founded the smallest functioning computer in the world, but the computing capacity of this one would not be able to work the control of an autonomous vehicle sufficiently (Michigan Micro Mote 2015). At this matter, GPU architecture, which allows parallel processing of data, would be much more useful because we must process concurrent data from several sensors and cameras. In recent times, the GPU became more popular because its usage for bitcoin mining, which doubled the prices of the cards brings with itself a serious retardation to use them in researches such as at the development of the SETI research (Baraniuk 2018). On the other hand, the profit from this growing of popularity had used to develop the GPU and followed that NVIDIA presented a new target hardware, which computes with 2 petaflop and was built for AI, in San José GPU Technology Conference on March 2018 (Chin 2018). However, this device would need too much space in a car so the automobile manufacturers, who deals with programming autonomous cars, should wait for a size reduction or another solution could be to choose Drive PX Pegasus system presented by Nvidia on Autumn of 2017 on GTC Europe 2017 Conference. The base of this system equals with the area of a licence plate, the performance is 320 billion operations, which is less than the prior option, but they promise that it can serve out the 5th level of self-driving function (Nvidia 2017). https://www.nvidia.com/en-us/ self-driving-cars/drive-px/.
Priority Levels and Danger in Usage of Artificial Intelligence
311
There are also experiments for controlling a car only based on obtained data from the camera just like they have done it on February 2018. in a case of a Porsche Panamera, which was driven through a Huawei Mate 10 Pro. This device includes a Kirin 970 systemchip, with a neural process unit (NPU) and helps to speed up the needed calculation by the AI (Newton et al. 2018). Typically, NPU shines in the field of image processing, like Apple A11 chip, which was enough to do a test on the maximum 30 miles/h speed in a car to realise the situations from pictures get from a roof camera to the cell phone to make the vehicle act. It would be stretching to say, that this cell phone could control a car in any circumstances, because during the test pictures were used from only one camera, which allows examining only a small number of situations, what could happen during the driving. This pointed out at least the help that an improved NPU can be with a GPU, furthermore it possibly can participate in order to decrease of consumption from the view of built-in units. 2.4
Information Security at Communication Between Vehicles
The opportunity to communicate with other vehicles could solve numerous problems, for example pot-holes could be reported. Advance would be, if vehicles could share their knowledge to prepare each other for traffic or irregularly terms on a road or prevent accidents. The question is that, with whom should be shared the differences between map and reality? Should it be sent to the centre (or road maintenance, if needed) or to surroundings only? If second option is chosen, load of the centre is reduced, no need to spread one information to everyone so network traffic would be less. In this case, a built-in device is needed, which can send and get information by a standardized protocol at short range and a regulation, which separate information by addressee (centre or surroundings). Finally, we can see how much problems could be solved by communication but it has risks too. If this communication can be hacked, fake warnings or information could create accidents or make vehicles to change way.
3 Priority Levels in Autonomous Driving (KISS Priority) Usage of cognizance is a knotty question at autonomous driving. Could an ambulance or fire-engine be autonomous? World with only autonomous vehicles has been studied in this article, except cycles or motorcycles. How can speed be granted for these vehicles? Our first solution is to notice sound signal and get out of their way. Second one, is to present priority levels in autonomous driving (KISS Priority). If vehicles would communicate, they could tell their level to each other to make controlling of city traffic easier. Low level cars must get out from offside lane to inside lane, if a higher-level vehicle comes behind it. At traffic jam, they can get out their way to make road clear for them. Ambulance, fire-engine, police cars, public transport or taxis in case of need could be on higher level. This could solve that problem, when drivers do not let go ambulance, what risks a life of a patient. Special cases indicate a will to change the level of a vehicle, for example a childbirth, when arrive to the hospital is faster by car than call for an ambulance. If this
312
G. Kiss and C. Berecz
vehicle does not have higher-level priority, other cars won’t let on its way and autonomous driving is not able to speed or change behaviour to arrive sooner to the hospital. Increased level for a trip could save lives, because there is no need to wait for ambulance if the patient is transportable by car. Increasing level could be reasonable in case of offense to save lives of victims by running away. In the other hand, increasing of level makes opportunity to abuse it, so regulation and terms of use needed before using. For example, call a center and tell situation before increase, so logging could separate rightful requests from abusers. Another part of this method is time. How long should car keep higher-level priority? If center gives permit, then it can take it back as vehicle arrive to destination (center might suggest the closest hospital and its address). In case of a robbery, car could note malice and increase level automatically to protect its passengers and their belongings (Table 2). Table 2. Priority levels by categories of vehicles (KISS Priority) Priority level 1–300 401–500 601–700 801–900 901–1000 1000–10000
Category Cars, van, truck, public transport Cars of politicians and diplomatists Vehicles of disaster management Ambulance, fire-engines, police cars Cars of State Guide persons and its escort vehicles Military vehicles
On the lowest level are passenger vehicles. Priority level of vans cannot be lower than passenger cars, because their way bound to lanes by highway code. Passenger cars could overtake vans in common lanes but must not be permitted to force a van to stop by a passenger car. Slow autonomous vehicles should not be permitted in offside lane to prevent hold up of traffic. Overtaking on a one-one lane road should be the same, so faster car could overtake the slower if road markings permit that. However, a wide range added to this level because of aforementioned situations and forward thinking (change of transportation). In some country, non-State Guide persons can use cognizance to arrive in time to an appointment, so second level is for them and their escort. (In some country, not state-level politicians have permission to use distinctive sign, this way they can arrive earlier to an appointment or to the airport than the others, that is why we categorize them and their escort’s vehicles differently.) The next level is the disaster-relief category. We rate those vehicles into this category, which does not travel directly for life saving purposes, but for example they have to resolve a traffic situation such as a bus on fire or they are heading to repair something, a broken channel, etc. In their case, we can soften further, because of the wide boarders, that if there are more catastrophic situation, one of the preventive vehicles should arrive there earlier, owing to the higher category, than the others.
Priority Levels and Danger in Usage of Artificial Intelligence
313
We gave a wide boarder again in the case of ambulance, firefighter and police cars, so the fire chief can arrive earlier to the location to estimate that, to know where the others should park for the most effective firefighting. The state leader’s vehicles and their escorts are the next level. Although, arriving earlier into an appointment for a politician is not more important than a human life, but we rate them into a higher category than the ambulance, because there will be only a few vehicles on this level, and we cannot allow a hacked ambulance to stop a vehicle like this. Of course, comes the question in everyone’s mind, if a vehicle carries the state leader, can it be autonomous? We will include some points later about this issue too. We rated the military vehicles into the highest level. Though, we do not see larger quantities of military vehicles on the roads on weekdays in a democratic country and these are usually used for goods traffic anyway, so because of this, in these cases we should reconsider to decrease its priority to the level of a basic van. If a vehicle like this is attacked, the priority level would be increased again for the escaping as it is aforementioned in the case of vans. This category was purposely left with these wide boarders because of large differences between the vehicles and the future technological improvements. It is important, if autonomous vehicles will have an option to modify its priority in order to solve any aforementioned situation, that should be well-protected against of using up. Though, it is seemed, that through the communication between the vehicles several dangerous situations can be avoided, but we can find danger in this too. If this communication is hacked, the vehicle would give false notifications twisting it into an accident or forcing it to change direction, etc.
4 Manchurian AI – Danger in Usage of Artificial Intelligence in the World of Autonomous Vehicle Apparently, usage of artificial intelligence is spreading among developer companies, because it can be a better solution for self-driving as earlier methods like, the continually learning system of Tesla. Artificial intelligence based system is a hard challenge for developers, because it must be prepared for and taught to every possibly situations and avoiding its dangers. This teaching method makes a difference. Researcher of MIT in 2018 presented a psychopath AI, called Norman (Yanardag and Cebrian 2018). Norman took a Rorschach test just as a simple AI (Rorschach 1927), to allow analysing differences between their answers, what do they see in inkblots, to highlight the importance of teaching method in decision making of AI (Exner 1995). Study of MIT researcher shows that an AI could be made with killing functions like, “Christine” in the book of Stephen King (King 1983). In the world of autonomous vehicles this fact could be very important. Take an example of Tesla in June 2018, one malcontent staffer was enough to make trouble (Kolodny 2018). Of course, an evil car would be caught during a long testing period, but the plan could succeed, if it would base on the idea of the film “The Manchurian Candidate” (Condon 1959). In this case, the killing function would link with a look of rare road sign or rare situation in traffic, etc. This object could activate malice and change protecting functions to attacking, that
314
G. Kiss and C. Berecz
could create an accident, injury or a terror event and an investigation needs long time to find out and then recall all Manchurian AI. The investigation could be make more difficult, if killing function activates at the sign but acts only a random time or distance later, so all situations would be different. This would delay the detection, but the Manchurian AI could not be used for mass destruction.
5 Autonomous of State Leader’s and Military Vehicles Previously came up question, that can a vehicle be fully autonomous at all, which for example carries the President of the United States? It is worthy to think a little bit about what we know exactly about the autonomous vehicles before the answering. A complex system with redundant facilities (such as the break), which keeps the vehicle in motion in case of a malfunction, furthermore these can stop the vehicle without an accident. The control is practiced by an AI, which is improving apace nowadays and it can perform increasingly complicated tasks also in other fields just as we had seen it in one of keynote presentations of Google I/O 2018 (Callaham 2018). We could see that too; these systems can be hacked and from this any autonomous vehicles are not exception. The question is, what can be run with less risk? The options are a hacked autonomous presidential car or a blackmailed driver in a regular car. Drivers with these assignments go through on multiplied levels of examination and training before they get the job, and there much more people travels there, who can intervene keeping the car on its planned route. In a hacked car people lose their power over control making it more dangerous. The same is true for military vehicles. For this reason, according to today’s knowledge it is not recommended to change the state leader’s and military vehicles into autonomous cars from the view of safety, so an equipment is needed to be added, which tells the priority for the other already autonomous vehicles. Of course, with time it is cannot be excepted, that these will be fully autonomous in the future.
6 Conclusion Previously we could see, that several fields are there, that developers of autonomous vehicles must mind. Suggested to create priority levels for autonomous transport to provide fast way for marked vehicles. Low level cars must get out from offside lane to inside lane, if a higher-level vehicle comes behind it. At traffic jam, they can get out their way to make road clear for them. Ambulance, fire-engine, police cars, public transport or taxis in case of need could be on higher level. This could solve that problem, when drivers do not let go ambulance, what risks a life of a patient. Categories of autonomous vehicles defined and its categories too. There is option on every level to make steps, since it will be needed because of the development of this technology. We have seen the usage of artificial intelligence is spreading among developer companies, because it can be a better solution for self-driving as earlier methods like, the continually learning system of Tesla. The teaching method in decision making of AI is very important. Researcher of MIT in 2018 presented a psychopath AI, called
Priority Levels and Danger in Usage of Artificial Intelligence
315
Norman. Study of MIT researcher shows that an AI could be made with killing functions (Manchurian AI) and an object, a word a situation, etc. could activate the malice and change protecting functions to attacking, that could create an accident, injury or a terror event. Finally, a thought over the current status and available information vehicles of military and state leaders is not suggested to displace by autonomous vehicles yet. Otherwise, a device must be built-in these to report its priority level to other participants of traffic. Acknowledgements. The research presented in this paper was carried out as part of the EFOP3.6.2-16-2017-00016 project in the framework of the New Széchenyi Plan. The completion of this project is funded by the European Union and co-financed by the European Social Fund.
References Aiordachioaie, D.: On Time-Frequency Image Processing for Change Detection Purposes: Soft Computing Applications, Advances in Intelligent Systems and Computing, vol. 633. Springer (2016). ISBN 978-3-319-62521-8 Aizenberg, I., Aizenberg, N.N., Vandewalle, J.P.L.: Multi-Valued and Universal Binary Neurons: Theory, Learning and Applications. Springer Science & Business Media (2000) Baraniuk, C.: Crypto-currency craze’hinders search for alien life (2018). http://www.bbc.com/ news/technology-43056744. Accessed 14 July 2018 BMW Group: Fully autonomous driving (2017). https://youtu.be/E8xg5I7hAx4. Accessed 14 July 2018 Callaham, C.: Google Assistant demo gets a bit creepy as its AI voice calls and speaks to real people (2018). https://www.androidauthority.com/google-assistant-duplex-862971/. Accessed 14 July 2018 Carvalho, A.C.L.F., Fairhurst, M.C., Bisset, D.: An integrated Boolean neural network for pattern classification. Pattern Recogn. Lett. 15(8), 807–813 (1994) Chin, M.: Nvidia just unveiled a terrifying AI supercomputer (2018). https://mashable.com/2018/ 03/27/nvidia-unveils-ai-supercomputer/?utm_cid=mash-com-fb-main-link#r5HFIKXZ1aqi. Accessed 14 July 2018 Condon, R.: The Manchurian Candidate. McGraw-Hill Publishing Corp. (1959). ISBN 9780743482974 Dechter, R.: Learning while searching in constraint-satisfaction-problems. In: 5th National Conference on Artificial Intelligence, pp: 178–183 (1986) Exner, J.E.: The Rorschach: A Comprehensive System, vol. 1: Basic Foundations. Wiley, New York (1995). ISBN 0-471-55902-4 Ford test (2018). https://www.youtube.com/watch?v=cc15Ox8UzEw&feature=youtu.be. Accessed 14 July 2018 Julia Computing (2017). https://juliacomputing.com/domains/ml-and-ai.html. Accessed 14 July 2018 King, S.: Christine, Viking Publishing Corp. (1983). ISBN 978-0-670-220267 Koerber, B.: Tesla predict a car crash (2016). https://mashable.com/2016/12/27/tesla-predictscrash-ahead-video/?europe=true&utm_cid=mash-com-fb-main-link#Mafu8Q1y25q8. Accessed 14 July 2018
316
G. Kiss and C. Berecz
Kolodny, L.: Elon Musk emails employees about ‘extensive and damaging sabotage’ by employee, CNBC (2018). https://www.cnbc.com/2018/06/18/elon-musk-email-employeeconducted-extensive-and-damaging-sabotage.html. Accessed 14 July 2018 Michigan Micro Mote (2015) Michigan Micro Mote (M3) Makes History. https://www.eecs. umich.edu/eecs/about/articles/2015/Worlds-Smallest-Computer-Michigan-Micro-Mote.html, Accessed 14 July 2018 Newton, T.: Watch a Huawei Mate 10 Pro Drive a Porsche Panamera, PC Magazin (2018). https://www.pcmag.com/news/359487/watch-a-huawei-mate-10-pro-drive-a-porschepanamera. Accessed 14 July 2018 Nvidia: World’s first functionally safe AI Self-Driving platform (2017). https://www.nvidia.com/ en-us/self-driving-cars/drive-px/. Accessed 14 July 2018 Rorschach, H.: Rorschach Test – Psychodiagnostic Plates. Hogrefe Publishing Corp, Cambridge (1927). ISBN 3-456-82605-2 SAE International (2018) Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles (J3016_201806), https://www.sae.org/standards/ content/j3016_201806/, Accessed 31 July 2018 The Guardian (2016) https://www.theguardian.com/technology/2016/jun/30/tesla-autopilotdeath-self-driving-car-elon-musk, Accessed 14 July 2018 Yanardag, P., Cebrian, M., Rahwan, I.: Norman, World’s first psychopath AI (2018). http:// norman-ai.mit.edu. Accessed 14 July 2018
Mobile Robot Platform for Studying Sensor Fusion Localization Algorithms Paul-Onut Negirla(&) and Mariana Nagy Arad, Romania
Abstract. A major risk in developing control loops for autonomous mobile robots is to rely on a single piece of measuring equipment. This paper describes the construction and application of a multi-sensor platform on which sensorfusion algorithms can be implemented and analyzed. The proposed application puts together signals from rotary encoders and inertial sensors. Wheel encoders are prone to errors due to slipping and skidding and inertial information gathered from an electronic gyroscope and an accelerometer sensor rapidly accumulates error due to numerical integration. Preliminary results show that the performance of robot localization increases by merging the above-mentioned sensors through a complementary filter. Moreover, this paper aims to provide a support work-frame for future analysis on mapping and path finding algorithms for autonomous mobile systems. Keywords: Sensor fusion Noise reduction Sensor analysis Complementary filter Odometry Inertial localization Differential steering
1 Introduction The main goal of robotics is the development of autonomous robots that can easily replace the human workforce. In order to make the right decisions, robots use sensors that perceive the environment and provide data to a computing center where specialized algorithms can make the right decisions to perform certain tasks. Without sensors, a robot cannot make decisions, which means that it can only be controlled manually by an operator. Mobile robots are systems with electrical motors that can move and perform tasks in a real environment where they need to plan their route and avoid the obstacles they encounter. Using a single type of sensor cannot be satisfactory in all situations encountered in an autonomous robot navigation especially in industrial environments where the sensor signal is affected by interference. Combining information from several types of sensors can compensate for these drawbacks through a process called sensor fusion (Waltz 1990) [1]. The paper presents the application of this principle on a robot platform with differential steering, equipped with accelerometer, compass, gyroscope and rotary position encoders While using a classic closed loop, if the sensors used to guide mobile robots fail, in most cases the decision taken is to stop the equipment or reduce the control to an openloop control. Acquiring data from a variety of different but redundant sensors makes it © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 317–326, 2021. https://doi.org/10.1007/978-3-030-51992-6_25
318
P.-O. Negirla and M. Nagy
possible to inter-validate measurements and choose a higher-confidence result with specific calculation methods (Fig. 1).
Fig. 1. Differential two wheeled mobile robot platform.
2 Types of Sensors for Relative Positioning of the Robot Platform The electronic, mechanical or chemical device that transforms an environmental attribute into a quantitative measurement is called the sensor. In the industrial environment, sensors that are affected by the lack of satellite signal or electromagnetic interference cannot be used, which means that sensors like the magnetic compass or GPS sensors cannot be used. The most common sensors used in mobile robots are rotary encoders, accelerometers and gyroscopes. 2.1
Rotary Position Encoders
Each wheel has a position encoder that can measure direction, position and rotation speed by converting mechanical motion into electrical signals. Using two Hall sensors or a certain unique shape from an optical point of view, it is possible to determine the direction or even the absolute position of the motors and, implicitly, the position of the wheels with which the robot is moved.
Mobile Robot Platform for Studying Sensor Fusion Localization Algorithms
2.2
319
Inertial Accelerometer and Gyroscope
An accelerometer and a gyroscope were mounted on the back of the chassis. This is a sensor that can measure acceleration on the three axes of a body based on inertia. The acceleration sensor is composed of a mobile silicon structure that moves inwardly according to body movements producing a cumulative variation of several capacitive values. Gyroscopes (from Greek: gyros - rotation, skopeein - see) are motion sensors that measure the angular velocity of an object relative to an inertial reference system. Gyroscopes using this effect are basically made based on a polycrystalline silicon resonance mass fixed in a frame of the same material so that the resonance is provided in only one direction. Although they are inferior to optical gyroscopic precision, MEMS gyroscopes have the advantages of reduced cost and size and power consumption.
3 Localization Methods The mobile platform is based on dead-reckon type sensors which use different types of localization methods in order to add robustness to the final solution. Drawbacks of each method can be avoided by relying on an alternative solution when required. Modern ways of bi-wheeled robots based on odometry of differential steering describe optimized path planning and localization algorithms that can use Taylor’s series to estimate the next special and angular position. 3.1
Odometry
The following distances travelled by each wheel can be calculated (Fig. 2).
Fig. 2. Route calculation on robot turnings.
sL ¼ r h
ð1Þ
320
P.-O. Negirla and M. Nagy
sR ¼ ðr þ bÞ h
ð2Þ
sM ¼ ðr þ b=2Þ h
ð3Þ
Turning the robot to a different angle is done using: dh ðvR vLÞ ¼ dt b
ð4Þ
By integrating (4) the orientation of the robot based on the speed of each wheel can be calculated as follows: dh ¼
ðvR vLÞt þ hð0Þ b
ð5Þ
dx ðvR þ vLÞ ¼ cosðhðtÞÞ dt b
ð6Þ
dy ðvR þ vLÞ ¼ sinðhðtÞÞ dt b
ð7Þ
Integrating again (6) and (7) relative (x, y) Cartesian coordinates can be calculated: xðtÞ ¼ xð0Þ þ
bðvR þ vLÞ ðvR vLÞt ½sinð Þ þ hð0ÞÞ sinðhð0ÞÞ 2ðvR vLÞ b
yðtÞ ¼ yð0Þ þ
bðvR þ vLÞ ðvR vLÞt ½cosð Þ þ hð0ÞÞ cosðhð0ÞÞ 2ðvR vLÞ b
Using these relationships in calculating the position of the robot can be difficult if the robot moves forward due to the null denominator therefore the following approximation is usually used in practice (Jha and Kumar 2014) [9]: sM ¼ h¼
sR þ sL 2
sR þ sL þ hð0Þ 2b
ð10Þ ð11Þ
x ¼ sM cosðhÞ þ xð0Þ
ð12Þ
y ¼ sM sinðhÞ þ yð0Þ
ð13Þ
Mobile Robot Platform for Studying Sensor Fusion Localization Algorithms
3.2
321
Inertial Localization
The calculation of the movement in the coordinates (x, y) can be determined using accelerometer readings, but its double integration quickly accumulates errors, and the sensor is also prone to noise. Equations of movement: v ¼ ðv0 þ atÞ xðtÞ ¼ xð0Þ þ vx ð0Þ t þ
ax t 2 Þ 2
yðtÞ ¼ yð0Þ þ vy ð0Þ t þ
ay t 2 Þ 2
4 Fusing Sensor Readings with Complementary Filtering Complementary filtering is a simple fusion method for two sensors that can be used in environments where the Kalman filter cannot be implemented due to computational difficulty or hardware constraint. (Madhira 2014) [10], (Bostani 2008) [11].
Fig. 3. Complementary filter block diagram.
The block diagram of the complementary filtering shown in Fig. 3 where x and y are the output data of two noise-affected sensors, and z is the filtered estimate. With a high frequency noise for the y signal and a low frequency signal for the x signal, we basically have two filters, a G(s) LPF filter and a complementary [1-G (s)] HPF type filter. The z estimate will be then calculated as: ^z ¼ x ½1 GðsÞ þ y GðsÞ Typically, such a filter is commonly encountered in the fusion of gyroscope data and accelerometer to estimate the inclination angle because the accelerometer provides data for correcting the drift error encountered in gyroscopes (Chen 2010) [12] and (Pei 2017) [13]. In a raw variant the estimated angle can have the following form:
322
P.-O. Negirla and M. Nagy
d ¼ ðangle þ gyroData dtÞ ½1 GðsÞ þ accData GðsÞ angle
5 Robot Platform for Acquisition For data acquisition, a differential directional robot with two DC motors with a 1:53 reducer, encoder for each wheel, ADXL345 accelerometer, and a Raspberry PI development board were used. The developed application controls electrical motors for a set route and transmits filtered and unfiltered data from accelerometer and real-time encoders through wireless to server for analysis using a clock synchronization algorithm between the platform and the server. Several modern methods of securing the communication between the mobile platform and the server were taken into consideration (Arif, Wang, Balas and Peng, 2018) [14, 15] and need to be taken into account while using remote sensor collection servers from autonomous robots. However for scientifically gathering of data for localization this was not scoped and minimal security through SSH communications between the robot and the server were used (Fig. 4).
Fig. 4. Main components of the mobile platform were installed on the back.
The accelerometer ADXL345 is an accelerometer sensor that provides acceleration information on the three axes. The range in which values can be read is [−16 g, 16 g]. The read values are 16-bit and can be accessed through either SPI or I2C. ADXL345 measures both dynamic accelerations resulting from shocks or movements and static gravity. In this way the accelerometer can be used to detect the inclination, but also the movement on all three axes (Fig. 5).
Mobile Robot Platform for Studying Sensor Fusion Localization Algorithms
323
Fig. 5. Accelerometer and gyroscope positioning at the back of the robot platform chassis.
On the motor axles, the two Hall sensors emit 53 pulses per revolution of each wheel on each of the two sensors. Two sensors are used to determine the direction of travel (Hamilton 2010) [16] (Fig. 6).
Fig. 6. Encoders at the end of the motor axle with two hall sensors.
6 Results By applying both a complementary filter over the accelerometer and encoder measurements, the skid and slip accumulation caused by the dual acceleration integration were filtered out (Figs. 7 and 8).
324
P.-O. Negirla and M. Nagy
Fig. 7. Encoders based path calculation
Fig. 8. Results of measured distances for a predefined path. Green – Accelerometer, Purple – Encoders, Red – Real Path and in Blue the path calculated through the complementary filter.
Above figure shows the measured and real paths in time (cm/seconds). The red line represents the actual route measured with a camera above the robot moving in a room. In light green we can see the route measured by the inertial sensors which accumulate errors rapidly and in purple the route was calculated using rotary encoders which
Mobile Robot Platform for Studying Sensor Fusion Localization Algorithms
325
accumulate errors due to the skid of wheels. Finally, in blue the filtered signal, compensated for the skid on the encoders and also filtered out the errors of the inertial integration resulting in a precise low-cost sensor for robot positioning.
7 Conclusion In conclusion, the use of several sensor types can improve the performance of the entire system by finding a way to combine information so that each sensor compensates for the deficiencies of others according to the situation. By merging the sensors, a more efficient and cost-effective system is achieved than the replacement of sensors with dedicated solutions.
References 1. Waltz, E., Llinas, J.: Multisensor Data Fusion. Artech House, Norwood (1990) 2. Kim, H.D., Seo, S.W., Jang, I.H., Sim, K.B.: SLAM of mobile robot in the indoor environment with digital magnetic compass and ultrasonic sensors. In: 2007 International Conference on Control, Automation and Systems, Seoul, pp. 87–90 (2007) 3. Kim, J.H., Seong, P.H.: Experiments on orientation recovery and steering of an autonomous mobile robot using encoded magnetic compass disc. IEEE Trans. Instrum. Meas. 45(1), 271– 274 (1996) 4. Suksakulchai, S., Thongchai, S., Wilkes, D.M., Kawamura, K.: Mobile robot localization using an electronic compass for corridor environment. In: Smc 2000 Conference Proceedings. 2000 IEEE International Conference on Systems, Man and Cybernetics, Nashville, TN (cat. no. 0), vol. 5, pp. 3354–3359 (2000) 5. Lee, D., Son, S., Yang, K., Park, J., Lee, H.: Sensor fusion localization system for outdoor mobile robot. In: 2009 ICCAS-SICE, Fukuoka, pp. 1384–1387 (2009) 6. Hsu, C., Lai, C., Kanamori, C., Aoyama, H., Wong, C.: Localization of mobile robots based on omni-directional ultrasonic sensing. In: SICE Annual Conference 2011, Tokyo, pp. 1972–1975 (2011) 7. Ko, D.W., Yi, C., Suh, I.H.: Semantic mapping and navigation with visual planar landmarks. In: 2012 9th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), Daejeon, pp. 255–258 (2012) 8. Ishii, K., Ishida, A., Saul, G., Inami, M., Igarashi, T.: Active navigation landmarks for a service robot in a home environment. In: 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI), Osaka, pp. 99–100 (2010) 9. Jha, A., Kumar, M.: Two wheels differential type odometry for mobile robots. In: Proceedings of 3rd International Conference on Reliability, Infocom Technologies and Optimization, Noida, pp. 1–5 (2014) 10. Madhira, K., Gandhi, A., Gujral, A.: Self-balancing robot using complementary filter: Implementation and analysis of complementary filter on SBR. In: 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), Chennai (2016) 11. Bostani, A., Vakili, A., Denidni, T.A.: A novel method to measure and correct the odometry errors in mobile robots. In: 2008 Canadian Conference on Electrical and Computer Engineering, Niagara Falls, ON (2008)
326
P.-O. Negirla and M. Nagy
12. Chen, C., Zhang, J.: Using odometry for differential wheeled robots. In: 2013 International Symposium on Next-Generation Electronics, Kaohsiung, pp. 569–571 (2013) 13. Pei, Y., Kleeman, L.: A novel odometry model for wheeled mobile robots incorporating linear acceleration. In: 2017 IEEE International Conference on Mechatronics and Automation (ICMA), Takamatsu, pp. 1396–1403 (2017) 14. Arif, M., Wang, G., Balas, V.E.: Secure VANETs: trusted communication scheme between vehicles and infrastructure based on fog computing. Stud. Inf. Control 27(2), 235–246 (2018) 15. Arif, M., Wang, G., Peng, T.: Track me if you can? Query based dual location privacy in VANETs for V2V and V2I. In: 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE). IEEE (2018) 16. Hamilton, B.R., Ma, X., Baxley, R.J., Walkenhorst, B.: Node localization and tracking using distance and acceleration measurements. In: 2010 2nd International Workshop on Cognitive Information Processing, Elba, pp. 399–404 (2010)
Fuzzy Applications, Theory, Expert Systems, Fuzzy and Control
Design of Intelligent Stabilizing Controller and Payload Estimator for the Electromagnetic Levitation System: DOBFLC Approach Ravi V. Gandhi1(&) and Dipak M. Adhyaru2 1
Indian Institute of Technology Gandhinagar (IITGN), Palaj 382355, Gujarat, India [email protected] 2 Institute of Technology, Nirma University, S.G Highway, Ahmedabad 382481, Gujarat, India [email protected]
Abstract. This article presents the design of an intelligent stabilizing controller and payload estimator using the Disturbance Observer Based Fuzzy Logic Controller (DOBFLC) approach in the region of the Electromagnetic Levitation System (EMLS). EMLS is one of the unstable and nonlinear benchmark systems with a wide range of applications. Investigation reveals that the payload is one of the prime sources of the vertical disturbance for the EMLS. Initially, the stabilizing controller has been designed using the Pre-filter based Fuzzy-PID controller based on the ITAE criterion in the absence of payload. Next, the payload estimator has been developed in the presence of the vertical disturbance due to payload for sinusoidal and variable step-change patterns. Finally, the disturbance corrector is hybridized with the primary FLC to achieve the stabilizing control as well as the disturbance rejection control using the proposed methodology. Simulation results are presented to validate the efficacy of the proposed design under the payload variation of 0–20% similar to the medium speed MAGLEV trains. Also to confirm the superiority of the proposed approach, the results obtained by the DOBFLC approach are compared with the DOBC plus LQR approach for the fixed as well as variable step change of set points and payload. Keywords: Electromagnetic Levitation System (EMLS) Disturbance Observer Based Control (DOBC) Fuzzy Logic Controller (FLC) Payload LQR control
1 Introduction MAGnetic LEVitation (MAGLEV) technology has been extensively used for the broad range of real-time applications like a magnetic train, magnetic bearing, wind turbine, weighing system, magnetic suspension…etc. due to the contactless features [1]. Openloop unstable and nonlinear behaviour has made this technology as a benchmark technology to validate the effectiveness of the control algorithms [2]. MAGLEV train is one of the renowned application of this technology. To investigate and to control the © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 329–343, 2021. https://doi.org/10.1007/978-3-030-51992-6_26
330
R. V. Gandhi and D. M. Adhyaru
levitation phenomena in MAGLEV trains, different prototype structures have been proposed in the various literature. Electromagnetic Levitation System (EMLS) is one of the prototype structure discussed in which has been used to investigate multiple control requirements for MAGLEV train with almost fixed the amount of the clearance or airgap [3]. Due to this feature, the linearized model has been widely used to design the stabilizing controller for EMLS. Phase-lead compensator single or cascade loop is one the most popular proposed control approach to achieve the stabilization for the EMLS applied in [4–7]. Conventional structures of the PID controllers like LQR based PID [8], IMC based PID [9], fractional PID [10] have been investigated to enhance the performance. Because of the inherent intelligence of the fuzzy controller, the performance of the fuzzy controller is superior to the conventional controller [11]. Also, the fuzzy controller has been proved to be the superior one than the LQR controllers for stabilizing the EMLS [12]. Many researchers proposed composite structure like fuzzy PD controller [13–15], fuzzy PID controller [16, 17, 32] and the Takagi-Sugeno fuzzy regulator [33] to achieve the stabilizing control for EMLS. Effective stabilizing control in the presence of the random payload disturbance has been obtained using Pre-fuzzy-PID controller [18]. The payload in a range of 0–20% is one of the prime sources of disturbance for low to medium speed MAGLEV trains [19]. It has been shown that the performance of the MAGLEV train can be drastically improved using the Disturbance Observer-Based Control (DOBC) [20]. LQR based stabilizing controller has been integrated with the DOBC to suppress the effect of a step-change in payload disturbance due to a weight change of the passengers [21]. A broad range of the applications has been found based on the fuzzy controller plus DOBC approach as discussed in [22–25]. In this paper, the similar DOBC has been designed and integrated with the Pre-fuzzy-PID type of the controller [18] to obtain the smooth and effective stabilizing control as well as disturbance rejection control for the EMLS. This paper may be used to estimate payload or the unknown disturbance for the wide range of applications. Organization of the chapter is as follows: Sect. 2 describes the modelling of voltage-controlled mode for EMLS. In Sect. 3, the requirements of the Disturbance Observer (DO) for the payload estimation and design of the Disturbance ObserverBased Fuzzy Logic Controller (DOBFLC) is discussed. In Sect. 5, simulation results are presented to confirm the efficacy of the proposed design followed by the conclusion.
2 Electromagnetic Levitation System (EMLS) EMLS is one of the open-loop unstable and nonlinear benchmark systems [27, 33]. Schematic of the EMLS is shown in Fig. 1. The prime objective of the controller is to regulate the ball position or airgap (x) under the variation of the vertical disturbance (d) due to the payload in the range of 0–20% by manipulating the electromagnetic coil voltage (VC). The nonlinear dynamical Voltage Controlled Quarter Model (VC-QM) for the EMLS can be expressed by (1) [26, 27],
Design of Intelligent Stabilizing Controller and Payload Estimator for the EMLS
2
3 2 3 0 27 6 x_ ðtÞ ¼ 4 g þ dMðtÞ C1 xx3 ððttÞÞ 5 þ 4 0 5uðtÞ; yðtÞ ¼ x1 ðtÞ 1 C3 C2 x3 ðtÞ x2 ð t Þ
331
ð1Þ
Here, xðtÞ 2 R31 ; uðtÞ 2 R11 and yðtÞ 2 R11 , are the state, input and output variables respectively for the EMLS. In this research, the vertical payload of 0–20% against the steel ball acceleration the dynamics is considered as the major source of the disturbance or uncertainty. Other disturbances or uncertainties are not considered in this research.
Fig. 1. Schematic of Electromagnetic Levitation System [26].
Matrices of the linearized state model for (1) around the operating conditions: (x0, u0) with nominal payload can be obtained as follows [26], 2
0 2 x30 4 2C A¼ 1 x3 10 0
3 2 3 1 0 0 x30 5 0 2C1 x2 ; B ¼ 4 0 5; C ¼ ½ 1 10 C3 0 C2
0 0
ð2Þ
Linearized state-space model and transfer function can be expressed by (3), and (4) respectively using the simulation parameters of EMLS mentioned in Table 1. 2
0 x_ ðtÞ ¼ 4 2803 0
1 0 0
3 2 3 0 0 27:7 5xðtÞ þ 4 0 5uðtÞ; yðtÞ ¼ x1 ðtÞ 26:7 2:4242
ð3Þ
332
R. V. Gandhi and D. M. Adhyaru
G m ðsÞ ¼
X ðsÞ 67:23 ¼ Vc ðsÞ s3 þ 26:67s2 2803s 74740
ð4Þ
Table 1. Simulation parameters of EMLS [27]. Parameter No. of turns Coil resistance Coil inductance Magnet force constant Mass of steel ball Gravitational constant Operating airgap Operating voltage
Character N R L Cf M g x0 u0
Value 2450 11 X 0.4125 H 6.5308 10−5 N.m2/A2 0.068 kg 9.82 m/s2 0.007 m 7.7821 V
The eigenvalues for the given system are +52.9420, −52.9420, −26.67 which represents unstable nature due to one pole (+52.9420) in the right half of s-plane. Formulation of the problem statement and control requirements can be prepared based on the above analysis of the state-space model, transfer function and eigenvalues as follows: (i) The stabilizing controller is required to maintain the steel ball position as per requirement for the given open-loop unstable EMLS. (ii) Disturbance Observer Based Controller (DOBC) is recommended to estimate and compensate for the effect of vertical disturbance due to the payload. (iii) Hybridization of stabilizing controller and DOBC is mandatory to achieve the stabilizing as well as payload rejection control for the EMLS.
3 Disturbance Observer Based Fuzzy Logic Control (DOBFLC) This section presents the desired control approach for the problem statements formulated in the previous section as shown by the block diagram structure in Fig. 2. Main components of the proposed approach are, (i) Fuzzy Logic Controller, (ii) Disturbance Observer (DO), and (iii) Disturbance Corrector. The hybridized approach by synchronizing these three components is termed as Disturbance Observer Based Fuzzy Logic Controller (DOBFLC) in this paper.
Design of Intelligent Stabilizing Controller and Payload Estimator for the EMLS
333
Fig. 2. Schematic block diagram of DOBFLC.
Assumption 1: For the given EMLS, the considered source of the disturbance is the payload which is changing very smoothly in the range of 0–20% of the weight of the steel ball. Other vertical disturbances on the steel ball are neglected. Assumption 2: Information of all the states are available to design the payload estimator using Disturbance Observer (DO). Assumption 3: During the simulation, all the control signals like uf and ud remains within the specified range to avoid falling or sticking of the steel ball. Design of the DOBFLC is divided into three-fold as follows: (i) Stabilizing controller design for fixed and variable step-change kind of set-point using FLC, (ii) Payload estimator design separately for the staircase and sinusoidal disturbances utilizing the separation kind of principle [28, 31], (iii) Disturbance corrector integrated with FLC for the stabilizing and disturbancerejecting control. 3.1
Fuzzy Logic Controller (FLC)
The configuration of the Fuzzy-PID controller is shown in Fig. 3. The dynamics in the time domain (uPID ðtÞ) and transfer function format (Gc ðsÞ) can be expressed by (5), and (6) [29],
334
R. V. Gandhi and D. M. Adhyaru
Fig. 3. Configuration of the Fuzzy-PID controller [29].
uf ð t Þ G u
deðtÞ Z þ Gi eðtÞdt Ge eðtÞ þ Gce dt
ð5Þ
The Open-loop transfer function of EMLS (4) can be expressed in the polynomial form by (6), G m ðsÞ ¼
Km s 3 þ a2 s 2 þ a1 s þ a0
ð6Þ
The closed-loop transfer function, T(s) of Fuzzy-PID controller-based EMLS can be expressed by (7) [18], T ðsÞ ¼
Gm ðsÞGc ðsÞ U f ðsÞ ; Gc ðsÞ ¼ 1 þ Gm ðsÞGc ðsÞ E ðsÞ
ð7Þ
The optimal values of the controller gains can be obtained by comparing the denominator of (7) to the 4th ordered characteristics equation based on the ITAE criterion given by (8) [30], s4 þ ð2:1wn Þs3 þ 3:4w2n s2 þ 2:7w3n s þ w4n ¼ 0
ð8Þ
As mentioned in [18], proper selection of wn would provide the values of gains for the Fuzzy-PID controller. The rule base for the fuzzy-PD controller using the error (e) and the rate of change of error (de/dt) with the controller output (uf) is shown in Table 2. The triangular Membership Functions (MF) and the 3-D surface for considered input-output are shown in Figs. 3 and 4 respectively [29],
Design of Intelligent Stabilizing Controller and Payload Estimator for the EMLS
335
Table 2. Rules for the Fuzzy-PD controller
Fig. 4. Input and output membership function.
To Reduce the aggressive integral action during the transient and to smooth the levitation of steel ball, Pre-filter with first ordered transfer function is supplemented after the set-point to form the structure similar to Pre-Fuzzy-PID Controller with the following transfer function [18, 30], G f ðsÞ ¼
f sþf
ð9Þ
Simulation of EMLS based on above stabilizing FLC has been performed without and with the payload of 0-20% for variable step and constant type of the set-point as shown in Fig. 4 and 5 respectively
336
R. V. Gandhi and D. M. Adhyaru
Fig. 5. Input and output membership function.
It can be observed without payload from Fig. 4 that FLC approach may be recommended for the stabilizing control of EMLS. Smooth, quick, overshoot and steadystate error-free stabilizing control for variable step as well as constant set-point have been gained using FLC approach (Fig. 7).
Fig. 6. FLC based stabilizing control of EMLS without payload.
Design of Intelligent Stabilizing Controller and Payload Estimator for the EMLS
337
Fig. 7. FLC based stabilizing control of EMLS with 0–20% payload.
But, the counter side with the 0–20% of payload, it can be observed from Fig. 5 that only FLC approach may not be suitable for the stabilizing control of EMLS. Under medium to a high amount of payload, FLC stabilizes the steel ball with a high amount of overshoot or undershoot which may force the steel ball either to fall from the magnetic field or to stick to the electromagnet. Above investigation demands the integration of the payload estimator and payload compensator with the FLC based stabilizing controller as discussed in the next section. 3.2
Disturbance Observer and Corrector
In this section, the payload estimator of 0–20% of the system weight for EMLS is designed for the staircase and sinusoidal pattern of the vertical disturbances. As discussed in Assumption 1, the considered source of the disturbance is of payload which is mainly due to the weight change of the passengers in case of the MAGLEV system [19, 21, 33]. Let, d ðtÞ ¼ ðDM Þg; 8d ðtÞ 2 ½0; 0:14 where DM is the unknown change of the mass acting on the steel ball. The linearized model of EMLS with the change in mass (DM) as a disturbance and Bd ¼ ½ 0 14:705 0 T as a disturbance matrix can be expressed by (10), 2
0 x_ ðtÞ ¼ 4 2803 0
3 2 3 2 3 1 0 0 0 0 27:7 5xðtÞ þ 4 0 5uðtÞ þ 4 14:705 5d ðtÞ 0 26:7 2:4242 0
ð10Þ
338
R. V. Gandhi and D. M. Adhyaru
Remark 1: From the observation of (10), it can be seen that controller action (u) and the vertical disturbance (d) enters from the different channels into the dynamics of EMLS. This kind of disturbance is termed as “Mismatching” type in [20]. Payload estimator for (10) can be designed using the concept of DOBC for the “Mismatching” disturbance. The dynamics of the disturbance observer based payload estimator can be expressed by (11) [21], k B ½u þ kd xðtÞ kd ½AxðtÞ þ BuðtÞ u_ ðtÞ ¼ d d d ðtÞ ¼ uðtÞ þ kd xðtÞ
ð11Þ
where d is the estimated lumped payload in terms of DM, u is a secondary vector and kd is the gain matrix of the considered observer which needs to be designed. Lemma 1 [21]: The estimation of the given disturbance observer (11) can asymptotically follow the lumped disturbance if the observer gain matrix kd is chosen such that kd Bd is a Hurwitz matrix. kd has been designed as per the recommendation of Lemma 1 by locating the eigenvalues of the kd Bd which stays in sufficient left half of s-plane. Smooth and quick estimation of the payload for two different patterns is shown in Fig. 6.
Fig. 8. Response of DO for the sinusoidal and stair-case type of payload.
Transient and the steady-state performance for the designed estimator can be recommended for the integration with the primary FLC based stabilizing controller as Fig. 6. The combined control law (uc) can be described as follows (Fig. 8), uc ¼ uf þ ud
ð12Þ
Design of Intelligent Stabilizing Controller and Payload Estimator for the EMLS
339
Here, the disturbance corrector is designed based on the gains (i.e., Gu and Gce) of the FLC as described by (13) which was similarly proposed in [21] for the LQR gains. h
i1 ud ¼ C ðA þ B½Gu Gce 0Þ1 B x CðA þ B½Gu Gce 0Þ1 Bd dðtÞ
ð13Þ
4 Simulation Results This section mainly focuses on the results obtained by integrating the DO, Dist. corrector and the FLC for the fixed as well as a variable step-change in set-point under the % payload variation, d ðtÞ ¼ f5; 12:5; 20; 10; 2:5g 2 ½0; 20 with an interval of 3 s. Gu ; Gce and Gie for the FLC during the simulation were found as −620.107, 0.0554 and 0.2861 respectively by comparing the (8) with the denominator polynomial of (7) as discussed in [18]. Simulation results of the DOBFLC for a fixed and a variable stepchange in set-point have been compared with one of the popular approaches as DOBLQR to judge the performance superiority of the proposed approach as presented in Fig. 9 and Fig. 10. In case of the fix set-point of 0.007 m which is the 50% of the total airgap, the DOBFLC gives offset less smooth response compare to DOBLQR as observed from Fig. 9. It is also seen that for the high amount of payload, DOBLQR may not suitable for the EMLS due to a large deviation from the setpoint. But, DOBFLC quickly stabilized the EMLS very near to the setpoint irrespective of the amount of payload. However, the rise-time of the response during the transient for DOBLQR is better than DOBFLC as observed from Fig. 9.
Fig. 9. Airgap stabilization under payload variation using DOBFLC approach.
340
R. V. Gandhi and D. M. Adhyaru
In the next case, the variable step change of set-point as f0:005; 0:007; 0:01g for the time interval of 5 s with % payload variation, d ðtÞ ¼ f5; 12:5; 20; 10; 2:5g 2 ½0; 20 with an interval of 3 s is considered. DOBFLC gives offset free satisfactory response compare to DOBLQR as observed from Fig. 10. It is also seen that for the high amount of payload (i.e., for the high-speed MAGLEV with d ðtÞ 2 ½0; 40) [3, 4], DOBLQR may not suitable for the tracking control problem for the EMLS due to a large deviation from the tracking signal. But, DOBFLC quickly tracks and stabilize the EMLS very near to the reference tracking signal irrespective of the amount of payload. Similar to the previous case, the rise-time of the response during the transient for DOBLQR is better than DOBFLC as observed from Fig. 10.
Fig. 10. Airgap stabilization under payload variation using DOBFLC approach.
The results obtained by single FLC for stabilizing control with or without the payload disturbance have been discussed in the Sect. 3.1 for the similar type of fix and variable step-change in set-point. Also, the performance effectiveness of the payload estimator has been confirmed for the sinusoidal and the staircase kind of payload variation in the range of 0–20% in the Sect. 3.2.
Design of Intelligent Stabilizing Controller and Payload Estimator for the EMLS
341
5 Conclusion In this article, the stabilizing control of the EMLS has been investigated using the combined efforts of the Pre-Fuzzy-PID controller and the DOBC for the various conditions. Initially, the FLC has been designed and implemented under nominal payload. It is observed that the same FLC alone may not be capable of withstanding the disturbance. As a solution to this issue, payload estimator has been designed to track the sinusoidal and step-change kind of disturbance using well-known Disturbance Observer method. Designed payload estimator quickly and smoothly tracks the payload variation. At last, the EMLS has been protected using the Dist. Corrector based on the information of the estimated payload. Finally, the effective stabilizing control, as well as disturbancerejection control for the EMLS with constant or slow varying disturbance, has been guaranteed using the hybridization of the FLC and the Dist. Corrector which is considered as DOBFLC in this paper. To validate the efficacy of the designed control approach, the results obtained by the DOBFLC are compared with the results obtained by the DOBLQR for the diverse conditions like fix set-point and a step-change in set-point under the payload variation (0–20%). For both cases, the DOBFLC approach has shown the superior and satisfactory performance compare to the DOBLQR approach. Acknowledgments. The proposed research work is a part of the full-time Ph.D. of Ravi V. Gandhi under the Visvesvaraya Ph.D. Scheme from the Nirma University which is governed by the M.H.R.D, India.
References 1. Yaghoubi, H.: Practical applications of Magnetic Levitation Technology. Iran Maglev Technology (IMT), Iran, pp. 1–56 (2012) 2. Gandhi, R.V., Adhyaru, D.M.: Novel approximation based dynamical modelling and nonlinear control of electromagnetic levitation system. In: International Joint Computational System Engineering, Inderscience, vol. 4, no. 4, pp. 224–237 (2018) 3. Goodall, R.: Dynamics and control requirements for EMS Maglev suspensions. In: Proceedings of 18th International Conference on Magnetically Levitated Systems and Linear Drives, Shanghai, China, pp. 926–934 (2004) 4. Banerjee, S., Prasad, D., Pal, J.: Design, implementation, and testing of an attraction type electromagnetic suspension system. In: National Power Systems Conference, Kharagpur, India, pp. 621–625 (2002) 5. Shawki, N., Alam, S., Gupta, A.: Design and implementation of a magnetic levitation system using phase lead compensation technique. In: 9th International Forum on Strategic Technology (IFOST), pp. 294–299 (2014) 6. Chuguang, F., Hengkun, L., Hu, C., Ruihao, L.: On Both flux and current feedback control technique for maglev suspension system. In: Proceedings of the 33rd Chinese Control Conference, Nanjing, China, pp. 180–183 (2014)
342
R. V. Gandhi and D. M. Adhyaru
7. Shawki, N., Alam, S., Gupta, A.: Design and implementation of a magnetic levitation system using phase lead compensation technique. In: 9th International Forum on Strategic Technology (IFOST), Cox’s Bazar, Bangladesh, pp. 294–299 (2014) 8. Vinodh Kumar, E., Jerome, J.: LQR based optimal tuning of PID controller for trajectory tracking of Magnetic Levitation System. In: International Conference on Design and Manufacturing, IconDM, pp. 254–264 (2013) 9. Dukaa, A., Dulua, M., Olteana, S.: IMC based PID control of a magnetic levitation system. In: 9th International Conference on Interdisciplinary in Engineering, Romania, pp. 592–599 (2015) 10. Folea, S., Muresan, C.I., Keyser, R.D., Ionescu, C.M.: Theoretical analysis and experimental validation of a simplified fractional order controller for a magnetic levitation system. IEEE Trans. Control Syst. Technol. 24(2), 756–763 (2016) 11. Al-Odienat, A.I., Al-Lawama, A.A.: The advantages of PID fuzzy controllers over the conventional types. Am. J. Appl. Sci. 5(6), 653–658 (2008). Science Publications 12. Unni, A.C., Junghare, A.S., Mohan, V., Ongsakul, W.: PID, fuzzy and LQR controllers for magnetic levitation system. In: International Conference on Cogeneration, Small Power Plants and District Energy (ICUE 2016), BITEC, Thailand, pp. 1–6 (2016) 13. Benomair, M., Tokhi, M.O.: Control of single axis magnetic levitation system using fuzzy logic control. In: Science and Information Conference, London, UK, pp. 14–18 (2015) 14. Duka, A., Abrudean, M.: Simulation of a fuzzy-PD learning control system. In: International Conference on Automation, Quality, and Testing, Robotics, vol. 2, pp. 81–84 (2008) 15. Li, J.: Fuzzy PD type control of magnetic levitation system. In: 5th IEEE Conference on Industrial Electronics and Applications, pp. 2052–2057 (2010) 16. Ma, J., Fan, W., He, F.: Parameters self-adjusting fuzzy PID control in magnetic levitation system. In: 2nd International Conference on Systems and Control in Aerospace and Astronautics, Shenzhen, China, pp. 1–5 (2008) 17. Yang, J., Sun, R., Cui, J., Ding, X.: Application of composite fuzzy-PID algorithm to suspension system of maglev train. In: 30th Annual Conference of the IEEE Industrial Electronics Society, Busan, Korea, pp. 2502–2505 (2004) 18. Gandhi, R.V., Adhyaru, D.M.: Pre-fuzzy-PID controller for effective control of electromagnetic levitation system. In: 4th Indian Control Conference (ICC), India, pp. 113–118 (2018) 19. Banerjee, S., Prasad, D., Pal, J.: Performance study of the controller of an attraction type levitation system under parametric change. J. Electr. Syst. 3, 377–394 (2010) 20. Li, S., Yang, J., Chen, W., Chen, X.: Disturbance Observer-Based Control: Methods and Applications. CRC Press, Taylor & Francis Group (2014) 21. Yang, J., Zolotas, A., Chen, W., Michail, K., Li, S.: Disturbance observer based control for nonlinear MAGLEV suspension system. In: Conference on Control and Fault-Tolerant Systems, Nice, France, pp. 281–286 (2010) 22. Chen, W., Yang, J., Guo, L., Li, S.: Disturbance observer-based control and related methods: an overview. IEEE Trans. Ind. Electron. 63(2), 1083–1095 (2016) 23. Has, Z., Muslim, A., Mardiyah., N.: Adaptive-fuzzy-PID controller based disturbance observer for DC motor speed control. In: Proceedings of the EECSI, Yogyakarta, Indonesia, pp. 1–6 (2017) 24. Ning, D., Sun, S., Zhang, F., Du, H., Li, W., Zhang, B.: Disturbance observer based TakagiSugeno fuzzy control for an active seat suspension. J. Mech. Syst. Sig. Process. 97, 515–530 (2017) 25. Han, H., Chen, J., Karimi, R.: State and disturbance observers-based polynomial fuzzy controller. Inf. Sci. 382–383, 38–59 (2017)
Design of Intelligent Stabilizing Controller and Payload Estimator for the EMLS
343
26. Gandhi, R.V., Adhyaru, D.M., Kasundra, J.: Modeling of voltage controlled and current controlled electromagnetic levitation system based on novel approximation of coil inductance. In: International Conference on Control, Automation, and Robotics (ICCAR), Newzealand, pp. 212–217 (2018) 27. Gandhi, R.V., Adhyaru, D.M.: Feedback linearization based optimal controller design for electromagnetic levitation system. In: IEEE International Conference on Control Instrumentation Communication and Computational Technologies, India, pp. 36–41 (2016) 28. Gandhi, R.V., Adhyaru, D.M.: Optimized control of boiler drum level using separation principle. Nirma University International Conference on Engineering (NUiCONE-2013), India, pp. 1–6 (2013) 29. Jantzen, J.: Foundations of Fuzzy Control, 2nd edn., pp. 85–99. Wiley Publication (2013) 30. Dorf, R., Bishop, R.: Modern Control Systems. 12th edn. Prentice Hall, Pearson Education, New Jersey (2011) 31. Adhyaru, D.: State observer design for nonlinear systems using a neural network. Appl. Soft Comput. 12(8), 2530–2537 (2012) 32. Balas, M., Balas, V.: The family of self adaptive interpolative controllers. In: Proceedings of Information Processing and Management of Uncertainty in Knowledge-Based Systems IPMU 2004, Perugia, vol. 3, pp. 2119–2124 (2004) 33. Gandhi, R.V., Adhyaru, D.M.: Simplified takagi-sugeno fuzzy regulator design for stabilizing control of electromagnetic levitation system. In: Deb, D., Balas, V., Dey, R. (eds.) Innovations in Infrastructure. Advances in Intelligent Systems and Computing, vol. 757. Springer, Singapore (2019)
On the Concept of Fuzzy Graphs for Some Networks V. Yegnanarayanan1(&), Noemi Clara Rohatinovici2, Y. Gayathri Narayana3, and Valentina E. Balas4 1
Advisory Board, RNB Global University, Rajasthan, India [email protected] 2 Polytechnic University of Timişoara, Timișoara, Romania [email protected] 3 Department of Electronic and Communication Engineering, SSN College of Engineering, Chennai, India [email protected] 4 Department of Automation Industrial Engineering, Textiles and Transport, University ‘Aurel Vlaicu’, Arad, Romania [email protected]
Abstract. In this paper we have provided a brief expository note on graphs that are fuzzy in nature and how the fuzzy concept when mixed with graphs tend to provide better solutions for pertinent application areas as diverse as traffic management, telecommunication and brain networks. The concept of graph coloring and the utility value of neutrosophic and intutionistic structures in graph set up are discussed with illustrations. We also probed the possibility of applications of existing methods associated with normal graphs to graphs that are fuzzy so that tough tasks of comprehending the functioning of human brain can be attempted in near future. Keywords: Fuzzy graphs Fuzzy coloring Intuitionistic neutrosophic graph Traffic lights problem
1 Introduction Mathematical models using Graphical representation are vital tools for finding solutions to combinatorial tasks of various fields such as topology, algebra, etc. Graph models that are fuzzy are more closer to nature, as vagueness and ambiguity occurs in nature. Complex phenomena are many in real life and most of them have incomplete information. To handle such situation we require a treatment apart from routine way of doing mathematics. Generalized simple graphs are involved as graph structures to understand practical issues and are helpful in large domains of computer science [19– 21, 23]. But systems such as networks, routes and images are some that cannot be depicted fully with graphs in view of the uncertainty of the associated parameters describing the system. For instance, a network of social relations represented as a graph, point to the
© Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 344–357, 2021. https://doi.org/10.1007/978-3-030-51992-6_27
On the Concept of Fuzzy Graphs for Some Networks
345
relation between persons or institutions etc. If the relations among them are quantified as better or worst depending on the contact frequency then fuzziness associated with can be modeled. This aspect leads to define fuzzy graphs [3, 6, 7]. A common property in all fuzzy graphs is that the value of membership of a link is smaller than the least of values of membership of its end points. Suppose that a network of social relations is to be represented as a fuzzy graph. Deem all units of social relations as fuzzy vertices. The values of membership of the nodes depend on various constraints. If that they are quantified as per the knowledge sources and the connection among them considered as fuzzy links. However knowledge propagation may be larger compared with the associated social elements and this aspect need not be deemed in fuzzy graphs that values of edge membership must be lower to the combined values of membership of end points. It means fuzzy graphs falls short of representing all images. To get rid of this constraint, introduction of generalized fuzzy graphs assumes significance [10, 14, 15, 18]. Zadeh [35] introduced fuzzy sets to tackle uncertainty. In [32] Turksen coined interval-valued fuzzy set. Atanassov [9] suggested the notion of fuzzy set that is intuitionistic in 1986. It dealt with incomplete information. Smarandache [30] suggested sets that are neutrosophic. Later Wang et al. in [32] proposed single-valued neutrosophic set. It is deemed as an extension of fuzzy set that is intuitionistic. The authors in [11, 12] studied the notions of sets that are intuitionistic and neutrosophic and their relations in their work. Kauffman [17] regarded fuzzy graph depending on Zadeh’s fuzzy relations [36]. The work of Rosenfeld [22] is a notable one on fuzzy nature of some graph-theoretic concepts. In [13] the authors probe graphs that are Mstrong fuzzy. In 2011, In [16] it is noted that graph structures that are fuzzy in a number of applications and developed its properties. The authors in [1] made known the idea of graph structures that are bipolar fuzzy. Then in [14] graphs that are single-valued neutrosophic are considered. In [2] the authors coined neutrosophic soft graphs with applications. [4] mentioned mistakes in the works of [14] and [23]. [5] gave the idea of hypergraphs that are single-valued neutrosophic. Graph representations of soft sets that are intuitionistic neutrosophic were dealt with in [3]. The authors in [24–28] have done a good amount of work on different types of graphs that are fuzzy. in For notations, terminologies and other applications see [29, 31, 33, 34, 36]. In this paper we provide a bird’s eye view of the powerfulness and usefulness of fuzzy graph representation for practical applications. We also prove some results as a motivating factor to look deeper into it for fruitful results in our future endeavors.
2 Fuzziness in Graphs They are of following types. Type A: G whose edge set is fuzzy; Type B: G whose vertex set is fuzzy; Type C: G with normal vertex and edge sets but the heads and tails of edges are fuzzy; Type D: If G = {Gi} with fuzziness in every Gi; Type E: G with normal vertex and edge sets but the edge weights are fuzzy. A graph that is fuzzy is a triple ðG ¼ ðV; r; lÞÞ where V 6¼ ;; r; l are fuzzy sets on E ðV V Þ and V; with lðfu1 ; u2 gÞ minfrðu1 Þ; rðu2 Þg for every u1 ; u2 2 V. Note that
346
V. Yegnanarayanan et al.
by defining lðu1 Þ ¼ 1 for every u1 2 V and l ui ; uj ¼ 1 if ui ; uj 2 E and l ui ; uj ¼ 0 otherwise, we can assert that all normal graphs are fuzzy graphs but not conversely. By the edges incident on a vertex u1 2 V of a graph that is fuzzy G ¼ P ðV; r; lÞ we mean d ðu1 Þ ¼ u6¼v lðu; u1 Þ for all u 2 V. u; u1 2 V ðGÞ are said to be ~ ¼ ðV; r;~ adjacent of minfrðuÞ; rðu1 Þg lðu; u1 Þ. By a directed fuzzy graph G lÞ we mean a set V 6¼ ; together with r on [0, 1] and l ¼ V V ! ½0; 1 such that for all lðu1 ; u2 Þ rðu1 Þ ^ rðu2 Þ. ~ lðu1 ; u2 Þ stands for the value of membership of u1 ; u2 2 V;~ the directed edge ðu1 ; u2 Þ.
3 Vertex Coloring of Fuzzy Graph Graph vertex coloring is a pertinent concept with several real time applications. To name a few for instance, the mobile phone networks. A graph vertex coloring with minimal number of colors is called proper if no two adjacent vertices have the same color. Chromatic number is the name allotted with notation v. Traffic lights problem can be thought of as a normal graph vertex coloring problem by relaxing certain practical constraints. Suppose that there is a traffic road junction with traffic flowing from East to West and from North to South where all right and left turns are allowed. Also assume that traffic is heavy equally in all directions. See Fig. 1. This situation can be modelled using graphical representation where every vertex stands for a traffic flow. Here two vertices are said to be connected if the respective traffic flows happens one across the other. As the traffic flow occurs in eight directions the graph will have 8 vertices, say uj 1 j 8 .
Fig. 1. A road network
A normal graph visualization of the road network in Fig. 1 is given in Fig. 2. Note that ðu3 ; u8 Þ 2 E ðGÞ as the respective directions intersect. Like this other edges can be drawn and the graph in Fig. 2 results. We know that two adjacent vertices are assigned different colors or in other words different light signals. It is an easy exercise to deduce that the v(G) shown in Fig. 2 is 4. As we have assumed traffic is equally heavy in all directions, our task has become
On the Concept of Fuzzy Graphs for Some Networks
347
Fig. 2. The graph of the road network
simple. Suppose that t0 is the equal cycle time for all traffic lights. If the quantum of traffic in u2 and u6 is more than the quantum of traffic in u4 and u8 then the traffic flow for u2 and u6 will require more time than t0 . This means total time for waiting of traffic on all routes will increase leading to traffic jam or accident. So fuzziness occur. It can be handled by assigning some membership value for each of 8 traffic flow directions and create a pattern so that vehicles move freely without clashing with other traffic flow directions. The fuzzy coloring task aims to find the chromatic number for any level a of a fuzzy graph and the endowed colour function. The fuzzy chromatic number here turns out to be a fuzzy number along its a cut. One can also see [8] for the analysis of a similar problem. That is, given a graph that is fuzzy G = (V, l) a normal way to acquire some idea about it is to study the sequence of a-cuts. A fuzzy set S defined on V can be explained from its a-cuts Sa ¼ u 2 V : ls ðuÞ agaI: Observe that whenever a b for some a, b 2 I we have S/ Sb . That is given a finite family obeying monotone property, fSa j : jf1; . . .:; qgg, a fuzzy set S can be got from ls ðuÞ ¼ lubfa j : uSa jg 8 u V: Let fGa ¼ ðV; Ea Þ : aI g be a-cut sets of G and a-cut of a graph that is fuzzy is a normal graph Ga ¼ ðV; Ea Þ with Ea ¼ fðr; sÞ : r; sV; lrs a}. Note that any k-coloing for a normal set can be defined on Ga . From this a k-coloring function of G* is attained via this sequence. So far each a 2 I, we can let va as the chromatic number of Ga and the v of G* a graph that is fuzzy is done via a family of sets that are monotone. So given G ¼ ðV; lÞ; vðGÞ ¼ fðu; hðuÞ : u 2 V1g with V1 ¼ f1; . . .:jV jg; hð uÞ ¼ lubfa 2 I : u 2 Sa g8u 2 Vi and Sa ¼ f1; . . .; va g8a 2 I: 8a 2 I: Hence the v of a graph that is fuzzy is a normalized number that is fuzzy. Low values of a occur in cases with several incompatible links among the vertices and so more number of colors are required to combat or overcome these incompatibilities. But when a value are high there will be literally no links with incompatibilities and also fewer colors are only required. A beauty of this is that the v sums up all this to solve the problem that is fuzzy in nature. For a graph G* ¼ ðV; r; lÞ that is fuzzy an edge e = (u, v) with u, v 2 V(G) is called a strong edge if (1/2)ðrðuÞ ^ rðvÞÞ lðu; vÞ and is considered a weak edge if it is not so where by u ^ v we mean min{u, v}.
348
V. Yegnanarayanan et al.
Theorem 1. If G and G* are normal and its fuzzy version of a graph then vðGÞ v(G*). Proof. Let G = (V, E) and G* ¼ ðV; r; lÞ. Clearly E = {(u1, u2) : lðu1 ; u2 Þ 0g: Let |E(G)| = q and vðGÞ ¼ k: Then clearly k q. So the maximum number of strong edges that G* can have is q. If each edge of G* strong then vðGÞ ¼ v(G*). Further observe that if an edge is weak then its end vertices can be assigned two fuzzy colors in match with the same normal color. So if G* has its strong edge set cardinality less than q then v(G*) < k. So vðGÞ v(G*).
4 Intuitionistic and Neutrosophic Fuzzy Graph Structure Neutrosophic sets (NSs) suggested by Smarandache is a useful tool for tackling with incomplete, indeterminate and inconsistent information about real world. The neutrosophic sets are characterized by a truth-membership function t, an indeterminacy-membership function i and a falsity-membership function f independently, which are within ] −0, 1+ [. Fuzzy graphs and Intuitionistic fuzzy graphs fail when the relations among nodes in problems are indeterminate. For this purpose, Samarandache defined four categories of neutrosophic graphs, two based on literal indeterminacy I; I-edge neutrosophic graph and I-vertex neutrosophic graph. The two others graphs based on (t, i, f) components are: The (t, i, f)-edge neutrosophic graph and the (t, i, f)-vertex neutrosophic graph. These concepts are at nascent stage. The study of single valued neutrosophic graphs (SVN-graph) is very new and upcoming. The following definitions are adopted from [11, 12] for an arbitrary size to a size of two. 0 0 Definition 1. Let G 1 ¼ ðH1 ; H2 Þ and G 2 ¼ H1 ; H2 be two graph structures. The 0 0 Cartesian product of G 1 and G 2 written as G 1 G 2 is G 1 G 2 ¼ H1 H1 ; H2 H2 o n 0 0 0 with Hj Hj ¼ ðu1 vÞðu2 vÞ : v 2 H1 ; u1 u2 2 Hj [ ðuv1 Þðuv2 Þ : u 2 H1 ; v1 v2 2 Hj , j = 1,2. Definition 2. Let V be a given set. By a set J that is fuzzyof V termed generalized intuitionistic we mean an object that assumes the form J ¼ u; fJ ðuÞ; gj ðuÞ : u 2 V with fJ : V ! ½0; 1 and gJ : V ! ½0; 1 pointing to membership or otherwise degree of u 2 V such that minffJ ðuÞ; gJ ðuÞg 0:5 8 u 2 V. Definition 3. We call a set J ¼ fTJ ðuÞ; JJ ðuÞ; FJ ðuÞ : u 2 V g an Intuitionistic neutrosophic if fTJ ðuÞ ^ JJ ðuÞg 0:5, fJJ ðuÞ ^ FJ ðuÞg 0:5, fFJ ðuÞ ^ TJ ðuÞg 0:5 and TJ ðuÞ ^ JJ ðuÞ ^ FJ ðuÞ 2 ½0; 2. Definition 4. By an Intuitionistic neutrosophic graph we mean a pair G ¼ ðH1 ; H2 Þ with V as underlying set with TH1 ; FH1 ; JH1 : V ! ½0; 1 pointing to True, False and indeterminate values of membership of the vertices of V and TH2 ; FH2 ; JH2 : EV V ! ½0; 1 pointing to True, False, Indeterminate values of membership of edges uv 2 E such that TH2 ðuvÞ TH1 ðuÞ ^ TH1 ðvÞ; FH2 ðuvÞ FH1 ðuÞ ^ FH1 ðvÞ; JH2 ðuvÞ
On the Concept of Fuzzy Graphs for Some Networks
349
JH1 ðuÞ ^ JH1 ðvÞ; TH2 ðuvÞ ^ JH2 ðuvÞ 0:5; TH2 ðuvÞ ^ FH2 ðuvÞ 0:5; JH2 ðuvÞ ^ FH2 ðuvÞ 0:5; TH2 ðuvÞ ^ FH2 ðuvÞ ^ JH2 ðuvÞ 2 ½0; 2; 8 u; v 2 V. Definition 5. G j ¼ ðL1 ; L2 Þ is termed an intuitionistic neutrosophic graph structure of G ¼ ðH1 ; H2 Þ if L1 ¼ hv; T ðvÞ; J ðvÞ; F ðvÞi and L2 ¼ huv; T ðuvÞ; J ðuvÞ; F ðuvÞi are intuitionistic neutrosophic sets based on the sets H1 ; H2 with T ðuvÞ T ðuÞ ^ T ðvÞ; J ðuvÞ J ðuÞ ^ J ðvÞ; F ðuvÞ F ðuÞ ^ F ðvÞ; T ðuvÞ ^ J ðuvÞ 0:5; T ðuvÞ ^ F ðuvÞ 0:5; J ðuvÞ ^ F ðuvÞ 0:5; T ðuvÞ ^ J ðuvÞ ^ F ðuvÞ 2 ½0; 28u; v 2 L2 with L1 ; L2 as underlying vertex and edge sets of G . We know that billions of people all over the globe are still under the clutches of poverty. Poverty differs region by region. It is quite difficult for poor countries to engage in social exchanges with rich nations and hence they may opt to engage among similar such countries for their social prospects. There are various types of social exchanges available. To name a few, medicines, minerals, textile material and so on. Using intuitionistic neutrosophic graph structures we can find out which trade among poor countries are relatively more than others. We can also find out which nation has more resources for specific goods and better environment for its social exchange. Moreover one can make out for which social exchange, a rich country can invest its currency and in which area these low income countries have potential to grow better. A general step by step method to make use of the intuitionistic fuzzy graph structure is the following. First input a vertex set V = {V1, V2} and an intuitionistic neutrosophic set L defined on V*. Then input an intuitionistic neutrosophic set of social exchange of any node with the rest and determine T, F, J of each pair of nodes with TðV1 ; V2 Þ minfTðV1 Þ; TðV2 Þg; FðV1 ; V2 Þ maxfFðV1 Þ; FðV2 Þg; JðV1 ; V2 Þ minfJðV1 Þ; JðV2 Þg. Repeat this for every vertex in V*. Introduce relations V 1 , V 2 on V* so that (V*, V 1 , V 2 ) is the identifying structure. Arbitrarily pick a member of this relation with the value of T is high comparatively and the corresponding F and J. values are less than other relations. Finally list out all members that are in conformity with T, F, J values, the respective L1, L2 sets are intuitionistic neutrosophic on V1, V2 and (V*, V 1 , V 2 ) is an intuitionistic neutrosophic graph structure.
5 Fuzzy Graph for Telecommunication Communication is vital for human society and culture. Out of various modes available to communicate telecommunication has now become indispensable in our daily life. China mobile, Vodafone, Airtel are some popular telecom companies worldwide. Service providers of these companies are interested in developing a database of their customers to find out the VIPs among them (by which we mean those who use their service quite often) as well as those who are unhappy with the service and contemplate to quit the system. To this they depend on using complaints data. Despite sincere attempts finding exact reason for leaving is still open. The concept of fuzziness is adopted to determine the probability of frustrated customers. For this purpose fuzzy telecommunication network was designed through fuzzy graph theory. In real life scenario each group of people will have users belonging to various mobile networks. Data related to specific customers of those other networks who are in
350
V. Yegnanarayanan et al.
touch with the customers of Fuzzy Telecommunication Network are of paramount importance. It is an uphill task to gather data of each and every one. However the data concerning the number and duration of calls of the customers belonging to other networks and in touch with the customers of Fuzzy Telecommunication Network are available. Every service provider prepare his/her plans by observing how often calls are made within the same network utilizing low call-tariff. If a customer of a network speaks to a customer of another network for a large amount of time then it means that the customer of the said network has lot of friends within the same network X. Membership values are assigned to the customers for easy identification in Fuzzy Telecommunication Network. Suppose a customer outside of Fuzzy Telecommunication Network calls for a certain amount of time to customers of Fuzzy Telecommunication Network, then the outside customer is also precious for Fuzzy Telecommunication Network. If the call duration of an outside customer to Fuzzy Telecommunication Network is more than a threshold t, then the customer is considered as precious. This t is called scd (satisfied call duration). If A calls B using cell phone where A, B 2 V then l : V V ! ½0; 1 is a mapping with lðA; BÞ ¼ ðt1 =tÞðrðAÞ ^ rðBÞÞ if t1 2 ½0; t and rðAÞ ^ rðBÞ if t1 > t. Here t1 is the duration of call per unit interval of time and t is the satisfied call duration, a fixed positive real number for the fuzzy telecommunication network. A digraph representation of a telecom network is as follows: Let V1 ¼ fui :1 i s1 g bethe set of confirmed customers where s1 is a huge integer and V2 ¼ uj : 1 j s2 be the set of unconfirmed customers in a telecom S network. Set V ¼ 2i¼1 Vi . The value of membership of the customers are denoted by customer are given by f1 : V ! ½0; 1 and the value of membership links between the! ! ~ f2 : V V ! ½0; 1: Then the telecom network is G ¼ V; f1 ; f2 . The normal graph ! ~ is G = ðV; r; lÞ where lðu1 ; u2 Þ ¼ ððl ðu1 ; u2 Þ version of this fuzzy graph G þ~ lðu2 ; u1 ÞÞ=2Þ; 8u1 ; u2 2 V. 5.1
Churn Speculation
Churn or turning away of customers from one service provider to another in a telecommunication is a burning issue. Often turning away occurs in prepaid system. Hence preparing a list of turning away is a pertinent task. Excellent offers from telecom service providers are easily accepted. This motivates us to change our service provider citing non major issues. Moreover calling within same service provider has much benefits. Hence if a very popular hero change his mobile service provider then his followers also repeat the same. Also stability of a network is determined by the outgoing or incoming calls of a phone number. If ones outgoing calls increase or remains fixed when compared to the previous interval of time, then the service provider need not worry. Further the factors which measures a customer is number of different phone numbers to which he/she is connected. Suppose the number of connected customers decreases in particular interval of time then it means the customer may turn away in future. If the number of outgoing calls per unit interval of time decrease, then one can
On the Concept of Fuzzy Graphs for Some Networks
351
calculate the rate of decrease as: R1 = outgoing call reduction in previous time interval/outgoing calls’ total time in previous time interval; If the no. of close phone mates (cpm) decrease then its decrease rate is measured as R2 = cpm reduction in previous time interval/total no. of cpm in previous time interval; If the incoming calls (ic) decrease then its measure is R3 = ic reduction in previous time interval/total no. of ic in previous time interval; A realistic measure(rm) for turning away of a customer C is given by rm(C) = (w1R1 + w2R2 + w3R3)/(w1 + w2 + w3) where wi’s stand for the weights assigned as per the relative importance of Ri. Normally this value lies in [0, 1]. If the value approaches 1 for any customer then it means he is turning away and if it is less than 0.5 then it means that the service provider is safe.
6 Graphs and Fuzzy Graphs for Brain Networks Right from the days of Newtonian split every discipline of total no. of icrhuman behavior sail in their own direction. Earlier people like Leonard de Vinci can paint, design the first helicopter and write philosophic and theological papers as well. But in today’s world engineers limit engineering in a narrow way and tend to become inaccessible to other disciplines and thereby becoming responsible for a narrower implementations of areas such as computers. Actually it is a fact that our thought processes are infinitely more complex than a yes/no reasoning. Fuzzy logic permits a decision making process like our own and neural networks offer avenues where we can learn what to do in typical situations mimicking our own neural structures. Human brain is the best computational “machine”. It consists of innumerble neurons which when work together can perform tasks far beyond what computers do today. Since the 1940’s Scientists are trying hard to create a machine that work like the human brain neurons. 6.1
Some Interesting Facts About Biological Neurons
A typical brain contains between 1010 and 1012 neurons. • Neurons are connected to each other in a complex spatial arrangement to and from the central nervous system. • The cell body is typically a few microns in diameter. • The ‘hair like’ dendrites are tens of microns in length and receive incoming signals. • The axon is the neuron output and is 1 mm to 1 m in length. It can branch at its extremity, allowing it to connect with a number of other neurons. • A single neuron may be directly connected to hundreds or even tens of thousands of other neurons. • Signals are transmitted by electromechanical means. • Pulse propagation speed ranges from 5 to 125 m/s. • A delay of 1 ms exists for pulses to traverse the synapse, via the generation of chemical substances called ‘neurotransmitters’. • It is thought that, due to increased cell activity, metabolic growth takes place at the synaptic junction, which can increase its area and hence its ‘weight’. • If the ‘sum’ of the signals received by a neuron exceeds a threshold, the neuron ‘fires’. • Neurons can fire over a wide range of frequencies but always fire with the same amplitude. • After carrying a pulse, an axon fiber is in a state of complete nonexcitability for the duration of the ‘refractory period’, about 10 ms. • Information is frequency encoded on the emitted signals.
352
6.2
V. Yegnanarayanan et al.
Neural Network
A main function of a neural network is to produce an output pattern when presented with an input pattern. Suppose a neural network such as ADALINE is in its learning phase 3 things are to be kept in mind. ♦ From a pre designed training set inputs that are to be applied are picked where the desired response of the system to these inputs is known. ♦ Compare the actual output generated with the desired output to find the error. ♦ Adjust the weights to reduce the error. This kind of training is called supervised learning. In the skull of vertebrates the brain is a conglomeration of soft nervous tissue. Within the brain information is processed continuously posted between interdependent regions. DWMRI (Diffusion-Weighted magnetic resonance imaging) methods are adopted to determine the chance of connection between any given two areas of cortical and subcortical brain gray matter. Also fMRI (Functional Magnetic Resonance Imaging) studies have confirmed a small-world topology by correlating the functional connectivity the inter-regions. This further elucidates the existence of clustered subnet works arrangement in the brain. One can form distinct connected fuzzy graphs from all cortical and sub-cortical voxels exhibiting inter-voxel functional connectivity. A point in 3-D space that points to a pixel in an image can be characterized by voxels, a unit of graphical data. Graph theory in general and Fuzzy graph theory in particular is an ideal framework to represent the pattern of connectivity of the human brain network. This perspective gives new insights in the assertion of small-world concept through artificial and real human data. Graph theoretical analysis of brain networks is available in the literature. But using fuzzy graphs for this analysis is still in nascent stage. Pavlopoulos in [37] have declared that brain is the final frontier of science. Most of brain’s energy is spent in body’s information collection, interpretation, and in response regulation of the body. fMRI’s low frequency oscillations play a part in networks like motor, visual, and auditory regions that are functional in nature. Lately an analysis of resting-state fMRI confirmed the formation of a connection graph from an individual dataset. In this context, a graph or a graph that is fuzzy with of a set of nodes and links standing for the adjacency or path between the nodes. To identify the number of connections between nodes two classifications of networks exists. That is networks that are Small-world or Scale-free. The former exhibits nearest distances among nodes and a dense clustering whereas the latter shows a low average number of connections per node. Power-law scaling property is employed to determine whether networks are small-world or scale-free based on the connection pattern that governs it. Pertinent models such as Watts and Strogatz, and Barabasi-Albert in graph theory aid much in the understanding of brain’s functional connection pattern. Magnetic resonance methods are employed to form diagnostic opinions of the human body. The image generated depends on proton density, relaxation time T1 and relaxation time T2. Gray matter, White matter, and Cerebro-spinal fluid are three tissue regions present in MRI. White matter links Gray matter regions and facilitate a pass of impulses between neurons. MRI methods together with graph theory explains connectivity between the gray matter areas through three steps. First use a brain graph G = [V0, A0, W0] where a set of vertices pointing to cerebral tissue (N0), a set of white matter arcs (A0) between vertices in N0 and a set of real arc weights (W0) to solve the
On the Concept of Fuzzy Graphs for Some Networks
353
cerebral volume. Then involve existing algorithms to solve the likely path trajectory among any two vertices. Finally redefine the initially obtained brain graph into the one with non-overlapping gray matter and clustering and call it H, H = [V, A, W] to indicate gray matter areas with A as the white matter direct links among gray matter regions. Here W stands for probability of links among the gray matter areas. Anatomical connection strength (ACS), anatomical connection density (ACD), and anatomical connection chance are the measures meant for determining links between any Gray matter subgroups. ACD applies if any two zones have almost same amounts of link density with respect to another set of zones. ACS determines the information flow among any two regions. ACP finds the most likely probability of two regions being linked. It was established through graph theory methods that during the resting state the clustering coefficient was more than that of a graph that is random, pointing to smallworld phenomenon of the brain. Further the connection pattern exhibits a distribution that obeys power-law for a preassigned set of pre fixed values. These observations point to a topography that is scale-free of the brain. Neurological diseases like stroke affect the complete brain framework and so a fuzzy network approach is the need of the hour to probe neurological deficits in the cerebrum. We in future propose to collect more relevant data concerning white and gray matter regions with the help of fuzzy graph theory. We will endeavor to focus our attention towards voxel scale in the brain and obtain insight into neuro defects such as Schizophrenia, Alzheimer’s, and Parkinson’s disease-PD. There is no screening test for early detection of PD. The Hurst Coefficient could be used for the characterization of EEG signals as they can be used as processes with extended memory [38]. Fuzzy logic may be employed in this analysis and we will revert on this elsewhere.
7 Limitations and Further Scope “Intuitionistic Fuzzy Set Theory-IFS” due to Takeuti and Titani is a set theory developed in intuitionistic logic. It is an extension of intuitionistic logic where all formulas achievable in the intuitionistic logic are achievable in their logic. The name “intuitionistic” in Atanassov’s theory of IFSs was motivated by a pair of membership functions (f+, f−) where f+(v) is the degree of membership of v in IF f−(v) is the degree of non-membership and f+(v) + f−(v) 1. Note that the former is a legitimate approach and has nothing in common with the latter. Calling the theory of latter as intuitionistic leads to a misunderstanding. For example, in interval-valued fuzzy sets, the notion is membership grades can be hardly precise. As fuzzy sets are to model ill-defined concepts, demanding precision in membership grades sound illogical. So it leads naturally to interval-valued fuzzy sets as a clear departure away from standard fuzzy sets. It is common in engineering that intervals are involved to indicate values of quantities in case of uncertainty. Consider one more real life example namely, voting example where “yes”, “no” and “abstain” votes are possible. Abstention votes are dealt by means of the IFSs in case if they are considered to mean “unclassifiable votes”. Here abstention should be viewed as an expression of a voter’s discomfort with available political options and not
354
V. Yegnanarayanan et al.
her uncertainty about what to do. So a pure IFS can be a intuitive model for a specific situation than its equivalent interval-valued formulation. In Atanassov theory the personification of imprecise concepts through membership and non-membership degrees that do not add up to 1, may be useful for a proper development of applications. Neutrosophic sets are a generalization of the notion of fuzzy sets and intuitionistic fuzzy sets. Neutrosophic models provide more precision, flexibility and compatibility to the system than classical, fuzzy and/or intuitionistic fuzzy models. An independent neural network and fuzzy system work in parallel on a set of input and yields output classes through parallel Neural Fuzzy possibilistic classifier model [39].
8 Conclusion Largely our traditional techniques for modeling, reasoning and computing are crisp, deterministic and precise. Precision presumes that parameters of a model represent our perception either exactly of the phenomenon modeled or depend on the characteristics of the real system being modeled. But as the complexity of a system increase our ability to make correct and valid statements about its behavior decreases. We are stimulated to find a threshold beyond which precision and significance becomes mutually exclusive almost. Further while constructing a model, we tend to maximize its usefulness. This goal is closely linked with the relationship between vital characteristics of every system model, viz., complexity, credibility and uncertainty. Uncertainty has crucial part in endeavors to maximize the usefulness of system models. All traditional logic presumes that precise symbols are being employed. A graph is a user friendly way of depicting information involving relationship between objects. In several real world problems, we get incomplete information about that problem. So there is vagueness in the knowledge of the objects or in its relationships or in both. So to explain this type of relation, we need fuzzy graph model. Fuzzy graph coloring is perhaps an important problem of fuzzy graph theory. The first definition of a fuzzy graph was attributed to Kaufmann in 1973, based on Zadeh’s fuzzy relations. But actual credit goes to Azriel Rosenfeld [22] who probed fuzzy relations on fuzzy sets and developed the theory of fuzzy graphs in 1975. During the same time, R.T. Yehand S.Y. Bang also came out with various concepts connected to fuzzy graphs. The control policy of the traffic light lies mostly on the number of vehicles in the intersection line. The concept of accident and number of vehicles in each line is fuzzy and needs to be graduated. We represented each traffic flow with a fuzzy edge whose membership value depends on the number of the vehicles in that path. Two fuzzy vertices are adjacent if the corresponding traffic flows cross each other then there is a likelihood of accident. Likelihood of accident value depends on vertex membership value. The maximum security level is realized when all lanes are in intersection with each other and the number of vehicles in each line is very high. So the resulting graph is a complete graph. In this case, the chromatic number is the number of lanes and the control policy of the lights ensures that only one movement is allowed in any slot of the cycle. The minimum security level is attained when the intersection edge set is empty
On the Concept of Fuzzy Graphs for Some Networks
355
and in that case the chromatic number is 1 and all movements are permitted at any instant. We have provided a nice illustration of handling traffic management task involving graph coloring and how the same can be extended to situations that are uncertain and fuzzy. We have discussed some examples to get a flavor for fuzzy logic and basic neural network. In the area of neural networks many other examples could be looked at to cover areas such as weighting, self learning and other models of neural networks. We briefly touched upon the concept of neutrosophic and intutionistic graph structures that can handle fuzziness effectively, utility value of fuzzy representation for telecommunication networks and human brain networks. We hope to revert back more on this with more results and analysis elsewhere. Acknowledgements. The first author gratefully acknowledges Tata realty-SASTRA Srinivasa Ramanujan Research Grant for its support. The second and third author acknowledge research supported in part by the EU through the European Research Development Fund under the Competitiveness Operational Program BioCellNanoART = Novel Bio-inspired Cellular Nano-architectures, POC-A1-A1.1.4-E nr. 30/01.09. 2016).
References 1. Akram, M., Akmal, R.: Application of bipolar fuzzy sets in graph structures. Appl. Comput. Intell. Soft Comput. 2016, 13 (2016) 2. Akram, M., Shahzadi, S.: Neutrosophic soft graphs with application. J. Intell. Fuzzy Syst. 32, 841–858 (2017) 3. Akram, M., Shahzadi, S.: Representation of graphs using intuitionistic neutrosophic soft sets. J. Math. Anal. 7(6), 31–53 (2016) 4. Akram, M., Shahzadi, G.: Operations on single-valued neutrosophic graphs. J. Uncertain Syst. 11, 1–26 (2017) 5. Akram, M., Shahzadi, S., Borumand Saeid, A.: Single-valued neutrosophic hypergraphs. TWMS J. Appl. Eng. Math. 8, 122–135 (2016) 6. Akram, M., Sitara, M.: Novel applications of single-valued neutrosophic graph structures in decision making. J. Appl. Math. Comput. 56, 501–532 (2018) 7. Akram, M., Adeel, A.: Representation of labeling tree based on m- polar fuzzy sets. Ann. Fuzzy Math. Inform. 13(2), 1–9 (2017) 8. Arif, M., Wang, G., Balas, V.E.: Secure vanets: trusted communication scheme between vehicles and infrastructure based on fog computing. Stud. Inform. Control 27(2), 235–246 (2018) 9. Atanassov, K.: Intuitionistic fuzzy sets. Fuzzy Sets Syst. 20, 87–96 (1986) 10. Bhattacharya, P.: Some remarks on fuzzy graphs. Pattern Recogn. Lett. 6(5), 297–302 (1987) 11. Bhowmik, M., Pal, M.: Intuitionistic neutrosophic set. J. Inf. Comput. Sci. 4(2), 142–152 (2009) 12. Bhowmik, M., Pal, M.: Intuitionistic neutrosophic set relations and some of its properties. J. Inf. Comput. Sci. 5(3), 183–192 (2010) 13. Bhutani, K.R., Rosenfeld, A.: Strong arcs in fuzzy graphs. Inform. Sci. 152, 319–326 (2003)
356
V. Yegnanarayanan et al.
14. Broumi, S., Talea, M., Bakali, A., Smarandache, F.: Single-valued neutrosophic graphs. J. New Theory 10, 86–101 (2016) 15. Dinesh, T.: A study on graph structures, incidence algebras and their fuzzy analogues. Ph.D. thesis, Kannur University, Kannur, India (2011) 16. Dinesh, T., Ramakrishnan, T.V.: On generalised fuzzy graph structures. Appl. Math. Sci. 5 (4), 173–180 (2011) 17. Kauffman, A.: Introduction a la Theorie des Sous-emsembles Flous. Masson et Cie 1 (1973) 18. Karunambigai, M.G., Buvaneswari, R.: Degrees in intuitionistic fuzzy graphs. Ann. Fuzzy Math. Inform. 13, 1–13 (2017) 19. Mordeson, J.N., Chang-Shyh, P.: Operations on fuzzy graphs. Inform. Sci. 79, 159–170 (1994) 20. Mondal, T.K., Samanta, S.K.: Generalized intuitionistic fuzzy sets. J. Fuzzy Math. 10(4), 839–862 (2002) 21. Peng, J.J., Wang, J.Q., Zhang, H,Y., Chen, X.H.: An outranking approach for multicriteria decision-making problems with simplified neutrosophic sets. Appl. Soft Comput. 25, 336– 346 (2014) 22. Rosenfeld, A.: Fuzzy graphs, fuzzy sets and their applications. In: Zadeh, L.A., Fu, K.S., Shimura, M. (eds.), pp. 77–95. Academic Press, New York (1975) 23. Shah, N., Hussain, A.: Neutrosophic soft graphs. Neutrosophic Set Syst. 11, 31–44 (2016) 24. Samanta, S., Pal, M.: Irregular bipolar fuzzy graphs. Int. J. Appl. Fuzzy Sets 2, 91–102 (2012) 25. Samanta, S., Pal, M.: Fuzzy k-competition graphs and p-competition fuzzy graphs. Fuzzy Eng. Inf. 5(2), 191–204 (2013) 26. Samanta, S., Pal, M.: Some more results on bipolar fuzzy sets and bipolar fuzzy intersection graphs. J. Fuzzy Math. 22(2), 253–262 (2014) 27. Samanta, S., Pal, M.: Fuzzy planar graphs. IEEE Trans. Fuzzy Syst. 23(6), 1936–1942 (2015) 28. Samanta, S., Pal, M., Akram, M.: mm-step fuzzy competition graphs. J. Appl. Math. Comput. 32, 827–842 (2014) 29. Sampathkumar, E.: Generalized graph structures. Bull. Kerala Math. Assoc. 3(2), 65–123 (2006) 30. Smarandache, F.: Neutrosophy Neutrosophic Probability, Set, and Logic. Amer Res Press Rehoboth, USA (1998) 31. Smarandache, F.: A Unifying Field in Logics. Neutrosophy: Neutrosophic Probability, Set and Logic. American Research Press, Rehoboth (1999) 32. Turksen, I.: Interval-valued fuzzy sets based on normal form. Fuzzy Sets Syst. 20, 191–210 (1986) 33. Wang, H., Smarandache, F., Zhang, Y.Q., Sunderraman, R.: Single valued neutrosophic sets. Multispace Multistruct. 4, 410–413 (2010) 34. Ye, J.: Single-valued neutrosophic minimum spanning tree and its clustering method. J. Intell. Syst. 23(3), 311–324 (2014) 35. Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965) 36. Zadeh, L.A.: Similarity relations and fuzzy orderings. Inform. Sci. 3, 177–200 (1971) 37. Pavlopoulos, G.A., Hooper, S.D., Sifrim, A., Schneider, R., Aerts, J.: Medusa: a tool for exploring and clustering biological networks. BMC Res. Notes 4(1), 384 (2011)
On the Concept of Fuzzy Graphs for Some Networks
357
38. Toderean (Aldea), R., Geman, O., Chiuchisan, J., Balas, V.E., Beiu, V.: Novel method for neurodegenerative disorders screening patients using hurst coefficients on EEG Delta Rhythm, Soft Computing Applications. In: Balas, V.E., Jain, L.C., Balas, M.M. (eds.) Proceedings of the 7th International Workshop Soft Computing Applications (SOFA 2016), vol. 1, 327–338 (2016) 39. Pratap, A., Kanimozhiselvi, C.S., Vijayakumar, R., Pramod, K.V.: Parallel neural fuzzybased joint classifier model for grading autistic disorder. soft computing applications. In: Balas, V.E., Jain, L.C., Kovačević, B. (eds.) Proceedings of the 6th International Workshop Soft Computing Applications (SOFA 2014), vol. 1 (2014)
Fuzzy Logic in Predictive Harmonic Systems W. Bel1, D. López De Luise1(&), F. Costantini2, I. Antoff2, L. Alvarez2, and L. Fravega2
2
1 CI2S Lab, Pringles 10 2nd FL, C1183ADB Buenos Aires, Argentina [email protected], [email protected] IDTI Lab, 25 de Mayo 385, 3260 Concepción del Uruguay, Entre Ríos, Argentina
Abstract. From the risk prediction perspective it is important to be able to access any relevant information that may improve its accuracy. Many concerns arise regarding how well such information is being managed. From the traditional point of view, data can be numeric or nominal. These considerations affect the efficiency of the predictive model. Furthermore, an additional problem arises when the semantic interpretation is also taken into account: how precise would become the model? This paper introduces an approach called Fuzzy Harmonic Systems (FHS), an extension of Harmonic Systems (HS), an approach to perform adaptive predictions. The fuzziness may be a characteristic of data, or part of the process to improve flexibility. In the latter case, relaxing the original data derives in FHS patterns. This paper evaluates the efficiency of FHS compared to HS. Results indicate that even though the model is less precise, the predictive power increases, at a very low accuracy cost. Keywords: Fuzzy logic prediction
Harmonic system Fuzzy harmonic system Risk
1 Introduction Currently, Expert Systems (ES) for traffic risk prediction have different perspectives and functioning. Certain proposals make intensive use of technology and design networks to control streets and highways [1]. Some others use wireless communication between vehicles [2–6] as a prevention complement. But all of them have a limited success and are very sensitive to meteorology condition. Some other proposals allow toggling a GPS browser, or considering certain vehicle conditions, for instance the use of belts and airbags. There are other risk causes related to the lack of possibility of foresee the future (for example sudden displacement of other vehicles or pedestrians, animals crossing at that moment, etc.). The reaction time in those cases is usually not quick enough for an appropriate maneuver without any risk. On the average, a person reaction time is 1,5 s after risk stimuli. That can be altered by alcohol, fatigue or thermal stress [7–9], that increase the risk level and are not well considered in models. Furthermore, there are many other facts that are not considered by Expert Systems (ES) and other predictive © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 358–371, 2021. https://doi.org/10.1007/978-3-030-51992-6_28
Fuzzy Logic in Predictive Harmonic Systems
359
models. Among them it can be mentioned certain weather conditions, street conditions, vehicle color, speed, bottlenecks and obstacles in the neighborhood, etc. [10]. ES usually have if-then rules where risk is the result of a specific combination of set of heuristically selected values and variables, leaving aside the specific relationship between them, and the impact of such variability and its semantic interpretation. HS belong to a series of Data Mining approaches focused in the rhythm, acceleration, stationery status and many other behavior alternates in time patterns related to real time events. Their main characteristic is the possibility to react on real time [11, 12]. They are suitable for problems where quick reactions following restricted criteria are required (in this context usually refers to a driver or pedestrian reacting to environment changes or conditions) [13]. FHS extends HS adding fuzzy predictions centered in semantic interpretation of environment variables. It replaces the recognition of configuration patterns of certain parameters in a deterministic way, by an inferential ability, that increases the bias of the model. This model is based on fuzzy patterns resulting in more flexibility for the process of parameter configuration and better human comprehension of the predictor context [14]. This way, due to Occam Razzor Theorem [15], it is expected a bigger inference power on new situations. The rest of this paper presents the Fuzzy Harmonic System model and Fuzzy Patterns (Sect. 2), efficiency and performance statistics of HS and FHS models (Sect. 3), findings from the FHS testing (Sect. 4) and finally conclusions and future work (Sect. 5).
2 Model Fuzzy Harmonic Systems and Fuzzy Patterns Model FHS extends HS model since both functions with resonance and pattern matching principles. The main feature that FHS adds is the fuzzy application of the patterns and the adaptation of the inference engine according to that. It is depicted in Fig. 1 [14, 16]. Fuzziness approach gives the model the ability to contextualize variables, adding implicit knowledge of the context as membership functions. It is possible to handle a set of logic and subtle concepts of the problem through customized operators and numbers. All that results in the numeric ponderation of non-numeric parameters, enhanced with non evident information [14]. -Goal of FHS FHS can be applied to problems with the following two main characteristics [14]: i. Real time answer restriction. ii. For non numeric data, contextualization requirements. Both HS and FHS can be applied to problems with time evaluation of events. This implies the need to include a “timestamp” of the events, in order to be under the scope of these models.
360
W. Bel et al.
Fig. 1. FHS model
-Characteristics of FHS The flexibility of FHS models reduces its computational complexity, and its fuzzy inference engine makes it simpler and appropriate to apply the benefits of HS without affecting its performance: i. Firing subrutines: Detection of certain characteristics of an event during resonance can trigger subroutines. That is the case of Kronos [16], when a risk case is detected, when the user is in a risky zone, the risk level can influence in the resulting level calculated by the system. For instance, if the risk level is “Low” and the zone risk is “High”, the subroutine weights the level as “Medium”. ii. Filters: According to [11, 12] it is possible to apply different filters to reduce the number of comparisons and improve the velocity of data processing. Furthermore it is possible to apply filters on certain patterns. For instance patterns for pedestrian or driver, for weather conditions including running or snowing, etc. iii. Unsupervised: Unlike other prediction approaches, FHS does not require a training step in order to start functioning. It adjusts automatically its parameters and adapts by self to the context of the problem. iv. Data model very strong: As it has been said in the previous item, it does not need training in order to get in work. It is possible to change data model in a very simple way letting at the same time the system keep working. That makes the system very strong, allowing data addition very quick. To use fuzzy logic FHS model it is important to perform a fuzzification step of every variable in the model. To do that, just take the relevant properties of the problem [17]. Although it is possible to fuzzify any parameter, it does not make any sense to work on all of them [10].
Fuzzy Logic in Predictive Harmonic Systems
361
By fuzzy patterns it is possible to detect implicit knowledge by the shape of membership functions (for instance, low pressure, high pressure, etc.) as can be seen in Fig. 2. There is a bag or words that can manage categorical attributes as in Table 1.
Fig. 2. Membership function for pressure
Table 1. Bag of words Bag bag_color_risk bag_weather_rain bag_weather_lowvisibility
Words {“Black”, “Grey”, “Brown”, “Silver”, “Green”, “Blue”, “Red”} {“Rain”, “Drizzle”, “Hail”, “Thunderstorm”, “Precipitation”, “Overcast”, “Squalls”} {“Smoke”, “Fog”, “Mist”, “Dust”, “Sand”, “Haze”, “Ash”}
These patterns can be parameterized in a deterministic way, precise or by intervals. This way it is possible to use FHS in problems where data must be semantically handled more than numeric. For the FHS implementation, fuzzy patterns were defined, using the process described earlier. To compare FHS against HS model, the same rules used in HS, were fuzzyfied (see Tables 2 and 3), according to some fuzzy patterns used for pedestrian and drivers respectively. Each fuzzy pattern has the same initial values as the ones in HS [14]. The risk levels for inference rules are as follow: i. Risk High: There is a risk over 50% for an incident. There are many facts like weather, time in the day, day of the week, speed of the vehicle, level of alcohol in blood, lack of security, type of zone, etc. that influence very much in the driving experience or pedestrian status. ii. Risk Medium: The probability of an incident is between 25% and 35%, with some facts previously mentioned. iii. Risk Low: The probability of an incident is below 10%, and there are a few facts that slightly alter the optimum conditions. iv. No Risk: The probability of an incident is below 2%, with very good conditions.
362
W. Bel et al. Table 2. Fuzzy patterns for drivers
Drivers Rule Vehicle.Type = Motorcycle and Alcohol = High and Weather.Condition = Rain Vehicle.Type = Motorcycle and Alcohol = High and Weather.Condition = Rain and Vehicle.Color = Risk User.Age = Old and Time = Nightfall Weather.Visibility = Little and Vehicle.Color = Risk and Time = Nightfall Weather.Visibility = Little and Vehicle.Color = Risk and Time = Dawn Vehicle.Type = Big and Weather.Visibility = Little and Current_Speed = High Current_Speed = High and Belt_Helmet = False and Weather.Condition = Rain
Result Risk Medium Risk High Risk Medium Risk Low Risk Low Risk Medium Risk Medium
Table 3. Fuzzy Patterns for pedestrians Pedestrians Rule Weather.Condition = Rain and HeadPhone = True and Time > Nightfall Alcohol = High and Time = Nightfall Alcohol = High and Time = Dawn and User.Age = Old and Weather. Visibility = Little and Weather.Temperature = Low
Result Risk High Risk Medium Risk Low
The inference engine for fuzzy rules applies an acceptance threshold of 0.7 for every membership function, but this is a parameter that can be adjusted whenever it is required.
3 Tests, Statistics and Performance/Efficiency Metrics In order to test the prediction model with Kronos system, a set of samplings where performed. For a detailed description see [14, 16], Table 4 shows the model working in real world, collecting a test set of 516488 samples. Besides, there are the specific details of every case. As it can be seen from Table 4, FHS improves the number of detected cases to 42.27% additional to HS, because the fuzzy patterns give more flexibility when events
Fuzzy Logic in Predictive Harmonic Systems
363
Table 4. Comparison of performance and extra cases of the models Predictor Time Detected cases Number of extra cases detected Duration reduction hs_predict 3132700 6753 – 2 min (−3.7%) fhs_predict 3254698 12562 5809 (+42.27%) –
are not clearly defined. As a counterpart, HS has a reduction in the time required for processing of 3.7%. This is because FHS has an overhead due to fuzzyfication process. In order to validate the model, metrics like Cohen’s Kappa coefficient, ECM and RECM are evaluated. Table 5 presents the confusion matrix for performance of FHS, and sensibility, specificity and Kappa are in Table 6. Table 5. Confusion matrix for FHS performance Predicted
Expected V (risk) F (no risk) V (risk) 1860 (VP) 230 (FP) F (no risk) 370 (FN) 7540 (VN)
Table 6. Indicator for FHS precision Kappa 0.8194113995140571 Sensibility 0.8296391404243817 Specificity 0.9703266384981961
From Table 6, it is possible to say that FHS has an acceptable kappa (0.819) according to [18] and [19], since it is over the threshold 0.81. Sensibility (0.829) and specificity (0.97) confirm this trend and indicate that it is a good prediction model. Table 7 show ECM and RECM values for FHS in the context of the current sample. ECM value is good (12%) indicating a low error margin for this type of events.
Table 7. Indicators ECM and RECM for FHS Value Sample Media ECM RECM Absolute 66629 0.4699155023 0.3621996428 0.6018302442 Relative 66629 15.6% 12.0% 20.0%
Figure 3 shows ROC (Receiver Operating Characteristic) [20–22] for HS and FHS, a comparison of performance for both models as predictors.
364
W. Bel et al.
Fig. 3. ROC fhs_predict and hs_predict
Table 8 presents a summary of the analysis for ROC curve. The most relevant indictor is AUC (Area Under the Curve) [23, 24]. FHS model has a value of 0.90; according to [25, 26] the scoring is really good, as it is in the interval [0.80, 0.90). At this time, HS has an AUC of 0.71, meaning a fair efficiency, since it has a value in [0.70, 0.80).
Table 8. Summary of ROC values Predictor AUC SE.AUC LI LS Z p-value
fhs_predict 0.9000694 0.001600384 0.8969328 0.9032061 249.9833 0
hs_predict 0.7111321 0.002090535 0.7070347 0.7152295 100.9943 0
Fuzzy Logic in Predictive Harmonic Systems
365
4 Findings from the Samplings The following is a summary of main results and findings using the prototype Kronos in Android with model FHS [16]. Tests were performed in Concepción del Uruguay (CDU) and Colón (COL) city. All the information is under public domain and available in the following links: https://drive.google.com/file/d/1EwcnRGLSEcx-sljBG5aUID8Z9v8QrUcj/view? usp=sharing (For city Concepción del Uruguay, Entre Ríos, Argentina) and https://drive.google. com/file/d/1aZshGdLuz3NP8dadlVMuQ3LQMjE3UXxP/view?usp=sharing (Colón, Entre Ríos, Argentina). Tables 9 and 10 present summary data of testing with mobile FHS in two cities from Argentina. The first observation is that preliminary results from lab simulations have same percentages for risk level as field tests [14]. This indicates that the prediction performs as expected in real environments.
Table 9. Summary of field tests in CDU city Description
Total Percentage Total cases 10751 100% No risk 10426 96.9% Low risk 63 0.58% 3.01% Medium risk 153 1.42% High risk 109 1.01%
Table 10. Summary of field test in COL city Description
Total
Total cases 17397 No risk 16779 Low risk 349 Medium risk 196 High risk 73
Percentage 100% 96.4% 2.0% 3.6% 1.12% 0.41%
The following presents come main variables of the problem, and interesting findings from the field test using mobile Kronos:
366
W. Bel et al.
-Visibility Figures 4 and 5 show histograms of visibility and its relationship with risk levels. For CDU (Fig. 4) higher risk levels are mainly in places with visibility lower than 15 km. For distances lower than 1 km, the number of risk situations increased significantly. For visibility less than 0.5 km, the highest risk level reaches a 20%. For COL (Fig. 5) highest risk levels are for visibility less than 12 km.
Fig. 4. Histogram for visibility in CDU
Fig. 5. Histogram for visibility in COL
-Dew Point Figures 6 and 7 show histograms for dew point according to risk distribution. For CDU (Fig. 6) highest risk is in the range of 5–10 °C. For COL (Fig. 7) the worse range is between 10–15° [27, 28]. The rest of the situations are lower risks.
Fuzzy Logic in Predictive Harmonic Systems
367
Fig. 6. Histogram for dew point in CDU
Fig. 7. Histogram for dew point in COL
-Weather Condition Figures 8 and 9 are histograms for weather condition according risk levels. For CDU (Fig. 8) most of the highest risk level are during partly cloudy, mostly cloudy and rain. For COL (Fig. 9) highest levels are for mostly cloudy and cloudy. That difference indicates that bad weather conditions affect the traffic security.
368
W. Bel et al.
Fig. 8. Histogram for condition in CDU
Fig. 9. Histogram for condition in COL
-Atmospheric Pressure Figures 10 and 11 present histograms for atmospheric pressure according to risk levels. For CDU (Fig. 10) there is a threshold in 1015 hPa. Higher values increase risk level very much. A range from 1025 to 1030 hPa, the percentage rises to a 30%. For COL (Fig. 11) the highest levels have a different threshold: 1022 hPa; and conversely to the previous finding, higher risks are below that value.
Fuzzy Logic in Predictive Harmonic Systems
369
Fig. 10. Histogram for atmospheric pressure in CDU
Fig. 11. Histogram for atmospheric pressure in COL
5 Conclusions and Future Work This article briefly presents FHS and its management of fuzzy patterns, that are currently implemented as the Kronos prototype. It has a mobile subsystem that handles the risk levels for pedestrian and drivers. It constitutes a lightweight and simple solution to evaluate complex and dynamic environment situations, giving extra flexibility and fault tolerance for descriptive variables. Those characteristics indicate that FHS outperforms other approaches. It is pending to fully implement mobile FHS, considering an intelligent approach to assess the risk levels of zones in a city, weightings with events’ feedbacks, and many other features of the environment (for instance the presence of stray animals, street conditions, flood zones, illumination conditions, street closures, etc.).
370
W. Bel et al.
References 1. Zeadally, S., Hunt, R., et al.: Vehicular ad hoc networks (VANETS): status, results, and challenges. Telecommun. Syst. 50, 217–241 (2010) 2. Fujii, H., Seki, K., Nakagata, N.: Experimental research on protocol of inter-vehicle communication for vehicle control and driver support. In: 2nd World Congress on Intelligent Transport System, Yokohama, Japan, 9–11 November 1995, pp. 1600–1605 (1995) 3. Sasaki, I., Hirayama, T., Hatsuda, T.: Vehicle information network based on inter-vehicle communication by laser beam injection and retro reflection techniques. In: IEEE Vehicle Navigation and Information System Conference. Yokohama, Japan, August 31–September 2 1994, pp. 165–169 (1994) 4. Mizui, K., Uchida, M., Nakagawa, M.: Vehicle to vehicle communication and ranging system using spread spectrum technique—Proposal of double boomerang transmission system. In: IEEE Vehicle Navigation and Information Syst Conference, Yokohama, Japan, August 31–September 2 1994, pp. 153–158 (1994) 5. Kremer, D.W., Hubner, D., Holf, S., Benz, T., Schafer, W.: Computer aided design and evaluation of mobile radio local area networks in RTI/IVHS environments. IEEE J. Select. Areas Commun. 11, 406–421 (1993) 6. Valade, J.-M.: Vehicle to vehicle communications: experimental results and implementation perspectives. In: 2nd World Congress on Intelligent Transport System, Yokohama, Japan, 9– 11 Nov 1995, pp. 1606–1613 (1995) 7. http://www.cesvi.com.ar/seguridadvial/Recomendaciones/SeguridadRecomendaciones3. aspx (2017) 8. NTP 322: Valoración del riesgo de estrés térmico: índice WBGT - Pablo Luna Mendaza (1999) 9. Estrés térmico y sobrecarga térmica: evaluación de los riesgos - Eugenia Monroy Martí (2001) 10. Bel, W., López De Luise, D.: Parametric prediction model using expert system and fuzzy harmonic system. In: 7th International Workshop on Soft Computing Applications (SOFA), Arad, Romania (2016) 11. Acuña, I., García, E., López De Luise, D., Paredes, C., Celayeta, A., Sandillú, M., Bel, W.: Traffic & Pedestrian risk inference using Harmonic systems. SOFA. Romania (2014) 12. López De Luise, D., Bel, W., Mansilla, D., Lobatos, A., Blanc, L., Malca Larosa, R.: Risk prediction on time with GPS information. In: Kulczycki, P., Kóczy, L., Mesiar, R., Kacprzyk, J. (eds.) Information Technology and Computational Physics. Springer (2017) 13. López De Luise, D., Bel, W.: Cálculo de Riesgo en Tráfico y Peatón usando Sistemas Armónicos. 978-3-639-53739-0. Editorial Académica Española (2017) 14. Bel, W., López de Luise, D.: Fuzzy harmonic systems for traffic risk. In: XXIII Congreso Argentino de Ciencias de la Computación (CACIC), La Plata, Argentina (2017) 15. Domingos, P.: Data mining and knowledge discovery. 3, 409 (1999). https://doi.org/10. 1023/a:1009868929893 16. Bel, W., López De Luise, D., Ledesma, E.: IEEE Primera Conferencia Uruguay (URUCON) Montevideo, Uruguay - “Fuzzy Harmonic Systems: Ability for Traffic Risk Measurement in Android” (2017) 17. López De Luise, D.: Introducción a los Sistemas Armónicos. Advanced Research and Trends in New Technologies, Software, Human-Computer Interaction and Communicability. IGI Global (2013) 18. Altman, D.G.: Practical Statistics for Medical Research. Chapman and Hall, New York (1991)
Fuzzy Logic in Predictive Harmonic Systems
371
19. Fleiss, J.L., Cohen, J., Everitt, B.S.: Large sample standard errors of kappa and weighted kappa. Psychol. Bull. 72, 323–327 (1969) 20. Ferri, C., Hernandez-Orallo, J., Salido, M.A.: Volume under the ROC surface for multi-class problems. In: Machine Learning ECML 2003, pp. 108–120 (2003) 21. Green, D.M., Swets, J.A.: Signal Detection Theory and Psychophysics. Wiley, New York (1966). ISBN 0-471-32420-5 22. Goksuluk, D., Korkmaz, S., Zararsiz, G., Karaağaoğlu, A.E.: easyROC: an interactive webtool for ROC curve analysis using R language environment. R J. 8(2), 213–230 (2016) 23. Till, D.J., Hand, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach. Learn. 45, 171–186 (2012). https://doi.org/10.1023/A: 1010920819831 24. Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 143(1), 29–36 (1982). PMID 7063747. https://doi. org/10.1148/radiology.143.1.7063747 25. Youngstrom, E.A.: A primer on receiver operating characteristic analysis and diagnostic efficiency statistics for pediatric psychology: we are ready to ROC. J. Pediatr. Psychol. 39(2), 204–221 (2014). https://doi.org/10.1093/jpepsy/jst062 26. Cizek, G.J., Wollack, J.A.: Handbook of Quantitative Methods for Detecting Cheating on Tests - Educational Psychology Handbook (2016) 27. Celemín, A.H.: El tiempo y la salud humana. Meteorología Práctica (2008). ISBN 950430443 28. http://www.paranauticos.com/notas/meteorologia/punto-de-rocio.htm (2018)
Particle Swarm Optimization Ear Identification System B. Lavanya1, H. Hannah Inbarani1, Ahmad Taher Azar2,3(&), Khaled M. Fouad3, Anis Koubaa2, Nashwa Ahmad Kamal4, and I. Radu Lala5 1
2
Department of Computer Science, Periyar University, Salem, India {lavanyab26,hhinba}@gmail.com Robotics and Internet-of-Things Lab (RIOTU), Prince Sultan University, Riyadh, Saudi Arabia {aazar,akoubaa}@psu.edu.sa 3 Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt {ahmad.azar, kmfi}@fci.bu.edu.eg 4 Faculty of Engineering, Cairo University, Giza, Egypt [email protected] 5 Vasile Goldis West University, Arad, Romania [email protected]
Abstract. Biometric identification methods are proved to be very useful, natural, and easy for users than conventional methods of human identification. Ear biometric is used for personal identification and verification. Ear recognition is the most prominent ongoing biometric research. Ear image has rich edges due to intricate curves. Therefore, in this work, edges are revealed using a canny edge detector. In this paper, a simple and fast algorithm is developed for ear matching, which is based on Particle Swarm Optimization (PSO). The PSO method is exploited to match ear images with other ear images in the database. In-ear matching, the AMI EAR database is used, and the result shows that the accuracy of proposed method was 98% for testing 50 images and 96.6% when testing 150 images which surpasses other benchmark methods, namely Principal Component Analysis (PCA) and Scale Invariant Feature Transform (SIFT). Keywords: Biometrics Gaussian filter Canny edge detector PSO PCA SIFT
1 Introduction Biometrics is identifying and verifying the individual using their behavioral and physiological characteristics [12]. Biometrics was first initiated in 1879, when Alphonse Bertillon (1853–1914), a French Criminologist and biometric researcher, introduced his anthropometrical signalment system for identifying condemned criminals [1]. Biometric is a fast-growing technology that can be used in illegal systems like forensics, mug shot, and post-event analysis. Biometric provides security in computer networks, cellular phones, prevents ATM theft, etc. The biometric-based personal © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 372–384, 2021. https://doi.org/10.1007/978-3-030-51992-6_29
Particle Swarm Optimization Ear Identification System
373
recognition system is categorized into two kinds, i.e., verification (One to One) system, and identification (One to Many) systems. The biometric recognition can be either passive or active. The human ear owns affluent and steady structure maintained from childhood into old age. The new feature in biometric is human ear biometric, which becomes very popular [28]. Figure 1 gives the Anatomy of the human ear. Ear recognition comes under passive recognition. It denotes that the subject does not own an active portion of the entire process. It has several advantages when compared to the face, iris, and fingerprint biometrics. Face and ear are closely related since they are both located in the human head. The ear recognition methods are all the same. There are several benefits [12, 24] that use the ear as a data source for human identification. That is, the shape of the ear does not modify through human life, and also ear is not influenced by facial manifestation; accordingly the ear has both robust and reliable features for human identification. One more advantage is that the ear image can be readily captured from a distance.
Fig. 1. Anatomy of ear
However, in recent years many researchers have commenced considering the problem of developing a computational method for human identification based on the ear. Identification/Matching has been found in various applications like image registration, stereo vision, and object recognition [29–32]. People refer to handle a set of local features rather than pixel arrays to enhance the accuracy and decrease the execution time when two images are matched. PSO is implemented to be used in the optimization of the nonlinear continuous processes, and it emulates bird flocking, fish schooling, and swarm concept [16]. In this research, ear recognition based on the smoothing factor of Gaussian features and features extracted from edges is matched using the PSO algorithm. The proposed method is evaluated using the AMI ear database. The remaining of the paper is organized as follows. Section 2 depicts the related works. Methods are introduced in Sect. 3. Results and discussion are introduced in Sect. 4. Finally, the conclusion and future directions are given in Sect. 5.
374
B. Lavanya et al.
2 Related Work Researchers have established many techniques for human recognition using the human ear. Kumar [19] proposed an approach to automate human identification using 2D ear images based on the sturdy segmentation of the curved area of interest utilizing morphological operators and Fourier descriptors. Yin [34] introduced the method of Point Pattern Matching (PPM) using particle swarm optimization (PSO). The proposed method was evaluated using both synthetic datasets and real fingerprint images. Arunkumar [5] has proposed Palmprint and multimodal recognition of a face. PSO dependent feature Level fusion is used to perform this objective. PSO depends on the concept of bio-inspired cooperative attitude by the social behavior of bird flocking or fish schooling. This work has proposed two techniques for feature selection. These techniques are the Discrete Cosine Transforms and the Discrete Wavelet Transform. Logannathan [21] proposed the method of wavelet probabilistic neural network (WPNN) as a classifier of iris biometric. The PSO was utilized for training the single neuron for WPNN model optimization. Mussi [23] presented the PSO algorithm variant for Pattern Matching in image analysis. Sibai [25] proposed a method for ear recognition based on a feed-forward neural network. The authors extracted 7 features from each ear image. Anam and Akram [28] have proposed Personal Identification using Ear Recognition. The features are acquired using Haar wavelets and ear identification using a fast normalized cross-correlation. Fooprateepsiri and Kurutach [14] proposed a robust method for ear based personal identification under an environment of different scaling, rotation, and image reflection. This approach consists of two components. The first component is ear features identification by utilizing the multi-resolution Trace transform, and Fourier transform. The second component is a modification of a Hausdorff distance, which aims to measure and identify the similarity degree between the tested images and the models. Tayyaba [33] proposed three components that are database component, image processing component, and identification component for ear identification. The identification module performs the image assessment with the records existent in the database using template matching. Abaza [2] reported the similarity of human ears analysis. Such analysis is significant to recognize the potential of identification of the left and right ears of an individual. Al-Ta’i [3] compared the algorithm of PSO and Firefly for Fingerprint Authentication. Dong [13] presented a novel algorithm for 3D ear recognition using SIFT descriptors. The test ear images are recognized by the application of a new weighted key point matching algorithm. Table 1 shows the outline of related work.
Particle Swarm Optimization Ear Identification System
375
Table 1. Summary of related work Authors Hurley et al. [15] Fooprateepsiri and Kurutach [14]
Year 2005 2011
Database XM2VTS face database CMU PIE
Dong and Guo [13]
2011
Kumar and Wu [19] Kumar and Chan [20]
2012
University of Notre Dame public dataset (UND dataset) IIT and UND
2012
IIT and UND
Chan and Kumar [9] Anam and Akram [28]
2012
IIT and UND
2012
IIT and USTB
Chi and Ying [10]
2013
USTB
Jayachandra and Patel [17]
2013
CVL-database
Sibai et al. [25] Benzaoui et al. [7]
2013
Nature database (51 images) IIT and USTB
Chidananda et al. [11]
2015
2014
Color FERET, UMIST, CMU-PIE, and FEI
Approach Feature extraction Feature extraction and matching Feature extraction
Feature extraction Feature extraction, matching Feature extraction Feature extraction and matching Feature extraction, matching Feature extraction, matching Feature extraction Feature extraction, matching Feature extraction, matching
Methods Force field transform Trace transform, Fourier transform and modified Hausdorff distance Scale-invariant feature transform (new weighted key point matching) Log-Gabor filter Sparse representation of local texture descriptors, euclidean distance 2D-quadrature filter Haar wavelet, fast normalized cross correlation Scale-invariant feature transform and Forstner corner detection, Euclidean distance Canny edge detector, KNN
Feed-forward artificial neural network Local texture descriptors, KNN, SVM Entropy-cum-houghtransform, PSO
3 Methodology This section identifies the analysis of the proposed ear recognition technique and this approach is tested using MATLAB. Figure 2 illustrates the flow diagram of the proposed approach.
376
B. Lavanya et al.
Fig. 2. Flow diagram of proposed method
As shown in Fig. 2, the ear image is acquired from the database. In the next step, the ear image is cropped, and a color image is converted to a gray image, and noise is removed using a Gaussian filter. The edges of the ear image are detected utilizing the Canny edge detector, and finally, maximum found edges are matched with the test image. A) AMI Database The proposed algorithm is tested on the AMI ear database (http://www.ctim.es/ research_works/ami_ear_database). It consists of 700 ear images, seven images that are images of six right ear, and one image of the left ear per person. The number of pixels of each ear image is 492 * 702 pixels, and they are in JPEG extension. Figure 3 shows the sample ear images [4].
Fig. 3. Sample ear images
B) Region Cropping and Gray Conversion The region cropping method is employed for the ear image to crop the ear region. Ear image is not directly given as the input for the proposed technique. Here, the input color image is first transformed into a grayscale image by excluding the hue and saturation while conserving the luminance. Figure 4 shows the cropped grayscale image.
Particle Swarm Optimization Ear Identification System
377
Fig. 4. Gray conversion of cropped image
C) Gaussian Filter Image denoising is a fundamental task in image processing. There are many techniques available for image denoising. Traditional methods focus on linear methods and nonlinear methods. The noise introduced the undesired visual effect of the image [26, 27]. The main properties of denoising are eliminating noise from the image and preserve the edges. The filtering method is used for modifying or enhancing an image. The Gaussian filter is a linear low-pass filter [6]. In Gaussian filtering, operations are smoothing, sharpening, and edge enhancement [26, 27]. The Gaussian smoothing operation is twodimensional convolution operation that is used to blur the images which use a various kernel that acts the shape. The two-dimensional Gaussian filter has the form of: Gðx; yÞ ¼
1 x2 þ2y2 e 2r 2pr2
ð1Þ
In Eq. 1, r is the distribution standard deviation and it is supposed that the distribution has a mean of zero. In-ear matching, we extract edge features and compare features. Given ear image (shown in Fig. 4), we wish to get a clear ear map by using a Gaussian filter to smoothen the noise. A Canny operator is applied to extract edges from segmented ear image. Denoising is done to remove noise and to protect edges. Noise removal is necessary for an image or signal, as it disturbs the original image and may give wrong results. Therefore, in this work, denoising was done to remove noise from the input image utilizing the Gaussian Filter, as illustrated in Fig. 5. The Gaussian filter is a digital filtering technique, which is used to eliminate noise. Ear image is smoothened using the Gaussian filter using Eq. (1).
378
B. Lavanya et al.
Fig. 5. Denoised image
D) Canny Edge Detection The canny edge detector algorithm is utilized to reveal a broad range of edges in the image. The edge detector was implemented by John F. Canny in 1986 [8]. The edges of the denoised ear image were detected utilizing a Canny edge detector, which extracts the edges. The possibility of edge discovering points should be increased, whereas the possibility of falsely discovering non-edge points should be decreased. The discovered edges are preferable as close as possible to the real edges. Figure 6 shows the edge detected from the image. The edges are essential features in an ear image. These edges extracted features are considered for ear matching by PSO.
Fig. 6. Edge extracted image
E) Particle Swarm Optimization (PSO) PSO was presented by Eberhart and Kennedy [16, 18], which is a process that depends on the cooperative behavior of the population bio-inspired by the social attitude of bird flocking or fish schooling [16]. PSO method has been effectively utilized in several domains, such as artificial neural networks during training, linear constrained function optimization, wireless network optimization, data clustering, and many other domains, where GA can be utilized [29]. The image classification process is dependent on the descriptor utilized to represent an object [22]. PSO is initialized with random solutions (particle swarm), and it finds out optima by modifying generations. The particle search exploits a mixture of deterministic and
Particle Swarm Optimization Ear Identification System
379
probabilistic theories, which are based on data sharing and enhance their search processes. During search space, each particle evolutionary finds out its candidate solution through the time, utilizes its memory and knowledge acquired by the swarm. The global best particle is established between the swarm, and the information has participated through the particles. It is a one-way information sharing mechanism. PSO Algorithm PSO aims to tackle an optimization issue, a swarm of computational elements, which are named particles. These particles are utilized to examine the optimum solution in the solution area. Each particle demonstrates a candidate solution and is determined by certain coordinates in the D-dimensional vector of the search area. The situation of the i -th particle is stated as Xi = (xi1, xi2,….., xiD). The particle velocity is indicated as Vi = (vi1, vi2, …….., viD). For each particle, the fitness function is assessed. Also, it is compared to the best previous result fitness of that particle to the fitness of the best particle through all particles in the swarm. After two best values are found, the particles evolve by modifying their velocities and positions through the Eqs. 2 and 3. Vti þ 1 = W*Vti + c1 rand(). (pbest - xti ) + c2.rand(). (gbest - xti )
ð2Þ
Xti þ 1 = xti + vti þ 1
ð3Þ
i = (1, 2,…, N) and N is the swarm size; pbest is the particle’s best-reached solution and gbest is the global best solution. c1 and c2 are parameters of cognitive and social that are ranged between 0 and 2.w is inertia weight. rand () generates random numbers with uniform distribution (0, 1). -Vmax Vt+1 Vmax (Vmax is the maximum i velocity). The inertia weight is a score utilized to manage the balance of the search algorithm between exploration and exploitation. The “cognitive” component represents the private experience of the particle itself. The “social” component represents the cooperation among the particles. The recursive procedures will continue until the termination condition (maximum number of iterations t) has arrived. In PSO, the solution effectiveness or fitness is calculated for each particle. The matching is measured by F(x,y) in Eq (4), the distance between each particle is measures, that is between the current particle ‘i’ and neighbor particle ‘j’. Then we calculate mean distance using Eq (5) [22]. F(x, y) =
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2ffi xi x j þ y i y j
MD =
s 1X MDi s i¼1
ð4Þ ð5Þ
Where ‘s’ is swarm size. The pbest and gbest can be identified through the fitness value of all particles. The algorithm is summarized in Algorithm 1.
380
B. Lavanya et al.
Initialize parameters Initialize population Repeat the process until stop measures is seen Calculate fitness of each particle Determining the pbest values, update pbest if the current fitness is better Determining the gbest values, find the best fitness particle and compare current best for each particle Modify the particle velocity through eq. (2) Modify particle position through eq. (3) Output pbest and fitness value as optimum transformation Algorithm 1: PSO Algorithm for matching problem
After extracting the edge features, matching is done using the PSO matching technique, which gives a good result for pattern matching. Ear matching is performed based on Eq. (2), and the fitness function is calculated using Eq. (4). The proposed ear biometric system relies on the matching operation for the identification of persons. Ear matching is a significant process for judicial application due to the uniqueness of individuals. The one to many matchings is used to match the person with all persons enrolled in the database. Matched image is illustrated in Fig. 7.
Fig. 7. Ear edge matched image
4 Results and Discussion In this experiment analysis section, ear identification using the PSO method is applied to the AMI dataset. We used a sample of the AMI database [4], which consists of 50 images in the first test and 150 images in the second test. The experimental evaluation
Particle Swarm Optimization Ear Identification System
381
Table 2. Comparison of various Ear identification methods with 50 images Methods ED PCA SIFT SVM PSO
Total no. of images No. of images identified Performance 50 42 84% 50 44 88% 50 45 90% 50 46 92% 50 49 98%
Fig. 8. Experimental analysis for different Ear identification methods
Table 3. Comparison of various methods to proposed PSO with 150 images Methods ED PCA SIFT SVM PSO
Total no. of images No. of images identified performance 150 121 81% 150 130 87.3% 150 135 90.6% 150 139 93.33% 150 145 96.6.%
of the proposed method and other benchmark methods are depicted in Fig. 8, which shows the matches found per person of the human ear and represents the high recognition rate. Table 2 and Table 3 provide the performance of ear identification based on PSO. From this Table, it can be noticed that PSO gives better accuracy while comparing PSO with PCA and SIFT. The proposed PSO matching gives a better result. The performance of the ear identification algorithm is evaluated by computing the Identification rate. In general, the identification rate or recognition rate is,
382
B. Lavanya et al.
Identification Rate =
Number of ear images correctly Identified Total Number of Ear Images
ð6Þ
The obtained identification rate is shown in Table 2 and Table 3. In this study, PSO has been proposed for human identification from ear images. Recognition rate shows the performance of Euclidean distance (ED), PCA, SIFT, and PSO. Canny edge detection effectively improves the ear image recognition rate. PSO method is used to match the ear image with other ear images and gives better accuracy.
5 Conclusions and Future Directions Ear matching is a significant process in judicial applications because of its uniqueness. The identification is based on the shape of the ear edges. Ear image has rich edges. This paper mainly focuses on filtering based on Gaussian filter and edge detection based on the Canny edge detector. A Canny edge detector gives clear edges between ear images. In this study, PSO has been proposed for human identification from ear images. In this paper, an algorithm is presented for matching a given image with trained images based on particle swarm optimization. We conclude that the proposed algorithm gives a better identification rate. As future work, the PSO matching algorithm is required to be tested on more samples from a single person and integrating another preprocessing step to enhance ear matching. Acknowledgement. Special acknowledgement to Robotics and Internet-of-Things Lab (RIOTU), Prince Sultan University, Riyadh, Saudi Arabia. We would like to show our gratitude to Prince Sultan University, Riyadh, Saudi Arabia.
References 1. Amirthalingam, G., Radhamani, G.: A multimodal approach for face and ear biometric system. Int. J. Comput. Sci. Issues (IJCSI) 10(5), 234–241 (2013) 2. Abaza, A., Ross, A.: Towards understanding the symmetry of human ears: a biometric perspective. In: 2010 Fourth IEEE International Conference on Biometrics: Theory Applications and Systems (BTAS), 27–29 September 2010, Washington, DC, USA (2010) 3. Al-Ta’i, Z.T.M., Abd Al-Hameed, O.Y.: Comparison between PSO and firefly algorithms in fingerprint authentication. Int. J. Eng. Innov. Technol. (IJEIT), 3(1), 421–425 2013 4. AMI Database: http://www.ctim.es/research_works/ami_ear_database 5. Arunkumar, M., Valarmathy, S.: Palmprint and face based multimodal recognition using PSO dependent feature level fusion. J. Theor. Appl. Info. Technol. 57(3), 337–346 (2013) 6. Kumar, B.K.S.: Image Denoising based on Gaussian/bilateral filter and its method noise thresholding. Signal Image Video Process. 7(6), 1159–1172 (2013) 7. Benzaoui, A., Hadid, A., Boukrouche, A.: Ear biometric recognition using local texture descriptors. J. Electron. Imaging 23(5), 053008 (2014) 8. Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8(6), 679–698 (1986)
Particle Swarm Optimization Ear Identification System
383
9. Chan, T.S., Kumar, A.: Reliable ear identification using 2-D quadrature filters. Pattern Recogn. Lett. 33(14), 1870–1881 (2012) 10. Chi, M., Ying, T.: Ear recognition based on Forstner and SIFT. Indonesian J. Electr. Eng. Comput. Sci. 11(12), 7131–7137 (2013) 11. Chidananda, P., Srinivas, P., Manikantan, K., Ramachandran, S.: Entropy-cum-Houghtransform-based ear detection using ellipsoid particle swarm optimization. Mach. Vis. Appl. 26(2-3), 185–203 (2015) 12. Li, S.Z., Jain, A.K.: Encyclopedia of Biometrics. Springer, Boston, MA (2015). ISBN: 9781-4899-7487-7 13. Dong, X., Guo, Y.: 3D ear recognition using SIFT keypoint matching. Energy Procedia 11, 1103–1109 (2011) 14. Fooprateepsiri, R., Kurutach, W.: Ear based personal identification approach forensic science tasks. Chiang Mai J. Sci. 38(2), 166–175 (2011) 15. Hurley, D.J., Nixon, M.N., Carter, J.N.: Force field feature extraction for ear biometrics. Comput. Vis. Image Underst. 98(3), 491–512 (2005) 16. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: IEEE International Conference on Neural Networks. 27 November–1 December 1995, Perth, WA, Australia (1995) 17. Jayachandra, C., Patel, B.S.: An ear recognition approach using edge detection. Technology 1(1), 38–45 (2013) 18. Kennedy, J.: Particle Swarm Optimization, Encyclopedia of Machine Learning. Springer, US, pp. 760–766 (2010) 19. Kumar, A., Wu, C.: Automated human identification using ear imaging. Pattern Recogn. 45(3), 956–968 (2012) 20. Kumar, A., Chan, T.C.: Robust ear identification using asparse representation of local texture descriptors. Pattern Recogn. 46(1), 73–85 (2013) 21. Logannathan, B., Marimuthu, A.: Iris authentication using PSO. Int. J. Comput. Organ. Trends 2(1), 10–15 (2012) 22. López-Franco, C., Villavicencio, L., Arana-Daniel, N., Alanis, A.Y.: Image classification using PSO-SVM and an RGB-D sensor. Math. Probl. Eng. 2014(695910), 17 (2014). https:// doi.org/10.1155/2014/695910 23. Mussi, L., Cagnoni, S.: Particle swarm for pattern matching in image analysis. artificial life and evolutionary computation. In: Proceedings of Wivace 2008 Venice, Italy, 8–10 September 2008, pp. 89–98 (2008) 24. Anwar, A.S., Ghany, K.K., Elmahdy, H.: Human ear recognition using geometrical features extraction. Procedia Comput. Sci. 65, 529–537 (2015) 25. Sibai, F.N., Nuaimi, A., Maamari, A., Kuwair, R.: Ear recognition with feed-forward artificial neural networks. Neural Comput. Appl. 23(5), 1265–1273 (2013) 26. Ben Abdallah, M., Azar, A.T., Guedri, H., Malek, J., Belmabrouk, H.: Noise-estimationbased anisotropic diffusion approach for retinal blood vessel segmentation. Neural Comput. Appl. 29(8), 159–180 (2017) 27. Ben Abdallah, M., Malek, J., Azar, A.T., Belmabrouk, H., Krissian, K.: Adaptive noisereducing anisotropic diffusion filter. Neural Comput. Appl. 27(5), 1273–1300 (2016) 28. Anam, T., Akram, M.U.: Personal identification using ear recognition. TELKOMNIKA (Telecommun. Comput. Electron. Control) 10(2), 321–326 (2012) 29. Azar, A.T., El-Said, S.A., Balas, V.E., Olariu, T.: Linguistic hedges fuzzy feature selection for differential diagnosis of Erythemato-Squamous diseases. In: Balas, V., Fodor, J., Várkonyi-Kóczy, A., Dombi, J., Jain, L. (eds.) Soft Computing Applications. Advances in Intelligent Systems and Computing, vol 195. Springer, Berlin, Heidelberg (2013)
384
B. Lavanya et al.
30. Kumar, S.S., Inbarani, H.H., Azar, A.T., Own, H.S., Balas, V.E., Olariu, T.: Optimistic multi-granulation rough set-based classification for neonatal jaundice diagnosis. In: Balas, V., Jain, L.C., Kovacevic, B. (eds.) Soft Computing Applications. SOFA 2014. Advances in Intelligent Systems and Computing, vol. 356. Springer, Cham (2016) 31. Pintea, C.M., Matei, O., Ramadan, R.A., Pavone, M., Niazi, M., Azar, A.T.: A fuzzy approach of sensitivity for multiple colonies on ant colony optimization. In: Balas, V., Jain, L., Balas, M. (eds.) Soft Computing Applications. SOFA 2016. Advances in Intelligent Systems and Computing, vol. 634. Springer, Cham (2018) 32. Azar, A.T., Ali, H.S., Balas, V.E., Olariu, T., Ciurea, R.: Boosted decision trees for vertebral column disease diagnosis. In: Balas, V., Jain, L.C., Kovacevic, B. (eds.) Soft Computing Applications. SOFA 2014. Advances in Intelligent Systems and Computing, vol. 356. Springer, Cham (2016) 33. Tayyaba, S., Ashraf, M.W., Afzulpurkar, N., Hussain, A., Imran, M., Noreen, A.: A novel ear identification system for security applications. Int. J. Comput. Commun. Eng. 2(2), 125– 128 (2013) 34. Yin, P.Y.: Particle swarm optimization for point pattern matching. J. Vis. Commun. Image Represent. 17(1), 143–162 (2006)
Puzzle Learning Trail Generation Using Learning Blocks DoruAnastasiu Popescu1(&), Daniel Nijloveanu2, and Nicolae Bold1 1
2
Department of Mathematics and Computer Science, University of Pitesti, Pitești, Romania [email protected], [email protected] Faculty of Management, Economic Engineering in Agriculture and Rural Development, University of Agronomic Sciences and Veterinary Medicine Bucharest, Slatina, Romania [email protected]
Abstract. Many training tools and methods have been developed in the last periods of time. Many of them which were based on various mathematical models have proven their trustworthiness and applicability in practice, helping the trainer to use the technology of information in advantage of the class work. One of the methods that was established in recent times and is extremely useful in teaching is training by using blocks. In this paper we will present a method of automatically generating a sequence of blocks, generally called learning trail, using a set of given blocks. These blocks have special characteristics of connection with the other blocks, due to the fact that the trail has a quadratic form. The generation of a trail has a genetic base, for its building using a genetic-based implementation. Keywords: Genetic
Tool Web application Education Assessment
1 Introduction While many novel methods of training appear, the subject is always the same, that is, gaining knowledge in order to face real issues. A wide variety of approaches were thought in order to fulfill this purpose and many of them use tools of technology information in order to accomplish it [2, 3, 12]. Learning by blocks is one of the new approaches to many areas of expertise, especially in programming learning [13]. In many cases, the studied problem is the generation of learning paths or trails [1]. The idea is already used in programming software such as Scratch [6] or Lego Mindstorms [7]. A similar idea can be used to generate paths [8] of tests or even lessons. The generation is implemented using a genetic algorithm, part of evolutionary optimization algorithm class [18, 19]. But the usage of the trails or paths is not limited to training. It can also be applied to areas such as organizational management [4], where optimal paths of activities can be generated. This paper studies the possibility of the learning trail generation usage in the educational processes for organisational and curricular purposes. The initial data that is © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 385–391, 2021. https://doi.org/10.1007/978-3-030-51992-6_30
386
D. Popescu et al.
provided consists in bits or blocks of information that can be linked to one another. The information of each block consists in either a step within the process of solving a particular problem given to the student or a bit of organisational information (a topic or a chapter within a course). In this matter, the paper is structured in several sections. Thus, Sect. 2 presents the description of our proposed model of generating learning trails using learning blocks. The third section shows some theoretical results regarding the implementation of the model presented in Sect. 2. Finally, the conclusions draw some guidelines for the future work.
2 Model Description The model of learning that uses building blocks is widely used as a learning tool for teaching in various educational systems [5, 14, 17]. The idea is similar to generating items for tests in an assessment context [10]. Creating a block trail requires organizational skills, this process having characteristics that permit its usage within education process planning, very important within the context of adaptive learning systems [9]. This can be made using the human mind, but also using specific methods from the algorithms, due to their heuristic characteristics [11]. The whole concept also links to distributed cognition [15], due to the fact that solving a problem can be done by taking bits of information from a larger common pool of separate bits of information. Also, related to problem solving using block-based learning, the concept borrows ideas from the constructionism in learning [16], which claims that learning is made using the known bits of information in order to obtain other unknown pieces of information. Basically, having at hand a number of learning blocks, we would like to create a trail of learning that has specific requirements. The structural components of the model are the building blocks and the trail that must be built. But, firstly, we shall define a block and a trail of blocks. A block Bi is a structure of information that contains a learning sequence. Also, every block has four matchings, one on each side, on the north side, south side, east side and west side. A connection between two blocks can be made using various data structures. The connections can be codified as numbers, keywords, binaries or complex data structures. Bi ¼ fnðBi Þ; eðBi Þ; sðBi Þ; wðBi Þg; i ¼ 1; N
ð1Þ
A trail of blocks is a two-dimensional tabular structure formed of K * K blocks built based on given requirements. 2 6 TB ¼ 6 4
B1
B2
BK þ 1 ...
BK þ 2 ...
BðK1ÞK þ 1
BðK1ÞK þ 2
3 ... BK . . . B2K 7 7 ... ... 5 . . . BKK
ð2Þ
We can now formulate the statement of the problem that needs to be solved. There are given N initial blocks that can be connected with other blocks on the four edges,
Puzzle Learning Trail Generation Using Learning Blocks
387
with every block containing a sequence of learning. For every block, the four connections on the edges are known. Based on these, it is required to determine a quadratic trail of learning of size K * K, where the component blocks are connected on the four sides. The input data of the model is consisting in: – N (the number of initial blocks); – K (the desired dimension of the trail). Now, that we have settled the components of the model, we will describe the main requirement for the given problem: – for a given sequence of blocks, every side of a block must match with the corresponding blocks in the four directions Bi ! wðBi Þ eðBi1 Þ $ eðBi Þ wðBi þ 1 Þ $ nðBi Þ sðBiK Þ $ sðBi Þ nðBi þ K Þ
ð3Þ
– a numerical requirement that is obvious is: K K N
ð4Þ
In practice, the requirements (1) and (3) are enough to be respected during the process of building the learning trail.
3 Implementation and Results In order to solve the problem of generating learning trails, we chose a genetic algorithm. Basically, the blocks can codify bits of information, as follows: – in the first case, for solving a particular given problem, the blocks codify steps of solving the problem; for example, describing a controlled robot movement in order to accomplish a task; – setting up a learning path; in this case, the blocks codify topics or sections within a course and help structuring them in learning paths for organisational purposes. These blocks have four connections on the for sides, as puzzle pieces, and can be connected to each other. Given the blocks, it is required a learning path or trail that maximizes the number of connections within a path. We chose to implement a learning path with topics codified by order numbers. Also, the connection type for each side of each block is known. The implementation that we chose was made based on the next components: • input data: N (the number of available blocks), K (the desired size of the trail), NrG (the number of generations), NrPOP (the size of the initial population), rm (mutation rate), ri (crossover ratio)
388
D. Popescu et al.
• for every block, four numbers are given, that represents the codification of the four connections, in order for up, right, down and left. • output data: the learning trail in the form of a matrix • genetic structures: genes represented by blocks and chromosomes represented by a one-dimensional representation of the trail matrix • genetic operations: – generation of initial population – mutation B1 ; B2 ; B3 ; . . .; Bi ; . . .; BKK ; i ¼ random 1; K K ! B1 ; B2 ; B3 ; . . .; M; . . .; BKK ; M ¼ random 1; N
ð5Þ
– crossover Bs ¼ Bs1 ; Bs2 ; Bs3 ; . . .; Bsi ; Bsi þ 1 . . .; BsKK ; Bq ¼ Bq1 ; Bq2 ; Bq3 ; . . .; Bqi ; Bqi þ 1 . . .; BqKK ! 0
0
Bs ¼ Bs1 ; Bs2 ; Bs3 ; . . .; Bsi ; Bqi þ 1 . . .; BqKK ; Bq ¼ Bq1 ; Bq2 ; Bq3 ; . . .; Bqi ; Bsi þ 1 . . .; BsKK ; s; q ¼ 1; NrPOP; i ¼ 1; K K ð6Þ
• fitness function: S ¼ fsðBi ÞjsðBi Þ ¼ nðBi þ K Þ and K i K ðK 1Þg N ¼ fnðBi ÞjnðBi Þ ¼ sðBiK Þ and K i K ðK 1Þg E ¼ feðBi ÞjeðBi Þ ¼ vðBi þ 1 Þ and 0 i K K 1Þg V ¼ fvðBi ÞjvðBi Þ ¼ PeðBi1 Þ and 1 i K KÞg f ðTBi Þ ¼ KK i¼1 jS [ N [ E [ V j
ð7Þ
The implementation was made using Java programming language. As genetic structures, we will use genes codifying blocks and chromosomes codifying learning trails. The chosen genetic operations are the mutation and the crossover in one point. These were chosen as they are the most used and proved to have the best results in previous genetic algorithm implementations of the authors. As fitness function, we will use the number of correct connections that every block has, due to the fact that the connections that we use are numerical. The genetic algorithm has the next form: int i, j; read(); generation_init_population();
Puzzle Learning Trail Generation Using Learning Blocks
389
for (i = 0; i < NrG; i++) { mutation(); crossover(); sort(); }
The sort is made by the fitness, the most suitable trail being the one with the maximum number of correct connections. We will take now a set of input data which would consist of an example of showing the modality of a trail determination. For a set of N = 40 initial blocks, 100 generations and a trail size of K = 3 (K * K = 9), the next table presents the input data for the example. Table 1. Input data for the example Block 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
North 1 1 1 2 1 1 2 1 2 2 1 1 2 1 1 2 1 2 2 1
East 1 2 2 2 2 1 2 2 2 1 1 2 2 2 1 2 2 2 1 1
South 1 2 1 1 1 1 2 2 2 2 1 1 1 1 1 2 2 2 2 1
West 1 1 2 1 2 2 2 2 1 2 1 2 1 2 2 2 2 1 2 1
Block 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40.
North 1 1 1 2 1 1 2 1 2 2 1 1 2 1 1 2 1 2 2 1
East 1 2 2 2 2 1 2 2 2 1 1 2 2 2 1 2 2 2 1 1
South 1 2 1 1 1 1 2 2 2 2 1 1 1 1 1 2 2 2 2 1
West 1 1 2 1 2 2 2 2 1 2 1 2 1 2 2 2 2 1 2 1
The numerical values for the four sides of the blocks shown in Table 1 codify the type of connection with the other blocks on that side. They can be of any type (e.g., string – keywords [11], boolean etc.), but we chose numerical for the easy understanding of the example. For example, the block 14 can be connected with 15 on the north side, 13 on the south, 2 on the left and 16 on the right. For the given example, a suitable trail found by the algorithm was 10 15 31 2 14 16 9 13 7, with f = 8. The numbers represent the order number of the block given in the reading order. The next figure presents the output trail in its quadratic form (Fig. 1).
390
D. Popescu et al.
2 1 1
2 10 2 1 2 2 2 9 2
1
2
2
2
2
2
1 15 1 1 14 1 1 13 1
1
1
2
2
2
2
1 31 1 2 16 2 2 7 2
1 2 2
Fig. 1. Block sequence for the given input data
The obtained sequence is the one which has the greater number of correct connections on its run. For each block, the number of connections must be considered to be added one single time for all the four sides for each block. The obtained sequence is a solution to the problem that will be used either as a guidance to solving the problem by the student or in practice, to organize an educational process, such as a course or a lesson.
4 Conclusions The usage of the genetic algorithms is nowadays a top trend in the evolution algorithms area, due to their applicability in almost any problem. Learning by building blocks is also a promising development of an educational method, because of its beneficial effects on the organizational and structural gain of skills of the learners. The next step of this development is to adapt the method to a real training environment, in order to show its applicability in real life. In the future, we would like to test the method in a learning environment, where it will prove the potential by increasing the efficiency of learning in terms of teacher assistance and learning organization and planning.
References 1. Popescu, D.A., Bold, N., Domsa, O.: A generator of sequences of hierarchical tests which contain specified keywords. In: SACI 2016, pp. 255–260 (2016) 2. Holotescu, C.: A conceptual model for open learning environments. In: International Conference on Virtual Learning – ICVL, pp. 54–61 (2015) 3. Gaytan, J., McEwen-Beryl, C.: Effective online instructional and assessment strategies. Am. J. Distance Educ. 21(3), 117–132 (2007) 4. Goh, S.C.: Toward a learning organization: the strategic building blocks. S.A.M. Adv. Manag. J. 63(2), 15–22 (1998) 5. Gabow, H.N., Maheshwari, S.N., Osterweil, L.J.: On two problems in the generation of program test paths. IEEE Trans. Softw. Eng. SE-2(3), 227–231 (1976) 6. Massachusetts Institute of Technology educational resources. https://scratch.mit.edu/
Puzzle Learning Trail Generation Using Learning Blocks
391
7. https://www.lego.com/en-us/mindstorms 8. Nijloveanu, D., Bold, N., Bold, A.C.: A hierarchical model of test generation within a battery of tests. In: International Conference on Virtual Learning, pp. 147–153 (2015) 9. Popescu, E.: Adaptation provisioning with respect to learning styles in a web-based educational system: an experimental study. J. Comput. Assist. Learn. 26(4), 243–257 (2010) 10. Popescu, D.A., Bold, N., Nijloveanu, D.: A method based on genetic algorithms for generating assessment tests used for learning. In: International Conference on Intelligent Text Processing and Computational Linguistics, Conference Proceedings, 3–9 April, Konya, Turkey (2016) 11. Defta, C.L., Şerb, A., Iacob, N.M., Baron, C.: Threats analysis for E-learning platforms. Knowl. Horizons Econ. 6(1), 132–135 (2014) 12. Choi, B.: Playful assessment for evaluating teachers’ competencies for technology integration. In: Langran, E., Borup, J. (eds.) Proceedings of Society for Information Technology & Teacher Education International Conference, pp. 406–410. Association for the Advancement of Computing in Education (AACE), Washington, D.C. (2018) 13. Maloney, J.H., Peppler, K., Kafai, Y., Resnick, M., Rusk, N.: Programming by choice: urban youth learning programming with scratch. In: Proceedings of the 39th SIGCSE technical symposium on Computer science education (SIGCSE 2008), pp. 367–371. ACM, New York (2008) 14. Weintrop, D., Wilensky, U.: To block or not to block, that is the question: students’ perceptions of blocks-based programming. In: Proceedings of the 14th International Conference on Interaction Design and Children (IDC 2015), pp. 199–208. ACM, New York (2015) 15. Hollan, J., Hutchins, E., Kirsh, D.: Distributed cognition: toward a new foundation for human-computer interaction research. ACM Trans. Comput. Hum. Interact. 7(2), 174–196 (2000) 16. Harel, I., Papert, S. (eds.): Constructionism. Ablex Publishing, Westport (1991) 17. Roque, R.V.: OpenBlocks: An Extendable Framework for Graphical Block Programming Systems. MIT (2007) 18. Rotar, C., Iantovics, L.B., Arik, S.: A novel Osmosis-inspired algorithm for multiobjective optimization. In: Liu, D. et al. (ed.) 24th International Conference on Neural Information Processing (ICONIP 2017) Guangzhou, China, 14–18 November 2017. LNCS 10637, pp. 80–88 (2017). https://doi.org/10.1007/978-3-319-70093-9_9 19. Rotar, C., Iantovics, L.B.: Directed evolution – a new metaheuristc for optimization. J. Artif. Intell. Soft Comput. Res. 7(3), 183–200 (2017). https://doi.org/10.1515/jaiscr-2017-0013. ISSN 2083-2567
Design of an Extended Self-tuning Adaptive Controller Ioan Filip, Florin Dragan, Iosif Szeidert(&), and Cezara Rat Department of Automation and Applied Informatics, University Politehnica Timisoara, Bvd. V. Parvan, No. 2, Timisoara 300223, Romania {ioan.filip,florin.dragan,iosif.szeidert}@aut.upt.ro, [email protected]
Abstract. In the paper there is presented an extended strategy for a self-tuning minimum variance control system, by using an additional external integral component. In order to estimate the process parameters, a parameter estimator based on the Givens orthogonal transformation was used. The strategy was tested and validated for the case of a synchronous generator connected to a power energy system, considered as controlled process. The proposed solution tries to achieve two main goals: the control system stability, respectively a zero steady-state error. For certain plants, the second objective can be achieved by supplementing with an external integral component, operating in parallel with the minimum variance controller. The paper proves the necessity of this external integral component, under the conditions when the minimum variance control law cannot contain an integral term, which would destabilize the control system. Keywords: Self-tuning minimum variance controller External integral component Modelling and simulation Synchronous generator
1 Introduction Although the minimum variance control techniques are already considered classic solutions, their applicability for nonlinear process, still represents a serious challenge, due to the process complexity and various specific issues that must be solved [1, 2]. Ensuring the control system stability, while achieving good control performances, are such issues that (for particular plants) may require a change of the control strategy by adding supplementary control components. The present paper considers such nonlinear process, consisting of a synchronous generator connected to a power system. The technical literature shows that even this plant is exactly described by a 7th order nonlinear mathematical model (Park’s equations), its behavior around an operating point can be approximated by a 4th order linearized model [3, 4]. As a starting point, based on this simplified linearized model, a minimum variance control strategy can be designed.
© Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 392–402, 2021. https://doi.org/10.1007/978-3-030-51992-6_31
Design of an Extended Self-tuning Adaptive Controller
393
2 The Design of the Self-tuning Adaptive Control System The design of a minimum variance control system, particularized for the case of a synchronous generator connected to a power system through a long transmission line, requires the following initial information [5, 6]: – The 4th order linear model of the process (synchronous generator connected to a power system) described by a discrete stochastic equation: A z1 yt ¼ z1 Bðz1 Þut þ C z1 et þ d
ð1Þ
where: yt - the process output (terminal voltage variation), ut - the process input (excitation voltage variation), et - the stochastic disturbance (zero average and dispersion r2 ), d - the process output for zero input, z1 - one step delay operator, Aðz1 Þ, Bðz1 Þ - process polynomials (of known order): A z1 ¼ 1 þ a1 z1 þ a2 z2 þ a3 z3 þ a4 z4 ; B z1 ¼ b0 þ b1 z1 þ b2 z2 þ b3 z3 and C ðz1 Þ ¼ 1 (in order to avoid the parameters estimation of the noise filtering polynomial). Also, the process output for zero input was considered d ¼ 0 (without affecting the generality of the foregoing). The following criterion function is considered for minimization: n 2 o J ¼ E ½yt þ 1 wt 2 þ Q0 z1 ut
ð2Þ
where: yt þ 1 – the controlled output at discrete time t + 1, ut – the controller output at discrete time t, wt – the set point, Q0 – a chosen polynomial to penalize the control, E f:g – the mean operator [5, 7]. The design of the one-step ahead optimal predictor starts from relation (1), rewritten for t þ 1 time moment and leads to the relation: F ðz1 Þ Bðz1 Þ 1 yt þ ut þ z d 1 C ðz Þ C ðz1 Þ C ðz1 Þ
ð3Þ
F z1 ¼ z C z1 A z1 ¼ f1 þ f2 z1 þ . . . þ fn zn þ 1
ð4Þ
yt þ 1=t ¼ yt þ 1 et þ 1 ¼ where:
Substituting yt þ 1 from relation (3) in relation (2), the criterion function becomes:
394
I. Filip et al.
J¼E
n
yt þ 1 þ et þ 1 wt
2
2 o þ Q0 z1 ut
ð5Þ
However for E e2t þ 1 ¼ r2 (the noise dispersion), the minimization of criterion function J is equivalent with minimization of the following criterion function: I¼ So
@J @ut
¼ 0 involves
@I @ut
n
yt þ 1 wt
2
o 2 þ Q0 z1 ut þ r2
ð6Þ
¼ 0, resulting the next equation:
@yt þ 1 2 yt þ 1 wt þ 2Q0 ð0ÞQ0 z1 ut ¼ 0 @ut Taking into account relation (3), there can be written:
@yt þ 1 @ut
ð7Þ
¼ b0 .
Q0 ð0ÞQ0 ðz1 Þ Noting Q z1 ¼ b0
ð8Þ
and replacing relation (3) in (6), results: F ðz1 Þ Bðz1 Þ 1 y ut þ z d wt þ Q z1 ut ¼ 0 þ t 1 1 1 C ðz Þ C ðz Þ C ðz Þ
ð9Þ
Considering d ¼ ct, the following control law results: ut ¼
Cðz1 Þwt F ðz1 Þyt d Bðz1 Þ þ Qðz1 ÞCðz1 Þ
ð10Þ
where: F ðz1 Þ ¼ z½C ðz1 Þ Aðz1 Þ. For C ðz1 Þ ¼ 1 and d ¼ 0, results F ðz1 Þ ¼ z½1 Aðz1 Þ, the control law becomes: ut ¼
wt z½1 Aðz1 Þyt Bðz1 Þ þ Qðz1 Þ
ð11Þ
The control law described by relation (11) will be used in the next case studies, by considering Qðz1 Þ ¼ qð1 q1 z1 Þ (where q and q1 are off-line tuning parameters of the controller). Taking into account that the coefficients of Aðz1 Þ and Bðz1 Þ polynomial are parameter estimates of the process model, the control law can be rewritten: ^ ðz1 Þ yt wt z 1 A ut ¼ ^ ðz1 Þ ^ ðz1 Þ þ Q B
ð12Þ
Design of an Extended Self-tuning Adaptive Controller
395
where through ^ were noted the estimates. For the next implementation and case studies, an on-line parameter estimator based on Givens orthogonal transformation was used [8, 9]. There can be noticed that in the quadratic criterion function (relation (2)) another polynomial noted Q0 ðz1 Þ is used. The relationship between the two polynomials is given by relation (8). The following generalized form of Q0 ðz1 Þ polynomial can be considered: Q0 ðz1 Þ ¼ q0 ð1 q1 z1 Þ. Taking into account the relation (8), results: Q0 ð0ÞQ0 ðz1 Þ q0 2 Q z1 ¼ ¼ 1 q1 z1 ¼ q 1 q1 z1 b0 b0
ð13Þ
So: ut ¼
wt z½1 Aðz1 Þyt wt z½1 Aðz1 Þyt ¼ 1 1 Bðz Þ þ Qðz Þ Bðz1 Þ þ q0 ð1 q1 z1 Þ
ð14Þ
and for the particular polynomials of the discrete Eq. (1), the control law becomes: ut ¼
^4 z3 yt wt þ ½^a1 þ ^a2 z1 þ ^a3 z2 þ a ^b0 þ q þ ^b1 þ qq1 z1 þ ^ b2 z2 þ ^ b3 z3
ð15Þ
Obviously, b0 is practically a parameter estimate of the process model. However, the technical literature [5, 10] and the performed study cases, show that, in relation (13) 02
the term q ¼ qb0 can be considered as a constant parameter (known as control penalty factor). Also, the literature recommends, as usual values of q, the range [0.0001… 0.01]. Depending on q and q1 values, some cases can be noticed: – If q1 ¼ 0, results Qðz1 Þ ¼ q which ensures a control penalization. In the case of non-minimum phase processes, the adequate choice of the q control penalty factor can assure an increase of the control system stability. The disadvantage of such choice consists in the occurrence of a steady-state error. – If q1 ¼ 1 and Qðz1 Þ ¼ qð1 z1 Þ, an internal integral element is added in the control law, in order to achieve a zero steady-state error. Unfortunately, for processes whose linearized models have zeros of the transfer function close to the unit circle (case of the synchronous generator), a such approach leads to an unstable system. – If q1 2 ð0. . .1Þ, the polynomial Qðz1 Þ ¼ qð1 q1 z1 Þ, ensures a control penalty (by term q), simultaneously with a decrease of the steady-state error (by term q1 ). If q1 is closer to 1, the steady-state error decreases (q1 ¼ 1 being equivalent with an integral element). Such choice of Qðz1 Þ polynomial is justified by the control system stability and the q1 parameter acts only on the steady-state error. Analyzing the above-mentioned facts, neither one of the three possible solutions regarding the controller tuning are not completely satisfactory, as it will be proven in the following case studies.
396
I. Filip et al.
The proposed solution extends the designed minimum variance control law with an additional PI component (leading to a zero steady-state error).
3 The Design of the Extended Adaptive Controller. Case Studies The case studies outlined below will highlight the problems encountered in controller tuning (related to the system stability and steady-state error), and will propose and validate a solution to resolve these issues. All variables in the following study cases are expressed in per unit (p.u.) and the abscissa axis represents time (seconds). Case 1) The following values of the simulation parameters were considered: q ¼ 0:0001; q1 ¼ 1; r2 ¼ 108 (q1 ¼ 1 - meaning practically an integral component in the control law). The set point variance is presented in Fig. 1a (with a slope variation of 0.05 relative units/seconds).
Fig. 1a. Set point variance.
Fig. 1c. Controller output variance.
Fig. 1b. Terminal (controlled output)
voltage
variance
Fig. 1d. Parameter estimations of polynomial A.
Design of an Extended Self-tuning Adaptive Controller
397
Fig. 1e. Parameter estimations of polynomial B.
In Fig. 1b there can be noticed a good tracking of the reference by the controlled output, from this point of view the performances being very good. There can also be noticed a zero steady-state error. However, the controller output variance is too high (Fig. 1c), reaching values that are above the saturation limit of the execution element. The retuning of the controller is required, in order to achieve a significant decrease of the control variance. The Figs. 1d and 1e show the process parameter estimates. There can be noticed that a change of the set point (so of the operating point) involves a change in the parameter estimations and a return to the previously operating point doesn’t involve a restore of the previous values of the estimates. This statement is even more evident in the present case, where the linearized model was used only for the design phase of the adaptive control law and not for the process implementation. Case 2) Based on the above, the control penalty factor is increased ðq ¼ 0:01Þ in order to achieve a smaller control variance. Also, the integral term is maintained ðq1 ¼ 1Þ.
Fig. 2a. Parameter estimations.
Fig. 2b. Controller output variance.
All performed tests (Figs. 2a, 2b) show that for such controller tuning, the system becomes unstable. Figure 2a shows the parameter estimates time variation, numerically stable until time 3.5 s, whilst the controller output oscillates and increases uncontrolled,
398
I. Filip et al.
even before this moment and also before the set point change (Fig. 2b). The estimates and the controller output show that the system instability is not due to a numerical instability of the estimator (the estimates still succeeding to track the evolutions of the real parameters until time t = 3.5 s), the cause being the improper values of the two controller tuning parameters (q and q1 ). This behavior was expected, the control system being also unstable in the absence of control penalty (q ¼ 0). The explanation consists in the fact that the plant model zeros (the B polynomial roots) are very close to the unit circle. These plant model zeros can become poles of the controller (within the control structure -see the control law). Some certain estimation errors or a variation of the controller pole as a consequence of the tuning, can pull these poles out of the unit circle (so the system becomes unstable). In another paper of the authors [11], the same control law (designed through the minimization of the criterion function (2)) was also tested for an adaptive control system using a recursive least squares (RLS) estimator. In this paper, by using another type of estimator (based of Givens orthogonal transformation), the relatively similar results validate the stated conclusion: the system instability is caused by an improper tuning of the controller (and therefore, is not caused by the parameter estimator). Case 3) As a consequence, this new study case considers: q1 ¼ 0:6; ensuring a “weak” integral component, in order to stabilize the system, respectively q ¼ 0:01, to increase the control penalization (Fig. 3a).
Fig. 3a. Terminal voltage (controlled output).
variance
Fig. 3b. Controller output variance.
Design of an Extended Self-tuning Adaptive Controller
Fig. 3c. Parameter estimations of polynomial A.
399
Fig. 3d. Parameter estimations of polynomial B.
There can be noticed a decrease of the controller output variance, which reach acceptable values (Fig. 3b). Unfortunately, a non-zero steady-state error is present, simultaneously with an increase of the overshoot and of the settling time, resulting in a slight performance degradation. The Figs. 3c and 3d show the estimates, being able to see that the controller parameters influences the parameter estimations (see Case 1 versus Case 3). Case 4) For the same control penalty factor q ¼ 0:01, there is considered q1 ¼ 0, completely rejecting the integral component. There can be noticed a clear decrease of the controller output variance (Fig. 4b), but in the same time, removing the integral component leads to an increase of the steady-state error (Fig. 4a).
Fig. 4a. Terminal voltage variance (controlled output).
Fig. 4b. Controller output variance.
400
I. Filip et al.
Case 5) For the same values of the controller parameters as in the previous case (q ¼ 0:01 and q1 ¼ 0), a relatively simple solution is proposed to achieve a zero steady-state error, by adding an external integral loop, operating in parallel with the adaptive controller (see Fig. 5).
Fig. 5. Extended self-tuning adaptive control system.
In many cases, such external integral loop ensures control performances that only the main controller (in this case - the minimum variance controller) does not succeed to fulfill (a zero steady-state error). Therefore, the action of the main controller is accompanied by the action of an external control loop. In this way, in the present case, the role of the adaptive controller is much diminished in the stationary regimes (remaining important only for transient regimes). In Fig. 6a can be noticed a zero steadystate error, simultaneously with a decreased control variance (Fig. 6b).
Fig. 6a. Terminal voltage variance.
Fig. 6b. Controller output variance.
Design of an Extended Self-tuning Adaptive Controller
401
Fig. 6c. Controllers outputs.
In Fig. 6c is depicted the overall control variable (u) and its components delivered by adaptive controller itself (u1 ), respectively by the external integral loop ðu2 Þ (see also Fig. 5). The predictive character of the adaptive controller ensures a faster response time for transient regimes. As a result, the external integral loop makes futile the internal integral component of the minimum variance control law. So, the controller tuning q ¼ 0:01, q1 ¼ 0 and the external integral loop is a good solution to solve the problem. As already mentioned, for the same controlled process, the results obtained by using Givens estimator are similar to the one obtained by using the RLS estimator [11], thus proving the validity of the proposed control solution. An advantage of this solution is a much reduced effort required for tuning the integral component (practically a PI controller), without demanding a high precision. It is well known the fact that the tuning of a PI controller (especially for nonlinear process) is a quite difficult task [12]. In the present case, the role to detect an operating point change, followed by controller retuning (using parameter estimations), is the task of the self-tuning minimum variance controller.
4 Conclusions The proposed control strategy extends the minimum variance control law by using an additional external integral component. The goal of this component is to ensure a zero steady-state error, simultaneous with the stabilization of the control system. The extended control solution has been tested and validated for the case of a synchronous generator connected to a power energy system, considered as a controlled process. The important feature of this process is that the linear model poles are very close to the edge of the unit circle. Based on this particular linearized model, the minimum variance control law (obtained by minimizing a criterion function) may lead either to system instability or to a non-zero steady-state error (for any setting of tuning parameters). By a proper tuning of control penalty parameters ensuring the system stability, the solution proposes an external integral loop only in order to obtain a zero steady-state error (task that, for the considered process, can’t be achieved by the minimum variance controller).
402
I. Filip et al.
Although the control strategy has been tested and validated for a particular case, the proposed solution is also valid for other processes, where the minimum variance control law cannot simultaneously ensure both control system stability and a zero steady-state error.
References 1. Wellstead, P.E., Zarrop, M.B.: Self-Tuning Systems-Control and Signal Processing. Wiley, Oxford (1991) 2. Åström, K.J., Kumar, P.R.: Control: a perspective. Automatica 1(50), 3–43 (2014) 3. Filip, I., Prostean, O., Szeidert, I., Vasar, C.: An improved structure of an adaptive excitation control system operating under short-circuit. Adv. Electr. Comput. Eng. 16(2), 43–50 (2016) 4. Budisan, N., Prostean, O., Robu, N., Filip, I.: Revival by automation of induction generator for distributed power systems, in Romanian academic research. Renewable Energy 32(9), 1484–1496 (2007) 5. Åström, K.J., Wittenmark, B.: Adaptive Control. Addison-Wesley, Reading (1989) 6. Filip, I., Vasar, C.: About initial setting of a self-tuning controller. In: Proceedings of the 4th International Symposium on Applied Computational Intelligence and Informatics, SACI 2007, Timisoara, Romania, pp. 251–256 (2007) 7. Yanou, A., Minami, M., Matsuno, T.: Strong stability system regulating safety for generalized minimum variance control. In: Proceedings of the 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Limassol, Cyprus, 2017, pp. 1–8 (2017) 8. Filip, I., Szeidert, I.: Givens orthogonal transformation based estimator versus RLS estimator–case study for an induction generator model. In: Soft Computing Applications, Advances in Intelligent Systems and Computing, vol. 357, pp. 1287–1299. Springer (2015) 9. Mesloub, A., Abed-Meraim, K., Belouchrani, A.: A new algorithm for complex nonorthogonal joint diagonalization based on shear and givens rotations. IEEE Trans. Sig. Process. 62(8) (2014). https://doi.org/10.1109/tsp.2014.2303947 10. Filip, I., Prostean, O., Szeidert, I., Vasar, C.: Consideration regarding the convergence and stability of an adaptive self-tuning control system. In: Proceedings of the 5th IEEE International Conference on Computational Cybernetics, Gammarth, Tunisia, pp. 75–79 (2007) 11. Filip, I., Prostean, O., Szeidert, I., Vasar, C., Prostean, G.: Self-tuning control using external integrator loop for a synchronous generator excitation system. In: IEEE Conference on Emerging Technologies and Factory Automation, Prague, Czech Republic, 20–22 September 2006. https://doi.org/10.1109/etfa.2006.355251 12. O’Dwyer, A.: Handbook of PI and PID Controller Tuning Rules. Imperial College Press, London (2006)
Assessing the Quality of Bread by Fuzzy Weights of Sensory Attributes Anca M. Dicu, Marius Mircea Balas(&), Cecilia Sîrghie, Dana Radu, and Corina Mnerie Aurel Vlaicu University of Arad, Arad, Romania [email protected], [email protected], [email protected], [email protected]
Abstract. Assessing the quality of food products is very subjective, based on the personal perceptions of the judges. That is why the fuzzy sets and logic are very often used in such cases. However, the fuzzy dedicated software is not very accessible, especially for non-specialists. That is why we developed an assessing/sorting procedure that consists of the simple statistical calculus of scores for each sensory attributes of the product, and the weighted adding of the scores in order to obtain a global quality assessment. Fuzzy processing appears only when setting the weights of each attribute, by introducing fuzzy weights for each sensory attribute. This procedure was implemented by Matlab script functions and applied for the case of a local product of the Arad region of Romania, the Pecica traditional bread. Keywords: Food products assessment function Fuzzy weight
Sensory attribute Matlab script
1 Introduction Assessing the quality of food products using sensory analysis is a very subjective process, based on personal perceptions of the judges. Sensory evaluation (SE) has been defined as a scientific method used to evoke, measure, analyze, and interpret those responses to products as perceived through the senses of sight, smell, touch, taste, and hearing [1]. SE is the ultimate criterion for judging the quality of food [2] and SE provides important and useful information to the food industry and food scientists about the sensory characteristics of food [3]. Currently, the world’s most consumed food, bread, accounts for 95% of bakery sales, with global bread consumption being over 9 billion kg (20 billion lb) [4]. When we talk about the history of humans, it’s impossible not to think about their diet and not to mention bread, knowing that one of the oldest people’s activities was baking bread [5]. But, the decrease in the production of raw food materials due to the increase of the global population has led to a special attention to the management of food preservation, which has an important role in reducing the losses [6]. One of the main causes that affects the quality of bread is staling, so the shelf life of the bread is relatively short, causing economic losses around the world [5, 7]. The staling of bread, considered to be a complex phenomenon, refers to both crust staling © Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 403–411, 2021. https://doi.org/10.1007/978-3-030-51992-6_32
404
A. M. Dicu et al.
and crumb staling. Crust staling is due to transfer of moisture from crumb to crust, while crumb staling is related to physicochemical alteration in starch [5]. Unfortunately, the staling process has negative repercussions in terms of the shelf life of bread [7]. According to Giménez et al. [8] the shelf life is influenced by time, environment, and susceptibility of product to quality change, changes who might compromise nutritional, microbiological or sensory quality. Shelf-life of food products can be regarded as the period of time during which a product could be stored until it becomes unacceptable from safety, nutritional, or sensory perspectives. Shelf-life estimation of food products and beverages has become increasingly important in recent years due to technological developments and the increase in consumer interest in eating fresh, safe and high quality products [8, 9]. The elimination of shortcomings in relation to the shelf-life of food products has been technically solved by packaging. In this context, new strategies and new packaging materials have been developed for food preservation to meet current consumer requirements for safe and sustainable food with high nutritional and sensory value [8, 10]. For bread, recent studies have highlighted the possibility of applying a multitude of packing methods: active packaging [10–13] or modified atmosphere packaging (MAP) [14–16]. But taking into account the indirect indicators of the appreciation the staling of bread (such as texture, crust moisture and loss of moisture), the influence of the packing material on the staling process of the bread was studied. In references [17, 18] there were used as packaging material films of different thicknesses, thus obtaining a moisture barrier. Determinations reported in ref. [19] have demonstrated the advantages of using cellulose textile materials made of two layers of cotton and one layer of polyethylene in comparison with classic packaging (PP, PE, etc.) in order to slow down the bread’s staling. In this context, different methodologies have been developed to estimate the shelf life of food products. According to Ref. [8] sensory evaluation is a key factor for determining the shelf-life in several food categories, included bread. The objective of this study was to understand the influence of cellulose textile packaging on the sensory qualities of packaged traditional bread. That is why we developed an assessing/sorting procedure that consists of the simple statistical calculus of scores for each sensory attributes of the product, and the weighted adding of the scores in order to obtain a global quality assessment. The fuzzy processing enables users to set the weights of each attribute in a linguistic manner, by means of a fuzzyinterpolative decision controller [20]. This procedure was implemented through Matlab script functions, and is currently applied for the case of a local product of the Arad region of Romania, which is the Pecica traditional bread.
Assessing the Quality of Bread
405
2 Scoring the Bread Samples The traditional bread of Pecica is processed according to the ancient local recipe. Because the great wooden heated bakery ovens were put in function only once a week the people were storing bread in textile bags for at least seven days. Our objective is to scientifically demonstrate the efficiency and the correctness of this practice and to promote its revival. Twelve non-smoking judges, aged between 22 and 50 years, were selected from staff members and students of the Food Engineering, Tourism and Environmental Protection Faculty of “Aurel Vlaicu” University of Arad. The panellists were instructed to rinse mouth with water between samples [2, 21]. The bread samples were stored at room temperature, in different packages. The breads were evaluated for five assessment variables (Var): smell (Mr), taste (Gs), crust crispness (Dr), crumb reliance (Sf) and crumb firmness (Fr). Sensory evaluation of bread samples was performed using a five - point scale, reported on control, seven days after baking [7, 22]: 0 1 2 3 4
– – – – –
altered parameter highly modified parameter moderately modified parameter slightly modified parameter unmodified parameter
For instance, the daily data issued by the 12 judges for Gs (taste) are:
J judges 12 11 10 9 8 7 6 5 4 3 2 1
D days: 1 3 3 3 3 3 3 3 3 3 3 3 3
2 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 3 3 3
4 3 3 3 3 3 3 3 3 3 3 3 3
5 3 2 3 3 3 3 3 2 3 3 3 3
6 3 2 3 3 3 3 3 2 3 3 3 3
7 3 2 3 3 3 3 3 2 3 2 3 2
8 3 2 3 3 2 3 3 2 3 2 3 2
9 2 2 2 2 2 2 2 2 2 2 2 2
10 2 1 1 2 2 2 2 1 2 2 1 1
D = 10
J = 12
The score of each variable per day is calculated as the arithmetic mean of the judgements (see Fig. 1). If we denote the judgement of the judge j in day d for taste as Gsj(d), the daily scores of the taste will be:
406
A. M. Dicu et al.
Fig. 1. The scores of the assessment variables
P ScoreGsðd Þ ¼
j
Gsj ðd Þ J
ð1Þ
The same stands for ScoreDrðd Þ; ScoreFrðd Þ; ScoreSf ðd Þ and ScoreMrðd Þ.
3 Assessing the Overall Quality Given the daily scores of the five assessment variables, they are used to compute the overall quality assessment Qual(d), essentially by weighted summing. The weights of the assessment variables are WDr, WFr, WSf, WGs and WMr. One can define the final quality score as P ScoreQualðd Þ ¼
Var
Wvar ScoreVarðd Þ P Var Wvar
ð2Þ
However, the quality analysis may be enriched by adding new assessing variables in a hierarchical way. We will cluster the five variable as follows: – Mec = {Dr, Fr, Sf} is an assessing variable characterizing the bread’s physicalmechanical properties.
Assessing the Quality of Bread
407
– Org = {Mr, Gs} is an assessing variable characterizing the bread’s organoleptic properties. The scores are computed simply by weighted sums: ScoreQualðd Þ ¼
WMec ScoreMecðd Þ þ WOrg ScoreOrgðd Þ WMec þ WOrg
ð3Þ
The Fig. 2 and Fig. 3 results issued for WDr = WFr = WSf = WGs = WMr = WMec = WOrg = 1. This basic setting may be useful when ranking several different bread samples.
Fig. 2. The mechanical, organoleptic and overall qualities
Using different weights one can detail and refine the assessment according to the personal preferences of the judges or of the bread producers. Moreover, different bread’s types demand specific weights of assessment variables to define their best
408
A. M. Dicu et al.
Fig. 3. Comparing assessment data
quality: soft bread or sandwich bread (pain de mie Fr.) presupposes a very good crumb reliance Sf while la baguette is all about crust crispness Dr. Setting numerical weights’ values is immediate, but offers only poor explanations on the reasons that justified the setting. Here the linguistic side of the fuzzy sets is welcome.
4 Fuzzy Weighted Summing When comparing the previous numeric algorithm with a fuzzy equivalent implemented with the Matlab FIS toolkit, obvious advantages are showing: simplicity, transparence, full control of any bit of the algorithm. Each Matlab simulation needs initial data and generates resulting data, all nested into the corresponding Matlab Workspace. The Workspace stands as a platform enabling us to perform other deeper data manipulations and statistics, in Matlab or in other software environments. The Fig. 3 comparison overlaps three scores. Similar functions are helping us compare the daily scores of different bread packages, in order to draw the final conclusions of the research on Pecica bread. The only advantage of the fuzzy algorithm, its good handling of the subjective side of the assessment’s process, is eclipsed in this case by the sheer size of the inference rule base. Considering 5 linguistic labels for each assessment variable one should write 55 = 3125 rules. By hierarchizing the variables’ structure one reduce the rule base size and one can introduce new assessing variables, as shown above. Now the maximum number of inference rules diminishes at 53 = 125 rules for Org, 52 = 25 rules for Mec and finally 52 = 25 rules for Qual.
Assessing the Quality of Bread
409
Since we already use the weighted sum to aggregate the assessment variables, there is no need for a proper fuzzy inference rule base. The Center of Gravity defuzzification (COG) is implemented by weighted sums, and vice versa, as in our case, the weighted sum enables us to describe fuzzy concepts [20]. The simplest way to introduce a fuzzy content in our method is to turn each weight, namely WDr, WFr, WSf, WGs, WMr, WMec and WOrg, into fuzzy linguistic variables. The Fuzzy Weights of Sensory Attributes (FWSA) introduced like this are computed separately, on behalf of the linguistic expressed opinion of bread experts. Figure 4 illustrates the design of a five linguistic term FWSA:
Fig. 4. A fuzzy weight of sensory attribute
0 – no importance 0.5 – low importance 1 – medium importance 1.5 – very important 2 – essential importance
5 Discussion The paper describes a method of assessment for food products, currently used in a research project focused on packaging bread, aiming to delay crust and crumb staling and to prolong its shelf-life. The Pecica bread was traditionally preserved in textile bags and this method is intended to be reviewed and valued. During ten days, twelve judges are comparing a set of bread samples preserved in different packages, in order to establish the best packaging solution. The assessment data are processed by numerical Matlab script functions and the assessment scores are aggregated by weighted adding. Fuzzy weights are used to set in a linguistic manner the assessment’s features, according to experts and producers opinions.
410
A. M. Dicu et al.
References 1. Croitoru, C. (ed.): Sensory Analysis of Agro-Food Products, vol. 1. AGIR, Bucharest (2013) 2. Kaushik, N., Gondi, A.R., Rana, R., Rao, P.S.: Application of fuzzy logic technique for sensory evaluation of high pressure processed Mango pulp and litchi juice and its comparison to thermal treatment. Innov. Food Sci. Emerg. Technol. 32, 70–78 (2015) 3. Debjani, C., Das, S., Das, H.: Aggregation of sensory data using fuzzy logic for sensory quality evaluation of food. J. Food Sci. Technol. 50(6), 1088–1096 (2013) 4. Pico, J., Bernal, J., Gómez, M.: Wheat bread aroma compounds in crumb and crust: a review. Food Res. Int. 75, 200–215 (2015) 5. Noshirvani, N., Ghanbarzadeh, B., Mokarram, R.R., Hashemi, M.: Novel active packaging based on carboxymethyl cellulose-chitosan-ZnO NPs nanocomposite for increasing the shelf life of bread. Food Packag. Shelf Life 11, 106–114 (2017) 6. Fazeli, M.R., Shahverdi, A.R., Sedaghat, B., Jamalifar, H., Samadi, N.: Sourdough-isolated Lactobacillus fermentum as a potent anti-mould preservative of a traditional Iranian bread. Eur. Food Res. Technol. 218(6), 554–556 (2004) 7. Diaconescu, D., Zdremtan, M., Mester, M., Halmagean, L., Balint, M.: A study on the influence of some biogenic effectors on bread staling sensory evaluation. J. Agroalim. Process. Technol. 19(2), 247–252 (2013) 8. Giménez, A., Ares, F., Ares, G.: Sensory shelf-life estimation: a review of current methodological approaches. Food Res. Int. 49, 311–325 (2012) 9. Alamprese, C., Cappa, C., Ratti, S., Limbo, S., Signorelli, M., Fessas, D., Lucisano, M.: Shelf life extension of whole-wheat breadsticks: formulation and packaging strategies. Food Chem. 230, 532–539 (2017) 10. Soares, N.F.F., Rutishauser, D.M., Melo, N., Cruz, R.S., Andrade, N.J.: Inhibition of microbial growth in bread through active packaging. Packag. Technol. Sci. 15, 129–132 (2002) 11. Balaguer, M.P., Lopez-Carballo, G., Catala, R., Gavara, R., Hernandez-Munoz, P.: Antifungal properties of gliadin films incorporating cinnamaldehyde and application in active food packaging of bread and cheese spread foodstuffs. Int. J. Food Microbiol. 166, 369–377 (2013) 12. Otoni, C.G., Pontes, S.F.O., Medeiros, E.A.A., Soares, N.D.F.F.: Edible films from methylcellulose and nanoemulsions of clove bud (Syzygium aromaticum) and oregano (Origanum vulgare) essential oils as shelf life extenders for sliced bread? J. Agric. Food Chem. 62(22), 5214–5219 (2014) 13. Mihaly Cozmuta, A., Peter, A., Mihaly Cozmuta, L., Nicula, C., Crisan, L., Baia, L., et al.: Active packaging system based on Ag/TiO2 nanocomposite used for extending the shelf life of bread chemical and microbiological investigations. Packag. Technol. Sci. 28(4), 271–284 (2014) 14. Del Nobile, M.A., Matoriello, T., Cavella, S., Giudici, P., Masi, P.: Shelf life extension of durum wheat bread. Ital. J. Food Sci. 15(3), 383–393 (2003) 15. Degirmencioglu, N., Gocmen, D., Inkaya, A.N., Aydin, E., Guldas, M., Gonenc, S.: Influence of modified atmosphere packaging and potassium sorbate on microbiological characteristics of sliced bread. J. Food Sci. Technol. 48(2), 236–241 (2011) 16. Muizniece Brasava, S., Dukalska, L., Murniece, I., Dabina Bicka, I., Kozlinskis, E., Sarvi, S., et al.: Active packaging influence on shelf life extension of sliced wheat bread. World Acad. Sci. Eng. Technol. 67, 1128–1134 (2012)
Assessing the Quality of Bread
411
17. Ahmadi, E., Azizi, M.H., Abbasi, S., Hadian, Z., Sareminezhad, S.: Extending bread shelflife using polysaccharide coatings containing potassium sorbate. J. Food Res. 21(2), 209– 217 (2011) 18. Salehifar, M., Beladi Nejad, M.H., Alizadeh, R., Azizi, M.H.: Effect of LDPE/MWCNT films on the shelf life of Iranian Lavash bread? Eur. J. Exp. Biol. 3(6), 183–188 (2013) 19. Cioban, C., Alexa, E., Sumalan, R., Merce, I.: Impact of packaging on Bread physical and chemical properties. Bull. UASVM Agric. 67(2), 212–217 (2010) 20. Balas, M.M.: The fuzzy interpolative methodology. In: Balas, V.E., Fodor, J., VárkonyiKóczy, A.R. (eds.) Soft Computing Based Modeling in Intelligent Systems. Studies in Computational Intelligence, pp. 145–167. Springer (2009) 21. Singh, K.P., Mishra, A., Mishra, H.N.: Fuzzy analysis of sensory attributes of bread prepared from millet-based composite flours. LWT - Food Sci. Technol. 48, 276–282 (2012) 22. Dicu, A.M., Perța-Crișan, S. (eds.): Quality and Sensory Analysis of Foods. Aurel Vlaicu University of Arad (2012)
Catalan Numbers Associated to Integer Partitions and the Super-Exponential Tiberiu Spircu1(B) and Stefan V. Pantazi2 1
Institute of Mathematics of the Romanian Academy, 21, Calea Grivit¸ei, 010736 Bucharest, Romania [email protected] 2 Conestoga College ITAL, 299 Doon Valley Drive, Kitchener, ON N2G 4M4, Canada
Abstract. The main goal of this paper is to analyse the possible extension of the exponential generating function (of a complex sequence) to the sequences indexed by partitions of integers. Some possible properties of such functions, such as multiplicativity, are investigated.
· Catalan numbers · Generating function 05A10 · 05A15
Keywords: Integer partition AMS Classification:
1
Introduction
The present paper was largely influenced by the second volume of Stanley’s treatise Enumerative Combinatorics [9]. In fact, we extend some classical results about Catalan numbers and exponential generating functions to partitions and functions defined on partitions. Section 2 is devoted to exploiting a special lattice of partitions of integers where each integer subpartition has a “complement” and can be “extended”. The lattice structure and subpartition extension form the basis for the recursive generation of all integer partitions. Further, the convolution of complex functions defined on partitions is defined, and – connected to this convolution – in Sect. 3 the super-exponential generating function of such functions is introduced and exploited. Section 4 is devoted to studying some special complex-valued functions, which are called quasi-multiplicative. The given examples show the interesting variety of such functions.
2
Definitions, Notations and Preliminaries
It is well known that prime numbers generate integers. In the same way, integers can be used to generate partitions. In this paper we will investigate this extension more closely. We start by considering the super-exponential function, denoted Exp, which is a complex function defined over the complex vector space C(P) of complex sequences (x1 , x2 , ..., xt , ...) indexed by the positive integers, such that all c Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 412–425, 2021. https://doi.org/10.1007/978-3-030-51992-6_33
Catalan Numbers Associated to Integer Partitions
413
components, except a finite number, are equal to 0. More precisely, Exp(x1 , x2 , ..., xt , ...) = ex1 +x2 +...+xt +... . The notion of partition, more precisely k-parts partition of a positive integer n, is presented by tradition as a sequence λ = (λ1 , λ2 , ..., λs , ...) of non-negative integers – called parts of λ – satisfying the following conditions: (i) λ1 + λ2 + ... + λk = n, (ii) λ1 ≥ λ2 ≥ ... ≥ λk ≥ 1, (iii) λs = 0 for s > k. The following notation is used (see [8]): λ n. Given partitions λ = (λ1 , λ2 , ..., λs , ...) n and μ = (μ1 , μ2 , ..., μs , ...) m, we will say that μ is (lexicographically) below λ if m < n or if m = n and there exists an index j such that μi = λi for i < j, but μj < λj . The set of partitions is totally ordered by this lexicographic order. It is well-known that the sequences λ satisfying conditions (i)–(iii) correspond to sequences c = (c1 , c2 , ..., ct , ...) of non-negative integers satisfying conditions: (a) ct ≥ 0 for any t, (b) ct = 0 for t > n, (c) c1 · 1 + c2 · 2 + ... + cn · n = n. We will write c n. ct is the number of components λs equal to t, and the number k of strictly positive parts is recovered from the sequence as c1 + c2 + ... + cn . The sequence (0, 0, ..., 0, ...), in both representations, determines a (unique) partition of zero; it will be called the void partition and it will be denoted by . The partitions r = (0, ..., 0, 1, 0, ...), i.e. such that rt = 0 for all t, except t = r, where rr = 1, will be called elementary. A first coefficient associated to the partition c n, denoted by Kc , is defined as follows c1 + c2 + ... + cn Kc = Ck−1 c1 , c2 , ..., cn 2n 1 where Cn = n+1 n denotes the usual Catalan numbers. By default, K = 0, and Kr = 1 for all elementary partitions r . A partition μ , with l parts, of number m = μ1 + μ2 + ... + μl will be called subpartition of λ if every μi is a λa(i) , where a : [l] → [k] is a one-to-one mapping. It is much easier to work with subpartitions if we adopt the alternative description. Namely, a partition d = (d1 , d2 , ..., dt , ...) m ≤ n is called subpartition of c = (c1 , c2 , ..., ct , ...) n if dt ≤ ct for each t. In this case we will use the notation d c . (The notation d ≤ c will be reserved for the lexicographic order, if this is the case.) This last approach allows an easy evaluation of the number of subpartitions. Proposition 1. The partition c = (c1 , c2 , ..., ct , ...) has exactly (c1 + 1) · (c2 + 1) · ... · (ct + 1) · ... subpartitions.
414
T. Spircu and S. V. Pantazi
Moreover, it is easy to establish that the sequence (c1 − d1 , c2 − d2 , ..., ct − dt , ...) constitutes a partition of n − m, perfectly determined by the partition d of m and the partition c of n. This, denoted by c − dd, will be called the complementary subpartition of d in c . And, obviously, is a subpartition of every other partition c . Proposition 2. For a non-void and non-elementary partition c , we have Kc = Kd Kc−d . d subpartition of c
The proof is technical. It takes n into account: (a) the well-known property of Catalan numbers Cn+1 = (see [2], Chap. 29); (b) the relai=0 Ci C l n−i k−l = tion between multinomial coefficients d1 +d2 +...=l d1 ,d2 ,... c1 −d1 ,c2 −d2 ,... k c1 c2 c1 ,c2 ,... obtained by evaluating the coefficient of x1 x2 ... in the development of (x1 + x2 + ...)k (see [8], Sect. 1.2), and (c) the obvious decomposition k−1 d subpartition of c = l=1 d1 +d2 +...=l . The “recursive” relation in Proposition above is analogous to that satisfied by Catalan numbers Cn . By analogy, the numbers Kc will be called the Catalan numbers associated to partition c . We will use the notation c |cc for the partition given by componentwise addi tion of components of c and c , i.e. given by (c1 + c1 , c2 + c2 , ..., ct + ct , ...). Therefore c = d |(cc − dd) where d is a subpartition of c . The subordination relation defined above transforms the set P of partitions into a lattice. This lattice is (uniquely) complemented, and the operations “meet” res. “join” are induced by componentwise applying max res. min. This lattice is not the classical Young lattice (see [8]), nor the lattice of Brylawski based on dominance ordering [1], see Fig. 1.
Fig. 1. Integer partition lattices
This notion of partition decompositions can be easily generalized. Namely, if d j = (dj1 , dj2 , ..., djt , ...) is a partition of positive integer mj (j = 1, ..., s), and
Catalan Numbers Associated to Integer Partitions
415
s if j=1 djt = ct for each t, then we say that d 1 , d 2 , ..., d s is a decomposition of partition c and we use the notation c = d 1 |dd2 |...|dds . (In such a decomposition all components d j are supposed non-void.) Any partition c = (c1 , c2 , ..., ct , ...) can be obviously decomposed into elementary components, as follows c = 1 c1 |22c2 |...|ttct ..., where r c means r |rr |...|rr for c times
c > 0, and r 0 = . Two decompositions c = d 1 |dd2 |...|dds and c = d 1 |dd2 |...|dds of the same partition c , and with the same number of components, will be called equivalent if there exists a bijection σ : [s] → [s] such that d j = d σ(j) for each j ∈ [s]. Thus the decompositions of partition c are grouped into (equivalence) classes. Each such class is represented by a distinguished decomposition d 1 |dd2 |...|dds , in which the components are lexicographically ordered d 1 ≤ d 2 ≤ ... ≤ d s . The set of all such distinguished decompositions will be denoted by Δ(cc, s). For example, c = d|(cc − dd) and c = (cc − dd)|dd are equivalent. One of these two is distinguished, except when c − d coincides with d. Let us consider now the possibility that some consecutive components d j – in a decomposition δ from Δ(cc, s) – coincide, in other words consider the decomposition c = e 1 |ee1 |...|ee1 |ee2 |ee2 |...|ee2 |...|eeu |eeu |...|eeu g1 times
g2 times
gu times
in which e 1 < e 2 < ... < e u (lexicographically). Obviously, the number of components in the decomposition is g1 + g2 + ... + gu = s. We will associate to a distinguished decomposition δ = (ee1 |...|eeu ) of c two coefficients. Each one has two groups of factors: (1) one group generated by passing from subpartitions e j to partition c , and (2) the other determined by the number of decompositions equivalent to δ. Namely, ct if e j = (ej1 , ej2 , ..., ejt , ...), then at each level t a multinomial facappears. As for (2), it is easy to evaluate how many different tor e1t ,...,e ut decompositions are equivalent to δ; their number is g1 !...gu !. Hence, the second coefficient, Lc ,δ , associated to δ ∈ Δ(cc, s), will be defined as follows: c1 ct Lc ,δ = ... .../(g1 !...gu !). e11 , ..., eu1 e1t , ..., eut The third coefficient, Mc ,δ , associated to δ ∈ Δ(cc, s), will be defined as Lc ,δ · (s − 1)!. Let c = (c1 , c2 , ..., ct , ...) be a partition n, and let r ∈ P be a positive integer. By setting ct = ct for t = r and cr = cr + 1, we obtain a new partition c n + r having a supplementary component. This partition will be denoted by εr (cc) and will be called the extension of c by r. Of course, every partition c is obtained by repeated extensions, starting from the void partition and this forms the basis of the recursive generating algorithm described in the following section.
416
2.1
T. Spircu and S. V. Pantazi
Recursive Generation of All Integer Partitions
Generating all integer partitions is a well researched topic (for an overview see [4]). The recursive algorithm described here is based on subpartition extension and aims to maintain as many similarities as possible to the recursive description and generation of Catalan monoids introduced in an earlier article [5]. As in [5], we maintained the convenient way to generate and organize all integer partitions of 1, . . . , n by attaching a path in a labelled rooted tree to every integer partition. Subsequently, the enumeration of all partitions is easily done in a variety of orders (e.g., lexicographic, anti-lexicographic) using usual tree traversing algorithms. Another way to look at this approach is as a restriction of the (max-min) lattice introduced earlier. By choosing either the min or max subpartition ordering, the partial order defined by the subordination relation is reduced to a total order. Given the usual, standard representation of integer partitions with parts in decreasing order, in what follows, the max subpartition ordering is preferred. The rooted tree Tn describing all integer partitions of 1, . . . , n will be constructed from the “previous” tree Tn−1 by applying the following rules: (R0 ) The initial tree T0 has only one vertex, its root, labelled 0. (R1 ) All the labelled nodes of Tn−1 are kept in Tn
(R2a ) For 0 < k ≤ n2 attach a new node labelled k to all nodes in Tn−1 with labels ≥ k. (R2b ) Attach a new node labelled n to the root node 0. If we denote by qˆ(n, k) the number of nodes in tree Tn representing restricted integer partitions with no part smaller than k, the number of new nodes added to Tn according to rule (R2a ), for each k, corresponds to the following relations: (i) (ii) (iii) (iv)
qˆ(n, k) = 0 for k > n; qˆ(n, n) = 1; qˆ(n, k) = qˆ(n, k + 1) + qˆ(n − k, k) for 1 < k < n; qˆ(n, 1) = P (n), where P (n) is the partition function (OEIS A000041) counting all unrestricted partitions of n.
Clearly, qˆ(n, k) is an analogue to the usual partition function q(n, k) (OEIS A026820) counting integer partitions of n with no part larger than k. Proof. Relations (i, ii, iv) are obvious. Proving the recursive relation (iii) uses the same approach as other well-known recurrence relations such as the one calculating the number of restricted partitions of n into k parts (see [7], p. 94). The set of all restricted integer partitions of n with no part smaller than k, n counted by qˆ(n, k), is classified into two disjoint sets: a set P≥(k+1) of partitions n of n with no part smaller than k + 1 and another set Pk of partitions of n whose n is qˆ(n, k + 1). By removing one k smallest part is k. The cardinality of P≥(k+1) n term from each of the partitions in Pk , we can define a one-to-one correspondence n−k of integer partitions of n − k having no part smaller than k. to the set P≥k Therefore, the cardinality of set Pkn is qˆ(n − k, k).
Catalan Numbers Associated to Integer Partitions
417
Example. The qˆ(8, 2) = qˆ(8, 3) + qˆ(6, 2) = 7 integer partitions of 8 with no parts smaller than 2 are {8, 44, 53, 422, 2222, 332, 62} = {8, 44, 53} ∪ {422, 2222, 332, 62}. Note that rules (R0 −R1 ) above are identical to those described in [5], forming the recursive component of both algorithms. Rule (R2a ) is extending partitions by k and reflects the preferred choice of total order. Had we adopted the alternative (i.e., min subpartition), rule (R2a ) would be changed such that new nodes labelled k would be attached to all nodes in Tn−1 with labels ≤ k and their count would be given by the usual partition function q. Obviously, by extending the void partition by n, rule (R2b ) is responsible with the nodes that represent elementary partitions. Since at each step of the recursion the algorithm must add P (n) of nodes to the tree Tn−1 , the following identity must be satisfied:
P (n) = 1 +
n 2
qˆ(n − k, k)
k=1
Proof. Identities that contain sums adding up to the partition function P (n) are sometimes referred to in literature as “intermediate functions” and proving them likely follows a similar logic. In our case, the entire set of non-restricted integer partitions of n can be classified into n disjoint sets Pkn , 1 ≤ k ≤ n of restricted integer partitions of n whose smallest part is k. The cardinality of set Pnn is always 1. For n2 < k < n there can be no integer partitions of n whose smallest part is k and therefore such sets with zero cardinalities are excluded from the sum. As before, by removing one k part from each of the partitions in the remaining sets Pkn , we obtain the one-to-one correspondence to the sets n−k of integer partitions of n − k having no part smaller than k and whose P≥k cardinalities are given by qˆ(n − k, k). 2.2
Algorithm for the Recursive Generation of Integer Partitions
The algorithm for generating integer partitions follows closely the rules (R0 −R2 ) that form the approach to recursively generate a tree Tn . In order to represent the nodes in the tree Tn , the implementation relies on a node data structure that stores information about a node’s parent and label. The entire collection of nodes is stored in the nodeg global variable as an array of nof node data structures whose length is given by the summatory function i=0 P (i) (OEIS A000070) of the integer partition function P . One additional global variable, indexg , is responsible with indicating the position of the current node in the nodeg array (Line 23). After initializing data structures, the algorithms generates the tree T0 whose only vertex is the root labeled 0, corresponding to the void partition (Lines 24– 25). This is followed by the recursive portion of the algorithm. The recursive procedure GenerateTree(n) implements rule R1 by calling itself n times in order to generate all previous Tn−1 trees. For each generated tree Tn−1 , rules
418
T. Spircu and S. V. Pantazi
Algorithm 1. Generate Integer Partition Tree Tn 1: function AddNode(parent, label) 2: indexg ← indexg + 1 3: nodeg [indexg ].parent ← parent 4: nodeg [indexg ].label ← label 5: return indexg 6: end function 7: procedure GenerateTree(n) 8: if n > 0 then 9: GenerateTree(n-1) 10: parentIndex ← indexg 11: for k ← 1 . . . n2 do 12: for l ← 1 . . . qˆ(n − k, k) do 13: AddNode(parentIndex, k) 14: parentIndex ← parentIndex − 1 15: end for
Increment the current node index Child node stores parent index Child node stores label
Apply rule R1 to generate Tn−1
Apply rule R2a
Skip nodes that cannot be extended by k
16: parentIndex ← parentIndex − [P (n − k) − qˆ(n − k, k)] 17: end for 18: AddNode(0,n) 19: end if 20: end procedure
Apply rule R2b
21: procedure GenerateAllIntegerPartitions(1 . . . n) n Global array of size 22: nodeg ← [parent, label, partition] i=0 P (i) 23: indexg ← 0 Global index of the current node in the nodeg array 24: nodeg [0].parent ← −1 Root node does not have a parent 25: nodeg [0].label ← 0 Apply rule R0 26: GenerateTree(n) Apply rule R1 27: end procedure
R2a and R2b are applied. After attaching qˆ(n − k, k) new nodes labeled k to their respective parent nodes, for each k, the implementation of rule R2a takes advantage of the ordering of nodes in the global array nodeg and skips over a number of P (n − k) − qˆ(n − k, k) nodes representing partitions that cannot be extended by k. Example. In Fig. 2, the trees T1 , T2 , T3 and T4 are presented, containing 2, 4, 7, 12 nodes respectively. The number of nodes in each tree is given by given by the summatory function of the integer partition function P . The labelled rooted trees represent all integer partitions of 1 . . . n. As each partition is identified by a path starting at the root, the arrow → k means “extension by integer k”. Figure 3 shows the tree with all 12 integer partitions of n ≤ 4, i.e., 0, 1, 11, 2, 21, 111, 3, 31, 1111, 211, 22, 4, listed in the order they are generated by Algorithm 1: The connection with the recursive Catalan monoid generation algorithm presented in [5] are evident. Integer partitions of n are the subset of the 2n−1 integer
Catalan Numbers Associated to Integer Partitions ( T1 )
0
1
(T2 )
0
1
419
1
2 (T3 )
0
1
1
2
1
1
3
(T4 )
0
1
1
1
2
1
1
1
2 1
3 4
Fig. 2. Trees T1 , T2 , T3 and T4
compositions (where the order of parts matters) of n. In turn, integer compositions of n are the idempotent subset of the Catalan monoid Cn . The essential difference with the Catalan monoid generation algorithm is restricting the addition of new nodes to single nodes with a label that is no greater than that of its parent node. In essence, rule R2a avoids the creation of nodes that represent invalid integer partitions. 0
1
11
111
2
21
211
1111
22 3
31
4 Fig. 3. Tree of integer partitions of n ≤ 4
3
Functions Defined on Partitions
In what follows we will study the functions ϕ : P → C, where of course C is the field of complex numbers. By analogy with what is obtained in Sect. 5.5.1 from
420
T. Spircu and S. V. Pantazi
[9], given a function ϕ : P → C, we define its super-exponential generating function (in short, segf) as follows: Y)= SEϕ (Y
c ∈P
ϕ(cc)
Y1c1 Y2c2 ...Ytct ... c1 !c2 !...ct !...
where Y = (Y1 , Y2 , ..., Yt , ...) is a family of indeterminates, indexed by positive integers. Notice that, formally, Y) Y c1 Y c2 ...Ytct ... ∂SEϕ (Y = ϕ(εr (cc)) 1 2 ∂Yr c1 !c2 !...ct !... c
(1)
c ∈P
Perhaps the simplest example of super-exponential generating function is Y) = obtained for π : P → C given by π(cc) = c1 !c2 !...ct !.... In this case SEπ (Y 1 . t≥1 1−Yt The Catalan numbers provide another interesting example. More precisely, consider the function κ : P → C given by κ(cc) = Kc · π(cc) and denote by Z its segf. Then it is easy to find out that the product Z ·Z is exactly Z −(Y1 +Y2 +...) Y ) = 12 (1 − 1 − 4(Y1 + Y2 + ...)). i.e., formally, SEκ (Y The sum of functions ϕ, ψ : P → C is defined componentwise; in addition, we define as the function ϕ ∗ ψ = χ : P → C described by χ(cc) = their convolution ct c1 c2 d c − dd) for c ∈ P, c n. (Notice by Proposition 1 d cc d1 d2 ... dt ...ϕ(d )ψ(c above, this sum has a finite number of terms, namely (c1 + 1) · (c2 + 1) · ... · (ct + 1) · ... .) Of course, the convolution is a commutative and associative operation, thus the sum and the convolution turn out the set CP of functions P → C into a commutative ring with unit, the latter being the function ι : P → C given by ι(cc) = 0 for each non-void partition c , and ι() = 1. The invertible functions ϕ : P → C are precisely those for which ϕ() = 0. Moreover, the functions ϕ : P → C such that ϕ() = 0 constitute a maximal ideal J of this ring. An important role, in this ring CP , plays the so-called zeta function ζ : P → C given by ζ(cc) = 1 for each c , whose inverse is called the Moebius function. Proposition 3. (a) (analogue to Prop.5.1.1 from [9]). If χ is the convolution of Y ) = SEϕ (Y Y ) · SEψ (Y Y ). ϕ and ψ, then SEχ (Y (b) The Moebius function μ : P → C is defined by μ(cc) = (−1)c1 +...+ct +... . Proof. (a) follows from the obvious equality, involving binomial coefficients, valid for each partition c n and each subpartition d m of it: c1 c2 c1 c2 cn Y1 Y2 ...Yncn Y1c1 −d1 ...Yncn −dn Y1d1 Y2d2 ...Yndn · = ... d1 !d2 !...dn ! (c1 − d1 )!...(cn − dn )! d1 d2 dn c1 !c2 !...cn ! (b) It can be easily seen that the super-exponential generating function of the Y ) is precisely the super-exponential Exp(Y Y ), and the zeta function SEζ (Y
Catalan Numbers Associated to Integer Partitions
421
super-exponential generating function of the identity ι is the constant 1. Y ) · SEμ (Y Y) = 1 From ζ ∗ μ = ι using (a) we obtain the equality SEζ (Y Y ) = Exp(−Y −Y ) = SEζ (−Y −Y ). It follows that μ(cc), the i.e., formally, SEμ (Y c c c Y 1 Y 2 ...Y t ... coefficient of 1c1 !c22 !...ct t!... , is exactly ζ(cc) · (−1)c1 +c2 +...+ct +... . As an immediate generalization, let us notice that the multiple convolution ϕ1 ∗ ϕ2 ∗ ... ∗ ϕs = ψ is defined as follows Kϕ1 (dd1 )ϕ(dd2 )...ϕ(dds ) ψ(cc) =
d2 |...|d ds =cc d 1 |d
where K = d11 ,d21c1,...,dk1 d12 ,d22c2,...,dk2 ... d1t ,d2tct,...,dkt ... is a product of multinomial coefficients. It is rather obvious that between the corresponding super-exponential generating functions we have the relation
Y ) = SEϕ1 (Y Y ) · SEϕ2 (Y Y ) · ... · SEϕs (Y Y ). SEψ (Y Y ) = [SEϕ (Y Y )]s . In particular, if ψ = ϕ ∗ ϕ ∗ ... ∗ ϕ (s times), then SEψ (Y Proposition 4 (analogue to Theorem 5.1.4 from [9]). Given a function ϕ : P → C with ϕ() = 0, let us define χ : P → C as follows χ() = 0, χ(cc) = Lc ,δ ϕ(dd1 )ϕ(dd2 )...ϕ(dds ) (2) d1 |d d2 |...|d ds )∈Δ(cc,s) δ=(d 1≤s≤c1 +...+ct +...
where the sum is taken over all the possible distinguished decompositions of partition c , and Lc ,δ is as above. Then Y ) = exp(SEϕ (Y Y )). 1 + SEχ (Y
(3) c1
ct
...Yt ... Y ) the coefficient of a monomial Yc11 !...c Proof. Of course, in SEχ (Y is exactly t !... χ(cc). Considering s fixed, let us evaluate possible components of the power Y )]s determined by this monomial. Such a component appears when [SEϕ (Y Y
dj1
...Y
djt
...
t decomposing the monomial as a product of s monomials d1j1 !...djt !... , which monomials identify in fact a decomposition δ = (dd1 |...|dds ). Rewriting this decomposition as δ = (ee1 |...|eeu ) with each e j of multiplicity gj , notice that the coefY ) is precisely ϕ(eej ). When reconstructing the monomial ficient of e j in SEϕ (Y c c Y1 1 ...Yt t ... ϕ(ee ) · ... · ϕ(ee ) is preceded by the product of multic1 !...ct !... , the product c1 1 ct u ... e1t ,...,eut .... The same number appears when nomial coefficients e11 ,...,e u1 permuting the monomials. Namely e 1 appears in g1 factors out of s in the Y )]s , then e 2 appears in g2 factors out of the remaining s − g1 , product [SEϕ (Y Y )]s and so on. Thus, there are s!/(q1 !...qu !) ways in which we obtain in [SEϕ (Y c c Y1 1 ...Yt t ... Y )]s the same component, i.e. the monomial c1 !...ct !... is affected in [SEϕ (Y
422
T. Spircu and S. V. Pantazi Y
s
[SEϕ (Y Y )] by the coefficient s!Lc ,δ ϕ(ee1 ) · ... · ϕ(eeu ), thus in by the coefficient s! Lc ,δ ϕ(ee1 ) · ... · ϕ(eeu ) = χ(cc). The following equality is established
Y)= SEχ (Y
Y ) [SEϕ (Y Y )]2 Y )]3 SEϕ (Y SEϕ (Y + + + ... 1! 2! 3!
i.e. the formula (3). By formal derivation of (3), with respect to the r-th indeterminate Yr , we obtain the relation Y) Y) ∂SEχ (Y ∂SEϕ (Y Y )]. = [1 + SEχ (Y ∂Yr ∂Yr
(4)
Now, given ϕ, and considering the relations (4) for r = 1, 2, 3, ..., the function χ defined by (2) can be obtained by a sequence of recursive relations, as follows: c1 c2 ct χ(εr (cc)) = ϕ(εr (cc)) + (5) ... ...ϕ(εr (dd))χ(ee). d1 d2 dt d|e e c =d
These relations can be reversed; thus, given χ, the “original” function ϕ can be obtained by recursions. However, it can be noticed that formula (3) allows us to consider function χ associated to ϕ as in Proposition 3 as the exponential of ϕ. As such, we could say that ϕ is the logarithm of ϕ, and we could write, formally, Y ) = log[1 + SEχ (Y Y )] = SEϕ (Y
Y ) [SEχ (Y Y )]2 Y )]3 [SEχ (Y SEχ (Y − + − ... 1 2 3
In such way the formula (2) can be reversed as follows: (−1)s+1 Mc ,δ χ(dd1 )χ(dd2 )...χ(dds ). ϕ(cc) =
(2’)
d1 |d d2 |...|d ds )∈Δ(cc,s) δ=(d 1≤s≤c1 +...+ct +...
4
Multiplicative and Quasi-Multiplicative Functions
An interesting family of functions ϕ : P → C is composed of those functions that are perfectly determined by the values ϕ(rr ) taken on the elementary partitions r. In general, if χ = ϕ ∗ ψ, and if ϕ() = ψ() = 1, then for an elementary partition r we have χ(rr ) = ϕ(rr ) + ψ(rr ). A function ϕ : P → C is called multiplicative in case ϕ(cc) = ϕ(dd)ϕ(ee) for each partition c and for each decomposition c = d |ee. In particular, considering arbitrary trivial decompositions c = |cc, it follows that ϕ() = 1 for non-null multiplicative functions ϕ. In addition, the formula ϕ(11c1 |22c2 |...) = (ϕ(11))c1 (ϕ(22))c2 ... is immediate. The zeta function and the Moebius function above are examples of multiplicative functions. As another example, let us consider ξ(cc) = 2c1 +c2 +...+ct +...
Catalan Numbers Associated to Integer Partitions
423
Proposition 5. (i) If ϕ, ψ : P → C are multiplicative functions, then χ = ϕ ∗ ψ is multiplicative too. (ii) If ϕ : P → C is multiplicative, then its convolutional inverse ϕ−1 exists and is multiplicative. (iii) Any multiplicative function ϕ : P → C is perfectly determined by a complex function f : P → C (defined on positive integers). Proof. Obvious. In fact, χ(11c1 |22c2 |...) = (ϕ(11) + ψ(11))c1 (ϕ(22) + ψ(22))c2 ... As for the inverse of ϕ, we have ϕ−1 (rr c ) = [−ϕ(rr )]c . The complex function f associated to ϕ is given by f (r) = ϕ(rr ). The constant 1 is associated to the zeta function ζ, and the constant 2 is associated to ξ above. Another easy result is expressed in the following. Y ) of a multiProposition 6. The super-exponential generating function SEϕ (Y plicative ϕ : P → C is a product of exponentials. Proof. In fact, each component of the product is associated to an elementary partition r , i.e. to a positive integer r, and in fact is the exponential exp(ϕ(rr )Yr ); Y ) = exp( r ∈P ϕ(rr )Yr ). thus, SEϕ (Y Y ) = SEϕ (−Y −Y ). As a consequence, SEϕ−1 (Y In particular, since ξ(rr ) = 2 for all positive integers r, the super-exponential generating function of the multiplicative function ξ given as example above is Y ) = SEζ (2Y Y ) = [SEζ (Y Y )]2 , hence ξ is the convolutional square of the zeta SEξ (Y function ζ. Its convolutional inverse is thus the square of the Moebius function μ. Also, the odd cardinality partition problem (see [9]) can be described by a multiplicative function θ : P → C determined by o : P → C such that o(2m − Y ) = t∈P exp(Y2t−1 ), and consequently its 1) = 1, o(2m) = 0. Its segf is SEθ (Y Y ) = t∈P exp(−Y2t−1 ) as segf. convolutional inverse has SEθ (Y A very interesting multiplicative function ω : P → C is obtained when defining ω(rr ) = 1r . Its segf Y)= SEω (Y
c n, n∈N
1 Y1c1 Y2c2 Yncn ... 1c1 2c2 ...n cn c1 ! c2 ! cn !
is in fact the cycle index (originally Zyklenzeiger [6]) of the permutation group G∞ (see also [3], Appendix 3). After replacing each Ytt by Yt , it is clear that Y ), i.e. is exp( n∈P Ynn ). Y ) = SEζ (Y SEω (Y Let us consider now a family of functions ϕ : P → C slightly larger than those of multiplicative ones. We will say that ϕ is quasi-multiplicative if there is a complex function f : P × P → C (defined on pairs of positive integers) such that: (a)ϕ(rr c ) = f (r, c) for any r, c ∈ P, and
424
T. Spircu and S. V. Pantazi
(b) ϕ(cc) = f (1, c1 )f (2, c2 )...f (t, ct )... in case the partition c decomposes as 1 c1 |22c2 |...|ttct |.... Proposition 7. (i) If ϕ, ψ : P → C are quasi-multiplicative functions, then χ = ϕ ∗ ψ is quasi-multiplicative. (ii) If ϕ : P → C is quasi-multiplicative, then ϕ−1 is quasi-multiplicative. Proof. (i) First, it is clear that ϕ() = 1 for any quasi-multiplicative ϕ. Then, if χ = ϕ ∗ ψ, we have χ() = 1 provided ϕ() = ψ() = 1. c For an elementary partition r , the partition has as subpartitions r d with c r c−d c c )ψ(rr d ). 0 ≤ d ≤ c; it is clear that χ(rr ) = d=0 d ϕ(rr
The expression of χ(11c1 |22c2 |...|ttct |...) contains terms c2 ct c1 ... ...ϕ(11c1 −d1 |22c2 −d2 |...|ttct −dt |...)ψ(11d1 |22d2 |...|ttdt |...). d1 d2 dt On the other hand, the product χ(11c1 )χ(22c2 )...χ(ttct )... contains products d1 c2 −d2 )ψ(22d2 )...ϕ(ttct −dt )ψ(ttdt )... affected by the coeffiϕ(11c1 −d1 )ψ(1 1 )ϕ(2 ct2 c1 c2 cients d1 d2 ... dt .... Thus it is immediate that χ is quasi-multiplicative provided ϕ and ψ are quasi-multiplicative. (ii) Let us notice first that ϕ−1 (rr ) = −ϕ(rr ) for any elementary partition r . Given two different elementary partitions r and s , from (ϕ ∗ ϕ−1 )(rr |ss) = (ϕ ∗ ϕ−1 )(rr ) = (ϕ ∗ ϕ−1 )(ss) = 0 it follows easily that ϕ−1 (rr |ss) = ϕ−1 (rr )ϕ−1 (ss) provided ϕ(rr |ss) = ϕ(rr )ϕ(ss). In general, in a development (ϕ ∗ ϕ−1 )(11c1 |22c2 |...|ttct |...) = 0 we find out terms ϕ(11c1 −d1 |22c2 −d2 |...|ttct −dt |...)ϕ−1 (11d1 |22d2 |...|ttdt |...) affected by the coefficient dc11 dc22 ... dctt .... On the other hand, in the product (ϕ ∗ ϕ−1 )(11c1 )(ϕ ∗ ϕ−1 )(22c2 )...(ϕ ∗ ϕ−1 )(ttct )... the terms are as follows ϕ(11c1 −d1 )ϕ(22c2 −d2 )...ϕ(ttct −dt )...ϕ−1 (11d1 )ϕ−1 (22d2 )...ϕ−1 (ttdt ).... By supposing the equality ϕ−1 (11d1 |22d2 |...|ttdt |...) = ϕ−1 (11d1 )...ϕ−1 (ttdt )... valid for all exponents such that d1 + d2 + ... < c1 + c2 + ..., an induction argument shows that the equality ϕ−1 (11c1 |22c2 |...|ttct |...) = ϕ−1 (11c1 )ϕ−1 (22c2 )...ϕ−1 (ttct )... is valid and thus ϕ−1 is quasi-multiplicative. Y ) of a quasiProposition 8. The super-exponential generating function SEϕ (Y multiplicative ϕ : P → C is a product of ordinary (single-indeterminate) exponential generating functions. Y ) = Ef1 (Y1 )Ef2 (Y2 )...Eft (Yt )... where ft : P → C is given Proof. In fact, SEϕ (Y by ft (a) = f (t, a) for each a ∈ P, and f is the complex function that determines ϕ.
Catalan Numbers Associated to Integer Partitions
425
Many examples of quasi-multiplicative functions that are not multiplicative can be constructed easily enough. Apart from π(rr a ) = a! given in Sect. 3, another similar nice example is σ(rr a ) = (a−1)! t∈P log(1 − Yt ). r a , for which the segf is A very interesting quasi-multiplicative function is obtained from a complex function f : P → C by setting f (r) + c − 1 τ (rr c ) = for r , c ∈ P. c By connecting f with τ by a relation τ (11c1 )τ (22c2 )...τ (ttct )... = τ (rr + 11) f (r + 1) = c r
and accepting that f (1) = 1, we obtain exactly the recursive relation connecting the numbers of types of (rooted) trees (see [3]). By connecting f with τ by a relation τ (11c1 )τ (22c2 )...τ (ttct )... f (r + 1) = c r genuine partition
and accepting that f (1) = 1, f (2) = 0, we obtain exactly the recursive relation connecting the numbers of types of reduced (i.e. without nodes of out-degree 1) rooted trees; this sequence goes as follows: 1, 0, 1, 1, 2, 3, 6, 10, 19, ....
References 1. Brylawski, T.: The lattice of integer partitions. Discrete Math. 6, 201–219 (1973) 2. Grimaldi, R.P.: Fibonacci and Catalan Numbers. An Introduction. Wiley, Hoboken (2012) 3. Harary, F., Palmer, E.M.: Graphical Enumeration. Academic Press, New York and London (1973) 4. Kelleher, J.: Encoding partitions as ascending compositions. Ph.D. thesis, Department of Computer Science, University College Cork, December 2005. http://www. jeromekelleher.net/downloads/k06.pdf. Accessed 24 Mar 2018 5. Pantazi, S.V., Spircu, T.: Relations between consecutive Catalan monoids. In: Proceedings of the 48th Southeastern International Conference on Combinatorics, Graph Theory Computing - SEICCGT 2017, Boca Raton, Florida (2017) 6. P´ olya, G.: Kombinatorische Anzahlbestimmungen f¨ ur Gruppen, Graphen und chemische Verbindungen. Acta Math. 68, 145–254 (1937) 7. Ruskey, F.: Combinatorial Generation. University of Victoria (2003). http://www. 1stworks.com/ref/RuskeyCombGen.pdf. Accessed 24 Mar 2018 8. Stanley, R.P.: Enumerative Combinatorics, vol. 1. Cambridge University Press, Cambridge (1997). (9th printing, 2008) 9. Stanley, R.P.: Enumerative Combinatorics, vol. 2. Cambridge University Press, Cambridge (1999). (Reprinted 2005)
Emotional Control of Wideband Mobile Communication Vadim L. Stefanuk1,2(&) and Ludmila V. Savinitch1 1
Institute for Information Transmission Problems, Bolshoi Karetny per. 19, 127051 Moscow, Russia {stefanuk,savinitch}@iitp.ru 2 Peoples’ Friendship University of Russia, Miklucho-Maklaya str. 6, 117198 Moscow, Russia
Abstract. To avoid overloading in Wideband Mobile Communication Systems, a new approach is proposed, using Elastic Power Control that was studied in the frames of a mathematical model of WMCS. Both concepts, the model of collective radio stations and the elasticity, are theoretical ones. They have been studied mostly with mathematical means on the base of theory of collective behavior of learning automata. However, in present paper it is proposed a practical implementation of an elasticity to guarantee stability of mobile communication. In this new approach the Natural Language Base, consisting from emotive phrases, is proposed to be used in a feedback for Power Control to avoid the crossing the edge of instability, namely the area of admissible vectors of communication qualities. Keywords: Wideband communication Power control Communication quality Elasticity Emotions Natural Language WMCS UWB
1 Introduction The paper consists in three main blocks. In the first block a detailed description of elasticity of power control in the Wideband Mobile Communication (WMCS) is described. In the second block it is shown how the spoken utterances of the people may be used to provide a feedback needed for the elasticity control to take place. The third block describes the project of implementation of WMCS as a game of collective of radio stations, which produce interferences to other stations during their normal communication performance. 1.1
Wideband Mobile Communication Systems
Wideband mobile communication system (WMCS), modeling joint activity of n pairs of communicating users, was first proposed in [1, 2] in 1967 with the goal to design a wireless version of ordinary telephone communication, making the communication convenient, mobile, and distance independent, and etc.
© Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 426–436, 2021. https://doi.org/10.1007/978-3-030-51992-6_34
Emotional Control of Wideband Mobile Communication
427
Some modern mobile communication systems, like WCDMA, extensively use the proposed in [1, 2] wideband approach, either in the whole network or within each of its Wi-Fi cells. Probably Motorola of USA was the first company to produce hand held devices for this type of communication, some years after the documents [1, 2] and some other papers have been handed over to the Chairs of the company during their visit to Moscow, Soviet Union. Later IEEE worked out certain standards (IEEE 802.11 and etc.) for such systems. Presently almost all mobile communication systems throughout the world follow the IEEE standards. The authors of [1, 2] actually are proud to see that WMCS proposed over half a century ago, has been implemented in close correspondence with our mathematical model and provided people with a distant independent communication means. Actually, the US engineers undoubtedly knew about our discovery from our extensive publication activity in important journals, proceedings, and in many our presentations in USA (MIT, Texas University), France (University Paris VI), Sweden and other countries. In the cellular mobile systems actually there is a mixture of the wideband (within a cell) and the trunk radio connection (among cells). The repeaters needed for it were also described in [1, 2]. People say that the market principals and general engineering approaches were never too fair towards the original inventors. For example, Karl Hellwig in his survey of 2007 [3] never mentioned even the fact of existence of some area of admissibility KðnÞ of vector qualities discovered in [1, 2] if the power control is used in a wideband mobile communication. He never mentioned our papers [1, 2] and etc. Similar facts are typical for an engineering approach to the communication. That is why we decided to present in the current paper some old facts, which obviously slipped away from the attention of the current engineers, and to demonstrate a solution to the problem of appropriate stability in wideband mobile communication. It was decided to concentrate in this paper on a particular communication problem related to the stability of its performance as the communication quality may be occasionally reduced, if, for instance, the number of speakers increases. Typically on the eve of some holydays the mobile contacts are occasionally broken and must be reestablished a new. This phenomenon was first described and carefully studied theoretically in our original mathematical model of wideband multi-radio communication starting with [1, 2] and in the research, mentioned in the footnote1 below. We will go into some details of the problem and propose an original way to reduce the network overloading using Natural Language approach typical for Artificial Intelligence and the Collective Behavior problems of various kinds.
1
Many of the initial ideas have been put by V.L. Stefanuk in his PhD thesis: “Collective automata behavior and the task of stable local control in the net of radio-stations” defended in the Institute for Control Problems, Russian Academy of Science, Moscow: 1968.
428
1.2
V. L. Stefanuk and L. V. Savinitch
Modeling of Wideband Communication
In the above mentioned publications [1, 2] it was described a completely new model, demonstrating that the Power Control is the only way to achieve an appropriate result of communication among n pairs of people talking from arbitrary sites. Before us the idea of power control was proposed by some other authors [4, 5] for reduction of a spectrum needed for ordinary communication. However, application of power control as a means of reduction of mutual interferences among several communicating pairs, which are simultaneously operating within common frequency band, was a principally novel idea at that time. Its basic concept contained in the following noise to signal (NTS) relation at the input of the i-th receiver: Ni þ ki ¼
n P j¼1;j6¼i
aij Pj
aii Pi
ði ¼ 1; . . .; nÞ
ð1Þ
In above expression Pi is the power of i-th transmitter, Ni is the additive noise in the input of the i-th receiver, the ki is the NTSi , the product aij Pj is the noise power at the input of the i-th receiver from the foreign transmitters, and aii Pi is the equivalence of power from communicating partner in the i-th pair. The NTSi takes into account the useful signal in the pair (i, i) received from transmitting party of the same pair. Generally, first bottom index in the coefficient aij refers to some receiving party, while the second one refers always to the transmitting party. The wideband channel from i-th transmitting agent to i-th receiving agent and the reversed channel of the same pair (i, i) are treated as two independent one-way wideband channels, which most of the time is a half-duplex channel, implementing actually full-duplex one. In our first publications [1, 2], introducing the concept of the wideband communication, it was shown that in the space of vectors ~ k ¼ ðk1 ; . . .; kn Þ there exists an open set KðnÞ with a property that only vectors ~ k (for any vectors ~ P) may be reached in the WMCS that satisfy the property: ~ k 2 KðnÞ
ð2Þ
Moreover, it was proved in our publications that providing that (2) is valid an arbitrary vector of values ~ ko may be reached in the collective of the radio stations (1) as solution of the following set of differential equations k0i
dPi ðtÞ ¼ ji Pi ðtÞ½ki ðtÞ k0i dt
ð3Þ
Emotional Control of Wideband Mobile Communication
429
As algorithms (3) run independently2, instability might take place any moment because the values aij and the powers Pj ; j 6¼ i; are not known. The corresponding differential equations for the power control and their study may be found in [1, 2, 5, 6]. In footnote (see footnote 1) and in [2, 5] a flexible way of power control was proposed that we called elastic control, which treats the restriction (2) in a special way. 1.3
Elastic Power Control
Elastic power control implements the idea of keeping acceptable quality of transmission at the receiving points and avoiding breaking communication in the process of power control. The following algorithm of guaranteed stability was proposed. This algorithm (3) provides necessary NTSi when it is possible in correspondence with (2). Let’s consider the arbitrary collective of n power control automata in which Pi ðtÞ and ki ðtÞ are correspondingly the action of i th automaton and the resulting NTSi obtained. And let the value Ci ðPi ; ki ; di Þ be a penalty function that depends only on values known within the pair ði; iÞ and on some parameter di ðtÞ. In reference (see footnote 1) it was shown that the collective eventually is reaching the deterministic Nash equilibrium (or Nash party) satisfying the following relations: @Ci ðk1 ; . . .; k1 Þ ¼ 0; @ki
@ 2 Ci ðk1 ; . . .; k1 Þ [ 0 ði ¼ 1; . . .; nÞ; @k2i
ð4Þ
provided, that the following penalty function was used Ci ¼ ki
d2i @Ci ðk1 ; . . .; k1 Þ @ki Pi
ð5Þ
In accordance with (5) the i-th transmitter would reduce its NTSi , i.e. ki , by increasing its power until the insignificant reduction of the value ki would require too much increase of the transmitting power. In some special cases the stability with use of functions (4) and (5) will be achieved when the point of equilibrium for ki will be at a distance di from the border of KðnÞ . To implement the elasticity criterion (5) the i-th receiver of the pair ði; iÞ has to inform i-th transmitter on the current value ki ðtÞ. In present paper it will be shown that this important feedback information from i-th receiver intended to its i-th transmitter may be extracted from the words of the speaking person of the same pair. (Note that each person in the pair may take the role of transmitter or receiver at the various moments t.)
2
It is assumed that a feedback for each power control subsystem is provided.
430
V. L. Stefanuk and L. V. Savinitch
2 Using Emotional Reactions as Feedback 2.1
Introduction
Replicas are frequently used by people for various purposes. A replica is some saying that is not directly related to the topic under discussion. In our case replicas add some emotional dimension to the talk. What is said by a person over the mobile device may be addressed to two different tasks. One task is related to the content to be exchanged by the speakers. Yet, another task is to say something concerning the quality of communication when it is not good enough. This second type of information we propose to use by a technical system that supports the communication. (The two plans idea in a written text has been studied in [14]). To recognize various replicas one may use the speech recognition technique, which is usually available in handheld devices. In this Chapter an attempt to build a Natural Language Knowledge Base (NLKB) for the recognition system, which may be used in elastic power control, is described in some details. To be applicable, the NLKB must contain also numerical estimations of quality of communication. 2.2
Extracting Signal-to-Noise Ratios from Replicas
The extraction of NTS is very important for organization of appropriate power control as discussed in the previous chapter providing a feedback for the control. Then the transmitter would be able to decide which way its power should be changed, up or down, in order to keep the signal at the currently receiving part of the pair at an appropriate level of quality. For this purpose let us consider first RASTI (Rapid Speech Transmission Index) that may be found among various standards ISO/TR-4870, ANSI S3.2, S3.5; IEC 26816, and etc. This criterion is a number between 0 and 1. The number corresponds to the subjective estimation of intelligibility in popular table dependences [7–10] (Table 1):
Table 1. Rapid speech transmission index RASTI values 0.75–1.00 0.60–0.75 0.45–0.60 0.30–0.45 0.00–0.30
Intelligibility Excellent Good Satisfactory Bad Too bad
Emotional Control of Wideband Mobile Communication
431
In the next Table 2 some three columns have been added (in this paper) to the above traditional table for RASTI. These additions take into account that it is well known that the speech signal is assumed to be excellent when STN is about 30 dB at the input of receiving device. Due to this we designed next table demonstrating the correspondence between Intelligibility and NTS, or k.
Table 2. Correspondence between intelligibility and NTS RASTI 1 0.75 0.50 0.25 0
Intelligibility Excellent Good Satisfactory Bad Too bad
Decibels 30 dB 16 dB 0 dB −16 dB −30 dB
SNR 1000 39.8 1.00 0.033 0.001
NTS (lambda) 0.001 0.033 1.00 39.8 1000
Above table allows extracting the corresponding NTS for each of the phrases presented in the next table, Table 3. The values NTS for the arbitrary intelligibility numbers, contained in the Table 3, may be calculated by an interpolation of numbers shown in Table 2. It means that the corresponding NTS may be built for any phrase.3 Table 2 permits to find the value of k whenever an emotional phrase is pronounced by the currently receiving part of each pair in the collective of radio stations. This value may be recognized by the transmitting part of the communicative pair and an appropriate decision would be taken concerning power of transmission: should it be increased or decreased at this moment. In the next chapter we will describe the way how the value of powers of each transmitting part may be made optimal, assuming that at each moment t any transmitter of the collective knows what is the value of its corresponding NTS. The moments of time t are the instances of pronouncing of the emotional phrases by the receiving party and their recognition by the currently transmitting party using the base of all the emotional phrases collected in Table 3 (which is assumed to be known to every station of WMCS).
3
Naturally, these calculations may be performed in advance. Yet, it is not desirable at this stage in order to be able to update the Table 3, when some new emotional phrases would start to be used by people. Or when some other language is applied by the users of WMCS.
432
2.3
V. L. Stefanuk and L. V. Savinitch
Collecting of Emotional Replicas from Users of Mobile Communication
Problems raised in this part of our paper have certain relation to L.V. Savinitch previous research on noise effect to spoken communication [11]. She paid attention that in the noise environment the intensity of the speaker voice naturally increases and the fundamental tone frequency fo also changes. This phenomenon was referred to as Lombard’s Effect [12]. Similar effects should be expected in our WMCS, where we have somewhat unusual noisy environment. Indeed, besides the environmental noise the loss of speech intelligibility may be due to the interferences from different communication wideband channels, causing instability during power control. In this paper we will not consider acoustic or prosodic features of the speech communication limiting ourselves with the practical problem of stability of WMCS. Our goal is the analysis of emotional reactions [13] of communicating persons and extract data for a feedback needed for the elastic control of the powers. It was decided to build a realistic natural language knowledge base (NLKB) in two rounds. In the first round we collected a number of words and expressions that people are constantly using in mobile communication. The poll was organized among 25 students in a Moscow University, asking them to produce a collection of phrases, expressing their emotional estimation of the quality of mobile communication. The results are shown in the first two columns of Table 3 (see4). At the second step of the experimental research the same 25 students were asked to independently estimate the intelligibility of each of the phrases using numbers from 1 to 10, where 10 corresponds to the highest intelligibility. At the end we took the average of their estimations and obtained a statistically realistic numerical estimations of the emotional phrases found in the first round of our research. To obtain the RASTI values discussed in the Sect. 2.2 above, all the numerical values were multiplied by factor 0.1. The resulting Table 3 may be directly used in our approach to stability as now the third column contains RASTI indexes for NTS rations.
4
We used Russian language speakers to built the Table 3, and gave the translations in the table only for convenience of the readers. Note that for another language one has to build a separate table, similar to Table 3, and in the way it was described above.
Emotional Control of Wideband Mobile Communication
433
Table 3. Estimates of the disturbance intelligibility indexes in modern mobile communication Phrases Не могли бы вы повторить, вас плохо слышно. Алло-алло, вас не слышно. Чёрт, ты где там пропал? Ё-моё, я тебя совсем не слышу. Что за дыра там у тебя, связи нет?! Мегафон, чёрт бы тебя побрал! Вообще ничего не слышно. Говори погромче. Что-то шипит в трубке. Ты пропадаешь! Что? Не поняла, тебя не слышно. В трубке что-то заикается. Блин, чёртов бункер, сети нет! Громче говори, я не слышу! Блин, плохо тебя слышу. Алло! Алло! Что молчишь? Блин, бесит меня эта связь дурацкая! Алло! Абсолютно вас не слышу. Перезвоните мне! Вы пропадаете! Что? Ещё раз, пожалуйста! Да, чёрт возьми, связь, как в пещере! Всё, меня это уже бесит, позже созвонимся. Ты меня слышишь? Ты пропадаешь… Ку-ку! Чёртов мегафон! Вы меня слышите? Я перезвоню. Мой телефон плохо ловит. Блин, связь плохая, ты где находишься, что тебя так плохо слышно?! Ты можешь говорить громче? И чётче? Что-то со связью. Перезвони. Ты совсем пропадаешь, не могу разобрать, блин, what the fuck! Что? Я тебя не слышу! Ты слышишь? Щас, я перезвоню, что-то со связью. Опять связь тупит. Напиши СМС, а то второй раз такая фигня. У тебя что-то с телефоном, я же говорила: переходи на Мегафон. Я в метро. Ничего не слышно. Повтори ещё раз, я не поняла. Блин, что-то со связью. Чёртово метро! Отойди уже куда-нибудь, где потише. Ой, что-то связь пропала. Какой-то начался треск. Стало тихо. Что-то начало щёлкать. Слышимость ухудшилась.
Their translations Could you repeat that please, the connection is bad I can not hear you Dammit, I think I've lost you Yo, yo! can't hear you at all! The connection is bad
Average Intelligibility 0.620 0.528 0.368 0.404 0.320
Megaphone, the hell! Can't hear anything Could you speak louder a bit I hear too much noise I'm loosing you What? I did not understand you. Can't hear you very well Some stammering in the tube Mud. The hell Banker, no network available Speak up. I could not hear you Damn it, I can't hear you. You are silent Dammit, this is driving me nuts
0.303 0.380 0.724 0.600 0.632 0.512
I absolutely can not hear you. Call me again. You are getting lost What? Once again, please Yes? Connection is like in the cave
0.344
This is it, can't deal with this connection, call you later. Can you hear me? You are getting lost... Damn Megaphone Do you hear me?
0.472 0.504 0.544 0.588 0.432 0.364
0.508 0.588 0.352 0.280 0.584 0.588 0.472 0.368 0.620
I will call back to you My telephone catches badly Damn it! Where you are that I hear you so badly
0.532 0.512 0.456
Can you speak louder? And distinctly? Something to the connection. Call me again You are getting lost completely. I can not understand you, the fuck
0.584
What? I can not hear you! Can you hear me? I will call you back at once. Some Happening with communication. The connection is again bad
0.424 0.536 0.424
0.492 0.324
0.424
Text me, can't hear you again.
0.348
You have a problem with your telephone, I said before, switch to the Megaphone. I am on a subway. Can't hear anything Could you repeat that, I did not hear what you said. Dammit, something wrong with the connection To hell with subway! Could you go somewhere less noisy.
0.328
I think the connection is lost I got some crackling noise going on Can't hear anything, you might be on mute I hear noise Sound is disappearing
0.384 0.600 0.488 0.380 0.520 0.484 0.456 0.472 0.476 0.516
434
V. L. Stefanuk and L. V. Savinitch
3 Automata that Implement the Nash Party It is considered that the KB obtained in previous chapter is known to every user of WMCS and is used in that their gadget is able to recognize the emotive phrases pronounced and convert it to a number shown in the Table 3. This number will be used as a feedback in the power control algorithms. In the present chapter it will be shown that to implement the proposed elastic criterion we have to use the collection of power control algorithms (3) in the collective of radio stations in WMCS, that converge if and only if ~ k0 ¼ ðk01 ; . . .; k0n Þ 2 KðnÞ . For the elastic criterion we have to consider an automata game, where the penalty \reward function for the itch automata in (3) is defined with an automaton of the second layer that implements the penalty function (5). We may use approach [15, 16] intended to find the extreme points of a function of many variables, provided that the calculation of the derivative of i-th coordinate is essentially local. However we prefer here to obtain a necessary result in the following way. Consider a time interval l. Let the bottom automaton (3) initially keeps constant ð1Þ value ki . Let now it provides another constant value in the time interval 1=2, namely ð1Þ
ð1Þ
ki þ r, and in the second half of the interval l it keeps constant value ki r. 1 ;...;kn Þ in two points. To This allows approximately finding the derivative @Pi ðk@k i achieve this, it is necessary to take difference of the powers Pi at the start of the interval l and in its middle point. Using this values it is possible to compute two values for the criterion (5) in the middle of time interval l. Then the upper level automaton orders to the bottom automaton to keep constant the value, which is equal to the value of the two numbers above corresponding the smaller number for the criteria Ci . 1 ;...;kn Þ may be calculated approximately at any moment t and Thus, the value @Pi ðk@k i correspondingly the value Ci may be founded as well. Note that the precision of the above calculations are getting higher as r ! 0 in analogy to the search for minimum in papers [11, 12] on the extreme regulation. Thus, at each step the minimal values for the criterion Ci from the two may be taken and our two layers algorithm continues to move in the direction of minimum of Ci i.e. the system will go along with Elastic Criterion. In our model on the actions of elasticity criterion may be influence with the feedback signal from the receiving agent of the same pair as a phrase from the Table 3, reporting on the actual quality of communication, i.e. NTSi ðtÞ. If one has NTSi ðtÞ [ ki ðtÞ than our algorithm would continue its operations, increasing the quality of reception by making it maximally possible within the frame of elastic criteria. However, if it is hold NTSi ðtÞ ki ðtÞ the algorithm stops increasing power for the sake of economy of power (and saving batteries in mobile devices). For the same reason of economy if there is no feedback for a certain period of time the algorithm stops moving to the minima by increasing power as the speaker obviously is satisfied with reception quality. Yet, if the algorithm is moving to the minima by decreasing power it has to continue the process.
Emotional Control of Wideband Mobile Communication
435
Thus, the elastic criterion never admits instability as the vector ~ k would never cross the boundary of admissible area. Thus, in such a system one would have the guaranteed stability. Note that in case of elastic criteria we had to use two level automata construction. The first level provides a given value of NTSi (first taken from the third column of the Table 3). The automaton of the second level is calculating the value of the criteria and formulates a new task for the automaton of the first level to increase or decrease the power. Note also that the partners may change their roles as communication continues, However, the partner, currently representing transmitting part, would expect an emotional utterance from its counterpart in order to continue the algorithm described above and would adjust its power correspondingly. Obviously all power control processes for each pairs, as well as within any pair of collective of radio stations, are going in parallel and completely independently from each other, as there is no contradiction in their activity.
4 Conclusion The numbers in the Table 3 became known to the technical system of a transmitter right after an emotional phrase pronounced by the receiving user will be recognized by transmitting system. The transmitter razes the power if the found number in the Table 3 is high enough. If the corresponding increase of communication quality is not essential, then the power should not be increased further. From the other side, if the communication quality remains low (the next known number in the 3d column is again high) the power should not be increased further. Actually the changes of the power should be done in small steps (see footnote 1). After each step the result should be controlled by the numbers obtained from the Table 3 or with some waiting time. The details are in (see footnote 1) and in [5]. The attempts to change the power will stop at the point when there is no essential progress in the quality of communication. In fact it means that the values k1 ; . . .; kn have reached the equilibrium Nash party in correspondence with (4) and all of the values of the vectors ~ k will be always admissible: during the process, and in its final equilibrium point. Note that the partners may change their roles as communication is continuing, However the partner, which is currently represents transmitting part would expect an emotional utterance from its counterpart in order to continue the algorithm, described in Sect. 1 and would adjust its power correspondingly. Obviously all the processes for each pairs, as well as within any concrete pair of the collective of radio stations, are going in parallel and completely independently from each other, as there is no contradiction in their activity. Due to described linguistic feedback in this paper there will be no overloading or any kind of instability in Wideband Mobile Communication System. The results obtained in this paper show that Natural Language may be used to adjust the powers in the way that guarantees stability of mobile media. Our approach to collective behavior of learning systems uses here so-called elastic approach. What
436
V. L. Stefanuk and L. V. Savinitch
should be specially stressed is that the automata have continuous states and require a differential technique for their analyses, which is not typical for the area of Artificial Intelligence. The resulting system with an emotional feedback not only guarantees stability, but also it saving some energy and saving technical user equipment, when the conditions of communication is good enough. Acknowledgments. The present study was partially supported with Russian Foundation for Basic Research, grant 18-07-00736 and grant 17-29-07053. We would like to thank the anonymous editors of this paper for their valuable remarks.
References 1. Stefanuk, V.L.: On the stability of power control in a net of radio stations. In: Proceedings of 3D Conference on the Theory of Coding and Information Transmission, pp. 64–71. IITP, Moscow (1967) 2. Stefanuk, V.L., Tzetlin, M.L.: On power control in the collective of radio-stations. Inf. Transm. Probl. 3(4), 59–67 (1967) 3. Wrotten, L.J.: Power control to minimize communication spectrum requirements. IEEE Int. Convention Rec. Part 1, 260–265 (1965) 4. Barrious, A.A.: Spectrum economy through controlled transmitter power levels. IEEE Int. Convention Rec. 13, 29–37 (1965) 5. Stefanuk, V.L.: On the local criteria of stable choice of the power level. Inf. Transm. Prob. 5 (3), 46–63 (1969) 6. Stefanuk, V.L.: Collective behavior of automata and the problems of stable local control of a large scale system. In: Proceedings of 2D International Joint Conference on Artificial Intelligence, pp. 51–56, London (1971) 7. Houtgast, T., Steeneken, H.J.M., et al.: Past, present and future of the Speech Transmission Index, TNO Human Factors, Soesterberg, The Netherlands (2002). ISBN 90-76702-02-0 8. Barnet Proposed a reference scale Common Intelligibility system (CIS), based on a mathematical relation (CIS = 1 + log(STI)) (1995/1999) 9. Houtgast, T., Steenehen H.J.N.: Evaluation of speech transition channels by using Artificial Signals. Acoustica 25, 355–367 (1071) 10. Steenehen, H.J.N., Houtgast, T.: J. Acoust. Soc. Am. 67, 318–326 (1980) 11. Savinitch, L.V.: Intelligence system behavior in changing communicative environment. In: Proceedings of 15th National Conference on Artificial Intelligence, pp. 295–301, Smolensk (2016). (in Russian) 12. Lombard, É.: Le signe de l’élévation de la voix. Annales des Maladies de L’Oreille et du Larynx XXXVII(2), 101–119 (1911) 13. Fominich, I.B., Alekseev, V.V.: Classification of emotions: information approach. In: Proceedings of IV International Workshop Integral Models and Soft Computing in Artificial Intelligence, Moscow, Nauka, pp. 63–71 (2007). (in Russian) 14. Savinitch, L.V.: Pragmatic and Communicative Strategies in Journalistic Discourse Under Censorship Power Without Domination: Dialogism and the empowering Property of Communication, pp. 107–137. John Benjamin’s Publishing Company, Amsterdam/Philadelphia (2005) 15. Krasovsky A.A.: Dynamics of continuous self adjusting system (1963). (in Russian) 16. Rastrigin, L.A.: Systems of Optimizing Control. Nauka, Moscow (1974). (in Russian)
Author Index
A Adhyaru, Dipak M., 329 Agrawal, Namrata, 197 Ahmad, Hifzan, 186, 197 Alvarez, L., 358 Antoff, I., 358 Avramoni, Dacian, 294 Azar, Ahmad Taher, 372 B Balas, Marius Mircea, 403 Balas, Valentina E., 344 Barbu, Tudor, 177 Barbulescu, Constantin, 3, 19 Barla, Eva, 41 Barna, Cornel, 186, 197 Beiu, Valeriu, 115 Bejan, Crina Anina, 152 Bel, W., 358 Berecz, Csilla, 307 Bold, Nicolae, 165, 385 Bucerzan, Dominic, 152 Bunda, Serban-Ioan, 30 C Călin, Alina Delia, 73 Castillo, Oscar, 130 Chis, Violeta, 3 Coroiu, Adriana Mihaela, 73 Cosariu, Cristian, 294 Costantini, F., 358
Cowell, Simon R., 115 Craciun, Mihaela, 3 D Deb, Dipankar, 60 Dhiman, Harsh S., 60 Diaconescu, Andra, 228 Dicu, Anca M., 403 Dineva, Adrienn, 48 Dombi, József, 81 Dragan, Florin, 392 Drăgoi, Vlad-Florin, 115 Dzitac, Simona, 30 F Filip, Ioan, 273, 392 Fouad, Khaled M., 372 Fravega, L., 358 G Gandhi, Ravi V., 329 Giuca, Olivia, 240 I Inbarani, H. Hannah, 372 Iovanovici, Alexandru, 294 Ivascu, Larisa, 217 J Janurová, Kateřina, 102 Jónás, Tamás, 81 Joon, Kirti, 197
© Springer Nature Switzerland AG 2021 V. E. Balas et al. (Eds.): SOFA 2018, AISC 1221, pp. 437–438, 2021. https://doi.org/10.1007/978-3-030-51992-6
438 K Kamal, Nashwa Ahmad, 372 Kilyeni, Stefan, 3, 19 Kiss, Gabor, 307 Koubaa, Anis, 372 L Lala, I. Radu, 372 Lavanya, B., 372 López De Luise, D., 358 M Mallick, Ajay Kumar, 186 Martinovič, Jan, 102 Martinovič, Tomáš, 102 Melin, Patricia, 130 Miclea, Serban, 217 Mnerie, Corina, 403 N Naaji, Antoanela, 285 Nagy, Mariana, 317 Narayana, Y. Gayathri, 344 Negirla, Paul-Onut, 317 Negrut, Mircea Liviu, 217 Nijloveanu, Daniel, 385 Ninawe, Aniket, 186 P Pálfi, Judith, 48 Pandey, Hari Mohan, 145 Pantazi, Stefan V., 412 Popescu, Alina Madalina, 240 Popescu, Daniela Elena, 240 Popescu, Doru Anastasiu, 165 Popescu, DoruAnastasiu, 385 Popescu, Marius Constantin, 285 Popescu, Traian Mihai, 240 Prodan, Lucian, 294
Author Index Prostean, Gabriela, 228, 240 Prostean, Octavian, 273 R Radu, Dana, 403 Rat, Cezara, 392 Rat, Cezara-Liliana, 273 Rohatinovici, Noemi Clara, 344 Ruset, Patriciu, 228 S Sah, Dinesh Kumar, 186, 197 Sala, Adi, 285 Sánchez, Daniela, 130 Savinitch, Ludmila V., 426 Secui, Dinu-Calin, 30 Simo, Attila, 3, 19 Simona, Dzitac, 41 Sîrghie, Cecilia, 403 Slaninová, Kateřina, 102 Spircu, Tiberiu, 412 Stefanuk, Vadim L., 426 Szeidert, Iosif, 392 Szeidert-Subert, Iosif, 294 T Tamasila, Matei, 228 Taucean, Ilie Mihai, 217 Tița, Victor, 165 V Vasar, Cristian, 273 Vasile, Carja, 41 W Windridge, David, 145 Y Yadav, Vikash, 186, 197 Yegnanarayanan, V., 344