313 79 11MB
English Pages XV, 332 [324] Year 2021
Advances in Intelligent Systems and Computing 1168
Shailesh Tiwari · Munesh C. Trivedi · Krishn Kumar Mishra · A. K. Misra · Khedo Kavi Kumar · Erma Suryani Editors
Smart Innovations in Communication and Computational Sciences Proceedings of ICSICCS 2020
Advances in Intelligent Systems and Computing Volume 1168
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. ** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink **
More information about this series at http://www.springer.com/series/11156
Shailesh Tiwari Munesh C. Trivedi Krishn Kumar Mishra A. K. Misra Khedo Kavi Kumar Erma Suryani •
•
• •
•
Editors
Smart Innovations in Communication and Computational Sciences Proceedings of ICSICCS 2020
123
Editors Shailesh Tiwari Computer Science Engineering Department ABES Engineering College Ghaziabad, Uttar Pradesh, India Krishn Kumar Mishra Computer Science Engineering Department Motilal Nehru National Institute of Technology Allahabad Allahabad, Uttar Pradesh, India Khedo Kavi Kumar Department of Digital Technologies University of Mauritius Reduit, Moka, Mauritius
Munesh C. Trivedi Department of Computer Science and Engineering National Institute of Technology Agartala Agartala, Tripura, India A. K. Misra Computer Science Engineering Department Motilal Nehru National Institute of Technology Allahabad Allahabad, Uttar Pradesh, India Erma Suryani Information Systems Department Faculty of Information and Communication Technology Institut Teknologi Sepuluh Nopember (ITS) Surabaya, Indonesia
ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-981-15-5344-8 ISBN 978-981-15-5345-5 (eBook) https://doi.org/10.1007/978-981-15-5345-5 © Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
The 3rd International Conference on Smart Innovations in Communications and Computational Sciences (ICSICCS 2020) has been held at Ayodhya, Uttar Pradesh, India, during February 27–28, 2020. The ICSICCS 2020 has been organized and supported by the Institute of Engineering and Technology, Dr. Raammanohar Lohia Avadh University, Ayodhya, Uttar Pradesh, India. The main purpose of ICSICCS 2020 is to provide a forum for researchers, educators, engineers, and government officials involved in the general areas of communication, computational sciences and technology to disseminate their latest research results and exchange views on the future research directions of these fields. It is organized specifically to help computer industry to derive the advances of next-generation computer and communication technology. Researchers invited to speak will present the latest developments and technical solutions. The field of communications and computational sciences always deals with finding the innovative solutions of problems by proposing different techniques, methods, and tools. Generally, innovation refers to find new ways of doing usual things or doing new things in a different manner, but due to increasingly growing technological advances with speedy pace, Smart Innovations are needed. Smart refers to ‘how intelligent the innovation is?’ Nowadays, there is massive need to develop new ‘intelligent’ ‘ideas, methods, techniques, devices, tools’. The proceedings covers those systems, paradigms, techniques, technical reviews that employ knowledge and intelligence in a broad spectrum. ICSICCS 2020 received around 150 submissions from around 403 authors of five different countries such as India, China, Bangladesh, Taiwan, and Bulgaria. Each submission has been gone through the plagiarism check. On the basis of plagiarism report, each submission was rigorously reviewed by at least two reviewers. Even some submissions have more than two reviews. On the basis of these reviews, 29 high-quality papers were selected for publication in this proceedings volume, with an acceptance rate of 19.3%. In this book, selected manuscripts have been subdivided into various tracks named—advanced communications and security, intelligent computing techniques, intelligent hardware and software design, and intelligent image processing. v
vi
Preface
A sincere effort has been made to make it an immense source of knowledge for all and includes 29 manuscripts. The selected manuscripts have gone through a rigorous review process and are revised by authors after incorporating the suggestions of the reviewers. We are thankful to the speakers: Prof. Srikanta Patnaik, SOA University, Bhubaneswar, India, and Mr. Vinit Goenka, Member-Governing Council-CRIS, Ministry of Railways; Member-IT. We are also thankful to the delegates and the authors for their participation and their interest in ICSICCS as a platform to share their ideas and innovations. We are also thankful to Prof. Dr. Janusz Kacprzyk, Series Editor, AISC, Springer, and Mr. Aninda Bose, Senior Editor, Hard Sciences, Springer Nature, India, for providing continuous guidance and support. Also, we extend our heartfelt gratitude and thanks to the reviewers and technical program committee members for showing their concerns and efforts in the review process. We are indeed thankful to everyone directly or indirectly associated with the conference organizing team leading it toward the success. We hope you enjoy the conference proceedings and wish you all the best. Ayodhya, Uttar Pradesh, India
Editors Shailesh Tiwari Munesh C. Trivedi Krishn Kumar Mishra A. K. Misra Khedo Kavi Kumar Erma Suryani
About This Book
Over the last few decades, smart innovations in communication and computational sciences have gained an impressive level by drawing the attention of researchers, academicians, and technocrats. The techniques, methods, and tools developed under the aegis of communication and computational sciences improve not only very common areas of our daily life, but also areas of education, health, transportation, robotics, data sciences, data analytics, production industries, and many more. Thus, smart innovations in communication and computational sciences are the need of the society of today as well as tomorrow. Nowadays, various cores in the field of of communication and computational sciences entered in a new era of technological innovations and development, which we are calling ‘Smart Computing’. Smart computing provides intelligent solutions to the problems based on human-like thinking, and it is more complex as compared to other computing technologies. We can say that the main objective of Smart Computing is to make software and computing devices usable, smaller, mobile, and smarter. Keeping this ideology in preference, this book includes the insights that reflect the advances in the field of smart communication and computational sciences from upcoming researchers and leading academicians across the globe. It contains the high-quality peer-reviewed papers of ‘International Conference on Smart Innovations in Communication and Computational Sciences (ICSICCS 2020)’, held at the Institute of Engineering and Technology, Dr. Raammanohar Lohia Avadh University, Ayodhya, Uttar Pradesh, India, during February 27–28, 2020. These papers are arranged in the form of chapters. The contents of this book cover three areas: Advanced Communications and Security, Intelligent Computing Techniques, Intelligent Hardware and Software Design, and Intelligent Image Processing. This book helps the perspective readers’ from computer industry and academia to derive the advances of next-generation Smart computing technologies and shape them into real-life applications.
vii
Contents
Advanced Communications and Security Trust-Based Fuzzy Bat Optimization Algorithm for Attack Detection in Manet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rakesh Kumar and Shashi Shekhar Multipath Routing Using Improved Grey Wolf Optimizer (IGWO)-Based Ad Hoc on-Demand Distance Vector Routing (AODV) Algorithm on MANET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abhay Chaturvedi and Rakesh Kumar Comparative Analysis of Consensus Algorithms and Issues in Integration of Blockchain with IoT . . . . . . . . . . . . . . . . . . . . . . . . . . Ashok Kumar Yadav and Karan Singh Use of Hadoop for Sentiment Analysis on Twitter’s Big Data . . . . . . . . Sandeep Rathor
3
13
25 47
Enhanced Mobility Management Model for Mobile Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ashok Kumar Yadav and Karan Singh
55
Multi-objective-Based Lion Optimization Algorithm for Relay Selection in Wireless Cooperative Communication Networks . . . . . . . . . Anand Ranjan, O. P. Singh, G. R. Mishra, and Himanshu Katiyar
69
Intelligent Computing Techniques EmbPred30: Assessing 30-Days Readmission for Diabetic Patients Using Categorical Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sarthak, Shikhar Shukla, and Surya Prakash Tripathi
81
Prediction of Different Classes of Skin Disease Using Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anurag Kumar Verma, Saurabh Pal, and Surjeet Kumar
91
ix
x
Contents
Feature Selection and Classification Based on Swarm Intelligence Approach to Detect Malware in Android Platform . . . . . . . . . . . . . . . . 101 Ashish Sharma and Manoj Kumar Sentiment Classification Using Hybrid Bayes Theorem Support Vector Machine Over Social Network . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Shashi Shekhar and Narendra Mohan Improved Cuckoo Search with Artificial Bee Colony for Efficient Load Balancing in Cloud Computing Environment . . . . . . . . . . . . . . . . . . . . . 123 Rakesh Kumar and Abhay Chaturvedi Modified Genetic Algorithm with Artificial Neural Network Algorithm for Cancer Gene Expression Database . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Narendra Mohan and Neeraj Varshney Transfer Learning: Survey and Classification . . . . . . . . . . . . . . . . . . . . 145 Nidhi Agarwal, Akanksha Sondhi, Khyati Chopra, and Ghanapriya Singh Machine Learning Approaches for Accurate Image Recognition and Detection for Plant Disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Swati Vashisht, Praveen Kumar, and Munesh C. Trivedi Behavior Analysis of a Deep Feedforward Neural Network by Varying the Weight Initialization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Geetika Srivastava, Shubham Vashisth, Ishika Dhall, and Shipra Saraswat A Novel Approach for 4-Bit Squaring Circuit Using Vedic Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Shatrughna Ojha, Vandana Shukla, O. P. Singh, G. R. Mishra, and R. K. Tiwari Smart Educational Innovation Leads to University Competitiveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Zornitsa Yordanova and Borislava Stoimenova Analysis and Forecast of Car Sales Based on R Language Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Yuxin Zhao Intelligent Hardware and Software Design Distributed Database Design and Heterogeneous Data Fusion Method Applied to the Management and Control Platform of Green Building Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Wen-xia Liu, Wen-jing Wu, Wei-hua Dong, Dong-lin Wang, Chao Long, Chen-fei Qu, and Shuang Liu Analysis and Prediction of Fuel Oil in a Terminal . . . . . . . . . . . . . . . . . 223 Yuxin Zhao
Contents
xi
An Android Malware Detection Method Based on Native Libraries . . . 233 Qian Zhu, Yuanqi Xu, Chenyang Jiang, and Wenling Xie Fire Early Warning System Based on Precision Positioning Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Hanhui Lin, Likai Su, and Yongxia Luo The Construction of Learning Resource Recommendation System Based on Recognition Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Hanhui Lin, Yiyu Huang, and Yongxia Luo Review of Ultra-Low-Power CMOS Amplifier for Bio-electronic Sensor Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Geetika Srivastava, Ashish Dixit, Anil Kumar, and Sachchidanand Shukla Intelligent Image Processing An Enhancement of Underwater Images Based on Contrast Restricted Adaptive Histogram Equalization for Image Enhancement . . . . . . . . . . 275 Vishal Goyal and Aasheesh Shukla Hybridization of Social Spider Optimization (SSO) Algorithm with Differential Evolution (DE) Using Super-Resolution Reconstruction of Video Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 Shashi Shekhar and Neeraj Varshney Enhanced Convolutional Neural Network (ECNN) for Maize Leaf Diseases Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 Rohit Agarwal and Himanshu Sharma Facial Expression Recognition Using Improved Local Binary Pattern and Min-Max Similarity with Nearest Neighbor Algorithm . . . . . . . . . . 309 Narendra Mohan and Neeraj Varshney Visibility Improvement of Hazy Image Using Fusion of Multiple Exposure Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 Subhash Chand Agrawal and Anand Singh Jalal
About the Editors
Dr. Shailesh Tiwari currently works as a Professor in Computer Science and Engineering Department, ABES Engineering College, Ghaziabad, India. He is an alumnus of Motilal Nehru National Institute of Technology Allahabad, India. His primary areas of research are software testing, implementation of optimization algorithms and machine learning techniques in various problems. He has published more than 70 publications in International Journals and in Proceedings of International Conferences of repute. He has edited special issues of several Scopus, SCI and E-SCI-indexed journals. He has also edited several books published by Springer. He has organized several international conferences under the banner of IEEE, ACM and Springer. He is a Senior Member of IEEE, member of IEEE Computer Society, Fellow of Institution of Engineers (FIE). Dr. Munesh C. Trivedi currently working as an Associate Professor in Department of CSE, NIT Agartala, Tripura, India. He has published 20 text books and 80 research publications in different International Journals and Proceedings of International Conferences of repute. He has received Young Scientist and numerous awards from different national as well international forum. He has organized several international conferences technically sponsored by IEEE, ACM and Springer. He is on the review panel of IEEE Computer Society, International Journal of Network Security, Pattern Recognition Letter and Computer & Education (Elsevier’s Journal). He is Executive Committee Member of IEEE India Council and IEEE Asia Pacific Region 10. Dr. Krishn Kumar Mishra is currently working as Assistant Professor, Department of Computer Science and Engineering, Motilal Nehru National Institute of Technology Allahabad, India. He has also been worked as Visiting Faculty in Department of Mathematics and Computer Science, University of Missouri, St. Louis, USA. His primary area of research includes evolutionary algorithms, optimization techniques and design, and analysis of algorithms. He has also
xiii
xiv
About the Editors
published more than 60 publications in International Journals and in Proceedings of Internal Conferences of repute. He is serving as a program committee member of several conferences and also edited Scopus and SCI-indexed journals. He is also member of reviewer board of Applied Intelligence Journal, Springer. Prof. A. K. Misra is retired from the post of Professor in Computer Science and Engineering Department, Motilal Nehru National Institute of Technology Allahabad, India. Presently, he is associated with SPMIT, Allahabad, India as an Advisor. He has more than 45 years of experience in teaching, research and administration. His area of specialization is Software Engineering and Nature Inspired Algorithms. He has fetched grants as coordinator, co-investigator, chief investigator for several research projects and completed them successfully such as Indo-UK: REC Project, Development of a Framework for Knowledge Acquisition and machine Learning for Construction of Ontology for Traditional Knowledge Digital Library (TKDL) Semantic Web Portal for Tribal Medicine and many more. He has guided 148 PG and 20 doctorate students. He has published more than 90 research articles in International Journals and in Proceedings of International Conferences of repute. He is fellow of Institution of Engineers (FIE), Member of ISTE, Member of IEEE and CSI. He has organized several national and international conferences in capacity of the General Chair under the flagship of ACM and IEEE. Dr. Khedo Kavi Kumar is an Associate Professor in the Department of Computer Science and Engineering, University of Mauritius, Mauritius. His research interests are directed toward Wireless Sensor Networks, Mobile Ad-hoc Networks, Context-Awareness, Ubiquitous Computing and Internet of Things. He has published research papers in renowned international conferences/high impact journals and has presented his research works in reputed international conferences around the world. He has also served on numerous editorial boards of distinguished international journals and on technical program committees of popular international conferences (IEEE Africon 2013, IEEE ICIT 2013, InSITE 2011, IEEE WCNC 2012, ICIC 2007, WCSN 2007, COGNITIVE 2010, WILEY International Journal on Communication Systems, International Journal of Sensor Networks, International Journal of Computer Applications). He has also served as Head of Department at the Department of Computer Science and Engineering, Faculty of Engineering, University of Mauritius. He was awarded the UoM Research Excellence Award in February 2010 and ICT Personality of the Year 2013 (Runner Up) which recognizes the outstanding young academic who has contributed significantly to promote research at the University and to an individual that has demonstrated exemplary growth and performance in the ICT industry in the year 2013 in Mauritius, respectively.
About the Editors
xv
Dr. Erma Suryani, Ph. D. currently works as an Associate Professor in Information Systems Department, Faculty of Information and Communication Technology, Institut Teknologi Sepuluh Nopember (ITS), Surabaya, Indonesia. She is an alumnus of National Taiwan University of Science and Technology (NTUST), Taiwan. Her primary areas of research are systems dynamics, model driven decision support systems, supply chain management, enterprise systems, as well as modelling and simulation in several fields. She has published more than 75 publications in International Journals and in Proceedings of International Conferences of repute. She is also a reviewer in some SCI Journals and has published several books published by Springer and national publishers. She has organized several international conferences under the banner of procedia computer science.
Advanced Communications and Security
Trust-Based Fuzzy Bat Optimization Algorithm for Attack Detection in Manet Rakesh Kumar and Shashi Shekhar
Abstract Mobile nodes dynamically form a multi-hop wireless networks which are termed as mobile ad hoc networks (MANETs). Centralized infrastructure is not required for the operation of MANETs. They are non-fixed infrastructure networks, and there are so many issues with them because of their dynamic topology, mobile nodes, security, bandwidth constraints, limited battery backup, etc. Trust is an association, trustworthiness, reliability and faithfulness of the nodes in the network. This paper discusses about the trust and trust computations. In this work, a trust-based fuzzy bat (TBF) optimization model is proposed and implemented to mitigate the effects of attacks. Various network environments are used to perform sensitivity analysis. Performance metrics like normalized routing overhead and packet delivery ratio are used for evaluation. Values of trust update interval, weight of trust component and distrust threshold are varied. A security solution is ensured using fuzzy bat with trust evaluation model in an untrusted environment of MANETs. The result concludes that the proposed TFB algorithm provides higher lifetime of a network and throughput and lower consumption of energy and end-to-end delay than the existing approaches. Keywords MANET · Trust metric · Trust-based fuzzy Bat (TFB) · Security · Attack detection
1 Introduction Mobile ad hoc network is a multi-hop, self-organized, peer to peer network. Due to open and unmanaged network characteristics, nodes in MANETs are affected by various types of attacks. MANETs are widely used in information sharing and distributed collaborations. These functions require that the nodes in MANETs should R. Kumar · S. Shekhar (B) Department of Computer Engineering & Applications, GLA University, Mathura, UP 281406, India e-mail: [email protected] R. Kumar e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_1
3
4
R. Kumar and S. Shekhar
cooperate among themselves [1]. Generally, MANETs are deployed in resource constrained environment, which increases the number of compromised nodes. Due to the ad hoc infrastructure, MANETs are greatly affected by denial of service (DoS) attacks. To improve the survivability of the nodes in such type of network, a great deal of research has been performed. There are two types of attacks [2] in wireless networks such as a routing disruption and resource depletion attack. A routing disruption attack performs modification of the routing paths such as Sybil attack and worm hole attack. On the other hand, in resource depletion attack, the main focus is on the consumption of various resources such as battery power. For example, the most prominent DoS attack aims to consume battery power of the node completely. Digital world is created by changes in technology, but security in those advanced technology is an important concern [3]. New challenges with respect to security of IoT applications are created by limited radio range, dynamic topology, resource constraints and MANETs inherent characteristics. It has other challenges including, scalability, reliability, optimal resource management and improvement of qualityof-service (QoS). Researchers are majorly concentrating on ad hoc network’s secured routing. Between mobile devices, cooperative trusted setting is assumed by network’s classical protocols used for routing [4]. Attackers can compromise legitimate internal mobile nodes very easily. DoS attacks are launched by misbehaviour in forwarding of packets. In classical MANET routing protocols, prominent attacks of DoS correspond to sequence number of attacks. In the data transmission phase, data packets may be lost by this attack. In the process of route discovery, protocol rules are broken. Further damage can be reduced by prevention of malfunctioning of protocols in the early stage. In recent days, researchers are aiming to use trust-based secure routing to address different issues in security.
2 Related Work Khan et al. [5] enhanced MANET security by extending optimized link state routing (OLSR) protocol to form a multi attribute trust framework (MATF). Rate of packet dropping, false positive rate, rate of adversary detection and bootstrapping time are enhanced. In order to expedite process of trust building, multiple attributes are used by MATF instead of single trust attribute usage. Attributes include data packet forwarding, control packet forwarding and control packet generation. Watch dog node’s secondhand information is used instead of first-hand information. Nodes having trust value greater than threshold value are termed as watch dog node. Effective results are produced for link withholding attacks, packet modification and packet dropping. Thorat and Kulkarni [6] used uncertainty analysis framework (UAF) for addressing attacks in packet dropping. Ad hoc on-demand distance vector (AODV)
Trust-Based Fuzzy Bat Optimization Algorithm for Attack …
5
routing protocol is extended to form uncertainty analysis framework (UAF). Uncertainty (BDU), disbelief and belief metrics are computed by using indirect and direct observation in UAF. Neighbouring nodes packet forwarding probability is used for the same. BDU values are affected by selfish nodes, density of network and mobility models as shown by results of the simulation. Better knowledge about network security and performance can be given by UAF when convergence of network occurs. Benign and distrusted node information is also provided by this. Jawhar et al. [7] devised a trust-based routing protocol by extending dynamic source routing (DSR) protocol for ad hoc and sensor networks (TRAS). For communication data, node with high trust factor is selected. If primary route fails, backup routes can be used. If a node involves in packet forwarding process, then its trust factor is getting increased. If a node receives positive acknowledgement in data transmission, its trust factor is enhanced. In various conditions, better performance is shown by two evaluated versions of TRAS, TRAS-25 and TRAS-50 as shown by results of simulation. Sethuraman and Kannan [8] formed trusted and reliable path with less energy consumption for sending data packets by proposing refined trust energy-ad hoc ondemand distance vector (ReTE-AODV) scheme. Indirect and direct trust values are computed with node’s energy value. For acquiring, trust value which is refined, and Bayesian probability is incorporated with trust model. This is also used to handle ambiguity. With respect to end-to-end delay and packet delivery ratio, this scheme shows better performance when compared with the existing methods.
3 Proposed Methodology 3.1 System Model Without trusted centralized entity, pure MANET is considered. Multi-hops are used make a communication between nodes. Dismounted soldiers carry mobile devices, for a mobility of node, walking speed of (0, x)m/s is considered. Where upper limit of speed is represented by x. Various ranges of speed and amount of energy are contained by nodes in order to reflect heterogeneous network characteristics. Every node’s location and id is periodically beacon. Node failures can be detected in an easy manner by neighbouring node because of this beacon message. In a timely condition, it maintains a valid membership of group. Here single group is assigned with single task. Joining or leaving of a node corresponds to changes caused by the topology of network, which make involuntary reconnection or disconnection of nodes. In energy consumption computation, this rekeying cost is also considered. Trust value computation of node is incorporated with node’s reconnection of disconnection.
6
R. Kumar and S. Shekhar
Closeness trust component is used for this computation. Every node degree of 1 hop neighbours is represented by this component [9]. Due to operational conditions, environment or inherent nature of node, node may behave selfishly or maliciously. With addition to specified nature, operating conditions also affect the node. When energy level of node is low, nodes are willing to save its energy by showing selfishness. Node may be compromised. Node compromise rate can be related to node’s energy level. When energy level of node is low, it is more likely to be compromised. Node can defend attacks if it has high level of energy. It uses high-energy-consuming defence technique to perform attack defending. Association between node’s behaviour and status is related to node’s own inherent nature in triggering bad character. Based on status of a node, its energy level can be adjusted. Communication mode’s energy consumption is only considered in the calculation. Idle listening energy consumption is not considered. Only transmitting and receiving mode consumptions are only considered. Energy consumption rate of node is slowed down if node is selfish. Node energy consumption increases, if it is compromised and IDS is not detected. In this condition, nodes may be attacked and it consumes more energy. For the selfish node, redemption mechanisms are considered, and re-evaluation period is also considered. Based on own energy level of node, it may behave normally or in a selfish manner. When a node becomes member, it will consume more energy. Like GCSs, group member leaving and joining mechanism is modelled. Based on distributed key agreement protocol, keying of individual is performed when membership changes because of eviction/leave/join.
3.2 Trust Metric Two aspects of trust relationship are spans by trust metrics considered. In order to account social relationships, evaluate social trust at first [10]. For the same, closeness and honesty are considered. Degree of uncompromise is given by honesty, and in social network, the number of 1 hop neighbours gives closeness of that node. Node’s ability in completing the specified task is accounted by QoS trust. Node’s QoS trust level is computed by degree of cooperation and level of energy. In order to account trust decay, trust value of node changes dynamically. This is caused due to failure or mobility of node, large trust chain, change in energy level of node, compromised or uncompromised node, cooperative or selfish node. Continuous real number is assigned for trust level of node with ranging 0–1. Complete distrust is represented by 0 value, 0.5 represents ignorance, and 1 complete trust is represented by 1 value. Four components showing the status of node are used to derive the value of overall trust. They are level of energy, honesty, closeness and cooperation. SPN method-based analytical models are developed for computing trust values probabilistically and for computing every component of trust. Trust values are derived probabilistically by using computed values of trust component in the proposed trust metric. Binary values of 0 or 1 are assigned for
Trust-Based Fuzzy Bat Optimization Algorithm for Attack …
7
a trust components honesty and energy. During interval of trust update, statistical data available to drop packets defines trust component cooperation, and it may be a probability. Number of 1 hop neighbours defined closeness, and it shows relative largeness of 1 hop neighbours. As mentioned above, four trust components are reflected by trust metric. They are honesty, closeness, energy and cooperation. With n hops, node I is evaluating node j’s trust value as n-hop
Ti, j
n-hop, cooperation
(t) = w1 Ti, j +
n-hop, cooperation
(t) + w2 Ti, j
n-hop, cooperation w3 Ti, j (t)
+
(t)
n-hop, cooperation w4 Ti, j (t)
(1)
Equal weights (w1 = w2 = w3 = w4 ) are given to four trust components mentioned in expression 1, computation of honesty, closeness, energy and trust component n-hop(x) Ti, j (t), where X represents cooperation are shown below.
3.3 Trust Model Computation Through Fuzzy Bat Optimization (TFB) In any distributed networks where the cooperation among the nodes is very important, trust is considered to be an essential requirement. Trust is described as a reliance of one node over the other node. The value of trust shows the reliability of one node over the other node. A node may completely depend on the other node, but this is not feasible in real-time environment. Thus, the value of trust can be represented as a probability value that lies in the range of 0 and 1. Hence, trust calculation is best described as fuzzy approach [11]. In the dynamic environment of MANETs, the location of nodes is not fixed. Hence, to calculate trust value, the mobility of nodes is taken as an important parameter. As nodes are mobile, the trust information also updates rapidly. Thus, trust is a dynamic entity which can be represented as a continuous variable whose value ranges between 0 and 1. In this proposed system, the trust value of the nodes is generated periodically using two different perspectives of trust. The trust value is calculated using the previous individual experience of the nodes and the current behaviour of their neighbours. After calculating node’s trust value, fuzzy member function is utilized which defines five different fuzzy levels such as Very high, High, Medium, Low and Very Low. Value of trust lies between 0 and 1. In MANETs, if a node A is willing to transmit data or packets through node B, node A may not be confident of the trustworthiness of the node B. Hence, node A relies on the previous experiences with node B which is stored as one of the fuzzy levels. Then node A takes decision based on the obtained information.
8
R. Kumar and S. Shekhar
According to bats behaviour, the fitness values are calculated to reduce the sparse features efficiently for the given facial expression databases. Band of bats are developed to search for a food/prey. Bats use their echolocation capability. Using expressions 1, 2 and 3, position and velocity of bats are updated in order to produce their virtual movement as fi = f min + f min − f max β j
j
j
vi = vi (t − 1) + [xˆ j − xi (t − 1)fi j
j
j
xi (t) = xi (t − 1) + vi (t)
(2) (3) (4)
where randomly generated number is represented as β and it lies in interval [0, 1]. Value of decision variable j for bat i is denoted by x ji (t) at time step t. Pace and movement range are controlled by fi results. For decision variable, current global best location is represented by variable xˆ j . All solutions given by m bats are compared, in order to find best location. Some rules are initialized in modelling of this algorithm as 1. Distance is sensed by using echolocation by all bats. In a magical way, bats are able to distinguish background and prey/food. 2. With a frequency f min , from position x i , bat bi flies randomly with varying wavelength, velocity vi loudness A0 in order to t = find prey. Wavelength of emitted pulse can be adjusted automatically by bats and based on target proximity, and rate of pulse emission r ∈ [0, 1] is adjusted. 3. In various ways, loudness can be varied. In general, it varies to minimum constant value Ami from large (positive) A0 . By using bat optimization, the sparse features are reduced as follows: Objective function (x), x = (x1, …,). Bat population x i and vi are initialized, = 1, 2, …,. At pulse frequency fi is defined ∀i = 1, 2, …,. Loudness Ai and pulse rates r i are initialized, i = 1, 2, …,. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
Input as facial expression database features While t < T For each bat bi , do Using Eqs. (2), (3) and (4), new solutions are generated If rand > r i , then Reduce the attacker nodes Apply the trust model using (1) Find the attacker nodes using objective function values Among best solutions, select a solution Around best solution, local solutions are generated If rand < Ai and f (x i ) < f (x ) then
Trust-Based Fuzzy Bat Optimization Algorithm for Attack …
12. 13. 14. 15.
9
New solutions are accepted Value of r i is increased, and value of Ai is reduced Repeat fuzzy bat from step 2 Bats are ranked and current best x is computed
For every bat bi , initialize frequency fi, velocity vi and initial position x i . For every time step, maximum number of iterations is represented by T. Facial expression database is given as input, and the sparse features are processed by using pulse frequency. By using objective function, the best fitness value is computed optimally. In current best solutions, select one best solution and then for every bat new solution is generated that accepts the condition ¯ xnew = xold + A(t)
(5)
¯ at time t, and it ∈ [− 1, where average loudness of all bats is represented by A(t) 1] attempts to random walk’s strength and direction. Ai(t + 1) = αAi(t)
(6)
Update emission pulse rate r i and loudness Ai in every iteration of algorithm and the reduced sparse feature is obtained, then further features are taken into the classification phase. It is used to find and discard the attacker nodes and provides security over the network.
4 Experimental Result Performance of existing methods like AODV and MATF is used for making performance comparison with proposed TFB method. NS-2 simulator is used for conducting experimentation. Lifetime of network, packet delivery ratio, consumption of energy, throughput and end-to-end delay are used for comparing performance. Table 1 shows the settings of simulation. Table 1 Simulation parameters
Parameter
Values
Nodes count
100
Size of area
1100 × 1100 m
Mac
802.11
Range of radio
250 m
Time of simulation
60 s
Size of packet
80 bytes
10
R. Kumar and S. Shekhar
5 Performance Evaluation 1. End-to-end delay The standard period occupied through a packet to send to destination from source nodes over system is termed as end-to-end delay. End-to-end delay performance of proposed TFB and existing AODV and MATF are shown in Fig. 1. A number of nodes are represented in x-axis, and the end-toend delay is represented in y-axis of a plot. In experimentation, number of words increased from 20 to 100. Low value of end-to-end delay is shown by proposed TFB work while the existing methods AODV and MATF producing high value of end-to-end delay. 2. Throughput The speed in which the information data is effectively sent across the system or communication links defines throughput. In bits per second (bit/s or bps), it is computed. Alternatively, it is defined as number of packets processed in a specified time. Throughput performance of proposed TFB and existing AODV and MATF are shown in Fig. 2. Nodes are represented in x-axis, and throughput is represented in Fig. 1 End-to-end delay comparison
Fig. 2 Throughput comparison
Trust-Based Fuzzy Bat Optimization Algorithm for Attack …
11
Fig. 3 Energy consumption comparison
y-axis of a plot. High performance in terms of throughput is shown by proposed TFB work while the existing methods AODV and MATF producing low performances. 3. Energy consumption It corresponds to a required power for sending and receiving a packet to a node within system in a specified time slot. Consumption of energy performance of proposed TFB and existing AODV and MATF is shown in Fig. 3. Nodes are represented in x-axis, and consumption of energy is represented in y-axis of a plot. High performance in terms of consumption of energy is shown by proposed work while the existing method producing low performances. 4. Network lifetime The system is called better when the proposed method provides higher lifetime of network. Lifetime of a network comparison of proposed TFB and existing AODV and MATF is shown in Fig. 4. Sizes are represented in x-axis, and lifetime of a network is represented in y-axis of a plot. High performance is shown by proposed work while existing method producing low performances. When size of packet is increased, nodes’ repeated usage is avoided by proposed method, which enhances the performance of a network. Lifetime of a network is enhanced by proposed method when compared to existing AODV and MATF methods. Fig. 4 Network lifetime
12
R. Kumar and S. Shekhar
6 Conclusion This work detected an attack node by proposing novel trust-based TFB scheme. On MANETs, it provides high security. Before attack patterns launch packet drop attack, challenges in it have to be identified. Quality-of-service is enhanced by isolating the challenges in early stages by this method. This work identified the optimum values of important parameters by doing sensitivity analysis. In any trust-based scheme, important role is played by distrust threshold as shown by results of experimentation. Scheme’s detection rate is optimized by setting value of distrust threshold in a conservative manner. High value of lifetime of a network, throughput and low value of consumption of energy and endto-end delay is exhibited by proposed TFB as shown by results of experimentation than existing methods like AODV and MATF.
References 1. Popli, R., Garg, K., Batra, S.: Outlier detection using data mining in trust based clustered MANET’s. Int. J. Electr. Electron. Comput. Sci. Eng. 5(1), 2454–1222 2. Li, W., Joshi, A.: Outlier detection in ad hoc networks using dempster-shafer theory. In: IEEE, Tenth International Conference on Mobile Data Management: Systems, Services and Middleware, pp. 112–121 (2009) 3. Liu, Q., Wang, G., Li, F., Yang, S., Wu, J.: Preserving privacy with probabilistic in distinguish ability in weighted social networks. IEEE Trans. Parallel Distrib. Syst. 28(5), 1417–1429 (2017, May). https://doi.org/10.1109/tpds.2016.2615020 4. Jhaveri, R. H., Patel, N. M., Jinwala, D.C.: A composite trust model for secure routing in mobile ad-hoc networks. In: J.H. Ortiz, A.P. de la Cruz (eds.) Ad hoc networks, Rijeka, Croatia: InTech, pp. 19–45 (2017) 5. Khan, M.S., Khan, M.I., Malik, S.-U.-R., Khalid, O., Azim, M., Javaid, N.: MATF: a multiattribute trust framework for MANETs. EURASIP J. Wirel. Commun. Netw. 2016(1), 197 (2016). https://doi.org/10.1186/s13638-016-0691-4 6. Thorat, S.A., Kulkarni, P.J.: Uncertainty analysis framework for trust based routing in MANET. Peer Peer Netw. Appl. 10(4), 1101–1111 (2017) 7. Jawhar, I., et al. TRAS: a trust-based routing protocol for ad hoc and sensor networks. In: 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS). IEEE (2016) 8. Sethuraman, P., Kannan, N.: Refined trust energy-ad hoc on demand distance vector (ReTEAODV) routing algorithm for secured routing in MANET. Wirel. Netw. 23(7), 2227–2237 (2017) 9. Joseph, J.F.C., et al.: CRADS: integrated cross layer approach for detecting routing attacks in MANETs. In: 2008 IEEE Wireless Communications and Networking Conference. IEEE (2008) 10. Golbeck, J. (ed.): Computing With Social Trust. Springer Science & Business Media (2008) 11. Luo, J., et al.: Fuzzy trust recommendation based on collaborative filtering for mobile ad-hoc networks. In: 2008 33rd IEEE Conference on Local Computer Networks (LCN). IEEE (2008)
Multipath Routing Using Improved Grey Wolf Optimizer (IGWO)-Based Ad Hoc on-Demand Distance Vector Routing (AODV) Algorithm on MANET Abhay Chaturvedi and Rakesh Kumar
Abstract Mobile ad hoc network (MANET) is type of a wireless network without any infrastructure in which each node has the capability to seek out the best route. In the proposed system, improved grey wolf optimizer (IGWO)-based AODV protocol is proposed. It is used to optimize the AODV parameters for providing better energy consumption nodes along with multiple best routing paths. The proposed method contains three phases such as network model, AODV routing protocol, and generation of objective function in IGWO-based AODV. In the network model, nodes are connected to send and receive the packets. In the second phase, the AODV protocol is focused to calculate the energy utilization through distance among nodes. In the third phase, the IGWO is used to optimize the parameters such as energy, hop count, throughput, and delay using the fitness function values. The optimized search criteria of GWO help to decrease amount of time for the search process. Thus, the simulation results conclude that the proposed IGWO-AODV algorithm is better than the existing methods in terms of lower energy utilization, excellent throughput, relatively lesser end-to-end delay, and better network lifespan. Keywords Distance vector · Routing · MANET · AODV
1 Introduction MANETs is an infrastructure-less, active system that encloses a set of wireless movable nodes, which communicate with one another without the use of any kind of centralized authority. Every node existing in a MANET is without charge to travel A. Chaturvedi Department of Electronics and Communication Engineering, GLA University, Mathura, UP 281406, India e-mail: [email protected] R. Kumar (B) Department of Computer Engineering and Applications, GLA University, Mathura, UP 281406, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_2
13
14
A. Chaturvedi and R. Kumar
freely in any route and hence modify its links to other nodes often [1, 2]. In a MANET, every node has its own radio broadcast range. Two nodes can have communiqué with every other openly only if they are in the similar radio communication series. If two hubs are not in the similar radio transmission range, they can still transmit but using the intermediate nodes. Every time the nodes in the MANET are concerned in transmission, battery energy of those nodes will get reduced. The issue of power utilization in MANETs can be identified in various levels. Recently, several researchers aimed on the optimization of power efficiency of mobile nodes, from diverse of viewpoint. The objective of power responsive routing method is to help decrease power utilization in packets broadcast among a source and a destination, to prevent packets routing via nodes with low residual energy. It is used to optimize flooding of routing data on the given system and to neglect interfering and intermediate conflicts. Power resourceful routing algorithms achieved via creating proper routing approaches that choose correct route having more energy to transmit information among the nodes, as studied in [3]. This helps balance the amount of traffic conceded through each node. But the battery power is limited. In [4], data are provided concerning energy-efficient package direction-finding on a multi-hop wireless scheme, where the mobility feature is considered into report through accepting a consequential form. They assumed the goal of reducing the power utilization or packet delivery and topic to the packet delay restriction. Heuristic method includes solely the shortest path calculation and consequently shows better scaling to the size of the network and the online traffic density. However, it has problem with the network lifetime. Routing in these systems is greatly composite because of moving nodes, and therefore, several protocols are improved. The aim of routing is to find the modified topology of a constantly varying network to discover an accurate route to a particular node in a MANET. To start communication between mobile nodes, a collection of algorithms must be pursued. The protocols are used such as connection state routing approach and distance vector protocol. Though in MANET, routing and transmission through these protocols are restricted because of the different aspects such as Quality of Service (QoS), energy utilization, throughput and bandwidth, the system uses proactive protocols to preserve the routing data and nodes history. The routing information such as node id, address, power, and routing charge are saved into the routing table. Routing tables should be updated along with the changes in system over time [5].
2 Literature Review In [6], Kai, A. A. and Jan, H, R (2011) used an adaptive topology control protocol designed for mobile nodes. This system permits every node to choose whether to help in energy-efficient routing to preserve its energy. It can significantly minimize the broadcasting energy consumed in the beacon messages sent for mobile nodes. It utilized an energy-efficient preservation rule to decrease the power consumed in beacon. It has been verified that any sort of rebuilding and power variation can
Multipath Routing Using Improved Grey Wolf Optimizer (IGWO) …
15
exposure in four and five beacon time intervals. An adaptive configuration rule is formulated to organize the parameter for every node depending on the mobility and energy levels of the node. However, it has poor energy utilization. In [7], P.S. Karadge, Dr. S.V. Sankpal (2012) suggested maximum energy level ad hoc distance vector (MEL-AODV) protocol that has chosen the power resourceful route through residual energy of every node connected to the path. The source that wants to transmit information packages to the intended destination node initiates the process of route request. The route with the most energy is chosen as the best route, and then, the source transmits the data packets along that path. In [8], Istikmal (2013) uses the optimized method in DSR routing algorithm. It employs ant algorithm which is utilized to estimate, and the estimation is completed via few statistical equations. This work examined and computed the performance achieved of the routing algorithm in diverse circumstances and also performed the output along with the DSR routing approach. In accordance with the comparison analysis, it can be noted that the DSR-ant exhibits 48% lesser delay, 1.37 times lesser hop count, and 3.6 times greater throughput. The important disadvantage of this approach was the higher routing overhead.
3 Proposed Methodology 3.1 Network Model The proposed system considers the network environment, which is structured with no central authority. The nodes placed on the network can send and receive packets via multiple indirect hops. Assume the MANET network along with N number of identical mobile nodes, n number of neighbour nodes, source node, destination node, links and routing path along with particular distance. A packet needs to be sent via a source node to another destination node. Then, intermediary nodes can be employed in the form of relay nodes. The aim is to discover the allocation of the packet delay, hop count, lifetime, PSNR, throughput, and the distribution of the energy during the time when the message gets sent to the destination node. The routing algorithm determines which nodes must be chosen for a certain communication. Therefore, the routing algorithms have a significant role to play in preserving the energy in a communication system and improve the lifespan of the nodes and also the entire network.
3.2 Mobility Model The mobility model [9] denotes the motion of the MANET nodes existing in the system depends on the position, rapidity, and pace. This scheme discovers the
16
A. Chaturvedi and R. Kumar
implementation of the scheme in the network for information data communication. Consider n1 and n2 be two MANET nodes located at (a1 , b1 ) and (a2 , b2 ), correspondingly. At a certain point of time l = 1, both of the nodes travel to a new position (a1 , b1 ) and (a2 , b2 ) such that the association of the nodes is within a particular place. The Euclidean distance among MANET node is given as d(0) = |a1 − a2 |2 + |b1 − b2 |2
(1)
The distance among the MANET nodes at any point of time l in the new positions is calculated as follows, 2 2 d(l) = a1 − a2 + b1 − b2
(2)
where (a1 , b1 ) and (a2 , b2 ) are the new locations obtained via the nodes n1 and n2, respectively. Grey wolf optimizer (GWO) is, in reality, a meta-heuristic algorithm in [10] which duplicates the hunting structure of the grey wolves. In GWO algorithm, grey wolves are clustered into alpha (α), beta (β), delta (δ), and omega (ω) as per their hierarchy occupied in the social sphere. The alpha wolves are the powerful ones, and every other grey wolf pursues its commands. The second type of the wolves that belongs to the beta category is answerable for supporting alpha in their decision-making. Omega is the lowest-ranked grey wolves.
3.3 AODV Protocol AODV is quite an ordinary, effective, and resourceful routing protocol designed for mobile ad hoc networks with no static topology. The inspiration behind this algorithm is the less amount of bandwidth available in the communication medium, which is utilized for wireless communications. The on-demand route discovery, route maintenance, hop-by-hop routing, and application of node sequence numbers help the algorithm to deal with the topology and routing information. Getting the routes just on-demand changes AODV to an efficient and popular algorithm for MANETs. Every mobile host is present in the network functions like an expert router and the routes are found as required, thereby rendering the network to have the self-start aspect. Every node in the network has a routing table that maintains the routing information entries pertaining to its neighbouring nodes, and two individual counters, which include a node sequence number and a broadcast id. If a node (suppose, source node ‘S’) requires to communicate with another (like, destination node ‘D’), it increases its broadcast id and then begins the process of path discovery by multicasting a route request packet RREQ to its neighbours.
Multipath Routing Using Improved Grey Wolf Optimizer (IGWO) …
17
3.4 IGWO with AODV Routing Protocol in MANET • The GWO algorithm is a novel swarm intelligence scheme that can make a replica the social ladder and hunting prototype of grey wolves living in the environment. Figure 3 is to offer the optimization solution for multi-objective problem. For mathematically mimicking the social dominant hierarchy of wolves, the solution with the best fitness is set as alpha. Then, the second and third solutions with better fitness are set as beta and delta. The remaining section of the solutions is thought as omega. The IGWO is used to obtain set of solutions viewing the finest compromises between the objective function. IGWO takes over the encircling process of GWO; there is a neighbourhood in the form of a circle present around the solutions that can be extensive to bigger dimensions. The random parameters D and B help contestant solutions to include hyper-spheres along with diverse arbitrary radius. It inherits the hunting procedure of GWO, and the search mediator is permitted to establish the probable location of the prey. Searching and utilization are ensured via the adaptive values of d and D. These adaptive parameters d and D permit IGWO to efficiently conversion among searching and exploitation. Thus, the convergence of IGWO is guaranteed. Decreasing D, part of the iterations are constant to exploration (|D| ≥ 1) and the other part are devoted to development (|D| < 1). It has only two key parameters to be adjusted (d and B). This method and choice leader section preserve the diversity of the records during optimization. Hence, it is used to provide more optimal solutions for the given network. In this algorithm, hunting is escorted through alpha, beta, and delta, while omega wolves are dependable for encompassing the quarry to reveal improved resolution. The chase is led via the alpha. The beta and delta may also take part in hunting rarely. In other words, alpha, beta, and delta calculate around the location of the prey, and other wolves inform their locations randomly around the prey. As stated above, the grey wolves terminate the hunt through launching an attack on the prey, while it ends being on movement. For precisely modelling the act of advancing towards the prey, it reduces the value set for d. The mathematical model for hunting behaviour is as follow: O = B · Yk (m) − Y (m)
(3)
· O Y (m + 1) = Yk (m) + D
(4)
where Yk stands for the position of the prey, and Y (m) refers to the position of the and B indicate the coefficient vectors computed with grey wolf at mth iteration. D Eqs. (3) and (4) correspondingly = 2d · r1 − d D
(5)
18
A. Chaturvedi and R. Kumar
B = 2 · r2
(6)
where d indicates the coefficient vector that is decremented linearly from 2 to 0 with the incrementing number of iterations and r 1 , r 2 indicate the random numbers between the range 0 and 1. − → O α = C1 · Yα (m) − Y )
(7)
− → O β = C2 · Yβ (m) − Y )
(8)
− → O δ = C3 · Yδ (m) − Y )
(9)
1 · O α Y1 = Yα − D
(10)
1 · O β Y2 = Yβ − D
(11)
1 · O δ Y3 = Yδ − D
(12)
Y (t + 1) =
Y1 + Y2 + Y3 3
(13)
It describes the span estimated around the present position and alpha, beta, and delta, correspondingly. Once the distance computing is completed, the ultimate location of the ω wolves is discovered. It is used to optimize the network parameters to obtain the shortest path. The best fitness values are chosen, and throughput is increased. However, it has issue with optimal multi-routing selection. Also the delay as well as hop count parameters should be improved over MANET. In this research, IGWO-based AODV algorithm is introduced to improve the energy consumption, delay, network lifetime, and hop count over MANET. The proposed approach does not result in the problem of link failure and prevents the message being rebroadcasted again from the source node. In AODV algorithm, during the evaluation process, IGWO applied to select the multipath route along with better energy consumption nodes. It is used to select the node which contains lower energy consumption, lower delay, higher network lifetime, and higher throughput. In some cases, routing path selection nodes cannot satisfy few metrics with the abovementioned criteria. Hence, in such case, the lower energy consumption nodes are to be considered and other metrics are optional using IGWO. This scenario gives a significant improvement in node energy. AODV routing protocol is utilized to progress the power utilization, and IGWO is utilized for improving the network lifespan through the multipath routing prominently. The overall block diagram is shown in Fig. 1.
Multipath Routing Using Improved Grey Wolf Optimizer (IGWO) … Fig. 1 Overall block diagram of the proposed system
Number of nodes in
19
Generate packets source to destination
Apply IGWO based AODV algorithm AODV protocol for energy efficiency Route discovery phase Route maintenance
Compute hop count IGWO for multipath routing Calculate threshold Compute objective
Tune parameters in AODV Compute the best node
In the case of a MANET, all the nodes need to function in a collaborative and efficient manner by sharing information about the quality of the nodes’ links and partial routes. The IGWO-based AODV uses the GWO algorithm in order to search for the multipath routing via on-demand nature employing metrics to select the best route. The AODV routing is utilized by the IGWO to position the probable optimal paths.
4 Simulation Settings In this section, the performance obtained of the newly introduced IGWO-AODV method is evaluated and then compared with the available techniques like DSR and ABC. The experiments are carried out with the help of the NS-2 simulator. The comparison analysis of the earlier available and the proposed techniques are then
20 Table 1 Simulation parameters
A. Chaturvedi and R. Kumar Parameter
Values
No. of nodes
100
Area size
1100 × 1100 m
Mac
802.11
Radio range
250 m
Simulation time
60 s
Packet size
80 bytes
done in terms of end-to-end delay, throughput, energy consumption, packet delivery ratio, and network lifespan. Table 1 provides the simulation settings.
4.1 Performance Evaluation 1. End-to-end delay The average time consumed by a packet to be transmitted from a source node to the destination node across the network is called as end-to-end delay. Figure 2 illustrates the comparison analysis carried out in terms of the end-to-end delay performance for the newly introduced IGWO-AODV and existing ABC and DSR techniques. The number of nodes is plotted along the x-axis, and the end-to-end delay metric is plotted along the y-axis. The nodes are varying between 20 and 100, and end-to-end delay (ms) is plotted for such nodes. From the graph, it is evident that the proposed IGWO-AODV algorithm attains much lower end-to-end delay in comparison with the existing ABC and DSR algorithms. 2. Throughput Throughput is defined as the rate at which the data packets are transmitted along the network or communication links with success. The unit of measurement is bits per second (bit/s or bps). It is also measured in terms of units of information, which are processed over a certain time period. Figure 3 illustrates the comparison of the throughput performance achieved for proposed IGWO-AODV and existing ABC and DSR techniques. The number of Fig. 2 End-to-end delay comparison
Multipath Routing Using Improved Grey Wolf Optimizer (IGWO) …
21
Fig. 3 Throughput comparison
nodes is plotted along the x-axis, and the throughput is plotted along the y-axis. It is evident from the graph that the proposed IGWO-AODV algorithm yields much better throughput compared to the available methods of ABC and DSR. 3. Energy consumption Energy consumption is defined as the average energy necessary for the operations of transmission, reception, or forwarding conducted on a packet to a node present in the network during a certain time span. Figure 4 illustrates the comparison analysis of the performance in terms of energy usage for proposed IGWO-AODV and existing ABC and DSR techniques. The number of nodes is plotted along the x-axis, and the energy consumption is plotted along the y-axis. It is evident from the graph that the proposed IGWO-AODV scheme yields much efficient energy usage compared to the available methods. 4. Network lifetime The system is called better when the proposed method provides higher network lifetime. Figure 5 depicts the network lifespan for a certain packet size. The packet size is plotted along the x-axis, and the network lifespan is plotted along the y-axis. The Fig. 4 Energy consumption comparison
22
A. Chaturvedi and R. Kumar
Fig. 5 Network lifetime
available system demonstrates not so efficient network performance, while the newly introduced system reveals much better performance. It has also been found that the newly introduced system maximizes the lifespan of the network by preventing the redundant utilization of nodes with the increase in the packet size. It is shown that the IGWO-AODV algorithm yields a higher network lifespan in comparison with the other available ABC and DSR approaches.
5 Conclusion In this work, proposed IGWO-AODV algorithm is used for finding best multipath routing nodes over MANETs. To determine the optimal paths among the transmission nodes, AODV algorithm is optimized by IGWO algorithm. When there is a rise in the number of nodes in the network, the energy efficiency is improved by employing this protocol. This method imitated the hierarchy in the social ladder and hunting nature of grey wolves. Also the IGWO algorithm focused to develop the optimal values for increasing the network lifetime. The result concludes that the proposed IGWOAODV algorithm provides higher throughput, network lifespan, lower end-to-end delay, and energy usage compared to the earlier DSR and ABC approaches.
References 1. El Defrawy, K., Tsudik, G.: Privacy-preserving location-based on-demand routing in MANETs. IEEE J. Sel. Areas Commun. 29(10), 1926–1934 (2011) 2. Mayadunna, H., et al.: Improving trusted routing by identifying malicious nodes in a MANET using reinforcement learning. In: 2017 Seventeenth International Conference on Advances in ICT for Emerging Regions (ICTer). IEEE (2017) 3. Chang, J.-H., Tassiulas, L.: Energy conserving routing in wireless ad-hoc networks. In: Proceedings of IEEE INFOCOM, Tel-Aviv, Israel (2000) 4. Zhang, J., Zhang, Q., Li, B., Luo, X., Zhu, W.: Energy-efficient routing in mobile ad hoc networks: mobility-assisted case. IEEE Trans, Veh. Technol. 55(1), 369–379 (2006)
Multipath Routing Using Improved Grey Wolf Optimizer (IGWO) …
23
5. Helen, D., Arivazhagan, D.: Application, disadvantages, challenges of adhoc network. J. Acad. Ind. Res. 2, 2278–5213 (2014) 6. Kai, A.A., Jan, H.R.: Adaptive topology control for mobile ad hoc networks. IEEE Trans. Parallel Distrib. Syst. 22(12), 1960–1963 (2011) 7. Karadge, P.S., Sankpal, S.V.: A performance comparison of energy efficient AODV protocols in mobile ad hoc networks. Int. J. Adv. Res. Comput. Commun. Eng. 2(1) (2012) 8. Istikmal: Analysis and evaluation optimization dynamic source routing (DSR) protocol in mobile adhoc network based on ant algorithm. In: 2013 International Conference of Information and Communication Technology (ICoICT), IEEE, pp. 400–404 (2013) 9. Yadav, A.K., Tripathi, S.: QMRPRNS: design of QoS multicast routing protocol using reliable node selection scheme for MANETs. Peer Peer Netw. Appl. 10, 1–13 (2016) 10. Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014)
Comparative Analysis of Consensus Algorithms and Issues in Integration of Blockchain with IoT Ashok Kumar Yadav and Karan Singh
Abstract In today’s era of Big data and Machine Learning, IoT is playing a very crucial role in areas like social, economic, political, education, health care, etc., resulting in a drastic increase in data. Abundant amount of data, create security, privacy, trust issues in the era of the internet. The responsibility of IT is to ensure the privacy and security for huge incoming information and data due to the drastic evolution of the IoT in the coming years. The Blockchain has emerged as one of the major technologies that have the potential to transform the way of sharing the huge information while maintaining the trust. Building trust in the distributed and decentralized environment without the call for a trusted third party is a technological challenge for researchers. Due to the emergence of IoT, the huge and critical information is available over the internet. The trust over the information is reduced drastically, causing an increase in security and privacy concern day-by-day. Blockchain is one of the best emerging technology for ensuring privacy and security by using cryptographic algorithms and hashing. We will discuss the basics of blockchain technology, consensus algorithms, comparison of important consensus algorithms, integration of blockchain with IoT, integration issues and areas of application. Keywords Consensus algorithm · Merkle tree · Hashing · DLT · Hash cash · IoT · Integration of IoT with blockchain
1 Introduction Recent advancement of wireless communication, computing power, internet, big data, cloud computing, has increased the data. The drastic increase in data creates a lot of problems like security, privacy, trust, and authentication. The responsibility of IT is A. K. Yadav (B) · K. Singh School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India e-mail: [email protected] K. Singh e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_3
25
26
A. K. Yadav and K. Singh
to ensure the privacy and security for huge incoming information and data, due to the drastic evolution of the IoT in the coming years. The blockchain has emerged as one of the major technologies that have the potential to transform the way of sharing the huge information and trust to another. Building trust in the distributed and the decentralized environment without a trusted third party is a technological advancement that has the potential to change upcoming scenarios of the society, industries, and organizations. In today’s era of Big data and Machine Learning, IoT is playing a very crucial role in nearly all areas like social, economic, political, education, health care, etc. Disruptive technologies such as big data and cloud computing have been benefited by IoT. Due to the emergence of IoT, the huge and critical information is available over the internet. The trust over the information is reduced drastically, causing an increase in security and privacy concern. The blockchain is one of the best-emerging technologies for ensuring privacy and security by using cryptographic algorithms. Blockchain technology has turned out from the concept of time stamping of Digital Document published by Stuart Haber and W. Scott Stornetta in 1991. Time Stamping of a digital document is used to maintain the dignity and integrity of the digital document by a particular node [1]. Cryptocurrency Bitcoin has acquired so much fame implemented by Satoshi Nakamoto in the year 2009 [2]. Blockchain technology is the effective application of existing technology such as Decentralization, Hashcash, Public Ledger, Consensus, Merkle Tree, Public Key Encryption and Hashing Algorithm. Decentralization can be considered as the first most important perspective of blockchain technology. Basically, blockchain is a platform where various peers can participate to generate block having the same authority and to cooperate. Every peer connected will have the same authority to make changes in public ledger if applicable. Network failure during the execution of the transaction does not affect the transaction too much because every peer makes their own separate network. Public Ledger is documentation of every successful transaction which is available and sharable to all peers (in peer to peer network) (Fig. 1). Structure and the size of the block is implementation dependent. The maximum number of transactions that a block can contain depends upon the block size and the size of each transaction. Blockchain cannot guarantee the transaction privacy since the values of all transactions and balance for each public key are publicly visible. A block has Block Header and Block Body. Block Header contains Block Version, Merkle Tree Root Hash, Time Stamp, N bits target threshold of a valid block hash, Nonce and Parent Block Hash. Block Body contains Transaction Counter and Transactions. Blockchain is a chain of blocks. It is considered, that longer the chain of blocks more will be prioritized to add a new block, to provide privacy and security. All the blocks connected in the chain avail secure, decentralization permission less system, where many users can participate without providing their identities into existence [3]. This will take care of all malicious activity in the transaction phase. Mining is the solution to such a problem. Miners will decide the block size and transaction probability and whether it will add to blockchain or not. If the answer is yes, then which chain will be used to add a block also decided by miners. Finally, after investigation new block will add at the longest chain. Making a bit change of nonce
Comparative Analysis of Consensus Algorithms and Issues … Merkle Tree
Public Key Encryption Hashing Algorithm
27
Decentralized Technique
Blockchain Technology
Distributed Ledger Consensus Protocol
Provide
Data Decentralization Transparency Security and Privacy Tamper proof replicated ledger Immutable Ledger Automation and Smart Contract New way of storage
Fig. 1 Component of blockchain and output
will affect the whole hash of all successor blocks. It is very difficult to identify the real hash value. Miners get some incentives to maintain their honesty with block size and transaction. Change in a transaction will provide a replica of that particular block to every peer connected with that transaction. Variation in, transactions can be done by miners. It leads to an honesty problem and failing to design a secure and privacy system. Miner having high power computing machine will get more incentive. The high-power computing machine consumes a huge amount of electricity. It is a major concern for the miner. The solution to this is to use an effective consensus algorithm. There are several consensus algorithms have proposed. We will investigate some most important consensus algorithm and do a comparative analysis [4].
2 Background and Types of Consensus Protocol The concept of Hashcash was suggested by Adam Back in 1997 [5]. Hashcash is a mining algorithm used as Proof-of-Work Consensus algorithm (Used for Permission less Blockchain Technology i.e. Bitcoin). It is used to restrain email and save such system from Denial of Service Attacks. Brute Force method is the only way to implement the Hashcash. The consensus algorithm is the heart of blockchain technology. The consensus is considered as the pillar of the blockchain network. Many consensus algorithms have been suggested to get system safe from any malicious activity in
28
A. K. Yadav and K. Singh
blockchain technology. PoW (Proof-of-Work), PoS (Proof-of-Stake), DPoS (Delegated Proof of Stake), PBFT (Practical Byzantine Fault Tolerance), etc. are some of them. Basically, Consensus ensures attainment of logical decision, so every peer should agree whether a transaction should be committed in the database or not. Blockchain uses the technique of hash function, Merkle Tree, Nonce (to make hash function harder to retrace) to provide Data Centralization, Transparency, Security and Privacy, Tamper proof replicated ledger, Immutable Ledger Non-Repudiation, Irreversibility of records, Automation and Smart Contract and a new way of storing [6, 7].
2.1 Proof-of-Work (PoW) Consensus Algorithm Proof of work was invented by Dwork and Moni Naor in 1993 and formalized by Markus Jakobsson and Ari Juels in 1999. It ensures economic measure to prevent denial of service attacks. DoS attacks, prevent legitimate users from using the service. It is asymmetric, i.e., hard on the requester side but easy to check for the service provider. The proof of work prevents fraud nodes to get a grip of true nodes. The concept of PoW is used beyond blockchain. Ideally, the concept is to produce a challenge to the user. The user has to produce a solution which should show some proof of work being done against that challenge. Once it is validated, the user is accepted. It eliminates the slow entity which is not capable enough to generate PoW. In blockchain, PoW is used to generate a value that is difficult to generate and easy to verify. To generate block hash with n leading 0’s, a nonce is required. The brute force method is applied to find the value of the nonce. The combination of the nonce and the generated block data, including the hash value of the previous block comes out with required leading 0’s. More is the value of n, more the complexity. PoW in context of blockchain signifies that the computation required is exponential to the number of leading 0 required in PoW. As the blocks are chained, redoing will require an entire chain to be redone. It also signifies that some amount of computation and effort has been invested in finding the solution to the problem [4]. Since many of the miners are working in same permission-less network so, it will be difficult to identify which miner will commit the block and verify the committed transaction block. In PoW miners take around 10 min to gather all the committed transactions and generate a new block for them. Metadata of a block contains a previous Block Hash, current Block Hash, Merkle Tree and Nonce. Miners are awarded some incentives in form of cryptocurrency when they generate a new block [8] (Fig. 2). PoW is also difficult for the miners to propose a new block (i.e. to find a nonce that will not affect the previous block hash) and miners must show their prior done work which was generated before proposing a new block to other nodes. Time stamping should also be a factor of the block so that later peer cannot disagree on its transaction made. The major problem for the miners is to get the number of leading zero’s in the generated Hashcode. In PoW node having high power machine will perform
Comparative Analysis of Consensus Algorithms and Issues …
29
Start Group the transactions to form a block that need to be verified
Server generates mathematical puzzle
Miners compete to solve the problem
First one to find the solution sends the PoW to the network
All other nodes verify the solution No Is solution?
Block is cancelled yes
Block is added to the existing block chain and information is updated
Reward the miner
End Fig. 2 Flowchart of PoW algorithm
more transaction to be committed and generate a new block. This will lead to higher incentives towards node utilizing more powerful machine in generating new blocks [9].
2.2 Proof-of-Stake (PoS) Consensus Algorithm In proof of work, miners are required to give a solution for the complex cryptographic hash problem. Miners compete with each other to become the first to find the nonce. The first miner who solves the puzzle gets the reward. Mining in proof of work algorithm requires a lot of computing power and resources (Fig. 3).
30
A. K. Yadav and K. Singh
Start
Validator stakes some amount of money
Validator is chosen
Validator creates a block
No Reward and stakes amount not given
Is valid Block ? Yes Block is added to the Yeblockchain
Validator receives reward and staked amount
End Fig. 3 Flowchart of PoS algorithm
All the energy is used to solve the puzzle. Higher the computational power, the higher the hash rate and thus higher the chances of mining the next block. This leads to the formation of Mining Pools where miners come together and share their computational power to solve the puzzle and share the reward among themselves. Proof of work uses a huge amount of electricity and encourages mining pools which take the blockchain towards centralization [10]. To solve these issues, a new consensus algorithm was proposed called Proof of Stake. A validator is chosen randomly to validate the next block. To become a validator, a node has to deposit a certain amount of coins in the network as a stake. This process is called staking/minting/forging. The chance of becoming the validator is proportional to the stake. The bigger the stake is, the higher the chances of validating the block. This algorithm favors the rich
Comparative Analysis of Consensus Algorithms and Issues …
31
stake. When a validator tries to approve an invalid block, he/she loses a part of the stake. When a validator approves a valid block, he/she gets the transaction fees and the stake is returned. Therefore, the amount of the stake should be higher than the total transaction fee to avoid any fraudulent block to be added. A fraud validator loses more coins than he/she gets. If a node does not wish to be a validator anymore, his/her stake as well as transaction fees are released after a certain period of time (not immediately as network needs to punish the node if he/she is involved in fraudulent block). Proof of Stake does not ask for huge amounts of electrical power and it is more decentralized. It is more environmental friendly than Proof of Work. 51% attack is less likely to happen with Proof of Stake as the validator should have at least 51% of all the coins which is a very huge amount. Proof of Stake performs in a more protective way and consume less power to execute the transaction. Sometimes a person having more cryptocurrency will have more probability to mine new block, but again, it was arising the problem of dominance when a person having 50% or more then it will have the highest probability to mine the block so the solution has been made in terms of some randomization protocol in which random nodes are selected to mine new block. It was also found that nodes starting with POW and later move to push for better and smoother usage. In PoW, miners can mine only one block and choosing a wrong fork is costly for miners. If a miner chooses the wrong branch and then later, another branch ends up being the longest chain, the miner’s resources for mining the block are wasted. In PoS, the validators can forge multiple forks and choosing a wrong fork is not costly as miners did not spend expensive resources. Every other validator can work on multiple branches. A fraud validator can double spend with the money. A node can include a fraudulent block in one branch and wait for it to be confirmed by the service; once it is confirmed, the node can double spend the money by including the block in the other branch [6]. A malicious validator can approve a “bad” block in one fork and a “good” block in the other. In case if the same validator again gets the chance to validate the blocks, he/she might work in “bad” branch, making the “bad” branch longer than the “good” one. Hence, other validators, too, may start working on the longest chain that includes a fraudulent block. In PoS, validators should have some amount of money for the stake. The problem is how the validators would manage to acquire money at the beginning when the PoS was at its initial stage. Proof of Stake needs the coins to be distributed initially as the coins are needed for forging. In PoS, the attacker can go back to previous blocks and rewrite the history. The attacker may buy some old private keys from the old validators who have lost interest in forging. A node staking a larger amount of money than the other nodes have more chances of becoming the validator.
32
A. K. Yadav and K. Singh
2.3 Proof-of-Activity (PoA) Consensus Algorithm PoA can be considered as the combination of Proof-of-Stake (PoS) and Proof-ofWork (PoW). PoA modifies the solution for PoS. If any miner wants to commit some transactions in a block as to mine a new block, then if that miner wants to commit that mined block into the database, then most of every node sign the block for validation [11] (Fig. 4).
2.4 Proof-of-Elapsed Time Consensus Algorithm (PoET) PoET algorithm suggests some common steps to select the miner which would mine a new block. Each miner that had mined the prior block had waited for random time quantum to do so. Any miner which is proposing any new block to mine should wait for the random moment of time and it will be easy to determine whether any miner which is proposed for the new block to mine has waited for some time or not by making a determination that a miner has utilized a special CPU instruction set (Fig. 5).
2.5 Practical Byzantine Fault Tolerance (PBFT) PBFT algorithm concerns when one or more node in any network becomes faulty and behave maliciously that results in improper communication among all nodes connected to that network. Such problems result in delay, whereas time is a very serious concern as we are working in an asynchronous system, where if at least one fault occurs then it would be impossible to solve the consensus problem. It will also generate discrimination in responses of various nodes. PBFT works for the permissioned model. In practical Byzantine fault tolerance, state machine replication occurs at multiple nodes and the client will wait for an n + 1 response from all nodes where n is the number of faulty nodes, but it isn’t giving the proper solution for this because n + 1 cannot determine the majority vote for the client. PBFT applies to the asynchronous system [12]. Generally, PBFT was attained after PAXOS and RAFT that both have maximum fault tolerance of n/2 − 1 among all nodes where n seems to be the number of faulty nodes [7, 13]. Since PBFT is getting around 3n + 1 response among all non-faulty nodes where n is determined as the faulty nodes. As we are discussing the state machine replication, so it is important to understand it (Fig. 6).
Comparative Analysis of Consensus Algorithms and Issues …
33
Start
Miners group the transactions to form a block that need to be verified Server generates mathematical puzzle Miners compete to solve the problem First one to find the solution sends the block to the network Validators stake some amount of money
Validators are randomly chosen Mined block is assigned to a group of validators to verify
No Block is valid ? Yes Block is added to the blockchain
Miners and validators receive reward
End Fig. 4 Flowchart of PoA algorithm
2.6 Delegated Proof of Stake (DPoS) Consensus Algorithm Delegated proof of stake is similar to PoS algorithm. It refers to more decentralized fashion in blockchain network and it also modifies the way by which energy can be utilized, often very less in executing the proper manipulation. In Delegated proof
34
A. K. Yadav and K. Singh
Start
Waiting period is assigned by the network
Node waits for some period of time No Node has shortest waiting period ? Yes
Mine the block
Mined block is sent to network
All nodes verify the block No Block is valid ?
Block is added to the blockchain
Miner is rewarded
End Fig. 5 Flowchart of PoET algorithm
Comparative Analysis of Consensus Algorithms and Issues …
35
Start
Nodes are sequentially ordered
Client sends request to leader node (which is selected in a round robin manner)
Leader node invokes a service operation
Request is multicasted by leading node to the replicas
Replicas execute the request
Replicas send reply to the client
Client waits f+1 different results from different nodes (f is the maximum number of potentially faulty nodes)
This result is the result of the service operation requested
End Fig. 6 Flowchart of Practical Byzantine Fault Tolerance (PBFT) Algorithm
of stake stockholders are given a chance to give their votes to whom they want to mine the next block. Cryptocurrency holders will also have the opportunity to select the miner to mine a further block. Stockholders will choose the delegates who will be responsible for the mining of new block and somehow, some witnesses are also selected on an election basis by currency holders to perform proper manipulation like searching of nonce and validation of block etc. Task of the delegates is to decide how much incentives to be given to witnesses and they will also decide the factors like block size, power, etc. The final decision will be made by stakeholders to what
36
A. K. Yadav and K. Singh
Table 1 Comparison of permissioned network consensus algorithms BFT
RBFT
PBFT
PAXOS
RAFT
Closed network
Closed network
Synchronous
Synchronous
Synchronous
Used for business working based on smart contracts
Used for business working based on smart contracts
Smart contracts Smart contracts dependent dependent
Smart contracts dependent
State machine replication is used
The proper authorities are handling proper work
State machine replication is used
Sender, proposer and acceptor jointly work
Collecting selected ledger on some agreement to work
Good transactional Good transactional Greater throughput throughput transactional throughput
Greater transactional throughput
Greater transactional throughput
Byzantine faults
Byzantine faults (i.e.Hyperledger body)
Byzantine faults
Crash fault
Crash fault
Based on traditional notions
Based on traditional notions
It can bear f − 1 tolerance
PAXOS can bear f /2 − 1 faults
RAFT can bear f /2 − 1 faults
delegates will have proposed to them. Witnesses change within some time duration or within a week. Witnesses should perform the transaction allotted within the given time duration. It’s all about the reputation of witnesses, more they perform the transaction efficiently within the given time duration, more will be their chance to get selected again in the mining process by selectors (i.e. Cryptocurrency holders). DPoS is also increasing in more decentralized fashion as what was proposed in PoS. PoS is more centralized, the one having higher amount of currency will have the more dominating effect in the whole network but in DPoS it has been modified and made a system something distributed that is removing the centralization process [11, 14, 15].
3 Comparison of Consensus Algorithms See Tables 1 and 2.
4 Applications Area At present, blockchain based applications are not limited to only financial domain, but have grown in accounting, voting, energy supply, quality assurance, self-sovereign, identity (KYC), health care, logistics, agriculture and food, law enforcement, industrial data space, digital identification and authentication, gaming and gambling, government and organizational governance, job market, market forecasting, media
Comparative Analysis of Consensus Algorithms and Issues …
37
Table 2 Comparison of permissionless network consensus algorithms Proof of work (PoW)
Proof of stake (PoS) [11]
Proof of Burn (PoB)
PoET
Used for industries working on financial level
Used for industries working on financial level
Used for industries working on financial level
Used for industries working on financial level
Using public key encryption (i.e. Bitcoin)
Using RSA algorithm for encryption
RSA algorithm for encryption
RSA algorithm for encryption
Miners having higher work done after investing higher power will have higher probability to mine the new block
It is some election type selection of miners for next block to be mined
PoB acquires some cryptocurrencies (wealth) to mine new block using virtual resource
Person spends some time and power to mine new block who finishes first the prior task will be the next miner
Power inefficient
Power efficient
Power efficient
Power efficient
Open environment
Open environment
Open environment
Open environment
Bitcoin script is used
Mostly Golong is used Mostly Golong is used
and content distribution, network infrastructure, philanthropy transparency and community services, real state reputation verification and ranking ride, sharing service, social network and supply chain certification in the food industry. Blockchain has problem with scalability and security, which must be tackled.
5 Internet of Thing (IoT) The Internet of Things (IoT) is a recent technology to connect and communicate among numerous things simultaneously for providing different benefits to consumers and this will change user’s interaction with the technology. The concept of IoT is not new. “The Internet of Things has more potential than the internet, and this is changing the world, just as the internet did in the last few years. The core concept behind IoT is establishing a system to store all the data on the cloud without having the requirement of human efforts in collecting it. It is believed that the impact of IoT on the world will be immense in the upcoming years. Though the presence of IoT offers a cutting-edge opportunity in fully automated systems, traffic management, and solutions, it comes with certain limitations which cannot be ignored. IoT offers a universal model of sharing information to enhance society, enabling advanced services by interconnecting things based on existing wireless communication technologies. A study confirms that approximately 60% of companies are already engaged in developing IoT projects. Recently more than 30% startups are at an early stage of deployment of IoT. More than 69% of these IoT based companies are now focusing problems like, how IoT operational cost can be reduced? Cisco
38
A. K. Yadav and K. Singh
says, 74% of organizations have failed with their IoT startups. It is happening due to the involvement of humans in IoT implementation, beyond the functional elements of sensors and network complexity. Data reliability, security, and privacy are rigorous issues in cloud-based IoT applications.
6 Challenges of IoT Issues with IoT are concerned with security, privacy, scalability, reliability and maintainability. Blockchain has decentralization, persistence, anonymity, immutability, identity and access management, resiliency, autonomy, apriority, cost-saving as attractive features. These features may address challenges of IoT, big data and machine learning. Blockchain pushes centralized network-based IoT to blockchainbased Distributed Ledger. Blockchain mainly can tackle scalability, privacy, and reliability issue in the IoT system. This chapter proposes a trusted, reliable and flexible architecture for IoT service systems based on blockchain technology, which facilitate self-trust, data integrity, audit, data resilience, and scalability [16]. Authentication, authorization and encryption are not enough to tackle issues of IoT due to resource limited node. Instead of resource limited node, IoT systems are also vulnerable to malicious attacks. It is possible due to the failure of security firmware updates in time. So, major challenges in IoT are compatibility and interoperability of different IoT nodes. It requires nearly 40–60% effort. In IoT there are many nodes that have very low computing power and different working mechanism. IoT device uses different types of protocols. Identification and authentication of IoT devices are very difficult because of huge diversity and number of IoT devices. Currently, 20 billion devices are connected in IoT systems. Working with the IoT system totally depends on the speed of internet connectivity. IoT needs a very high speed of internet for the proper and efficient working of IoT. IoT system is not so applicable because the internet is still not available everywhere at a seamless and appropriate speed. So, seamless and appropriate internet connectivity issues may be one of the biggest challenges in IoT deployment [17]. Advancement of wireless technology, computing, and processing speed of the CPU, has made human’s life so easy and flexible. But this advancement may generate huge amount of semi structured and structured data at a very high speed and variation in a very short time [8]. 80% of data generated by IoT system are unstructured data. Handling unstructured data may create a challenge to the deployment of IoT systems for real time applications. Conventional security and privacy approaches are not applicable to IoT because of decentralized and resource constraints of the majority of IoT devices. By 2020, 25% of cyber-attacks will target IoT devices. And also 54% IoT device owners, do not use any third party too and 25% out of these do not even change the default password on their device. The device does not know neighbors in advance, no cooperative data exchange, the adversary can compromise device’s data, energy-inefficient protocol, data capturing, intelligent analytics, and delivering value may create challenges for deploying the IoT systems.
Comparative Analysis of Consensus Algorithms and Issues …
39
IoT nodes ask blockchain to store their state, manage multiple writers, and prevent from the need of a trusted third party [18, 19].
7 Integration of Blockchain with IoT This is the era of peer to peer interaction. Unprecedented growth in IoT explores the new mechanism of storing, accessing and sharing information. This new mechanism of storing, accessing and sharing of information raises the issue of security, privacy and data availability. The centralized architecture of IoT enhanced the evolution of IoT, but creates the issue of single-point failure and also provide the entire control of the system to a few powerful authorized users. Blockchain is one of the latest emerging technology, which has the capabilities of a new way to store, access and share information in a decentralized and distributed system. Blockchain technology can be integrated with IoT to resolve the problem with IoT such as storing, accessing, sharing and central point failure. So, we can say that the use of blockchain may be a complement to the IoT with reliable and secure information. Blockchain may be defined as the key to solving scalability, reliability and privacy problems related to IoT paradigm. Interaction of IoT with Blockchain can be mainly IoT-IoT, IoTBlockchain and combination of IoT-IoT and IoT-Blockchain. In IoT-IoT data are stored in blockchain while its interaction takes place with the use of blockchain. It is a useful scenario where reliable IoT interaction with low latency is required. In IoTBlockchain all interaction and their respective data go through blockchain, to collect an immutable and traceable record of interactions. It is helpful in scenarios like trade and rent to obtain reliability and security. In hybrid platform some interaction and data part are stored in blockchain and the remaining are directly shared between the IoT devices. This approach is a perfect way to leverage and benefit of both Blockchain and real-time IoT interaction. The interaction method can be chosen on a parameter such as a throughput, latency number of writers, a number of interested writers, data media, interactive media, consensus mechanism, security and resource consumption., Blockchain integration with IoT has the potential to overcome the challenges in the control IoT system [20, 21]. Blockchain can improve the power of IoT by offering a trust, sharing service for reliability and tractability of information. Shifting from a cloud to a peer to peer network will resolve central points of failure. It also prevents few powerful companies to control the processing and storage of the information of a huge number of people also improve fault tolerance and system scalability. Blockchain specially designed for an internet scenario with high processing computer, it was far from the reality of IoT. Blockchain technology can be used to keep a track of connected IoT devices which can enable the process of transaction and coordination between devices. Blockchain technology can also be used to keep an immutable record of the history of smart devices in an IoT network, which will maintain the autonomy of peers without any requirement of the single authority [19, 22].
40
A. K. Yadav and K. Singh
The fundamental issues of IoT such as security, privacy, scalability, reliability, maintainability can be solved by blockchain because blockchain provides many attractive features such as decentralization, persistence, anonymity, immutability, identity and access management, resiliency, autonomy, apriority, cost-saving. These features may address challenges of IoT, Big data and machine learning. Blockchain pushes network-based IoT to blockchain-based one. IoT system facing challenges such as heterogeneity of IoT system, poor interoperability, resource-limited IoT devices, privacy and security vulnerabilities. Blockchain can support the IoT system with the enhancement of heterogeneity, interoperability, privacy and security vulnerability. Blockchain can also improve the reliability and scalability of the IoT system. We can say blockchain can be integrated with the IoT system [23–25].
8 Advantages of Integration of Blockchain with IoT 1. Enhanced interoperability Heterogeneous type of data can be converted, processed, extracted, compressed and finally stored in the blockchain. 2. Improved security and privacy IoT data transactions can be encrypted and digitally signed by cryptographic keys. Blockchain support automatic updating of IoT device’s firmware to heal vulnerable breaches, thereby improving the system security. 3. Traceability and reliability All the historical transactions stored in the blockchain may be traceable because of anywhere, anytime verification. The immutability of blockchain helps in the reliability of IoT data due to the impossibility of tampering the information stored in bockchain. 4. Decentralization and use of public ledger After integration of IoT with block chain, the central governing body is eliminated, which will remove the single point failure problem. Further, due to decentralization, the trust issue is also solved since the majority of the nodes in the network agree on the validation of a particular transaction. So, blockchain will provide a secure platform for IoT devices.
9 Challenges of Integration of IoT with Blockchain The integration of blockchain with IoT is 99% secured from any attacker outside the network or system. Miners within the blockchain could be a threat to the system when they create their own virtual blockchain network to communicate and start their malicious activities. Since blockchain ledger is regularly updated, this may lead to the problem of centralization. As time passes incentives gradually decrease due to limited available currency to pay the miners for mining. There may be a chance of not getting any incentive. To overcome the deficiency of currency, blockchain authority may charge some fee to the node just like to trusted third party. It will
Comparative Analysis of Consensus Algorithms and Issues …
41
Performance Issues
Autonomous System 1-Self-Regulated 2-Self -Manage Hight Throughput 1-Less communication Overhead 2-High rate of transaction
Scalability for expansion of Network Low Communication Complexity Fast Transaction Conformation Time
Low Memory Management
Fig. 7 Performance issues in integration of IoT with blockchain
gain the concept of blockchain because one of the primary goals of blockchain is to reduce transaction charges. If we are discussing the permission-less model, then the first cryptocurrency that comes in mind is Bitcoin. But a major problem with the permission-less model is one address of one user. A user can directly enter into the network and participate in a transaction, so it is very difficult to know the sender authentication by the receiver whereas digital signature is applied there to solve this problem. It also has the problem of ‘how digital signatures can be identified or validated’ whereas it may possibly use public key verification of the sender, but may have the problem of multiple addresses because public and private keys are changeable or dynamic in nature. Another point in a permission-less model is that if nodes are not active for almost 3 h of time duration in the bitcoin network, then they would be disconnected from the same network (Fig. 7). Blockchain technology implementation has problem of scalability, block size, number of transactions per second, not applicable for high-frequency fraud. There may be a trade-off between block size and security. To improve the selfish mining strategy, miners can hide their mined block for more revenue in the future. There may be privacy leakage when users make transactions with their public key and private key. A user real IP address could be tracked. The number of blocks mined per unit time cannot fulfill the requirement of process of millions of transactions in a real-time fashion. What may be the maximum chain length and maximum number of miners? Is there any possibility to go for a centralized system using blockchain? Larger block size could slow down the propagation speed and may lead blockchain branch, small transactions can be delayed because miners give more performance to a high transaction fee technical challenge [10, 13]. To ensure the immutability and reliability of the information, blockchain is required to store all the transaction information for verification and validation. Due to storing all the transaction information and the exponential increase of storage, limited IoT devices leads to a drastic increase in the size of Blockchain. This is the primary drawback of integration with IoT. Some of the blockchain implementations have the processing of very few transactions per second. But some real-time IoT applications need very high processing transactions per second. This may also be a
42
A. K. Yadav and K. Singh
Security Aspect
Trust-less Environment
Key Management
Distributed Computing Distributed Storage Decentralization Issuance Revocation Safe Storage
Device Security
User Security
Data Security Data Integrity Data Authentication Data privacy
Device Authentication Authorization Software Integrity Temper Proof
Network Access
Restricted Network Control access
Enrolment Identity management Authentication Authorization Privacy
Fig. 8 Security issues in Integration of IoT with blockchain
major bottleneck of integrating IoT with blockchain. To resolve the issues of scalability of storage size, researchers are proposing a new optimized storage blockchain using the concept of removing old transaction records from the blockchain [26]. The biggest challenges in the integration of the blockchain with IoT are the scalability of the ledger and the rate of transaction execution in the blockchain. Huge number of IoT devices generate exponential transactions at a rate which current blockchain solution cannot handle. Implementing blockchain peers into IoT devices is very difficult because of resource constraints. Integration of IoT is limited because of scalability, expensive computation and high transaction cost of public blockchain. Industries will face a lot of challenges after the integration of IoT with Blockchain such as the connection of huge numbers of IoT devices in the coming next five years, control and management of the huge number of IoT devices in a decentralized system. The integration of IoT with blockchain will raise a few questions for industries like how the industry will enable peer to peer communication between globally distributed devices, how the industry will provide compliance and governance for autonomous systems and how industry will address the security complexities of IoT landscape [12]. Integration of blockchain with limited resource IoT devices is not suitable due to the lack of computational resources, limited bandwidth and they need to preserve power (Fig. 8).
9.1 Open Research Issues in BCIOT Due to resource constraints of IoT devices, conventional big data analysis scheme may not apply to it. So, it is very difficult to conduct big data analysis in IoT blockchain systems. Security, vulnerability and privacy mechanism are most important research areas in the fields of integration of the blockchain in IoT (Fig. 9).
Comparative Analysis of Consensus Algorithms and Issues …
43
Open Research issues in BCIOT
Privacy Leakage
Resource Constraints Scalability
Security Vulnerability
Difficulty in Big data Analysis in BCIOT
Incentive Mechanism
Fig. 9 Open research issues in integration of IoT with blockchain
IoT tries to make the system fully automated, but to ensure verifiability, authentication, and immutability of the blockchain it requires an efficient incentive mechanism. Designing an efficient incentive mechanism is another area of research in the integration of the blockchain with IoT. The drastic increase in the complexity of the network and the addition of a huge number of IoT devices require an effective, scalable mechanism of integration of the blockchain with IoT. It is also one of the future research areas in the integration of blockchain with IoT.
10 Technical Challenges Blockchain is an emerging technology that provides a way to record transactions in a way that ensures security, transparency, auditability, immutability and anonymity. Decentralization is one of the key features of blockchain technology because it can solve problems like an extra cost, performance limitations, and central point failure. Blockchain allows the validation of transactions through the nodes without authentication, jurisdiction or intervention done by a central agency, reduce the service cost, mitigate the performance limitations and eliminating the central point failure. Despite of transparency, blockchain can maintain a certain level of privacy by making blockchain addresses anonymous. It must be noted that the blockchain can only maintain pseudonyms. It may not preserve the full privacy. The features of blockchain-like decentralization, immutability, and transparency make it a suitable technology to integrate with IoT, to provide a solution to the major challenges of IoT. Despite being a very efficient emerging technology for recording transactions, this technology still faces some challenges like storage and scalability. Long-chain may have a negative impact on the performance of IoT. It may increase the synchronization time for new users. Consensus protocol has a direct impact on the scalability of the blockchain. Besides this, blockchain technology also faces security weaknesses and threats. The most common attack is 51% or majority attack. In this situation, the consensus in the network can be controlled, especially those that centralized the consensus among fewer nodes. Double spending attack, race attack, denial of services and man in the middle attack are also some of the challenges faced by blockchain technology. Blockchain infrastructure is vulnerable to man in the middle kind of attack since they strongly depend on communication. In this attack, the attacker can monopolize
44
A. K. Yadav and K. Singh
the node’s connections isolating it from the rest of the network [19]. Blockchain technology implementation can have the following issues: scalability, block size, number of transactions per second, not applicable for high-frequency fraud, tradeoff between block size and security, improve selfish mining strategy, miners can hide their mined block for more revenue in the future, privacy leakage even when users only make transactions with their public key and the private key, users’ real IP address could be tracked, number of blocks mined per unit time cannot fulfill the requirement of process of millions of transactions in a real-time fashion, what may be the maximum chain length and maximum number of miners, is there any possibility to go for a centralized system using blockchain, larger block size could slow down the propagation speed and lend blockchain branch, and small transactions can be delayed because miners give more performance to a high transaction fee technical challenges [4, 27, 28].
11 Conclusion Today, due to the emergence of IoT and Big data, the huge, diverse and critical information is available over the internet. The trust over the information has reduced drastically, causing an increase in security and privacy concern of the industry and organizations. Blockchain technology has the potential to change upcoming scenarios of society, industries, and organizations. The blockchain is one of the best emerging technology for ensuring privacy and security by using cryptographic algorithms and hashing. We have discussed the basics of blockchain technology, consensus algorithms, comparison and analysis of important consensus algorithms, and area of application in this paper. We also have a discussion on IoT its issues, blockchain integration with IoT and its implementation challenges. In future, we will cover the different implementation platform such as Etherium and Hyperledger.
References 1. Haber, S., Scott Stornetta, W.: How to time-stamp a digital document. J. Cryptology 3(2), 99–111 (1991) 2. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system. Self-Published Paper, (2008, May). [Online]. Available https://bitcoin.org/bitcoin.pdf 3. Castro, M., Liskov, B.: Practical byzantine fault tolerance and proactive recovery. ACM Trans. Comput. Syst. 20(4), 398–461 (2002) 4. Fernández-Caramés, T. M., Fraga-Lamas, P.: A review on the use of blockchain for the internet of things. IEEE Access 6, 32979–33001 (2018) 5. Back, A.: Hashcash a denial of service counter-measure. http://www.cypherspace.org/has hcash/ 6. Sankar, L. S., Sindhu, M., Sethumadhavan, M.: Survey of consensus protocols on blockchain applications. 2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS). https://doi.org/10.1109/icaccs.2017.8014672
Comparative Analysis of Consensus Algorithms and Issues …
45
7. Vukoli, M.: The quest for scalable blockchain fabric: proof-of-work vs. BFT replication. In: IFIP WG 11.4 International Workshop on Open Problems in Network Security iNetSec 2015, pp. 112–125, October 29, 2015 (2016) 8. Jaag, C., Bach, C.: Blockchain technology and cryptocurrencies opportunities for postal financial services. In: The Changing Postal and Delivery Sector, pp. 205–221. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-46046-8_13 9. Bentov, I., Gabizon, A., Mizrahi, A.: Cryptocurrencies without proof of work. Paper presented at the International Conference on Financial Cryptography and Data Security, pp. 142–157. Springer, Berlin, Heidelberg. (2016, February). https://doi.org/10.1007/978-3-662-53357-4_10 10. Zheng, Z., Xie, S., Dai, H.-N., Wang, H.: Blockchain challenges and opportunities: survey. School of Data and Computer Science, Sun Yat-sen University, Tech. Rep. (2016) 11. Larimer, D.: DPOS Consensus Algorithm—The Missing Whitepaper. Steemit (2018). [Online]. Available https://steemit.com/dpos/dantheman/dpos-consensus-algorithm-this-missingwhitepaper. Accessed 3 Feb 2018 12. Correia, M., Veronese, G., Lung, L.: Asynchronous Byzantine consensus with 2f + 1 processes. In: Proceedings of 2010 ACM Symposium on Applied Computing—SAC ‘10 (2010) 13. Zheng, Z., Xie, S., Dai, H., Chen, X., Wang, H.: An overview of blockchain technology: architecture, consensus, and future trends. In: 2017 IEEE International Congress on Big Data (BigData Congress), Honolulu, IEEE, pp. 557–564 (2017) 14. Zheng, Z., Xie, S., Dai, H. N., Chen, X., Wang, H.: Blockchain challenges and opportunities: a survey. Huaimin Wang Int. J. Web Grid Serv. (IJWGS) 14(4) (2018) 15. Larimer, D.: Delegated proof-of-stake consensus (2018) 16. Dai, H.-N., Zheng, Z., Zhang, Y.: Blockchain for Internet of Things: A Survey. IEEE Internet of Things Journal, (2019) 1–1. https://doi.org/10.1109/jiot.2019.2920987 17. Bentov, I., Gabizon, A., Mizrahi, A.: Cryptocurrencies without proof of work. In: International Conference on Financial Cryptography and Data Security, Christ Church, Barbados, pp. 142– 157 (2016, February) 18. Pilkington, M.: Blockchain technology: principles and applications. Research hand-book on digital transformations, p. 225 (2016). Retrieved from https://papers.ssrn.com/sol3/Papers. cfm?abstract_id=2662660 19. Samaniego, M., Deters, R.: Blockchain as a service for IoT. In: 2016 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData). IEEE, pp. 433–436 (2016) 20. Dai, H., Zheng, Z., Zhang, Y.: Blockchain for Internet of Things: a survey. In: IEEE Internet Things J. https://doi.org/10.1109/jiot.2019.2920987 21. Lin, J., Yu, W., Zhang, N., Yang, X., Zhang, H., Zhao, W.: A survey on internet of things: architecture, enabling technologies, security and privacy, and applications. IEEE Internet Things J. 4(5), 1125–1142 (2017) 22. Reyna, A., Martín, C., Chen, J., Soler, E., Díaz, M.: On blockchain and its integration with IoT. Challenges and opportunities. Future Gener. Comput. Syst. (2018) 23. Atlam, H.F., Alenezi, A., Alassafi, M.O., Wills, G.: Blockchain with Internet of Things: Benefits, challenges, and future directions. International Journal of Intelligent Systems and Applications, 10 (6), 40-48, [2030]. https://doi.org/10.5815/ijisa.2018.06.05 (2018) 24. Zheng, Z., Xie, S., Dai, H.N., Chen, X., Wang, H.: Blockchain challenges and opportunities: a survey. Int. J. Web Grid Serv, Vol. 14, No. 4, (2018) 25. Liang, X., Zhao, J., Shetty, S., Li, D.: Towards data assurance and resilience in IoT using blockchain. MILCOM 2017–2017 IEEE Military Communications Conference (MILCOM). https://doi.org/10.1109/milcom.2017.8170858 (2017) 26. Panarello, A., Tapas, N., Merlino, G., Longo, F., Puliafito, A.: Blockchain and IoT Integration: A Systematic Survey. Sensors, 18(8), 2575. https://doi.org/10.3390/s18082575 (2018)
46
A. K. Yadav and K. Singh
27. Song, J.C., Demir, M.A., Prevost, J.J., Rad, P.: Blockchain design for trusted decentralized IoT networks. In 2018 13th Annual Conference on System of Systems Engineering (SoSE). IEEE, pp. 169–174 (2018) 28. Conoscenti, M., Vetro, A., De Martin, J.C.: Blockchain for the internet of things: a systematic literature review. In: IEEE/ACS 13th International Conference of Computer Systems and Applications, pp. 1–6 (2016)
Use of Hadoop for Sentiment Analysis on Twitter’s Big Data Sandeep Rathor
Abstract Nowadays, everyone is connected with social media like WhatsApp, Facebook, Instagram, Twitter, etc. Social media is an important platform for sharing ideas, information, opinion, and knowledge among human being. In spite of these basic attributes, we can also analyze the sentiments and emotions. However, handling such big data is a challenging task. Therefore, such type of analysis is efficiently possible only through the Hadoop. In the proposed research, we are going to analyze nature of a particular person on the basis of their behavior on social sites using Hadoop. For analysis purpose in our research, we have taken Twitter data. The sentiment analysis is used as positive, negative, and neutral by using the concept of decision dictionary. The result shows sentiment analysis with good accuracy. Keywords Social site sentimental analysis · Big data · Hadoop · Hadoop ecosystem
1 Introduction Social media is a platform to share the information. It can reflect emotion, sentiments, and domain [1]. In today’s era, everyone is dependent on social media. We are connected to our friends and family members through the social media like Facebook, WhatsApp, Twitter etc. Twitter, one of the biggest Web-based social webpage, gets tweets in millions consistently in the range of terabytes per day. Our morning starts through the messages on different social media. Therefore, huge amount of data is generated today and this data is unstructured and unmanaged. To manage this huge amount unstructured data, big data approaches are used. It manages and analyzes this kind of data efficiently. Working with big data simply means handling a variety of data formats and structures [2]. Generally, big data refers to large amount of data as flight data, share price data, medical data, population data, etc. Analysis on this data has been proposed by various researchers. S. Rathor (B) GLA University, Mathura, UP 281406, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_4
47
48
S. Rathor
For analyzing and processing a large amount of data, we need software that can handle such a large data. Therefore, for that purpose, Hadoop is best suited because of its open-source nature. There are various tools used in Hadoop for analyzing such a big data in a distributed environment [3]. In the proposed research, I used Flume and Pig tool for analysis purpose. For analysis of sentiments correctly, I need a large data set; therefore, I used Twitter’s data. It has a huge text post on daily basis, and it is a common and popular platform to share the ideas or opinion on a specific topic. The participant on Twitter belongs to different countries also. Most popular tools of Hadoop are Pig and Flume [4]. Generally, Pig tool is used for analysis purpose; therefore, in the proposed research, Apache Pig is used for data manipulation operation. While Flume is a reliable tool for collecting huge amount of data, aggregating operations are implemented through the Flume in the proposed research. Various sources of big data are represented by the Fig. 1. All the components of Fig. 1 belong to huge amount of data, and the manipulation of such a large data is possible only through Hadoop. The objective of the proposed work is to analyze the sentiments on large data set, i.e., Twitter’s data using Hadoop and big data tools. The rest of this paper is organized as follows: Sect. 2 discusses the techniques proposed by various researchers in this context. The proposed approach is discussed in Sect. 3. The results of the proposed approach are shown in Sect. 4, and finally, Sect. 5 presents conclusions.
Fig. 1 Sources of big data [5]
Use of Hadoop for Sentiment Analysis on Twitter’s Big Data
49
2 Related Work In this section, we are going to discuss related works and techniques proposed by various researchers. A preference-based sentiment analysis is proposed by Kumar et al. [2]. It analyzed various products using the Hadoop technology. This paper shows the necessity of analysis in a huge amount of data. This paper proposed sentiment analysis of tweets, but data size and accuracy are very low. For detecting emotion on sentential data, Wu et al. [6] proposed emotion recognition approach. This paper consists of emotion generation rules and semantic labels to represent a state of emotion. To obtain the similarities input sentence and association rules, mixer model is used. The emotional states used in this paper is limited that is only three, while other states of emotion are also possible. Analysis of text to recognize emotion and sentiment is proposed by Yadollahi et al. [7]. In this paper, opinion mining and emotion mining are used. Opinion mining refers to recognition of the attitude. The basic NLP technique is used for that purpose. Emotion mining is used to detect emotions using machine learning technique. This paper focused on the sentiment analysis on different ways; however, the used data set is very limited. Therefore, in my proposed paper, we are determining emotion on large data set, i.e., on Twitter data. Sentiment analysis of stock blog to predict the stock price is proposed by [8]. In this paper, author developed a model for performing semantic analysis of social networking blogs especially financial blogs of all the companies listed on national stock exchange of India. To predict the stock behavior, the author used two components mainly, i.e., prices (opening and closing) and sentiment blog of that day. This model can predict the stock with the acceptable accuracy of 84%. However, it is limited only for stock exchange. The volume of data is also limited to perform sentiment analysis. Sentiment analysis of Twitter’s data using basic machine learning algorithms is proposed by [9, 10]. In these papers, the author used basic machine learning algorithms like KNN and SVM to classify the text data. Initially, sentiment classification is performed using any one technique like KNN and the results are passed to the other machine learning technique, to train the model, and finally, sentiments are classified into various defined categories [11]. This model took a huge time to perform the analysis; therefore, it is required to use latest tools and techniques to perform sentiment analysis in a minimum time. Survey of various emotion recognition techniques was presented in [12, 13]. In their survey, the authors discussed various existing approaches and techniques for an information retrieval.
50
S. Rathor
3 Proposed Methodology In the proposed research, Hadoop is used because it is an open-source software and efficient tool for storing and manipulating large amount of data in distributed patterns [14]. It applies concepts of functional programming. Core Components of Hadoop: When a Hadoop cluster is settled, two services are basically part of Hadoop setup, i.e., HDFS and YARN. One is used for storing of large amount of data, and other one is used for processing this huge amount of data. When data is passed to HDFS, then MapReduce function executes the jobs to process this large amount of data, while YARN manages all the resources required at that time. There are various components used in the proposed system. Node Manager: It is responsible for maintaining and monitoring the resources a node level component. Name Node: It is responsible for maintaining the metadata of all the stored files. Data Node: It provides the services to all the clients like read or write. It runs on each machine independently. Resource Manager: It works on cluster level. It is responsible for scheduling applications. It is executed on master machine. Use of Apache open source in the proposed system: For the making efficient use of Hadoop, Apache open source is used. It is able to support multiple projects at a time by using this Apache Hadoop ecosystem. The role of various components in Apache ecosystem is as follows: Spark: It can be considered as computational and mathematical programming model. It works as a gateway between memory and Hadoop. Hive: It is responsible for managing and structuring the big data. It is also responsible for providing data as per the given queries in distributed Hadoop clusters. HBase: It is responsible for storing data in a tabular form. It consists of many rows and columns because features of stored data are large. Sqoop: It can be considered as command line user interface. It is used to move a large amount of data into relational database or from warehouse to Hadoop cluster. In my proposed approach, initially data is collected from Twitter with the help of Twitter Streaming API by using Flume. The tweets are collected from the original sources and sent to the sink. The sinks are used for writing purpose into Hadoop distributed file system. Then, Pig queries are applied on these tweets to first structure this data as this data is unstructured, and then, this data is analyzed by implementing more queries. The step-by-step process is also represented in Fig. 2.
4 Result and Discussion Affin Dictionary is an analysis tool. It is a word list containing both positive and obscene words and is used to rate words in positive, negative, or neutral manner. In Affin Dictionary, I classified the tweets into positive, negative, and neutral by taking
Use of Hadoop for Sentiment Analysis on Twitter’s Big Data
51
Fig. 2 Framework of the proposed system
the rating of the tweet. I have uploaded the dictionary into Pig using the following statement as: Dictionary = load ‘/AFINN.txt’ using PigStorage(‘\t’) S(word:chararray,rating:int); The final output is shown in Fig. 3. It represents the result of sentiment analysis in the form of a value. A positive value represents positive sentiment, negative value represents negative sentiment, and a zero value represents neutral sentiment. If the output is NULL, then it is invalid tweet.
5 Conclusion Hadoop is most suited fault-tolerant tool nowadays because it can handle such a large data set efficiently like Twitter. In the proposed approach, Apache framework is used to fetch the data from tweets. Pig tool is also used in the proposed approach to analyze Twitter data. It provides efficient result with lower computation time.
52
S. Rathor
Fig. 3 Sentiments: positive, negative, and neutral rated tweets
References 1. Rathor, S., Jadon, R.S.: The art of domain classification and recognition for text conversation using support vector classifier. Int. J. Arts Technol. 11(3), 309–324 (2019) 2. Kumar, M., Bala, A.: Analyzing Twitter sentiments through big data. In: 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, pp. 2628–2631 (2016) 3. Barskar, A., Phulre, A.: Opinion mining of twitter data using Hadoop and Apache Pig. Int. J. Comput. Appl. 158(9) (2017) 4. Jain, A., Bhatnagar, V.: Crime data analysis using pig with Hadoop. Procedia Comput. Sci. 78, 571–578 (2016) 5. https://www.smartdatacollective.com/big-data-20-free-big-data-sources-everyone-shouldknow/. Access date 10 Jan 2020 6. Wu, C.-H., Chuang, Z.-J., Lin, Y.-C.: Emotion recognition from text using semantic labels and separable mixture models. ACM Trans. Asian Lang. Inf. Process. 5(2), 165–183 (2006) 7. Yadollahi, A., Shahraki, A.G., Zaiane, O.R.: Current state of text sentiment analysis from opinion to emotion mining. ACM Comput. Surv. (2017) 8. Ranjan, S., Singh, I., Dua, S., Sood, S.: Sentiment analysis of stock blog network communities for prediction of stock price trends. Indian J. Financ. 12(12), 7–21 (2018) 9. Huq, M.R., Ali, A., Rahman, A.: Sentiment analysis on Twitter data using KNN and SVM. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 8(6), 19–25 (2017) 10. Jain, A.P., Katkar, V.D.: Sentiments analysis of Twitter data using data mining. In: 2015 International Conference on Information Processing (ICIP). IEEE (2015) 11. Neethu, M.S., Rajasree, R.: Sentiment analysis in twitter using machine learning techniques. In: 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT) (2013)
Use of Hadoop for Sentiment Analysis on Twitter’s Big Data
53
12. Nahar, L., Sultana, Z., Iqbal, N., Chowdhury, A.: Sentiment analysis and emotion extraction: a review of research paradigm. In: 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), pp. 1–8. IEEE (2019) 13. Wu, Y., Ren, F.: Learning sentimental influence in twitter. In: 2011 International Conference on Future Computer Sciences and Application. IEEE (2011) 14. Rodrigues, A.P., Chiplunkar, N.N.: Real-time Twitter data analysis using Hadoop ecosystem. Cogent Eng. 5(1), 1534519 (2018)
Enhanced Mobility Management Model for Mobile Communications Ashok Kumar Yadav and Karan Singh
Abstract Providing anytime anywhere roaming along with maximizing quality of service (QoS) is one of crucial challenges in mobile communication. Accomplishing ongoing transaction during roaming of users is the challenge for researchers. High roaming behavior of mobile hosts (MH) may degrade the performance, efficiency and reliability of mobile communication system. Frequently generated handover requests may fail due to scarcity of free available channels in the next coming base station. To enhance the performance, efficiency and reliability of the mobile commerce systems, we have combined the GE/GE/C/N/FCFS and GE/GE/C/N/PR priority queuing mechanism to improve the performance in terms of HTR drop rate, NTR blocking rate, mean queue length of incoming requests and channel utilization. Keywords Handover · GE-type distribution · Queuing models · HTR · NTR · QoS
1 Introduction In today’s world, the processing speed, computing power and storage capacity of portable devices such as mobile phone, laptop, palmtop, tablet, smart phone, notebook, etc. have increased. In parallel, many advanced wireless technologies such as WCDMA, UTMS, DMA2000, HSDPA, WiMAX, LTE etc. also developed and deployed. The nature of accessing and processing data is changing from fixed to mobile. Now it is become geographically independent, which is known as a mobile information system. Mobile communication is in growing demands due to anytime and anywhere connectivity. Wireless cellular networks provide the freedom to roam from cell to cell. The geographical area is divided into a small area to improve the utilization of the frequency spectrum. Such a small service area is called a cell [1]. A. K. Yadav (B) · K. Singh School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India e-mail: [email protected] K. Singh e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_5
55
56
A. K. Yadav and K. Singh
Mobile communication system mainly has database server which is fixed host (FH), base station (BS) or mobile support station (MSS) and mobile host (MH). Database servers are fixed host (FH). It is connected by high-speed wired networks. It is used as permanent data storage repositories. Sharable data are distributed over FH. Mobile unit (MU) or MH can cache limited amount of information. It has the capability to move or stationary. BS or MSS is providing the additional processing service to the mobile hosts that are currently residing in that cell [2, 3]. Figure 1 describes the mobile communication system. Cell 4 Cell 2 Cell 3
MS
B
Cell 1 A MSS
FH
MSS
MSS DBMS
Wired Network
FH
DBMS
FH
DBMS
Connected Mobility Disconnected Mobility Handover Process
Connected Mobile Host Disconnected Mobile Host Mobile Cell Wireless Connection
Fig. 1 Architecture of mobile communication system [2]
BS1
MS
BS2
Before Handover Fig. 2 Handover between BS1 and BS2 [1]
BS1
MS
BS2
After handover
Enhanced Mobility Management Model for Mobile Communications
57
1.1 Handover The mobile communication system is combination of distributed systems and wireless communication system. A handover process is the process of transferring the ongoing transaction request of the current BS to the new BS while users are roaming from one cell to another cell in mobile communication. It causes the change in point of connection during the communication. Handover can be categorized into two broad categories such as hard-handover and soft-handover [1, 4]. The steps involved in the handover process are given as follows [1]. Step 1: Base station1 (BS1 ) continuously sends message to the mobile host (MH) which is used in execution of context-aware transaction. Step 2: The mobile host receives the message from BS1 . It determines whether, it is in the BS1 or in BS2 . Step 3: If the mobile host finds that it is in BS1 , then there is no requirement of starting any handover process. The user’s ongoing transaction will continue to execute without any interruption. Mobile host will execute through the facilities provided by BS1 . Step 4: If the mobile host finds it in BS2 , then mobile host requires a new address in the BS2 . So that it can still continue to execute the transactions. It also requires to give its new address to the BS1 . Step5: When a user roams from BS1 and registered with the BS2 , any communication, information corresponding context-aware transaction with it will be redirected to the BS1 .
1.2 The GE Distribution The Poisson distribution is the most suitable distribution to model telephone network like non-blocking request systems. The Poisson arrival process is not suitable for modeling growing internet traffic in cellular network due to assumption of one arrival at random time. Measurement of actual interarrival or service time is limited to few parameters that can be easily computed. In cellular network, network request blocking occurs due to insufficient resource available. Cellular traffic is bursty in nature. To proper modeling of such blocking traffic with bursty property; most suitable distribution is GE distribution. The GE distribution has the distribution function with parameters (1/ϑ, c2 ). F(t) = prob(T ≤ t) = 1 − τ e−σ t , t ≥ 0 τ = 2/ c2 + 1 σ = 2ϑ c2 + 1
(1)
58
A. K. Yadav and K. Singh
0≤τ ≤1 where T, c2 , and 1/ϑ are an interevent time random variable, SCV and the mean of the random variable, respectively. SCV provides measures of the variability of the distribution. Memoryless property is the main feature of GE distribution. The solutions of queuing systems using GE distributions of network analysis are manipulable due to its memoryless behavior [5–7]. The high roaming behavior of MH may degrade efficiency, performance and reliability of mobile communication [4, 8]. If none of the channel is free at the time of handover process, the request is dropped. This dropping of executing transaction requests can have a negative effect on the efficiency, reliability and performance. Excess number of handover transaction requests create network overhead. Scarcity of enough channels causes failure of an ongoing transaction. Due to the limitations of mobile communication systems such as real-time movement, change of location, lower bandwidth, unstable network, frequent disconnections, limited battery backup, limited storage power and limited functionality, context-aware transactions do not support the ACID properties. Therefore, context-aware transactions required to satisfy a new set of properties such as relaxed atomicity, consistency, context and durability (RACCD) [9]. To reduce network overhead, an efficient handover technique is essential. So, we have proposed a context-aware transaction for mobility management. The paper is organized as follows. Section 2 gives some important related works regarding our work in this paper. Section 3 deals with EMMM description and analysis of the combination of queuing models proposed by us. Section 4 has a flow graph of the EMMM model. Section 5 deals with proposed algorithm. Section 6 explains simulation and results, and in the last Sect. 7, we have a conclusion and future work.
2 Related Works Several different techniques have been advised by many researchers to achieve enhanced performance, improve reliability and efficiency in mobile transaction. It has been observed that the number of messages generated at different nodes for a CAT has a great effect on performance, reliability and efficiency in mobile transactions. Existing approaches suggested that the several transaction models reduce the number of messages [10]. Data availability is enhanced by Pre-Write transaction model [11]; high commit mobile transaction model enhances the commitment rate of the transaction requests [12]; timeout-based commit protocol improves performance and throughput [13]; kangaroo transaction model considers data and movement both for enhancing the commitment rate of the transaction requests of mobile transaction [14], etc. However, till now there is no enough investigation on the underlying network on which mobile transaction is carried out [15]. Proposed work enhanced the handover process for mobility management of context-aware transaction in mobile
Enhanced Mobility Management Model for Mobile Communications
59
communication system. Performance issues of mobile transaction can be improved by considering the active network capacity and context of a transaction. A customized program can be inserted into the network nodes to improve the performance of mobile transaction in active networks [9]. Channel reservation and queuing of handover are two broad categories of handover strategies [16]. In channel reservation scheme, handover requests are serviced by only reserved channels. This scheme gives high priority to handover request over new requests. This scheme cannot provide fair QoS to different type of services [17]. It is not performed efficiently when the underlying traffic condition is variable. Queuing of handover allows queuing of requests. Queuing of request is possible due to overlapping of cells [18, 19]. Overlapping of two base stations is known as handover area. Requests can be queued until the channel becomes free or the signal greater than threshold [20, 21]. HTR and NTR both have access the free channel for execution of context-aware transaction. In our proposed model, HTR and NTR both are taken as an equal preference when they hold the channels. That means both transactions have an equal service rate but different arrival rates. We have combined the GE/GE/C/N/FCFS and GE/GE/C/N/PR priority queuing mechanism to reduce dropping probability of handover transaction requests. It provides uninterrupted connectivity and ensures reliable execution of CAT transactions during roaming of users. EMMM will reduce the dropping probability rate of (HTR) and (NTR). EMMM has improved the performance in terms of HTR drop rate, NTR blocking rate, mean queue length of incoming requests and channel utilization.
3 EMMM Description Queuing network models are a performance analysis tool for the complex network of queues and servers. GE-type distribution can be used model the burstiness of the interarrival time of incoming HTR and NTR requests [22, 23]. It is characterized by various values of squared coefficient of variation (SCV). Our proposed enhanced scheme uses the GE distribution for servicing of the incoming HTR and NTR requests. It is described in Fig. 3. A cellular network can be treated as a queuing network model (QNM) [24, 25]. One cell is enough for modeling, calculating and analyzing the effectiveness of the proposed model. Each cell is represented as a queuing node with finite capacity buffer for queuing the incoming HTR and NTR requests. Finite buffer capacity has been taken due to the scarcity of channel bandwidth of the base station. Every cell is taken as a queue and every channel is taken as a serve [26]. Due to the scarcity of the available spectrum of the frequency, each queue has limited capacity for handling the incoming HTR and NTR requests. The queuing of the HTR and NTR requests is possible because of overlapping of the cell area and this overlapping area is known as a handover area [27]. The HTR and NTR request will remains in queue till channels become free or signal falls to a threshold value. The EMMM integrates GE/GE/C/N/FCFS and GE/GE/C/N/PR/CBS different
60
A. K. Yadav and K. Singh
Channels
GE/GE/C/N/FCFS Queue HTR Queue-1
Channels
GE/GE/C/N/PR Queue NTR Complete Buffer Share Queue-2
Fig. 3 Queuing model Blocking NTR
NTR {λi, Cai 2}
1-τ=(Cai 2 - 1)/(Cai 2 + 1) {μi, Csi2}
τ=2/(Cai 2 + 1)
Complete Buffer Share 2
HTR {λi, Cai } τ=2/(Cai 2 + 1)
Channels
GE/GE/C/N/PR Queue
HTR
NTR
{μi, Csi2}
{μi, Csi2}
Blocking HTR 1-τ=(Cai 2 - 1)/(Cai 2 + 1)
Queue-2
Fig. 4 Modeling of HTR and NTR with GE/GE/C/N/PR Queue
queuing techniques to improve the HTR dropout rate, NTR blocking rate, mean queue length of incoming requests and channel utilization (Fig. 4). GE/GE/C/N/PR queuing technique. GE/GE/C/N/PR/CBS model a cell with HTR and Making a pool of communication channel for data sensitive application is possible by the combination of queues in the proposed model. Data loss sensitive application can tolerate the delay but cannot tolerate any data loss. Modeling of burtiness of data sensitive application of interarrival time of incoming HTR and NTR request can be done by the GE-type distribution. It ensures uninterrupted connectivity for high priority data loss-sensitive requests. In our model, the GE/GE/C/N/FCFS queuing technique dedicated to handling only HTR requests. The assumption is all HTR type requests are of the same priority. HTR requests are served in FCFS discipline. In Fig. 4, Queue-1 is used for handles HTR requests, however, Queue-2 is shared queue. Queue-2 GE/GE/C/N/PR/CBS is used to handle both the requests HTR and NTR. GE/GE/C/N/PR/CBS also controls HTR request if the Queue-1 is full [28]. In this queuing technique, the incoming request will be served using pre-emptive resume discipline. Since, HTR requests have higher priority as per assumption. Therefore,
Enhanced Mobility Management Model for Mobile Communications
61
HTR requests will be scheduled first to get services from the server despite the NTRs are already waiting in the Queue-2. On the basis of buffer capacity, sharing queuing technique may be of two types. One is called a complete buffer share (CBS) and the second is a partial buffer share (PBS). We have used complete buffer sharing in NTR class request interarrival and interservice time. In GE/GE/C/N/PR/CBS, number of channels in the cell and total capacity of the queue is represented by C and N, respectively. The request for available channels is served in the FCFS and priority scheduling discipline. In the proposed algorithm, total channel (C) is divided into two channel subsystems, namely C 1 (1 ≤ C 1 < C) and C2 = C − C 1 . C 1 channel dedicated to the Queue-1 and C2 dedicated to GE/GE/C/N/PR to S serve both HTR and NTR requests.
4 GE/GE/C/N/FCFS Analysis 2 Let i be the class of incoming HTR/NTR requests.λi , Cai is mean arrival rate and 2 SCV of inter arrival of ith class, respectively. μi , Cai is a mean service rate and SCVs of the service time for ith class, respectively. We have assumed all the channels have same service rate for all classes of requests but different arrival rate. The probability of n requests in the system at the steady state is
prob0 =
ρn
ρ N −c+1 −1 ρc 1 − c + n! c! 1 − ρc
(2)
4.1 Expected Queue Length
N −c ρ N +c+1 − (N − c + 1) 1 − ρc ρc ρ c+1 ρ0 1 − c Lq = (c − 1)! (c − ρ)2
(3)
by applying the Little’s law, we can calculate queue and system expected waiting time.
62
A. K. Yadav and K. Singh
4.2 Blocking Probability Let focus on a labeled requests with arriving in bulk of class i (i = HTR/NTR), which find the state of GE/GE/C/N/PR/CBS queue states nj = (0 nj , nj+1 … nR ) with nk = R 0, k = 1, 2 … j. So, requests in the queue are v = k=1 n and free buffer space is (N − v) [29]. The blocking probability of specified class estimated as: Bi =
N R
δiL (1 − Si ) N −k prob n j
(4)
j=1 k=0
2 Si = Cai +21 /2 is the selection probability of interarrival time and service time stage. ri = Csi + 1 /2 · Si = ri /(ri (1 − Si ) + Si ) is the rejection probability of interarrival time and service time stage.
4.3 Utilizations
⎧ c ⎫ N 1 ⎨ ni Oik (s) Oik (s) ni ⎬ Ui = q Z u ik qi ⎭ Z v=1 ⎩s ∫ Av i k=1 ik i
5 Flow Chart
(5)
Yes
HTR
No
Yes No Drop the request
Yes Allocate the available server for service
Is any channel free?
Drop the request
Yes
Drop the request
Allocate the available server for service
Is any channel free?
No
Allocate the available server for service
Yes
Is any channel free?
No Drop the request
No
Insert in Queue-2
Yes Insert in Queue-2
NTR
Is Queue-2 empty?
Is Queue-2 empty?
Drop the request
No
Insert in Queue-1
Is Queue-1 empty?
Is request HTR or NTR?
Start
Enhanced Mobility Management Model for Mobile Communications 63
64
A. K. Yadav and K. Singh
6 Experimental Setup and Results The performance of the model has evaluated by creating different experimental scenarios using queuing handover schemes with the help of Java Simulation Tool (JMT), where a lot of requests for context-aware transactions (CAT ) arise while using smart phones during moving of users. JMT is a freely available simulator. It is used to analyze the performance of our model in the terms of dropout rate, blocking rate, channel utilization in the mobile communication networks of context-aware transactions. We have analyzed the effect of the HTR load on dropout probability, impacts of HTR load on the queue occupancy, effect of bulk HTR load on channel utilization, impact of HTR on blocking probability of NTR, and impact of traffic overhead on HTR and NTR dropping probability. The outcome of the experiment implies that the devised approach causes the reduction in the probability of dropping HTR request during the handover process. The result shown in Fig. 5, implies that with the increase in HTR traffic load results in lesser effect in dropping probability of HTR under the proposed mobility management scheme unlike the already existing schemes. Reduction in dropping probability rate shows the effectiveness of the proposed mobility management model. We have found that our enhanced mobility management model for the handover process reduces the dropping probability of HTR requests. The second performance parameter of the proposed mobility management model can verify with the help of Fig. 6. It indicates that the HTR traffic load is approximately proportional to mean queue length (MQL) up to a certain point and become absolutely proportional to the MQL later on. This increase of the mean queue length of the queue will be acquired when the dedicated buffer (GE/GE/C/N/FCF) completely become full. When the first queue (GE/GE/C/N/FCFS) completely full, then the incoming HTR transaction joins the second queue (GE/GE/C/N/PR). Therefore, lesser MQL will be obtained for HTR than NTR due to combining Queue-1 and Queue-2. Fig. 5 Handover traffic load versus dropping probability
Dropping Probability
HTR NTR
1.0 .8 .6 .4 .2 0
1
1.2
1.4
1.6
1.8
Handover Traffic Load
2.0
2.4
Fig. 6 HTR traffic load versus mean queue length
Mean Queue Length
Enhanced Mobility Management Model for Mobile Communications 2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0
65
HTR NTR
1
4
3
2
5
6
7
1.2
1.4
HTR Traffic Load
0.85
Fig. 7 Utilization for HTR and NTR
HTR NTR
Utilisation
0.75 0.65 0.55 0.45 0.35 0.25 0.15 0.05 0.2
0.4
0.6
0.8
1.0
Mean Arrival rate
For the clarification of channel utilization can be seen in Fig. 7. Figure 7 shows the effect of mean arrival rate on channel utilization. The proposed work mainly focuses upon the behavior of HTR and NTR. The result shows that with an increase in the mean arrival rate of HTR and NTR, channel utilization increases. This increase in shared channel utilization verifies that the proposed enhanced mobility management model is more efficient than the existing mobility management schemes.
7 Conclusion In mobile communication to ensure consistency of context-aware transaction data and providing reliability of data loss-sensitive applications in case of communication fail due to roaming of a user from one area to another is a challenge for researchers. Our model enucleates execution of context-aware transaction in anywhere and anytime environment. The proposed work focused on HTR to make context-aware mobile communication systems efficiently. In our model, we have focused more on the HTR request dropping rate of HTR. The enhanced model has improved the performance in terms of HTR drop rate, NTR blocking rate, mean queue length of incoming requests and channel utilization. However, it can make further efficient if both HTR and NTR consider simultaneously.
66
A. K. Yadav and K. Singh
References 1. Sori, N.: Handoff initiation and performance analysis in CDMA cellular systems. Ph.D thesis, Addis Ababa University, Ethiopia (2007, February) 2. Le, H.N.: A transaction processing system for supporting mobile collaborative work. Ph.D dissertation, Norwegian University of Science and Technology, Trondheim, Norway (2006, October) 3. Kumar, V.: Mobile database system, 10th edn. University of Missouri-Kansas City, Wiley, Hoboken, New Jersey 4. Guterin, R.: Channel occupancy time distribution in a cellular radio system. IEEE Trans. Veh. Technol. VT-35(3) (1987) 5. Awan, I., Kouvtsos, D.D.: Entropy maximization and open queuing networks with priorities and blocking. Performance Eval. 51(2–4), 191–227 (2003) 6. Ahemed, A.N.: Characterization of beta, binomial and poison distribution. IEEE Trans. Reliab. 40(3), 290–295 (1991) 7. Ross, S.M.: Introduction to Probability Models, 10th edn. Elsevier (2010) 8. Madria, S.K., Bhargava, B.: A transaction model for mobile computing. In: Proceedings of IEEE, IDEAS’98, International Database Engineering and Applications Symposium, pp. 92– 102, 8–12 July 1998 9. Younas, M., Awan, I.: Mobility management schemes for context-aware transaction in pervasive and mobile cyberspace. IEEE Trans. Ind. Electron. 60(3), 1108–1115 (2013) 10. Awan, I., Singh, S.: Performance evaluation of e-commerce request in wireless cellular networks. Inf. Technol. 48(6), 393–401 (2006) 11. Madria, S.K., Baseer, M., Bhowmick, S.S.: A Multi-Version Transaction Model to Improve Data Availability in Mobile Computing, vol. 2519, pp. 322–338. LNCS, Springer, Heidlberg (2002) 12. Lee, M., Helal, S.: HiCoMo: high commit mobile transactions. Distrib. Parallel Databases 11(1), 73–92 (2002) 13. Kumar, V., Prabhu, N., Dunham, M.H., Seydim, A.Y.: TCOT-A timeout-based mobile transaction commitment protocol. IEEE Trans. Comput. 51(10), 1212–1218 (2002) 14. Dunham, M.H., Helal, A., Balakrishnan, S.: A mobile transaction model that captures both the data and movement behaviour. Mobile Netw. Appl. 149–162 (1997) 15. Younas, M., Awan, I., Chao, K.M.: Network-centric strategy for mobile transactions. J. Interconnection Netw. 5(3), 329–350 (2004) 16. Awan, I.: Mobility management for m-commerce requests in wireless cellular networks. Inf. Syst. Frontiers 8(4), 285–295 (2006) 17. Akyildiz, I., Ho, J., Wang, W.: Mobility management in next generation wireless system. Proc. IEEE 87(8), 1347–1384 (1999) 18. Abdullah, R.M., Zukarnain, Z.A.: Enhanced handover decision algorithm in heterogeneous wireless network. Sensors 17 (2017) 19. Tai, W.L., Chang, Y.F., Chen, Y.C.: A fast-handover-supported authentication protocol for vehicular ad hoc networks. J. Inf. Hiding Multimed. Signal Process 7, 960–969 (2016) 20. Chen, Y., Chen, C.: Performance analysis of non-preemptive GE/G/1 priority queuing of LER system with bulk arrival. Comput. Electr. Eng. 35, 764–789 (2009) 21. Nguyen, M. T., Kwon, S., Kim, H.: Mobility robustness optimization for handover failure reduction in LTE small-cell networks. IEEE Trans. Veh. Technol. 67(5), 4672–4676 (2018) 22. Guerin, R.: Queuing blocking system with two arrival streams and guard channels. IEEE Trans. Commun. 36(2) (1988) 23. Jain, J.L., Mohanty, G., Hohm, W.: A Course on Queuing Models. Chapaman & Hall/CRC Taylar & Francis Group Boca Raton London, New York (2007) 24. Cooper, R.B.: Introduction to Queuing Theory, 2nd edn. Elsevier North Halland, Inc. (1981) 25. Tomaras, P.J.: Decomposition of general queuing network models. Ph.D Thesis, University of Bradford (1989)
Enhanced Mobility Management Model for Mobile Communications
67
26. Posner, E.C., Guerin, R.: Traffic policies in cellular radio that minimize blocking of handoff calls. In: 11th International Teletraffic Congress (ITC 11), Kyoto, Japan (1985) 27. Tang, F., Guo, M., Li, M., You, I.: An adaptive context-aware transaction model for mobile and ubiquitous computing. Comput. Informatic. 27(5), 785–798 (2008) 28. Fantacci, R.: Performance evaluation of prioritized handoff schemes in mobile cellular networks. IEEE Trans. Veh. Techol. 49(2), 485–493 (2000) 29. Kouvatsos, D.D., Tabet-Aoul, N.: An ME-based approximation for multiserver queue with preemptive priority. Eur. J. Opera. Res. 77(3), 496–515 (1994)
Multi-objective-Based Lion Optimization Algorithm for Relay Selection in Wireless Cooperative Communication Networks Anand Ranjan, O. P. Singh, G. R. Mishra, and Himanshu Katiyar
Abstract In recent years, the cooperative relaying technique is considered as an important technique for mesh network, wireless sensor network (WSN), ad hoc network and cellular network. In this cooperative communication network, relay nodes are allocated between source and destination for facilitating the data transmission. Still the selection of relay node is one of the main challenges for improving the performance of cooperative communication network. So, in this paper, a multiobjective-based lion optimization algorithm (LOA) is presented for multiuser cooperative communication network. For optimal relay node selection, multi-objective functions based on signal-to-noise ratio (SNR) and channel gain in addition to power consumption are defined in this approach. With these objective functions, LOA algorithm executes and selects an optimal relay node. The simulation result shows the developed approach efficiency evaluation regarding SNR, lifetime of the network and energy consumption. Keywords Cooperative communication network · Relay node selection · Multi-objective-based Lion optimization algorithm · Power efficiency
A. Ranjan (B) · O. P. Singh · G. R. Mishra ASET Amity University—Lucknow Campus, Lucknow, Uttar Pradesh, India e-mail: [email protected] O. P. Singh e-mail: [email protected] G. R. Mishra e-mail: [email protected] H. Katiyar Department of Electronics Engineering, Rajkiya Engineering College, Sonbhadra, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_6
69
70
A. Ranjan et al.
1 Introduction The cooperative communication in wireless communication is a technique where the data is sent to the receiver with the assistance of intermediate relays [1]. In majority of the cases, the data with the same frequency band cannot be sent or received at the same time in a relay. In recent years, to improve the system performance in wireless system, an efficient approach called cooperative communication which deals with fading is used [2]. When a user sends a data to another user in wireless networks, due to its broadcasting nature the nearby user also receives the signal. For sending the received data to the destination from the source, the nearby users function as relays. The multiple antennas for each user are not required for the multiple transmission paths or virtual multiple input and multiple output systems to the destination [3]. These fairly helpful networks are known as relay networks. At present, using the broadcasting property of wireless medium, the time and spatial diversity gain in wireless network has been exploited by cooperative communication [4]. Relay selection is considered as one of the crucial parts of cooperative communication as it is an effective way to attain considerable gain in performance [5]. One or multiple relays have to be chosen for cooperating with the source so as to guarantee the reliability of the overall transmission when the source destination has a poor quality. Based on various constraint functions, issues in design and channel information assumptions, a variety of relay selection have been proposed by numerous researchers in recent years [6, 7–11, 12, 13]. Selecting a specific relay node from many potential relay nodes is considered as a very crucial task in enhancing the cooperative relaying’s performance. To ease the data transmission from the source node to the destination node, the relay nodes must be allocated to the relay stations or other networks [14]. There are three primary strategies for cooperating relaying; they are compress-and-forward (CF), decode-and-forward (DF) and amplify-and-forward (AF) approaches. An amplified form of previously attained signal will be communicated through the relay node to the destination node in AF approach. The data information is regenerated by the relay node, and a new form of data is forwarded to the destination node in DF approach [15]. The message sent by the source cannot be decoded using relay, but the message from the source can be contributed by compressing and forwarding its observation to the destination in CF relay approach [16]. An AF approach may overpower DF approach or vice versa based on the performance criterion and condition of the channel [17]. Designing a relay selection scheme based on these criterions is a difficult task. Based on the destination feedback, the relay will cooperate or not cooperate with the communication link. When the direct transmission is successful, the relay will stay ineffective which was represented by the feedback. When the feedback represents unsuccessful direct transmission, the relay again transmits the overhead signal utilizing AF or DF [18]. For cooperative communications, many of the relay selection approaches use channel gain or signalto-noise ratio while the particular standard for effective relay task and an assumption is made that the partial or full channel state data existed in the destination, at the
Multi-objective-Based Lion Optimization Algorithm for Relay …
71
source and at all of the potential relays [19]. So as to satisfy the entire criterion and solve some of the mentioned issues in relay selection, a new approach for relay selection is to be presented in cooperative communication network. In this paper, for optimal relay nodes selection, multi-objective-based lion optimization algorithm is presented. This algorithm is performed based on the hunting behavior of lion. In this algorithm, relay node is selected based on the objective functions of power and communication cost. That means, node with minimum power consumption and minimum cost is selected as the best relay nodes.
2 Problem Statement and Solution The cooperation communication network has a significant issue, i.e., it increases the interest in the cooperation process to the system clients. In cooperation communication, the greater part is that it motivates the forces to share their resources except the user supplied. For example, power and channel with the others in the network, nonetheless, by and by the clients have no ability to join the collaboration pool. Along these lines, by considering power consumption, channel gain and SNR, optimal relay nodes are to be chosen in the multiuser cooperative communication network. So, for optimal relay nodes selection, multi-objective-based lion optimization algorithm is presented. This algorithm is performed based on the hunting behavior of lion. In this algorithm, relay node is selected based on the objective functions such as power consumption.
3 Optimal Relay Node Selection Using Multi-objective-Based Lion Optimization Algorithm 3.1 System Model In Fig. 1, the system model of multiuser cooperative communication network is illustrated. In this system model, N numbers of users are placed randomly. Among these users, one user is fixed as a source node, and the base station is considered as a destination node. Except source node, all other nodes (N − 1) are performing as a relay node. In the figure, two types of communications such as direct communication and cooperative communication are distinguished. The source node transfers the data to the destination directly in a direct communication. This communication affects the power efficiency of the system. So, to improve the power efficiency of the system, cooperative communication is chosen. In this communication, for transferring the data from source to destination which was supported by a user acted as a relay node. By using AF protocol, the received signal at the relay node has been processed.
72
A. Ranjan et al.
Fig. 1 System model
Before selecting the optimal relays, the source node forwards request to send (RTS) frame to the destination through the relay nodes. This frame is used to estimate the channel gain between source and relay at each relay. After receiving this frame, the destination replies to the source by send clear to send (CTS) frame. This frame is used to estimate the channel gain between relay and destination at each relay. Because of these initial communications, information about the relays such as channel gain, SNR and power consumption is collected. Based on this information, the best relay is selected between source and destination. In this approach, the following objective functions are used to select an efficient relay node (RN). Signal-to-noise ratio (SNR): The transmitted signal SNR (η) from the source to destination can be determined as follows: η S−D =
|g S−D |2 nD
(1)
where nD stands for noise in destination, ηS−D stands for SNR between the source and destination, and |gS−D | stands for channel coefficient between the source and destination. At the first hop, SNR between source node and ith RN is denoted as η S−RNi and is defined as follows: g S−RN 2 i (2) η S−RNi = n RNi where g S−RNi represents the channel coefficient between source and RNi and n RNi denotes the noise at RNi . At the second hop, gRNi −D stands for channel coefficient between RNi and destination and α stands for factor used to amplify the received signal at RNi . Then,
Multi-objective-Based Lion Optimization Algorithm for Relay …
73
ηRNi −D stands for SNR of amplified signal from RN i to destination, and it is defined as follows: 2 2 α g S−RNi gRNi −D ηRNi −D = (3) 2 n RNi α 2 g S−RNi +1 Finally, using Eqs. (2) and (3), one of the objective functions is defined that an RN i with minimum SNR of two hops is selected as an optimal RNi . This objective function (O1 ) is defined as follows: O1 = ηi = min 1/η S−RNi , 1/ηRNi −D
(4)
Channel gain: The second objective function (O2 ) for selecting the optimal RNi by minimizes the channel gain of two hops, and it can be defined as follows: O2 = gi = min g S−RNi , gRNi −D
(5)
Power consumption: Depending on the following third objective function (O3 ), an RNi with maximum residual power is selected as an optimal RNi . Residual power (PRe ) of an RNi is calculated as follows: PRe (RNi ) = PIn (RNi ) − PCon (RNi )
(6)
where PIn (RNi ) represents the initial power of the RNi and PCon (RNi ) represents the consumed power of the RNi . Using (6), the third objective function or power of RNi is defined as follows: O3 = Pi =
1 max(PRe (RNi ))
(7)
Depending on these objective functions (4), (5) and (7), cooperative RN is selected using the lion optimization algorithm (LOA) in the communication network. The selection of RN using LOA is described in the following section.
3.2 Implementation Steps Implementation of proposed LOA approach is described as follows: • Initially, multi-objective functions are derived for selecting the optimal relay node. SNR, channel gain and power consumption of the relay nodes are the objective functions used for relay node selection.
74
A. Ranjan et al.
• For optimal relay node selection, lion optimization algorithm (LOA) is presented. Among the available relay nodes between source and destination, an optimal relay node is selected using this algorithm by evaluating the multi-objective functions as a fitness value. • By using the game approach, the developed approach performance will be compared to the existing relay node selection. • The performance of the developed approach will be judged in terms of SNR, energy consumption and network lifetime.
4 Results and Discussions The proposed optimal relay node selection using multi-objective-based lion optimization algorithm is implemented using MATLAB programming, and the experimentation is done by the help of a computer with common configurations. Table 1 shows the simulation parameters and involved assumptions used in this model. For the simulation of this proposed approach, 200 mobile nodes are used where 198 nodes are considered as RNs while remaining two nodes are considered as source and destination. Among these RNs, optimal RN is selected using multiobjective-based LOA algorithm. For source node, initial energy is assigned as 0.3 J, and initial energy for RNs is uniformly assigned as [0.1 J, 0.5 J]. In this approach, Rayleigh fading channel is considered with 1 MHz. In addition, the receiver utilizes a noise model called additive white Gaussian noise (AWGN). Amplify-and-forward (AF) is used for relaying protocol.
4.1 Performance Analysis This section shows that the performance of the developed approach are evaluated in terms of SNR, lifetime of the network and energy consumption. Then, the relay Table 1 Summary of the performance of ML algorithms
Parameters
Assumptions
Number of nodes
200
Initial energy of source
0.3 J
Initial energy of RNs
[0.1 J, 0.5 J]
Channel model
Rayleigh fading channel
Bandwidth of channel
1 MHz
Relaying protocol
Amplify-and-forward (AF)
Receiver noise
Additive white Gaussian noise
Data rate
100–500 kb/s
Multi-objective-Based Lion Optimization Algorithm for Relay …
75
Fig. 2 Trade-off between distance of source/RN link and SNR of source/RN link and RN/destination link
section using game approach will be compared with the performance of the developed approach [20]. By varying the distance between source, RN and data rate, the performance metrics of the developed approach are analyzed. Performance based on Varying Distance between Source and RN The SNR between source and RN (η S−RNi ) and SNR between RN and destination are analyzed by varying the distance between source and RNs as shown in Fig. 2. SNR of the source and RN link and distance between RN and destination decrease when the distance between source and RN increases. Besides, SNR of RN and destination link increases when the distance between source and RN increases. From the figure, the location of the optimal RN is located at the intersection point of the curves. Approximately, the optimal RN is located at 1450 m from source. Performance based on Varying Data Rate Figures 3 and 4 show the comparison of performance metrics such as network lifetime and energy consumption of the proposed relay selection algorithm with that of the existing relay selection algorithm for varying data rate. In the proposed approach, the optimal RN is selected based on the RN which is satisfying the multi-objective condition. So, the selected RN can alive in the network for a long time. Thus, the network lifetime of the developed approach is increased by 51% as compared with the actual approach for varying the data rate as shown in Fig. 3. Figure 4 shows the comparison of energy consumption between the developed approach and the existing approach [20] for varying the data rate. The energy consumption is increased when the data rate increases as shown in the figure. However, compared to the existing approach, energy consumption of the proposed approach is reduced to 18%. In this approach, one of the objective functions for selecting the optimal RN is power consumption of RN so that a RN which is satisfying this condition is selected as an optimal RN. Thus, the energy efficiency of the network is improved.
76
A. Ranjan et al.
Fig. 3 Comparison of network lifetime to vary the data rate with existing approach [20]
Fig. 4 Comparison of energy consumption to vary the data rate with existing approach [20]
5 Conclusion Power efficiency of the cooperative communication network has been improved by selecting optimal relay node using multi-objective-based lion optimization algorithm. An optimal relay node between source and destination has been selected by evaluating SNR, channel gain and power consumption of relay nodes. The actual method where the relay node selection is based on game approach is compared with the performance of the developed approach. In addition, the developed approach performance has been estimated with respect to SNR, network lifetime and energy consumption. The simulation results with various data rates show significant reduction in energy consumption (18% reduction) as well as the tremendous increase (of 51%) in network lifetime of the developed approach with the existing approach [20].
Multi-objective-Based Lion Optimization Algorithm for Relay …
77
References 1. Liang, X., Min, C., Balasingham, I., Leung, V.C.M.: Cooperative communications with relay selection for wireless networks: design issues and applications. Wirel. Commun. Mob. Comput. 13(8), 745–759 (2013) 2. Guo, L., Zhu, Z., Lau, F.C.M., Zhao, Y., Yu, H.: SSCSMA-based random relay selection scheme for large-scale relay networks. Comput. Commun. 127, 13–19 (2018) 3. Liang, X., Chen, M., Leung, V.C.M.: Relay assignment for cooperative communications over distributed wireless networks: a game-theoretic approach. Technical Report: DCG12 (2010) 4. Shah, V.K., Gharge, A.P.: A review on relay selection techniques in cooperative communication. Int. J. Eng. Innov. Technol. (IJEIT) 2(4), 65–69 (2012) 5. Xu, F., Lau, F.C.M., Zhou, Q.F., Yue, D.-W.: Outage performance of cooperative communication systems using opportunistic relaying and selection combining receiver. IEEE Sig. Process. Lett. 16(2), 113–136 (2009) 6. Wang, B., Han, Z., Ray Liu, K.J.: Distributed relay selection and power control for multiuser cooperative communication networks using Stackelberg game. IEEE Trans. Mob. Comput. 8(7), 975–990 (2009) 7. Geng, K., Gao, Q., Fei, L., Xiong, H.: Relay selection in cooperative communication systems over continuous time-varying fading channel. Chin. J. Aeronaut. 30(1), 391–398 (2017) 8. Sousa, I., Queluz, M.P., Rodrigues, A.: A smart scheme for relay selection in cooperative wireless communication systems. EURASIP J. Wirel. Commun. Netw. 2013(1), 146 (2013) 9. Nam, S., Vu, M., Tarokh, V.: Relay selection methods for wireless cooperative communications. In: 2008 42nd Annual Conference on Information Sciences and Systems, pp. 859–864. IEEE (2008) 10. Nazir, M., Rajatheva, N.: Relay selection techniques in cooperative communication using game theory. In: 2010 2nd International Conference on Computational Intelligence, Communication Systems and Networks, pp. 130–136. IEEE (2010) 11. Ahn, Y.-S., Choi, S.-B., Jeon, J., Song, H.-K.: Adaptive relay selection and data transmission scheme for cooperative communication. Wirel. Pers. Commun. 91(1), 267–276 (2016) 12. Ding, W., Fei, L., Gao, Q., Liu, S.: Relay selection based on MAP estimation for cooperative communication with outdated channel state information. Chin. J. Aeronaut. 26(3), 661–667 (2013) 13. Bouallegue, T., Sethom, K.: New threshold-based relay selection algorithm in dual hop cooperative network. Procedia Comput. Sci. 109, 273–280 (2017) 14. Lee, J., Rim, M., Kim, K.: On the outage performance of selection amplify-and-forward relaying scheme. IEEE Commun. Lett. 18(3), 423–426 (2014) 15. Bai, Z.Q., Jia, J., Wang, C.X., Yuan, D.F.: Performance analysis of SNR-based incremental hybird decode-amplify-forward cooperative relaying protocol. IEEE Trans. Commun. 63(6), 2094–2106 (2015) 16. Raja, A., Viswanath, P.: Compress-and-forward scheme for relay networks: backword decoding and connection to bisubmodular flows. IEEE Trans. Inf. Theory 60(9), 5627–5638 (2014) 17. Li, Y., Louie, R.H.Y., Vucetic, B.: Relay selection with network coding in two-way relay channels. IEEE Trans. Veh. Technol. 59(9), 4489–4499 (2010) 18. Cho, S.R., Choi, W., Huang, K.: Qos provisioning relay selection in random relay networks. IEEE Trans. Veh. Technol. 60(6), 2680–2689 (2011) 19. Ouyang, F., Ge, J., Gong, F., Hou, J.: Random access based blind relay selection in large-scale relay networks. IEEE Commun. Lett. 19(2), 255–258 (2015) 20. Saghezchi, F.B., Radwan, A., Rodriguez, J.: Energy-aware relay selection in cooperative wireless networks: an assignment game approach. Ad Hoc Netw. 56, 96–108 (2017)
Intelligent Computing Techniques
EmbPred30: Assessing 30-Days Readmission for Diabetic Patients Using Categorical Embeddings Sarthak, Shikhar Shukla, and Surya Prakash Tripathi
Abstract Hospital readmission is a crucial healthcare quality measure that helps in determining the level of quality of care that a hospital offers to a patient and has proven to be immensely expensive. It is estimated that more than $25 billion are spent yearly due to readmission of diabetic patients in the USA. This paper benchmarks existing models and proposes a new embedding-based state-of-the-art deep neural network(DNN). The model can identify whether a hospitalized diabetic patient will be readmitted within 30 days or not with an accuracy of 95.2% and Area Under the Receiver Operating Characteristics (AUROC) of 97.4% on data collected from 130 US hospitals between 1999 and 2008. The results are encouraging with patients having changes in medication while admitted having a high chance of getting readmitted. Identifying prospective patients for readmission could help the hospital systems in improving their inpatient care, thereby saving them from unnecessary expenditures. Keywords Hospital readmission · Health care · 30-day readmission · Deep neural network · Diabetes
1 Introduction Diabetes is a disease causing high level of blood sugar. In type 1 diabetes, body does not produce insulin, but if injected from external sources, will use it, and in type 2, the body does not produce as well as use insulin. It is estimated that 30.3 million Sarthak (B) Analytics Quotient, Bangalore, India e-mail: [email protected]; [email protected] S. Shukla Samsung R&D Institute India-Bangalore, Bangalore, India e-mail: [email protected]; [email protected] S. Prakash Tripathi Institute of Engineering and Technology, Lucknow, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_7
81
82
Sarthak et al.
people of all ages in the US are suffering from diabetes as of 2015, out of which 7.2 million are unaware [10]. As of 2016, it is ranked seventh in the list of global causes of mortality. Diabetes can be an underlying cause for many cardiovascular diseases, retinopathy, and nephropathy leading to frequent readmission in the hospital. The Centers for Medicare and Medicaid Services(CMS) labeled a 30-day readmission rate as a measure of healthcare quality offered by the hospital in order to provide the best inpatient care and improve the healthcare quality. Hospitals with high readmission rates will be penalized as per the Patient Protection and Affordable Care Act (ACA) of 2010 [18]. During the recent studies [16], it was observed that a 30-day readmission rate for patients with diabetes ranges between 14.4 and 22.7%, which is significantly higher than the rate of all 30-day readmitted patients (8.5–13.5%). In 2012, expenditure incurred due to hospital admissions amounted to $124 billion, of which $25 billion were spent on readmitted patients [19]. The main purpose of this research is to facilitate healthcare institutions in predicting readmission of a diabetic patient by allowing the model to learn the relation among features and their importance in determining whether the patient will be readmitted or not. This helps the hospitals in providing the best inpatient treatment and improve the cost efficiency of healthcare centers. At the same time, it is important to identify the key factors responsible for the readmission of a diabetic patient. The structure of the work is as follows. In Sect. 2, we discuss the previous-related studies in this field. Section 3 deals with the motivation behind choosing the current techniques and model architecture for predicting a 30-day readmission, dataset description and data preprocessing. In Sect. 4, we present the experimental results. Finally, in Sect. 5, we discuss the conclusions drawn from the experiment and future work.
2 Related Work Over the years, several machine learning and deep learning models have been developed in an effort to reduce the hospital’s readmission rate. LACE index (Length of stay, Acuity of admission, Charlson comorbidity index and Emergency visits) was the most preferred model due to its ease of implementation [4, 12, 14]. However, due to an unbalanced dataset, the models achieved low c-statistics or ROC scores ranging from 56 to 63%. Munnagi et al. [15] used Stepwise Regression, Forward Regression, LARS [13] or more commonly known as Least Angle Regression, and LASSO [22] for feature selection and trained several models based on logistic regression, decision trees, gradient boosting, and SVMs. His research showed that SVM outperformed all other models with an ROC of 63.3%. Rubin et al. [19] proposed Diabetes Early Readmission Risk Index (DDERRI), which is a multivariate logistic regression with an ROC of 72.0%. DERRI was trained on 13 statistically significant features selected from 43 features and first proposed that HBA1C level has little relation to that of a 30-day readmission. According to his research, lower socio-economic status, racial/ethnic
EmbPred30: Assessing 30-Days Readmission for Diabetic Patients …
83
minority, comorbidity burden, public insurance, emergent or urgent admission, and a history of recent prior hospitalization are some of the important factors responsible for 30-day readmission of patients. Damian et al. [14] also demonstrated that while HBA1c levels are important, they might not be crucial in predicting readmission of a patient. He built a different model for the age groups [0–30), [30–70) and [70–99). An ensemble model comprising of extreme gradient boosted trees, gradient boosted greedy trees, and extratrees classifiers [6] designed for age group [0–30) achieved an accuracy of 84.8%. For the age group [30–70), an ensemble model containing random forest using Gini function, gradient, and extreme gradient boosted trees with early stopping achieved an accuracy of 78.5%. For the agegroup [70–99), an ensemble model containing extratrees classifiers [6] and extreme gradient boosted trees with early stopping achieved an average accuracy of 68.5%. The average accuracy across three models comes out to be 77.2%. Ahmad et al. [8], in his research, designed a model using convolutional neural networks (CNN), which was trained after data preprocessing, feature selection, and feature engineering. SMOTE [2] was used to tackle the problem of an unbalanced dataset by generating random data of patients readmitted within 30 days. The model achieved an accuracy of 92% with an ROC of 95%. The research points out the need for data pruning techniques such as removing the duplicate patient data and outliers in the dataset. Goudjerkan et al. [7] benchmarks several existing models employing LACE, machine learning, and deep learning models and compiles the best of every research into a single multilayer perceptron model consisting of two hidden layers with dropout [20] to achieve high overall accuracy and ROC. The research makes use of comprehensive data cleaning, data reduction, and data transformation techniques. Random forest algorithm is used for feature selection and SMOTE [2] algorithm for data balancing. The model reports a k-fold cross-validation accuracy and an ROC of 95%. In Table 1, we summarize the quantitative results of the above previous studies. It includes the model basis, if the dataset had been augmented (Data Aug.) and AUROC. The results of our proposed models are also mentioned (at the top) in the table.
3 Proposed Method 3.1 Dataset This research uses the dataset [5] obtained from the Center for Machine Learning and Intelligent Systems at the University of California, Irvine [5] covering 10 years (1998–2008) of diabetes patient data gathered from 130 US hospitals having 70,000 distinct patients [21] with over 100,000 records. Every record was labeled as to whether the patient was readmitted within 30 days, readmitted after 30 days, or not readmitted at all.
84
Sarthak et al.
Table 1 Quantitative review of previous studies along with our results Year Model basis Data Aug. AUROC Remarks 2019
Categorical Embeddings and Neural Networks
Yes
97.4
2019
Multilayer Perceptron
Yes
95.0
2019
Ensemble of No Logistic Regression, Decision Tree, Neural Network and Augmented Naive Bayes Network Random Forest No
2018
63.5a
94.0
2018
Convolutional Neural Network
Yes
95.0
2017
Recurrent Neural Network
No
81.1
2016
Ensemble model using Extreme Gradient Boosted Trees with early stopping, SVM, RandomForest and Extratrees classfier
No
77.2a
Embeddings for categorical features concatenated with normalized continuous features were fed into neural network Extensive data augmentation and feature engineering was used to train a multilayer perceptron network Ensemble of models were used to improve model AUROC and sensitivity in an unbalanced dataset After Feature Selection and Feature engineering, Random Forest model was trained using Gini function Feature Selection and Feature engineering coupled with convolutional and linear layers was used Proposed a Recurrent Neural Network(RNN) that performed better than Logistic Regression, SVMs, Multilayer Perceptrons on unbalanced dataset Built three different ensemble machine learning models for three age groups, 0–30, 30–70 and 70–100
References Self
[7]
[17]
[11]
[8]
[3]
[14]
(continued)
EmbPred30: Assessing 30-Days Readmission for Diabetic Patients … Table 1 (continued) Year Model basis
Data Aug. AUROC No
23.0b
2016
Neural Networks
2015
Multivariate Logistic No Regression
72.0
2015
SVM
63.3
No
Remarks
85
References
A single hidden [1] layer was used for prediction based on features selected and features engineered Feature Selection [19] and Feature engineering was used to train a logistic regression model Optimal parameters [15] estimated using LARS [13] were using in training an SVM
a Accuracy; b Area under the precision-recall curve *All the results are on UCI [5] dataset
3.2 Data Preprocessing (i) Filling missing values: The missing values in the categorical columns are replaced by ‘nan’ which is treated as another category. On the other hand, for filling the missing data in continuous-valued columns, we took the median of those columns since it is the best representative of the distribution of data. (ii) Removing Outliers and Inconsistencies: As suggested in a recent research [11], it is important to maintain a single record for every patient ID. So continuing along the path of this research, only the first encounter with the patient is kept, and the rest of the patient’s records are dropped. Also, as suggested by the dataset [5], the data contains records of those patients who have diagnosed with diabetes, in at least one of the three diagnoses done by hospital staff. If none of the columns diag_1, diag_2, diag_3 (which represent three different diagnoses done by the hospital) did not have the diabetes ICD9 code mentioned, which is of the format 250.xx, then those records were also dropped. Also, if a patient died or is referred to the end of life care, then those patients would not be readmitted and hence were deleted from the dataset. (iii) Feature Encoding: The dataset contains three classes, with 11.2% of the 70,000 patients readmitted within 30 days, 34.9% readmitted after 30 days, and 53.9% are not readmitted at all. Since we have to predict whether a patient will be readmitted within 30 days or not, dropping patients readmitted after 30 days would result in the loss of one-third of data. Therefore, we define two classes, Yes and No, Yes suggesting that the patient will be readmitted within 30 days else No. Levels of HBA1C and glucose serum test results are also relabeled into
86
Sarthak et al.
three categories, namely “normal” (Normal), “abnormal” (Values >7, >8 for HB1Ac, and >300, >200 for glucose serum test) and “not tested” (None). All the 23 columns related to the amount of medicines administered to an admitted patient are also relabeled into two categories, Yes or No. If the level of any medicine is “Steady” or “None” then it is labeled as No, and Yes if the level is “Up” or “Down”. (iv) Data Synthesis: To deal with the imbalance in the original dataset, which has 88.8% negative samples, we generated data through Synthetic Minority Oversampling Technique (SMOTE) [2]. This statistical technique takes neighboring samples from the minority target class and generates new samples whose features are a combination of the original samples’ feature space. After using this technique our synthetic dataset has a 1:1 ratio of samples from both the target classes. (v) Normalizing continuous variables: Many columns such as “number_ inpatient”, “number_outpatient”, “number_emergency” were highly skewed with high kurtosis. To reduce the skewness, log(x+1) transformation was used. The features in the dataset vary in scale, unit, and range. Due to this, the features with a broad range in the dataset can have a disproportionate impact on the prediction. To overcome this problem, the data are normalized so that each feature has a mean of 0 and a standard deviation of 1. (vi) Feature Engineering: • Feature Synthesis: In the previous researches [7, 8, 11], two new features were created service_utilization and count_meds, which proved influential in determining whether the patient will be readmitted within 30 days or not. Service utilization is the sum of the number of times a patient has used the services of the hospital. This is calculated by summing up the number of inpatient visits, the number of outpatient visits, and the number of emergency visits. Count_Meds is a feature counting the number of medication changes that happened when a patient was admitted. This was done by counting the number of “Yes” in the 23 columns related to the amount of medicine registered. • Feature Selection: Out of 49 columns present in the dataset, columns that had more than 50% values missing, such as weight, were dropped. Two columns examide and citoglipton had their cardinality as 1; the only value being No. As a result, these two columns were also dropped. The importance of the rest of the columns was analyzed after training our model for 70 epochs. Most of the columns or features which had importance of 0 were dropped. One of the features, Count_Meds, which was engineered, was determined as the most important feature among all others. In Fig. 1 we present the importance of final features fed into our model.
EmbPred30: Assessing 30-Days Readmission for Diabetic Patients …
87
Fig. 1 Input features of our model
3.3 Model Details Categorical variables are passed through an embedding layer. This layer converts each categorical value to a float vector with a dimension given by the formula [9]: Embedding dimension = [1.6 ∗ n 0.56 ],
(1)
where n is the cardinality of the column. For each categorical column, the embedding matrix has rows corresponding to unique categorical values in the dataset. After this, dropout is applied to the values obtained from the previous operation. The continuous variables are passed through a batch normalization layer. The results from both the continuous and categorical variables are then concatenated and fed to a feedforward network whose architecture is described in Table 2.
4 Results and Discussion This paper’s approach achieves state-of-the-art result on the UCI dataset [5] in determining whether a patient will be readmitted within 30 days or not. In Fig. 2, we plot the roc curve for two classes in our dataset, and in Fig. 3, we present the confusion matrix of our trained model. Overall, the model’s accuracy is 95.2% with a standard deviation of 0.34% and ROC is 97.4% with a standard deviation of 0.42% after evaluating it using k-fold cross-validation (k = 6).
88
Sarthak et al.
Table 2 Architecture of the embedding model Layer Name Output features (Embedding Layer) Race Gender Age payer_code medical_specialty diag_1 diag_2 diag_3 max_glu_serum A1Cresult Metformin Nateglinide Chlorpropamide Glimepiride Pioglitazone Rosiglitazone Acarbose Insulin Change DiabetesMed (Embedding Dropout) Dropout(0.05) (Batch Normalization for continuous variables) BatchNorm1d (Sequential) Linear RELU BatchNorm1d Dropout(0.15) Linear RELU BatchNorm1d Dropout(0.15) Linear Total Parameters
Momentum /eps(x-epsilon)
No. of parameters
(1,4) (1,3) (1,6) (1,8) (1,17) (1,62) (1,63) (1,65) (1,3) (1,3) (1,3) (1,3) (1,3) (1,3) (1,3) (1,3) (1,3) (1,3) (1,3) (1,3)
24 9 66 144 1207 42,532 43,911 47,840 12 12 9 9 9 9 9 9 9 9 9 9
(283)
283
(13)
0.1 / 1e−05
13
(512) (512) (512) (512) (512) (512) (512) (512) (2)
0.1 / 1e−05
144,896 512 512 512 262,144 512 512 512 1,024 547,279
0.1 / 1e−05 0.1 / 1e−05 0.1 / 1e−05
EmbPred30: Assessing 30-Days Readmission for Diabetic Patients …
89
Fig. 2 ROC curve
Fig. 3 Confusion matrix
5 Conclusion and Future Scope The original dataset was severely imbalanced; the availability of data with more samples of the underrepresented class could help mitigate this problem. This would provide more realistic data points than the ones generated by SMOTE [2] algorithm. Furthermore, there can be more analysis of the embedding matrices to help interpret and visualize the distinguishing features. The effect of varying the dimensions of each embedding is another potential area of study. This research proposes a novel approach of generating embeddings for categorical features such as those of ICD9 codes used while diagnosing the patient. To exceed the ROC score of the best existing model, we combined our approach with the best of previous researches to obtain the state-of-the-art ROC of 97.4% and an accuracy of 95.2%. We found that changes in the number of medication administered to a patient play a vital role in determining whether a patient will be readmitted to a hospital within 30 days or not.
90
Sarthak et al.
References 1. Bhuvan, M.S., et al.: Identifying diabetic patients with high risk of readmission. In: arXiv:abs/1602.04257 (2016) 2. Bowyer, K.W., et al.: SMOTE: synthetic minority over-sampling technique In: CoRR abs/1106.1813. arXiv:1106.1813. http://arxiv.org/abs/1106.1813 (2011) 3. Chopra, C., et al.: Recurrent Neural Networks with Non-Sequential Data to Predict Hospital Readmission of Diabetic Patients, pp. 18–23, Oct 2017. https://doi.org/10.1145/3155077. 3155081 4. Damery, S., Combes, G.: Evaluating the predictive strength of the LACE index in identifying patients at high risk of hospital readmission following an inpatient episode: a retrospective cohort study. BMJ Open 7(7) (2017). issn: 2044-6055. https://doi.org/10.1136/bmjopen-2017016921 5. Diabetes 130-US hospitals for years 1999–2008 data set. https://archive.ics.uci.edu/ml/ datasets/diabetes+130-us+hospitals+for+years+1999-2008 (2008) 6. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Machine Learning 63(1), 3–42 (2006). issn: 1573-0565. https://doi.org/10.1007/s10994-006-6226-1 7. Goudjerkan, T., Jayabalan, M.: Predicting 30-day hospital readmission for diabetes patients using multilayer perceptron. Int. J. Adv. Comput. Sci. Appl. 10, 268–275 (2019). https://doi. org/10.14569/IJACSA.2019.0100236 8. Hammoudeh, A., et al.: Predicting Hospital Readmission among Diabetics using Deep Learning (2018) 9. Howard, J., et al.: fastai. https://github.com/fastai/fastai (2018) 10. International Diabetes Federation. IDF Diabetes Atlas, 8th edn. Brussels, International Diabetes Federation, Belgium (2017) 11. Lin, C.Y.: What are Predictors of Medication Change and Hospital Readmission in Diabetic Patients? (2018) 12. Low, L., et al.: Predicting 30-Day readmissions: performance of the LACE index compared with a regression model among general medicine patients in Singapore. BioMed Research International 2015, p. 169870 (2015). https://doi.org/10.1155/2015/169870 13. Mander, A.: LARS: Stata Module to Perform Least Angle Regression. Statistical Software Components, Boston College Department of Economics (2006) 14. Mingle, D.: Predicting diabetic readmission rates: moving beyond Hba1c. Curr. Trends Biomed. Eng. Biosci. 7(3), 555707 (2015). https://doi.org/10.19080/CTBEB.2017.07.555715 15. Munnangi, H., Chakraborty, G.: Predicting Readmission of Diabetic Patients using the High performance Support Vector Machine Algorithm of SAS R Enterprise MinerTM (2015) 16. Ostling, S.: The relationship between diabetes mellitus and 30-day readmission rates. Clin. Diab. Endocrinol. 3(1), 3 (2017). https://doi.org/10.1186/s40842-016-0040-x 17. Pham, H.N., et al.: Predicting hospital readmission patterns of diabetic patients using ensemble model and cluster analysis. In: 2019 International Conference on System Science and Engineering (ICSSE), pp. 273–278 (2019). https://doi.org/10.1109/ICSSE.2019.8823441 18. Readmissions Reduction Program: https://www.cms.gov/Medicare/medicare-fee-for-servicepayment/acuteinpatientPPS/readmissions-reduction-program.html 19. Rubin, D.J.: Correction to: Hospital readmission of patients with diabetes. Curr. Diab. Rep. 18(4), 21 (2018). issn: 1539-0829. https://doi.org/10.1007/s11892-018-0989-1 20. Srivastava, Nitish, et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014) 21. Strack, B., et al.: Impact of HbA1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records. BioMed Research International 2014, p. 781670 (2014). https://doi.org/10.1155/2014/781670 22. Tibshirani, Robert: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1994)
Prediction of Different Classes of Skin Disease Using Machine Learning Techniques Anurag Kumar Verma, Saurabh Pal, and Surjeet Kumar
Abstract Skin diseases are very common nowadays and spreading widely among people in present time. With the growth of computer-based technology and relevance of different machine learning methods in current decade, the development of skin disease prediction using classifier methods is analytical and exact. Therefore, development of data mining techniques can efficiently distinguish classes of skin disease is importance. A new method was developed using four types of classification methods which is used in this research paper. We use skin disease dataset for analyzing various machine learning algorithms to classify the different classes of skin disease. The proposed data mining techniques were checked on skin disease datasets for analyzing six types of skin disease, which are psoriasis, seborrheic dermatitis, lichen planus, pityriasis rosea, chronic dermatitis, and pityriasis rubra. The results of base learners used in this paper are more accurate than the results obtained by previous studies. Keywords Health care · Skin disease · MLP · RF · KNC
A. K. Verma MCA Department, VBS Purvanchal University, Jaunpur, UP, India e-mail: [email protected] S. Pal (B) · S. Kumar Department of Computer Applications, VBS Purvanchal University, Jaunpur, UP, India e-mail: [email protected] S. Kumar e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_8
91
92
A. K. Verma et al.
1 Introduction The skin is the main important part of person body. Skins protect our body from heat, cold dust particles, wounds, germs, and harmful radiation from sun and also generates vitamin D for our body. The skin also controls body temperature; therefore, skin is significant for maintaining excellent health and protects our body from various types of skin disease. The fast development of computer technology in the present decades and the use of data mining technology play a crucial role in the analysis of skin diseases. Researchers are constantly developing various prediction methods, but the largest researchers use only a few classification algorithms instead of ensemble methods. The ensemble method uses different data mining techniques and combines them to find predictions. In this research paper, we have used four types of data mining algorithms to evaluate the performance of skin disease. The data mining algorithms used for evaluation are (i) decision tree classifier (DT), (ii) K neighbors classifier (KNC), (iii) multilayer perceptron classifier (MLPC), and (iv) random forest classifier (RFC). These four classifiers used are of different type called heterogeneous. The decision tree and random forest are used for decision making based on tree structure. Multilayer perceptron is a neural network-based classifier. K-neighbor classifier is based on distance-based classification. Heterogeneous type of data mining algorithms is used in this study so that predictions of each algorithm represent different types of data mining techniques. These days feature selection techniques are also applied to enhance the performance of prediction. We know that if we put garbage in, then we found garbage out; i.e., if we give non-significant attribute for applying data mining algorithm, then the performance of the algorithms is not good. Therefore, it is necessary to select only those attributes witch play important role in prediction. There are various feature selection techniques which can be used to find the important attributes.
2 Related Work Ramya and Rajeshkumar [1] discussed the GLCM method for obtaining attributes from fragmented disease and classifying skin disease using fuzzy logic, and the author proves that GLCM method is more accurate than others. Verma et al. [2] applied six different base learner classifiers to examine the performance of prediction erythemato-squamous disease. The authors applied three ensemble methods which are used to improve the accuracy of prediction. Feature selection technique feature importance is applied to choose most effective attributes which is applied on skin disease to make an important attribute dataset and enhanced accuracy achieved as 99.68% in case of GB ensemble method with the help of RNC base learner.
Prediction of Different Classes of Skin Disease Using Machine …
93
Ahmed et al. [3] used clusters of preprocessed data, with the help of k-means clustering for finding the group of related and non-related data into skin disease. MAFIA algorithm is used for examining frequent patterns. The author extracted frequent patterns using decision tree and AprioriTid algorithms from clustered datasets. Vijaya [4] discussed the skin disease and its types of classification by applying SVM classifier to predict skin disease classes accurately. The texture and chrominance attributes are selected by preprocessed datasets using for test and train. Chang and Chen [5] focused on decision tree classifier jointly with neural network classifier techniques to build most predictive model for skin disease. All classifiers used to predict disease are accurate; the highest accuracy of 92.62% is achieved in neural network model. In the study of Verma et al. [6], the different machine learning methods used to predict skin diseases are correlation and regression tree (CART), random forest (RF), decision tree (DT), support vector machine (SVM), and gradient boosting decision tree (GBDT) for skin disease predictions. The best accuracy was found to be 95.90% of GBDT. Then, five data mining techniques were combined using an integrated method to achieve 98.64% as maximum accuracy. Chaurasia and Pal [7] models that are build on linear discriminant analysis, logistic regression, classification and regression trees, k nearest neighbors, support vector machines, and Gaussian Bayes established to predict the skin diseases. After that, various integrated data mining classifiers such as gradient boosting, AdaBoost, extra tree, and random forest were used to improve the performance of the model. The author obtained in gradient boost 98.64% as the maximum accuracy. Fernando et al. [8] discussed developed DOCAID, to predict various diseases like typhoid fever, malaria, tuberculosis, gastroenteritis, and jaundice on records of patients conditions and symptoms by applying Gaussian Naive Bayesian data mining techniques. They find accuracy of 91% for predicting skin disease. Theodorali et al. [9] designed a new model for predicting the final conclusion of a dangerously wounded patient having met with an accident. The examination contains an assessment of machine learning techniques using classification, clustering, and association rule mining. They found the results in terms of sensitivity, specificity, false positive rate, and false negative rate and analyzed the results between these classifier models. Sharma et al. [10] used support vector machine and artificial neural network techniques, to analyze a variety of types of skin diseases. The author applied CWV scheme for merging 2 methods for getting best accuracy in the training 99.25% and testing 98.99%. Rambhajani et al. [11] applied Bayesian classifier to examine the skin disease dataset. They applied best first search attribute selection technique, for removing nonimportant 20 attributes from the original dataset obtained by UCI Irving repository. They used Bayesian method to obtain accuracy of 99.31%. Verma and Pal [12] applied a new comparison of three different feature selection techniques to improve the results obtained by previous studies. The three compared feature selection techniques are univariate, feature importance, and correlation matrix with heat map. 15 most important attributes are selected applied on skin disease and
94
A. K. Verma et al.
get the highest accuracy 99.86% in correlation matrix with heat map feature selection technique. Yadav and Pal [13] discussed prediction of thyroid disease using bagging, AdaBoostM1, staking, and voting ensemble techniques. The accuracy obtained is 99.86%, sensitivity 99.87%, and specificity 99.77%. Verma et al. [14] discussed a new method to improve the accuracy of precision for skin disease. The authors used a feature selection technique feature importance to obtain 15 important features. They compare the results with feature selection and without feature selection and prove that feature selection gives better results as compared to without feature selection. With feature selection, the highest accuracy obtained is 99.86%. In these investigations, the main work of the differential analysis of erythematosquamous disease is Table 1. This research paper is an effort using data mining algorithms to compute the performance of four machine learning classifiers, which are decision tree (DT), K Table 1 Previous studies done Author
Year
Method
Classification accuracy (%)
Chang and Chen
2009
Decision tree
80.33
Neural network
90.62
ANN
97.17
Parikh et al. Pravin and Jafai
2015 2017
SVM
94.04
Multi-SVM
97.4
KNN
90
Naive Bayesian
55
Zhang et al.
2018
ANN
96.8
Verma et al.
2019
Bagging
98.556
Chaursia and Pal
Verma et al.
2019
2019
AdaBoost
99.25
Gradient Boosting
99.86
LR
97.94
LDA
96.22
KNN
85.57
CART
93.50
NB
89.02
SVM
92.10
CART
94.17
SVM
96.93
DT
93.82
RF
97.27
GBDT
96.25
Ensemble method
98.64
Prediction of Different Classes of Skin Disease Using Machine …
95
neighbors classifier (KNC), multilayer perceptron classifier (MLPC), and random forest classifier (RFC). All four classifiers are used on the skin disease dataset. The performances of four classifiers are measured on the basis of accuracy.
3 Methods Machine learning is the technique for developing new algorithms, which provides computer the capability to learn from previously stored information’s. The methodology used in this paper uses (i) decision tree (DT), (ii) K neighbors classifier (KNC), (iii) multilayer perceptron classifier (MLPC), and (iv) random forest classifier (RFC) [15, 16].
3.1 Dataset Analysis The dataset for this study is obtained from UCI machine repository [17]. This dataset was created to inspect skin disease and different types of skin diseases. The dataset includes 35 attributes; 34 attributes are linear, and 1 attributes is nominal. Skin disease can be classified in six classes. The dataset with 366 records and 35 attributes was shown in Table 2.
3.2 Data Preprocessing The method used in this research paper starts with data preprocessing. Data preprocessing step includes (i) a data-driven technique for selecting records and significant features for analysis and (ii) The collected data may include blare, mistaken, absent values, or conflicting data. Therefore, we apply various techniques for data cleaning such as mean value, moving average or nil values to clean such fields. (iii) Even after cleaning, the dataset is not ready for mining, because data obtained from various sources is in different formats, which cannot be used; therefore, data must be transformed into one format which is suitable for data mining. The transformation is applied to achieve normalization; smoothing, aggregation, etc. are used.
4 Results and Discussion After preprocessing, the data is visualized and shown in Fig. 1. From the figure, all attributes show the range of diseases. Each attribute is represented by three dots are box and whisker plot.
96
A. K. Verma et al.
Table 2 Skin disease dataset Classes (No. of instances)
Clinical
Histopathological attributes
C1: psoriasis (112)
f1: erythema
f12: melanin incontinence
C2: seborrheic dermatitis (61)
f2: scaling
f13: eosinophils in the infiltrate
C3: lichen planus (72)
f3: definite borders
f14: PNL infiltrate
C4: pityriasis rosea (49)
f4: itching
f15: fibrosis of the papillary dermis
C5: chronic dermatitis (52)
f5: koebner phenomenon
f16: exocytosis
C6: pityriasis rubra (20)
f6: polygonal papules
f17: acanthosis
f7: follicular papules
f19: parakeratosis
fl8: hyperkeratosis
f20: clubbing of the rete ridges involvement
f8: oral mucosal
f21: elongation of the rete ridges
f9: knee and elbow
f22: thinning of the suprapapillary epidermis
f10: scalp involvement
f23: spongiform pustule
f11: family history
f24: munro microabscess
f34: age
f25: focal hypergranulosis f26: disappearance of the granular layer f27: vacuolization and damage of basal layer f28: spongiosis f29: saw-tooth appearance of rete ridges f30: follicular horn plug f31: perifollicular parakeratosis f32: inflammatory mononuclear infiltrate f33: band-like infiltrate
The smooth continuous version of the smoothed graph estimated from the data is known as density map. Kernel density estimation is the most familiar form of estimation. Density map is represented as a continuous curve (core) is drawn at each data point, after drawing all of data points curves are added together to draw a smoothed density estimate. Gaussian (which produces a Gaussian bell curve at each data point) is the most commonly used kernel. Density map of the attributes are illustrated in the Fig. 2. Python code is used to find the prediction on skin diseases dataset to calculate the train CV accuracy, train accuracy, and test accuracy of the four machine learning classifiers. Table 3 shows the value calculated by four classifiers.
Prediction of Different Classes of Skin Disease Using Machine …
97
Fig. 1 Box and whisker plot of skin disease dataset
Fig. 2 Density map representation of dataset Table 3 Base learners accuracy Algorithms
Train CV accuracy
Train accuracy
Test accuracy
DT
0.93 (±0.03)
1.00
0.97
KNC
0.96 (±0.03)
0.96
0.99
MLPC
0.97 (±0.03)
1.00
0.99
RFC
0.98 (±0.03)
1.00
0.99
98
A. K. Verma et al.
Fig. 3 Comparison of different classifiers accuracy
Another diagram for visualizing the results of four classifiers is shown with the help of box and the whisker. The plot draws a 25th and 75th percentile around the data that captures the middle 50% of the observations. Draw a line at the 50th percentile (median) and draw whiskers above and below the box to summarize the general range of observations. Draw points for outliers outside the data or for outliers outside the range. The accuracy achieved by five base learners is represented by box and whisker plot in Fig. 3 The accuracy, confusion matrix, precision, recall, f1-score, and support are evaluated using random forest classifier method to obtain values which are given below: 0.9864864864864865 ⎡ ⎤ 24 0 0 0 0 0 ⎢ 0 10 0 0 0 0 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 0 0 11 0 0 0 ⎥ ⎢ ⎥ ⎢ 0 1 0 13 0 0 ⎥ ⎢ ⎥ ⎣ 0 0 0 0 11 0 ⎦ 0 0 0 0 0 4 1 2 3 4 5 6
precision 1.00 0.91 1.00 1.00 1.00 1.00
recall f1 - score support 1.00 1.00 24 1.00 0.95 10 1.00 1.00 11 0.93 0.96 14 1.00 1.00 11 1.00 1.00 4
avg/total 0.99 0.99 0.99 74
Prediction of Different Classes of Skin Disease Using Machine …
99
5 Conclusion Data mining plays an important role in healthcare organizations. Knowledge gained using data mining techniques can be used to make successful and effective decisions that improve and develop healthcare organizations. In the present days, various expert systems have been developed to predict the diseases like thyroid, breast cancer, and diabetes. This paper describes different classification techniques for skin disease prediction. Four data mining techniques (i) decision tree classifier (DT), (ii) K neighbors classifier (KNC), (iii) multilayer perceptron classifier (MLPC), and (iv) random forest classifier (RFC) are used for evaluating the performance of these algorithms to predict skin disease. After evaluation, we obtained the highest accuracy in random forest classifier as 98.64%. Various matrices are evaluated to check the performance of the model like precision, f1-score, and recall. We can further apply ensemble techniques and feature selection methods to improve the performance of prediction on skin disease.
References 1. Ramya, G., Rajeshkumar, J.: Novel method for segmentation of skin lesions from digital images. Int. Res. J. Eng. Technol. 02(8), 1544–1547 (2015) 2. Verma, A.K., Pal, S., Kumar, S.: Prediction of skin disease using ensemble data mining techniques and feature selection method—a comparative study. Appl. Biochem. Biotechnol. 1–19 (2019) 3. Ahmed, K., Jesmin, T., Rahman, M.Z.: Early prevention and detection of skin cancer risk using data mining. Int. J. Comput. Appl. 62(4), 1–6 (2013) 4. Vijaya, M.S.: Categorization of non-melanoma skin lesion diseases using support vector machine and its variants. Int. J. Med. Imaging 3(2), 34–40 (2015) 5. Chang, C.L., Chen, C.H.: Applying decision tree and neural network to increase quality of dermatological diagnosis. Expert Syst. Appl. 36(2), 4035–4041 (2009) 6. Verma, A.K., Pal, S., Kumar, S.: Classification of skin disease using ensemble data mining techniques. Asian Pac. J. Cancer Prev. 20(6), 1887–1894 (2019) 7. Chaurasia, V., Pal, S.: Skin diseases prediction: binary classification machine learning and multi model ensemble techniques. Res. J. Pharm. Technol. 12(August), 3829–3832 (2019) 8. Fernando, Z.T., Trivedi, P., Patni, A.: DOCAID: Predictive healthcare analytics using Naive Bayes classification. In: Second Student Research Symposium (SRS), International Conference on Advances in Computing, Communications and Informatics (ICACCI’13) (2013) 9. Theodoraki, E.M., Katsaragakis, S., Koukouvinos, C., Parpoula, C.: Innovative data mining approaches for outcome prediction of trauma patients. J. Biomed. Sci. Eng. 3(08), 791–798 (2010) 10. Sharma, D.K., Hota, H.S.: Data mining techniques for prediction of different categories of dermatology diseases. Acad. Inf. Manage. Sci. J. 16(2), 103–115 (2013) 11. Rambhajani, M., Deepanker, W., Pathak, N.: Classification of dermatology diseases through Bayes net and Best First Search. Int. J. Adv. Res. Comput. Commun. Eng. 4(5) (2015) 12. Verma, A.K., Pal, S.: Prediction of skin disease with three different feature selection techniques using stacking ensemble method. Appl. Biochem. Biotechnol. 1–20 (2019) 13. Yadav, D.C., Pal, S.: Thyroid prediction using ensemble data mining techniques. Int. J. Inf. Technol. 1–11 (2019)
100
A. K. Verma et al.
14. Verma, A.K., Pal, S., Kumar, S.: Comparison of skin disease prediction by feature selection using ensemble data mining techniques. Inform. Med. Unlocked 16, 100202 (2019) 15. Chaurasia, V., Pal, S., Tiwari, B.B.: Prediction of benign and malignant breast cancer using data mining techniques. J. Algorithms Comput. Technol. 12(2), 119–126 (2018) 16. Chaurasia, V., Pal, S., Tiwari, B.B.: Chronic kidney disease: a predictive model using decision tree. Int. J. Eng. Res. Technol. 11(11), 1781–1794 (2018) 17. Güvenir, H.A., Demiröz, G., Ilter, N.: Learning differential diagnosis of erythemato-squamous diseases using voting feature intervals. Artif. Intell. Med. 13(3), 147–165 (1998)
Feature Selection and Classification Based on Swarm Intelligence Approach to Detect Malware in Android Platform Ashish Sharma and Manoj Kumar
Abstract In these years, deep learning methods are used to detect Android malware due to low efficient manual checking and increase in Android malware. Data set quality defines the performance of these models. Low quality of training data set will result in reduced performance. Manual checking guarantees the data set quality in real world. Trained model may cause failure by malicious applications in Google Play. Artificial bee colony (ABC) algorithm based on selective swarm intelligence technique is proposed to rectify this issue, and it is robust detection technique of Android malware. The data set quality will not affect the effectiveness of solution. Better component learner’s combination can be computed by genetic algorithm in proposed model, which will enhance the robustness of model. In same area, better robust performance is exhibited by the proposed technique than other methods. Keywords Reverse engineering · Android malware analysis · Genetic algorithm · Feature selection · Artificial bee colony (ABC) algorithm · Machine learning
1 Introduction Android operating system has become the most popular smartphone operating system due to its open source, compatibility and market openness. A variety of applications provide great convenience for people’s lives, but many types of Android malware have followed. The security problems brought by Android smart terminals are becoming more and more serious, causing many problems such as user privacy leakage and economic loss, which brings a lot of troubles to users. Therefore, it is of great significance to the research of malware detection. Effective malware detection is a trending topic in research. A. Sharma · M. Kumar (B) Department of Computer Engineering & Applications, GLA University, Mathura 281406, India e-mail: [email protected] A. Sharma e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_9
101
102
A. Sharma and M. Kumar
At present, there are some methods in Android malware detection, mainly including hybrid, dynamic and static detection methods by combining static and dynamic detection [1]. With the widespread use of machine learning algorithms, many researchers have attempted to detect Android malware using machine learning methods. In [2], a set of Android malware detection methods based on privilege correlation is proposed. The Naive Bayes is improved to form a classifier, which achieves the initial rapid detection of malware. But it only uses a single feature which is likely to have a poor result in detection. Reference [3] proposed a method based on permission frequent pattern mining algorithm apriori to detect an Android malware. Reference [4] proposed support vector machine (SVM)-based detection method for an Android malware, which uses dangerous permission combinations and vulnerable API calls as feature attributes and establishes SVM classifier to automatically distinguish malware and benign software. Reference [5] proposed a sandbox-based Android malware dynamic analysis scheme, which designs and implements a recoverable Android sandbox through virtual machine and uses the reverse tool to insert API monitoring code into the Android installation package. The software run in the box and the API call information is monitored, so as to simulate the real running process of the application in the sandbox. The effect of detecting the malware is achieved, but cost expensive resources. Reference [6] proposed an efficient, systemwide information flow tracking tool, TaintDroid, which can simultaneously track the diffusion path of multiple sensitive data of Android applications and achieve the tracking of multiple sensitive data leakage sources. Reference [7] designed and implemented the dynamic monitoring framework of Android application behaviour. The experiments evaluated the four algorithms of SVM, decision tree, k-nearest neighbour and Naive Bayes. In [8], the dynamic and static methods are used to extract three types of features, design a triple hybrid ensemble algorithm (THEA) for three types of features and realize Androdect tool by THEA algorithm, which is complicated in terms of technical implementation. The current research results have many shortcomings with respect to implementation complexity, false positive rate and detection accuracy. This paper proposes an Android malware detection approach optimized by ABC, which processes the Android application through reverse engineering, extracts features such as Permission, Intent and API, and feature selection algorithm is applied to obtain optimal features subset. Training classifier by using the tenfold cross-validation method. Experimental results show that the proposed method effectively enhances accuracy and reduces false positive rate of Android malware detection.
2 Proposed Methodology The features are extracted by two set of reverse engineered Android apps or APKs, namely goodware and malware. The features include content providers, services, activity, app components count and permissions. Feature vectors are formed by
Feature Selection and Classification Based on Swarm …
103
these features. The vector has class label o to represent malware and 1 to represent goodware. CSV is given to artificial bee colony (ABC) algorithm for reducing feature set’s dimension and to select set of features which are optimized. Neural networks and support vector machine classifiers are machine learning classifier, and they are trained using this set of features which are optimized. Proposed approach is shown in Fig. 1. There are two units. They are Androguard tool-based feature extraction and artificial bee colony algorithm-based feature selection. For evaluating the selected features, they are fed to machine learning algorithms as input. A. Reverse Engineering of Android APKs From AndroidManifest.xml, it obtains static features in proposed method. The apps information which may be required by any Android platform is contained by these features. APKs are disassembled using Androguard tool, and it is also used to obtain features which are static. B. Feature Vector Features are extracted and mapped to a feature vector as follows. App Components: The counts of app components such as activity, services and feature vector correspond to broadcast receivers and content providers. A vector space with |S| dimension is used to map a permission feature set. The app x has a feature is represented by 1 in dimension set, 0 is used otherwise. For every feature which is extracted from app x, construct a vector ψ(x) in this manner by dimension set to 1. Equation (1) summarizes it as ψ: X → {0; 1}|S|
(1)
C. Discriminatory Feature Selection It is important to select more significant features in the detection of malware. Experimental result quality is highly influenced by this. Learning classifier’s complexity in computation can be reduced greatly by using a low-dimensional feature vector with discriminative features. Best feature subset is produced by artificial bee colony (ABC) algorithm using all the features of CSV. This subset of features is given to classifiers which are based on machine learning techniques. There are three phases in ABC algorithm. They are
Fig. 1 Methodology block diagram
104
A. Sharma and M. Kumar
phase of employed bee, phase of onlooker bee and phase of scout bee. In the search space , bee population N is created by using the bees in search space randomly. Food source group corresponds to population. Attraction of bees varies based on food source volume. Food source with high volume will be attracted by more bees when compared to food source with less volume. Computation of food source with more volume of food is the major objective of ABC algorithm. After initializing the population of bee, objective function-based fitness function is computed. The computation of fitness function is expressed as fit(X i ) =
if f (X i ) ≥ 0 1 + | f (X i )| Otherwise 1 1+ f (X i )
(2)
where solution’s function value is represented as f (X i ) where X i , i = 1, 2, . . . , N P . Employed Bee Phase The food source with high quantity of food is computed by sending swarm of employee’s bees in this stage. Every employed bee will find a food source. i is used to represent employed and vi is used to represent the food source computed by an employed bee i and it is expressed as vi, j =
xi, j + ϕi, j xi, j − xr 1, j , if j = j1 xi, j , Otherwise
(3)
where the jth variable of vi is given by vi, j , xi is given by xi, j and xr 1 is given by and xr 1, j . Random number is represented as ϕi, j ∈ [−1, 1]. Position of old food source will be updated by new xi in old value, after the computation of new food source having high amount of food. The following equation represents it as xi =
vi , if f (vi ) f (xi ) or fit(vi )fit(xi ) xi otherwise
(4)
Onlooker Bee Phase Food sources found by employed bees are filtered based on its volume and quality by using onlooker bees. Roulette wheel selection process is used for computing this. Based on the values of probability, food sources are found, and they are expressed as fiti pi = N p j=1
(5) fit j
Feature Selection and Classification Based on Swarm …
105
Greedy selection process is used to select food source depends on this value of probability. Between old and newly produced solution, solution with high quality is selected by this process. Scout Bee Phase Situation in which it is not able to compute the optimum solution, scout bees are used. In a particular time, if employed and onlooker bees are not able to compute the optimum solution, then scout bees are sent out. The parameter called ‘limit’ with respect to time unit defines the employment of scout bees. The computation of limit after the employment of scout bee is given by li =
0, if f (vi ) f (xi ) or fit(vi )fit(xi ) li + 1, otherwise
(6)
Onlooker and employed bees compute the update of position, and it is expressed as vi, j =
nbest xi,nbest x + ∅ − x i, j rl, j , if j = j1 j i, j xi, j ,
(7)
Otherwise
where best solution amongst neighbours of xi and itself is represented as xinbest . Euclidean distance d(i, m) is used to compute best solution amongst different solutions, and it can be calculated as NP mdi =
d(i, m) Np − 1
m=1
(8)
1. Update of solution is lacked in ABC algorithm. In the intermediate level, update of solution is not done, which is also not done in mutation and crossover. So, optimum solutions cannot be identified. By integrating the algorithm of cuckoo search, this can be enhanced. The following describes the major steps to be followed in algorithm. 1. ABC sub-systems are initialized, respectively. 2. Employed bee’s search, selection onlooker bee and search process are executed on bee colony. 3. Employed scout’s search process is executed. Based on given probability, compute the starting point of search. Equation (7) may be used to produce that probability randomly or Information Exchange Process 1 may be used to obtain that. 4. Cats search process is executed. Information Exchange Process 2 is adapted for every cat, when given small probability is satisfied. Based on Eqs. (6) and (2), update the position and velocity. Else using Eqs. (1) and (2), position and velocity are updated.
106
A. Sharma and M. Kumar
5. Final solution corresponds to best solution, and it is memorized. If the termination condition is satisfied by any one of the sub-system’s individual best, the algorithm is stopped. D. Classification Machine learning techniques are used to detect zero-day threats posed by variants of Android malware which is increasing ever. ML techniques are best suited than signature-based techniques, because it requires that signature database has to be updated regularly. Following algorithms are used to train as well as test the classifier with genetic algorithm selected features. They are neural network (NN) and support vector machine (SVM) classifiers.
3 Experimental Results A data set with 40,000 APKs is used to perform the proposed work, and it has two classes. There are 20,000 malicious or malware and 20,000 benign and goodware applications in the data set. To extract features, APKs are reverse engineered. With class labels, goodware (1) and malware (10), CSV is generated and it has 99 features. ABC algorithm is used select feature subset which is optimized. This is the primary aim of this research work. Neural network and support vector machine classifiers are trained by ABC algorithm selected discriminate features by feeding it is an input. Kernel function of support vector machine is set as radial basis function (RBF) and tenfold is used as a cross-validation fold number. One of size 40 is used as feed forward neural network’s hidden layer count. Intel(R) Xeon(R) Silver 4114 CPU operating at 2.20 GHz and 2.19 GHz with 64 GB RAM and 64-bit operating system is used to test the algorithm. Malware and goodware are compared in order to analyse the performance of two classifiers after and before the selection of features. For various classifiers, ABC algorithm selects the features which are shown in Table 1. It also shows the classification accuracy produced by genetic and ABC algorithm with selected feature subset. Both classifiers preserve AUC with reduced feature set as shown from Table 1, and various classifier’s ROC curve after and before the selection of feature is shown in Fig. 2. Support vector machine classifier’s ROC curve is shown in Fig. 2a, and neural network classifier’s ROC curve is shown in Fig. 2b. With selected features, better performance is exhibited by the classifier as shown by ROC curve.
Feature Selection and Classification Based on Swarm …
107
Table 1 Selected feature by ABC algorithm and accuracy obtained with selected features for different classifiers Algorithm
Algorithms
No. of features before feature selection
Accuracy before Features feature selection Selected
Accuracy after feature selection
Support vector machine
GA
99
0.9891
33
0.9803
Proposed ABC
99.30
0.9903
30
0.9852
Neural network
GA
99
0.9876
40
0.9828
Proposed ABC
99.10
0.9899
35
0.9895
Neural network and support vector machine classifier’s performance before and after the selection of feature is shown in Table 2. Enhanced results in accuracy are obtained when the proposed ABC and feature selection genetic algorithm are used in combination with neural network and support vector machine classifiers. Space of feature vector is very less, so it reduces the time complexity of classifier training.
4 Conclusion Day by day, Android platform is posed with increased number of threats. Through malicious malwares and applications, they are spread. It is necessary to detect those malware by designing a framework with high accuracy in results. Zero-day threats are posed by malware variants, and they cannot be detected by signature-based method. So, techniques based on machine learning are utilized. Most optimized subset of feature is obtained in the proposed work by using an artificial bee colony algorithm. In an effective way, machine learning algorithms are trained using this subset of features. On feature set with low dimension, neural network classifiers with support vector machine algorithm have produced more than 95% of accurate results in the experimentation. Classifier’s complexity in training is reduced by this. In future, the huge data set can be used to produce enhanced results, and other algorithms in machine learning can also be analysed.
108
A. Sharma and M. Kumar
Fig. 2 ROC curves comparison before and after the selection of feature selection. a Support vector machine. b Neural network classifier
With 40 features (post feature selection)
Neural network
95.5 94.6
With 99 features
SVM 95.6 94.3
With 99 features
With 40 features (post feature selection)
96.02
94
95.1
95.23
97.6
93.9
94.9
95.4
92.25
94.1
94.9
96.01
ABC
GA
GA
ABC
Specificity (%)
Sensitivity (%)
Performance matrices
Approaches
Table 2 Performance comparison using various matrices
94.1
95.2
95
96.6
GA
95
96.01
96
97.61
ABC
Accuracy (%)
3.76
8.57
10.20
22.92
GA
2.98
8.01
9.50
21.01
ABC
Training time complexity (s)
Feature Selection and Classification Based on Swarm … 109
110
A. Sharma and M. Kumar
References 1. Qing, S.H.: Research progress on Android security. Ruan Jian XueBao/J. Softw. 27(1), 45–71 (2016). http://www.jos.org.cn/1000-9825/4914.html. (in Chinese) 2. Zhang, R., Yang, J.Y.: Android malware detection based on permission relevance. J. Comput. Appl. 34(5), 1322–1325 (2014) 3. Yang, H., Zhang, Y., Hu, Y., et al.: Android malware detection method based on permission sequential pattern mining algorithm. Transactions 2013(S1), 106–115 (2013) 4. Li, W., Ge, J., Dai, G.: detecting malware for android platform: an SVM-based approach. In: IEEE International Conference on Cyber Security and Cloud Computing, pp. 464–469. IEEE Press, Piscataway, New Jersey, USA (2015) 5. Zhao, Y., Hu, L., Xiong, H., et al.: Dynamic analysis scheme of Android malware based on Sandbox. Netinfo Secur. 12, 21–26 (2014) 6. Enck, W., Gilbert, P., Han, S., et al.: TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones. In: ACM Transactions on Computer Systems, pp. 393–407. ACM Press, New York, NY, USA (2014) 7. Runkang, S., Guojun, P., Jingwen, L., et al.: Behavior oriented method of Android malware detection and its effectiveness. J. Comput. Appl. 36(4), 973–978 (2016) 8. Yang, H., Zhang, Y., Hu, Y., et al.: A malware behavior detection system of Android application based on multi-class features. Chin. J. Comput. 37(1), 15–27 (2014)
Sentiment Classification Using Hybrid Bayes Theorem Support Vector Machine Over Social Network Shashi Shekhar and Narendra Mohan
Abstract Opinions or information can be shared on social media sites including, LinkedIn, blogs, Facebook, twitter, etc., in text form. Opinions or views about movies, products, politics or any interested topics of user can be shared using social networking sites in comments or feedback or picture form. Individuals opinion about political events, social, issues and products can be gathered and analysed by sentiment analysis. The proposed system includes preprocessing, feature extraction, sentiment classification using hybrid Bayes theorem support vector machine (HBSVM) algorithm. Preprocessing is used for removing unnecessary data, and it helps to improve the classification accuracy in the given dataset. Then, feature extraction is performed to select the prominent features based on the frequent terms. Then, apply HBSVM model for classifying neutral and non-neutral posts. Negative and positive are the class of non-neutral posts. Based on response to a post of various aspects, classification of group members is done. High performance is exhibited by proposed HBSVM as proven by results of experimentation when compared with existing techniques. Keywords Feature extraction · Support vector machine (HBSVM) algorithm · Hybrid Bayes theorem · Classification · Sentiment analysis
1 Introduction Subjective text, opinions, sentiment management correspond to sentiment analysis [1]. For public views, comprehension information is provided by sentiment analysis. Various reviews and tweets are analysed by this. Various significant events can be predicted using this verified tool [2]. Certain entity is evaluated using reviews from public. S. Shekhar (B) · N. Mohan Department of Computer Engineering and Applications, GLA, University, Mathura 281406, India e-mail: [email protected] N. Mohan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_10
111
112
S. Shekhar and N. Mohan
There are neutral, negative and positive class of opinions. User reviews, expressive direction can be computed automatically by sentiment analysis. Increase in hidden information structuring and analysing increases need of sentiment analysis. User gives reviews using short text. This makes the process of sentiment analysis as a more difficult one. Hidden sentiment of texts is analysed using sentiment analysis which uses computer systems. They are classified into negative or positive attitudes [3]. Some keywords form base of subject classification. More semantic information is required by sentiment analysis. Machine learning and sentiment dictionary method are major sentiment analysis methods. Words sentiment polarity is computed by using extended sentiment inclination point mutual information algorithm in sentiment dictionary methods. Text’s sentiment tendencies are judged by this. Most famous micro-blogging website is Twitter. Tweets correspond to a message with 140 characters which can be received or send by users. Interaction of people can be analysed by this. On various topics alike, election, brand impact, politics, opinions can be collected from twitter. For a plethora of problems, efficient solutions can be obtained by user with advancements in machine learning methods [4]. An opinion document is classified by sentiment classification into negative or positive sentiment. Within document, no information is extracted or studied by this. Document-level sentiment classification also represents same. With two rating scores or classes, supervised learning problem is used to formulate document sentiment classification [5]. For classification, support vector machines (SVM) and Bayesian classification which are standard supervised learning techniques can be applied directly. For document sentiment classification, there exist unsupervised methods. Language patterns and sentiment words form the base of those methods. In sentiment classification, important role is played by a sentiment words which shows, negative or positive sentiments [6]. The major issue of this research work is the identification of sentiment associated with the features. On social media, toward detection of sentiment, till now, limited research is done. Multiple settings are produced by other forms of solutions. To overcome the abovementioned issues, in this research, preprocessing is initially applied to filter repeated and noise content for the given dataset. Then, feature extraction is performed to choose prominent features and HBSVM used for sentiment classification which splits the post as neutral and non-neutral. It is proposed to improve the overall system accuracy for positive, negative and neutral posts.
2 Literature Review Rathi et al. [7] presented a social network like Facebook and Twitter. Data and opinions are loaded in this. Twitter is the most commonly used micro-blogging site. Tweets are shared by people to express their ideas. For sentimental analysis, it is used
Sentiment Classification Using Hybrid Bayes Theorem …
113
as a best source. Positive, negative or neutral are the classes of opinions. Sentiment analysis corresponds to analysing and grouping of various opinions. This research concentrates on classifying emotions of tweets’ data collected from Twitter. Ensemble machine learning methods are used to enhance sentiment analysis’s results of classification. Reliability as well as efficiency of the analysis is also enhanced. Decision tree is combined with support vector machine (SVM) by this research. With respect to accuracy and f-measure, better results are produced as shown by experimentation. Geetika et al. [8] implemented a classification method of customer reviews based on sentiment analysis. Opinions correspond to tweets in analysing information. This is unstructured one, may be a negative or positive or neutral. Data set is preprocessed; adjectives are extracted from it. Adjective corresponds to a feature vector with some meaning. List of feature vectors are selected then. Along with WordNet based on semantic orientation, SVM, maximum entropy and naive Bayes classification algorithms which are based on machine learning are applied for synonyms extraction and content feature similarity. With respect to accuracy, precision, recall, performances of classifier are measured.
3 Proposed Methodology In this research, HBSVM is proposed to progress the sentiment classification performance more effectively and accurately. In order to improve accuracy, it requires to implement sentiment classification algorithm for given Facebook group by giving neutral and non-neutral posts efficiently.
3.1 Preprocessing In order to perform effective analytical operations, noise in the documents has to be removed. In preprocessing, lowercase conversion, Non-English terms filtering, tokenization, replacement are performed. URL tag is used to replace URL links in replacement step. For example, http://example.com is replaced by URL. Punctuation marks and spaces are used split text in tokenization to form a word bag. Full forms of words like “I’m”, “don’t” are used to retain them as it is. NonEnglish terms are removed directly by filtering. All words are converted to its lowercase forms in. lowercase conversion. Texts are clustered by K-means clustering, in which features corresponds to bigrams and unigrams [9]. In same cluster, texts with same words are clustered. 1. Desired number of clusters k are selected 2. Every text is assigned to a cluster randomly 3. Cluster centres are computed
114
S. Shekhar and N. Mohan
4. Every text is reassigned to closest cluster 5. Cluster centres are re-computed 6. Until convergences, 4 and 5 are repeated.
3.2 Feature Extraction Vector space mode is used to transform text documents dataset into document term table for analysing it. It is very difficult to extract features from text document, and it is even more difficult, if social media behaves informally [10]. In polarity classification, important role is played by suitable feature set selection. TFIDF of n-gram, word n-grams and character grams features are most commonly used in sentiment analysis. Extract various micro-blogging features including question marks, elongated words, all caps, user names, URLs, punctuation marks, emotions, hash tags. Figure 1 shows proposed system’s overall block diagram.
3.2.1
Sentiment Lexicons
Lists of n-gram words correspond to sentiment lexicons. Polarity values are assigned either automatically or manually. Lexicons with low coverage correspond to those created manually, and they have few thousands of terms. National Research Council Canada (NRC) emotion lexicon, Bing Liu’s lexicon and Multi-Perspective Question and Answering (MPQA) are the most commonly used lexicons. Twitter messages are re-used to create lexicons automatically. It has high value of coverage, and classification is highly impacted by this. NRC hashtag sentiment lexicon, and sentiment 140 lexicon are used very often.
3.2.2
Negation Features
Message’s polarity can be changed by negation. Various methods are used to handle negation, which includes negation context handling, valance shifter and simple polarity switch. From a negation word, context of negation is started. For example, can’t, don’t, wouldn’t, and it is continued till sentence end. In context of negation, all words are added with negation suffix “_NEG”.
3.2.3
Part of Speech Tagging
Word’s part of speech (PoS) tag changes word’s sentiment. Count every PoS tag occurrences. Various PoS taggers are used as a feature of classification. TFIDF = TF ∗ IDF
(1)
Sentiment Classification Using Hybrid Bayes Theorem …
115
Fig. 1 Proposed system’s overall block diagram
TF = loge 1 + f i j IDF = loge
N Ni
(2)
(3)
116
S. Shekhar and N. Mohan
where frequency of token i is represented as f ij which is in document j, number of documents in corpus is represented as N and number of documents containing token i is represented as N i .
3.3 Sentiment Classification Using MSVM For dataset of Facebook group, classification accuracy of sentiment analysis is enhanced by proposed HBSVM. Categorization of text uses this generally. In feature space with high dimension, better performance is achieved by this. In space, points are used to represent examples by SVM algorithm. Various class examples are separated by a huge margin. For classification of sentiment, better results are produced by naive Bayes algorithm. A vector w, which represents hyperplane are computed. Document vector between classes are separated by this vector w. The following represents feature vector of SVM, w.x − b = 0
(4)
Distance between input vector x i and hyper plane are computed to achieve classification process. Data are classified by constructing hyperplane in SVMs. Various types of data at various sides of hypersurface cannot be classified by SVM. SVM is applied with kernel technique [11]. Major difference between proposed and SVM algorithm is that nonlinear kernel function is used to replace each dot product. Equation k(x i ; x j ) = ’(x i )’(x j ) is used to relate kernel with transform. In transformed space, value of w exists. There must be one adjusting parameter for every kernel. It enables the flexibility of kernel as well as tailoring itself based on practical data. This work combines Bayes theorem for enhancing the SVM classification process. From likelihood and prior probability, posterior probability is computed by using Bayes theorem. Probability is computed by Bayes theorem as P(A)P BA A = P B P(B)
(5)
Conditional independence between events are assumed in this. Posterior probability is represented by P(A|B), likelihood is P(B|A), A’s prior probability is represented as P(A), and B’s prior probability is represented as P(B). For every class, posterior probability is computed using naïve Bayesian equation. Prediction outcome corresponds to class having high posterior probability.
Sentiment Classification Using Hybrid Bayes Theorem …
117
HBSVM Input: Dataset with labelling Output: Extracted reviews popularity, whether they are neutral, negative or positive. Step 1: Reviews are extracted Step 2: Feature vectors are obtained Step 3: Feature vector list and training dataset are combined and computed the probability of features using (5) Step 4: Perform training and testing using HBSVM algorithm Step 5: Feature vectors similarity are computed Step 6: Provides positive, negative and neutral posts.
4 Experimental Result Online social networks, real-world environment are used chose experimental set-up. Cheltenham Facebook Groups are used for collecting dataset [12]. Experimentation is performed using three selecting three open groups. Stringent set-up is provided by this dataset. For like–share rating, sparse data matrix will be obtained. There will be limited amount of content to perform linguistic analysis. In tow sets, experiments are conducted. Effective customized suggestions are provided by proposed method. It is proved by first set, and clutter-free group environment is provided by this. Within community, members popularity can be identified by applying proposed method. It is proven by second set. Performance of proposed HBSVM algorithm is compared with existing artificial neural network and SVM algorithms. Accuracy Overall accurateness of detection results defines accuracy, and it is expressed as, Accuracy =
tp + tn tp + tn + fp + fn
(6)
where numbers of true positive is represented as tp, number of true negative is represented as tn, number of false positive is represented as fp, and number of false negative is represented as fn. Accuracy comparison results of proposed as well as existing methods are shown in Fig. 2. Methods are represented in x-axis, and precision value is represented in y-axis of plot. For specified dataset, low accuracy values are exhibited by existing ANN and SVM algorithm and high accuracy value is produced by proposed HBSUM
118
S. Shekhar and N. Mohan
Fig. 2 Accuracy
method. Thus, result concludes that the proposed HBSVM improves the sentiment classification process by identifying the neutral and non-neutral posts efficiently for the given dataset. Precision Precision is defined in the binary classification by assuming positive samples as posts, precision =
tp tp + fp
(7)
Precision value comparison results of proposed as well as existing methods are shown in Fig. 3. Methods are represented in x-axis, and precision value is represented in y-axis of plot. For specified dataset, low precision values are exhibited by existing ANN and SVM algorithm and high precision value is produced by proposed HBSUM method. Thus, result concludes that the proposed HBSVM improves the sentiment classification process by identifying the neutral and non-neutral posts efficiently for the given datasets.
Fig. 3 Precision
Sentiment Classification Using Hybrid Bayes Theorem …
119
Fig. 4 Recall
Recall The calculation of the recall value is done as follows: Recall =
tp tp + fn
(8)
Recall comparison results of proposed as well as existing methods are shown in Fig. 4. Methods are represented in x-axis, and recall value is represented in y-axis of plot. For specified dataset, low recall values are exhibited by existing ANN and SVM algorithm and high recall value is produced by proposed HBSUM method. Thus, result concludes that the proposed HBSVM improves the sentiment classification process by identifying the neutral and non-neutral posts efficiently for the given dataset. F-measure F1-score is expressed as: F1−score =
2 × precision × recall precision + recall
(9)
F-measure comparison results of proposed as well as existing methods are shown in Fig. 5. Methods are represented in x-axis, and F-measure is represented in y-axis
Fig. 5 F-measure
120
S. Shekhar and N. Mohan
of plot. For specified dataset, low F-measure values are exhibited by existing ANN and SVM algorithm and high F-measure value is produced by proposed HBSUM method. Thus, result concludes that the proposed HBSVM improves the sentiment classification process by identifying the neutral and non-neutral posts efficiently for the given dataset.
5 Conclusion In this research work, HBSVM algorithm is proposed for efficient sentiment classification for the given Facebook group dataset. Measure of frequent term is used to extract important features in selection of features process. Hence, it selects prominent features from the dataset, and further, it is used to classify the posts effectively. HBSVM algorithm is proposed for sentiment classification, and it classifies the posts as neutral and non-neutral posts. The proposed algorithm for accurate classification whether posts are neutral, negative or positive in given dataset. Thus, the proposed HBSVM algorithm provides greater performance in f-measure, recall, precision and accuracy values. Hybrid optimization algorithm can be developed to improve the different aspects of solution for stance detection in future.
References 1. Cambria, E., Schuller, B., Xia, Y., Havasi, C.: New avenues in opinion mining and sentiment analysis. IEEE Intell. Syst. 28(2), 15–21 (2013) 2. Melville, P., Gryc, W. and Lawrence, R.D.: Sentiment analysis of blogs by combining lexical knowledge with text classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2009) 3. Lee, S.H., Cui, J., Kim, J.W.: Sentiment analysis on movie review through building modified sentiment dictionary by movie genre. J. Intell. Inform. Syst. 22(2), 97–113 (2016) 4. Suresh, H.: An unsupervised fuzzy clustering method for twitter sentiment analysis. In: 2016 International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS). IEEE (2016) 5. Narayanan, R., Liu, B., Choudhary, A.: Sentiment analysis of conditional sentences. In: Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP2009), Singapore (2009) 6. Karampiperis, P., Koukourikos, A., Stoitsi, G.: Collaborative filtering recommendation of educational content in social environments utilizing sentiment analysis techniques. In: Recommender Systems for Technology Enhanced Learning: Research Trends & Applications, vol. RecSysTEL Edited Volume. Springer, (2013) 7. Rathi, M., et al.: Sentiment analysis of tweets using machine learning approach. In: 2018 Eleventh International Conference on Contemporary Computing (IC3). IEEE, (2018) 8. Gautam, G., Yadav, D.: Sentiment Analysis of Twitter Data Using Machine Learning Approaches and Semantic Analysis. Department of Computer Science & Engineering, IEEE (2014) 9. Roychowdhury, S.: Expediting K-means Cluster Analysis Data Mining using Subsample Elimination Preprocessing. U.S. Patent No. 8,229,876. 24 Jul. 2012
Sentiment Classification Using Hybrid Bayes Theorem …
121
10. Mukherjee, S., Bhattacharyya P.: Feature specific sentiment analysis for product reviews. In: International Conference on Intelligent Text Processing and Computational Linguistics. Springer, Berlin, Heidelberg (2012) 11. Zheng, W., Ye, Q.: Sentiment classification of Chinese traveler reviews by support vector machine algorithm. In: 2009 Third International Symposium on Intelligent Information Technology Application, vol. 3. IEEE (2009) 12. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Improved Cuckoo Search with Artificial Bee Colony for Efficient Load Balancing in Cloud Computing Environment Rakesh Kumar and Abhay Chaturvedi
Abstract Task scheduling and load balancing are major issues in cloud environments. Utilization of resource is affected directly by this. In research field of cloud, aspect of load balancing has to be considered. On front as well as back end, it has a serious impact. For balancing load, improved cuckoo search + artificial bee colony (ICSABC) algorithm is proposed in this work. It is bio-inspired load balancing scheduling algorithm. On VMs, load balance level is improved by enabling process of scheduling efficiently. In cloud computing, most important work is task scheduling. Users are paying based on resource usage time. Among resources of system, load must be distributed evenly. It increases the utilization of resource and execution time is reduced. Best task-to-virtual machine mapping is computed by proposed algorithm. Speed of VM processing and submitted workload’s length has an effect on it. Best average VM load is achieved by proposed ICSABC as shown by results of experimentation. It also results in high accuracy and low complexity time, when compared with existing methods. Keywords Cloud computing · Improved cuckoo search + artificial bee colony (ICSABC) · Load balancing
1 Introduction Internet is related to cloud computing. Cloud corresponds to overall Internet. Business can be scaled up, without having much investment. At very reasonable cost, storage, infrastructure, platform, and software requirements are fulfilled by cloud. By using
R. Kumar (B) · A. Chaturvedi Department of Computer Engineering and Applications, GLA University, Mathura 281406, India e-mail: [email protected] Department of Electronics, GLA University, Mathura 281406, India A. Chaturvedi e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_11
123
124
R. Kumar and A. Chaturvedi
cloud computing, cost of operation is reduced [1]. Virtualization concept is used in cloud computing. In cloud, various services can be used by end user via virtualization. In virtualization form, services are delivered by cloud computing data centers. On data center, create number of virtual machines (VMs). In order to effectively utilize the capacity of these virtual machines, incoming requests must be allocated efficiently [2, 3]. With respect to cloud system, various parameters should be made optimum. Response time, quality of service, utilization of resources, scalability, and performance are other parameters have to be considered. Cloud computing provides on-demand services. At particular time, based on client’s need, software, information, shared resources, and devices are provided. In Internet, it commonly used name. Cloud corresponds to entire Internet. In cloud computing, balancing of load is a challenging task. It requires a distributed solution. Maintaining more idle services is not cost effective and not feasible in required demand fulfilling. Appropriate servers are not assigned with jobs. In cloud, components are available in large area, and it has a complex structure. Individual cannot do effective balancing of load [4]. Virtual machines are allocated with task by task-level scheduling. Host is allocated with virtual machines by resource-level scheduling [5]. Resources of cloud should be balanced. Incoming application requests are scheduled to virtual machines. This is done to complete the task in specified time and also in balanced manner. Self-adaptive global search-based optimization method is proposed in [6]. This algorithm resembles genetic algorithms (GA) which are population-based algorithms. Individuals in population are not recombined directly. Particle’s social behavior defines this. Particles are adjusting its position which depends on its best position and best particle’s best position in entire population in every iteration. Particle’s stochastic nature is increased by this. Convergence speed is enhanced to produce good solutions.
2 Related Work Singh et al. [7] handled hierarchical and multi-dimensional resources of cloud system using load balancing algorithms. In data center, server and storage virtualizations are constituted by this. This algorithm tracks computation and application data. Usage of network switches, servers, and storage nodes is monitored by this. Mondal et al. [8] did balancing of load using stochastic hill climbing technique. On cloud computing, resource utilization is optimized by this technique. It is a soft computing-based technique. Simulation of proposed techniques shows its performance, and when compared with Round Robin (RR) and First Come First Serve (FCFS) methods, less time of response is shown by this algorithm. Ajit et al. [9] balanced pre-emptive independent task of heterogeneous cloud system by using a load balancing algorithm. It is termed as honey bee behavior inspired load balancing (HBB-LB). Among virtual machines, load is balanced and
Improved Cuckoo Search with Artificial Bee …
125
throughput is maximized. On virtual machines, task priority is also balanced. This reduces the task’s waiting time. Using simulation, existing scheduling and load balancing algorithms are compared with proposed algorithms. Better results are shown by proposed method. Major quality of service QoS parameter considered here is priority. Other QoS factors may also be considered to enhance the algorithm. Lu and Gu [10] maximized utilization of CPU by proposing ACO-inspired loadadaptive cloud resource scheduling algorithm. Two issues are solved by this. They are scheduling of adaptive resources and hotspot node detection. Usage of all VMs bandwidth, memory, and CPU usage is monitored by the proposed method. Process of scheduling starts if hotspot VM is detected. Idle node is computed by performing resource scheduling process. Idle nodes are having light load when compared to hotspot node. Faster convergence of ant is enabled by adding expansion factor. VMs with overloading are detected easily by proposed algorithm. Nearest idle node is also computed. Singh and Singh [11] implemented node duplication genetic (NGA) algorithmbased GA method. It is used in heterogeneous multiprocessor systems. Application completion time and delay time in communication are considered by this algorithm. In two various stages, evolution of NGA algorithm computed fitness function. Task fitness is computed in the first stage. In a legal order, tasks are scheduled and executed. On a single processer, legal system is scheduled. Processing time is minimized in the second stage of fitness.
3 Proposed Methodology 3.1 System Model Cloud computing is a dynamic process. But balancing of load may be formulated as static one. In this, M processing units in cloud are allocated with N jobs submitted by users of cloud. To various virtual machines, load is distributed using a load balancer. This is to avoid overloading or under loading of nodes [12]. Data unavailability may be caused by node failures if load balancing is not done in a proper way. So, introduced virtual machines are based on cloud.
3.2 Clustering Virtual Machines Usage of memory and CPU is used in placement process of VM for indicating proper physical machine search. There are three groups of request in VM resource request. Memory and CPU demands are same in the first group. Group of clusters are formed by grouping VM having same cost of communication [13]. Third group of VM
126
R. Kumar and A. Chaturvedi
Fig. 1 Cluster mechanism
corresponds to VMs that are not communicating with other VMs. Figure 1 shows the clustering mechanism. VMs 1, 2, 3, 4 are having same demand of memory and cost of communication for each other. They will be placed in the same rack or in one physical machine. Because of cost of communication and memory, VM5 and VM6 are grouped in one group. VM7 is not making any communication with other VMs. On any physical machine, it can be initiated.
3.3 Improved ICSABC Algorithm for Load Balancing The CS algorithm is stimulated through the brood parasitism of few cuckoo species through laying their eggs in the nests of another host bird. In this algorithm, every egg in a nest symbolizes a candidate resolution, and a cuckoo egg signifies a fresh resolution. Number of cuckoo X i j (i = 1, . . . , n) ( j = 1, . . . , m) is represented as the number of data instances (i = 1, . . . , n) in the churn prediction with their attributes ( j = 1, . . . , m). The strategy is to utilize the fresh, and effectively best, resolutions to restore a not-so-good resolution in nests, and subsequently the three rules are given. Few cuckoo species have brood parasitism which stimulates CS algorithm. In other host bird’s nest, eggs are laid by cuckoo species. Candidate resolution is symbolized by each candidate solution of CS algorithm is roused through brood parasitism of few cuckoo species. Cuckoos are laying eggs in homes of other host feathered creatures [14]. In this process, every egg in a nest symbolizes to an applicant arrangement, and a cuckoo egg shows another solution. Quantity of cuckoo X i j (i = 1, . . . , n) ( j = 1, . . . , m) is symbolized as the quantity of information example (i = 1,…, n) in the churn forecast with their features (j = 1,…, m). New and possible better solutions are utilized by this system and to supplant a not all that great arrangement in nest, subsequent the three rules: Rule 1: Every information feature lays one egg at any specified moment, and in an arbitrarily picked best features in nest, d, umps its egg which relates to the introduction procedure.
Improved Cuckoo Search with Artificial Bee …
127
Rule 2: In small nest, better eggs are contained and will continue to the next production. Rule 3: Nests amount (chosen features) is set and alien eggs are discovered by host bird along with a specific possibility, pa ∈ (0, 1) to construct a new position. A basic portion of CS calculation is the utilization of arbitrary walk in both neighborhood and global seek step. Novel best chosen features at t +1 iteration is produced arbitrarily through Lévy flights in global search step. (t) (t) (t) (t) = X i(t) X i(t+1) j j + α × L i j ⊕ Randi j ⊕ X i j − X global
(1)
where random walk dependent on Lévy flights is represented as L i(t)j , arbitrary vector is represented as Randi(t)j wth unit standard divergence and zero mean; global optimal arrangement is represented as Xglobal. With a power law tail, progression of immediate hops seen over a probability density function (PDF) describes Lévy flight which is an arbitrary walk. Levy fights are used obtain a straightforward as well as effective route. Subsequent condition is used to perform i, d component of L which is expressed in condition (4) u i j,d L i j,d = 1 , (i = 1, . . . , n), j = 1, . . . , m, d = 1, . . . , D vi j,d β
(2)
where dimensionality of seek space is represented as D, parameter is represented as β, it will be in scope of [1, 2], arbitrary amount drawn from typical dissemination N (μ, σ 2) is represented as u_(ij, d) and v_(ij, d), and they are having zero mean (μ = 0) and standard deviations (σ ) as ⎤ Γ (β + 1) sin πβ 2 ⎥ ⎢ σu = ⎣ β−1 ⎦, σv = 1 β+1 2 Γ 2 β2 ⎡
(3)
Then again, in the neighborhood search step, the dad division of the most exceedingly awful features is found and supplanted via new ideal chosen attributes outcomes, and in this manner, the places of the novel ideally chosen attributes resolutions are produced via arbitrary walk as pursues (t) (t) (t) (t) (t) ⊕ X = X + r ⊕ H p − r − X X i(t+1) a 1i j 2i j j ij ij jk
(4)
where r1(t) and r2(t) are arbitrary vectors consistently appropriated in series of (0,1), ij ij H (.) is Heaviside step,H (.) = 0 for non-chosen features and H (.) = 1 for chosen (t) features, pa is the finding likelihood, X i(t) j and X jk are two distinctive chosen features resolutions arbitrarily via random variation. Subsequent to creating the new ideal , it is executed and compared with X i(t) chosen features for churn prediction, X i(t+1) j j .
128
R. Kumar and A. Chaturvedi
If churn classification accuracy of new chosen attributes is better than objective corresponds to new basic solution, else fitness of previous chosen attributes, X i(t+1) j (t) X i j kept as it is. Take N virtual machine set as VM = {VM1, VM2, …, VMn}, in VM series of tasks planned and has to be implemented are taken as Task = {Task1, Task2, …, Taskm}. Karaboga [15] presented the ABC algorithm. Right values are computed by this algorithm to solve a problem. For searching food sources, three classes of honey bees are used by ABC algorithm (food sources are searched by scout bees, food sources are explored, and information is shared to onlooker bees by employed bees. Fitness value is computed by onlooker bees and best food sources are found by them). (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19)
Begin In VM in cloud, randomly place n scout bees in population initialization step. Objective function f (x), x = (x 1 ,…, x d ) T n host nests x i (i = 1, 2,…, n) initial population are generated while (t < MaxGeneration) Using L´evy flights, randomly obtain a cuckoo Quality or fitness F i of it is computed Randomly select a nest among n (say, j) If (F i > F j ) New solution replaces j; Compute the value of fitness in evaluation of fitness of population step. Select scout bees having high value of fitness and from neighborhood of m VMs, sites visited by them are selected end Abandoned worst nest and built new ones; Best solutions are kept; Solutions are ranked, and current best solution is computed end while Results of post-processing and visualization of it end
Based on the demand of VM, clusters are computed and based on characteristics of resource demand VMs are clustered. Processing time is used to quantify VM load in computing load balance step. System will be in balanced state if mean load is greater than loaded VM’s standard deviation. Else system will be in imbalance state.
4 Experimental Result This section, existing GA and ACO algorithms, is considered to evaluate the performance metric against proposed ICSABC algorithms. Performance metrics like time complexity and accuracy are considered for evaluation.
Improved Cuckoo Search with Artificial Bee …
129
Accuracy System is said to be better if proposed algorithm exhibits higher accuracy. Accuracy results comparison of proposed and existing method are shown in Fig. 2. Methods are represented in x-axis, and accuracy value is represented in y-axis. For a specified data, highly accurate results are produced by proposed ICSABC algorithm, while existing ACO and GA produce low value of accurate results. Thus, result concludes that the proposed ICSABC improves the resource allocation process over the cloud computing. Time complexity System is superior when proposed method exhibits low value of time complexity. Time complexity results comparison of proposed and existing method are shown in Fig. 3. Methods are represented in x-axis, and time complexity value is represented in y-axis. For a specified data, low value of time complexity is resulted by proposed ICSABC algorithm, while existing ACO and GA produce high value of time complexity. Thus, the result concludes that the proposed ICSABC improves the resource allocation process over the cloud computing.
Fig. 2 Accuracy
Fig. 3 Time complexity
130
R. Kumar and A. Chaturvedi
5 Conclusion On cloud computing environment, proposed ICSABC algorithm is used for implementing load balancing task scheduling algorithm. In cloud, over virtual machines, cloud user’s requests for load application are balanced. VM utilization is enhanced by proposed ICSABC algorithm in cloud system. In this research, Levy flights are combined with metaheuristic cuckoo search to form a new search algorithm. ABC algorithm is used with cuckoo species breeding technique. Ant colony optimization and genetic algorithms are used for validating and comparing performance of proposed algorithm.
References 1. Mauch, V., Kunze, M., Hillenbrand, M.: High performance cloud computing. Future Gener. Comput. Syst. Int. J. Grid Comput. eScience 29(6), 1408–1416 (2012). Elsevier, ISSN: 0167739 2. Singh, A., Korupolu, M., Mohapatra, D.: Server-storage virtualization: Integration and load balancing in data centers. In: International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE, pp. 1–12. E-ISBN: 978-1-4244- 2835-9, Print ISBN: 978-1-4244-2834-2 (2008) 3. Ni, J., Huang, Y., Luan, Z., Zhang, J., Qian, D.: Virtual machine mapping policy based on load balancing in private cloud environment. In: International Conference on Cloud and Service Computing, IEEE, pp. 292–295. EISBN: 978-1-4577-1636-2, Print ISBN: 978-1-4577-1635-5 (2011) 4. Gao, R., Wu, J.: Dynamic load balancing strategy for cloud computing with ant colony optimization. Future Internet 7(4), 465–483. doi: https://doi.org/10.3390/fi7040465, ISSN: 1999-5903 (2015) 5. Ramezani, F., Lu, J., Hussain, F.K.: Task-based system load balancing in cloud computing using particle swarm optimization. Int. J. Parallel Program. 42(5), 739–754 (2014). https://doi. org/10.1007/s10766-013-0275-4 6. Ramezani, Fahimeh, Jie, Lu, Hussain, FarookhKhadeer: Task-based system load balancing in cloud computing using particle swarm optimization. Int. J. Parallel Prog. 42(5), 739–754 (2014) 7. Singh, A., Korupolu, M., Mohapatra, D.: Server-storage virtualization: integration and load balancing in data centers. In: Proceedings of the 2008 ACM/IEEE conference on Supercomputing. IEEE Press (2008) 8. Mondal, B., Dasgupta, K., Dutta, P.: Load balancing in cloud computing using stochastic hill climbing-a soft computing approach. Procedia Technol. 1(4), 783–789 (2012) 9. Ajit, M., Vidya, G.: VM level load balancing in cloud environment. In: 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT). IEEE (2013) 10. Lu, X., Gu, Z.,: A load-adapative cloud resource scheduling model based on ant colony algorithm. In: 2011 IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), pp. 296–300, IEEE (2011) 11. Singh, J., Singh, H.: Efficient tasks scheduling for heterogeneous multiprocessor using genetic algorithm with node duplication. Indian J. Comput. Sci. Eng. 2, 402 (2011) 12. Fang, Y., Wang, F., Ge, J.: A task scheduling algorithm based on load balancing in cloud computing. In: International Conference on Web Information Systems and Mining. Springer, Berlin, Heidelberg (2010)
Improved Cuckoo Search with Artificial Bee …
131
13. Hu, J., et al.: A scheduling strategy on load balancing of virtual machine resources in cloud computing environment. In: 2010 3rd International Symposium on Parallel Architectures, Algorithms and Programming. IEEE (2010) 14. Wang, G.-G., et al.: Hybridizing harmony search algorithm with cuckoo search for global numerical optimization. Soft Comput. 20(1), 273–285 (2016) 15. Karaboga, D., Bahriye, A., Ozturk C.: Artificial bee colony (ABC) optimization algorithm for training feed-forward neural networks. In: International Conference on Modeling Decisions for Artificial Intelligence. Springer, Berlin, Heidelberg (2007)
Modified Genetic Algorithm with Artificial Neural Network Algorithm for Cancer Gene Expression Database Narendra Mohan and Neeraj Varshney
Abstract In nature, research of cancer is a biological as well as clinical one. Statistical research based on data is becoming more important. More challenging and difficult task in medical field corresponds to a prediction of disease outcome. For that, data mining applications are developed. For medical research group, high volume of medical data is collected and stored by using computer with automated tools. In this research, modified genetic algorithm+artificial neural network (MGA+ANN) algorithm is proposed for earlier cancer detection in the cancer features as well as for improving accuracy results of cancer classification. There are three major stages of proposed method. They are classification, selection feature and pre-processing. Pre-processing is done by using K-means algorithm for reducing noise data from given dataset. This handles missing features and redundancy features using K-means centroid values and min-max normalization, respectively. Accuracy of classification is enhanced effectively by this. Pre-processed features are given to selection of feature subset process for obtaining more informative features from the cancer dataset. It is performed by using MGA algorithm, and the objective function is used to compute the necessary and prominent feature based on the best fitness values. Then the ANN algorithm is applied for classification through training and testing model. It classifies the features more accurately using proposed method rather than the previous algorithms. With respect to accuracy, f-measure, recall and precision, better results are given by proposed MGA+ANN algorithm than existing algorithms as shown by results of experimentation. Keywords Modified genetic algorithm (MGA) · Artificial neural network (ANN) · Cancer data · Feature extraction · Classification
N. Mohan · N. Varshney (B) Department of Computer Engineering and Applications, GLA University, Mathura 281406, India e-mail: [email protected] N. Mohan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_12
133
134
N. Mohan and N. Varshney
1 Introduction Cancer is a heterogeneous disease. It has various subtypes. It is very important to diagnose cancer in early stages in order to find medical treatment required for patient. There are numerous studies about cancer cell classification in bioinformatics and biomedical field. They are classified as malignant or benign. Detection and curing of cancer are attempted in last few decades. There are various types of cancer which includes, blood, throat, lung and breast cancer [1]. It causes death, and there is no curing treatment for it. There are six levels of cancers. First two levels of cancers can be cured if it is detected in early stages in some time period. Advancement in technology has given new ways to find new techniques to cure cancer. There are three types of problem in classifying cancer. Prediction of patient survival in a specific stage of cancer is a first problem. If a person is affected by cancer then prediction of future disease and how it will cure [2]? Detection of early stages of cancer in the domain in which we are handling is the third problem. Growth of malignant tissue causes cancer, and it is caused by rapid division of cells. Datasets with different patient parameters are used to find tumour class as benign or malignant. In machine learning domain, we have worked and various standard algorithms including neural networks, decision tree, Naïve Bayes and logistic regression are focused [3]. From a given dataset, the presence of breast cancer is predicted. From website like Kaggle, data of 500 patients is collected with different parameters including radius mean, area mean and perimeter mean. In database, interesting knowledge is discovered by using a process called data mining. Data patterns are extracted by applying intelligent methods in this process. From hue amount of data, process of mining or extracting knowledge refers to data mining [4, 5]. In information industry, use of data mining has attracted a lot. Huge amount of data can be collected due to the growth in information technology, and it needs methods to extract useful information from those data.
2 Related Work Aldryan et al. [6] used modified back propagation with conjugate gradient Polak– Ribiere to propose a classification system. Gene selection is done using ant colony optimization. Microarray data is classified by MBP conjugate gradient Polak–Ribiere using human body’s neural network fundamental function. Optimization of MBP is done by selecting important genes by applying ACO as gene selector. With large dimension, microarray data can be processed by MBP. Selection feature is done by ACO which is a newly developed method. f-measure value of 0.7297 is produced by MBP classification. The score is increased to 0.8635 if it is combined with ACO for selecting feature.
Modified Genetic Algorithm with Artificial …
135
Zhu et al. [7] implemented null space-based linear discriminant analysis (NSLDA) for classification of cancer. Mass spectrometry profile’s first-order derivative is extracted by NSLDA. Data dimension is reduced by NSLDA according to null space strategy. Discriminative features are extracted. Prostate cancer database PC-H4 and ovarian cancer database OC-WCX2a are used for testing and evaluating the performance of this method. Better results are produced by this method when compared to LD and PC methods. Bouazza et al. [8] used filtering method to present an effect of feature selection methods on accuracy also discussed about supervised classification error in cancer classification. ReliefF, SNR, T-statistics and Fisher methods are used for making comparative analysis. Colon cancer, prostate cancer and leukaemia cancer datasets are used in experimentation. Hybrid method formed by SVM and SNR produces better results when compared to support vector machine (SVM) and K-nearest neighbours (KNN) methods. Wang et al. [9] made a way to understand cancer signal pathway by using novel modelling technique. It is applied to the classification of cancer. Between biomarkers, construct a regulatory network for specific cancer group. Energy function is minimized by optimizing this network. Disagreement between network’s output and input defines energy function. Sigmoid kernel function is imposed to develop network’s nonlinear version. Nasopharyngeal carcinoma’s protein profiling data is used to test this method. Support vector machines are used to compare performance.
3 Proposed Methodology 3.1 Pre-processing Using K-Means Algorithm In this research, K-means clustering algorithm is used to perform pre-processing in order to increase the cancer gene dataset classification accuracy. In a given dataset, inconsistencies are corrected, noises are smoothed out, and missing values are filled by this algorithm. According to cluster’s initial centroid, groups are formed by similar data using K-means clustering technique [10]. It uses the concept of Euclidean distance to calculate the centroids of the clusters. Random partitioning is done at the start. Current centres of clusters are computed repeatedly by this algorithm, and every data is reassigned to a cluster whose centre is near to it. If there is no reassignment, algorithm terminates. This minimizes the intra-cluster variance, and it is given by sum of squares of difference between feature data and its cluster centre. Runtime is reduced by K-means algorithm, which is linear with respect to number of elements of data. It can be implemented easily. The number of classes equals number of cluster number in this research.
136
N. Mohan and N. Varshney
Algorithm 1: K-means algorithm 1. 2. 3. 4. 5.
From dataset (D), number of clusters k are selected Initialize cluster centres µ1,…, µk K data points are chosen and to this points, cluster centres are set. To cluster’s points are assigned randomly, cluster’s mean value is computed. For every data point, closest cluster centre is computed and calculates the measure of distance for finding missing values using given below formula n d(i, j) = (xi − yi )2
(1)
i=1
where xi and yi are two points in Euclidean n-space 6. Find redundancy features using min-max normalization v =
v − min (D) max(D) − min(D)
(2)
where min represents minimum value and max represents maximum values of features in D 7. To this cluster, data points are assigned 8. Cluster centres are recomputed 9. If there is no new reassignments, terminate the process. From dataset, separate the instances with missing attributes. Two sets of data are formed by dividing the dataset. One set has complete instances. There will not be any missing values. Another set has incomplete instances. It has missing attributes. Complete instances are clustered using K-means clustering which is applied on complete instances. Sequentially instances are taken. Possible values are used to fill the missing values. From resultant clusters, dataset is applied with K-means clustering. Validate the new added instance about its clustering in proper cluster. If it is in proper cluster, permanent value corresponds to an assigned value. Process is repeated to next instances. If instance is not in proper cluster, values are reassigned and the same process is repeated until clustering it in proper cluster. Compute the value of redundancy, and min-max normalization is used to reduce it. From dataset, minimum and maximum values are computed for every feature using this normalization, and from dataset, repeated values are removed using minmax values. Accuracy of classification of K-means algorithm is enhanced by using pre-processing techniques.
Modified Genetic Algorithm with Artificial …
137
3.2 Feature Selection Using MGA Algorithm Features are extracted using genetic algorithm (GA). This is done based on the importance of feature computed by equation. Complexity of computation is reduced by selection of features, and performance of analysis is enhanced by this. From cancer dataset, features are extracted by using a newly designed genetic algorithm. Assignment and arrangement problems are solved by computing optimum value. Fitness values of features are computed using modified genetic algorithm. Between parents, genes are exchanged for optimization. There are five steps in GA as follows: 1. At starting, population with M chromosomes are generated randomly, where population size is represented by M, which lies between l and chromosome length x. 2. In population, for every chromosome x, fitness value of function ϕ(x) is computed. 3. Until creating M offspring’s repeat the following: 3.1. From current population, pair of chromosomes are selected probabilistically using fitness function value. 3.2. Using mutation and crossover operators, offspring y is produced, where I = 1, 2,…, N. 4. Newly created one replaces current population. 5. Repeat from second step. Genetic algorithm has procedure which is an iterative one. In that, for finding problem’s solutions, search space is represented by population. Genome which is a finite symbol string is used to represent every individual, and the following shows the procedure of GA. Randomly generate individual’s initial population. In current population, decode the individuals in each evolutionary step. Based on fitness function, they are evaluated. In search space, problem optimization is described by this. Based on fitness value, select the individuals for forming new population. Fitness function is an objective function. It measures given individuals closeness in obtaining problem’s set aims [11]. Various procedures for selection exist. Fitness proportionate selection is the simplest one. Based on individuals relative fitness, with probability proportional they are selected. In population, individual’s relative performance is proportional to number of times it is expected to get selected. This is ensured by relative fitness. To population, new individuals are brought by individuals with high value of fitness, and they are having high chance of getting reproduced. Individuals with low fitness value are not considered. Genetic algorithms are a stochastic iterative processes. Convergence of it is not guaranteed. Maximum generation number or fitness’s selected level may be given as a condition for stopping. A characterizes fuzzy variable as well as variable’s membership function is the value of membership for a given real value of u(R). Process of logical reasoning by humans is resembled by fuzzy logic. It is a computational method which manipulates
138
N. Mohan and N. Varshney
the information. There are four major components of fuzzy logic. They are knowledge base, defuzzifier, inference engine and fuzzifier. Inputs are translated to fuzzy values by fuzzifier. Fuzzy outputs are computed by applying fuzzy reasoning mechanism, which is done by inference engine. Crisp values are formed from output by defuzzifier. Connection membership function group and fuzzy rules are contained in knowledge base. Large search space which is complex is searched using GA. On various problems, near optimum or optimum solutions are given by GA. So, for optimization, fuzzy genetic algorithms are considered. Search space is constituted by fuzzy system parameters. In fuzzy system, knowledge is tuned by GA by computing values of membership function. An expert defined initial fuzzy system. In a genome, encode the membership function, and systems having better performances are computed using GA.
3.3 Artificial Neural Network (ANN) for Cancer Classification Biological neural network’s functionality and structure can be simulated using a mathematical model termed as artificial neural network (ANN). Artificial neuron forms base of artificial neural network’s building block. It is a simple mathematical function. There are three rules set. They are activation, summation and multiplication. Inputs are weighted at the initial stage of artificial neuron. Individual weight is multiplied with each value of input [12]. Sum function is performed at the middle of artificial neuron section. Bias and all inputs which are weighted are summed by this sum function. Through activation function, bias and previous inputs which are weighted are passed at the final stage of artificial neuron. Activation function is also termed as transfer function as shown in Fig. 1. Artificial neuron’s set of rules and working principles are same. Artificial neurons should be connected to artificial neural networks in order extract full power of computation and potential as shown in Fig. 2. Simple rules and few basic are used to reduce the complexity of artificial neural networks. There will be nine features in every feature vector of testing and training dataset. There are two classes of classification. They are non-cancerous and cancerous cell. Classification is done by selecting proper features. Artificial neural network is used to get highly accurate classification.
4 Experimental Results In this system, the existing methods are considered as SVM and KNN method to evaluate breast cancer datasets alongside proposed MGA+ANN algorithm. The
Modified Genetic Algorithm with Artificial …
139
Fig. 1 Artificial neuron’s working principle
Fig. 2 Simple artificial neural network example
metrics like accuracy, f-measure, r-call and precision are used to evaluate the performance. Cancer Wisconsin (diagnostic) dataset is considered for experimentation which contains 569 instances and 32 attributes as features. Precision Precision is computed as Precision =
True positive True positive + False positive
(3)
140
N. Mohan and N. Varshney
Fig. 3 Precision
Quality or accuracy computation is expressed by precision. Quantity or completeness is expressed by recall. More relevant results are obtained by an algorithm if it has high value of precision. Ratio of true positives number to total element defines classification. Precision metric comparison of existing and proposed methods is shown in Fig. 3. Methods are represented in x-axis, and precision value is represented in y-axis of a plot. For a given dataset of breast cancer, existing methods such as SVM and KNN algorithms provided low value of precision, whereas proposed MGA+ANN algorithm provided high value of precision. Thus, the result concludes that proposed MGA+ANN increases the breast cancer classification accuracy through the optimal subset selection of features. Recall Recall value is computed as Recall =
True positive True positive + False negative
(4)
Ratio of retrieved relevant document to total relevant documents existing defines recall. Ratio of number of retrieved relevant document to total retrieved relevant documents defines precision value. Recall metric comparison of existing and proposed methods is shown in Fig. 4. Methods are represented in x-axis, and recall value is represented in y-axis of a plot. For a given dataset of breast cancer, existing methods such as SVM and KNN algorithms provided low value of recall, whereas proposed MGA+ANN algorithm provided high value of recall. Thus, the result concludes that proposed MGA+ANN increases the breast cancer classification accuracy through the optimal subset selection of features. F-measure F1-score is expressed by
Modified Genetic Algorithm with Artificial …
141
Fig. 4 Recall
F1−score =
2 × precision × recall precision + recall
(5)
F-measure metric comparison of existing and proposed methods is shown in Fig. 5. Methods are represented in x-axis, and f-measure value is represented in y-axis of a plot. For a given dataset of breast cancer, existing methods such as SVM and KNN algorithms provided lower f-measure, whereas proposed MGA+ANN algorithm provides high value of f-measure. Thus, the result concludes that proposed MGA+ANN increases the breast cancer classification accuracy through the optimal subset selection of features. Accuracy Model’s overall correctness defines accuracy and it is expressed as Tp + Tn Accuracy = Tp + Tn + Fp + Fn
(6)
Accuracy metric comparison of existing and proposed methods is shown in Fig. 6. Methods are represented in x-axis, and accuracy value is represented in y-axis of a plot. For a given dataset of breast cancer, existing methods such as SVM and
Fig. 5 F-measure
142
N. Mohan and N. Varshney
Fig. 6 Accuracy
KNN algorithms provided lower accuracy, whereas proposed MGA+ANN algorithm exhibited high accuracy. Thus, the result concludes that proposed MGA+ANN increases the breast cancer classification accuracy through the optimal subset selection of features
5 Conclusion In this research work, MGA+ANN algorithm is proposed for improving accuracy of classification results for given dataset of gene cancer prominently. In this work, K-means clustering algorithm is used to perform pre-processing. It enhances the accuracy of classification by reducing missing values and redundancy features from the given breast cancer dataset. Process of feature selection is done using these. MGA optimization algorithm is used for selecting subset of features. Cuckoos best fitness function is used for selecting useful as well as important features. Then the classification is done by ANN algorithm. It enhances the accuracy of classification of gene cancer. The result concludes that the proposed MGA+ANN algorithm provides higher classification performance with respect to greater accuracy, f-measure, recall and precision values than existing SVM and KNN algorithms.
References 1. Tan, A.C., Gilbert, D.: Ensemble Machine Learning on Gene Expression Data for Cancer Classification (2003) 2. Cruz, Joseph A., Wishart, David S.: Applications of machine learning in cancer prediction and prognosis. Cancer Inform. 2, 117693510600200030 (2006) 3. Jørgensen, T.M., et al.: Machine-learning classification of non-melanoma skin cancers from image features obtained by optical coherence tomography. Skin Res. Technol. 14(3), 364–369 (2008) 4. Kharya, S.: Using data mining techniques for diagnosis and prognosis of cancer disease. arXiv preprint arXiv:1205.1923 (2012)
Modified Genetic Algorithm with Artificial …
143
5. Chauhan, D., Jaiswal, V.: An efficient data mining classification approach for detecting lung cancer disease. In: 2016 International Conference on Communication and Electronics Systems (ICCES). IEEE (2016) 6. Aldryan, D.P., Annisa, A.: Cancer detection based on microarray data classification with ant colony optimization and modified backpropagation conjugate gradient Polak-Ribiére. In: 2018 International Conference on Computer, Control, Informatics and its Applications (IC3INA), pp. 13–16. IEEE (2018) 7. Zhu, L., et al.: Null space LDA based feature extraction of mass spectrometry data for cancer classification. In: 2009 2nd International Conference on Biomedical Engineering and Informatics. IEEE (2009) 8. Bouazza, S.H., et al.: Gene-expression-based cancer classification through feature selection with KNN and SVM classifiers. In: 2015 Intelligent Systems and Computer Vision (ISCV). IEEE (2015) 9. Wang, H.Q., et al.: Exploring protein regulations with regulatory networks for cancer classification. In: 2008 International Conference on BioMedical Engineering and Informatics, vol. 1. IEEE (2008) 10. Mohamad, I.B., Usman, D.: Standardization and its effects on K-means clustering algorithm. Res. J. Appl. Sci. Eng. Technol. 6(17), 3299–3303 (2013) 11. Jiang, S., et al.: Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Expert Syst. Appl. 82, 216–230 (2017) 12. Kaymak, S., Helwan, A., Uzun, D.: Breast cancer image classification using artificial neural networks. Procedia Comput. Sci. 1(120), 126–131 (2017)
Transfer Learning: Survey and Classification Nidhi Agarwal, Akanksha Sondhi, Khyati Chopra, and Ghanapriya Singh
Abstract A key notion in numerous data mining and machine learning (ML) algorithms says that the training data and testing data are essentially in the similar feature space and also have the alike probability distribution function (PDF). Though, in several real-life applications, this theory might not retain true. There are issues where training data is costly or tough to gather. Thus, there is a necessity to build high-performance classifiers, trained using more commonly found data from distinct domains. This methodology is stated as transfer learning (TL). TL is usually beneficial when enough data is not available in the target domain but the large dataset is available in source domain. This survey paper explains transfer learning along with its categorization and provides examples and perspective related to transfer learning. Negative transfer learning is also discussed in detail along with its effects on the accomplishment of learning in target domain. Keywords Transfer learning (TL) · Machine learning (ML) · Survey · Negative transfer
N. Agarwal (B) Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial University, Barabanki, India e-mail: [email protected] A. Sondhi · K. Chopra Department of Electrical and Electronics Engineering, G D Goenka University, Gurugram, India e-mail: [email protected] K. Chopra e-mail: [email protected] G. Singh Department of Electronics Engineering, National Institute of Technology Uttarakhand, Srinagar, Uttarakhand, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_13
145
146
N. Agarwal et al.
1 Introduction The ML and data mining have been broadly and effectively used in many applications where the training data can be obtained in order to foresee future outcomes [1]. Though, many ML techniques function fine only under hypothesis that the training data and the testing data both are extracted from the similar PDF and feature space. When this PDF alters, most of the models are required to be reconstructed from the beginning by means of newly gathered training data. In numerous applications, it is costly or not viable to recall the desirable training data to reform the models. In such instances, TL or knowledge transfer among the task domains would be required. TL is utilized to enhance a classifier to learn from a domain to another domain by transferring the useful parameters. In transfer learning, there are three main questions including “what to transfer,” “how to transfer” and “when to transfer.” “What to transfer” enquires which portion of information should be transmitted through domains or else tasks. Knowledge can be particular for certain domains and tasks that might or might not be useful. On the other hand, some knowledge may be mutual among various domains and may help to get better performance for the target domain or task. Once ascertaining the part of knowledge that should be transmitted, learning algorithms must build on to transfer the useful knowledge. Therefore, the next problem which arises is “how to transfer.” The third problem “when to transfer” seeks in which circumstances, transmitting should take place. Most existing work on TL emphasizes “what to transfer” plus “how to transfer” by utterly supposing that the source domain and the target domain are linked together. Many instances in data engineering could be discovered where TL is being used. One example is recognizing the target plants [2] with given knowledge of some similar plants. If the targeted plants given are peppers together with fruits of complicated shapes and variable colors which are comparable to the plant canopy, then the aim will be to trace and count green and red pepper fruits on huge, thick pepper plants flourishing in a greenhouse. The method uses the transfer learning approach in two steps: (1) The fruits are traced in a single image (YOLO method), and (2) numerous views are united to upsurge the detection rate of the fruits. Looking at the real-time application from the field of machine learning, a technique in which the transfer learning is used is speech recognition system [3]. The source audio model is trained using huge data of call center’s telephone records, and meeting recordings of a Grand National Assembly in Turkey are set as the target domain. The aim is to use the transfer learning approach to relate the target data with the source model. It also evaluates the consequences of using different transferred layer, feature extractors and target training data sizes on transfer learning. As a third example, let us contemplate the sensor-based activity recognition in human [4] in order to learn high-level knowledge about human activity. Successful human activity recognition (HAR) applications embrace video surveillance, behavior analysis, gesture recognition and gait analysis. HAR is divided mainly into two types: sensor-based and video-based human activity recognition. Sensor-based human
Transfer Learning: Survey and Classification
147
activity recognition emphasizes the gesture data from MEMS sensors for example gyroscope, accelerometer, barometer, magnetometer, sound sensors, Bluetooth and so on [5], while video-based HAR investigates images or videos comprising human movements from the camera. A proper definition of TL can be expressed as a given source domain, i.e., DS which contains a learning task T SL along with a target domain, i.e., DT which contains a learning task T TL , the aim of TL is to aid in improving the learning of target predictive function f TP (.) in target domain using the knowledge in source domain, where DS is not equal to DT , or T SL is not equal to T TL . In the above given definition, we used two words out of which one is “domain” and the other is “task”. For this survey, we consider domain D made up of two constituents, namely feature space X˙ and marginal PDF, P(X) where X = {x 1 , x 2 , ˙ For a given particular domain, let us consider D = {X, ˙ P(X)}, a task …, x n } ∈ X. contains two constituents, namely a label space Y˙ along with an objective predictive ˙ f (.)}. The domain is defined as a function f (.), i.e., task T is represented by T = {Y, ˙ P(X)}, where DS = DT implies that either X˙ S = X˙ T or PS (X) = PT (X) pair D = {X, [6]. This survey paper delivers a thorough overview of existing methods being used in the field of transfer learning as it relates to data mining tasks for regression, classification and clustering. A lot of research on transfer learning in ML literature has been done [7]. Though, this paper emphasizes the categorization of TL techniques into inductive, transductive and unsupervised TL. By accomplishing this survey, it is desirable to deliver a beneficial source for the machine learning, data mining along with TL. The remaining paper is structured as follows: In Sect. 2, an overview on the classification of TL technique in three divisions is given, each of which is explained using recent works in that particular field. Then brief survey is done in Sect. 3 in the area of “negative transfer” that occurs when the transfer of knowledge has some negative influence on target domain. In the fourth section, we bring together some effective applications of TL which are supported by some published data along with software tools for TL research. Lastly, the conclusion refers to conclude the paper along with analysis of future works in the last section.
2 Distinctive Settings in Transfer Learning This section provides survey of papers on the categorization of TL technique and is distributed into subsections that classify TL under three settings, namely inductive, transductive and unsupervised transfer learning. Figure 1 displays an overview on different settings of TL which explicitly represents the three types of TL and all the possible cases that are under those categories. Inductive transfer learning consists of two cases depending upon the availability of data in source domain and target domain including both regression and classification tasks. Transductive TL also exhibits two cases of which the first one is where domains are different but task is the same and
148
N. Agarwal et al.
Fig. 1 Summary of distinctive settings of TL
second is where both domain and task are same. In the unsupervised transfer learning technique, data is unavailable in both the source domain and the target domain.
2.1 Inductive TL The idea of inductive transfer learning algorithms is to increase approximation of the target PDF f T (.) in the target domain given target tasks are dissimilar form the source tasks. Though, the source domains and the target domains may or may not be the same. As per the availability of labeled data and unlabeled data, the inductive TL can be said similar to the mentioned two cases: (1) multi-task learning and (2) self-taught learning. A special case of the multi-task learning says the source domain contains a large labeled data base. Inductive TL approach focuses on attaining great performance in target task by conveying the knowledge from the source task. However, in multi-task methods, numerous tasks are learned concurrently including both source tasks and the target tasks [8].
Transfer Learning: Survey and Classification
149
In the case of self-taught learning (STL), labeled data is unavailable in source domain, whereas labeled data is available in the target domain. Self-taught learning is a deep learning methodology that contains of two stages for the categorization. Foremost is a feature representation transfer which is learnt from a huge gathering of the unlabeled data, whereas in the second stage, this learnt representation is put on to labeled data to perform classification task [9]. In STL, the label spaces among source domain along with the target domain might be dissimilar, which infers that side knowledge of source domain is inadequate to be used precisely. Inductive TL comprises source domain, i.e., DS and a learning task, i.e., TSL along with the target domain (DT ) plus a learning task (T TL ). Inductive TL objectives are to improve the learning of target PDF f TP (.) in target domain by utilizing the information in source domain and task, given T SL is not equal to T TL . The different approaches used in inductive TL are instance transfer [10], feature representation transfer [11], parameter transfer [12] and relational knowledge transfer [13].
2.2 Transductive TL In transductive TL technique, a lot labeled data is present in the source domain, whereas no labeled data is present in the target domain [14]. In this setting, both source tasks and target tasks are the similar, whereas there is a difference in domain only. Further two more cases arise in transductive transfer learning depending upon distinctive circumstances among the source and the target domain. In the first case, it is considered that feature spaces are different in the source and the target domain, i.e., X˙ S = X˙ T and in case two, it is said that feature spaces among source domain and the target domain remain similar but they have different marginal PDF, i.e., P(X S ) = P(X T ). The second case discussed above is associated with domain alteration for knowledge transfer. Transductive TL is used in recognition of electroencephalogram signals [15] and in spectrum optimization [16]. Transductive transfer learning consists of a given source domain (DS ) which contains a learning task T SL and the target domain (DT ) which contains a learning task T TL . The aim of transductive TL is to develop the learning of target PDF f TP (.) in target domain via knowledge in source domain and task, given DS = DT and T SL = T TL . The different approaches used in transductive TL are instance transfer as well as feature representation transfer.
2.3 Unsupervised TL Unsupervised TL is same as inductive TL, but the main dissimilarity is that labeled data is absent in both the source and the target domains. Nevertheless, the unsupervised TL emphasis on resolving unsupervised tasks such as spoken language
150
N. Agarwal et al.
understanding [17] and person re-identification [18]. In training, only unlabeled data is present in both the source domain and the target domain. Unsupervised TL comprises a source domain (DS ) along with a learning task T SL and the target domain (DT ) with a learning task T TL . It has the objective of improving the learning of the target PDF f TP (.) in the target domain by consuming the knowledge in the source domain. In the mentioned case, it is considered that T SL = T TL . As per the definition of the unsupervised TL, labeled data is unavailable in the source domain and target domain. Thus far, in the recent year glitch classification algorithm [19] and image quality assessment (IQA) algorithm [20] are applications for unsupervised clustering of LIGO data and distortion-generic blind image, respectively. The different approach used in unsupervised transfer learning is feature representation transfer, which deals with TL for relational domains. Feature representation has application in person re-identification [21]. When inadequate training samples are given, then feature representations learned from a larger auxiliary dataset need to be transferred. In [22], a different instance of clustering problems is recognized as self-taught clustering (STC). STC is one example of the unsupervised TL that goals at clustering a smaller gathering of the unlabeled data in target domain, given a huge amount of unlabeled data in source domain.
3 Negative Transfer In TL setting, labeled data is rare for a precise target task which often proposes an actual solution by using data from a linked source task. Though, when conveying knowledge from a less connected source, it may in reverse damage the target functioning, this occurrence is acknowledged as negative transfer. Regardless of its severity, negative transfer is typically defined in a casual manner, missing severe definition, cautious analysis or methodical behavior [23]. We have to achieve three main goals to avoid negative transfer, (1) remove “knowledge” obtained in source domain which is destructive to target domain, (2) recollect “knowledge” that is useful in the target domain and (3) recognize a “cutting point” at which the reskilling process is initiated for the finest knowledge transfer [24]. TL transfers knowledge from a source domain in which a lot labeled training samples are available to target domain in which labeled training samples are absent. But data might go in another distribution in target domain supposing that the domains are interrelated. It becomes difficult to discover a strong degree of relationship between the source domain and target domain that give rise to negative transfer. As per the recent studies, few strategies can be used to prevent negative transfer, viz. circular validation, maximum mean discrepancy (MMD) similarity metric and similar additional methods [25]. Negative transfer is also seen in application of sentence construction, vocabulary, culture and coherence while converting from Chinese to English language. Mi et al. [26] experimented with 150 sophomores from Xidian University on which writing test and interview were used to complete the study. By arbitrarily studying student’s
Transfer Learning: Survey and Classification
151
writing and interviewing them, classification and comparison of errors produced by negative transfer were done. Gui et al. [27] observed class noise collected during learning iterations leads to negative transfer in the setting of transductive transfer learning. A novel method was proposed to detect negative transfer by identifying noise samples for noise reduction.
4 Survey of Transfer Learning In recent times, TL methods are used effectively in numerous applications. Esteva et al. [28] and Xia [29] suggested to use TL methods for deep learning in health care and in healthcare text analytics, respectively. It presents TL methods for health care using computational techniques which influence only some chief areas of medicine and survey how to make end-to-end systems. In the later case, TL has been utilized in healthcare analytics tasks in which it is used to recognize health issues and is done by three distinctive types of knowledge transfer methods. The first approach is embedding a word used to evaluate the resemblances between words. In the second approach, domain discrepancies are reduced by domain knowledge transfer. Lastly, it utilizes Unified Medical Language System (UMLS), which holds a list of key terminology and related resources for increasing text analytics. In [30, 31], transfer learning technique is used for detection and categorization of breast cancer and medical imaging, respectively. Three different architectures, namely GoogLeNet, ResNet and VGGNet were analyzed for categorization of breast cancer. TL concept is used in the detection and categorization of cancer tumor, which involves few steps: data preprocessing, augmentation, pretrained CNN architecture intended for feature extraction and classification. Raghu et al. used transfer learning technique for medical imaging as there is small difference between the features and the task specifications of natural image categorization and target medical tasks. ResNet50 [32] and Inception-v3 [33] architectures of CNN are used for the standard ImageNet database in the field of medical transfer learning. A component-based face recognition has been done using transfer learning technique for forensic applications as presented in [34]. It explains that the knowledge acquired from one domain, i.e., overall face images, is used to categorize elements of the face (lips, ears and nose), i.e., another domain. Though, the face and all its components are from distinct domains which impart conventional information used to transfer the knowledge acquired from a domain to the other domain. Apart from this, half-faces are used such as right, left, lower and upper half and right, left, lower and upper diagonal for the union among partial as well as complete face images. These associations are used for component-based face recognition, partial face and holistic face recognition. A novel way to convert source domain information to suit target domain is put forward in [35]. It considers the partial samples of the target domain as seeds for commencing the transfer of source information. Seeded transfer learning is also used for regression problem [36]. The approach used was different from the existing
152
N. Agarwal et al.
literature in a way that it uses few data in target domain like seeds for initializing the transfer of source information. Transfer learning technique is also used in auto tuning exascale applications in the field of machine learning [37]. It aims at originating methods for use in auto tuning, where the objective is to discover the top performance parameters of an application given as a black-box function. The optimization methods taken into account were black-box optimization, non-Bayesian optimization and Bayesian optimization for auto tuning applications. The paper presents two ways: (1) Auto tuners tune the application on a finite set of inputs to collect knowledge, and it is applied on all input problems thus speeding up the tuning process; (2) the optimal parameters are obtained on the initial set of problems. Transfer learning for auto tuning 1 (TLA1) is applied if an estimation to the optimum parameter conformation of a new task is deemed enough, whereas transfer learning for auto tuning 2 (TLA2) is applied if a superior trait of the optimal parameter conformation of a new task is anticipated. In [38], TL is used in convolutional neural networks (CNNs) for biomedical image categorization. It obtains generic image features from the nature image datasets and a part of generic image feature is used in minor datasets. For microscopic image classification, various architectures of TL are used. PAP-smear and 2D-Hela datasets are used in experimentation by three state-of-the-art convolutional neural networks, namely Inception-v3, Resnet152 [39] and Inception-Resnet-v2 [40] are worked and features are extracted and concatenated. Paolo et al. [41] propose the concept of TL in milieu of electroencephalogram (EEG)-based brain–computer interface (BCI) categorization. It uses the dataset from previous sections and calibrates the classifier to allow a calibrationless BCI way of operation. Methods involve representation of data by means of spatial covariance matrices of EEG signals and/or manipulating the current effective techniques centered on the Riemannian geometry of the diverse symmetric positive definite (SPD) matrices. To make the data comparable between different sessions of EEG, the affine transform of the spatial covariance matrices can be centered with reference to other sessions. For additional details around the experimental results, it is suggested to refer to the reference papers given in the reference section of this survey paper. From the included examples, it is inferred that the TL methods are intended suitably for the real-world applications and TL technique can be used to enhance the performance appreciably in contrast to the non-transfer learning approaches. Various researchers have also provided MATLAB toolkit for transfer learning. Furthermore, these toolkits also offer standard platform for testing and developing new algorithms for TL.
5 Conclusions This survey paper portrays a picture of transfer learning after reviewing numerous existing settings of TL. It is categorized into three special settings, namely inductive, transductive and unsupervised TL. Moreover, given three settings of TL can
Transfer Learning: Survey and Classification
153
be summed up into four cases established on “what to transfer.” The four cases are instance transfer, parameter transfer, feature representation transfer and relational knowledge transfer methodology. The three settings assume that the data is independent and identically distributed, whereas the four cases trade on with the TL on relational data. In non-transfer learning technique, it is expected that the source and the target domain are allied although this is not the case with transfer learning. This survey includes numerous researches focusing in the area of negative transfer. The negative transfer learning is still a highly demanding research area for the researchers. Several TL algorithms assume that the source and the target domain are linked together. Nevertheless, if the hypothesis does not retain, negative transfer may occur, which may depreciate the performance of the system. The conception of optimal transfer is to select information from a source domain and transfer to attain the maximum achievable performance in target domain. There is overlie among the conceptions of optimal transfer and negative transfer; still, optimal transfer endeavors to locate the finest performing target learner, which goes beyond the concept of negative transfer. Furthermore, generally prevailing transfer learning algorithms concentrated on refining various distributions among source domain and target domain. Though, in countless applications, we might desire to transfer information through tasks or domains which consist of distinctive feature spaces as well as transfer from numerous such source domain. In the forthcoming years, transfer learning methods may be extensively utilized to resolve additional exciting applications. TL methods have been applied to various applications, viz. health care, face recognition, auto tuning, human–computer interaction, context awareness, etc.
References 1. Weiss, K., Khoshgoftaar, T.M., Wang, D.D.: A survey of transfer learning. J. Big Data 3(1), 9 (2016) 2. Mure¸san, H., Oltean, M.: Fruit recognition from images using deep learning. Acta Univ. Sapientiae Inform. 10(1), 26–42 (2018) 3. Asefisaray, B., Haznedaro˘glu, A., Erden, M., Arslan, L.M.: Transfer learning for automatic speech recognition systems. In: 2018 26th Signal Processing and Communications Applications Conference (SIU), May 2018, pp. 1–4. IEEE 4. Wang, J., Chen, Y., Hao, S., Peng, X., Hu, L.: Deep learning for sensor-based activity recognition: a survey. Pattern Recogn. Lett. 119, 3–11 (2019) 5. Chowdhary, M., Kumar, A., Singh, G., Meer, K.R., Kar, I.N., Bahl, R.: STMicroelectronics International NV and STMicroelectronics lnc.: Method and apparatus for determining probabilistic context awareness of a mobile device user using a single sensor and/or multi-sensor data fusion. U.S. Patent 9,870,535 (2018) 6. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2009) 7. Taylor, M.E., Stone, P.: Cross-domain transfer for reinforcement learning. In: Proceedings of the 24th International Conference on Machine learning, pp. 879–886. ACM (2007)
154
N. Agarwal et al.
8. Simões, R.S., Maltarollo, V.G., Oliveira, P.R., Honorio, K.M.: Transfer and multi-task learning in QSAR modeling: advances and challenges. Front. Pharmacol. 9, 74 (2018) 9. Javaid, A., Niyaz, Q., Sun, W., Alam, M.: A deep learning approach for network intrusion detection system. In: Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS), May 2016, pp. 21–26 10. Alothman, B.: Similarity based instance transfer learning for botnet detection. Int. J. Intell. Comput. Res. (IJICR) 9, 880–889 (2018) 11. Wang, W., Wang, H., Zhang, C., Xu, F.: Transfer feature representation via multiple kernel learning. In: Twenty-Ninth AAAI Conference on Artificial Intelligence, Feb 2015 12. Kumagai, W.: Learning bound for parameter transfer learning. In: Advances in Neural Information Processing Systems, pp. 2721–2729 (2016) 13. Wang, D., Li, Y., Lin, Y., Zhuang, Y.: Relational knowledge transfer for zero-shot learning. In: Thirtieth AAAI Conference on Artificial Intelligence (2016) 14. Rajesh, M., Gnanasekar, J.M.: Annoyed realm outlook taxonomy using twin transfer learning. Int. J. Pure Appl. Math. 116(21), 549–558 (2017) 15. Xie, L., Deng, Z., Xu, P., Choi, K.S., Wang, S.: Generalized hidden-mapping transductive transfer learning for recognition of epileptic electroencephalogram signals. IEEE Trans. Cybern. 49(6), 2200–2214 (2018) 16. Yao, Q., Yang, H., Yu, A., Zhang, J.: Transductive transfer learning-based spectrum optimization for resource reservation in seven-core elastic optical networks. J. Lightw. Technol. (2019) 17. Siddhant, A., Goyal, A., Metallinou, A.: Unsupervised transfer learning for spoken language understanding in intelligent agents. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 4959–4966, July, 2019 18. Lv, J., Chen, W., Li, Q., Yang, C.: Unsupervised cross-dataset person re-identification by transfer learning of spatial-temporal patterns. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7948–7956 (2018) 19. George, D., Shen, H., Huerta, E.A.: Classification and unsupervised clustering of LIGO data with Deep Transfer Learning. Phys. Rev. D 97(10), 101501 (2018) 20. Bianco, S., Celona, L., Napoletano, P., Schettini, R.: On the use of deep learning for blind image quality assessment. SIViP 12(2), 355–362 (2018) 21. Geng, M., Wang, Y., Xiang, T., Tian, Y.: Deep transfer learning for person re-identification (2016). arXiv preprint arXiv:1611.05244 22. Dai, W., Yang, Q., Xue, G.-R., Yu, Y.: Self-taught clustering. In: Proceedings of the 25th International Conference on Machine Learning, pp. 200–207. ACM (2008) 23. Wang, Z., Dai, Z., Póczos, B., Carbonell, J.: Characterizing and avoiding negative transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11293– 11302 (2019) 24. Ulmer, A., Kapach, Z., Merrick, D., Maiya, K., Sasidharan, A., Alikhan, A., Dang, D.: Actively Preventing Negative Transfer (2019) 25. Paul, A., Vogt, K., Rottensteiner, F., Ostermann, J., Heipke, C.: A comparison of two strategies for avoiding negative transfer in domain adaptation based on logistic regression. Int. Arch. Photogram. Remote Sens. Spat. Inf. Sci. ISPRS Arch. 42(2), 845–852 (2018) 26. Mi, X., Chen, W., Zhang, Y.: A study on negative transfer of Non-English majors’ writing under the error-analysis theory. In: 2018 8th International Conference on Management, Education and Information (MEICI 2018). Atlantis Press, Dec (2018) 27. Gui, L., et al.: Negative transfer detection in transductive transfer learning. Int. J. Mach. Learn. Cybern. 9(2), 185–197 (2018) 28. Esteva, A., Robicquet, A., Ramsundar, B., Kuleshov, V., DePristo, M., Chou, K., Cui, C., Corrado, G., Thrun, S., Dean, J.: A guide to deep learning in healthcare. Nat. Med. 25(1), 24 (2019) 29. Xia, L., Goldberg, D.M., Hong, S., Garvey, P.: Transfer Learning in Knowledge-Intensive Tasks: A Test in Healthcare Text Analytics (2019)
Transfer Learning: Survey and Classification
155
30. Khan, S., Islam, N., Jan, Z., Din, I.U., Rodrigues, J.J.C.: A novel deep learning based framework for the detection and classification of breast cancer using transfer learning. Pattern Recogn. Lett. 125, 1–6 (2019) 31. Raghu, M., Zhang, C., Kleinberg, J., Bengio, S.: Transfusion: Understanding transfer learning with applications to medical imaging (2019). arXiv preprint arXiv:1902.07208 32. He, K., Girshick, R., Dollár, P.: Rethinking imagenet pre-training (2018). arXiv preprint arXiv: 1811.08883 33. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015). arXiv preprint arXiv:1502.03167 34. Kute, R.S., Vyas, V., Anuse, A.: Component-based face recognition under transfer learning for forensic applications. Inf. Sci. 476, 176–191 (2019) 35. Salaken, S.M., Khosravi, A., Nguyen, T., Nahavandi, S.: Seeded transfer learning for regression problems with deep learning. Expert Syst. Appl. 115, 565–577 (2019) 36. Chakraborty, S., Gilyén, A., Jeffery, S.: The power of block-encoded matrix powers: improved regression techniques via faster Hamiltonian simulation (2018). arXiv preprint arXiv:1804. 01973 37. Sid-Lakhdar, W.M., Aznaveh, M.M., Li, X.S., Demmel, J.W.: Multitask and Transfer Learning for Autotuning Exascale Applications (2019). arXiv preprint arXiv:1908.05792 38. Nguyen, L.D., Lin, D., Lin, Z., Cao, J.: Deep CNNs for microscopic image classification by exploiting transfer learning and feature concatenation. In: 2018 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5 (2018) 39. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015) 40. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence, Feb 2017 41. Zanini, P., Congedo, M., Jutten, C., Said, S., Berthoumieu, Y.: Transfer learning: Riemannian geometry framework with applications to brain–computer interfaces. IEEE Trans. Biomed. Eng. 65(5), 1107–1116 (2017)
Machine Learning Approaches for Accurate Image Recognition and Detection for Plant Disease Swati Vashisht, Praveen Kumar, and Munesh C. Trivedi
Abstract In recent world where world population has reached to 7.7 billion and still increasing at a rapid rate, and it would reach to approx 9.8 billion by 2050. With the increase of population, several concerns also grow side by side in parallel, like environment, development, and food security and availability and increase in population leads to increase in demand of food. Technically, there is only 510.1 million km2 of area on earth in total. And about which very small portion is used for crop cultivations. Hence, it becomes important to make best possible use of the land available for growing crops and feed our population accordingly [1]. With the fast improvement of Internet and information technology, maintaining large records about any specific dataset and processing it according to the requirements has become very easy and efficient [2]. The recent survey predicts that 30% of jobs will be extinguished in the next nine years as the machine learning algorithms are going to replace these jobs with automation and zero human work interface machineries. Many technologies come and go in the timeline of years but these are some basic technologies which cannot be replaced or can never extinguish till the final rock on this planet [3]. In this paper, we will be analyzing different approaches of machine learning which are involved in plant disease detection. This paper will also explain the working of various approaches or algorithms and their comparative measures with other methods and technologies. Keywords Machine learning approaches · Advanced farming · Plant disease detection · Classifiers
S. Vashisht (B) · P. Kumar Computer Science and Engineering, Amity University, Noida, Uttar Pradesh, India e-mail: [email protected] P. Kumar e-mail: [email protected] M. C. Trivedi Computer Science and Engineering, National Institute of Technology Agartala, Agartala, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_14
157
158
S. Vashisht et al.
1 Introduction We rely upon consumable plants similarly as we rely upon oxygen. Without harvests, there is no nourishment, and without nourishment, there is no life. It is no mishap that human progress started to flourish with the innovation of farming. Today, present-day innovation enables us to develop crops in amounts vital for an unfaltering nourishment supply for billions of individuals. However, ailments stay a significant danger to this stockpile, and a huge division of yields are lost every year to ailments. The circumstance is especially critical for the 500 million smallholder ranchers around the world, whose jobs rely upon their harvests progressing nicely. In Africa alone, 80% of the horticultural yield originates from smallholder farmers. With billions of cell phones far and wide, would not it be incredible if the cell phone could be transformed into a diagnostics instrument [4], perceiving disease from pictures it catches with its camera? This test is the first of numerous means transforming this vision into a reality. We have gathered—and keep on gathering—a huge number of pictures of diseased and solid crops. The objective of this test is to create calculations than can precisely analyze a disease dependent on a picture [5].
1.1 Computer Vision and Machine Learning Traditional computer vision deals with images and video processing and extract information from images and video in a reliable manner. It works by investigating the various parts of the picture [5]. A basic model can be finding the edges in a picture. We have to apply some techniques to redesign the distinction work in the highdimensional pixels and afterward set an edge where the subordinate is high—edges. A basic undertaking in computer vision is image recognition and classification. Because of the utilization of deep learning in image recognition and classification, computers can consequently produce and learn high-dimensional attributes and properties [6]. Furthermore, by interpreting every given dimension, machines anticipate what is on the picture and demonstrate the degree of likelihood based on percentage of surety [7].
1.2 Computer Vision Techniques Used for Image Recognition and Classification Following are some computer vision techniques that are used in this paper for image processing and classification [8].
Machine Learning Approaches for Accurate Image …
1.2.1
159
Canny Edge Detection
The Canny edge detector is an edge recognition algorithm that uses calculation at various levels to identify an ample number of edges in training pictures [9]. It is a system to eliminate helpful auxiliary data from number of different vision articles and also to reduce the amount of information to be processed efficiently [10]. It has been extensively adhered to different computer vision frameworks. Canny had already found out significantly that the prerequisites for the utilization of edge detection on assorted vision frameworks are usually comparative. As a result, an edge detection solution to meet these prerequisites can be applied in a wide range of situations. The general criteria for edge detection include: (1) Detection of edge with low error rate, which implies that the detection ought to precisely get how many edges, appeared in the picture. (2) The edge point distinguished from the original ought to precisely focus on the focal point of the edge. (3) A given edge in the picture should just be counted once, and where conceivable, picture commotion must not make false edges [11]. 1.2.2
Steps for Performing Canny Edge Detection
• Noise reduction 1 (i − (k + 1))2 + ( j − (k + 1))2 ; 1 ≤ i, j ≤ (2k + 1) Hi j = exp − 2π σ 2 2σ 2 where k is kernel. Since edge detection results are highly sensitive to image noise. A method to get rid of noise and make image little smooth is by making image little blur [12, 13]. • Gradient calculation This step detects the edge intensity and direction. • Non-maximum suppression Preferably, the last picture ought to have thin edges. Hence, we should perform non-greatest maximum suppression to disperse the edges. • Double threshold In this step, we detect strong, weak, and non-relevant pixels: 1: Strong pixels—pixels which have high intensity. 2: Weak pixels—these pixels are the ones which do not have high intensity as much that it can be considered as strong pixels.
160
S. Vashisht et al.
3: Non-relevant pixels—other pixels which are not a part of edges. • Edge Tracking by Hysteresis In this step, all strong pixels are detected and are processed.
1.2.3
Probabilistic Neural Network
It is a feed-forward neural network, which is closely related to Parzen window PDF estimator. A PNN consists of several sub-networks, each of which is a Parzen window PDF estimator for each of the classes. Using PDF of each class, the class likelihood of information is estimated and Bayes’ rule is then utilized to allot the class with highest posterior likelihood to new enter information. By this strategy, the likelihood of misclassification is limited. It is broadly used in classification and pattern recognition problems [14].
1.2.4
K-Nearest Neighbor
K-Nearest Neighbor is one of many supervised learning algorithms utilized in data mining and machine learning; it is a classifier calculation where the learning is based “how comparable” is a data (a vector) from other [15].
1.2.5
Support Vector Machine
Support vector machine helps analyzing and manipulating the data. Sometime, we overcome the data sets that are similar to each other and even we as human cannot distinguish them so as for machine it is equally difficult task but no in the name of machine learning most famous classifying algorithm the SVM algorithm [16]. The support vector machine algorithm or SVM looks at the exhumes of data sets and draws a decision boundary also known as hyperplane near the extreme points in the data set so essentially.
1.2.6
Feature Extraction
In image processing and pattern recognition, feature extraction takes an initial measurement of image data and generates derived values that could present some information whine being non-redundant, facilitating the subsequent learning and generalization steps, and sometimes gives better human interpretations [17]. Deciding a reduced set of initial features is known as feature selection, and these selected features are supposed to have every relevant information from the input data [18].
Machine Learning Approaches for Accurate Image …
161
2 Machine Learning Approaches: A Comparative Study We have studied and analyzed various existing architectures that have already been proposed for different crop diseases and prepared a summarized data so that we can propose an arc (Table 1). Table 1 Comparison of different existing architectures and their accuracy results Approach
Crop analyzed
Description
VGGnet with 13 convolutional layers [19]
Tomato
A deep learning frame work caffe, developed by berkley vision and learning center is used to perform deep CNN classification [19]
Result (accuracy) (%) 96.3
Raspberry Pi and deep Tomato convolutional neural networks [20]
Classification is done by DCNN using Keras and TensorFlow. CNN can automatically recognize interesting areas in images which reduces the need for image processing [20]
97.2
Overlapping pooling American cotton leaf method with SVN and KNN [16]
The overlapping pooling has been implemented combining multilayered perceptrons (MLP) for detection of affected plants and Graph-based MLP for transforming the feature orientation with KNN and SVM [21] To minimize the errors by double layered modeling, various techniques like morphological segmentation, pattern matching, hue matching are combined [16]
96
(continued)
162
S. Vashisht et al.
Table 1 (continued) Approach
Crop analyzed
Description
KNN classifier [15]
Groundnut leaf
Image acquisition, image preprocessing, segmentation, features extraction and classification is done using K-Nearest Neighbor (KNN) [15]
Result (accuracy) (%) 95
Artificial neural network [22]
Cotton
Image preprocessing is done and based on color changes, affected leaf is highlighted and classification is done by ANN [22, 23]
90
Super-resolution convolutional neural network [24]
Tomato
Super-resolution 96 architecture is used, video is transformed in still frames and used for detection and then frames are again converted into videos [30]
CNN and particle swarm optimization [31]
Maize
Orthogonal learning particle swarm optimization (OLPSO) algorithm is utilized to optimize a number of hyperparameters of CNN models by finding optimal values for these hyperparameters rather than using traditional methods such as the manual trial and error method [31]
98.2
3 Conclusion Machine involvement has been increasing in our day-to-day life exponentially in last some years. From a farmer’s point of view, checking individual plant and observing it precisely is not technically possible. It can be furnished easily by applying computeraided approach on a large number of dataset. There are many aspects and research areas that machine learning has already been taken up for better results and designing a better decision support system. We will be using these approaches for advanced farming processes and estimation of crop health would be a measurement of healthy crop yield information. Machine learning approaches prompt precise horticultural sources of info, encouraging computerized variable rate use of fertilizers, weed control, and pesticides which enables us to produce healthy crop and also helps in infected crop detection with much ease.
164
S. Vashisht et al.
In this paper, we have discussed the methods evolved and suggested by researchers for various crops and then evaluated these approaches of deep learning algorithms and find out that LeNet architecture which consists of two layers of convolution, activation and pooling provides more accuracy in disease detection.
References 1. Luo, G., Pan, S., Zhang, Y., Jia, H., Chen, L.: Research on establishing numerical model of geo material based on CT image analysis. Eurasip J. Image Video Process. Open Access 2019(1), 1 Dec 2019 (2019) 2. Ali, M.M., Bachik, N.A., Muhadi, N., Tuan Yusof, T.N., Gomes, C.: Non-destructive techniques of detecting plant diseases: a review. Physiol. Mol. Plant Pathol. 108 (2019) 3. Ling, X., Zhou, H., Deng, W., Li, C., Gu, C., Sun, F.: Model ensemble for click prediction in bing search ads. In: 26th International World Wide Web Conference 2017, WWW 2017 Companion, pp. 689–698 (2017) 4. Sladojevic, S., Arsenovic, M., Anderla, A., Culibrk, D., Stefanovic, D.: Deep neural networks based recognition of plant diseases by leaf image classification. In: 2016, Computational Intelligence and Neuroscience 5. Mohanty, S.P., Hughes, D.P., Salathé, M.: Using deep learning for image-based plant disease detection. Front. Plant Sci. (2016) 6. Zhang, C., Zhou, P., Li, C., Liu, L.: A convolutional neural network for leaves recognition using data augmentation. In: 2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing 7. Sullca, C., Molina, C., Rodríguez, C., Fernández, T.: Diseases detection in blueberry leaves using computer vision and machine learning techniques. Int. J. Mach. Learn. Comput. 9(5) (2019) 8. Gohad, P.R., Khan, S.: A study of crop leaf disease detection using image processing techniques. Int. J. Sci. Technol. Res. 8(10), 215–217 (2019) 9. Bakhsh, N., Shin, J.Y., Gotway, M.B., Liang, J.: Computer-aided detection and visualization of pulmonary embolism using a novel, compact, and discriminative image representation. Medical 58 (2019) 10. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Abinovich, A.: Going deeper with convolutios. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1–9. 7–12 June 2015 11. Gavhale, K.R., Gawande, U., Hajari, K.O.: Unhealthy region of citrus leaf detection using image processing techniques. In: International Conference for Convergence of Technology, I2CT (2014) 12. Gadi,V., Garg, A., Manogaran, I., Sekharan, S., Zhu, H.: Understanding soil surface water content using light reflection theory: A novel color analysis technique considering variability in light intensity. J. Test. Eval. 48(5) (2020) 13. Zhao, Z., Li, B., Chen, L., Xin, M., Gao, F., Zhao, Q.: Interest point detection method based on multi-scale Gabor filters. IET Image Process. 13(12) (2019) 14. De Shuang, H.: Radial basis probabilistic neural networks: model and application. Int. J. Pattern Recognit Artif Intell. 13(07), 1083–1101 (1999) 15. Vaishnnave, M.P., Suganya Devi, K., Srinivasan, P., ArutPerumJothi, G.: Detection and classification of groundnut leaf diseases using KNN classifier. In: 2019 Proceedings of International Conference on Systems, Computation, Automation and networking 16. Prashar, K., Talwar, R., Kant, C.: CNN based on overlapping pooling method and multi-layered learning with SVM & KNN for American cotton leaf disease recognition. In: 2019 International Conference on Automation, Computational and Technology Management
Machine Learning Approaches for Accurate Image …
165
17. Wang, S.H., Lv, Y.D., Sui, Y., Liu, S., Wang, S.J., Zhang, Y.D.: Alcoholism detection by data augmentation and convolutional neural network with stochastic pooling. J. Med. Syst. (2018) 18. Visin, F., Kastner, K., Cho, K., Matteucci, M., Courville, A., Bengio, Y.: ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks 19. Charoenvilairisi, S., Seepiban, C., Phironrit, N., Phuangrat, B., Yoohat, K., Deeto, R., Chatchawankanphanich, O., Gajanandana, O.: Occurrence and distribution of begomoviruses infecting tomatoes, peppers and cucurbits in Thailand. Crop Prot. 127 (2020) 20. Emebo, O., Fori, B., Victor, G., Zannu, T.: Development of tomato septoria leaf spot and tomato mosaic diseases detection device using raspberry Pi and deep convolutional neural networks. In: Journal of Physics, 3rd International Conference on Science and Sustainable Development (ICSSD 2019) 21. Udawant, P., Srinath, P.: Diseased portion classification & recognition of cotton plants using convolution neural networks. Int. J. Eng. Adv. Technol (IJEAT) 8(6), ISSN: 2249-8958 (2019) 22. Rothe, P.R., Kshirsagar, R.V.: Cotton leaf disease identification using pattern recognition techniques. In: International Conference on Pervasive Computing: Advance Communication Technology and Application for Society, ICPC (2015) 23. Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: AAAI Publications, Twenty-Ninth AAAI Conference on Artificial Intelligence (2015) 24. Yamamoto, K., Togami, T., Yamaguchi, N.: Super-resolution of plant disease images for the acceleration of image-based phenotyping and vigor diagnosis in agriculture. Sensors 17, 2557 (2017) 25. Liang, S., Zhang, W.: Accurate image recognition of plant diseases based on multiple classifiers integration. In: Jia, Y., Du, J., Zhang, W. (eds.) Proceedings of 2019 Chinese Intelligent Systems Conference. CISC 2019. Lecture Notes in Electrical Engineering, vol. 594. Springer, Singapore (2020) 26. ak Entuni, C.J., Afendi Zulcaffle, T.M.: Simple screening method of maize disease using machine learning. Int. J. Innov. Technol. Exploring Eng (IJITEE) 9(1), ISSN: 2278-3075 (2019) 27. Hu, J., Li, D., Chen, G., Duan, Q., Han, Y.: Image segmentation method for crop nutrient deficiency based on fuzzy C-means clustering algorithm. Intell. Autom. Soft Comput. (2012) 28. Deva Hema, D., Dey, S., Krishabh, Saha, Anubhav Saha.: Mulberry leaf disease detection using deep learning” Int. J. Eng. Adv. Technol (IJEAT) 9(1), ISSN: 2249-8958 (2019) 29. Abdalla, A., Cen, H., Wan, L., Rashid, R., Weng, H., Zhou, W., He, Y.: Fine tuning convolutional neural network with transfer learning for semantic segmentation of ground-level oilseed rape images in a field with high weed pressure. Comput. Electron. Agric. (2019) 30. Li, D., Wang, R., Xie, C., Liu, L., Zhang, J., Li, R., Wang, F., Zhou, M., Liu, W.: A recognition method for rice plant diseases and pests video detection based on deep convolutional neural network. Sensors (2020) 31. Darwish, A., Ezzat, D., Hassanien, A.E.: An optimized model based on convolutional neural networks and orthogonal learning particle swarm optimization algorithm for plant disease diagnosis. Swarm Evol. Comput. (2020)
Behavior Analysis of a Deep Feedforward Neural Network by Varying the Weight Initialization Methods Geetika Srivastava, Shubham Vashisth, Ishika Dhall, and Shipra Saraswat
Abstract An artificial neural network is one of the principal machine learning models that are successfully able to model complex data by appending a greater number of layers to an artificial neural network; the model can minimize the cost over the training epochs. Weight initialization techniques play a vital role in the network’s performance. Initializing the weight matrix for the network model affects the way the network enables the process of learning from the input data. This allows the model to achieve the optimal set of weights and faster convergence which further governs the network’s performance over the defined problem. This paper explores the behavior of a deep neural network by varying weight initializing methods on the MNIST dataset for the classification problem. Our experimentation shows that He initialization can outclass other methods like Xavier initialization as well as more conventional methods like random initialization while employing ReLU nonlinear activation. Keywords Weight initialization techniques · Machine learning · Deep neural network
G. Srivastava (B) Department of Physics and Electronics, Dr. Rammanohar Lohia Avadh University Ayodhya, Ayodhya, Uttar Pradesh, India e-mail: [email protected] S. Vashisth · I. Dhall · S. Saraswat Department of Computer Science, Amity University, Sector-125, Noida, Uttar Pradesh, India e-mail: [email protected] I. Dhall e-mail: [email protected] S. Saraswat e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_15
167
168
G. Srivastava et al.
1 Introduction A tuned neural network with the finest configuration of parameters is essential to let the loss converge to minima. The intension behind initializing the weights is to avoid the vanishing or exploding outputs in between of the forward pass for a deep neural network model as explained in [1]. This prevents the gradient loss from attaining a considerably large or small unit to flow backward which further permits the network model to converge [2] at a much faster rate. Matrix multiplication operates as the fundamental operation in an artificial neural network. In case of deep neural network with large number of hidden layers, the matrix multiple operation is carried out at every consecutive layer which defines a single forward pass. This multiplication product serves as the input for the subsequent layers of the network. There have been various weight initialization techniques [3] that can be implemented on a deep neural network which affects the convergence and testing scores of the network model on the supervised machine learning problems. The key contribution of this paper is the analysis of the learning behavior of an artificial neural network by application of various weight initialization techniques. The constructed neural network models with distinguished weight initialization strategies are used to solve the classification problem on the MNIST dataset. Methods included in this experimentation are zero initialization, random initialization, Xavier/Glorot initialization and He initialization technique. This paper also contributes by performing hyperparameter tuning of the constructed network models followed by the comparative analysis of the testing accuracy of the same. The rest of the paper has been structured as given: Sect. 1 discusses the literature review; Sect. 2 presents the methodologies; Sect. 3 discusses the result. Finally, conclusion is discussed in Sect. 4.
2 Literature Review For enabling a deep neural network to learn, i.e., minimizing the loss and converging to the minima, many optimization algorithms are used such as RMSprop, AdaGrad, Adam, etc. [4] shows the experimentation of weight initialization methods using the learning algorithm called gradient descent, that is, one of the fundamental optimization algorithms used to train an artificial neural network model. The research analysis was deeply focused toward those weight initialization methods those were presented by Nguyen and Widrow [5] and Chen and Nutter [6]. After defining six function approximations, results from this paper suggest that weight initialization strategies proposed by Nguyen and Widrow [5] are relatively more fitting for gradient decent optimizer. Weight initialization methods directly impact the model’s convergence rate. Studies in [7] propose an approach for defining weights for an artificial neural network
Behavior Analysis of a Deep Feedforward Neural Network …
169
that operate with nonlinear activation functions rather than the linear activation functions. The approach was initiated by defining a general weight initialization technique that works with any standard artificial neural network model which uses such activation functions that are differentiate to zero. Further, paper illustrates that for specific nonlinear activation function, rectified linear unit (ReLU), Xavier initialization [8] fails in providing convergence to the neural network model. In [9], a weight normalization technique has been depicted wherein the parameters’ length was dissociated from its direction. The weights reparameterizations in neural network involved in the given work were in order to enhance converging speed of the gradient. Motivation behind the process was to mimic the working of the batch normalization as it could be applied on a recurrent model for instance LSTM. The adopted method yielded a much lower overhead when it comes to the computational cost. The technique was meant to be applied on machine learning models with generative approach, image detection, and reinforcement learning models. In case of learning, problems faced by deep neural networks like exploding gradient and vanishing gradient have been discussed and it can be troublesome. According to Koturwar and Merchant [10], the best remedy for this is to use the initialization technique which veiled good outcomes on basic datasets. Since a highdimensional loss function is to be optimized, a vigilant way is to be adopted to compute weights. A data dependent initialization is performed and then they analyze the performing of the technique proposed over standard techniques (Xavier and HeNormal). The implementation of proposed algorithm was done on a dataset and the results represented excellent classification precision. The local minima and maxima of a machine learning model are influenced by various parameters like weights and biases. The weights are initialized, and algorithm is trained in epochs; this impacts the convergence speed and convergence probability. As per [11], a successful convergence is much more probable with appropriate weight variations and initialization methods. A pragmatic assessment of different weight initialization methods has been represented in this paper by mentioning three measurements and its aspects. In order to determine the optimum initial weights for a feedforward neural network, which further assists in improving the converging rate and removing the redundant steps involved in the process and its implementation has been showed in [12]. After a rigorous series of experiments, it was found that the proposed methodology involved number of iterations which were only 3.03% of basic ones using the techniques based on linear algebra and Cauchy inequality. It aims to ensure that the output neurons are in an active region which would increase the probability of model converging faster. The literature review highlights the vital role that weight initialization strategies play in the performance of a deep neural network. Selection of the weight matrix for a neural network model directly impacts the convergence rate; hence, it serves as one of the hyperparameters that require tuning depending upon the nature of the problem.
170
G. Srivastava et al.
3 Methodology In order to analyze the performance of the deep neural network model by varying the weight initialization methods, four different neural network models were constructed. By keeping number of nodes, layers, optimizer, and learning rate constant, the concentration was paid toward the initialization process of the weight matrix. The evaluation of the four deep neural network models was done by following the concepts of supervised machine learning. This construction of four distinctive neural network models with distinguished weight initialization methods as described in Fig. 1 includes the following phases: 1. Data preparation and processing 2. Model selection and training 3. Hyperparameter tuning and testing.
3.1 Data Preparation and Processing The initial phase for building the deep network models consisted of data collection and preprocessing procedures. As the dataset used in this research utilized MNIST dataset, which consist of 60,000 samples of 28 * 28-pixel gray-scale images of handwritten digits from zero to nine, there were no major operations that were applied directly to the dataset. However, this phase included operations like reshaping and resizing of these acquired gray-scale images in order to make the processing of the data much configurable with the further phases.
3.2 Model Selection and Training This phase can be considered as the most crucial phase in any machine learning problem. The machine learning used to perform the analysis and comparison is a deep neural network constructed using open-source libraries machine learning and deep learning libraries compatible with Python programming language like Keras and TensorFlow. In this very step, the dataset is differentiated for testing and training in a 2:8 ratio, where the neural network model is trained using the 80% of the data and the validation score is estimated using the 20% of alienated data with respect to the trained model. Keeping other hyperparameters constant, weight initialization methods (zero initialization, random initialization, Xavier/Glorot initialization and He initialization) were altered to build four unique deep models. These models were trained using batch or offline machine learning procedure for a certain number of iterations.
Behavior Analysis of a Deep Feedforward Neural Network …
Zero weight initialization
Random weight initialization
Xavier / Glorot weight initialization
171
He weight initialization
Hyperparameter tuning and Testing Fig. 1 Construction of deep neural network models
3.3 Hyperparameter Tuning and Testing Depending upon the nature of the problem, the trained model is trying to solve which in our case is classification of handwritten digits; there are various hyperparameters that need to be tuned in order to maximize the validation accuracy of the model. These hyperparameters highly influence the behavior of the deep neural network model. This includes parameters like optimizing or learning algorithm, learning rate, activation functions, number of layers, nodes, epochs, etc.
172
G. Srivastava et al.
Testing allows one to compare and analyze the performance of the finalized models. This phase serves as the final report over the defined problem. The four constructed network models attain a specific testing accuracy score and loss which help in determining the best suited model with the optimal parameters.
4 Result By application of the following weight initializing methods: (i) (ii) (iii) (iv)
Zero initialization Random initialization Xavier/Glorot and He initialization.
Four distinctive models were constructed, keeping other hyperparameters constant after tuning for the considered dataset. The other hyperparameters as described in Table 1 were evaluated by tuning the network and assessing its performance over the classification problem (Table 2). In practice, if all weights are initialized with zero, for every weight derivative with respect to loss is constant, therefore as illustrated in Fig. 2a, in subsequent epochs, there is no change in the weights which resists any further learning by the model. Random initialization, Fig. 2b is relatively a better option than zero initialization; however, if the initialized weights are assigned a much larger value or smaller value, Table 1 Hyperparameters for the four neural network models
Table 2 Accuracy comparison of the models
Hyperparameter
Configuration
Number of hidden layers
3
Number of nodes per layer
{128, 128, 128}
Activation function on hidden layers
Rectified linear unit (ReLU)
Learning rate
0.001
Optimizer
Stochastic gradient descent
Classifying model
The accuracy achieved (in %)
K-nearest neighbors
92.73542600896862
Random forest
96.95067264573991
Decision tree
97.75784753363229
Logistic regression
97.9372197309417
Support vector machine
98.7443946188341
Naïve Bayes
98.83408071748879
Artificial neural network
98.9237668161435
Behavior Analysis of a Deep Feedforward Neural Network …
173
Fig. 2 Training and testing accuracy using a zero initialization, b random initialization, c Xavier initialization, d he initialization
the network model may suffer from the problem of vanishing and exploding gradient. Xavier initialization [8] is an advancement over the previous random initialization method as it is multiplied with the mentioned constant:
1 size[l−1] W [l] = np.random.randn(size_l, size_l − 1) ∗ np.sqrt(1/size_l − 1)
There is a considerable improvement in the learning of the network model as shown in Fig. 2c. The results for Xavier activation are more prominent using tanh () activation function rather than ReLU activation function [7]. In case of He initialization, Fig. 2d illustrates even smoother learning curve. It follows a similar approach as Xavier initialization except changing the constant 1 by 2 in the mathematical formula. 2 size[l−1] W [l] = np.random.randn(size_l, size_l − 1) ∗ np.sqrt(2/size_l − 1)
174
G. Srivastava et al.
Fig. 3 Loss comparison for constructed models
Figure 3 shows the loss comparison for constructed models along with the number of epochs involved in the process. As depicted in Fig. 3, zero initialization method yields the maximum validation loss as compared to the remaining initialization techniques. Moving forward to random initialization, which tends decrease with the number of epochs and becomes constant after 2.0 epochs. Xavier and He initialization models show an almost similar result but when observed thoroughly, He initialization seems to give a much better testing score. The report of obtained loss and testing accuracies has been tabulated in Table 3. As observed by experiments, testing accuracy of the given model using He initialization is the highest among other initialization techniques like Xavier initialization while application of ReLU nonlinear activation function. Table 3 Loss and testing accuracy report Weight initialization method
Validation loss
Testing accuracy score
Zero initialization
2.301039514541626
0.1135
Random initialization
0.31143197042942045
0.9081
Xavier/Glorot initialization He initialization
0.20179649783670903 0.17004146452993155
0.9419 0.9538
Behavior Analysis of a Deep Feedforward Neural Network …
175
5 Conclusion In order to attain convergence faster for a given machine learning problem, a model needs to be trained with appropriate parameters involved in the working. The experiment performed in the work shows what a vital role weight initialization technique plays for a network’s execution. The behavior of the model by varying weight initialization techniques on the given MNIST dataset for classifying the problem and identifying images has been successfully explored in this work. By observing the various weight initialization technique for rectified linear unit nonlinear activation, we can conclude that the technique of He weight initialization outclasses other techniques like Xavier/Glorot initialization and random initialization prominently. The proposed hypothesis has successfully been proven. This paper can be extended by including more initialization techniques by variation in the activation function of the model. This can be a foot in the door for more work to be done to analyze how the behavior of a machine learning model changes on changing its relevant parameters.
References 1. Pascanu, R., Mikolov, T., Bengio, Y.: Understanding the exploding gradient problem. 2, 417. CoRR http://arxiv.org/abs/1211.5063 2. Wu, W., Wang, J., Cheng, M., Li, Z.: Convergence analysis of online gradient method for BP neural networks. Neural Netw. 24(1), 91–98 (2011) 3. De Sousa, C.A.: An overview on weight initialization methods for feedforward neural networks. In 2016 International Joint Conference on Neural Networks (IJCNN), pp. 52–59. IEEE (2016) 4. Masood S, Doja MN, Chandra P.: Analysis of weight initialization techniques for gradient descent algorithm. In: 2015 Annual IEEE India Conference (INDICON). IEEE (2015) 5. Nguyen, D., Widrow, B.: Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. In: 1990 IJCNN International Joint Conference on Neural Networks, pp. 21–26. IEEE (1990) 6. Chen, C.L., Nutter, R.S.: Improving the training speed of three-layer feedforward neural nets by optimal estimation of the initial weights. In: [Proceedings] 1991 IEEE International Joint Conference on Neural Networks, pp. 2063–2068. IEEE (1991) 7. Kumar, S.K.: On weight initialization in deep neural networks. Preprint at arXiv:1704.08863. (2017) 8. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010) 9. Salimans, T., Kingma, D.P.: Weight normalization: A simple reparameterization to accelerate training of deep neural networks. In: Advances in Neural Information Processing Systems (2016) 10. Koturwar, S., Merchant, S.: Weight initialization of deep neural networks (DNNs) using data statistics. Preprint at arXiv:1710.10570. (2017) 11. Fernández-Redondo, M., Hernández-Espinosa, C.: Weight initialization methods for multilayer feedforward. In: ESANN (2001) 12. Yam, Jim Y.F., Chow, T.W.S.: A weight initialization method for improving training speed in feedforward neural network. Neurocomputing 30(1–4), 219–232 (2000)
A Novel Approach for 4-Bit Squaring Circuit Using Vedic Mathematics Shatrughna Ojha, Vandana Shukla, O. P. Singh, G. R. Mishra, and R. K. Tiwari
Abstract In ancient mathematics, one complete section is based on a set of sixteen sutras, called Vedic mathematics. This approach provides faster and efficient methods to perform mathematical calculations. Here, in this paper, we have proposed a novel approach for high-speed 4-bit squaring circuit based on the concept of Vedic approach. The proposed architecture utilizes a combination of 2-bit Vedic multiplier, Vedic squaring circuits, binary adder and half-adder circuits. Further, this work has been compared and analyzed with existing approaches of 4-bit binary squaring circuit using Vedic mathematical approach. It is concluded that proposed approach is more optimized in terms of number of design units as compared to the existing ones. Keywords Vedic mathematics · Urdhva triyakbhyam · Antyayor dasakepi · Sqauring circuit
1 Introduction Vedic mathematics, a set of 16 sutras (formulae), is one of the most interesting developments, in the field of speed calculation, in the twentieth century. It was discovered by Swami Bharti Krishna Tirtha, supposedly from one of the four Vedas, Atharva S. Ojha · V. Shukla (B) · O. P. Singh · G. R. Mishra Amity School of Engineering and Technology, Amity University, Lucknow, Uttar Pradesh, India e-mail: [email protected] S. Ojha e-mail: [email protected] O. P. Singh e-mail: [email protected] G. R. Mishra e-mail: [email protected] R. K. Tiwari Department of Physics and Electronics, Dr. Rammanohar Lohia Avadh University, Faizabad, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_16
177
178
S. Ojha et al.
Veda [1]. Here, sutra is a short word-formula, which is lucid to understand and memorize. These sutras are significantly useful for fast calculations. Vedic mathematical sutras are very versatile and could be utilized in an effective manner to improve the power efficiency of digital logic circuits [2, 3]. The application of squaring circuit for binary numbers is in signal processing, cryptography and high-speed hardware design, etc. Moreover, squaring circuit may also be utilized to improve the overall efficiency and performance VLSI designs. Here, in this paper, a 4-bit squaring circuit is proposed with reduced number of digital logic gates as compared to existing designs. We have utilized a combination of two significant sutras of Vedic mathematics named as Urdhva Triyakbhyam and Antyayor Dasakepi. This paper is organized in a total of five sections. Section 1 provides a brief introduction of work presented in this paper. After that, Sect. 2 elaborates some fundamental concepts of Vedic mathematics utilized in the proposed work. Our proposed design approach for 4-bit binary squaring circuit is given in Sect. 3 in brief. Result and analysis of our proposed design as compared to existing approach is presented in Sect. 4. The paper is concluded in Sect. 5 at the end.
2 Some Fundamental Concepts of Vedic Mathematics Utilized in Proposed Work In the proposed work, we have utilized the following significant concepts and approaches of Vedic mathematics [4, 5]: (i) Urdhva Triyakbhyam This sutra is one of the most significant among all sixteen sutras of Vedic mathematics. This sutra provides faster and efficient calculation of multiplication of two numbers. Here, multiplication of two binary numbers is based on the cross-multiplication of bits and addition process. This approach is somewhat similar to Japanese parallel lines multiplication technique. Figure 1 depicts the multiplication of two 2-bit binary numbers using Urdhva Triyakbhyam sutra. Here, multiplicand and multiplier bits are named as AB and CD, respectively. Here, the final product generated after multiplying AB and CD is named as MNO. Fig. 1 Example of 2-bit multiplication Urdhva Triyakbhyam sutra
A Novel Approach for 4-Bit Squaring Circuit …
179
Following points elaborate the steps involved in Urdhva Triyakbhyam approach as shown in Fig. 1: i.
Multiply unit place bits of the multiplicand (AB) and multiplier (CD), the multiplication result is stored in bit O. In equation form: O=B×D
ii. In this step, cross-multiply the unit place bit with tens place bit of multiplier and multiplicand vice versa. Sum bit of these two multiplication results addition is stored in bit N, whereas carry is forwarded to the next step. In equation form: N = [(B × C) + (A × D)] iii. In the last step of 2-bit binary multiplication, tens place bits of multiplicand and multiplier are multiplied. The product bit is the sum of this tens place bits multiplication and carry forwarded from previous step. In equation form: M = (A × C) + carry of N This method is applicable to numbers of any bit length. Here, in Fig. 2, we have shown the application of Urdhva Tiryabhyam method for the multiplication of two 3-bit binary numbers. (ii) Antyayor Dasakepi Antyayor Dasakepi sutra based on the special combination of unit place bits of multiplicand and multiplier binary numbers. When sum of unit place bits of multiplicand and multiplier are 0 and the remaining bits are identical, then this sutra is applicable. In other words, the condition to apply this method in binary number system is that unit place bits should be equal, i.e., either 0 or 1. Apparently, 1 + 1 = 10 and 0 + 0 = 0. Further, tens and higher bit positions have equal values. Fig. 2 Multiplication of two 3-bit binary numbers using Urdhva Tiryabhyam sutra
180
S. Ojha et al.
Fig. 3 Multiplication of decimal numbers using Antyayor Dasakepi
Let the two binary numbers to be multiplied are XY (multiplicand) and XZ (multiplier), where Y + Z = 0. Assume product of XY and XZ is named as LM. As shown in Fig. 3, following steps elaborate Antyayor Dasakepi method for these numbers: Step 1: Multiply Y and Z, and store the result in M. Step 2: Multiply X and (X + 1). The result is store in L. Step 3: The output is in the form LM. For example in Fig. 3 given, X = 45, Y = 2 and Z = 8. Therefore, L = 2070, M = 16. Figure 4 describes multiplication of two 3-bit binary numbers using Antyayor Dasakepi method. Moreover, this method is applicable in making a binary square circuit. Figure 5 shows the squaring of 2-bit binary number. Here, multiplicand and multiplier are equal, i.e., AB. Final result of squaring of this number is stored in PQRS. (iii) 2 × 2 Multiplier Circuit For any 2-bit multiplication the product of maximum 4-bit size can be achieved. As shown in Fig. 6, two 2-bit binary numbers AB and CD are multiplied to generate 4-bit product named as X 3 X 2 X 1 X 0. Here, six AND gates are utilized in combination with two XOR gates. Fig. 4 Example of binary multiplication using Antyayor Dasakepi method
Fig. 5 Two bit binary squaring circuit using Antyayor Dasakepi method
A Novel Approach for 4-Bit Squaring Circuit …
181
Fig. 6 Circuit diagram of 2 × 2 binary multiplier
3 Proposed Four Bit Binary Squaring Circuit As shown in Fig. 7, the main idea behind the proposed four-bit binary squaring circuit is to minimize the micro-operations as much as possible. Here, a 4-bit binary number ABCD is utilized for squaring operation. The result of square operating is stored in X 7 X 6 X 5 X 4 X 3 X 2 X 1 X 0. The Vedic relation with the design of a two-bit squaring circuit is that in Antyayor Dasakepi, the summation of the unit digits of two numbers that are to be multiplied should give 0 in its unit place, when the other places have similar digits [6–15]. This helps in reducing the time required to find the product of two numbers by having predetermined 1 and 10 places. In binary number system, the summation of two similar numbers always has a 0 in its unit place [16, 17]. The summation of 1 digit in multiplicand and multiplier has to be 10, 0 in unit place of result, in two numbers and all the other places should be similar in both the numbers, it can only be possible when both the numbers are same, i.e., 1 + 1 = 10, 0 + 0 = 0. Therefore, it can be inferred that the steps involved in finding the square of a binary number can be reduced by having predetermined results of 1 and 10 places. In our proposed circuit, we have utilized two two-bit squaring circuit based on Antyayor Dasakepi method as shown in Fig. 5 in earlier section. Moreover, one 2 × 2 Vedic multiplier shown in Fig. 6 is also utilized here. Our proposed architecture also uses one four-bit binary adder circuit along with one full-adder and three half-adder circuits.
4 Result and Analysis Earlier, in the year 2012, Sethi et al. have proposed an improved approach to design squaring circuit for binary numbers [6]. In this work, they have utilized one 2 × 2 vedic multiplier, two 2-bit squaring circuit along with one carry save adder, one 4-bit binary adder and one 5-bit binary adder. Whereas, our proposed circuit utilizes only
182
S. Ojha et al.
Fig. 7 Proposed 4-bit binary squaring circuit
one 2 × 2 vedic multiplier, two 2-bit squaring circuits along with one 4-bit binary adder and a set of 1 full-adder and 3 half-adders. Figure 8 shows the comparison of existing approach and proposed approach for 4-bit binary squaring circuits in terms of digital logic gates. As, we may interpret that a 2 × 2 Vedic multiplier and a 2-bit squaring circuit requires a total of 8 and 3 digital logic gates (AND, OR, XOR), respectively. Similarly, 40 and 20 logic gates (AND, OR, XOR) are required in 4-bit carry save adder and 4-bit binary adder, respectively. Moreover, 4-bit ripple carry adder and 5-bit binary adder require 20 and 25 logic gates (AND, OR, XOR), respectively. By calculating total number of digital logic gates (AND, OR, XOR) utilized to design the proposed and existing circuit, we may conclude that a total of 99 logic
A Novel Approach for 4-Bit Squaring Circuit …
183
Total Number of digital logic gates 120 100 80 60 40 20 0 ExisƟng Design (Sethi et al, 2012)
Proposed Design
Fig. 8 Comparison chart
gates (AND, OR, XOR) are required to design the existing approach proposed by Sethi et al. in the year 2012. Whereas, only 45 digital logic gates (AND, OR, XOR) are required to design our proposed 4-bit binary squaring circuit. Comparison chart shown in Fig. 8 clearly infers that our proposed design may be considered as the most optimized approach, as total number of digital logic gates (AND, OR, XOR) utilized are reduced to more than half value.
5 Conclusion Mathematical calculations are affecting almost each and every field in today’s life. Moreover, fast and frequent calculations are demand of this era. For fast calculations ancient mathematics and Vedic sutras may provide significant solutions. Here, in this paper, we have presented an improved circuit with lesser computations to perform squaring operation of a 4-bit number. This approach is mainly based upon two major sutras of Vedic mathematics. Our proposed design is also compared with the existing design. The comparison shows that the number of digital logic gates used in the proposed architecture of a 4-bit binary squaring circuit reduces to a good extent as compared to earlier approach. This reduction in design entities will lead to an overall performance and efficiency improvement of digital computational circuits. Our proposed approach may be further utilized to optimize and design other highperformance circuits.
184
S. Ojha et al.
References 1. Swami, J., BharatiKrisna, S.: Tirthaji Maharaja, Vedic Mathematics: Sixteen Simple Mathematical Formulae from the Veda. Delhi (1965) 2. Katgeri, A.V.: Effectiveness of vedic mathematics in the classrooms. SJIF 4/36 (2017) 3. Krishna Prasad, K.: An empirical study on role of vedic mathematics in improving the speed of basic mathematical operations. In: Int. J. Manage. IT Eng (IJMIE) 6(1), 161–171 (2016) 4. Williams, K.: Discover Vedic Mathematics Skelmersdale: Inspiration Books (1984) 5. Dhaval, B.: Vedic Mathematics Made Easy. Jaico Publishing House, New Delhi (2015) 6. Sethi, K., Panda, R.: An improved squaring circuit for binary numbers. Int. J. Adv. Comput. Sci. Appl. 03(02) (2012) 7. Deepa, A., Marimuthu, C.N.: Squaring using vedic mathematics and its architectures: a survey. Int J Intell Adv. Res. Eng. Comput. 06(01) 8. Mano, M.M.: Computer System Architecture 9. Hwang, K.: Computer Arithmetic: Principles, Architecture and Design. Wiley, New York (1979) 10. Akhter, S.: VHDL implementation of fast N × N multiplier based on vedic mathematics. In:Proceedings of the IEEE Conference, pp. 472–475 (2007) 11. Dhillon, H.S., Mitra, A.: A reduced-bit multiplication algorithm for digital arithmetic. Int. J. Comput. Math. Sci. pp. 64–69 (2008) 12. Chidgupkar, P.D., Karad, M.T.: The Implementation of vedic algorithms in digital signal processing. Glob. J. Eng. Educ. 8, 153–158 (2004) 13. Mehta, P., Gawali, D.: Conventional versus vedic mathematical method for hardware implementation of a multiplier. In: Proceedings of the International Conference on Advances in Computing, Control, and Telecommunication Technologies, pp. 640–642. Trivandrum, Kerala, India (2009) 14. Chidgupkar, P.D., Karad, M.T.: The Implementation of vedic algorithms in digital signal processing. Glob. J. Eng. Edu. 8(2) (2004) 15. Thapliyal, H., Srinivas, M.B.: High speed efficient N × N bit parallel hierarchical overlay multiplier architecture based on ancient indian vedic mathematics. Enformatika Trans. 2, 225– 228 (2004) 16. VedicMaths.Org (2004). http://www.vedicmaths.org 17. Dani, S.G.: Myth and reality: on “Vedic Mathematics”. Frontline 10(21), pp. 90–92. 22 Oct 1993 and 10(22), pp. 91–93. 5 Nov, 1993 (1993)
Smart Educational Innovation Leads to University Competitiveness Zornitsa Yordanova
and Borislava Stoimenova
Abstract The paper aims at identifying the boundaries and fundamentals of university competitiveness built on development and application of smart educational innovations. The study goes through a literature analysis for the purpose of stating the art of the main objective—the linkage between educational innovation and university competitiveness. Educational innovation is analyzed from its application prism in higher education in the main fields of its manifestation. Competitiveness is analyzed from the point of view of its importance for universities and encompasses the most popular global rankings. As a result, the paper shortlists the main educational innovation types and possible existing metrics, measuring university competitiveness as well as critically analyzes the higher education rankings worldwide. The study brings valuable insights for the audience engaged in education, higher education and university management as well as to all aspects of educational innovation application for potential stakeholders. The results may be of use in implementing regulations for stimulating and further development of theoretical and applied models for measuring higher education competitiveness and application of education innovation in higher education. Keywords Educational innovation · Innovation management · Competitiveness · Higher education · University rankings
1 Introduction Higher education was located within the core values of each country in the latest years. A significant moment in this direction for the European Union for instance was the “Bologna Declaration,” which aimed at establishing quality standards in higher Z. Yordanova (B) · B. Stoimenova University of National and World Economy, Sofia, Bulgaria e-mail: [email protected] B. Stoimenova e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_17
185
186
Z. Yordanova and B. Stoimenova
education and to support and spread the higher education European system to the rest of the world [1, 2]. Innovation has been one of the most discussed and hot topics for the latest 20 years, and despite its widespread popularity, integration within all industries, recognition of significance and research done, applications of innovations still do not still bring the desired results and indicative values set by the EU to the member countries. Innovation has been interpreted as a major driving wheel of national growth of economy and a determinant of productivity and competitiveness by many scholars [3–5]. Other researchers argue that innovation is the engine of growth for today’s economy and guarantees growth regardless of the economic environment [6]. Being on the top of the wave, the educational innovations are the main focus in this research as well as the reason why they are so important and it is the competitiveness of tertiary education [7]. The internationalization of university education, widely growing number and diversity of educational services providers, increasing demand for transparency and information on academic quality have increased the competitive pressure on universities worldwide to improve their performance. The only available sources that illustrate the complex university performance in a tangible format are the university ranking systems or league tables, also referred to as “rankings.” University rankings may influence the perceptions and priorities of different user groups [8]. Governments, potential students, granting organizations and agencies as well as media use university rankings as a basement and a tool for assessment of the competitiveness of universities. Governments also use universities’ ranks to set their national higher education agendas and as a signal for the competitiveness of their economies [9]. Students and parents use them as a selection tool for decision-making of where to continue education. Employees address rankings when looking at career opportunities [10]. Researchers may rely on rankings to find new collaborative partners [11]. The institutions from the private sector use rankings to identify mutually beneficial partnerships with universities. Rankings are used by universities to measure themselves in terms of global competitiveness and to address and design the institutions’ structure for improvement [12]. In such cases, rankings serve as marketing and management tools to define universities’ strategic development toward the ideal “world-class university” and for attracting better students, employees and partners. In the contemporary world, strengthening the connection between science, education and business is vital for building world-class universities and competitive economies.
2 Theoretical Background The theoretical background in this research is focused on the two main areas of the study: educational innovations and competitiveness of higher educational institutions.
Smart Educational Innovation Leads to University Competitiveness
187
2.1 Educational Innovations Innovations are critically important when it comes to education as this is a recognized growth engine for humanity. According to Taylor et al. [13], educational innovations are any modern and still effective teaching technique, strategically new attempt, tooling for application, or learning source which may lead to some benefits to students’ learning and better engagement. Fullan [14] claimed that educational innovation uses new or revised materials; it uses new teaching approaches; it leads to alteration of beliefs. Educational innovation has been a hot topic since ever, but still, in many studies, the approaches for researching it are experimenting and case study presentation. The managerial approach of problem-driven innovation [15] in this paper aims at eliciting some commonly identified problems and challenges in education so as these to address and ensure educational innovation future. OECD stated that major corner stones of innovations in the sector of education are transforming it to more productive and effective system [16]. According to Kozma [17], educational innovation means facilitating the transition from conventional and ordinary models to emerging concepts relying on information and communication technology solutions, such as promoting learning-oriented processes and constructivist processes. Innovations keep higher education a transforming component of civilization and it is an essential ingredient for its adaptive nature [19]. Schröder and Krüger [18] view social innovation as a new pathway for educational innovations. Staley and Trinkle [20] summarized ten trends in the management of higher education and accordingly focused on innovation in education. Some of them are: differentiation would lead to transformation; updating of education curriculum; agility of faculty; boost global academics and student mobility; the changing perception of the ordinary student; individual versus public good; lifelong collaboration between students and universities. Posselt et al. [21] addressed the university challenges by exploring them from the business model perspective. Wai [18] from his perspective identified collaboration and globalization as major objections for educational innovation especially the potential of cross-disciplinary collaboration. Incorporating sustainability within education has also been detected as a pivotal factor for its development [22]. According to Budzinskaya [23], one of the key criteria for assessing the competitiveness of universities is the commercialization of developments and the possibility of the same to export educational services. Tsvetkova and Lomer [24] summarized that academic excellence is the competitiveness enhancement in higher education. The recently increased amount of research discussing higher education, rankings and educational innovation put even more pressure to universities’ management [25]. Smart educational innovations are seen in the perspective of emerging technologies. These are: smart classrooms, smart learning environment, computerassisted assessment, smart edutainment, etc., but still not only information and communication technologies [26].
188
Z. Yordanova and B. Stoimenova
2.2 Competitiveness of Universities According to Tee [27], the concept of a “world-class” university refers to universities that are able to: (1) demonstrate good performance in the international university rankings; (2) produce excellent results measured by introducing of licenses, patents and publications in high-quality top-ranked journals; (3) produce skilled and professional graduates; (4) attract the best academics and students; (5) have abundant and varied sources of funding; (6) provide a rich learning and research environment; (7) ensure favorable and autonomous management; (8) promote strategic vision and innovation development; (9) respond effectively to the accelerated and changing global demand. As per Hazelkorn [28], the literature on universities’ rankings might be broken down into two divisions: methodological concepts and theoretical practices. The ranking systems used to assess the quality of education and the competitiveness of universities is two types—national and global. National ranking systems are typically resource distribution systems based on “input–output” performance indicators [29]. They include more comprehensive and a larger number of indicators because of their access to information and thorough knowledge of local institutions. National ranking indicators focus mainly on the parameters of education and institutions, while global ranking systems have fewer indicators and generally focus on research achievements [30]. Global ranking systems use information from different sources to develop league tables. The most widely used information sources are internationally accessible bibliometric databases (Web of Science and Scopus databases); reputation surveys across main stakeholders; information from third parties, such as government and local databases; institutional datasets. The lack of internationally relevant data remains a significant problem for any reliable comparison [31]. The International Association of Universities [32] monitors over 16,000 universities, and still global rankings focus only on the top 1000. This was the reason why this research has been performed and put main focus on global indices and rankings.
3 Methodology For researching the link between educational innovation and university competitiveness and to set up a state of art for further exploration, a literature review on competitiveness of universities was performed [33]. Based at first on a compilation of one hundred sixty-two articles, indexed in the Scopus database, selected using keywords such as university rankings, university competitiveness and university success metrics, an overview of their summaries, outlined key areas for new search, and selected 68 articles for further study of the theoretical and methodological grounds for measuring the competitiveness of universities. Selection criteria
Smart Educational Innovation Leads to University Competitiveness
189
contained the accordance of the content, the citation and impact of the article, and the author’s recognition. The literature review had two goals: • to identify the dimensions of universities competitiveness based on the global rankings available • to highlight whether educational innovation is included and measured somehow given its mainstream recognition for university competitiveness.
4 Findings and Discussion The results of the systematic literature analysis demonstrated that each of the researched rankings has specific characteristics as well as methodological limitations because of their diverse objectives and organizational uniqueness. The discrepancy in the ranking systems differs in terms of the metrics they use to measure quality, as well as methodology they apply. With the exception of the Academic Peer Review criterion used by QSWUR, the “Reputation Survey” used by THE-TR and some U-Multirank indicators, most of the indicators are easily compared and widely recognized (i.e., ISI Database, Scopus, WoS, Nobel Prize Websites, etc.). Another common feature is their emphasis on research findings mainly. Dependence on data obtained from universities calls into question the validity of the tables of THE-TR, QS and U-Multirank. The literature review on league tables identifies ten major global rankings, recognized worldwide and described in Table 1. The results of the comparative analysis of the ten global university rankings revealed that no matter of the differentiations between indicators across rankings, indicators related to scientific performance by academics play a leading role. Researches are evaluated on data administered by Web of Science and Scopus. According to Wächter et al. [34], compared to research indicators, teachingrelated ingredients vary much more, ranging from input indicators (such as student selectivity, entry scores, resources per student), across process-related indicators (like faculty/student ratio, teaching/learning environment, international student number/ratio, student satisfaction of teaching and assessment), to output indicators (i.e., graduation rate/time, graduate salary and job placement, alumni giving). Global rankings are more research driven, while national rankings are more teaching driven. Unfortunately, data sources for teaching assessment are considered subjective and unreliable. Although teaching quality is a main criterion determining university competitiveness, university policy makers rarely stimulate individual professors to lead and coach the overall teaching quality in the university. The European Commission has encouraged a shift in political attention from education to direct economic impact, popularly known as the “knowledge triangle” which encompasses education, research and innovation [35]. Universities are seen as a central player in the whole economic system because they are the major contributor to innovation, both in science and industry [35]. As a result from this research, we outlined the main global rankings with their main characteristics as a
190
Z. Yordanova and B. Stoimenova
Table 1 Global university rankings Ranking
Year
Publisher
Country
Academic ranking of World Universities
2003
Shanghai Jiao Tong University
China
Times higher education Quacquarelli Symonds World University Rankings
2004–2009
Britain’s Times Higher Education Supplement
UK
Webometrics Ranking of World Universities
2004
Spanish National Research Council
Spain
Performance Ranking of Scientific Papers for World Universities
2007
Higher Education Evaluation and Accreditation Council of Taiwan
Taiwan
Leiden Ranking
2008
Center for Science and Technology Studies University of Leiden
Netherlands
World’s Best Colleges and Universities
2008
US News and World Report
USA
SCImago Institutions Rankings
2009
SCImago lab
Spain
Quacquarelli Symonds World University Rankings
2010
Quacquarelli Symonds Limited
UK
starting point for further changes and extensions referring educational innovations. Academic Ranking of World Universities released by “The Economist” is a survey on university education. Liu [36] mentioned it as “the most widely used ranking of universities.” ARWU is inclined to include in its scope universities with outstanding research and award-winning faculty, by using quantitative indicators such as highly cited researchers and numbers of Nobel Prize winners. The ranking compares more than 1300 universities and publishes the best 500 in the ARWU World Top 500 Universities. From 2017, universities ranked in between 501 and 800 are known as ARWU World Top 500 Candidates. Times Higher Education and Quacquarelli Symonds World University Rankings use four main criteria: research and science quality, teaching capacity and potential, employment of graduates and international performance. Webometrics Ranking or the Ranking Web is performed by the Cybermetrics Lab. It measures universities’ web presence and impact. The ranking takes into account universities’ activities and outputs, their relevance and impact. Performance Ranking of Scientific Papers for World Universities, also known as National Taiwan University Ranking was first published in 2007. The ranking system emphasizes research performance, employing bibliometric data to assess science quantity based on both Science Citation Index (SCI) and Social Science Citation Index (SSCI) journals. Similar to ARWU, it centers exceptionally on research outputs of universities in comparative principle. The Leiden Ranking is based exclusively on bibliometric indicators about publication performance, citation impact and scientific
Smart Educational Innovation Leads to University Competitiveness
191
collaboration. This ranking builds on publication indicators, citation impact indicators, and scientific collaboration indicators, which measures the extent to which a universal diversity participates in scientific collaboration with other organizations [37]. It uses Science Citation Index Expanded, the Social Sciences Citation Index, and the Arts and Humanities Citation Index. Only article publications and double blind peer review of Web of Science papers are taken into account—Leiden Ranking does not take into account book publications, conference reports, and journals that are not indexed in the aforementioned web citation indexes. The World’s Best Colleges and Universities is the leading US-based ranking. It measures a wide range of university activities. This ranking system emphasizes a set of intangible indicators that make up the college experience. In addition to universities, SCImago Institutions Rankings also include other types of research institutions and also arts and humanities publications, including, articles, letters and reviews. The overall index combines three different sets of indicators: research productivity, innovation results and societal impact. The SIR uses both size-dependent and size-independent indicators which enables comparisons between institutions of different sizes. Quacquarelli Symonds in cooperation with Times Higher Education published a world university ranking from 2004 to 2010. Currently, this ranking uses six metrics that capture university performance and draws data from four different sources: (1) self-reported data from the universities; (2) citations and papers for each university produced from Elsevier’s Scopus Database; (3) global survey of academics; (4) global survey of employers [25, 38]. The U-Multirank system is developed for the needs of European Commission. U-Multirank’s multidimensional format allows users to compare institutions on each dimension, thus allowing analysis at two levels: institutional-level and field-based. The ranking includes five areas of metrics: teaching and learning, research, knowledge exchange, regional engagement, and internationalization, and does not hold all to research-dominated criteria [39]. Due to a number of limitations, rankings better not be the only source of information about universities’ quality and performance. Thus, rankings measure a reduced part of universities’ multiple functions and do not capture all tangible and intangible factors influencing the performance of a university. Potentially important factors might be location, campus life-style, range of courses, different activities and sports, cost of living and educational funds and grants. Universities have different profiles, which a single parameter within a ranking cannot measure. Another major criticism of global rankings is their concentration on research, whereas the quality of learning and teaching, student experience, practical application of knowledge and technology transfers to regions, industries and citizenship which are major goals of governments and students themselves are skipped [40]. The science focus model reduces education diversity and undermines the possibility for a university as a concept to contribute to society in novel areas [41]. The influence of research-intensive environment on teaching activities is complex and debatable. Depending on the disciplines and types of institutions, some researchers prove the positive impact of science-related activities on education, another conclude that there is no relationship between those two areas, while some of the authors find them even as competing activities.
192
Z. Yordanova and B. Stoimenova
Global rankings are generally tailored to science-strong and English-speaking universities. They tend to neglect some science areas, as well as the performance of many universities from non-English speaking, developing countries [40]. Due to more external funding and more articles published by researchers in medical and technical science for instance, bring them a significant advantage in the global rankings. Some researchers claim that university rankings are not so influential when it comes to students’ choice of a university. For example, Broecke [42] finds that the university rank influence on the number of accepted applicants is modest and suggests that there are other factors which are more significant for attracting students. Similarly, Elken et al. [43] show that rankings have not strong influence on the process of decision-making and strategic action at northern universities. Important methodological problems of global rankings are seen in their combinational approach to mastering multiple metrics of university performance in a single aggregate index [44] and absence of consideration of the input–output structure [44, 45]. Thus, they do not favor the most productive universities for which share of output exceeds share of input. Another criticism is over the choice of dimensions and their weights, the quality, reliability and compliance of the data and its capability to measure and compare complex and diverse higher education institutions contexts.
5 Conclusion The research reviewed all university rankings and outlined the global ones. This finding may be of use for further exploration on how Global university rankings measure different aspects of university competitiveness and how variables and changing environment may affect the competitiveness of university. From the current research on how educational innovation is included in the outlined Global university rankings, we may conclude that no matter of the proclamation of educational innovation development and application, they do not participate and are not measured when it comes to university competitiveness. The only metric in the rankings’ scope is knowledge transfer. There is a big gap in all researched university competitiveness indexes and rankings between the main headquarters of education for preparing students for the future as a main basis for innovating in education and the currently used metrics for university competitiveness. Future research of the authors will empirically research how universities develop and apply educational innovations according to teachers and researchers as well as in the students’ opinion and if strengthening this practice would improve the competitiveness of universities. The final goal of the authors is to create a new ranking for universities based on the application of educational innovation in different areas [46]: Quality; information and communication technologies; educational process; acquisition of new skills; culture; collaboration; sustainability; efficiency; motivation and engagement; educational modeling; transformation; general university management; mobility; boosting creativity; E-learning; managing complexity; globalization; accreditation; leadership. As a module of the
Smart Educational Innovation Leads to University Competitiveness
193
ranking, smart educations will be also under consideration, categorization and factor analysis for their impact on the overall education. Acknowledgements The paper and the study are supported by BG NSF Grant No DM 15/1-2017 (M 15/4); KP-06 OPR01/3-2018 and NID NI 14/2018
References 1. Wende, M.: The bologna declaration: enhancing the transparency and competitiveness of European higher education. J. Stud. Int. Educ. 42, 3–10 (2000) 2. European Commission.: Joint declaration of the European ministers of education. The Bologna Declaration of 19 June (1999) 3. Kathoeffer, D., Leker, J.: Knowledge transfer in academia: an exploratory study of the notinvented-here syndrome. J. Technol. Transfer 37(5), 658–675 (2012) 4. Arvanitis, S., Kubli, U., Woerter, M.: University-industry knowledge and technology transfer in Switzerland: what universities scientists think about co-operation with private enterprises. Res. Policy 37, 1865–1883 (2008) 5. Bekkers, R., Freitas, I.M.: Analyzing knowledge transfer channel between universities and industry: to what degree do sectors also matter? Res. Policy 37, 1837–1853 (2008) 6. Tidd, J., Bessant, J.: Managing Innovation: Integrating Technological, Market and Organizational Change. Wiley, West Sussex (2009) 7. Istvan, L., Darabos, E., Orsolya, V.: Competitiveness—higher education, studia universitatis “Vasile Goldis” Arad. Econ. Ser. 26(1), 11–25 (2016) 8. Hazelkorn, E.: The impact of league tables and ranking systems on higher education decision making. J. High. Educ. Policy Manag. 19(2) (2007) 9. Antonio, M., Øerberg, J.: Active instruments: on the use of university rankings in developing national systems of higher education. Policy Rev. High. Educ. 1(1), 91–108 (2017) 10. Presbitero, A., Roxas, B., Chadee, D.: Looking beyond HRM practices in enhancing employee retention in BPOs: focus on employee–organisation value fit. Int. J. Hum. Resour. Manag. 27(6), 635–652 (2016) 11. Campbell, J.W.: Efficiency, incentives, and transformational leadership: understanding collaboration preferences in the public sector. Public Perform. Manag. Rev. 41(2), 277–299 (2018) 12. Ulucan, A., Atici, K.B., Ozkan, A.: Croatian operational research review. Zagreb 9(2), 301 (2018) 13. Taylor, C., et al.: Propagating the adoption of CS educational innovations. In: ITiCSE’18, Larnaca, Cyprus (2018) 14. Fullan, M.: The New Meaning of Educational Change, 5th edn. Teachers College Press, New York (2007) 15. Coccia, Mario: Theorem of not independence of any technological innovation. J. Econ. Bibliogr. 5(1), 29–35 (2018) 16. OECD.: Innovating Education and Educating for Innovation: The Power of Digital Technologies and Skills. OECD, Paris (2016) 17. Kozma, R.B.: Technology, innovation, and educational change: a Global perspective. A Report of the Second Information Technology in Education Study Module 2. ISTE publisher (2003) 18. Wai, C.: Innovation and social impact in higher education: some lessons from toho-ku university and the open university of Hong Kong. Open J. Soc. Sci. 5, 139–153 (2007) 19. Schröder, A., Krüger, D.: Social innovation as a driver for new educational practices: modernising. Repair. Transform. Educ. Syst.,Sustain. 11(4), 1070 (2019)
194
Z. Yordanova and B. Stoimenova
20. Staley, D.J., Trinkle, D.A.: The changing landscape of higher education. Educ. Review. 46, 15–31 (2011) 21. Posselt, T., et al.: Opportunities and challenges of higher education institutions in Europe: An analysis from a business model perspective, special Issue: Policy turnaround: Towards a new deal for research and higher education. evaluation, rankings and governance for a New Era of Data Science 73(1), 100–115 (2019) 22. Yordanova, Z.: Educational Innovation: Bringing Back Fads to Fundamentals, ISPIM Innovation Conference—Celebrating Innovation: 500 Years Since daVinci. Florence, Italy (2019) 23. Budzinskaya, O., Competitiveness of Russian education in the world educational environment. Astra Salvensis-Rev. Istor. Si Cult. VI/2018(11), 565–576 (2018) 24. Tsvetkova, E., Lomer, S.: Academic excellence as “competitiveness enhancement in Russian higher education”. Int. J. Comp. Educ. Dev. 21(2), 127–144 (2019) 25. Erkkilä, T., Piironen, O.: Rankings and Global Knowledge Governance: Higher Education, Innovation and Competitiveness. Springer, Berlin. ISBN: 331968941X, 9783319689418 (2018) 26. Tikhomirov, V., Dneprovskaya, N., Yankovskaya, E.: Three dimensions of smart education. In: Uskov, V.L., Howlett, R.J., Jain, L.C. (eds.) Smart Education and Smart e-Learning. Smart Innovation, Systems and Technologies, vol. 41. Springer, Cham, (2015) 27. Tee, K.F.: Suitability of performance indicators and benchmarking practices in UK universities. Benchmarking Int. J. 23(3), 584–600 (2016) 28. Hazelkorn, E.: Rankings and the Reshaping of Higher Education: The Battle for World-Class Excellence. Palgrave Macmillian, London(2015) 29. Tochkov, K., Nenovsky, N., Tochkov, K.: University efficiency and public funding for higher education in Bulgaria. Post Communist Econ. 24(4), 517–534 (2012) 30. Çakır, M.P., Acartürk, C., Ala¸sehir, O., Çilingir, C.: A comparative analysis of global and national university ranking systems. Scientometrics 103(3), 813–848 (2015) 31. Hazelkorn, E.: World-class universities or world-class systems? Rankings and higher education policy choices. Rankings and Accountability in Higher Education: Uses and Misuses. Unesco, Paris (2013) 32. International Association of universities: International Handbook of Universities. Palgrave Macmillan, Basingstoke, UK (2014) 33. Downe-Wamboldt B.: Content analysis: method, applications, and issues. Health Care Women Int. 13(3), 313–321 (1992) 34. Wächter, B., Kelo, M., Lam, Q.K.H., Effertz, P., Jost, C., Kottowski, S.: University quality indicators: a critical assessment. Eur. Parliam. (2015) 35. Minola, T., Donina, D., Meoli, M.: Students climbing the entrepreneurial ladder: Does university internationalization pay off? Small Bus. Econ. 47(3), 565–587 (2016) 36. Liu, N.C.: The academic ranking of World Universities and its future direction. Rankings and Accountability in Higher Education: Uses and Misuses. Unesco, Paris (2013) 37. Waltman, L., et al.: The Leiden ranking 2011/2012: Data collection, indicators, and interpretation. J. Am. Soc. Inform. Sci. Technol. 63(12), 2419–2432 (2012) 38. Huang, M.H.: Opening the black box of QS World University rankings. Res. Eval. 21(1), 71–78 (2012) 39. Ziegele, F., Vught, F.: „U-Multirank“ und „U-Map” als Ansätze zur Schaffung von Transparenz im europäischen und globalen Hochschulsystem – Konzepte und Erfahrungen, Beiträge zur Hochschulforschung, 35. Jahrgang, 2/201350 40. Marginson, S., Wende, M.: To rank or to be ranked: the impact of global rankings in higher education. J. Stud. Int. Educ. 11(3–4), 306–329 (2016) 41. Boulton, G.: University rankings: diversity, excellence and the European initiative. Procedia Soc. Behav. Sci. 13, 74–82 (2011) 42. Broecke, S.: University rankings: do they matter in the UK? Educ. Econ. 23(2), 137–161 (2012) 43. Elken, M., Hovdhaugen, E., Stensaker, B.: Global rankings in the Nordic region: challenging the identity of research-intensive universities? High. Educ. 72(6), 781–795 (2016)
Smart Educational Innovation Leads to University Competitiveness
195
44. Daraio, C., Bonaccorsi, A., Simar, L.: Rankings and university performance: A conditional multidimensional approach. Eur. J. Oper. Res. 244(3), 918–930 (2015) 45. Kivinen, O., Hedman, J.: World-wide university rankings: A Scandinavian approach. Scientometrics 74(3), 391–408 (2007) 46. Yordanova, Z.: A model for evaluation of Innovative universities. In: 2nd Eurasian Conference on Educational Innovation 2019, Educational Innovations and Applications, pp. 459–462 Tijus, Meen, Chang. ISBN: 978-981-14-2064-1 (2019)
Analysis and Forecast of Car Sales Based on R Language Time Series Yuxin Zhao
Abstract This data comes from the data hall website, mainly using the R language as a tool for data analysis. This analysis mainly focuses on the time series analysis of a certain sedan. First, the monthly sales of the car are extracted as historical data and then analyze the forecast and draw the conclusion. Keywords R language · Data analysis · Car sales · Time series analysis
1 Introduction Time series is a fixed time interval as a unit of data, the most common such as a stock, daily stock price charts, daily weather data. Time series analysis is an important part of statistical analysis, because based on historical data can be predicted, so almost every statistical analysis software has time series analysis and forecasting function. Time series common analysis methods are: simple average method, weighted average method and moving average method. There are two powerful algorithms inside the time series: Holt-Winters and ARIMA. The ARIMA model is also a summing autoregressive moving average model. Among them, ARIMA (p, d, q) is called the differential autoregressive moving average model, AR is autoregressive and p is the autoregressive term; MA is the moving average, q is the number of moving average terms and d is the time series when the time series becomes stable degree of difference [1]. R language has a powerful package, in the data calculation, statistical analysis and data mining are invincible, and this article describes the car sales time series data in the R statistical analysis and prediction to achieve.
Y. Zhao (B) Beijing Information Technology College, Beijing 100018, China e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_18
197
198
Y. Zhao
Fig. 1 Volkswagen Langyi sales month data (part)
2 Data Situation We have to analysis and prediction the Langyi cars of Volkswagen in the popular model, so first screening out the popular sales data in the excel (Fig. 1). Data from April 2011 to October 2013, about two and a half year’s sales data. For ease of use, we sort by months in ascending order. The result is as follows (Fig. 2).
3 Data Processing First of all, we will import excel data into the R, we copy the excel file data to the clipboard. And then use the read.table function to import: > a< -read.table(“clipboard”,header=TRUE,sep=“\t”) >a (Fig. 3) According to the historical data, first draw the time series diagram, as follows: >Plot.ts (a $ month sales, xlab=“month”) (Fig. 4)
Analysis and Forecast of Car Sales Based on R Language …
199
Fig. 2 Volkswagen Langyi sales in ascending order of data (part)
Can be seen from the figure, is the public lively 31 months of sales data, there is no obvious cycle and seasonal trends, January 2013, a record sales, 48,267 units, should be before the Spring Festival, is the vehicle sales season. July 2011, there have been sales of the trough, sales only 3000 units.
4 Time Series Test Analysis 4.1 Autocorrelation Test For non-stationary series of data, the ACF autocorrelation graph does not tend to zero, or tends to be very slow. The two dashed lines in the autocorrelation graph represent the confidence bounds, which are the upper and lower bounds of the autocorrelation coefficients.
200
Fig. 3 The Volkswagen Langyi models sales data (part) in R
Fig. 4 Langyi monthly sales data timing chart
Y. Zhao
Analysis and Forecast of Car Sales Based on R Language …
201
Fig. 5 The autocorrelation graph of the original sequence
Draw the autocorrelation of the original sequence below: >acf (a $ month sales, lag.max=30) (Fig. 5).
4.2 Unit Root Test >unitrootTest (a $ month sales) (Fig. 6) From the above several figures for analysis, Fig. 4 in the timing diagram, we can see that there is a trend of successive years, non-stationary sequence. From the results of autocorrelation test, the autocorrelation coefficient is longer than zero and further indicates that it is a non-stationary sequence. The p value of the unit root test is significantly greater than 0.05, and it is judged that it is a non-stationary sequence.
202
Y. Zhao
Fig. 6 Unit root test results
5 ARIMA Modeling Analysis 5.1 Non-stationary Sequence Difference According to the calculation results of unit root test, self-correlation and partial correlation function, the smoothness of the time series after the first-order difference is determined [2]. Differential, or integrated. The first-order difference is to subtract the value of the previous term from each term of the original sequence. The second-order difference is another difference based on the first-order difference. The difference always gets the stationary sequence. R uses the diff () function to perform a differential operation on the time series.
> diffsale diffsale [1] -837 378 -19389 13702 6559 -731 950 -8353 6278 -1733 [11] -263 -295 -9380 1146 8768 4361 6037 1060 -7776 -3885 [21] 28231 -13378 3445 -4401 -3399 -4924 -1369 2855 2856 320 After the difference then to test: >plot.ts (diffsale) >acf (diffsale, lag.max = 30) >unitrootTest (diffsale)
Analysis and Forecast of Car Sales Based on R Language …
203
Fig. 7 Timing diagram of first-order difference
After the difference then to test: >plot.ts (diffsale) >acf (diffsale, lag.max=30) >unitrootTest (diffsale) (Figs. 7, 8, 9 and 10) After the first-order difference, the timing diagram fluctuates smoothly near the mean, and the autocorrelation has a strong short-term correlation. The unit root test P value is much less than 0.05. So, the first-order difference after the sequence showed a smooth.
5.2 Time Series Model Recognition Order From the first-order difference after the autocorrelation diagram can be seen, ACF value quickly fell into the confidence interval, there is no convergence trend, showing trailing. Therefore, consider the use of AR model first-order difference after the sequence, that is, the original sequence using ARIMA (14, 1, 3) model.
204
Fig. 8 Autocorrelation test chart
Fig. 9 Partial autocorrelation test chart
Y. Zhao
Analysis and Forecast of Car Sales Based on R Language …
205
Fig. 10 Unit root test chart
> arima Box.test (arima $ residuals, lag = 5, type = "Ljung-Box") Box-Ljung test data: arima$residuals X-squared = 0.84129, df = 5, p-value = 0.9743 It can be seen that p = 0. 9743, greater than 0.05, by white noise test.
6 ARIMA Model Prediction R can predict the future sequence value through the forecast package, predict the next five months sunny month sales and confidence on the lower bound, the statement is as follows:
> forecast =forecast:::forecast.Arima(arima,h=5,level=c(80,90)) Point Forecast Lo 80 Hi 80 Lo 90 Hi 90 31 18603.21 13394.50 23811.93 11917.90 25288.52 32 18323.80 13114.89 23532.71 11638.24 25009.36 33 26284.64 20792.30 31776.99 19235.29 33333.99 34 31365.17 24487.27 38243.08 22537.48 40192.87 35 29068.91 21297.64 36840.18 19094.59 39043.22 The predictions can be clearly seen. You can also draw primitive and predictive graphics, using plot to complete. >plot(forecast) (Fig. 11).
7 Concluding Remarks The above is the author’s analysis of the monthly sales data of Longyi car, Mainly using time series analysis method of R, draw the sequence diagram to test whether it is a smooth sequence, non-stationary sequence differential processing, until the smooth so far. And then use ARIMA method for analysis and modeling, and then further complete the forecast. After analysis and comparison, it is concluded that the Arima (14, 1, 3) model is used to predict the short-term sales trend of Langyi sedan. The Arima model is also used in many aspects, such as the analysis and forecast of the number of tourists [3], and the price index forecast [4]. Better prediction methods need to be further explored.
Analysis and Forecast of Car Sales Based on R Language …
207
Fig. 11 Time series prediction
References 1. Cong, Y., Dongmin, W., Huimin, Z.: Analysis and forecast of Zhengzhou corn purchase price based on ARIMA model. Modern Business 28, 31–33 (2019) 2. Tian, Z., Ershun, P.: Capacitor degradation model based on time series analysis. J. Shanghai Jiaotong University 53(11), 1316–1325 (2019) 3. Ruoyu, L., Libo, L.: Analysis and prediction of tourist numbers based on ARIMA model. Comput. Telecommun. (Z1), 1–4 (2019) 4. Xiaowei, Y., Xianguo, L., Youtian, T.: An empirical analysis of anhui consumer price index based on ARIMA model. J. Chaohu University 20(06), 1–7 (2018)
Intelligent Hardware and Software Design
Distributed Database Design and Heterogeneous Data Fusion Method Applied to the Management and Control Platform of Green Building Complex Wen-xia Liu, Wen-jing Wu, Wei-hua Dong, Dong-lin Wang, Chao Long, Chen-fei Qu, and Shuang Liu Abstract Ecological green has become the basic national policy in China. Green buildings are springing up in large numbers, but there are few corresponding solutions. In view of the characteristics of various complex data produced by the green building complex with many business forms and green technologies, and the difficulty of unified management and control, this paper proposes to establish a two-level distributed database to deal with the heterogeneous data and adopt the method of service data object-based middleware and data warehouse to deal with the data. According to the response requirements and data requirements of different functions of the platform, data fusion is performed in different ways. Query speed of data is optimized. And then reach the unified management of green building complex. Keywords Green building complex · Two-level control · Distributed database · Service data object · Middleware · Data warehouse · Data fusion
1 Introduction With the upsurge of smart city, smart transportation, smart shopping and other solutions related to stylish city have been derived. However, as an important part of the city, due to the complexity of its own data, there are relatively few data solutions for smart buildings, especially green buildings [1, 2]. In this paper, for the demand of coordinated, decentralized and unified management and control of green buildings, a set of data processing solutions for green buildings is established. Firstly, according to the demand of two-level management and control of green buildings, a distributed database is established. Then the data processing of the building complex is discussed, and heterogeneous processing of various complex data of the green building complex is carried out. The datum is integrated, queried and called by using XML-based middleware and data warehouse method. Finally, depending on the requirements of data update speed and response W. Liu (B) · W. Wu · W. Dong · D. Wang · C. Long · C. Qu · S. Liu Tian Jin Architecture Design Institute, Tian Jin 300074, China e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_19
211
212
W. Liu et al.
frequency, data methods are identified and called. This data processing method can address the needs of two-level management of green building complex and then realize the unified management of green building complex [3, 4].
2 Heterogeneous Data of Green Buildings Generally, there are many intelligent systems in the green building complex, involving a wide range of disciplines and management content. Most of the subsystems use proprietary communication protocol to achieve internal data transmission. This cannot do the information sharing between subsystems and also cannot achieve linkage and interoperability. This situation obviously cannot satisfy the requirements of the unified management and control of the building complex and also cannot meet the requirements of all kinds of situations faced by the green building complex objectively. Under the guidance of the overall management plan, each subsystem has an obligation to execute all kinds of complex instructions in an orderly manner and cannot achieve the goal of monitoring, prediction, correction and control of the building. In view of the complex situation that the green building complex, multi-function and multi-property co owned and co-managed, the design scheme of the integrated management and control platform of the green building complex is proposed. The key problem in the scheme is tantamount to solve the problem of information structure and integration of various structured, semi-structured and unstructured data, such as video, access control, light control, IP network and so on. Only by solving this fundamental problem, can we provide the data base for the unified management of the integrated management and control platform.
3 Database Design of Management and Control Platform for Green Building Complex Due to the demand for coordination, decentralization and unified management and control of complex green building complex property, the green intelligent management and control platform is designed as a two-level management and control mode. Set the intelligent general controls management platform of the building complex as the first-level management. The intelligent subcontrol management platform of other individual units is configured as secondary management.
Distributed Database Design and Heterogeneous Data Fusion …
213
Fig. 1 Distributed database management system
3.1 Overall Database Scheme Designs To adapt to the two-level controls design of green intelligent control platform. The data scale of dense green building complex is large, and the use scenarios are complex and diverse. The traditional single and centralized management database gradually exposes many defects. For example, the data volume is huge, the load is large, the data transmission link is congested, the data communication overhead is large, the data query efficiency is low and so on. Therefore, the solution of distributing database is put forward. The distributed database has the characteristics of physical spatial distribution, logical integrity and autonomous cooperation of a single building [5–9]. Generally, is sometimes difficult to ensure the security and confidentiality of the distributed database data. In this paper, the equipment network based on the internal local area network is used to isolate the network, and then the physical isolation is realized to assure the security and reliability of the information. The distributed database management system is illustrated in Fig. 1. The regional integrated management database of the green building complex meets the first-level management of the management and control platform, and the individual databases meet the secondlevel management requirements for the management and control platform. The single database can be interconnected with the regional integrated management database.
3.2 Single Building Database Designs According to the management requirements of different intelligent systems and equipment of the single building in the green building complex, according to the database design method, the single database design is divided into four stages, which are demand analysis, conceptual design, logical design and physical design. (1) Demand analysis Demand analysis of database includes two aspects: data analysis to be processed and data operation analysis. The data to be treated in construction equipment management
214
W. Liu et al.
Fig. 2 Database one to many diagram
includes static data, dynamic data and behavioral data. Static data mainly includes equipment asset management, space management and other data. Dynamic data includes real-time operational status, operational data, energy consumption statistics and other data. Behavioral data mainly refers to data generated by human operation behavior, such as work order distribution, mode setting and other data. (2) Conceptual design The entity types of the system include: building monomer, equipment, management personnel, etc. The interrelationship of these entities is that there is a relationship between building monomer and equipment, that is, “space location” relationship, which is one to many relationship, as shown in Fig. 2. There is a connection between the device and the administrator, which are a many to many relationship, as shown in Fig. 3. (3) Logic design The task of logical design is to transform the conceptual structure into the corresponding logical structure according to the characteristics. The transformation here is to transform the structure diagram representing the concept, as showed in Fig. 4, into a relational model, that is, to transform each entity and relation in the conceptual diagram into a relational table and normalize the table. We need to use paradigms according to the specific situation, and utilize three paradigms here. According to the above requirement analysis and system composition module, the corresponding database consists of some tables: user table, spatial information table, user table, equipment information table, etc. According to the actual requirements and processing convenience, the table can be expanded and cut accordingly.
Fig. 3 Database many to many diagram
Distributed Database Design and Heterogeneous Data Fusion …
215
Fig. 4 Database concept diagram
(4) Physical design Physical design is the detailed design of a data table. The tasks of this stage include: determining the names of all database file forms and the names, types, width of the fields contained, etc.; determining the indexes to be established for each database file.
4 Data De Heterogeneous Fusion Method Because the data of green building complex comes from different single buildings, different intelligent systems, such as video, light control, access control, energy consumption, etc., the data sources of these systems are different, the format is different, the structure is different, etc., the management and control platform needs to integrate these multimodal heterogeneous data. In this paper, there are two main methods to deal with heterogeneous data fusion, that is, middleware method and data warehouse method. Middleware method, shielding the heterogeneity of the system, can make the software run on different platforms; the advantage is convenient management, easy maintenance, cost saving, but the lack of common standards, high coupling. The method of data warehouse solves the problem of data distribution, which has the advantages of high access efficiency
216
W. Liu et al.
and poor network dependence, but this method has poor real-time performance, long development cycle, high cost and difficult to update. In view of the above problems, this paper adopts the middleware data fusion method and data warehouse method based on service data object. The specific methods are as follows.
4.1 Middleware Method Data Fusions Service data object (SDO) is a programming standard specification. It is a data programming architecture of the Java platform and a standard API interface [10–13] that provides data access to simplify and unify the access to heterogeneous data. It is another standard for representing data in enterprise applications. Middleware data fusion is based on a “common data model,” in essence, the data is still stored in each data source participating in the fusion, and the data is encapsulated and transformed through the “wrapper” developed by each data source, which is virtual to a common data mode. The user’s query access and other operations are unified operations on the shared data mode through the API interface of the service data object (SDO). The middleware decomposes the query submitted by users based on the common data model into queries for one or more data sources, and then synthesizes the query results of data sources into the data of the shared data model, and returns the results to users. In this method, the differences of underlying data sources are shielded from users, so that users’ queries are ostensibly aimed at a single data source, but in fact, queries are the results of subqueries of each data source, so it is also called virtual view method. The middleware heterogeneous data fusion model is presented in Fig. 5.
Fig. 5 Middleware data integration diagram
Distributed Database Design and Heterogeneous Data Fusion …
217
Fig. 6 Data warehouse fusion diagram
4.2 Data Fusions of Data Warehouse Method The data fusion method of data warehouse is to build a data warehouse, load the copies of data from different information sources to the data warehouse, synthesize a global mode, and users can query and access the data in the data warehouse through the API interface provided by SDO. The data warehouse integration method is shown in Fig. 6.
5 Heterogeneous Data Fusion of Green Building Complex In this paper, two heterogeneous data fusion methods based on service data object are employed in the management and control platform of green building complex. Depending on the requirements of different functions of the management and control platform, the middleware data fusion method and data warehouse data fusion method are adopted at the same time. According to the altered requirements of data transmission scale, update frequency and response speed of green building management and control platform, two data processing methods, data warehouse and middleware, are adopted. Data warehouse is applied to data sources with low real-time requirements, such as access control history data, parking history data, report data, etc. Middleware is applied to data with fast response, such as data transmission from single alarm data to regional integration platform, etc. Because the SDO data object adopts XML data format, but the size of XML data is several times of the original data, the load of a large number of data will cause a great burden on the server memory, which is easy to cause memory overflow, so this paper uses the data file buffer to solve the above problems. This paper uses reeds buffer. In the middleware data fusion method, the data is still saved on each single data source. The management and control platform provides a virtual view (i.e., global
218
W. Liu et al.
mode) and the processing mechanism of global mode query. Users submit requests directly on the basis of the global mode, which is processed by the control platform and converted into requests that can be executed by each data source on the basis of the local data view. The method of data fusion in a data warehouse is an idea of data replication. Copy the data from each data source to other data sources related to it, maintain the data consistency of the data source as a whole, and improve the efficiency of information sharing and utilization. The application architecture of SDO-based heterogeneous data fusion method in the management and control platform of green building complex is illustrated in Fig. 7. The architecture is divided into three layers: a query and processing layer, middle layer and data source layer. The data source layer mainly includes various types of multimodal data of intelligent system in the management and control platform of green building complex, such as structured, semi-structured and unstructured data of video, access control, ladder control, light control, etc. It is the most basic data source layer. In order to shield the differences of different data types in different systems, a wrapper is set up for encapsulation and transformation of underlying multimodal data. According to SDO standard, different data sources and different wrappers are developed. Provide a unified external interface. The wrapper encapsulates the data transformation into SDO data objects and provides them to the middle tier middleware. Middleware components are different according to different programming languages, such as the J2EE middleware WebLogic Server used in this paper.
Fig. 7 Data processing architecture of control platform
Distributed Database Design and Heterogeneous Data Fusion …
219
Fig. 8 Flow chart of data identifier of control platform
Query processing layer mainly provides a data interface for the control platform. The control platform passes the data access request related to the building equipment to the parser, which reads the request information and distributes the request to separate wrappers. The function of identification module is added in the design to make it quite clear whether the data warehouse or middleware is the application query access of the management and control platform. The flow chart of identification module identification data is given in Fig. 8.
6 Project Cases A cultural center project (hereinafter referred to as the project) consists of six parts: the exploration museum, the art gallery, the library, the performing arts center, the civic activity center, the cultural corridor and the underground parking lot, including the civil air defense project. The total construction area of the project is 316,000 sm. It is a case in large-scale cultural complex and green two-star building complex. The cultural complex is illustrated in Fig. 9. In this project, the management and control platform records all the information of the regional integration center and six building units in real time. It can not only integrate and manage the information of many subsystems but also compare the horizontal and vertical dimensions of discrete single subsystem information. Help the owner to analyze and make decisions, and set up corresponding plans. At the same time, it can realize the hierarchical management of a single building and then realize the fine management of the building. The database design method and data fusion method in this paper are applied to the management and control platform of this project. For example, middleware data
220
W. Liu et al.
Fig. 9 Green building complex
fusion method is used in alarm management of management and control platform. Based on XML, the alarm information of a single building is transmitted through middleware message subscription as shown in Fig. 10. The data fusion method of the data warehouse is used to query the historical records of door access in the single building management and control platform as shown in Fig. 11.
Fig. 10 Alarm information display using middleware data fusion method
Distributed Database Design and Heterogeneous Data Fusion …
221
Fig. 11 Access control history query using data warehouse data fusion method
7 Conclusions In this paper, through the distributed design of the management and control platform database of green building complex, the requirements of hierarchical management and refined management of building complex are satisfied. At the same time, the data of many intelligent systems is isomerized, and data fusion is carried out in different ways according to the response requirements of diverse functions and data requirements of the platform. Query speed of data is optimized. It lays a foundation for the design of the integrated management and control platform of green building complex.
References 1. Deren, L., Yuan, Y., Zhenfeng, S.: Big data in smart city. Geomatics Inf Sci. Wuhan Univ. 39(6), 631–640 (2014) 2. Jingyuan, W., Chao, L., Zhang, X., et al.: A review of data-based smart city research. Comput. Res. Dev. 2, 5–25 (2014) 3. Pengfei, Z.: Analysis of operation and maintenance management of Shanghai world expo green smart ecological park. Green Build. 6, 15–18 (2017) 4. Liu, W.X., Wang, D.L., Wu, W.J., et al.: The method of random generation of electronic patrol path based on artificial intelligence 5. Huiyang, F.: Research on integrated management and control technology of intelligent hydropower plant. (2015) 6. Jiang, F.: Research and application of oracle based data replication in shxepc CMS. Taiyuan Univ. Technol. (2008)
222
W. Liu et al.
7. Guolu, L.: Query processing in distributed database system. J. Qinghai Univ. Nationalities Edu. Sci. Ed. 25(S3), 68–69 (2005) 8. Yining, M., Xia, Z., Jingru, C.: Gradual optimization strategy of small and medium-sized websites based on distributed database. Comput. Knowl. Technol. 12, 11–12 (2012) 9. Ling, X., Jihong, L., Jianchu, Y.: Research and application of distributed database system. Comput. Eng. 1 (2001) 10. Zhigang, N., Mei, H., Jia, L.: Research on data integration scheme of heterogeneous system based on service data object. Comput. Appl 27(S1), 21–23 (2007) 11. Panqing, W., Zengliang, L., Yuan, T., et al.: Research on SDO based data integration platform. Comput. Meas. Control 18(7), 1657–1659 (2010) 12. Fuguijie: Research on heterogeneous data integration and data access technology based on service data object. Shenyang Univ. Technol (2011) 13. Liu, M.: Research on data synchronization technology of heterogeneous system of railway information sharing platform. Southwest Jiaotong University (2013)
Analysis and Prediction of Fuel Oil in a Terminal Yuxin Zhao
Abstract Time series is data with a fixed time interval as unit. The most common data are stocks, daily stock price chart, daily weather data, etc. Time series analysis is an important part of statistical analysis, because it can be predicted based on historical data, so almost all statistical analysis software has the function of time series analysis and prediction. The common analysis methods of time series are simple average method, weighted average method, and moving average method. There are two powerful algorithms in time series: Holt winters and ARIMA. R language has a powerful software package, which is invincible in data calculation, statistical analysis, and data mining. Based on the fuel data of a terminal, this paper analyzes and forecasts the fuel volume time series data of that day. Keywords R language · Data analysis · Time series analysis
1 Introduction Arima models, including AR (P), MA (q), and ARMA (P, q), are suitable for stationary time series. However, in the practice of statistical analysis, the time series we get are usually unstable, sometimes there are systematic upward or downward trends. For non-stationary time series, we should first stabilize them. Among them, the difference transformation is the most commonly used smoothing method. On the basis of stationary time series, the autocorrelation function (ACF) and partial autocorrelation function (PACF) sequences are obtained to determine the values of P, D, and Q parameters. After the stabilization, if the partial autocorrelation function is truncated and the autocorrelation function is tailed, the AR model is established, and the corresponding order of the truncated part is p value;
Y. Zhao (B) Beijing Information Technology College, Beijing 100018, China e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_20
223
224
Y. Zhao
If the autocorrelation function is truncated and the partial autocorrelation function is tailed, then MA model is established, and the corresponding order of the truncated part is Q value; If the partial autocorrelation function and autocorrelation function are both tailed, the sequence is suitable for ARMA model. That is, P value is determined by PACF, Q value is determined by ACF, and D represents the order of difference. In the process of modeling, we test whether it has statistical significance through parameter estimation. Through hypothesis test, we can judge whether the residual sequence is white noise sequence. Finally, the model that has passed the test is used for prediction.
2 Innovation In ARIMA model, the parameters of P and Q are not more than 2, but in the project, we found that when p value is 5 and Q value is 13, AIC value is the minimum, and the model is ideal. R language is a classical statistical analysis software, which also has a complete function in time series modeling. This project uses R to complete the analysis and prediction.
3 Project Analysis and Prediction Process The project data is the refueling volume data of a terminal in the past one year, which is a typical non-stationary time series. Because there is no cycle and seasonal characteristics, ARIMA model is selected to differentiate the original time series and parameters are determined after stabilization to complete the creation of the model, and the model is applied to predict the refueling volume of the subsequent date, so that the fuel supplier of the terminal can be accurate in advance standby and preset the number of refueling pumps to be started.
3.1 Process the Original Data, and Read the Time Series of the Fuel Quantity of the Day into R >data< -read.table(“clipboard”,header=T,sep=“\t”) > plot.ts(data$oil) (Fig. 1)
Analysis and Prediction of Fuel Oil in a Terminal
225
Fig. 1 Time sequence chart of fuel filling volume of the day
The time series of a smooth sequence shows that the sequence value always fluctuates randomly near a constant and that the range of fluctuations is bounded; if there is a clear trend or periodicity, then it is usually not a smooth sequence. From the sequence diagram above, we can initially judge the series as a nonsmooth sequence.
3.2 Unit Root Test is Carried Out on the Original Sequence to Verify Its Stationarity The smoothness test of the time series usually uses the method of unit root testing [1]. In R software, unit root test can be implemented using the unitrootTest () function in the fUnitRoots package. Function form: unitrootTest(x, lags=1, type=c(“nc”, “c”, “ct”), title=NULL, description=NULL) Wherein, the input parameter x is the observation sequence, lags is the maximum lag term used to correct the error term, type is regression type of unit root, p is the parameters returned, p value is less than 0.05 to pass the unit root test.
226
Y. Zhao
> unitrootTest(data$oil) Title: Augmented Dickey-Fuller Test Test Results: PARAMETER: Lag Order: 1 STATISTIC: DF: 0.4338 P VALUE: t: 0.8068 n: 0.7901 Because P>0.05, it is shown as non-stationary time series
3.3 Autocorrelation and Partial Autocorrelation Test The autocorrelation graph shows that the autocorrelation coefficient is greater than zero for a long time, indicating that there is a strong long-term correlation between the sequences, which can be judged as a non-stationary sequence [2]. Autocorrelation graph and partial autocorrelation graph of original time series: > acf(data$oil,lag.max=25) (Fig. 2) > pacf(data$oil,lag.max=25) (Fig. 3) The autocorrelation diagram of the original sequence further illustrates that the original sequence is a non-stationary sequence.
3.4 ACF and PACF Diagram After Second-Order Difference For non-stationary time series, we need to differentiate them until we get a stationary time series. We performed a second-order difference on the original time series (Figs. 4 and 5).
Analysis and Prediction of Fuel Oil in a Terminal
Fig. 2 Autocorrelation graph of original time series
Fig. 3 partial autocorrelation graph of original time series
227
228
Fig. 4 ACF diagram after second-order difference
Fig. 5 PACF diagram after second-order difference
Y. Zhao
Analysis and Prediction of Fuel Oil in a Terminal
229
3.5 Unit Root Test to Determine Whether the Sequence is Stable > unitrootTest(diffjiayou) Title: Augmented Dickey-Fuller Test Test Results: PARAMETER: Lag Order: 1 STATISTIC: DF: -25.9314 P VALUE: t: < 2.2e-16 n: 0.0002966 After the second-order difference, P arima Series: data$oil ARIMA(5,2,13) Coefficients: ar1 ar2 ar3 ar4 ar5 ma1 ma2 ma3 -0.0904 -0.6050 -0.0716 -0.8664 -0.3080 -1.6488 1.3037 -1.0731 s.e. 0.1692 0.0569 0.1306 0.0544 0.1655 0.2079 0.3271 0.3359 ma4 ma5 ma6 ma7 ma8 ma9 ma10 ma11 1.2174 -1.1747 -0.0769 0.7593 -0.8342 0.7188 -0.6577 0.7834 s.e. 0.3279 0.3608 0.3300 0.1930 0.2127 0.2302 0.1962 0.2021 ma12 ma13 -0.6251 0.3081 s.e. 0.1745 0.1196 sigma^2 estimated as 50688: log likelihood=-1861.44 AIC=3760.88 AICc=3763.89 BIC=3829.39 After establishing the model, the AIC value is 3760.88, which is better than other models.
230
Y. Zhao
Fig. 6 Time series diagram of fuel quantity with predicted value
3.7 Five Days After Forecast > forecast:::forecast.Arima(arima,h=5,level=c(80,90)) Point Forecast Lo 80 Hi 80 Lo 90 Hi 90 275 6188.096 5897.655 6478.537 5815.318 6560.873 276 6110.409 5810.039 6410.778 5724.888 6495.929 277 6056.936 5736.841 6377.030 5646.099 6467.772 278 5976.990 5642.181 6311.799 5547.268 6406.712 279 6124.432 5781.763 6467.102 5684.621 6564.244 > forecast=forecast:::forecast.Arima(arima,h=5,level=c(80,90)) > forecast See Fig. 6.
4 Summary Time series analysis and prediction are widely used in our life, such as prediction of road traffic conditions [3] and prediction of infectious diseases [4]; according to the characteristics of different time series, it is our constant goal to choose the
Analysis and Prediction of Fuel Oil in a Terminal
231
appropriate model, so as to carry out the prediction with high accuracy. This article mainly discusses the use experience of ARIMA model, hoping to communicate with you.
References 1. Zhang, J.: Application of time series model in economic data analysis. Liaoning Normal University (2017) 2. Lu, X., Wang, J.: Live STAR user card opening prediction for time series analysis. Sci. Surveying Mapp. 41(12), 39–42+74 (2016) 3. Xu, D.-W, Wang, Y.-D, Jia, L.-M, Qin, Y., Dong, H.-H.: Real-time prediction of road traffic conditions based on ARIMA and Kalman filtering. Front/ Inf. Technol. Electron. Eng. 18(02), 287–303 (2017) 4. Yanfei, Y.: Study on epidemic trend and prediction of infectious diseases based on time series model. Changchun University of Technology, 2016 Liu Lu. Application of time series analysis in China’s GDP forecast. Times Finance (24), 20–24 (2017) 5. Bing, S., Xiaoming, Y., Weihui, B., Lei, S., Xiaofen, N., Wei, M., Jie, G.: Application of time series analysis in prediction and early warning of influenza like cases in Jing’an District, Shanghai. Environ. Occup. Med. 33(02), 156–159 (2016)
An Android Malware Detection Method Based on Native Libraries Qian Zhu, Yuanqi Xu, Chenyang Jiang, and Wenling Xie
Abstract According to the problem that few existing Android malware detection studies focus on the malicious code in native libraries, an Android malware detection method based on native libraries is proposed in this paper. Firstly, ARM assembly instructions and grayscale images are extracted from native libraries. Secondly, N-gram features, GLCM, and fusion features are extracted from ARM assembly instructions and grayscale images. Finally, different types of machine learning algorithms are trained with these features to establish the best malware detection classifier. The experimental results show that the proposed method achieves accuracy of 86.3% and can detect malicious code in native libraries effectively. Keywords Android malware · Native libraries · N-gram · GLCM
1 Introduction Different kinds of Android malware have grown explosively in recent years. As the most important carrier of the mobile Internet, the security of mobile phones is becoming more and more complex. Mobile security has become the focus in the security field. APK is the abbreviation for Android application package. Generally, an APK file consists of the following files:
Q. Zhu (B) · Y. Xu · C. Jiang · W. Xie Software College, Northeastern University, Shenyang 110819, China e-mail: [email protected] Y. Xu e-mail: [email protected] C. Jiang e-mail: [email protected] W. Xie e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_21
233
234
Q. Zhu et al.
(i)
AndroidManifest.xml: it includes information of the whole application such as package name, permission, API version, and component. classes.dex: it includes the entire code logic of the application, which exists as the Dalvik bytecode. lib: it includes native libraries that is compiled from C/C++. META-INF: it mainly stores the application certificate and signature information files. assets: it includes images and text resources files. resources.arsc: it includes application resources files. res: it includes some resources files that cannot pack into resources.arsc.
(ii) (iii) (iv) (v) (vi) (vii)
At present, the researches of Android malware detection mainly extract relevant information from aforesaid files of APK and do relevant processing to detect malware. Some studies used features from classes.dex. MalDozer [1] is an automatic Android malware detection and family attribution framework. It extracts the raw sequence of the app’s API method calls from classes.dex and automatically extracts and learns the malicious and the benign patterns from the actual samples to detect Android malware. Fan et al. [2] proposed a novel approach that constructed frequent subgraphs from classes.dex to represent the common behaviors of malware samples that belonged to the same family. Some studies used features from AndroidManifest.xml. Chen et al. [3] presented a method to measure the similarity of software behaviors by means of static analysis and dynamic operation. Their methods statically analyzed features from AndroidManifest.xml such as package name, component, and class name. Some studies used features from resources. Faruki et al. [4] presented AndroSimilar, an approach which generated signatures by extracting statistically robust features from resources to detect Android malware. Some studies used features from the MANI-FEST folder. Lee et al. [5] proposed a novel malware classification method for Android malware using stacked RNNs and CNNs. Their method extracted application’s package name from AndroidManifest.xml and the developer certificate from the MAN-IFEST folder. Some studies used features from the lib folder. Lang et al. [6] proposed a malicious code classification method based on multi-feature fusion. Their method used three features: N-gram sequences from disassembled X86 libraries, GIST and GLCM features from source and disassembled X86 libraries. Sun et al. [7] proposed an Android malicious code detection mechanism based on native layer. Their method converted native libraries into assembly instructions, generated control flow graphs, and then optimized them. However, the problem is that there are few current studies extracting features from native libraries. It means that these studies cannot detect malicious code in native libraries effectively. Due to this problem, an Android malware detection method based on native libraries is proposed in this paper.
An Android Malware Detection Method Based on Native Libraries
235
2 Methodology The malware detection model of this paper is shown in Fig. 1. Firstly, the training set is processed after raw data extraction and feature extraction process. Secondly, four types of machine learning classifiers (KNN, LR, SVM, and RF) are trained with the processed training set and the best classifier is selected by comparison. Finally, the test set is processed in the same way and is tested by the classifier selected in the previous step to determine whether it is malware.
2.1 Raw Data Extraction 2.1.1
Native Libraries Extraction
In the course of loading native libraries, an app will find corresponding files from the lib folder according to the running platform. In this section, native libraries running on ARM platform are unzipped from APK.
2.1.2
Disassembling Native Libraries
IDA Pro is a cross-platform, multi-processor disassemble and debugger [8], which can disassemble ARM assembly instructions from native libraries. Thus, the batch mode of IDA is used to disassemble ARM assembly instructions in this section.
2.1.3
Grayscale Image Extraction
For a color image, every pixel’s color depends on three parameters that are between 0 and 255. Compared with the color image, the grayscale image is simple relatively, which has only one element to decide the gray level of every pixel and describes textural features clearly.
Fig. 1 Malware detection model
236
Q. Zhu et al.
From the perspective of data storage, native libraries are binary streams that consist of 0 and 1. In the course of grayscale image extraction, each native library is converted into a binary stream and every binary stream is spliced into a long stream in file order. Then, every 8 bit of the long stream generates a pixel’s gray level. Then, the long stream is spliced into a two-dimensional array of equal length and width with minimum losses according to its length and grading operations are performed to the array after normalization according to the set level. For example, there is an array [187, 255, 31, 74, 99]. Normally, it is very difficult to extract accurate feature in a messy data, so 0 to 255 are divided into eight equally spaced levels. After that, this array is transformed to [160, 224, 0, 64, 64]. In order to better describe the extraction process, here is a detailed example of extracting grayscale images from an Android malware that SHA1 is 7fb636bb6d33969c51d7d44745bb65fb09b43913a3bee0d5d8629e6dc7f0e0c8. The steps are as follows: (i)
There are there native libraries in the lib folder of this APK: libhellojni.so, liblocSDK4b.so, and libminivenus.so. Each of them is converted to a binary stream and these three binary streams are spliced into a long stream in the read file order. (ii) The length of the long stream is divided by 8 and takes a square. The calculation result is the side length of the square grayscale image. (iii) A two-dimensional array is generated using the method above. Due to grayscale images involved in this section are single-channel, each number in the twodimensional array is the gray value of one pixel and each one-dimensional array in the two-dimensional array is a row of the grayscale image and a complete grayscale image is generated by traversing the two-dimensional array. Figure 2 is the result of grayscale image extraction and marks three grayscale image parts of each native library.
2.2 Feature Extraction 2.2.1
N-gram Feature Extraction
N-gram is a language model and one of the most important concepts of natural language processing (NLP). N-gram algorithm uses a sliding window with fixed length of N to cut the extracted contents and moves the extracted contents from the first character pursuant to byte stream from left to right to form subsequence relate to the former N-1 words. For instance, string ANDROID’s 2-gram sequence is AN, ND, DR, RO, OI, ID and 3-gram’s sequence is AND, NDR, DRO, ROI, OID. Normally, an N-gram sequence’s incidence in dataset is more frequent compared with the extracted contents, and more stable and better in statistics. This is the cause of using the N-gram algorithm to extract features in this section.
An Android Malware Detection Method Based on Native Libraries
237
Fig. 2 Result of grayscale image extraction
Due to ARM assembly instructions are numerous, this section classifies and simplifies them and leaves five instruction sets of M, B, J, L, and S which represent instructions of moving, jumping, judging, loading data, and storing data, respectively, and only remains opcode fields and removes parameters. ARM assembly instructions classification is shown in Fig. 3. In the course of extracting N-gram sequences from ARM assembly instructions, the extracted characteristic quantity will grow rapidly as growth of N. When N = 5, the characteristic quantity will be 3125; when N is larger than five, characteristic quantity will outpace 15,000. Excessive characteristic quantity will conversely reduce test degree of malicious code, and hence, this section selects to extract 2-gram, 3-gram, 4-gram, and 5-gram sequences from ARM assembly instructions.
Fig. 3 ARM assembly instructions classification
238
Q. Zhu et al.
Fig. 4 ARM assembly instructions disassembled by IDA Pro
For instance, the process of extracting N-gram sequences from ARM assembly instructions disassembled by IDA Pro in Fig. 4 is shown in Fig. 5. According to the method mentioned above, this section takes every disassembled function as a unit to extract 2-gram, 3-gram, 4-gram, and 5-gram sequences. Afterward, it regards every APK as a unit and sums up amount of every N-gram sequence extracted in the last step to generate 2-gram, 3-gram, 4-gram, and 5-gram features.
Fig. 5 Process of extracting N-gram sequences
An Android Malware Detection Method Based on Native Libraries
2.2.2
239
GLCM Feature Extraction
The texture feature is a large number of similar elements or graphic structures with strong or weak regularity from image, which is understood as a kind of change and repeat of image in gray space or recurring texture units and their arrangement rules in general. And the texture feature has great rotation invariance and noise immunity, but there is a large deviation when the resolution changes or is impacted by light. A problem that the texture features reflected from the two-dimensional image is not the real texture of the actual object surface. However, in this section, texture feature extracted from grayscales is used to match images without the influence of the change of image resolution and light or the differences between two-dimensional images and the surface of real objects, so it is a good situation to use texture feature. At present, the common texture features are GIST, GLCM, Tamura texture features, and autoregressive texture features. This section uses GLCM that is widely used and has the best effect feature to describe the spatial characteristics of grayscale images. GLCM function will produce a gray-level co-occurrence matrix (GLCM) and the large of this matrix is the square of the gray level of original image, in order to reduce the dimensions of GLCM to rise computational efficiency the gray level of original image needs to be compressed and this step has been explained in detail in the normalization algorithm part in the gray-level image generation section. GLCM describes a relationship between two gray image elements in a certain distance or adjacent image units of a local area or the whole area in a statistical image. The elements of GLCM represent the joint probability density P(i, j/d, θ ) between gray levels. P(i, j/d, θ ) means a frequency of starting from gray level i and ending at gray level j at the distance d and direction θ, so the grayscales with similar texture feature will have similar frequency in a certain pair of gray levels. Before the process of the GLCM matrix generation, two groups of parameters need to be set. One is the distance and the other is the direction. If the number of distance and direction were set to n and m, respectively, a × b GLCMs will be generated. In terms of distance parameters, the better distance between two pixels is 100 pixel. In terms of direction parameters, the commonly used angles are 45, 90, 135, and 180°. There will be two symmetric matrices when 135 and 315° are used in the meantime. Thus, the distance parameters are only set to 45, 90, 135, and 180°. An example is given to illustrate the GLCM generation process. The parameters of distance d and direction θ are set to 1 pixel and 135°. The 5 × 5 matrix is called gray-level matrix completed normalized option that the gray value is divided into 8 levels represented by the numbers from 1 to 8 (Fig. 6). The 8 × 8 matrix is called grayscale image co-occurrence matrix (GLCM) (Fig. 9). It is clear that there are two parts texture features that the one part consists of 7 and 5 stages (Fig. 7) and the other part consists of 3-level (Fig. 8). For instance, the statistics marked by red line, the value is 2 at (7, 5) of the co-occurrence matrix (i.e., j = 7 i = 5) because there are two points that 7-level gray value in 135° direction of 5-level. Similarly, in the position indicated by the blue line, the value is 3 at (5, 3) of the GLCM.
240
Fig. 6 Gray-level matrix
Fig. 7 One part of texture features
Fig. 8 Other part of texture features
Fig. 9 GLCM
Q. Zhu et al.
An Android Malware Detection Method Based on Native Libraries
2.2.3
241
Fusion Feature Extraction
In this section, N-gram features and GLCM features extracted above of the same app are simply combined without processing. Four kinds of fusion features are extracted by combining four kinds of N-gram features with one kind of GLCM features.
2.3 Machine Learning Classifiers The machine learning algorithms that have been used in our experiment include: • K-nearest neighbors (KNN): In characteristic space if the majority of one sample’s k-nearest neighbors belong to one category, then this sample belongs the one as well. The KNN algorithm just decides the category of a sample according to one or several samples’ category of the nearest neighbors. • Logistic regression (LR): Logistic regression hypothesis data obeys Bernoulli distribution. It adopts maximized likelihood function and gradient downgrade to solve parameter and categorical probability and classifies data through threshold filtering. Normally, the default threshold is 0.5 and it will be 1 when the probability is equal to or larger than 0.5; it will be 0 if it is less than 0.5 and threshold can be set by your own. • Support vector machine (SVM): Normally, in characteristic space the decisive border of category is not only. SVM researches on how to find the prime decisive border which can not only divide the current samples but strong in generalization to better divide the unknown samples. The samples nearest to decisive border of different categories are called support vector. • Random forest (RF): Random forest is one of the ensemble learning models. Its base estimator is decisive trees and adopts bagging to select samples. Every base estimator selects part of the sample data and characteristics for training, which increased difference between base estimators. The classified results will be decided by base estimator of every decisive tree. In this section, four algorithms mentioned above and features obtained from the previous process are applied to perform the classification task. In order to train a machine learning classifier, features from the training set are provided along with the class labels of each training instance. After training four classifiers, measurement metrics of each classifier are evaluated and the best classifier is selected by comparison. Then, features from the test set along with the class labels of each test instance are provided to the best classifier to determine whether it is malware.
242
Q. Zhu et al.
3 Experiments and Results 3.1 Dataset and Experimental Environment In our experiments, we used 1000 benign apps that contained native libraries from various app stores and 1000 malware that contained native libraries from koodous [9]. The machine configuration was Windows 10, Intel Core i7-7500U CPU, 16 GB memory. The experiments were done using Python, and the machine learning classifiers were implemented using scikit-learn [10].
3.2 Measurement Metrics • • • • •
True positives (TP) are the correctly predicted positive cases. False positives (FP) are the incorrectly predicted negative cases. True negatives (TN) are the correctly predicted negative cases. False negatives (FN) are the incorrectly predicted positive cases. Accuracy is the proportion of the total number of corrected predictions. Accuracy =
TP + TN Total
• Precision is the proportion of the correctly predicted positive cases. Prescision =
TP TP + FP
• Recall or true positive rate (TPR) is the proportion of the correctly identified positive cases. Recall =
TP TP + FN
• False positive rate (FPR) is the proportion of incorrectly predicted negatives cases. FPR =
FP TP + FP
• F1-score is a weighted average of recall and precision. F1-score =
2 ∗ Precession ∗ Recall Precession + Recall
An Android Malware Detection Method Based on Native Libraries
243
Table 1 Results of different N-gram features (g)
Accuracy (%)
Precision (%)
Recall (%)
F1-score (%)
2
72.16
65.90
88.69
75.62
3
77.16
80.39
70.20
74.95
4
79.50
75.68
85.27
80.19
5
80.83
78.09
84.24
81.05
• Area under curve (AUC) is the area under ROC curve. ROC curve is a graph that shows the relationship between TPR and FPR. It can describe the overall performance of a classifier. When AUC = 0.5, the classifier has no ability to classify. When 0.7 ≤ AUC < 0.8, the classifier has acceptable ability to classify. When 0.8 ≤ AUC < 0.9, the classifier has good ability to classify. When AUC ≥ 0.9, the classifier has pretty good ability to classify.
3.3 N-gram Features Selection In this section, we passed N-gram features obtained from the previous process to train the machine learning SVM classifier and evaluated the performance of different N-gram features by comparison. We can observe from results of different N-gram features in Table 1 that 5-gram features achieved accuracy of 80.83%, precision of 78.09%, recall of 84.24%, and F1-score of 81.05% and it performed best in each matrix. Thus, we selected 5-gram features in the next experiment.
3.4 5-gram Features Results Analysis In this section, we passed 5-gram features selected from last experiment to train KNN, RF, LR, and SVM classifiers and evaluated the performance of different classifiers by comparison. The ROC curve of each classifier trained with 5-gram features in Fig. 10 showed that AUC of each classifier was greater than 0.8 and the random forest classifier achieved highest AUC of 0.9030. It meant that each classifier had good ability of classification and the random forest classifier could classify malware better than other classifiers using 5-gram features. We can observe from results of different classifiers trained with 5-gram features in Table 2 that the random forest classifier achieved highest accuracy of 85.16%, highest precision of 88.88%, and highest F1score of 83.90%. Thus, random forest is the best machine learning classifier trained with 5-gram features to detect malware.
244
Q. Zhu et al.
Fig. 10 ROC curve of different classifiers trained with 5-gram features
Table 2 Results of different classifiers trained with 5-gram features Accuracy (%)
Precision (%)
Recall (%)
F1-score (%)
KNN
80.50
82.28
76.36
79.21
RF
85.16
88.88
79.45
83.90
LR
81.33
78.84
84.24
81.45
SVM
80.83
78.09
84.24
81.05
3.5 GLCM Features Results Analysis In this section, we passed GLCM features obtained from the previous process to train KNN, RF, LR, and SVM classifiers and evaluated the performance of different classifiers by comparison. The ROC curve of different classifiers trained with GLCM features in Fig. 11 showed that AUC of each classifier was greater than 0.8 and the random forest classifier achieved highest AUC of 0.9050. It meant that each classifier had good ability of classification and the random forest classifier could classify malware better than other classifiers using GLCM features. We can observe from results of different classifiers trained with GLCM features in Table 3 that the random forest classifier achieved highest accuracy of 84.00% and highest F1-score of 82.73%. Thus, random forest is the best machine learning classifier trained with GLCM features to detect malware.
An Android Malware Detection Method Based on Native Libraries
245
Fig. 11 ROC curve of different classifiers trained with GLCM features
Table 3 Results of different classifiers trained with GLCM features Accuracy (%)
Precision (%)
Recall (%)
F1-score (%)
KNN
81.83
88.60
71.91
79.39
RF
84.00
87.12
78.76
82.73
LR
74.66
71.47
79.79
75.40
SVM
72.33
67.21
84.24
74.77
3.6 Fusion Features Results Analysis In the previous two experiments, the random forest classifier trained with 5gram or GLCM features performed better compared with other classifiers. Thus, in this section, the random forest classifier was applied to perform classification task using 5-gram fusion features obtained from previous process. We can observe from results of random forest classifiers trained with fusion features in Table 4 that each metric received small increasing compared with the best metrics in the previous two experiments. The classifier trained with fusion features achieved better accuracy of 84.00%, better precision of 89.77%, better recall of 81.16%, and better F1-score of 82.73%. It meant that the Table 4 Results of the random forest classifier trained with fusion features 5 g+GLCM
Accuracy (%)
Precision (%)
Recall (%)
F1-score (%)
86.33
89.77
81.16
85.25
246
Q. Zhu et al.
random forest classifier trained with fusion features could detect Android malware more effectively and was the best malware detection classifier in this paper.
4 Conclusion In this paper, we propose an Android malware detection method based on native libraries. We extracted ARM assembly instructions and grayscale images from native libraries and N-gram, GLCM and fusion features from them to train and select the best detection classifier. We performed four experiments and the best detection classifier in this paper is the random forest classifier trained with 5-gram fusion features. Compared with other classifiers in this paper, it achieved accuracy of 86.33%, precision of 89.77%, recall of 81.16%, and F1-score of 85.25%. As a result, our method was an effective way to detect malicious code in native libraries. In the future, we will add new features such as features from native libraries running on X86 platform and apply more machine learning classifiers to the current method. Acknowledgment This work was supported by the National Natural Science Foundation of China (Grant Nos. 61902057), the Fundamental Research Funds for the Central Universities (Grant Nos. N181703005, N181704004), and The Doctoral Start-up Funds of Liaoning Province, China (Grant No. 2019-BS-084).
References 1. Karbab, E.B., Debbabi, M., Derhab, A., Mouheb, D.: MalDozer: automatic framework for android malware detection using deep learning. Digital Investigation, 24, S48–S59 2. Fan, M., Liu, J., Luo, X., Chen, K., Tian, Z., Zheng, Q., Liu, T.: Android malware familial classific-tion and representative sample selection via frequent subgraph analysis. IEEE Trans. Inf. Forensics Secur. 13(8), 1890–1905 (2018) 3. Chen, P., Zhao, R., Shan, Z., Han, J., Meng, X.: Android malware behavior similarity detection based on dynamic and static combination. Appl. Res. Comput. (5), 1534–1539 4. Faruki, P., Laxmi, V., Bharmal, A., Gaur, M.S., Ganmoor, V.: ANDROSIMILAR: robust signature for detecting variants of Android malware. J. Inf. Secur. Appl. 22, 66–80 (2015) 5. Lee, W.Y., Saxe, J., Harang, R.: SeqDroid: obfuscated android malware detection using stacked convolutional and recurrent neural networks. In: Deep Learning Applications for Cyber Security, pp. 197–210. Springer, Cham 6. Lang, D., Ding, W., Jiang, H., Chen, Z.: Research on malicious code classification algorithm based on multi-feature fusion. Appl. Res. Comput. 39(8), 2333–2338 7. Sun, B., Zhuang, Y.: An android malicious code detection mechanism based on native layer. Comput. Modernization 285(05), 5–10+16 8. IDA Pro. (2019) Online. Available: https://www.hex-rays.com/products/ida/ 9. Koodous. (2019) Online. Available: https://koodous.com 10. Scikit-learn. (2019) Online. Available: https://scikit-learn.org/
Fire Early Warning System Based on Precision Positioning Technology Hanhui Lin, Likai Su, and Yongxia Luo
Abstract This paper first introduces china’s “Internet plus” policy guidance, the degree of informationization of various industries is high, but the fire information device construction foundation is relatively weak than other industries, so it is necessary to strengthen research and system construction in this regard. Then, the components of fire early warning system based on precision positioning technology, such as smoke detector, environmental detector and acoustic and optical alarm, are introduced, the fire early warning system based on precision positioning technology is designed, and the operation mode of fire early warning system based on precision positioning technology is introduced. Finally, summarize the full text and explore the next research direction. Keywords Targeting technology · Fire protection · Early warning system
1 Research Background China’s “Internet plus” policy guidance, various industries in the information construction has been fruitful, “Internet and fire” has also been paid attention to, but due to the fire information device construction foundation is relatively weak, fire early warning information device construction level is relatively lagging behind, the use of Internet of Things technology, information technology services to the people’s public level needs to be improved. For fire safety hazards, is still stuck in the police situation after the emergence of the police, by the discovery of police personnel call the police, so that on the one hand can not be killed in the bud, on the other hand, the alarm is also very difficult to describe their specific location in the case of urgency. In H. Lin · L. Su (B) Center of Faculty Development and Educational Technology, Guangdong University of Finance and Economics, Guangzhou, China e-mail: [email protected] H. Lin · Y. Luo College of Educational Information Technology, South China Normal University, Guangzhou, China © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_22
247
248
H. Lin et al.
order to better serve the people, ensure the safety of the people’s property, quickly and accurately obtain the specific location of fire hazards, the urgent need to use Internet of things technology and information technology to establish a set of fire-fighting early warning system based on precision positioning technology.
2 Related System Analysis Domestic and foreign experts and scholars have carried out fire alarm system related research, such as SOWAH R in the “A Web-based communication software module of a real-time multi-sensor fire detection and notification system” proposed a Web-based fire alarm system for remote transmission of fire alarms and allow fire departments and rescue personnel to receive real-time fire condition [1]. Li Shibin and others in the “Fire Safety Management Information Monitoring System Research and Implementation” designed a fire safety management information monitoring system for use in the Web environment, for alarm monitoring, fault follow-up, supervision and management, alarm statistics and analysis, etc. [2]. Lu Wei and so on in the “Ping An campus intelligent fire monitoring system design and implementation” designed edge-based terminal perception system and used GSM cellular network for abnormal signal alarm, but the perceived terminal system is relatively fast energy consumption, and it is difficult to get practical application [3]. Qin Long and so on in the “GSMbased environmental monitoring system implementation” proposed a GSM-based cellular network-based fire warning system, because GSM communication costs are high. As a result, it is also difficult to obtain large-scale applications [4]. We can see that the existing system because or can not get the exact location of the accident in time, or perceive the terminal energy consumption, or high production costs and other reasons, or can not carry out effective large-scale promotion. Therefore, it is necessary to research and develop a fire warning system with high degree of automation, sensitive, accurate positioning, and price-friendly, which can provide timely and accurate early warning information for fire management departments and users, and help fire management departments and users eliminate fire hazards in the bud. At the same time in the case of dangerous situation for the fire management department to provide accurate geographic information and the severity of the danger, for the fire management department to deal with the danger to provide timely and accurate data, to achieve accurate fire. Research and development of fire early warning system based on precision positioning technology is of great significance to improve the quality of fire service and facilitate users to eliminate fire hazards in a timely manner.
Fire Early Warning System Based on Precision Positioning …
249
3 Introduction to the Components of Fire Early Warning System 3.1 Environmental Detectors Environmental detectors are one of the most important components of a fire detection system, which contains at least one sensor that detects various physical and chemical phenomena during the combustion of matter continuously or at a certain frequency cycle, and at least provides a suitable signal to the control and indication equipment. Its basic function is to produce a variety of gas, smoke, heat, light (flame), and other physical and chemical parameters that characterize fire signals and convert it into a computer-received electrical signal for computer analysis and processing. Environmental detectors are generally composed of sensitive component sensors, processing units, and judgment and indication circuits, wherein sensitive component sensors can act as a monitor for one or more fire parameters, make an effective response, and then process electronically or mechanically, and convert into electrical signals [5].
3.2 Smoke Detectors Smoke detectors, also known as smoke-sensitive fire detectors, smoke detectors, smoke detectors, smoke detectors and smoke sensors, are mainly used in fire protection systems, and are also used in the construction of security systems. It is a typical space fire-fighting measure to civilian equipment. Smoke detector is mainly through monitoring the concentration of smoke to achieve fire prevention, smoke detector’s internal use is of ion smoke sensing, ion smoke sensor is a technology-advanced, stable, and reliable sensor, widely used in a variety of fire alarm systems, and its performance is far better than the gas resistance class of fire alarm [6].
3.3 Sound and Light Alarm A sound and light alarm is an alarm signal device, that is set to meet the customer’s special requirements for alarm loudness and installation position. Explosion proof sound and light alarm is suitable for installation in explosive gas environment places containing ii.C-class T6 temperature group, can also be used in petroleum, chemical and other industries with explosion-proof requirements of the 1st. 1 and 2 zone explosion-proof sites, can also be used in the open and outdoor. Non-coding type can be used with any manufacturer’s fire alarm controller at home and abroad. During the production site accident or fire and other emergencies, the fire alarm controller
250
H. Lin et al.
sent the control signal to start the sound and light alarm circuit, issued sound and light alarm signal, to complete the alarm purpose [7].
4 Fire Early Warning System Design In order to overcome at least one defect described in the existing system, this fire alarm system provides a fire early warning system based on precision positioning technology, which has the characteristics of high degree of automation, sensitivity, and accurate positioning and can provide timely and accurate early warning information for fire management departments and users. In order to solve the technical problems of the existing system, the technical solution skilled is: a fire warning system based on precision positioning technology, including a wireless network connection module, wireless network connection module connected to environmental detectors and smoke detectors, environmental detectors for detecting flammable gas concentrations, and smoke detectors for detecting smoke concentrations. The wireless network connection module is connected to an alarm device, the wireless network connection module is connected to a data processing storage module, and the data processing storage module is connected to the user terminal and the management terminal. The system has a wireless network connection module, the wireless network connection module is connected with an environmental detector and a smoke detector, the environment detector detects the concentration of flammable gas and the monitors data transmission to the wireless network connection module, and the wireless network connection module is transmitted to the data processing storage module. Transferring monitoring data to the wireless network connection module, smoke detectors achieve fire warning by monitoring the concentration of smoke, and the smoke detector transmits the monitoring data to the wireless network connection module, which is transmitted to the data processing storage module. Alarm devices are used to monitor hazards and to achieve alarm. The data processing storage module is connected to the user terminal and the management terminal, and the data processing storage module sends early warning signals to the user terminal and the management terminal. The alarm device is equipped with a sound and light alarm. The sound and light alarm is an alarm with a sound and light signal alarm, when the flammable gas leak exceeds the lower limit of the explosion, the sound and light alarm will issue a sound and light alarm prompt, and the data through the wireless network connection module to transmit the value. The alarm device is equipped with a leakage alarm. In the event of leakage, the leakage alarm transmits the leakage through the wireless network connection module. The Wi-Fi module is connected with an emergency button. In the case of human discovery of hidden dangers and dangerous situations, the data can be transmitted through the wireless network connection module by controlling the emergency button, improving the response flexibility of the system. The data processing memory
Fire Early Warning System Based on Precision Positioning …
251
module is equipped with a processor and memory. The memory of the data processing storage module has data storage and the processor has the function of data processing.
5 Operation of Fire Early Warning System As shown in Fig. 1, the system provides a fire warning system based on precision positioning technology, including wireless network connection module 5, wireless network connection module 5 for TP-LINK wireless network connection module 5, specifically TL-WDR5620 1200M 5G dual-frequency intelligent wireless router. Wi-Fi module 5 is connected to the environment detector 3 and the smoke detector 1, the environmental detector 3 is the Qinkai environmental probe (color screen GSM), and the smoke detector 1 is Panasonic smoke detector 1 (JYT-GW-PEW001/B). As shown in Fig. 1, the environmental detector 3 is used to detect the concentration of flammable gases and the smoke detector 1 is used to detect the concentration of smoke. When a flammable gas leak occurs, the environmental detector 3 detects the change of flammable gas concentration and connects the relevant monitoring information through the wireless network connection module 5 to the user terminal 9 and the management terminal 7. When a fire occurs and smoke occurs, the smoke detector 1 is used for changes in smoke concentration and via wireless network
Fig. 1 System operation model
252
H. Lin et al.
connection, module 5 is connected to user terminal 9 and management terminal 7 to transmit the relevant monitoring information. As shown in Fig. 1, the alarm device is equipped with a sound and light alarm 4, the sound and light alarm 4 for the sharp sound and light alarm 4 (KERUI P6), when the flammable gas leak exceeds the lower limit of the explosion, the sound and light alarm 4 will issue a sound and light alarm prompt, and the data through the wireless network connection module 5 to transmit the value to the user terminal 9 and management terminal 7, enable monitoring. Alarm device is with leakage alarm 6, and leakage alarm 6 is for the Qinkai leakage alarm 6. Leak alarm 6 in the leakage situation, the leakage situation through the wireless network connection module 5 transmitted to the user terminal 9 and management terminal 7, is to achieve monitoring function. As shown in Fig. 1, Wi-Fi module 5 is connected with emergency button 2, fire emergency button 2 is Siemens fire emergency button 2 (Ruizhi series), in the case of human discovery of hidden dangers, dangerous situation, you can control the emergency button 2 data transmitted through the wireless network connection module 5, improve the system flexibility of the response. Management terminal for personal computers, smart terminals, such as smart terminals for smartphones, tablets and so on. Management terminal, user intelligent terminal through the wireless network connection module 5, and the various monitoring devices are connected to achieve the goal of fire warning. The wireless network connection module is connected to data processing enclosure 8, and the data processing storage module 8 is used to store and process data.
6 Conclusion In recent years, all industries in our country have achieved fruitful construction results in terms of informationization and intelligence, but the foundation of fire information construction is relatively weak compared to other industries, fire early warning information device construction level is relatively lagging behind. In this case, the research team research and development of fire warning system based on precision positioning technology, after research and analysis of the system for fire management departments and users to provide timely and accurate fire warning information, to help fire management departments and users as far as possible to eliminate fire hazards in the bud. Provide accurate geographic information and severity report for fire management departments in the event of a dangerous situation, and provide timely and accurate data for the fire management department to deal with the danger, so as to provide accurate fire protection. The next study will focus on the practical application of the system and optimize the system repeatedly. Acknowledgements This work was supported by the Education Project of Industry University Cooperation (201801186008), the Guangdong Provincial Science and Technology Program (2017A040405051), the Higher Education Teaching Reform Project of Guangdong in 2017, the
Fire Early Warning System Based on Precision Positioning …
253
“Twelfth Five-Year” Plan Youth Program of National Education Information Technology Research (146242186), the Undergraduate Teaching Quality and Teaching Reform Project of Wuyi University (JX2018007), the Features Innovative Program in Colleges and Universities of Guangdong (2018GXJK177, 2017GXJK180).
References 1. Sowah, R., Ofoli, A.R., Krakani, S., et al.: A web-based communication module design of a realtime multi-sensor fire detection and notification system. In: 2014 IEEE Industry Applications Society Annual Meeting (2014) 2. Li, S., Jin, Y., Yao, W., et al.: Research and implementation of fire safety management information monitoring system. Comput. Eng. Appl. 53(18), 213–217 (2017) 3. Lu, W., Bai, Y., Chen, H.: Design and implementation of an intelligent fire control monitoring system in Ping’an Campus. Electron. Meas. Technol. 36(5), 97–100 (2015) 4. Qin, L., Qian, L., Wang, Y.: Implementation of environmental monitoring system based on GSM network. Comput. Eng. Des. 27(16), 1033–1035 (2006) 5. Tian, S., Zhang, T.: An intelligent fire detector system design discussion. Fire Sci. Technol. 36(10), 1407–1409 (2017) 6. Qi, W.: South China. The achievements of the wonderful fire detector’s three smoke alarms. Western China Sci. Technol. 11 (2005) 7. Zhang, F.: The device runs a safety warning sound light alarm. Chin. Foreign Wine Ind. Beer Technol. 03, 55–58 (2017)
The Construction of Learning Resource Recommendation System Based on Recognition Technology Hanhui Lin, Yiyu Huang, and Yongxia Luo
Abstract Under the guidance of educational informationize 2.0 and Chinese education modernization 2035 baton, teaching information resources will show exponential growth. It is difficult to manually process and recommend to learners including text, pictures, audio, video, and other rich media learning resources. This study makes an in-depth analysis of the existing recommendation system of learning resources, which shows that the existing recommendation system is unsatisfactory and unable to accurately recommend learning resources for learners. Therefore, it is urgent to establish a recommendation system for learning rich media resources based on recognition technology. In this study, we will analyze the key technologies and construct the learning resource recommendation system based on identification technology by combining the feature analysis of these technologies, to introduce the operation mode of this system in detail, finally summarizing the research of this paper and discussing the next step of the research. Keywords Media · Learning resources · Recommendation
1 Research Background At present, under the guidance of educational informationize policy, various industries in the information construction have been fruitful. “Internet + education” has
H. Lin Center of Faculty Development and Educational Technology, Guangdong University of Finance and Economics, Guangzhou, China Y. Huang (B) School of Humanities and Communication, Guangdong University of Finance and Economics, Guangzhou, China e-mail: [email protected] H. Lin · Y. Luo College of Educational Information Technology, South China Normal University, Guangzhou, China © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_23
255
256
H. Lin et al.
been booming development and the network learning resources present exponential growth. However, it is difficult to classify processing by artificial, especially the multifarious learning resources in the form of a text, images, audio, video, etc. In this case, it is easy for learners to get lost in learning. On the other hand, the 2035 strategic task of Chinese education modernization clearly points out that “Constructing the intelligent campus, and building the intelligent teaching, management and service platform. Using modern technology to accelerate the reform of personnel training mode, to achieve the organic combination of large-scale education and personalized training.” In order to complete the strategic task of personalized learning needs, it is urgent to research and develop a learning resource recommendation system based on recognition technology, which can quickly and accurately recommend learning resources to learners, so as to meet the learners’ needs to obtain their favorite and appropriate learning resources in the sea of resources.
2 Literature Review Thorat et al. made a comparative analysis of content-based recommendation and collaborative filtering recommendation and proposed the recommendation model of intelligent learning system through acquiring learners’ learning feedback information [1]. Wang et al. realized personalized recommendation function by using learner’s feature model and course content label matching [2]. Through modeling users’ learning preferences and domain knowledge, professor yu shengquan of Beijing normal university proposed a “learning meta-platform” construction scheme with personalized content recommendation [3]. Yang lina et al. attempted to solve personalized resource recommendation problems of intelligent learning system by using multi-role agent cooperation framework [4]. Wang yonggu et al. proposed a personalized recommendation method for learning resources based on collaborative filtering technology [5]. He jieyue et al. proposed that collaborative filtering algorithm could effectively reduce the complexity of model construction, but there were problems such as sparse matrix and cold start [6]. Liu zhongbao et al. proposed a two-part diagram learning resource hybrid recommendation method based on the introduction of heat conduction and material diffusion theory [7]. We can see that the recommended learning resources have become the focus of the education teaching research field, but did not give full play to the role of the emerging rich media recognition technology, and also cannot meet current demand vast learning resources recommendation. The learning resources recommendation system based on the recognition is scarcely mentioned, so need to study the technology which connects text recognition, image recognition technology, voice recognition technology, and video technology, so as to accurately identify the learning resources, and to learn resources feature extraction technology, and then information matching. Its benefit is to recommend learning resources for learners accurately and save the time for learners to select learning resources and improve learning efficiency.
The Construction of Learning Resource Recommendation System …
257
3 Key Technologies 3.1 Text Information Extraction There are many methods of information extraction. This system uses latent Dirichlet allocation (LDA) model to extract information. Latent Dirichlet allocation (LDA) is a document topic generation model, also known as a three-layer Bayesian probability model, including three-layer structure of word, topic, and document [8]. By generative model, we mean that every word in an article is obtained through the process of “selecting a certain topic with a certain probability and selecting a certain word from the topic with a certain probability.” Document to topic obeys polynomial distribution, and topic to word obeys polynomial distribution.
3.2 Identification of Picture Resources Image recognition refers to the use of computer image processing, analysis, and understanding, in order to identify a variety of different patterns of target and image technology. In general industrial, the industrial camera is used to take pictures, and then the software is used to further identify and process according to the grayscale difference of the pictures. The image recognition software is represented by kangnaishi abroad, and the domestic representative has the graph intelligence. In geography, it is a technique for classifying remote sensing images. In this system, image teaching resources are used to identify and extract text information.
3.3 Speech Resource Identification Speech recognition, also known as automatic speech recognition (ASR), aims to convert the lexical content of human speech into computer-readable input, such as keystrokes, binary codes, or character sequences. In contrast to speaker recognition and speaker confirmation, the latter attempts to identify or confirm the speaker who made the sound rather than the words contained therein. People pay more and more attention to voice data. Speech recognition is an interdisciplinary subject. For nearly twenty years, speech recognition technology has made significant progress, starting to move from the laboratory to the market. In this system, phonetic teaching resources are recognized and extracted into text information.
258
H. Lin et al.
3.4 Video Resource Identification Through the intelligent analysis module embedded in the middle, the video screen is identified, detected, analyzed, filtered out interference, and abnormal conditions in the video screen are marked with targets and tracks. The intelligent video analysis module is an algorithm based on artificial intelligence and pattern recognition principle. In this system, video teaching resources are used to identify and extract text information.
3.5 Content User Characteristics Matching User information and learning resource contents are identified, user characteristics are analyzed, and specific learner groups are identified based on the characteristics analyzed. Firstly, the identified learners are described, and a set of keyword lists are abstracted according to the characteristics of learners, that is learners’ feature words. Secondly, the detected learner is identified and the user node belonging to the learner is found. In the process of learner behavior filtering, string regular matching is adopted to compare the learner’s personal attributes with the features of the content of learning resources. If the learner’s personal attributes or the features of learning resources match successfully, the corresponding learner will be recommended the learning resources.
4 System Construction This system provides a learning resource recommendation system based on recognition technology to overcome the practical problem of excessive learning resources with rich media and difficulties in selecting appropriate learning resources. It has the characteristics of quick classification of multi-format learning resources, fast matching of learning resources with learners and accurate recommendation of learning resources. In order to solve the technical problems, this system adopts the technical scheme is: based on identification technology of learning resource recommender systems, which include the data processing module. Data processing module is connected with the text recognition module, image resources recognition module, the speech recognition module, video resources recognition module, and wireless network connection module. Data processing module is connected with the content feature extraction module and user feature extraction module. Data processing module is connected with the content feature matching module, and the content users feature matching module is connected with the user intelligent terminal.
The Construction of Learning Resource Recommendation System …
259
The management system provided by this system is equipped with a data processing module, which is used to process and store data. The data processing module is connected with a text resource identification module, which includes the identification and classification of text resources in word, TXT, and other formats and is stored through the data processing module. The data processing module is connected with the image resource identification module, which includes the identification and classification of JPG, PNG, j peg, GIF, and other multi-format image resources and stores them through the data processing module. The data processing module is connected with the voice resource recognition module, which includes the recognition and classification of mp3, wma, m4r, and other multiformat voice resources and stores them through the data processing module. The data processing module is connected with the video resource identification module. The video resource identification module includes the identification and classification of video resources in mp4, WMV, avi, and other formats, which are stored through the data processing module. The data processing module connects to the server or the Internet through the wireless network connection module. The data processing module is connected with the content feature extraction module, which comprehensively analyzes and extracts the resource features including text, picture, voice, and video, and stores them through the data processing module. The data processing module is connected with the user feature extraction module, which comprehensively analyzes the learner features and stores them through the data processing module. The data processing module is connected with the content user feature matching module. The content user feature matching module compares the resource content feature with the learner feature and recommends the result to the learner intelligent terminal. The content user feature matching module is connected to the user intelligent terminal, which receives the learning resources recommended by the display matching module.
5 System Operation The operation process of this system is described in Fig. 1. This system provides a learning resource recommendation system based on identification technology, including data processing module 5, which is equipped with processor and memory. The memory is used to store data, and the processor is used to process data. As shown in Fig. 1, data processing module 5 is connected with text resource identification module 3, which includes text resource identification in word, TXT and other formats and is stored through data processing module 5. Data processing module 5 is connected with image resource identification module 1, which is OCR identification module. OCR recognition module is used for JPG, PNG, Jpeg, GIF, and other multi-format image resources identification, converted into file information, through data processing module 5 storage. The data processing module 5 is connected with the voice resource recognition module 4. The voice resource recognition module 4 is an AI voice recognition module, which is used to identify mp3, wma, m4r, and
260
H. Lin et al.
Fig. 1 System operation model
other multi-format voice resources. It converts them into file information and stores them through the data processing module 5. Data processing module 5 is connected with video resource identification module 6. Video resource identification module 6 is VR video identification module, converted mp4, WMV, avi, and other multi-format video resources into file information and stored through data processing module 5. As shown in Fig. 1, the data processing module 5 is connected with the content feature extraction module 7 and is equipped with the LDA model-based extraction module, which is used to extract features from text resources, picture resources, voice resources, and video resources to prepare for matching. The data processing module 5 is connected with the user feature extraction module 8 and is provided with the term frequency-inverse document frequency (tf-idf) extraction module, which is used to extract the user features and prepare for matching. Data processing module 5 is connected to wireless network connection module 2, which realizes data acquisition and transmission. The data processing module 5 is connected with the content user feature matching module 9. The content user feature matching module 9 adopts the string regular matching module, which is used to compare the resource content feature and the learner feature. Content user feature matching module 9 is connected to user intelligent terminal module 10, which is equipped with display and interaction modules to receive learning resources recommended by user feature matching module 9 for display content.
The Construction of Learning Resource Recommendation System …
261
6 Conclusion Based on previous recommendation system in-depth study, it can be seen that learning rich media recommendation system based on identification technology also few studies that use text recognition, image recognition technology, voice recognition technology, video technology as the content of the recognition, and the contents of information used to identify feature extraction module, that can accurately extract the characteristics of the text, images, audio, video information. In addition, the user feature extraction module can comprehensively analyze learners’ learning interests and learning needs and adopt learning resources and user feature matching technology to recommend learning resources accurately for learners, save learners’ time in selecting learning resources, and improve learning efficiency. In this study, the learning resource recommendation system based on identification technology still lies in the conceptual research and construction stage. Next, the effects and shortcomings of this system in practical application will be studied, and the system will be optimized repeatedly. Acknowledgements This work was supported by the Education Project of Industry-University Cooperation (201801186008), the Guangdong Provincial Science and Technology Program (2017A040405051), the Higher Education Teaching Reform Project of Guangdong in 2017, the “Twelfth Five-Year” Plan Youth Program of National Education Information Technology Research (146242186), the Undergraduate Teaching Quality and Teaching Reform Project of Wuyi University (JX2018007), the Features Innovative Program in Colleges and Universities of Guangdong (2018GXJK177, 2017GXJK180).
References 1. Thorat, P.B., Goudar, R.M., Barve, S.S.: Survey on collaborative filtering, content-based filtering and hybrid recommendation system. Int. J. Comput. Appl. 110(4), 31–36 (2015) 2. Wang, L.: Cross-domain personalized learning resources recommendation method. Math. Probl. Eng. 5, 206–232 (2013) 3. Yu, S., Chen, M.: The characteristics and the trend of ubiquitous learning resources construction– exemplified by the learning cell resource model. Mod. Distance Educ. Res. 6, 14–22 (2011) 4. Yang, L., Liu, K., Yan, Z.: Research on individualized learning resource recommendation under Agent Cooperative Framework of case-based reasoning. China Educ. Technol. 12, 105–109 (2009) 5. Wang, Y., Qiu, F., Zhao, J., Liu, H.: Research on personalized recommendation of learning resources based on collaborative filtering recommendation technology. J. Distance Educ. 3, 66–71 (2011) 6. He, J., Ma, B.: A collaborative filtering recommendation algorithm for boltzmann machine by using the real-value condition of social relationship. Chin. J. Comput. 1, 183–195 (2016) 7. Liu, Z., Li, H., Song, W., Kong, X., Li, H., Zhang, J.: Research on mixed recommendation method of learning resources based on bipartite network. e-Educ. Res. 39(8), 385–390 (2018) 8. Momtazi, S.: Unsupervised Latent Dirichlet Allocation for supervised question classification. Inf. Process. Manag. 54(3), 380–393 (2018)
Review of Ultra-Low-Power CMOS Amplifier for Bio-electronic Sensor Interface Geetika Srivastava, Ashish Dixit, Anil Kumar, and Sachchidanand Shukla
Abstract Deep brain simulation (DBS) is recently approved by Food and Drug Administration (FDA) in 2018, for treatment of epilepsy. It is a procedure of placing a neurostimulator deep inside the brain that can target specific sections of brain and provide electrical stimulation to brain. DBS is proven to be very effective in the treatment of various brain-related disorders like Parkinson’s, tremors, and movement disorders, with nearly 7.5% of Indian population suffering from one or other type of neurological disorder. It is need of the time to focus on fast and affordable neural disorder diagnosis with the newest method of treatments. DBS works on brain signals processing and stimulation. The advancements in brain signal processing integrated with the predictive models of AI and ultra-low-power VLSI may lead to the next level of human–computer interface (HCI) with complete mobility. Main challenges of any bio-potential sensing are the sensitivity of the bio-electronic sensor array with its integration density, the amplifier array in such low-frequency high-density sensor configuration followed by appropriate transmitter with large data handling capability and signal integration. Conventional methods of neural signal recording are critical in modern-day neuroscience research and emerging neural prosthesis programs. Neural recording requires the use of precise, low-noise amplifier systems to acquire and condition the weak neural signals that are transduced through electrode interfaces. Neural amplifiers and amplifier-based systems are available commercially or can be designed in-house and fabricated using integrated circuit (IC) technologies, resulting G. Srivastava (B) · S. Shukla Department of Physics and Electronics, Dr. Rammanohar Lohia Avadh University Ayodhya (Uttar Pradesh), Ayodhya, Uttar Pradesh, India e-mail: [email protected] S. Shukla e-mail: [email protected] A. Dixit · A. Kumar Department of Electronics and Communication Engineering, Amity University Uttar Pradesh, Ayodhya 224028, India e-mail: [email protected] A. Kumar e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_24
263
264
G. Srivastava et al.
in very large-scale integration or application-specific integrated circuit solutions. ICbased neural amplifiers are now used to acquire untethered/portable neural recordings, as they meet the requirements of a miniaturized form factor, lightweight, and low power consumption. Furthermore, such miniaturized and low-power IC neural amplifiers are now being used in emerging implantable neural prosthesis technologies. This review focuses on neural amplifier-based devices and is presented in two interrelated parts. First, neural signal recording is reviewed, and practical challenges are highlighted. Current amplifier designs with increased functionality and performance and without penalties in chip size and power are featured. Second, applications of IC-based neural amplifiers in basic science experiments (e.g., cortical studies using animal models), neural prostheses (e.g., brain/nerve machine interfaces), and treatment of neuronal diseases (e.g., DBS for treatment of epilepsy) are highlighted. This review paper deals with different challenges in neural recorder design from amplifier design to signal processing technique and also the on chi off-chip computation challenges. Keywords Neural amplifier · VLSI · EEG
1 Introduction The closed-loop therapeutic devices are required to efficiently sense, amplify and process neural signals using proper signal conditioning, low-noise amplifiers and then generate electrical stimulations. Even the strongest brain signals related to action potential (APs) need correct discrimination from the background activity. The on-chip decision control block has many limitations as it is more powerhungry and requires frequent power-up that restricts mobility. Also, the on-chip controller has limited memory and large data cannot be processed (Fig. 1). The
Scalp Electrodes
Front end Amplifier Array
Gain Mul plier Amplifiers
Analog To Digital Converter
On Chip Data Processor
Decision Block
Control Signal Generator
Signal Condi oner
S mulator Scalp Electrodes
Fig. 1 Block diagram of closed-loop on-chip neural therapeutics
Review of Ultra-Low-Power CMOS Amplifier …
265
complex algorithms of AI that need large memory for implementation cannot be used with on-chip processor. However, the off-chip processor system can implement complex signal processing algorithms and through NN processing capability it can take good decisions and can have a better control (Fig. 2). The major limitation with such off-chip processing block is its signal transmission protocol. It requires large power for transmission of such high-dimensional data and although the complex decision is taken off-chip, there is a need of upon chip processor for proper data synchronization and data demultiplexing. The efficiency of AI-based neural prosthetics depends greatly upon efficient processing of these signals. The overall system performance and its utility depend upon the power budget.
Scalp Electrodes
Front end Amplifier Array
Analog To Digital Converter
Data Transmi er
Data Receptor
Off Chip Data Processor
Control Signal Generator
Data Reciever
Digital To Analog Converter
Signal Condi oner
S mulator for Scalp Electrodes
Wireless Transmission of Control Signal
On Chip Controller
Fig. 2 Block diagram of closed-loop off-chip neural therapeutics
Gain Mul plier Amplifiers
Wireless Transmission of Data
266
G. Srivastava et al.
Fig. 3 Frequency and amplitude range of human bio-potentials
With availability of high-end processing, there is a need to acquire as much data as possible to improve the efficiency. The high-channel count devices are required in complex brain analysis for recording minutest neuron signals. Higher the channel count will always result in more power-hungry system and may lead to poor data integration.
2 On-Chip Front-End Neural Amplifier The CMOS-based VLSI circuit is currently the best candidate for neural signal recording and processing and may give solution to power consumption problems of multichannel neural recording implants. Although there are separate circuits that are available for amplifier block and for transmitter block, the custom-designed circuits for neural recording ICs will have many advantages in terms of miniaturized form factor, small weight, and low power budget and improve noise immunity. The neural amplifier design involves amplification of bio-potentials that involves signals with smaller in amplitude and range between 1 Hz to few tens of kHz (Fig. 3).
3 General Neural Amplifier System Architecture and Design Considerations Input to neural amplifiers is either given from scalp electrodes or the subdermal electrodes. Both these types of electrodes have advantage and drawbacks but they are suitable for specific applications. The surface electrodes are either dry electrodes or
Review of Ultra-Low-Power CMOS Amplifier …
267
Sensor Electrode
Electrode Mul plexer Stage
Front End Amplifier
Casecaded Amplifiers
Voltage Swing Adjustment Block
ADC
On-Chip Processing Unit
Transmi er Circuit
Fig. 4 General architecture of neural signal processing unit
wet electrodes. The signals acquired from these electrodes are first passed through preamplifier stage which is often called front-end amplifier. Then, the signal is passed through a few stages of cascaded amplifiers for additional gain. The cascaded amplifier block is followed by the digitization block that digitizes the acquired signal for processing block where these signals are processed before sending to the transmitter stage (Fig. 4). For further analysis, the amplified digitized signal processed either on-chip or off-chip. The AI-based algorithms require high hardware specification, and hence, they prefer off-chip processing. To enable system mobility, the signals are required to be transmitted wirelessly and the protocol adopted plays an important role in it. In systems with implanted closed-loop systems, the heat dissipation is also required to be low as even rise of more than 2 °C of temperature can cause severe tissue damage. Thus, low-power neural amplifier customized for such systems is highly recommended. Another important requirement from such amplifier is small chip area and ESD protection.
4 Power Reduction Strategies The VLSI system design for such neural sensing and closed-loop control requires optimization at systems-level design and also at component level. For system-level optimization, the resource is required to be shared among the various blocks and hence the amplifiers are associated to a large sensor array rather than each amplifier assigned to each sensor electrode. The power supply voltage is reduced when component-level optimization is done. Recently, many researchers have reported interfacing a large number of EEH recording electrodes to an array of the amplifier of one-tenth of its number. However, assigning more than four electrodes in such specially blur sources to one amplifier has crosstalk issues and it further needs to be addressed. Power gating technique and inclusion sleep transistor for reducing power consumption are another very common approach in the low-power amplifier design.
268
G. Srivastava et al.
Another important challenge is varying range of background noise based on site of recording, environment, motion, and moreover poor contact of non-invasive electrodes. Many innovative tunable background noise cancelation techniques are also been reported recently. Using different supply voltage for different operating blocks is another possible solution for optimizing power reduction; however, it requires complex system adjustments and also puts supply overhead on the system. Dynamic voltage swing adjustment is another approach for adjusting amplifier output swing to the available limited voltage range of ADC.
5 The Neural Amplifier Design The performance of front-end amplifier majorly governs the overall system performance. The front-end amplifier should have gain in the range of 200–10,000, depending upon the CNS signal selected for processing. This amplifier must be a low-noise amplifier with an introduced noise level much lower than the electrode noise. The operational transconductance amplifier (OTA): OTA is a voltagecontrolled current source that produces current proportional to the differential voltage applied. OTA is most suitable for neural signal processing unit block. The circuit topology for which standard design may be suitable for driving capacitive loads, the sizing of the transistors is critical for achieving low noise at low current levels. In the feedback path of neural signal amplifier, a high-value resistor required, often of the order of few Giga Ohms. Such a high value of the feedback resistor is required for extremely low cutoff frequency below 1 Hz, suitable for neural signals. The constraint of fabricating larger value of resistor leads to the design of a different class of resistor made from active component such as MOSFET. These high-value active resistors requiring low on-chip area are called the pseudo-resistors. There are a variety of pseudo-architectures that are available based on their tuning capacity and large frequency operating range. Input capacitors are used to couple the input of shunt feedback amplifier. The feedback is provided through feedback capacitors (Fig. 5). The closed-loop gain is the ratio of input capacitor to feedback capacitor. Here, −3 dB low cutoff frequency is determined by pseudo-resistor. Low-power low-noise CMOS amplifier used in [1] employs two different MOSFET pseudo-resistors. These are used for amplification of low-frequency signals in mega-Hz range. Capacitors are used are for coupling C1, and feedback C2, to the input of OTA (Fig. 6). The value C1/C2 governs mid-band gain, and for the case where the two input capacitors are very large than the load capacitors, the bandwidth depends upon transconductance of OTA [1]. PMOS is used as pseudo-resistor for giving proper negative dc feedback (Fig. 7). There are many types of pseudo-resistors available for providing dc negative feedback.
Review of Ultra-Low-Power CMOS Amplifier …
269
Fig. 5 Front-end differential amplifier architecture
Fig. 6 Low-power low-noise CMOS amplifier
6 Neural Amplifier Performance Metrics The noise efficiency factor NEF is a measure of amplifier noise produced as compared to single BJT transistor operating on the same bias current. It as Noise Efficienct Factor (NEF) = Virms
2Itotal πUt · 4κ T · BW
where V irms is the input RMS noise voltage, I tot : total amplifier bias current, U t : thermal voltage, κ: Boltzmann’s constant, T: absolute temperature, and BW: effective noise bandwidth of amplifier.
270
G. Srivastava et al.
Fig. 7 DDA-based approach of pseudo-resistor amplifier architecture
The power efficiency factor was proposed by Muller (PEF) that is another superior measure of amplifier performance and it also takes care for supply voltage and given by Power Efficiency Factor (PEF) = Noise Efficienct Factor (NEF)2 · VDD .
7 IC Design Challenges of Neural Amplifiers As the decreasing size improved efficiency of individual CMOS devices are putting forward themselves as even superior candidates for neural amplifier design, the demand for putting more channels is increasing as well that is leading to higher power dissipation. Integrated with the requirement of separate power supply for different blocks, it is putting further constrains on the portability issues of the device [2–6]. The main challenges are also to send data off-chip by using low-power-consuming techniques. Although there are many low-power data schemes available like BLE, they are suitable for the applications that require small amount of data to be transmitted. In case of CNS signal processing, data generated per second is very high and coupled with a large number of sensing electrodes; the data handling becomes very difficult. Only way out is reduced noise and decreased power consumption by the system. The conventional approaches of low-power VLSI designs are no longer valid. Recently many good techniques have been proposed by many researchers. 1. The power gated technique 2. State retention technique 3. Sleepy keeper and stack approach
Review of Ultra-Low-Power CMOS Amplifier …
271
The total power dissipation in CMOS-based chips is basically divided into three categories namely: Short-circuit dissipation, switching dissipation, and leakage power dissipation. Review table Paper
Gain (dB)
I AMP
V IN,RMS (µA)
NEF
Bandwidth
A low-power low-noise CMOS amplifier for neural recording applications [1]
39.5
16 µA
2.2
4.0
0.025 Hz to 7.2 kHz
A low-power CMOS neural amplifier with amplitude measurements for spike sorting [7] √ A 2.2 µw 94 nV/ Hz chopper-stabilized instrumentation amplifier for EEG detection in chronic implants [8]
42.5
–
20.6
–
22 Hz to 6.7 kHz
45.5
1.2 µA
0.93
4.9
0.5–250 Hz
A 1 V 2.3 µW biomedical signal acquisition IC [9]
40.2
330 nA
0.94
3.8
3 mHz to 245 Hz
A submicrowatt 36.1 low-noise amplifier for neural recording [10]
0.805 µA
3.6
1.8
0.3 Hz to 4.7 kHz
Design of ultra-low-power bio-potential amplifiers for bio-signal acquisition applications [11]
40
12.1 µA
2.2
2.9
0.5 Hz to 10.5 kHz
An ultra-low-power low-noise CMOS bio-potential amplifier for neural recording [2]
58.7
2.85 µA
3.04
1.93
0.49 Hz to 10.5 kHz
Selection of Variable voltages for different blocks for system power optimization is another approach for the circuit design optimization. However, integrity of such system is critical. The non-conventional approaches include adiabatic switching operating at low power. Use of reversible logic is also used for power saving.
272
G. Srivastava et al.
8 Conclusion A small effort is made in this paper to give an overall system view for low-power neural recorder design from amplifier design to signal processing block design. The paper compares different low-noise CMOS-based amplifier in neural recorder on different feature parameters. The suitable low-noise CMOS amplifier design will result in better prediction on neural recordings as background noise cancelation i = s is the biggest challenge in neural recorders. Large input noise requires more noise cancelation schemes in signal processing block and thus results in the loss of useful information. All the currently available techniques are based on neural signal processing through OTA where the feedback path requires a high-value resistor which is pseudo-resistors. The paper reviews popular pseudo-resistor design technique and concludes that NEF can be further improved when the reference amplifier is shared by more channels. Additionally, the 1-V supply is well suited for integration with low-power digital circuitry in complex systems-on-chip.
References 1. Harrison, R.R., Charles, C.: A low-power low-noise CMOS amplifier for neural recording applications. IEEE J. Solid-State Circuits 38, 958–965 (2003) 2. Yang, T., Holleman, J.: An ultralow-power low-noise CMOS biopotential amplifier for neural recording. IEEE Trans. Circuits Syst. II Express Briefs 62(10), 927–931 (2015) 3. Wattanapanitch, W., Fee, M., Sarpeshkar, R.: An energy-efficient micropower neural recording amplifier. IEEE Trans. Biomed. Circuits Syst. 1(2), 136–147 (2007) 4. Szuts, T.A., et al.: A wireless multi-channel neural amplifier for freely moving animals. Nat. Neurosci. 14(2), 263 (2011) 5. Lee, J., et al.: A 64 channel programmable closed-loop neurostimulator with 8 channel neural amplifier and logarithmic ADC. IEEE J. Solid-State Circuits 45(9), 1935–1945 (2010) 6. Ng, K.A., et al.: Implantable neurotechnologies: a review of integrated circuit neural amplifiers. Med. Biol. Eng. Compu. 54(1), 45–62 (2016) 7. Horiuchi, T., et al.: A low-power CMOS neural amplifier with amplitude measurements for spike sorting. In: 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No. 04CH37512), vol. 4. IEEE (2004) √ 8. Denison, T., Consoer, K., Kelly, A., Hachenburg, A., Santa, W.: A 2.2 µw 94 nV/ Hz chopperstabilized instrumentation amplifier for EEG detection in chronic implants. In: International Solid-State Circuits Conference Digest of Technical Papers (2007) 9. Wu, H., Xu, Y.: A 1 V 2.3 µW biomedical signal acquisition IC. In: Solid-State Circuits, 2006 IEEE International Conference Digest of Technical Papers, pp. 119–128 (2006) 10. Holleman, J., Otis, B.: A sub-microwatt low-noise amplifier for neural recording. In: 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 3930–3933. IEEE (2007) 11. Zhang, F., Holleman, J., Otis, B.P.: Design of ultra-low power biopotential amplifiers for biosignal acquisition applications. IEEE Trans. Biomed. Circuits Syst. 6(4), 344–355 (2012)
Intelligent Image Processing
An Enhancement of Underwater Images Based on Contrast Restricted Adaptive Histogram Equalization for Image Enhancement Vishal Goyal and Aasheesh Shukla
Abstract Scattering of light and absorption color affects the images of underwater. Due to this visibility and contrast on underwater, images are reduced. Dark channel prior is used typically for restoration. Poor resolution and contrast are exhibited by the images of underwater due to scattering of light and absorption of it in the environment of underwater. Color is caused by this situation. Due to this, it is difficult to analyze the image of underwater in an efficient manner for the object identification. In this paper, adaptive histogram equalization (AHE)-based new underwater image enhancement technique is proposed to get enhanced results. In the formula of gray-level mapping, parameter β is introduced by AHE algorithm. In new histogram, the spacing between two adjacent gray levels is adjusted adaptively to take target function as information entropy. In image, excessive local area and gray pixel merger are avoided by this. Settings of camera will not affect the performance of AHE as shown by validation and various image processing application’s accuracy are enhanced. The results of image enhancement methods are measured using the metrics like underwater image quality measure (UIQM), underwater color image quality evaluation (UCIQE), and patch-based contrast quality index (PCQI). Keywords Image enhancement · Color contrast · Adaptive histogram equalization (AHE) · Underwater dehazing approach · Underwater images · Image fusion · White balancing
V. Goyal (B) · A. Shukla Department of Electronics Communication and Engineering, GLA, University, Mathura 281406, India e-mail: [email protected] A. Shukla e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_25
275
276
V. Goyal and A. Shukla
1 Introduction In marine environments, remotely operated and autonomous vehicles are used in underwater to explore as well as to interact. So, it is necessary to produce a clear image of underwater which is more important in ocean research and engineering. Expectation of image visual quality is satisfied by raw underwater images seldom. Due to the particle in underwater, scattering and absorption degrade the images of underwater. There are three types of lights in light received by a camera, in underwater situation. They are, direct, backscatter, and forward. Direct light suffered by attenuation resulting in information loss of underwater images. Forward scattering light has a negligible contribution to blurring of image features. Backscattering light reduces contrast of underwater images and suppresses fine details and patterns. The understanding of underwater scene is hindered by scattering absorption problems. It also affects the application in marine environmental surveillance and aquatic robot inspection. Underwater image’s color, contrast, and visibility have to be enhanced by developing efficient solutions in order to produce a superior quality of visual. The visibility of degraded images is enhanced and restored by various methods. Enhancing methods like histogram equalization and gamma correction have shown their limitation in this work due to the underwater scene’s deterioration which is produced by additive and multiplicative process [1]. Polarization filters, specialized hardware, and tailored acquisition strategies using multiple images are used to solve these issues in the existing works [2–4]. Better results are produced by these techniques, but they are suffered by various issues which in turn decrease the applicability of it. In this paper, adaptive histogram equalization (AHE)-based new underwater image enhancement technique is proposed to get enhanced results. In the formula of gray-level mapping, parameter β is introduced by AHE algorithm. In new histogram, spacing between two adjacent gray levels is adjusted adaptively to take target function as information entropy. In image, excessive local area and gray pixel merger are avoided by this. Settings of camera will not affect the performance of AHE as shown by validation, and various image processing applications’ accuracy is enhanced.
2 Literature Review Güraksin et al. [5] used a differential evolution algorithm to implement underwater image enhancement technique. In red-green-blue (RGB) space, enhancement of contrast is done by this method. It reduces the effects of absorption and scattering. Gao et al. [6] solved these problems by presenting the following method. A color-balanced image is produced by proposing a local adaptive proportion fusion algorithm. This image is the first input image. Second image corresponds to edge
An Enhancement of Underwater Images Based on Contrast …
277
enhanced image. Third image corresponds to a proportion fusion image. These three images are merged by local triple fusion method which is based on an image formation model to produce final image. When compared to existing methods, superior performance is shown by objective and subjective evaluations. Anwar et al. [7] implemented an image enhancement model, based on convolutional neural network, which is termed as UWCNN. Database of synthetic underwater image is used to train this model effectively. Ten various databases of marine image are synthesized in complaint to underwater scene’s optical property and imaging models. For every formation type of underwater image, multiple UWCNN model is trained separately. On various scene of underwater, better generalization is shown by this proposed method as demonstrated by results of experimentation. Quantitatively and qualitatively, it outperforms the existing methods. Wong et al. [8] enhanced the contrast and removed color cast in underwater image by proposing a solution. Gray world (GW) method is used to remove color cast. But these results are not good enough. Differential gray-levels histogram equalization (DHE) and adaptive GW (AGW) are combined to propose a method which operates in parallel manner. Quantitative and qualitative measures are used to make performance comparison. Zhang et al. [9] used underwater image model base to get the restored image. On the restored image, contrast enhancement and white balance are performed. Multiscale fusion technique is used to blend these derived images. Every input weighted by using contrast and saturation metrics in order to blend them. The image of underwater is enhanced effectively and execution time is reduced by this method and also image with high visual quality is produced. Ancuti et al. [10] enhanced the images degraded by medium absorption and scattering as well as captured in underwater by an effective method. It is a single image technique. Knowledge about structure of scene and conditions in the underwater is not required by this method. Multi-scale fusion method is used to avoid the artifacts created by sharp transition of weight map in low-frequency components. Better exposedness of dark region will result improved image and video quality. It also results in enhanced sharp edges and global contrast. Li et al. [11] used contrast enhancement algorithm and underwater image dehazing algorithm to propose a model which enhances the image of underwater. In underwater image, natural appearance, color, and visibility are restored by image dehazing algorithm which is based on the principle of minimum information loss. High accurate restoration of color, value information, and best visual quality is produced by the proposed method as shown by the results of experimentation.
278
V. Goyal and A. Shukla
3 Proposed Work In this paper, adaptive histogram equalization (AHE)-based new underwater image enhancement technique is proposed to get enhanced results. In the formula of graylevel mapping, parameter β is introduced by AHE algorithm. In new histogram, spacing between two adjacent gray levels is adjusted adaptively to take target function as information entropy. In image, excessive local area and gray pixel merger are avoided by this. Settings of camera will not affect the performance of AHE as shown by validation and various image processing applications’ accuracy is enhanced. The proposed work results are measured using the metrics like PCQI, UCIQE, and UIQM. Also, compare the proposed technique with the existing particular underwater improvement methods. Proposed work includes three main steps: color contrast enhancement, multi-scale fusion, and weight maps description of weight maps and images. The image improvement approach embraces a two-stage technique, joining whiteadjusting and image combination, to enhance images without falling back on the unequivocal reversal of optical model. In the proposed approach, white adjusting goes for making up for shading cast caused by particular retention of hues with profundity, while image combination is considered to improve edges and subtle elements of scene, to relieve loss of complexity coming about because of backscattering. Twostage method is adopted by this image enhancement technique. Image fusion and white balancing are combined to enhance underwater image quality. This does not require restoring the optical model’s explicit inversion. The selective absorption with depth causes color cast, and it is compensated by white balancing. Backscattering produces contrast loss, which can be mitigated by image fusion. It also improves the details of scene and edge. Different illumination and properties of medium attenuation cause color casting. It is removed by white balancing while enhancing aspect of an image. Depth is having high correlation with color perception in underwater and it is necessary to rectify the green-bluish appearance. The spectrum of wavelength is selectively affected by process of attenuation, as light penetrates through water. The colored surface’s appearance and intensity are affected by this. In deeper water, perception of color is affected by scattering. It attenuates long wavelengths highly. Total distance between scene and observer defines the color loss and attenuation. Domain stretching introduces the artifacts of quantization and it is reduced by white-balancing techniques. As channel of red is well balanced, reddish appearance of high-intensity regions is also well corrected. In underwater image, pixel’s gray mapping is done using histogram equalization algorithm which is based on theory of probability. Gray operations are used for this mapping and it forms clear, smooth, and uniform gray levels from histogram. This is used to enhance image [1]. If r (0 ≤ r ≤ 1) represents original image’s pixel gray value, p(r) represents its probability density, then s (0 ≤ s ≤ 1) represents, enhanced image’s pixel gray value, p(s) represents its probability density. s = T (r) represents mapping function.
An Enhancement of Underwater Images Based on Contrast …
279
Same height bars will be resulted in equalized histogram, as stated in the histogram meaning in physics. ps(s)ds = pr (r )dr
(1)
In interval, if s = T (r) is increasing monotonically, its inverse is also a monotonic function. The relation between f i and i in discrete condition is expressed as, f i = (m − 1)T (r ) = (m − 1)
i qk k=0
Q
(2)
where number of gray levels presented in original image is represented as m, number of pixels in image with kth gray level is given by qk , and total number of pixels in image is represented as Q. Consider an image within various gray levels, and pi represents occurrence probability of ith gray level. Entropy of gray level is expressed as, e(i) = − pi log pi
(3)
The entropy of the whole image is E=
n−1 i=0
e(i) = −
n−1
pi log pi
(4)
i=0
If uniform distribution is followed in image’s histogram, maximum value is achieved by whole image’s entropy. After equalization, enlarge the dynamic range as given in Eq. (2). The interval of quantization is expanded by this equalization. Sharpened image S is defined by formula of unsharp masking. S = I + β(I − G * I). Where image to be sharped is given as I, Gaussian filtered version of I is represented as G * I, and parameter is given as β. Unsharpened image is produced with low value of β and over-saturated regions are produced with high value of β. It may have darker shadows and brighter highlights. The sharpened image S is defined as mentioned below, to rectify this issue. S = (I + N {I − G ∗ I })/2,
(5)
where linear normalization operator is denoted by N{·}. It is also termed as histogram stretching. With a defined unique, scaling, and shifting factor, color pixel intensities are scaled and shifted by this operator. This is done to cover entire dynamic range which is available by transformed values of pixels. In final image, more representation is given by the pixel with high value of weight. This is done by using weight maps in blending.
280
V. Goyal and A. Shukla
On every input luminance channel, apply Laplacian filter to compute its absolute value. It is used to as Laplacian contrast weight (WL) to compute global contrast. In applications like field depth extension, tone mapping uses this straightforward indicator [12]. In this, texture and edges are assigned with high values. The contrast cannot be recovered by using this weight in underwater dehazing task. This is due to the fact that regions of ramp and flat cannot be differentiated in an effective way by this. Complementary and additional metrics of contrast assessment are introduced to rectify this problem. In the scene of underwater, prominence is lost and salient objects are emphasized by saliency weight (WS). Saliency estimator is used to measure the level of saliency. Biological concept of center-surround contrast inspires this computationally effective algorithm. Highlighted areas are favored by this saliency map. In highlighted region, saturation is decreased. This is used to introduce an additional map of weight to overcome the limitations of saliency map. Fusion algorithm is enabled by saturation weight (WSat) for adapting information of chromatic. Highly saturated regions are advantaged by it. Difference between luminance L k of kth input and Rk , Gk and Bk color channels are used to calculate this map of weight. Pixel weight which is exposed high or low is reduced by introducing exposedness weight map in an exposure fusion context [12]. The pixels close to the middle of dynamic range in an image are assigned with large weight value by weight map and vice versa. The exposedness weight map may penalize the gamma-corrected input to favor sharpened image, if it exploits entire dynamic range. This leads to missing of few enhancements in contrast and includes few artifacts in sharpening.
4 Results and Discussion In this area, first play out an extensive approval of proposed AHE with white-adjusting approach. At that point, contrast proposed system and the current specific underwater improvement methods. The usage of this method to applications is demonstrated. Specialized, existing, enhancement and restoration underwater image techniques are compared with dehazing method. Figure 1 shows the video used for reference. Fusion-based algorithm automatically sets the reduced parameter set, which is to be employed. In all experiments, single parameter α defines the process of white balancing and it is set as 1. Image size defines the decomposition level number for multi-scale fusion. So that few tenth of pixels are reached by small resolution size. Figure 1 shows the several video frames processed by proposed color enhancement for underwater images. The proper segmentation of underwater image by the proposed method is shown in Fig. 2. In Table 1, the higher results perform better quality enhancement, related images (same order). The results of the proposed AHE—white-balancing approach—are
An Enhancement of Underwater Images Based on Contrast …
281
Fig. 1 Underwater video dehazing
Fig. 2 Image enhancement results
compared with the existing methods such as weight fusion framework (WFF) [10] and a multi-fusion underwater dehazing approach (MFDA). WFF additionally considers temporal coherence between nearby casings by playing out a powerful edge safeguarding noise diminishment system. The improved pictures and recordings are portrayed by the decreased noise level, better exposedness of the dull districts, and enhanced contrast differentiation while the best points of interest and edges
282
V. Goyal and A. Shukla
Table 1 Underwater dehazing evaluation depending on PCQI, UCIQE, and UIQM metrics Image
WFF
MFDA
AHE
PCQI
UCIQE
UIQM
PCQI
UCIQE
UIQM
PCQI
UCIQE
UIQM
Shipwreck
1.1454
0.65
0.62
1.21
0.65
0.69
1.86
0.69
0.79
Reef1
0.983
0.69
0.71
1.15
0.72
0.72
1.41
0.75
0.769
Reef3
2.12
0.75
0.76
1.41
0.74
0.82
1.58
0.82
0.86
Galdran1
1.141
0.69
0.69
1.21
0.68
0.85
1.29
0.73
0.90
Galdran9
1.142
0.67
0.65
1.28
0.66
0.69
1.36
0.76
0.75
Ancuti 1
1.21
0.65
0.59
1.16
0.65
0.57
1.43
0.82
0.63
are improved fundamentally. What is more, the utility of the improving method is demonstrated for a few testing applications. Three metrics namely UIQM, UCIQE, and PCQI are used to evaluate the performance and they are provided in Table 1 [13–15]. Contrast metric of general-purpose image corresponds to PCQI. Assessment of underwater image is done using the metrics, UIQM and UCIQE. The parameters which characterize the underwater image including low-contrast, blurring, non-uniform color cast are quantified by UCIQE metric. Contrast, sharpness, and colorfulness quality of underwater image are addressed by UIQM. Figures 3, 4, and 5 show the graphical representation of results of these metrics represented in Table 1.
Fig. 3 PCQI comparison versus methods
An Enhancement of Underwater Images Based on Contrast …
283
Fig. 4 Uciqe comparison versus methods
Fig. 5 Uiqm comparison versus methods
5 Conclusion and Future Work Scattering and absorption affect the quality of images captured under the area of water. In analysis and display, certain amount of limitations is shown by those images. In marine biology recognition and underwater object detection, accuracy value is decreased by such a color cast and low-contrast underwater images. For
284
V. Goyal and A. Shukla
the enhancement of that underwater image, a method is proposed, in which structure of the scene, underwater condition knowledge, and specialized hardware are not required in this proposed method. Images from white-balanced version of original degraded image and adaptive histogram equalization (AHE) are directly blended in this method. In underwater image, pixel’s gray mapping is done using histogram equalization algorithm which is based on theory of probability. Gray operations are used for this mapping and it forms clear, smooth, and uniform gray levels from histogram. This is used to enhance image. In output image, edge and color contrast transfer are defined by this two image fusion with its associated weight maps. Structure details of image, color, and global contrast are enhanced with improved perceptual quality by this proposed method. They are a few limitations. They are, it is not possible to fully restore color always and it maintains some haze in the case where camera is far away from the scene region. These issues can be rectified in future extensions.
References 1. Schettini, R., Corchs, S.: Underwater image processing: state of the art of restoration and image enhancement methods. EURASIP J. Adv. Signal Process. 2010, Art. no. 746052 (2010) 2. Narasimhan, S.G., Nayar, S.K.: Contrast restoration of weather degraded images. IEEE Trans. Pattern Anal. Mach. Learn. 25(6), 713–724 (2003) 3. He, D.-M., Seet, G.G.L.: Divergent-beam LiDAR imaging in turbid water. Opt. Lasers Eng. 41, 217–231 (2004) 4. Schechner, Y.Y., Averbuch, Y.: Regularized image recovery in scattering media. IEEE Trans. Pattern Anal. Mach. Intell. 29(9), 1655–1660 (2007) 5. Güraksin, G.E., Köse, U., Deperlıo˘glu, Ö.: Underwater image enhancement based on contrast adjustment via differential evolution algorithm. In: International Symposium on INnovations in Intelligent SysTems and Applications (INISTA), pp. 1–5 (2016) 6. Gao, Y., Wang, J., Li, H., Feng, L.: Underwater image enhancement and restoration based on local fusion. J. Electron. Imaging 28(4), 043014 (2019) 7. Anwar, S., Li, C., Porikli, F.: Deep underwater image enhancement. arXiv:1807.03528v1 [cs.CV], pp. 1–13 (2018) 8. Wong, S.L., Paramesran, R., Taguchi, A.: Underwater image enhancement by adaptive gray world and differential gray-levels histogram equalization. Adv. Electr. Comput. Eng. 18(2), 109–117 (2018) 9. Zhang, C., Zhang, X., Tu, D.: Underwater image enhancement by fusion. In: International Workshop of Advanced Manufacturing and Automation, pp. 81–92. Springer, Singapore (2017) 10. Ancuti, C.O., Ancuti, C., De Vleeschouwer, C., Bekaert, P.: Color balance and fusion for underwater image enhancement. IEEE Trans. Image Process. 27(1), 379–393 (2017) 11. Li, C., Guo, J., Cong, R., Pang, Y., Wang, B.: Underwater image enhancement by dehazing with minimum information loss and histogram distribution prior. IEEE Trans. Image Process. 25(12), 5664–5677 (2016) 12. Mertens, T., Kautz, J., Van Reeth, F.: Exposure fusion: a simple and practical alternative to high dynamic range photography. Comput. Graph. Forum 28(1), 161–171 (2009) 13. Wang, S., Ma, K., Yeganeh, H., Wang, Z., Lin, W.: A patch structure representation method for quality assessment of contrast changed images. IEEE Signal Process. Lett. 22(12), 2387–2390 (2015)
An Enhancement of Underwater Images Based on Contrast …
285
14. Yang, M., Sowmya, A.: An underwater color image quality evaluation metric. IEEE Trans. Image Process. 24(12), 6062–6071 (2015) 15. Panetta, K., Gao, C., Agaian, S.: Human-visual-system-inspired underwater image quality measures. IEEE J. Ocean. Eng. 41(3), 541–551 (2015)
Hybridization of Social Spider Optimization (SSO) Algorithm with Differential Evolution (DE) Using Super-Resolution Reconstruction of Video Images Shashi Shekhar and Neeraj Varshney Abstract In video super-resolution reconstruction, the pixel correlation of continuous multi-frame image sequences is used to establish an effective video image superresolution reconstruction model. Optimization problem is formed by transforming it to pixel sequence of high-resolution image from pixel sequence of low-resolution image. To effectively reconstruct the super-resolution video image, two techniques are proposed by this research. It involves the soft computing technique application as an intelligent hybridization. Differential evolution (DE) is hybridized with social spider optimization (SSO) algorithm to form a SSO-DE algorithm. Effective balance between exploitation and exploration can be achieved by this. The search space of SSO algorithm is controlled as well as adjusted dynamically by introducing a weighting factor which changes with iteration. The trapping on local optimum value is avoided by introducing a mutation operator, after the completion of social spider search. The global search’s ability is strengthened by this. In evolution for super-resolution reconstruction of video images, efficiency of proposed method is demonstrated by the experimentation on set of standard benchmark functions. Effectiveness and feasibility of the proposed algorithm are verified by a set of video super-resolution reconstruction example. Keywords Social spider optimization (SSO) algorithm · Image/video · Glowworm swarm optimization (GSO) · Differential evolution (DE) · Super-resolution reconstruction
S. Shekhar (B) · N. Varshney Department of Computer Engineering & Applications, GLA, University, Mathura 281406, India e-mail: [email protected] N. Varshney e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 S. Tiwari et al. (eds.), Smart Innovations in Communication and Computational Sciences, Advances in Intelligent Systems and Computing 1168, https://doi.org/10.1007/978-981-15-5345-5_26
287
288
S. Shekhar and N. Varshney
1 Introduction For improving quality of digital image, super-resolution reconstruction (SRR) technique is used. One or more higher-resolution (HR) images are obtained by combining multiple low-resolution (LR) images of same scene in SRR. Image sensor limitations are outperformed by this. In various fields, like surveillance videos in forensics, images from low-cost digital sensors in standard end-user systems and reconstruction of satellite images in remote sensing, SRR are used. Various concepts which are important and results of SRR are reviewed in [1, 2]. There are two groups in SRR algorithm. They are image SRR and video SRR. From various observations, HR images are reconstructed using image SRR and entire HR image sequences are reconstructed using video SRR. Temporal regularization is included in video SRR algorithms. It constrains norm of changes in solution between adjacent time instants [3, 4]. Correlation between adjacent frames is introduced by this. This enhances the reconstructed sequence quality while ensuring consistency of video [5]. High cost of computation is contained by SRR algorithms. Recent advancements aim to increase the quality of reconstructed image using proper information about image and LR and HR relationship. Variational Bayesian methods [6], non-local methods [7, 8], deep learning-based methods [9] and spatial kernel regression [10] are the nonparametric method examples. Image reconstruction and motion estimation are used to implement video images with super-resolution by various theories [11]. The basic GSO algorithm has the drawbacks like poor stability and premature. These drawbacks are overcome by proposing a pixel adaptive intelligent optimization algorithm in this work. This algorithm is applied on video super-resolution reconstruction. More effective performances are shown by the proposed algorithm in simulation than basic GSO. In video super-resolution, it is very feasible.
2 Super-Resolution Image Reconstruction Model 2.1 Video Super-Resolution Reconstruction Model In model of image observation, input of the system corresponds to the low-resolution target image and output corresponds to high-resolution image. Motion displacement, atmospheric blur, noise and lens blur factors affect the formation of low-resolution image from high-resolution image. The observation model’s inverse process defines reconstruction model of super-resolution. Figure 1 shows the process of video superresolution reconstruction.
Hybridization of Social Spider Optimization (SSO) Algorithm …
289
Fig. 1 Process of video super-resolution reconstruction
2.2 Reconstruction Optimization Model A group of low-resolution video sequences are optimized to get a high-resolution video sequence. This process is termed as super-resolution reconstruction of video sequence based on the above-mentioned analysis. Video super-resolution is defined as a transformation in this research. It uses image sequences with interpolated magnified frames to initialize swarm with top left corner [0, 0] pixel. Between various frames, in corresponding position, one or more pixels are selected and amplified in order to maximize and combine the function value. This research concentrates on producing high-clarity output image with grey distribution corresponding to source image.
2.3 Reconstruction Optimization Model Video sequences group with low resolution are optimized to get a high-resolution video sequence. This process is termed as super-resolution reconstruction of video sequence based on the above-mentioned analysis. Video super-resolution is defined as a transformation in this research. It uses image sequences. The sequence contains frames with magnified interpolation to initialize swarm with top left corner [0, 0] pixel. Between various frames, in corresponding position, one or more pixels are selected and amplified in order to maximize and combine the function value.
290
S. Shekhar and N. Varshney
This research concentrates on producing high-clarity output image with grey distribution corresponding to source image. Sum of difference between highresolution histogram’s 256-gradation frequency and low-resolution image’s grey histogram frequency is computed using statistical method of grey histogram frequency. Formula (1) shows the image optimization’s objective function [11]. √
g=
m−1 n−1 2 (Ii, j − Ii+1, j )2 + (Ii, j − Ii, j+1 )2 2(m − 1)(n − 1) i=1 j=1
255 ai bi − − a×b m×n i=0
(1)
where low-resolution image’s number of pixels for with a grey level i is given by ai and image with high resolution is given by bi , a × b defines size of image with low resolution, reconstructed image is represented by I which is of size m × n.
3 Solution to Video Super-Resolution Problem Based on SSO The cooperative behaviour of social spiders in finding food defines the base of social spider optimization (SSO). There are two types of spiders. They are male and female [12]. Female spiders correspond to 70% of population. Position, weight and vibrations received from other spiders and fitness define every spider [12]. Vibrations from nearest better spider, nearest female spider and globally best spider are received by every spider. But vibrations from nearest female spider define the next position of male spider. Nearest best spider and globally best spider’s vibration defines the female spider’s next position. There are two classes of male spiders. They are dominant and nondominant. Weight of dominant male spider will be high when compared to median fitness of male spiders. Non-dominant male spiders correspond to male spiders excluding dominant one. Other spiders may be rippled or attracted by female spiders. Using Eq. (2), spider’s weights are computed. weight [s] = (fitness (s) − fitness (worst spider))/(fitness (best spider) − fitness (worst spider))
(2)
Single cluster is used to represent every spider in this technique. There are two components in cluster. They are centroid and list of documents near to centroid. Spider with best value of fitness is returned by algorithm. Its union is equal to all text documents present in data set.
Hybridization of Social Spider Optimization (SSO) Algorithm …
291
In the population, spider’s global best position and local best position define the spider movement in SSO algorithm. This makes the spiders to be trapped with local best position and speed of convergence may be reduced. The optimization performance can be enhanced to solve this issue by introducing a weighting factor which is adaptive. In SSO algorithm framework, premature phenomena are avoided by using a mutation operator. • Initiation All the spiders are empty in the initial stage and they are initialized randomly with a centroid. • Assignment of image data set In data set, every image is assigned to the nearest centroid which is present in spider. In population, one spider is assigned with a text document. If a text document is present in one spider, it will not be present in other spiders. Sum of text documents distance from spider centroid provides its fitness value. It is also termed as intracluster distance of cluster. • Finding of Female and male spider by updating weight factor Random numbers are generated to find the male and female spider’s position.
3.1 Weighting Factor The male and female spiders’ next position is influenced by the position of them in current iteration. The spider’s current position’s impact on the next position is described by introducing a weighting factor W in SSO algorithm. The following illustrates the enhanced technique used for searching. Fik+1 = W · Fik ± α · Vibci · (sc − Fik ) ± β · Vibbi · (sb − Fik )
(3)
Mik+1 = W · Mik ± α · Vibci · (s f − Mik ) + rand
(4)
Vibci = ω · e−di,c
(5)
Vibfi = ω · e−di, f
(6)
2
2
where random number is represented as α and β, they lie in the range [0, 1], number of iteration is denoted by k, local optimum position at current generation is represented by sc and global optimal position at current generation is represented by sb , transmitted information from sc is represented as Vibci and transmitted information
292
S. Shekhar and N. Varshney
from sb is represented as Vibbi by vibrating on common web. The size of parameter W is adjusted to balance the exploration and exploitation of SSO. High favourable exploration is achieved by high value of W. A qualitative analysis of parameter W as, 1. W = 0: New positions are reached by the spiders in colony. New positions are defined by global best sb and local best position sc . The position of the spider in the global best position will not change its position. 2. W = 0: Current one will affect the spider’s position update. Dynamic weighting factor W is assigned to explore powerful area in search space. This is also done to explore and exploit well. In search space, promising area can be explored by assigning high value of parameter W during initial stage. It is very useful in fuzzy dynamic and complex optimization problems. More sophisticated exploitation can be achieved by assigning small value in later stages. By considering above statements, for every iteration weighting factor is designed to change as follows, W = Wmax −
iter (Wmax − Wmin ) Maxiter
(7)
Here, maximum value of weighting factor W is represented as Wmax and minimum value of weighting factor W is represented as Wmin . • Mutation strategy In the current solution area, more accurate solution can be computed by introducing the mutation technique. The algorithm performance can be enhanced by this. The SSO algorithm convergence can be accelerated by an adaptive mutation technique with mutation probability (μ). It is defined as, xi,G =
xgbest,G + γ (x p,G − xq,G ), rand ≤ μ xi,G
(8)
Mutation operator is applied after the completion of implemented search of mating operation. The search is implemented between male and female spiders. Fine exploitation of computed solution is done by applying this mutation operator with probability μ. Until satisfying predefined criteria, repeat the search process.
4 Proposed Algorithm This research work describes about optimization of super-resolution image. Sequence of low-resolution image pixels is used in input to optimize the sequence of pixels of high-resolution image in output to compute the solution. The grey value of pixels
Hybridization of Social Spider Optimization (SSO) Algorithm …
293
is optimized using a grey image. Based on grey pixels sequence, row to column arrangement of image is made. One-dimensional vector L = [l1 , l2 , l3 , . . . , lm ] is used to represent low-resolution image. Number of pixels in it is given by m. One-dimensional vector H = [h 1 , h 2 , h 3 , . . . , h m ] is used to represent image with high resolution, and it has n number of pixels. High-resolution image’s solution is computed by SSO algorithm. To construct images with super-resolution, SSO algorithm follows the belowmentioned steps. Step 1: In video image, select continuous multi-frame image sequence based on sequence vectors of low-resolution image. L 1 = [l11 , l12 , l13 , . . . , l1m ], L 2 = [l21 , l22 , l23 , . . . , l2m ] . . . L i = [li1 , li2 , li3 , . . . , lim ]. Number of frame is represented as I. Step 2: Swarm size and individual L i ’s initial values are initialized. High-resolution image sequences are generated by multiple amplification and interpolation. They are, H1 = [h 11 , h 12 , h 13 , . . . , h 1m ], H2 = [h 21 , h 22 , h 23 , . . . , h 2m ], . . . , Hi = [h 1i , h 2i , h 3i , . . . , h im ]. Step 3: Parameters are initialized. Step 4: With the value of 1, iteration is initialized. Step 5: While (Iteration