131 64 25MB
English Pages 1014 [983] Year 2021
Lecture Notes in Networks and Systems 173
S. Smys Valentina Emilia Balas Khaled A. Kamel Pavel Lafata Editors
Inventive Computation and Information Technologies Proceedings of ICICIT 2020
Lecture Notes in Networks and Systems Volume 173
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA; Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada; Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.
More information about this series at http://www.springer.com/series/15179
S. Smys Valentina Emilia Balas Khaled A. Kamel Pavel Lafata •
•
•
Editors
Inventive Computation and Information Technologies Proceedings of ICICIT 2020
123
Editors S. Smys RVS Technical Campus Coimbatore, India Khaled A. Kamel Computer Science Department Texas Southern University Houston, TX, USA
Valentina Emilia Balas “Aurel Vlaicu” University of Arad Arad, Romania Pavel Lafata Czech Technical University Praha, Czech Republic
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-981-33-4304-7 ISBN 978-981-33-4305-4 (eBook) https://doi.org/10.1007/978-981-33-4305-4 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021, corrected publication 2021, 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
We are honored to dedicate this book to all the participants and editors of ICICIT 2020.
Preface
This conference proceedings volume contains the written versions of most of the contributions presented during the conference of ICICIT 2020. The conference provided a setting for discussing recent developments in a wide variety of topics including cloud computing, artificial intelligence, and fuzzy neural systems. The conference has been a good opportunity for participants coming from various destinations to present and discuss topics in their respective research areas. This conference tends to collect the latest research results and applications on computation technology, information, and control engineering. It includes a selection of 71 papers from 266 papers submitted to the conference from universities and industries all over the world. All of the accepted papers were subjected to strict peer-reviewing by two–four expert referees. The papers have been selected for this volume because of the quality and the relevance to the conference. We would like to express our sincere appreciation to all authors for their contributions to this book. We would like to extend our thanks to all the referees for their constructive comments on all papers; especially, we would like to thank the organizing committee for their hard work. Finally, we would like to thank the Springer publications for producing this volume. Coimbatore, India Arad, Romania Houston, USA Praha, Czech Republic
S. Smys Valentina Emilia Balas Khaled A. Kamel Pavel Lafata
vii
Contents
A Heuristic Algorithm for Deadline-Based Resource Allocation in Cloud Using Modified Fish Swarm Algorithm . . . . . . . . . . . . . . . . . . J. Uma, P. Vivekanandan, and R. Mahaveerakannan
1
Dynamic Congestion Control Routing Algorithm for Energy Harvesting in MANET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. M. Karthikeyan and G. Dalin
15
Predictable Mobility-Based Routing Protocol in Wireless Sensor Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. Sophia Reena and M. Punithavalli
27
Novel Exponential Particle Swarm Optimization Technique for Economic Load Dispatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nayan Bansal, Surendrabikram Thapa, Surabhi Adhikari, Avinash Kumar Jha, Anubhav Gaba, and Aayush Jha Risk Index-Based Ventilator Prediction System for COVID-19 Infection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Amit Bhati IoT-Based Smart Door Lock with Sanitizing System . . . . . . . . . . . . . . . M. Shanthini and G. Vidya
39
53 63
Aspect-Based Sentiment Analysis in Hindi: Comparison of Machine/Deep Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . T. Sai Aparna, K. Simran, B. Premjith, and K. P. Soman
81
Application of Whale Optimization Algorithm in DDOS Attack Detection and Feature Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. Ravi Kiran Varma, K. V. Subba Raju, and Suresh Ruthala
93
Social Media Data Analysis: Twitter Sentimental Analysis on Kerala Floods Using R Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Madhavi Katamaneni, Geeta Guttikonda, and Madhavi Latha Pandala ix
x
Contents
RETRACTED CHAPTER: Intrusion Detection Using Deep Learning . . . 113 Sanjay Patidar and Inderpreet Singh Bains Secure Trust-Based Group Key Generation Algorithm for Heterogeneous Mobile Wireless Sensor Networks . . . . . . . . . . . . . . . 127 S. Sabena, C. Sureshkumar, L. Sai Ramesh, and A. Ayyasamy A Study on Machine Learning Methods Used for Team Formation and Winner Prediction in Cricket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Manoj S. Ishi and J. B. Patil Machine Learning-Based Intrusion Detection System with Recursive Feature Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Akshay Ramesh Bhai Gupta and Jitendra Agrawal An Optical Character Recognition Technique for Devanagari Script Using Convolutional Neural Network and Unicode Encoding . . . . . . . . 173 Vamsi Krishna Kikkuri, Pavan Vemuri, Srikar Talagani, Yashwanth Thota, and Jayashree Nair A Machine Learning-Based Multi-feature Extraction Method for Leather Defect Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Malathy Jawahar, L. Jani Anbarasi, S. Graceline Jasmine, Modigari Narendra, R. Venba, and V. Karthik Multiple Sclerosis Disorder Detection Through Faster Region-Based Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 Shrawan Ram and Anil Gupta Retaining Named Entities for Headline Generation . . . . . . . . . . . . . . . . 221 Bhavesh Singh, Amit Marathe, Ali Abbas Rizvi, and Abhijit R. Joshi Information Hiding Using Quantum Image Processing State of Art Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 S. Thenmozhi, K. BalaSubramanya, S. Shrinivas, Shashank Karthik D. Joshi, and B. Vikas Smart On-board Vehicle-to-Vehicle Interaction Using Visible Light Communication for Enhancing Safety Driving . . . . . . . . . . . . . . . . . . . . 247 S. Satheesh Kumar, S. Karthik, J. S. Sujin, N. Lingaraj, and M. D. Saranya A Novel Machine Learning Based Analytical Technique for Detection and Diagnosis of Cancer from Medical Data . . . . . . . . . . . . . . . . . . . . . 259 Vasundhara and Suraiya Parveen Instrument Cluster Design for an Electric Vehicle Based on CAN Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 L. Manickavasagam, N. Krishanth, B. Atul Shrinath, G. Subash, S. R. Mohanrajan, and R. Ranjith
Contents
xi
Ant Colony Optimization: A Review of Literature and Application in Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 Nandini Nayar, Shivani Gautam, Poonam Singh, and Gaurav Mehta Hand Gesture Recognition Under Multi-view Cameras Using Local Image Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Kiet Tran-Trung and Vinh Truong Hoang Custom IP Design for Fault-Tolerant Digital Filters for High-Speed Imaging Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 Somashekhar Malipatil, Avinash Gour, and Vikas Maheshwari A Novel Focused Crawler with Anti-spamming Approach & Fast Query Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 Ritu Sachdeva and Sachin Gupta A Systematic Review of Log-Based Cloud Forensics . . . . . . . . . . . . . . . 333 Atonu Ghosh, Debashis De, and Koushik Majumder Performance Analysis of K-ELM Classifiers with the State-of-Art Classifiers for Human Action Recognition . . . . . . . . . . . . . . . . . . . . . . . 349 Ratnala Venkata Siva Harish and P. Rajesh Kumar Singular Value Decomposition-Based High-Resolution Channel Estimation Scheme for mmWave Massive MIMO with Hybrid Precoding for 5G Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 V. Baranidharan, N. Praveen Kumar, K. M. Naveen, R. Prathap, and K. P. Nithish Sriman Responsible Data Sharing in the Digital Economy: Big Data Governance Adoption in Bancassurance . . . . . . . . . . . . . . . . . . . . . . . . 379 Sunet Eybers and Naomi Setsabi A Contextual Model for Information Extraction in Resume Analytics Using NLP’s Spacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 Channabasamma, Yeresime Suresh, and A. Manusha Reddy Pediatric Bone Age Detection Using Capsule Network . . . . . . . . . . . . . . 405 Anant Koppar, Siddharth Kailasam, M. Varun, and Iresh Hiremath Design High-Frequency and Low-Power 2-D DWT Based on 9/7 and 5/3 Coefficient Using Complex Multiplier . . . . . . . . . . . . . . . 421 Satyendra Tripathi, Bharat Mishra, and Ashutosh Kumar Singh Fuzzy Expert System-Based Node Trust Estimation in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 K. Selvakumar and L. Sai Ramesh
xii
Contents
Artificial Neural Network-Based ECG Signal Classification and the Cardiac Arrhythmia Identification . . . . . . . . . . . . . . . . . . . . . . 445 M. Ramkumar, C. Ganesh Babu, G. S. Priyanka, B. Maruthi Shankar, S. Gokul Kumar, and R. Sarath Kumar CDS-Based Routing in MANET Using Q Learning with Extended Episodic Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 D. S. John Deva Prasanna, D. John Aravindhar, and P. Sivasankar A Graphical User Interface Based Heart Rate Monitoring Process and Detection of PQRST Peaks from ECG Signal . . . . . . . . . . . . . . . . . 481 M. Ramkumar, C. Ganesh Babu, A. Manjunathan, S. Udhayanan, M. Mathankumar, and R. Sarath Kumar Performance Analysis of Self Adaptive Equalizers Using Nature Inspired Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 N. Shwetha and Manoj Priyatham Obstacle-Aware Radio Propagation and Environmental Model for Hybrid Vehicular Ad hoc Network . . . . . . . . . . . . . . . . . . . . . . . . . . 513 S. Shalini and Annapurna P. Patil Decision Making Among Online Product in E-Commerce Websites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529 E. Rajesh Kumar, A. Aravind, E. Jotheeswar Raghava, and K. Abhinay A Descriptive Analysis of Data Preservation Concern and Objections in IoT-Enabled E-Health Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 537 Anuj Kumar Applying Deep Learning Approach for Wheat Rust Disease Detection Using MosNet Classification Technique . . . . . . . . . . . . . . . . . . . . . . . . . 551 Mosisa Dessalegn Olana, R. Rajesh Sharma, Akey Sungheetha, and Yun Koo Chung A Decision Support Tool to Select Candidate Business Processes in Robotic Process Automation (RPA): An Empirical Study . . . . . . . . . 567 K. V. Jeeva Padmini, G. I. U. S. Perera, H. M. N. Dilum Bandara, and R. K. Omega H. Silva Feature-Wise Opinion Summarization of Consumer Reviews Using Domain Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583 Dushyanthi Vidanagama, Thushari Silva, and Asoka S. Karunananda Machine Learning-Based Approach for Opinion Mining and Sentiment Polarity Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601 H. K. S. K. Hettikankanama, Shanmuganathan Vasanthapriyan, and Kapila T. Rathnayake
Contents
xiii
Early Detection of Diabetes by Iris Image Analysis . . . . . . . . . . . . . . . . 615 P. H. A. H. K. Yashodhara and D. D. M. Ranasinghe A Novel Palmprint Cancelable Scheme Based on Orthogonal IOM . . . . 633 Xiyu Wang, Hengjian Li, and Baohua Zhao Shape-Adaptive RBF Neural Network for Model-Based Nonlinear Controlling Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647 Kudabadu J. C. Kumara and M. H. M. R. S. Dilhani Electricity Load Forecasting Using Optimized Artificial Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665 M. H. M. R. Shyamali Dilhani, N. M. Wagarachchi, and Kudabadu J. C. Kumara Object Detection in Surveillance Using Deep Learning Methods: A Comparative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 677 Dharmender Saini, Narina Thakur, Rachna Jain, Preeti Nagrath, Hemanth Jude, and Nitika Sharma MaSMT4: The AGR Organizational Model-Based Multi-Agent System Development Framework for Machine Translation . . . . . . . . . . 691 Budditha Hettige, Asoka S. Karunananda, and George Rzevski Comparative Study of Optimized and Robust Fuzzy Controllers for Real Time Process Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 703 Ajay B. Patil and R. H. Chile Ant Colony Optimization-Based Solution for Finding Trustworthy Nodes in a Mobile Ad Hoc Network . . . . . . . . . . . . . . . . . . . . . . . . . . . 719 G. M. Jinarajadasa and S. R. Liyanage Software Development for the Prototype of the Electrical Impedance Tomography Module in C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729 A. A. Katsupeev, G. K. Aleksanyan, N. I. Gorbatenko, R. K. Litvyak, and E. O. Kombarova Information Communication Enabled Technology for the Welfare of Agriculture and Farmer’s Livelihoods Ecosystem in Keonjhar District of Odisha as a Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739 Bibhu Santosh Behera, Rahul Dev Behera, Anama Charan Behera, Rudra Ashish Behera, K. S. S. Rakesh, and Prarthana Mohanty CHAIN: A Naive Approach of Data Analysis to Enhance Market Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 751 Priya Matta, Sparsh Ahuja, Vishisth Basnet, and Bhasker Pant
xiv
Contents
Behavioural Scoring Based on Social Activity and Financial Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763 Anmol Gupta, Sanidhya Pandey, Harsh Krishna, Subham Pramanik, and P. Gouthaman An Optimized Method for Segmentation and Classification of Apple Leaf Diseases Based on Machine Learning . . . . . . . . . . . . . . . . . . . . . . . 781 Shaurya Singh Slathia, Akshat Chhajer, and P. Gouthaman A Thorough Analysis of Machine Learning and Deep Learning Methods for Crime Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795 J. Jeyaboopathiraja and G. Maria Priscilla Improved Density-Based Learning to Cluster for User Web Log in Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 813 N. V. Kousik, M. Sivaram, N. Yuvaraj, and R. Mahaveerakannan Spatiotemporal Particle Swarm Optimization with Incremental Deep Learning-Based Salient Multiple Object Detection . . . . . . . . . . . . . . . . . 831 M. Indirani and S. Shankar Election Tweets Prediction Using Enhanced Cart and Random Forest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 851 Ambati Jahnavi, B. Dushyanth Reddy, Madhuri Kommineni, Anandakumar Haldorai, and Bhavani Vasantha Flexible Language-Agnostic Framework To Emit Informative Compile-Time Error Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 859 Malathy Nagalakshmi, Tanya Sharma, and N. S. Kumar Enhancing Multi-factor User Authentication for Electronic Payments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 869 Md Arif Hassan, Zarina Shukur, and Mohammad Kamrul Hasan Comparative Analysis of Machine Learning Algorithms for Phishing Website Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 883 Dhiman Sarma, Tanni Mittra, Rose Mary Bawm, Tawsif Sarwar, Farzana Firoz Lima, and Sohrab Hossain Toxic Comment Classification Implementing CNN Combining Word Embedding Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 897 Monirul Islam Pavel, Razia Razzak, Katha Sengupta, Md. Dilshad Kabir Niloy, Munim Bin Muqith, and Siok Yee Tan A Comprehensive Investigation About Video Synopsis Methodology and Research Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 911 Swati Jagtap and Nilkanth B. Chopade
Contents
xv
Effective Multimodal Opinion Mining Framework Using Ensemble Learning Technique for Disease Risk Prediction . . . . . . . . . . . . . . . . . . 925 V. J. Aiswaryadevi, S. Kiruthika, G. Priyanka, N. Nataraj, and M. S. Sruthi Vertical Fragmentation of High-Dimensional Data Using Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935 Raji Ramachandran, Gopika Ravichandran, and Aswathi Raveendran Extrapolation of Futuristic Application of Robotics: A Review . . . . . . . 945 D. V. S. Pavan Karthik and S. Pranavanand AI-Based Digital Marketing Strategies—A Review . . . . . . . . . . . . . . . . 957 B. R. Arun Kumar NoRegINT—A Tool for Performing OSINT and Analysis from Social Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 971 S. Karthika, N. Bhalaji, S. Chithra, N. Sri Harikarthick, and Debadyuti Bhattacharya Correction to: Effective Multimodal Opinion Mining Framework Using Ensemble Learning Technique for Disease Risk Prediction . . . . . V. J. Aiswaryadevi, S. Kiruthika, G. Priyanka, N. Nataraj, and M. S. Sruthi Retraction Note to: Intrusion Detection Using Deep Learning . . . . . . . . Sanjay Patidar and Inderpreet Singh Bains
C1 C3
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 981
Editors and Contributors
About the Editors Dr. S. Smys received his M.E. and Ph.D. degrees all in Wireless Communication and Networking from Anna University and Karunya University, India. His main area of research activity is localization and routing architecture in wireless networks. He serves as Associate Editor of Computers and Electrical Engineering (C&EE) Journal, Elsevier, and Guest Editor of MONET Journal, Springer. He has served as Reviewer for IET, Springer, Inderscience and Elsevier journals. He has published many research articles in refereed journals and IEEE conferences. He has been General Chair, Session Chair, TPC Chair and Panelist in several conferences. He is a member of IEEE and a senior member of IACSIT wireless research group. He has been serving as Organizing Chair and Program Chair of several international conferences, and in the Program Committees of several international conferences. Currently, he is working as Professor in the Department of Information Technology at RVS Technical Campus, Coimbatore, India. Dr. Valentina Emilia Balas is currently Full Professor at “Aurel Vlaicu” University of Arad, Romania. She is the author of more than 300 research papers. Her research interests are in intelligent systems, fuzzy control and soft computing. She is Editor-in-Chief to International Journal of Advanced Intelligence Paradigms (IJAIP) and to IJCSE. Dr. Balas is a member of EUSFLAT, ACM and a SM IEEE, a member in TC – EC and TC-FS (IEEE CIS), TC – SC (IEEE SMCS), and Joint Secretary FIM. Dr. Khaled A. Kamel is currently Chairman and Professor at Texas Southern University, College of Science and Technology, Department of Computer Science, Houston, TX. He has published many research articles in refereed journals and IEEE conferences. He has more than 30 years of teaching and research experience. He has
xvii
xviii
Editors and Contributors
been General Chair, Session Chair, TPC Chair and Panelist in several conferences and acted as Reviewer and Guest Editor in refereed journals. His research interest includes networks, computing and communication systems. Dr. Pavel Lafata received his M.Sc. degree in 2007 and the Ph.D. degree in 2011 from the Department of Telecommunication Engineering, Faculty of Electrical Engineering, Czech Technical University in Prague (CTU in Prague). He is now Assistant Professor at the Department of Telecommunication Engineering of the CTU in Prague. Since 2007, he has been actively cooperating with several leading European manufacturers of telecommunication cables and optical network components performing field and laboratory testing of their products as well as consulting further research in this area. He also cooperates with many impact journals as Fellow Reviewer, such as International Journal of Electrical Power & Energy Systems, Elektronika ir Elektrotechnika, IEEE Communications Letters, Recent Patents on Electrical & Electronic Engineering, International Journal of Emerging Technologies in Computational and Applied Sciences and China Communications.
Contributors K. Abhinay Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh, India Surabhi Adhikari Department of Computer Science and Engineering, Delhi Technological University, New Delhi, India Jitendra Agrawal School of Information Technology, RGPV, Bhopal, India Sparsh Ahuja Computer Science and Engineering, Graphic Era University, Dehradun, India V. J. Aiswaryadevi Dr NGP Institute of Technology, Coimbatore, India G. K. Aleksanyan Department of Informational and Measurement Systems and Technologies, Platov South-Russian State Polytechnic University (NPI), Novocherkassk, Russia A. Aravind Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh, India Md Arif Hassan Faculty of Information Technology, Center for Cyber Security, National University Malaysia (UKM), Bangi, Selangor, Malaysia B. R. Arun Kumar Department of Master of Computer Applications, BMS Institute of Technology and Management (Affiliated to Vivesvaraya Technological University, Belagavi), Bengaluru, Karnataka, India
Editors and Contributors
xix
B. Atul Shrinath Department of Electrical and Electronics Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India A. Ayyasamy Department of Computer Engineering, Government Polytechnic College, Nagercoil, Tamil Nadu, India Inderpreet Singh Bains Delhi Technological University, New Delhi, India K. BalaSubramanya ECE, Dayananda Sagar College of Engineering, Bangalore, India Nayan Bansal Department of Electrical Engineering, Delhi Technological University, New Delhi, India V. Baranidharan Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Vishisth Basnet Computer Science and Engineering, Graphic Era University, Dehradun, India Rose Mary Bawm Department of Computer Science and Engineering, East Delta University, Chittagong, Bangladesh Anama Charan Behera Faculty Green College, Odisha, India Bibhu Santosh Behera OUAT, Bhubaneswar, Odisha, India; International Researcher, LIUTEBM University, Lusaka, Zambia Rahul Dev Behera OUAT, Bhubaneswar, Odisha, India Rudra Ashish Behera Faculty Green College, Odisha, India N. Bhalaji Department of Information Technology, SSN College of Engineering, Kalavakkam, Chennai, Tamil Nadu, India Amit Bhati Institute of Engineering and Technology, Dr. RML Awadh University, Ayodhya, UP, India Debadyuti Bhattacharya Department of Information Technology, SSN College of Engineering, Kalavakkam, Chennai, Tamil Nadu, India Channabasamma VNRVJIET, Hyderabad, India Akshat Chhajer Department of Information Technology, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, India R. H. Chile Department of Electrical Engineering, SGGS Institute of Engineering and Technology, Vishnupuri, Nanded, M.S, India S. Chithra Department of Information Technology, SSN College of Engineering, Kalavakkam, Chennai, Tamil Nadu, India
xx
Editors and Contributors
Nilkanth B. Chopade Department of Electronics and Telecommunication, Pimpri Chinchwad College of Engineering, Pune, India Yun Koo Chung Department of Computer Science and Engineering, School of Electrical Engineering and Computing, Adama Science and Technology University, Adama, Ethiopia G. Dalin Associate Professor, PG and Research Department of Computer Science, Hindusthan College of Arts & Science, Coimbatore, Tamil Nadu, India Debashis De Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, Kolkata, West Bengal, India M. H. M. R. S. Dilhani Department of Interdisciplinary Studies, Faculty of Engineering, University of Ruhuna, Galle, Sri Lanka H. M. N. Dilum Bandara Department of Computer Science and Engineering, University of Moratuwa, Moratuwa, Sri Lanka B. Dushyanth Reddy Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India Sunet Eybers University of Pretoria, Hatfield, Pretoria, South Africa Anubhav Gaba Department of Electrical Engineering, Delhi Technological University, New Delhi, India C. Ganesh Babu Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathyamangalam, India Shivani Gautam Department of Computer Science and Applications, Chitkara University School of Computer Applications, Chitkara University, Himachal Pradesh, India Atonu Ghosh Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, Kolkata, West Bengal, India S. Gokul Kumar Department of Technical Supply Chain, Ros Tech (A & D), Bengaluru, Karnataka, India N. I. Gorbatenko Department of Informational and Measurement Systems and Technologies, Platov South-Russian State Polytechnic University (NPI), Novocherkassk, Russia Avinash Gour Department of Electronics & Communication Engineering, Sri Satya Sai University of Technology & Medical Sciences (SSSUTMS), Sehore, Madhya Pradesh, India P. Gouthaman Department of Information Technology, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, India
Editors and Contributors
xxi
S. Graceline Jasmine School of Computer Science and Engineering, VIT University, Chennai, India Akshay Ramesh Bhai Gupta School of Information Technology, RGPV, Bhopal, India Anil Gupta Department of Computer Science and Engineering, MBM Engineering College, Jai Narain Vyas University, Jodhpur, Rajasthan, India Anmol Gupta Department of Information Technology, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, India Sachin Gupta Department of Computer Science, MVNU, Palwal, India Geeta Guttikonda Department of IT, VRSEC, Vijayawada, India Anandakumar Haldorai Department of Computer Science and Engineering, Sri Eshwar College of Engineering, Coimbatore, Tamil Nadu, India Ratnala Venkata Siva Harish Department of Electronics and Communications Engineering, Au College of Engineering (Autonomous), Visakhapatnam, Andhrapradesh, India Budditha Hettige Department of Computational Mathematics, Faculty of Information Technology, University of Moratuwa, Moratuwa, Sri Lanka H. K. S. K. Hettikankanama Department of Computing and Information Systems, Faculty of Applied Sciences, Sabaragamuwa University of Sri Lanka, Balangoda, Sri Lanka Iresh Hiremath Computer Science Engineering Department, Engineering Department, PES University, Bengaluru, Karnataka, India Vinh Truong Hoang Ho Chi Minh City Open University, Ho Chi Minh City, Vietnam Sohrab Hossain Department of Computer Science and Engineering, East Delta University, Chittagong, Bangladesh M. Indirani Assistant Professor, Department of IT, Hindusthan College of Engineering and Technology, Coimbatore, India Manoj S. Ishi Department of Computer Engineering, R. C. Patel Institute of Technology, Shirpur, MS, India Swati Jagtap Department of Electronics and Telecommunication, Pimpri Chinchwad College of Engineering, Pune, India Ambati Jahnavi Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India
xxii
Editors and Contributors
Rachna Jain Department of Computer Science Engineering, Bharati Vidyapeeth’s College of Engineering, New Delhi, India L. Jani Anbarasi School of Computer Science and Engineering, VIT University, Chennai, India Malathy Jawahar Leather Process Technology Division, CSIR-Central Leather Research Institute, Adyar, Chennai, India K. V. Jeeva Padmini Department of Computer Science and Engineering, University of Moratuwa, Moratuwa, Sri Lanka J. Jeyaboopathiraja Research Scholar, Department of Computer Science, Sri Ramakrishna College of Arts and Science, Coimbatore, India Aayush Jha Department of Civil Engineering, Delhi Technological University, New Delhi, India Avinash Kumar Jha Department of Civil Engineering, Delhi Technological University, New Delhi, India G. M. Jinarajadasa University of Kelaniya, Kelaniya, Sri Lanka D. John Aravindhar CSE, HITS, Chennai, India D. S. John Deva Prasanna CSE, HITS, Chennai, India Abhijit R. Joshi Department of Information Technology, D.J. Sanghvi College of Engineering, Mumbai, India Shashank Karthik D. Joshi ECE, Dayananda Sagar College of Engineering, Bangalore, India E. Jotheeswar Raghava Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh, India Hemanth Jude Department of ECE, Karunya University, Coimbatore, Tamil Nadu, India Siddharth Kailasam Computer Science Engineering Department, Engineering Department, PES University, Bengaluru, Karnataka, India Mohammad Kamrul Hasan Faculty of Information Technology, Center for Cyber Security, National University Malaysia (UKM), Bangi, Selangor, Malaysia S. Karthik Department of ECE, Sri Krishna College of Engineering and Technology, Coimbatore, India V. Karthik Leather Process Technology Division, CSIR-Central Leather Research Institute, Adyar, Chennai, India
Editors and Contributors
xxiii
S. Karthika Department of Information Technology, SSN College of Engineering, Kalavakkam, Chennai, Tamil Nadu, India M. M. Karthikeyan Ph.D Research Scholar, PG and Research Department of Computer Science, Hindusthan College of Arts & Science, Coimbatore, Tamil Nadu, India Asoka S. Karunananda Department of Computational Mathematics, Faculty of Information Technology, University of Moratuwa, Moratuwa, Sri Lanka Madhavi Katamaneni Department of IT, VRSEC, Vijayawada, India A. A. Katsupeev Department of Informational and Measurement Systems and Technologies, Platov South-Russian State Polytechnic University (NPI), Novocherkassk, Russia Vamsi Krishna Kikkuri Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, India S. Kiruthika Sri Krishna College of Technology, Coimbatore, India E. O. Kombarova Department of Informational and Measurement Systems and Technologies, Platov South-Russian State Polytechnic University (NPI), Novocherkassk, Russia Madhuri Kommineni Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India Anant Koppar Computer Science Engineering Department, Department, PES University, Bengaluru, Karnataka, India
Engineering
N. V. Kousik Galgotias University, Greater Noida, Uttarpradesh, India N. Krishanth Department of Electrical and Electronics Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Harsh Krishna Department of Information Technology, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, India Anuj Kumar Department of Computer Engineering and Applications, GLA University, Mathura, India Kudabadu J. C. Kumara Department of Mechanical and Manufacturing Engineering, Faculty of Engineering, University of Ruhuna, Hapugala, Galle, Sri Lanka N. S. Kumar Department of Computer Science, PES University, Bengaluru, India Hengjian Li School of Information Science and Engineering, University of Jinan, Jinan, China
xxiv
Editors and Contributors
Farzana Firoz Lima Department of Computer Science and Engineering, East Delta University, Chittagong, Bangladesh N. Lingaraj Department of Mechanical Engineering, Rajalakshmi Institute of Technology, Chennai, India R. K. Litvyak Department of Informational and Measurement Systems and Technologies, Platov South-Russian State Polytechnic University (NPI), Novocherkassk, Russia S. R. Liyanage University of Kelaniya, Kelaniya, Sri Lanka R. Mahaveerakannan Department of Information Technology, Hindusthan College of Engineering and Technology, Otthakkalmandapam, Coimbatore, India Vikas Maheshwari Department of Electronics & Communication Engineering, Bharat Institute of Engineering & Technology, Hyderabad, India Koushik Majumder Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, Kolkata, West Bengal, India Somashekhar Malipatil Department of Electronics & Communication Engineering, Sri Satya Sai University of Technology & Medical Sciences (SSSUTMS), Sehore, Madhya Pradesh, India L. Manickavasagam Department of Electrical and Electronics Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India A. Manjunathan Department of Electronics and Communication Engineering, K. Ramakrishnan College of Technology, Trichy, India A. Manusha Reddy VNRVJIET, Hyderabad, India Amit Marathe Department of Electronics and Telecommunication, Xavier Institute of Engineering, Mumbai, India G. Maria Priscilla Professor and Head, Department of Computer Science, Sri Ramakrishna College of Arts and Science, Coimbatore, India B. Maruthi Shankar Department of Electronics and Communication Engineering, Sri Krishna College of Engineering and Technology, Coimbatore, India M. Mathankumar Department of Electrical and Electronics Engineering, Kumaraguru College of Technology, Coimbatore, India Priya Matta Computer Science and Engineering, Graphic Era University, Dehradun, India Gaurav Mehta Department of Computer Science and Engineering, Chitkara University Institute of Engineering and Technology, Chitkara University, Himachal Pradesh, India
Editors and Contributors
xxv
Bharat Mishra MGCGV, Chitrakoot, India Tanni Mittra Department of Computer Science and Engineering, East West University, Dhaka, Bangladesh S. R. Mohanrajan Department of Electrical and Electronics Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Prarthana Mohanty OUAT, Bhubaneswar, Odisha, India Munim Bin Muqith Department of Computer Science and Engineering, BRAC University, Dhaka, Bangladesh Malathy Nagalakshmi Department of Computer Science, PES University, Bengaluru, India Preeti Nagrath Department of Computer Science Vidyapeeth’s College of Engineering, New Delhi, India
Engineering,
Bharati
Jayashree Nair Department of Computer Science and Applications, Amrita Vishwa Vidyapeetham, Amritapuri, India Modigari Narendra Department of Computer Science and Engineering, VFSTR Deemed to be University, Guntur, India N. Nataraj Bannari Amman Institute of Technology, Sathyamangalam, India K. M. Naveen Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Nandini Nayar Department of Computer Science and Engineering, Chitkara University Institute of Engineering and Technology, Chitkara University, Himachal Pradesh, India Md. Dilshad Kabir Niloy Department of Computer Science and Engineering, BRAC University, Dhaka, Bangladesh K. P. Nithish Sriman Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Mosisa Dessalegn Olana Department of Computer Science and Engineering, School of Electrical Engineering and Computing, Adama Science and Technology University, Adama, Ethiopia Madhavi Latha Pandala Department of IT, VRSEC, Vijayawada, India Sanidhya Pandey Department of Information Technology, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, India Bhasker Pant Computer Science and Engineering, Graphic Era University, Dehradun, India
xxvi
Editors and Contributors
Suraiya Parveen Department of Computer Science, School of Engineering Science and Technology, Jamia Hamdard, New Delhi, India Sanjay Patidar Delhi Technological University, New Delhi, India Ajay B. Patil Department of Electrical Engineering, SGGS Institute of Engineering and Technology, Vishnupuri, Nanded, M.S, India Annapurna P. Patil RIT Bangalore, Bangalore, India J. B. Patil Department of Computer Engineering, R. C. Patel Institute of Technology, Shirpur, MS, India D. V. S. Pavan Karthik Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering and Technology, Secunderabad, Telangana, India Monirul Islam Pavel Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, The National University of Malaysia, 43600 Bangi, Selangor, Malaysia G. I. U. S. Perera Department of Computer Science and Engineering, University of Moratuwa, Moratuwa, Sri Lanka Subham Pramanik Department of Information Technology, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, India S. Pranavanand Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering and Technology, Secunderabad, Telangana, India R. Prathap Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India N. Praveen Kumar Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India B. Premjith Center for Computational Engineering and Networking, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India G. Priyanka Sri Krishna College of Technology, Coimbatore, India G. S. Priyanka Department of Electronics and Communication Engineering, Sri Krishna College of Engineering and Technology, Coimbatore, India Manoj Priyatham Department of ECE, APS College of Engineering, Bangalore, Karnataka, India M. Punithavalli Department of Computer Applications, School of Computer Science and Engineering, Bharathiar University, Coimbatore, India E. Rajesh Kumar Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh, India
Editors and Contributors
xxvii
P. Rajesh Kumar Department of Electronics and Communications Engineering, Au College of Engineering (Autonomous), Visakhapatnam, Andhrapradesh, India R. Rajesh Sharma Department of Computer Science and Engineering, School of Electrical Engineering and Computing, Adama Science and Technology University, Adama, Ethiopia K. S. S. Rakesh LIUTEBM University, Lusaka, Zambia Shrawan Ram Department of Computer Science and Engineering, MBM Engineering College, Jai Narain Vyas University, Jodhpur, Rajasthan, India Raji Ramachandran Department of Computer Science and Applications, Amrita Vishwa Vidyapeetham, Amritapuri, India M. Ramkumar Department of Electronics and Communication Engineering, Sri Krishna College of Engineering and Technology, Coimbatore, India D. D. M. Ranasinghe Department of Electrical & Computer Engineering, The Open University of Sri Lanka, Nawala, Nugegoda, Sri Lanka R. Ranjith Department of Electrical and Electronics Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Kapila T. Rathnayake Department of Physical Sciences and Technology, Faculty of Applied Sciences, Sabaragamuwa University of Sri Lanka, Balangoda, Sri Lanka Aswathi Raveendran Department of Computer Science and Applications, Amrita Vishwa Vidyapeetham, Amritapuri, India P. Ravi Kiran Varma MVGR College of Engineering, Vizianagaram, AP, India Gopika Ravichandran Department of Computer Science and Applications, Amrita Vishwa Vidyapeetham, Amritapuri, India Razia Razzak Department of Computer Science and Engineering, BRAC University, Dhaka, Bangladesh Ali Abbas Rizvi Department of Information Technology, D.J. Sanghvi College of Engineering, Mumbai, India Suresh Ruthala MVGR College of Engineering, Vizianagaram, AP, India George Rzevski Department of Computational Mathematics, Faculty of Information Technology, University of Moratuwa, Moratuwa, Sri Lanka S. Sabena Department of Computer Science Engineering, Anna University Regional Campus, Tirunelveli, Tamil Nadu, India Ritu Sachdeva Department of Computer Science, MVNU, Palwal, India T. Sai Aparna Center for Computational Engineering and Networking, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India
xxviii
Editors and Contributors
L. Sai Ramesh Department of Information Science and Technology, Anna University, Chennai, Tamil Nadu, India Dharmender Saini Department of Computer Science Engineering, Bharati Vidyapeeth’s College of Engineering, New Delhi, India M. D. Saranya Department of Electronics and Communication Engineering, KPR Institute of Engineering and Technology, Coimbatore, India R. Sarath Kumar Department of Electronics and Communication Engineering, Sri Krishna College of Engineering and Technology, Coimbatore, India Dhiman Sarma Department of Computer Science and Engineering, Rangamati Science and Technology University, Rangamati, Bangladesh Tawsif Sarwar Department of Computer Science and Engineering, East Delta University, Chittagong, Bangladesh S. Satheesh Kumar Department of Electronics and Communication Engineering, KPR Institute of Engineering and Technology, Coimbatore, India K. Selvakumar Department of Computer Applications, NIT, Trichy, India Katha Sengupta Department of Computer Science and Engineering, BRAC University, Dhaka, Bangladesh Naomi Setsabi University of Pretoria, Hatfield, Pretoria, South Africa S. Shalini RIT Bangalore, Bangalore, India S. Shankar Professor, Department of CSE, Hindusthan College of Engineering and Technology, Coimbatore, India M. Shanthini PSG Institute of Technology and Applied Research, Coimbatore, India Nitika Sharma Department of Computer Science Vidyapeeth’s College of Engineering, New Delhi, India
Engineering,
Bharati
Tanya Sharma Department of Computer Science, PES University, Bengaluru, India S. Shrinivas ECE, Dayananda Sagar College of Engineering, Bangalore, India Zarina Shukur Faculty of Information Technology, Center for Cyber Security, National University Malaysia (UKM), Bangi, Selangor, Malaysia N. Shwetha Department of ECE, Dr. Ambedkar Institute of Technology, Bangalore, Karnataka, India M. H. M. R. Shyamali Dilhani Department of Interdisciplinary Studies, University of Ruhuna, Hapugala, Galle, Sri Lanka
Editors and Contributors
xxix
R. K. Omega H. Silva Department of Computer Science and Engineering, University of Moratuwa, Moratuwa, Sri Lanka Thushari Silva Department of Computational Mathematics, Faculty Information Technology, University of Moratuwa, Moratuwa, Sri Lanka
of
K. Simran Center for Computational Engineering and Networking, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Ashutosh Kumar Singh IIT-A, Prayagraj, India Bhavesh Singh Department of Information Technology, D.J. Sanghvi College of Engineering, Mumbai, India Poonam Singh Department of Computer Science and Applications, Chitkara University School of Computer Applications, Chitkara University, Himachal Pradesh, India M. Sivaram Research Center, Lebanese French University, Erbil, Iraq P. Sivasankar NITTTR, Chennai, India Shaurya Singh Slathia Department of Information Technology, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, India K. P. Soman Center for Computational Engineering and Networking, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India G. Sophia Reena Department of Information Technology, PSGR Krishnammal College for Women, Coimbatore, India N. Sri Harikarthick Department of Information Technology, SSN College of Engineering, Kalavakkam, Chennai, Tamil Nadu, India M. S. Sruthi Sri Krishna College of Technology, Coimbatore, India G. Subash Department of Electrical and Electronics Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India K. V. Subba Raju MVGR College of Engineering, Vizianagaram, AP, India J. S. Sujin Department of ECE, Sri Krishna College of Engineering and Technology, Coimbatore, India Akey Sungheetha Department of Computer Science and Engineering, School of Electrical Engineering and Computing, Adama Science and Technology University, Adama, Ethiopia Yeresime Suresh BIT&M, Ballari, India C. Sureshkumar Faculty of Information and Communication Engineering, Anna University, Chennai, Tamil Nadu, India
xxx
Editors and Contributors
Srikar Talagani Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, India Siok Yee Tan Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, The National University of Malaysia, 43600 Bangi, Selangor, Malaysia Narina Thakur Department of Computer Science Engineering, Bharati Vidyapeeth’s College of Engineering, New Delhi, India Surendrabikram Thapa Department of Computer Science and Engineering, Delhi Technological University, New Delhi, India S. Thenmozhi ECE, Dayananda Sagar College of Engineering, Bangalore, India Yashwanth Thota Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, India Kiet Tran-Trung Ho Chi Minh City Open University, Ho Chi Minh City, Vietnam Satyendra Tripathi MGCGV, Chitrakoot, India S. Udhayanan Department of Electronics and Communication Engineering, Sri Bharathi Engineering College for Women, Pudukkottai, India J. Uma Department of Information Technology, Hindusthan College of Engineering and Technology, Otthakkalmandapam, Coimbatore, India M. Varun Computer Science Engineering Department, Engineering Department, PES University, Bengaluru, Karnataka, India Bhavani Vasantha Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India Shanmuganathan Vasanthapriyan Department of Computing and Information Systems, Faculty of Applied Sciences, Sabaragamuwa University of Sri Lanka, Balangoda, Sri Lanka Vasundhara Department of Computer Science, School of Engineering Science and Technology, Jamia Hamdard, New Delhi, India Pavan Vemuri Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, India R. Venba Leather Process Technology Division, CSIR-Central Leather Research Institute, Adyar, Chennai, India Dushyanthi Vidanagama Department of Computational Mathematics, Faculty of Information Technology, University of Moratuwa, Moratuwa, Sri Lanka G. Vidya PSG Institute of Technology and Applied Research, Coimbatore, India
Editors and Contributors
xxxi
B. Vikas ECE, Dayananda Sagar College of Engineering, Bangalore, India P. Vivekanandan Department of Computer Science and Engineering, Park College of Engineering and Technology, Kaniyur, Coimbatore, India N. M. Wagarachchi Department of Interdisciplinary Studies, University of Ruhuna, Hapugala, Galle, Sri Lanka Xiyu Wang School of Information Science and Engineering, University of Jinan, Jinan, China P. H. A. H. K. Yashodhara Department of Electrical & Computer Engineering, The Open University of Sri Lanka, Nawala, Nugegoda, Sri Lanka N. Yuvaraj ICT Academy, Chennai, Tamilnadu, India Baohua Zhao School of Information Science and Engineering, University of Jinan, Jinan, China
A Heuristic Algorithm for Deadline-Based Resource Allocation in Cloud Using Modified Fish Swarm Algorithm J. Uma, P. Vivekanandan, and R. Mahaveerakannan
Abstract Virtualization plays an indispensable role in improving the efficacy and agility of cloud computing. This process involves in assigning resources to the cloud application users based on the requirements, where most of the resources are being virtual in nature. These resources are utilized by the users for implementing tasks for a certain time period. Virtualization assists in effective usage of hardware resources. Based on the application, the users may require definite amount of resources to be utilized in a definite time period. Thus, a deadline that is a start time and end time for every resource needs to be considered. Deadline specifically relates to time limit for the implementation of tasks in the workflow. In this paper, the resource allocation is optimized based on deadline as the optimization parameter using modified fish swarm algorithm (FSA). Keywords Cloud computing · Virtualization · Deadline and fish swarm algorithm (FSA)
1 Introduction Cloud computing [1] is a resource-constrained environment in which the allocation of resources plays a major role. The requirement of these virtual resources is defined by means of certain parameters that specify resources like applications, services, CPU, J. Uma · R. Mahaveerakannan (B) Department of Information Technology, Hindusthan College of Engineering and Technology, Otthakkalmandapam, Coimbatore 641032, India e-mail: [email protected] J. Uma e-mail: [email protected] P. Vivekanandan Department of Computer Science and Engineering, Park College of Engineering and Technology, Kaniyur, Coimbatore 641659, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_1
1
2
J. Uma et al.
processors, I/O, networks, storage and servers. It is imperative that these resources are effectively utilized in the cloud environment. With varying resource availability and workloads, keeping up with the quality of service (QoS) and simultaneously maintaining an effective usage of resources and system performance are critical tasks at hand. This gives rise to issues between the cloud resource provider and user for maximizing the resource usage effectively. Thus, the basic tenet of cloud computing is resources allotment [2]. Some of the key issues in cloud computing have been resolved using metaheuristic algorithms that have a stronghold in the research field, due to its efficacy and efficiency. Despite a lot of attention being garnered for resource allocation in cloud computing from the global research community, several of late studies have drawn attention towards the progress made in this area. The objective of resource allocation [3] is finding an optimal and feasible allocation scheme for a certain service. Classification of effective resource assignment schemes which efficiently utilize the constrained resources in the cloud environment has been performed. Resource assignment in distributed clouds and concentrating are the chief issues with regard to the challenges faced in the cloud paradigm. The issues extend to resource discovery, its availability, selecting the appropriate resource, treating and offering the resource, monitoring the resource, etc. Despite various issues present in the new research, distributed cloud is promising for usage across different contexts. Provisioning of resources is done by means of virtual machine (VM) technology [4]. Virtual environment has the potential of decreasing the mean job response time and executing the tasks as per the resource availability. The VMs are assigned to the users based on the nature of the job to be executed. A production environment involves several tasks being submitted to the cloud. Thus, the job scheduler software should comprise interfaces for defining the workflows and/or job dependencies for automatically executing the submitted tasks. All of the required VM images that are needed for running the user-related tasks are preconfigured by the cloud broker and stored in the cloud. All jobs that enter are sent into a queue. These jobs and a pool of machines are all managed by a system-level scheduler which runs on a particular system which also decides if new VM has to be provisioned from clouds and/or jobs are assigned to the VMs. This scheduler works from time to time. There are five tasks performed by the scheduler at every moment: (1) forecasting the possible workloads, (2) provisioning the required VMs beforehand from the cloud, (3) assigning tasks to VMs, (4) releasing the VM if its billing time unit (BTU) is close to surge and (5) starting the required amount of VMs when a lot of unassigned jobs are present. The progress in virtualization and distributed computing for supporting the costeffective utilization of computing resources is the basis for cloud computing. There is an emphasis on the scalability of resources as well as services-on-demand. Based on the business requirements, the resources can be scaled up or scaled down in cloud paradigm. There are issues in on-demand-resource allocation as the needs of the customers need to be factored. Provisioning of resources has been done via resource provisioning. Thus, based on the features of the task that requires the resources, the VMs are accordingly assigned. The execution of jobs of greater priority must not be delayed due to low-priority jobs. This kind of a context can cause a resource access
A Heuristic Algorithm for Deadline-Based Resource …
3
contention between jobs of low priority and high priority. The chief input in our allocation is the information contained in the resource request tuple. Nonetheless, the benefit of the Dcloud’s resource allotting algorithm for effectively using the cloud resources is severely undercut; if a selfish user keeps declaring short deadlines, it will affect the balance of the VM and bandwidth adversely [5]. A mechanism based on job as well as strategy-proof charging has been formulated for the DCloud. This allows the users to honestly declare deadlines so that their costs are minimized. There are several meta-heuristic algorithms [6] in use. Many new variations are often proposed for resource assignment in several fields. Several meta-heuristic algorithms that are popular in the cloud computing arena include firefly algorithm (FA), league championship algorithm (LCA), immune algorithm (IA), harmony search (HS), cuckoo search (CS), differential evolution (DE), memetic algorithm (MA) and ant colony optimization (ACO), among others [7]. There are several benefits of the artificial FSA (AFSA) including its ability for global search, strength and intense robustness. It also has rapid convergence and better precision for global search. For enabling flexible as well as effective usage of resources in the data centres, the Dcloud influences the deadlines of the cloud computing jobs. The literature associated with the proposed work has been explained in the second section briefly. The details on the schemes used for this work are explained in the third section, and the outcomes obtained are delineated in the fourth section with the conclusion of the work presented in the fifth section. In this work, an optimization algorithm-based FSA for minimizing the overall workflow execution cost while meeting deadline constraints. Section 2 briefly explains the literatures related to this work. Section 3 presents the techniques used in methodology. Section 4 explains the result and discussion. Section 5 concludes the entire work.
2 Literature Survey On the basis of evaluating the job traits, Saraswathi et al. [2] focussed on the assignment of VM resources to the user. The objective here is that jobs of low importance (whose deadlines are high) should not affect the execution of highly important jobs (whose deadlines are low). The VM resources have to be allocated dynamically for a user job within the available deadline. Resource and deadline-aware Hadoop job scheduler (RDS) has been suggested by Cheng et al. [8]. This takes into consideration the future availability of resources, while simultaneously decreasing the misses in the job’s deadline. The issue of job scheduling is formulated as an online optimization problem. This has been solved using an effective receding horizontal control algorithm. A self-learning prototype has been designed for estimating the job completion times for aiding the control. For predicting the availability of resources, a simple, yet an effective model has been used. Open-source Hadoop implementation has been used for implementing the RDS. Analysis has been done considering the varying benchmark workloads. It has been shown via experimental outcomes that usage of
4
J. Uma et al.
RDS decreases the penalty of missing the deadline by at least 36% and 10% when compared, respectively, with fair scheduler and EDF scheduler. Cloud infrastructure permits the simultaneous demand of the cloud services to the active users. Thus, effective provisioning of resources for fulfilling the user requirements is becoming imperative. When resources are used effectively, they cost lesser. In the virtualization scheme, the VMs are the resources which can map the incoming user requests/ tasks prior to the execution of the task on the physical machines. An analysis of the greedy approach algorithms for effectively mapping the tasks to the virtual machines and decreasing the VM usage costs has been explored by Kumar and Mandal [9]. The allocation of resources is crucial in several computational areas like operating systems and data centre management. As per Mohammad et al. [10], resource allocation in cloud-based systems involves assuring the users that their computing requirements are totally and appropriately satisfied by the cloud server set-up. The efficient utilization of resources is paramount to the context of servers that provide cloud services, so that maximum profit is generated. This leads to the resource allocation and the task scheduling to be the primary challenges in cloud computing. The review of the AFSA algorithm has been presented by Neshat et al. [11]. Also described is the evolution of the algorithm including its improvisations as well as its combinations with several methods. Its applications are also delineated. Several optimization schemes can be used in combination with the suggested scheme which may result in improvisation of performance of this technique. There are, however, some drawbacks including high complexity of time, absence of balance between the local search and the global search and also the lack of advantage from the experiences of the members of the group for forecasting the movements. The proposed deadlineaware two-stage scheduling schedules the VM for the requested jobs submitted from users. Every job is specified to need 2 types of VMs in a sequential for completing its respective tasks, as per Raju et al. [12]. This prototype takes into consideration the deadlines with regard to the reply time and the waiting time, and it allocates the VMs as resources for the jobs that require them, based on the processing time and the jobs scheduling. This prototype has been evaluated using a simulation environment by considering the analysis of several metrics like deadline violations, mean turnaround time and mean waiting time. It has been contrasted with first come first serve (FCFS) and shortest job first (SJF) scheduling strategies. In comparison with these schemes, the suggested prototype has shown to decrease the evaluation metrics by constant factor. From the CSP’s perspective, the issue of global optimization of the cloud system has been addressed by Gao et al. [13]. It takes into consideration lowering of operating expenses by maximizing the energy efficiency and simultaneously fulfilling the user-defined deadlines in the service-level agreements. For the workload to be modelled, viable approaches should be considered for optimizing cloud operation. There are two models that are currently available: batch requests and task graphs with dependencies. The latter method has been adopted. This micro-managed approach to workloads allows the optimization of energy as well as performance. Thus, the CSP can meet the user deadlines at lesser operational expenses. Yet, some added efforts
A Heuristic Algorithm for Deadline-Based Resource …
5
are required by these optimizations with regard to resource provisioning, placement of VMs and scheduling of tasks. The suggested framework addresses these issues holistically. It has been conveyed by Rodriguez and Buyya [14] that the schemes that exist cannot fulfil the requirements of QoS or even fail to include elastic and heterogeneous requirements of the computing services in the cloud environments. This paper suggests a strategy for resource provisioning and scheduling for scientific workflows on infrastructure as a service clouds. For minimizing the overall workflow execution expense and simultaneously fulfilling the deadline constraint, an algorithm has been suggested which is based on particle swarm optimization (PSO). CloudSim and different popular scientific workflows of variable sizes have been used for evaluating our heuristics. The outcomes have suggested that compared to some of the state-of-the-art schemes, the suggested scheme performs better. An auto-adaptive resource control framework which was deadline-aware has been suggested by Xiang et al. [15]. This framework can be executed in a totally distributed fashion. This makes it suited to the environments that are not reliable, where single point of failure is unacceptable. This concept is based on the Nash bargaining in non-cooperative game theory. Based on this concept, this framework assigns cloud resources optimally for maximizing the Nash bargaining solutions (NBS) with regard to the priority of job as well as its deadline for completion. It additionally allows for resource allocation to be auto-adaptive and deadline-aware, rebalancing in when exposed to cyber or physical threats which may compromise the ability of cloud systems. Experiments on Hadoop framework have validated the suggested scheme.
3 Methodology Most work in the literature focusses on job completion time or job deadline along with bandwidth constraints and VM capabilities. The challenge, however, is to map the deadline against the job completion time so that deadlines are met with minimum cost and job completion time. A novel allocation algorithm which benefits from the added information in the requested attraction has been formulated. It is based on two schemes: time sliding and bandwidth scaling. In time sliding, a delay between job/task submission and execution is permitted. This requires smoothening the maximum demand of the cloud and decreasing the quantity of excluded users at busy intervals, whereas in bandwidth scaling, dynamic adaptation of the bandwidth assigned to the VMs is allowed. Deadline, greedy-deadline and FSA-based deadlines are detailed in this section.
3.1 Deadline In cloud computing, for fulfilling a user’s task, a job needs cloud resources. One of the available models for resource scheduling is deadline-aware two-stage scheduling
6
J. Uma et al.
model. The cloud resources are present as virtual machines. After scheduling the given n job requests, the scheduler allocates the needed cloud resources/VMs for every job that requests for it [12]. The scheduler, on receiving the n jobs from different users, allocates the VMs as resources by means of job scheduling, in deadline-aware two-stage scheduling. In this prototype, a job needs several VMs of various types sequentially for task completion. The overall workflow deadline is distributed across the tasks. A part of the deadline is allocated to every task based on the VM which is the most cost-effective for that particular task.
3.2 Greedy-Deadline As they have the greedy approach, resource allocation algorithms (RAAs) are extremely suited for dynamic and heterogeneous cloud resource environments. These are linked to a process scheduler by means of cloud communication [9]. For determining the issues of task scheduling, the greedy approach for optimized profit is effective. The greedy-deadline resource allocation algorithm [16] can be explained as follows: 1. The input is the virtual machine input. 2. Every resource in the resource cache is checked to see if it is in suspended state or waking state. If yes, then the remaining capacity of the resource is found and checked. 3. The remaining capacity of the resource is found if it is in the sleeping state. 4. The function is processed to obtain the resource from the cache. The priorities of the incoming tasks are evaluated, and the newly allocated priority is compared with the previously allocated ones. This is followed by assigning the tasks into the previously formulated priority queues. After allocation of the tasks, tasks in the high-priority queues are selected and are executed. This is followed by the transfer of tasks from medium-priority queues to high-priority queues. Thus, the remaining tasks in the queues are executed until the queue has been exhausted of all tasks.
3.3 Fish Swarm Algorithm (FSA) Another population-based optimizer is the AFSO. Initially, the process begins with a random set of generated probable solutions. An interactive search is performed for obtaining an optimum solution. Artificial fish (AF) [17] refers to the fictitious form which is used for analysing and explaining the problem. It may be understood through the concept of animal ecology. Object-oriented analytical scheme has been employed for considering the artificial fish as an object enclosed in its own data
A Heuristic Algorithm for Deadline-Based Resource …
7
as well as series of behaviours. These fish grasp an extraordinary amount of data regarding their surroundings through the use of their senses. They control their tail and fin for the stimulant reaction. The solution space constitutes the environment, wherein the AF resides along with the states of the other fish. Its present state and the state of its environment (including the current quality of questions as well as the states of other companions) determine its next behaviour. Its environment is influenced by not only its 7 own activities but also due to the activities of its companions [18]. The external perception of the AF is realized by means of its vision. Let X denote the present state of the AF. Let the visual distance be denoted by ‘Visual’. Let the visual position at a certain moment be denoted by ‘Xv’. In case the virtual position state is more efficient compared to the current one, it advances by one step in the direction and arrives at the subsequent state denoted by Xnext. Otherwise, it goes on a tour of inspection. The more number of inspection tours the artificial fish goes on, and the greater the amount of knowledge the artificial fish gains about its environment. It definitely does not require to move through complex or infinite states. This aids in finding the global minimum by letting certain local optimum along with some uncertainty. Let X = (x1 , x2 , . . . , xn ) and X v = (x1v , x2v , . . . , xnv ), then this process can be expressed as in Eq. (1): xiv = xi + Visual.rand(), i ∈ (0, n)
(1)
Xv − X .Step.rand() ||X v − X ||
(2)
xnext = X +
where Rand () is random number between 0 and 1, Step is the step length, x i is the optimization variable and n is the number of variables. There are two components included in the AF model—variables and functions. Variables are as follows: the existing position of AF is denoted by X. The moving length step is denoted by ‘Step’. The visual distance is denoted by ‘Visual’. The try number is given by try_number, and the crowd factor whose value is between 0 and 1 is given by δ. The functions include the behaviours of the AF: preying, swarming, following, moving, leaping and evaluating. The flow chart for artificial fish swarm optimization has been shown in Fig. 1.
3.4 Proposed Fish Swarm Algorithm (FSA)-Deadline The FSA is easily stuck in local optimal solutions, so an improved FSA is proposed to avoid local optima using appropriate cost function for evaluating the solutions. AFSA also finds increased usage in complex optimization fields. It offers an alternate to wellknown evolutionary methods of computing which can be applied across domains. The service is considered to be a supererogatory one that a cloud service provider
8
J. Uma et al.
Start
Initialize all fishes
Following Behaviour
Swarming Behaviour Implement the better of two behavior on every fishes
Meet criteria?
Final Solution Fig. 1 Artificial fish swarm algorithm (AFSA)
offers as potential tenants may make use of the room to submit a job run time lesser than the actual profiled outcome on purpose (anticipating the cloud service provider to protract it by relaxing). Practically however, there may be job requests where the profiling error is greater than the profiling relaxing index provided by the cloud service provider. There are two schemes employed by the cloud provider to deal with these tasks. First approach is to kill the jobs at the presumed end duration. The second approach is when a cloud provider makes use of small part of the cloud resource for specifically servicing those jobs. The virtual machines associated with those jobs are sent at once to the specific servers at their expected end times. Thereafter, they are run on the basis of best effort. The algorithmic procedure of implementation is described below: (1) Fishes are positioned in random on a task node. That is, each fish represents a solution towards meeting the objectives of deadline and job completion time. (2) Fishes choose a path to a resource node with a certain probability, determining if the limits of the optimization model are met. If they are met, the node is
A Heuristic Algorithm for Deadline-Based Resource …
9
included to the list of solutions by the fish. Else, the fish goes on to search for another node. If X i is the current state of fish, a state X j is chosen randomly within visual distance, Y = f (X) is the food consistence of fish: X j = X i + a f _visual.rand()
(3)
If Yi < Y j , then the fish moves forward a step in the direction of the vector sum of the X j and the X best_af , X best_af is the best fish available. ⎤ t − X X i ⎦ + best_af ∗ af_step ∗ ranf() = X it + ⎣ X best_af − X t t i X j − X j ⎡
X it+1
X j − X tj
(4)
Else, state X j is chosen randomly again and check if it complies with the forward requirement. If the forward requirement is not satisfied, then the fish would move a step randomly; this helps to avoid local minima. X it+1 = X it + af_visual ∗ rand()
(5)
(3) Fish move arbitrarily towards the next task node for the assignment of their next task. (4) Assigning all the tasks is regarded to be an iterative procedure. The algorithm is terminated at the time when the number of iterations reaches its maximum.
4 Results and Discussion Table 1 displays the parameter of FSA. Tables 2, 3 and 4 and Figs. 2, 3 and 4 show the makespan, VM utilization and percentage of successful job completion, respectively, for deadline, greedy-deadline and FSA-deadline. It is seen from Table 2 and Fig. 2 that the makespan for FSA-deadline performs better by 8.7% and by 10.9% than deadline and greedy-deadline, respectively, for number of jobs 200. The makespan for FSA-deadline performs better by 7.6% and Table 1 Parameter of FSA
Parameter
Value
Population
30
Max generation
50
Visual
2.5
Try number
5
Step
0.1
Crowd
0.618
10
J. Uma et al.
Table 2 Makespan (in seconds) for FSA-deadline
Table 3 VM utilization (in percentage) for FSA-deadline
Number of jobs
Deadline
Greedy-deadline
FSA-deadline
200
44
43
48
400
94
92
102
600
152
149
164
800
201
197
218
1000
250
246
271
Number of jobs
Deadline
Greedy-deadline
FSA-deadline
200
76
78
79
400
77
79
78
600
79
81
83
800
80
82
83
1000
76
79
79
Table 4 Percentage of successful job completion for FSA-deadline Percentage of successful job completion
Deadline
Greedy-deadline
FSA-deadline
200
79.9
82.3
83.5
400
81.4
83.4
81.9
600
83.7
85.3
87.6
800
84.7
86.4
87.4
1000
80.1
83.1
83.6
325
Makespan (sec)
275 225 175 125 75 25 200
400
600
800
Number of Jobs Deadline
Greedy - Deadline
Fig.2 Makespan (in sec) for FSA-deadline
FSA-Deadline
1000
A Heuristic Algorithm for Deadline-Based Resource …
11
84 83
VM Utilization (%)
82 81 80 79 78 77 76 75 74 200
400
600
800
1000
Number of Jobs Deadline
Greedy - Deadline
FSA-Deadline
Fig. 3 VM utilization (in percentage) for FSA-deadline
Sucessful job completion (%)
89 87 85 83 81 79 77 200
400
600
800
1000
Number of Jobs Deadline
Greedy - Deadline
FSA-Deadline
Fig. 4 Percentage of successful job completion for FSA-deadline
by 9.6% than deadline and greedy-deadline, respectively, for number of jobs 600. The makespan for FSA-deadline performs better by 8.1% and by 9.7% than deadline and greedy-deadline, respectively, for number of jobs 1000. It is seen from Table 3 and Fig. 3 that the VM utilization for FSA-deadline performs better by 3.87% and by 1.27% than deadline and greedy-deadline, respectively, for number of jobs 200. The VM utilization for FSA-deadline performs better by 4.94% and by 2.44% than deadline and greedy-deadline, respectively, for number of jobs 600. The VM utilization for FSA-deadline performs better by 3.9% and no change than deadline and greedy-deadline, respectively, for number of jobs 1000. It is seen from Table 4 and Fig. 4 that the percentage of successful job completion for FSA-deadline performs better by 4.41% and by 1.45% than deadline and greedydeadline, respectively, for number of jobs 200. The percentage of successful job
12
J. Uma et al.
completion for FSA-deadline performs better by 4.6% and by 2.7% than deadline and greedy-deadline, respectively, for number of jobs 600. The percentage of successful job completion for FSA-deadline performs better by 4.28% and by 0.599% than deadline and greedy-deadline, respectively, for number of jobs 1000.
5 Conclusion In the cloud computing paradigm, the computation as well as the storage of resources is migrated to the “cloud.” These resources can be accessed anywhere by any user, based on the demand. Suspicious modification of optimization parameters in metaheuristic algorithms is needed in order to find better solutions that lack extreme computational time. The artificial fish swarm algorithm (shortened to AFSA) is regarded as one among the top optimization methods within the set of swarm intelligence algorithms. AFSA is chosen because it has global search ability, good robustness as well as tolerance of parameter setting. This work proposes heuristic algorithm for deadline-based resource allocation in cloud using modified fish swarm algorithm. Outcomes have shown that the makespan for FSA-deadline performs better for two hundred jobs by 8.7% and by 10.9% than deadline and greedy-deadline, respectively. For 600 jobs, the FSA-deadline makespan is better compared to deadline by 7.6% and for greedy-deadline, and it is better by 9.6%. The same parameters for 1000 jobs are better by 8.1 and 9.7%. This task could be implemented by a trusted third party in future that is reliable to both tenant and provider.
References 1. Wei W, Fan X, Song H, Fan X, Yang J (2016) Imperfect information dynamic stackelberggame based resource allocation using hidden Markov for cloud computing. IEEE Trans Serv Comput 11(1):78–89 2. Saraswathi AT, Kalaashri YR, Padmavathi S (2015) Dynamic resource allocation scheme in cloud computing. Proc Comput Sci 47:30–36 3. Chen X, Li W, Lu S, Zhou Z, Fu X (2018) Efficient resource allocation for on-demand mobileedge cloud computing. IEEE Trans Veh Technol 67(9):8769–8780 4. Jin S, Qie X, Hao S (2019) Virtual machine allocation strategy in energy-efficient cloud data centres. Int J Commun Netw Distrib Syst 22(2):181–195 5. Li D, Chen C, Guan J, Zhang Y, Zhu J, Yu R (2015) DCloud: deadline-aware resource allocation for cloud computing jobs. IEEE Trans Parallel Distrib Syst 27(8):2248–2260 6. Madni SHH, Latiff MSA, Coulibaly Y (2016) An appraisal of meta-heuristic resource allocation techniques for IaaS cloud. Indian J Sci Technol 9(4) 7. Asghari S, Navimipour NJ (2016) Review and comparison of meta-heuristic algorithms for service composition in cloud computing. Majlesi J Multimedia Proces 4(4) 8. Cheng D, Rao J, Jiang C, Zhou X (2015, May) Resource and deadline-aware job scheduling in dynamic hadoop clusters. In Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International, pp 956–965. IEEE
A Heuristic Algorithm for Deadline-Based Resource …
13
9. Kumar D, Mandal T (2016, April) Greedy approaches for deadline based task consolidation in cloud computing. In: 2016 International Conference onComputing, Communication and Automation (ICCCA), pp 1271–1276. IEEE 10. Mohammad A, Kumar A, Singh LSV (2016) A greedy approach for optimizing the problems of task scheduling and allocation of cloud resources in cloud environment 11. Neshat M, Sepidnam G, Sargolzaei M, Toosi AN (2014) Artificial fish swarm algorithm: a survey of the state-of-the-art, hybridization, combinatorial and indicative applications. Artif Intell Rev 42(4):965–997 12. Raju IRK, Varma PS, Sundari MR, Moses GJ (2016) Deadline aware two stage scheduling algorithm in cloud computing. Indian J Sci Technol 9(4) 13. Gao Y, Wang Y, Gupta SK, Pedram M (2013, September) An energy and deadline aware resource provisioning, scheduling and optimization framework for cloud systems. In: Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Co design and System Synthesis, p 31. IEEE Press 14. Rodriguez MA, Buyya R (2014) Deadline based resource provisioning and scheduling algorithm for scientific workflows on clouds. IEEE Trans Cloud Comput 2(2):222–235 15. Xiang Y, Balasubramanian B, Wang M, Lan T, Sen S, Chiang M (2013, September) Selfadaptive, deadline-aware resource control in cloud computing. In: 2013 IEEE 7th international conference onSelf-adaptation and self-organizing systems workshops (SASOW), pp 41–46. IEEE 16. Wu X, Gu Y, Tao J, Li G, Jayaraman PP, Sun D, et al. (2016) An online greedy allocation of VMs with non-increasing reservations in clouds. J Supercomput72(2):371–390 17. Shen H, Zhao H, Yang Z (2016) Adaptive resource schedule method in cloud computing system based on improved artificial fish swarm. J Comput Theor Nanosci 13(4):2556–2561 18. Li D, Chen C, Guan J, Zhang Y, Zhu J, Yu R (2016) DCloud: deadline-aware resource allocation for cloud computing jobs. IEEE Trans Parallel Distrib Syst 27(8):2248–2260
Dynamic Congestion Control Routing Algorithm for Energy Harvesting in MANET M. M. Karthikeyan and G. Dalin
Abstract Energy harvesting (EH) is seen as the key enabling innovation for the mass sending of mobile ad hoc networks (MANETs) for the IoT applications. Effective EH methodologies could oust the necessities of successive energy source substitution, subsequently offering a near interminable system working condition. Advances in EH systems have moved the plan of routing conventions for EH-MANET from “energy-mindful” to “energy-harvesting-mindful.” Right now, Dynamic Congestion Control Routing Algorithm is using Energy Harvesting in MANET. The presentation of the Dynamic Congestion Control Routing Algorithm-Based MANET Algorithm scheme is evaluated using various measurements, for instance, Energy Consumption Ratio, Routing Overhead Ratio, and Throughput Ratio. Keywords Dynamic congestion · Energy harvesting · Routing overhead · Throughput · Energy consumption · MANET
1 Introduction Congestion occurs in impromptu systems with compelled resources. In such a system, package transmission habitually encounters crash, impedance, and blurring, as a result of shared radio and dynamic topology. Transmission botches inconvenience the system load. Starting late, there is a growing interest for supporting sight and sound correspondences in specially appointed systems [1]. The immense constant arrangements are in impacts, information move limit concentrated, and M. M. Karthikeyan (B) Ph.D Research Scholar, PG and Research Department of Computer Science, Hindusthan College of Arts & Science, Coimbatore, Tamil Nadu, India e-mail: [email protected] G. Dalin Associate Professor, PG and Research Department of Computer Science, Hindusthan College of Arts & Science, Coimbatore, Tamil Nadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_2
15
16
M. M. Karthikeyan and G. Dalin
congestion committed. Congestion in specially appointed systems prompts bundle mishap, transfer speed corruption, and lounges around and energy for congestion recovery. Congestion-mindful routing show jolts preemptively settle congestion through bypassing the blocked associations. To restrict congestion in organize routing calculations are used. The vehicle is a system layer answer to offer isolated help in stopped up sensor systems. In proactive conventions, courses between each two centers are developed ahead of time in spite of the way that no transmission is sought after. This philosophy isn’t sensible for tremendous systems considering the way that various unused courses regardless of everything need be kept up and the incidental reviving may obtain overwhelming taking care of and correspondence overhead. In any case, when an association is isolates on account of disillusionment or center point conveys ability, which as often as possible occurs in MANETs, the deferral and overhead in view of the new process exposed may be tremendous [2]. To view this problem, different approaches of objective is used in multipath route conventions. A substitute way can be found quickly if the present way is broken. In addition, the use of different ways doesn’t change route performance is smarter than singlepath aside from on the off chance that we use an outstandingly huge no of ways are more expensive and subsequently infeasible [3]. Their estimation for arranging route conventions: congestion-versatile routing Vs congestion un-adaptive routing. Most by far of the current routing conventions have a spot with the consequent social event. A bit of the current routing conventions are congestion-mindful, and a very few are congestion-versatile (Fig. 1). In congestion-mindful routing systems, congestion is contemplated just while developing another course which proceeds as before until convey ability or dissatisfaction achieves withdrawal. In congestion-versatile routing, the course is adaptively factor reliant on the congestion status of the system. Routing may let congestion happen which is later recognized and dealt with by congestion control [4]. In multirate impromptu systems, different data rates will probably prompt certain courses having different associations with different data rates. In the occasion that lower data rate joins follow higher data rate joins, groups will create at the center heading the lower data rate associate, prompting long coating delays. A further explanation behind the congestion is interfaced constancy [5]. If associations break, congestion is extended on account of group salvage. Introductory a congestion-mindful routing metric that uses the retransmission check weighted channel deferral and pad lining delay, with a tendency of less stopped up high throughput interfaces with improve channel use. By then, to avoid the congestion issue we can apply the Congestion Aware Routing show for portable specially appointed systems (CARM) CARM applies an association data rate plan approach to manage hinder courses with screwed up association data rates [6]. In CRP, every center point appearing on a course alerts its past center point when slanted to be blocked. The past center by then uses an "avoid" course bypassing the potential congestion to the first non-blocked center on the course. Traffic will be part probabilistically over these two courses, fundamental and avoid, right now diminishing the chance of congestion occasion. CRP is a congestion-versatile unicast routing show for MANETs. Every center appearing on a course alerts its past center point when slanted to be obstructed [7]. The past
Dynamic Congestion Control Routing Algorithm …
17
Fig. 1 Architecture of congestion-aware routing algorithm
center uses an "evade" course for bypassing the potential congestion region to the first non-blocked center on the basic course. Traffic is part probabilistically over these two courses, fundamental and evades, therefore sufficiently decreasing the chance of congestion occasion. The congestion watching will use various measurements to screen the congestion status of the center points. When the no of packs setting off to the centers outperforms its passing on limit, the center gets stopped up and its beginnings losing packages. Manager among these is the degree of all packages discarded for the nonappearance of support space, the run of the mill line length, the quantity of groups facilitated out and retransmitted, the ordinary package delay, and the customary deviation of pack delay. In all cases, extending numbers exhibit creating congestion. Any of these strategies can work with CRP by and by [8].
2 Literature Survey Adaptive congestion control is an instrument with a learning limit. This learning limit empowers the system to adjust to dynamically changing system conditions to keep up relentlessness and staggering execution. At this moment, sent to the sender to change the sending rate, as indicated by the present system conditions [9]. It is
18
M. M. Karthikeyan and G. Dalin
versatile regarding creating deferrals, data move limit and different clients using the system. ACP is depicted by its learning limit which connects with the show to adjust to the fundamentally dynamic system condition to keep up security and unbelievable execution. This learning limit is appeared by a novel estimation calculation, which "learns" about the quantity of streams using each relationship in the system [10]. Merits: • An adaptive routing procedure can improve execution, as observed by the system client. • An adaptive routing technique can help in congestion control. Since an adaptive routing technique will, as a rule, change loads, it can concede the beginning of unprecedented congestion [9]. Demerits: • The routing choice is dynamically strange; in this way, the preparing trouble on arrange center points increments. • In most cases, adaptive frameworks depend upon status data that is amassed at one spot in any case utilized at another. There is a tradeoff here between the possibility of the data and the extent of overhead [10]. • An adaptive technique may respond too rapidly, causing congestion-passing on affecting, or too bit by bit, being immaterial. Uddin et al. [11] proposed an energy-proficient multipath routing show for versatile specially appointed mastermind to use the health work recalls this particular issue of energy consumption for MANET by applying the fitness function system to streamline the energy consumption in Ad hoc On-Demand Multipath Distance Vector (AOMDV) routing show. The proposed show is called Ad hoc On-Demand Multipath Distance Vector with the Fitness Function (FF-AOMDV). The wellbeing work is utilized to locate the ideal way from the source to the objective to lessen the energy consumption in multipath routing. The presentation of the proposed FFAOMDV show was overviewed by utilizing Network Simulator Version 2 (NS-2), where the show was separated and AOMDV and Ad hoc On-Demand Multipath Routing with Life Maximization (AOMR-LM) conventions, the two most prominent conventions proposed here. Merits: • FF-AOMDV figuring has performed unmistakably better than both AOMR-LM and AOMDV in throughput, pack transport degree and starts to finish delay. • Performed well against AOMDV for proportioning more energy and better structure lifetime. Demerits: • More Energy consumption and Less Network lifetime.
Dynamic Congestion Control Routing Algorithm …
19
Zhang et al. [12] proposed energy-productive communicate in portable systems haphazardness novel energy and information transmission fruitful communicate conspire named the energy-effective communicate plot, which can adjust to lively changing structure topology and channel abnormality [12]. The structure of the communicate plot depends upon a through and through assessment of the favorable circumstances and insufficiencies of the by and to a great extent utilized scourge communicate plans. Merits: • An energy-productive communicate contrive is proposed, affected by the appraisal of the information dispersing process utilizing the SIR plot; • Analytical results are appeared on the bit of focus focuses that get the information communicate by a fearless focus point in a system utilizing the proposed communicate plot. Demerits: • Right when the structure is familiar with empowering multi-hop interchanges among center points, or in multi-skip networks with compelled establishment support. Lee et al. [13] proposed an assembled TDMA space and power orchestrating plans which develop energy effectiveness (EE) considering quality-of-service (QoS) utility, and this arrangement redesigns the unfaltering quality and survivability of UVS key MANET. The proposed calculation has three stages Dinkelbach strategy, animating the Lagrangian multiplier and the CCCP procedure. To update the EE, the length of a TDMA design is dynamically balanced. The drawback of this show is that as the all out concede stretches out as appeared by the edge round, it cannot ensure the diligent transmission. Merits: • The proposed calculation is certified by numerical outcomes. • Those ensure least QoS and show the really unprecedented energy productivity. Demerits: • Using TDMA progression is that the clients increase some predefined encounters opening. • When moving from one cell site to other, if all the availabilities right presently full the client may be isolated. Jabbar et al. [14] proposed cream multipath energy and QoS—mindful improved association state routing show adaptation 2 (MEQSA-OLSRv2), which is made to adjust to the challenges showed up by constrained energy assets, mobility of focus focuses, and traffic congestion during information transmission in MANET-WSN association conditions of IoT systems. This show utilizes a middle point rank as exhibited by multi-criteria hub rank estimation (MCNR). This MCNR totals various
20
M. M. Karthikeyan and G. Dalin
parameters identified with energy and nature of administration (QoS) into a cautious estimation to fundamentally reduce the multifaceted thought of different obliged contemplations and dodge the control overhead acknowledged via independently communicating different parameters. These estimations are the middle’s lifetime, remaining battery energy, focus’ inert time, focus’ speed, and line length. The MCNR metric is used by another association quality assessment work for different course computations. Merits: • MEQSA-OLSRv2 maintained a strategic distance from the assurance of focuses with high flexibility. Demerits: • Audiences whine about information over-burden, and they can be overpowered and feel that it is annoying. • The quickly changing of progression has upset the gathering’s exercises. Kushwaha et al. [15] proposed a novel response for move server load starting with one server then onto the accompanying server. Energy effectiveness is a basic factor in the activity of specially appointed systems. The issue of sorting out routing show and overpowering nature of impromptu headway may decrease the life of the middle point like the life of the system. Merits: • MANETs over networks with a fixed topology join flexibility (an impromptu system can be made any place with versatile devices). • Scalability (you can without a lot of a stretch add more focuses to the system) and lower organization costs (no persuading inspiration to gather a framework first). Demerits: • Mobile focus focuses allow the land to pass on and plan a brief system. • The significant issue with the impromptu community focuses is resource goals.
3 Proposed Work 3.1 Dynamic Congestion Control Routing Algorithm DCCR is unicast routing show process for mobile ad hoc network. It decreases to organize congestion by techniques for diminishing pointless flooding of bundles and finding a without congestion route starts the source and the objective. At the present time, present the all out plan and an all around appraisal of the DCCR show. Exactly when a source have expected to communicate a data bundle to an objective,
Dynamic Congestion Control Routing Algorithm …
21
the DCCR show initially builds up a without congestion set (CFS) to relate both one-bounce and 2-jump neighbors. By then, the source begins the course exposure methodology using the CFS to perceive a sans congestion path to the objective. In case, the DCCR show cannot build up a CFS in view of the system being as of now stopped up, it cannot begin the course disclosure process. In any case, when another course has been set up, the transmission of data packs will continue. Right now, essential objective of DCCR is to find a without congestion course between the source and the objective. In doing accordingly, it reduces the overhead and flooding of groups. The DCCR show contains the going with parts: 1. 2. 3. 4.
Technique process of dynamic congestion, Construction of CFS, Routing way of congestion free, Path discovery of congestion free.
The proposed calculation controls arrange congestion by strategies for diminishing the futile flooding of bundles and finding a sans congestion path between the source and the objective. The EDCDCR framework at first perceives the congestion, by then forms a sans congestion set (CFS) to relate both one-bounce and two-jump neighbors and the source begins the course disclosure system using the CFS to recognize a sans congestion route to the objective. The proposed calculation involves three portions to perceive and control congestion on MAC layer in MANET fuses: dynamic congestion detection. 1. CFS construction, 2. Route discovery congestion free.
3.2 Dynamic Congestion Detection The detecting congestion is based on estimation of the link stability (LS), residual bandwidth (RB), and residual battery power (RP). Link Stability The link stability (LSD) is used to define link’s connection strength. In MANET, to improve QoS LSD is essential and is defined as: LSD = Mobility factor/Energy factor LSD characterizes the level of the connection dependability. The higher estimation of LSD, higher is the dependability of the connection and more noteworthy is the duration of its reality. In this way a course having the whole connection with LSD > LSD thr is practical.
22
M. M. Karthikeyan and G. Dalin
4 Experimental Results Energy Consumption Ratio Figure 2 exhibits the examination of Energy Consumption Ratio is characterized as comprise of low force gadgets that are disseminated in topographically confined territories. The energy consumption is a significant worry for MANET. Their Energy Consumption Ratio CODA esteems are commonly characterized as between 67.2 and 75. Energy Consumption Ratio CCF values are characterized as between 57 to 69. Energy consumption ratio proposed DCCRA values are characterized as between 83 and 93.6. These outcomes are reproduced utilizing NS2 test system. This outcome
Fig. 2 Comparison chart of energy consumption ratio
Dynamic Congestion Control Routing Algorithm …
23
shows a reliable outcome for proposed novel procedure. Consequently, the proposed strategy created a superior improvement energy consumption ratio results. Thus, the proposed strategy delivered a noteworthy improvement in results. Routing Overhead Ratio Figure 3 shows the examination of routing overhead ratio is characterized as routing and information parcels need to have a similar system data transfer capacity the vast majority of the occasions, and henceforth, routing bundles are viewed as an overhead in the system. Their current CODA esteems are commonly characterized as between 39 and 58, and another current CCF values are commonly characterized as between 26.77 and 44.56. Proposed DCCRA values are characterized as between 66 and 85. These outcomes are mimicked utilizing NS2 test system. This outcome shows a predictable outcome for proposed novel procedure. Consequently, the proposed technique delivered a superior improvement routing overhead ratio results. Subsequently the proposed strategy delivered a huge improvement in results. Throughput Ratio Figure 4 exhibits the examination of average throughput ratio. Normal throughput ratio is characterized as the proportion of parcels effectively got to the all out sent. Their current CODA esteems are commonly characterized as between 0.09 and 0.3. Existing CCF values are commonly characterized as between 0.04 and 0.22. Proposed DCCRA values are characterized as between 0.13 and 0.45. These outcomes are recreated utilizing NS2 test system. This outcome shows a reliable outcome for proposed novel procedure. Consequently, the proposed technique delivered a superior improvement average throughput ratio results. Subsequently, the proposed strategy created a huge improvement in results.
Fig. 3 Comparison chart of routing overhead ratio
24
M. M. Karthikeyan and G. Dalin
Fig. 4 Comparison chart of throughput ratio
5 Conclusion Congestion-mindful adaptive routing can effectively improve the presentation of networks due to its ability to exactly envision arrange congestion and choose perfect routing decisions. At the present time have examined the possibility of congestion mindful adaptive routing, favorable circumstances and burdens of congestion mindful adaptive routing, dynamic congestion control routing algorithm, and scheduling based on 1-Hop interference effect. We consider throughput-perfect booking plans for remote systems in a passed on way.
References 1. Arora B, Nipur (2015) An adaptive transmission power aware multipath routing protocol for mobile ad hoc networks © 2015 Published by Elsevier. https://creativecommons.org/licenses/ by-nc-nd/4.0/ 2. Divya M, Subasree S, Sakthivel NK (2015) Performance analysis of efficient energy routing protocols in MANET. pp. 1877–0509 © 2015 The Authors. Published by Elsevier. https://cre ativecommons.org/licenses/by-nc-nd/4.0/ 3. Sandeep J, Satheesh Kumar J (2015) Efficient packet transmission and energy optimization in military operation scenarios of MANET, pp 1877–0509 © 2015 The Authors. Published by Elsevier. https://creativecommons.org/licenses/by-nc-nd/4.0/ 4. Kim D, Kim, J-h, Moo C, Choi J, Yeom I (2015) Efficient content delivery in mobile ad-hoc networks using CCN, pp 1570–8705 © 2015 Elsevier. http://dx.doi.org/https://doi.org/10.1016/ j.adhoc.2015.06.007 5. Anish Pon Yamini K, Suthendranb K, Arivoli T (2019) Enhancement of energy efficiency using a transition state mac protocol for MANET, pp 1389–1286 © 2019 Published by Elsevier. https://doi.org/https://doi.org/10.1016/j.comnet.2019.03.013 6. Taheri S, Hartung S, Hogrefe D (2014) Anonymous group-based routing in MANETs, pp 2214–2126 © 2014 Elsevier Ltd. http://dx.doi.org/https://doi.org/10.1016/j.jisa.2014.09.002
Dynamic Congestion Control Routing Algorithm …
25
7. Sakthivel M, Palanisamy VG (2015) Enhancement of accuracy metrics for energy levels in MANETs, pp 0045–7906 © 2015 Elsevier Ltd. http://dx.doi.org/https://doi.org/10.1016/j.com peleceng.2015.04.007 8. Ragul Ravi R, Jayanthi V (2015) Energy efficient neighbour coverage protocol for reducing rebroadcast in MANET. © 2015 The Authors. Published by Elsevier. https://creativecommons. org/licenses/by-nc-nd/4.0/ 9. Gawas MA, Gudino LJ, Anupama KR (2016) Cross layer adaptive congestion control for best-effort traffic of IEEE 802.11e in mobile ad hoc networks. In: 2016 10th international symposium on communication systems, networks and digital signal processing (CSNDSP). doi: https://doi.org/10.1109/csndsp.2016.7574042 10. Shafigh AS, Veiga BL, Glisic S (2016) Cross layer scheme for quality of service aware multicast routing in mobile ad hoc networks. Wireless Netw 24(1):329–343. https://doi.org/10.1007/s11 276-016-1349-1 11. Uddin M, Taha A, Alsaqour R, Saba T (2016) Energy efficient multipath routing protocol for mobile ad-hoc network using the fitness function, pp 2169–3536 (c) 2016 IEEE 12. Zhang Z, Mao G, Anderson BDO (2015) Energy efficient broadcast in mobile networks subject to channel randomness, pp 1536–1276 (c) 2015 IEEE 13. Lee JS, Yoo Y-S, Choi HS, Kim T, Choi JK (2019) Energy-efficient TDMA scheduling for UVS tactical MANET, pp 1089–7798 (c) 2019 IEEE 14. Jabbar WA, Saad WK, Ismail M (2018) MEQSA-OLSRv2: a multicriteria-based hybrid multipath protocol for energy-efficient and QoS-aware data routing in MANET-WSN convergence scenarios of IoT, pp 2169–3536 (c) 2018 IEEE 15. Kushwaha A, Doohan NV (2016) M-EALBM: a modified approach energy aware load balancing multipath routing protocol in MANET. 978-1-5090-0669-4/16/$31.00 © 2016 IEEE
Predictable Mobility-Based Routing Protocol in Wireless Sensor Network G. Sophia Reena and M. Punithavalli
Abstract While considering the routing process in mobile wireless sensor network as the greatest complex task, it would get affected mainly based on mobility behavior of nodes. Successful routing ensures the increased network performance by sending packets without loss. This is confirmed in the previous research work by introducing the QoS-oriented distributed routing protocol (QOD) which measures the load level of channels before data transmission; thus, the successful packet transmission is ensured. However, this research method does not concentration prediction about mobility behavior which would cause the path breakage and network failure. It is completely determined in this proposed method by specifically presenting predictable mobility-based routing scheme (PMRS) in which successful data transmission can be guaranteed by avoiding the path breakage due to mobility. In this work, node movement will be predicted based on node direction and motion angles toward the destination node. By predicting the node mobility in the future, it is concluded that whether the node is nearest to the destination or not. Thus, the better route path can be established for deploying a successful data transmission. Based on node movement, the optimal cluster head would be selected, and thus, the shortest and reliable path can be achieved between source and destination nodes. In this work, cluster head selection is prepared by using the genetic algorithm, which can ensure the nodes reliable transmission without any node failure. Finally, data transmission is done through cluster head node by using time division multiple access (TDMA) method. The overall implementation of the proposed scheme is on the NS2, from which it is shown that this technique provides best possible result than the other recent schemes.
G. Sophia Reena (B) Department of Information Technology, PSGR Krishnammal College for Women, Peelamedu, Coimbatore, India e-mail: [email protected] M. Punithavalli Department of Computer Applications, School of Computer Science and Engineering, Bharathiar University, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_3
27
28
G. Sophia Reena and M. Punithavalli
Keywords Mobility prediction · Mobile wireless sensor network · Cluster head · Reliability · Location prediction
1 Introduction Wireless sensor network (WSN) is becoming widespread, where a lot of recent research work tends to be particularly focused on the specific applications. Actualtime communication exchange is one of the necessity researches that confront based on the type of application. In these kind of applications, data packets beyond the time limit are considered to affect the performance and quality of the system [1]. In order to handle this issue, it is crucial to analyze the appropriateness and relevance of actual-time communications in WSN with an effect on the amount of suppositions and the exceptional assessment standards so as to order and deduce the amount of normal issue [2]. To be able to gain the previously discussed goals, actual-time communications in WSN are categorized into hard, firm, and easy real-time categories devoid of failure of generality, such as conversational classification. Hard real-time communications point out that the lost time limit has an effect on the function of the system by means of rooting the complete system to fail [3]. As a result, in the most horrible scenario, it takes time to limit the end-to-end delay contained by the time limit [4]. In order to concentrate on the aspect of mobility dimension of WSN, it is extremely significant to recognize how traditional assumptions regarding dynamically organized WSNs shift with the implementation of mobile units [5]. These kind of networks are extremely possess potential to function better that the static WSNs because they are inclined to extend the network time span, minimize the usage of services, endow with additional data, and achieve higher processing [6]. Mobility has turned out to be a critical field of study for the WSN culture over the last few years. The growing capabilities and thus the dropping prices of mobile sensors build potential and sensible mobile sensor networks [7]. Since the topology changes too much, pre-structure of communication distribution networks won’t be of much facilitation at this point. Recurrent position notifications from a mobile node may result in unnecessary drainage of the battery bid of the sensor node, in addition, may also augment collisions [8]. Various routing protocols [9–11] are suggested by many scholars, diverse standards are endorsed, and design problem [12]. Usually routing method is extremely difficult in a mobile network and in MWSN it is even more complicated because the system nodes are low-power, value-efficient mobile devices with limited resources [8]. In this method, the path breakage due to mobility can be predicted by introducing the method, namely predictable mobility-based routing scheme (PMRS), in which successful data transmission can be guaranteed. This routing protocol groups the sensors into various similar clusters, there is a cluster head (CH) in each cluster that accumulates data from every node in their cluster. Selection of the cluster head is
Predictable Mobility-Based Routing Protocol in Wireless …
29
rendered using genetic algorithm (GA). The CH possibly will regularly accumulate data from the sensors or TDMA scheduling might be performed to collect the data from the sensors [13].
2 Predictable Mobility-Based Routing Protocol In the proposed research method, predictable mobility-based routing scheme (PMRS) is developed for deploying an effective data transmission that can be assured by avoiding the path breakage due to mobility. In this work, the upcoming node movement will be predicted based on node direction and motion angles toward the destination node. By predicting the node mobility in upcoming, it is determined that whether the node is nearest to destination or not. Thus, the improved route path can be established for the successful data transmission. Based on node movement optimal cluster head would be selected, thus the shortest and reliable path can be attained between source and destination nodes. In this work, cluster head selection is completed by using the genetic algorithm which can confirm the nodes reliable transmission without node failure. Finally, data transmission is completed through cluster head node using time division multiple access (TDMA) method. The processes involved in the proposed research technique are listed below: • Nodes mobility prediction based on nodes direction and motion angles • Mobility-based cluster head selection using genetic algorithm • Reliable data transmission using time division multiple access method.
2.1 Nodes Mobility Prediction The update protocol is essential for the dissemination of knowledge about geographical location and services. Measured resources, such as battery power, queuing space, processor speed, range of transmission. 1. Type 1 update: A Type 1 update is regularly produced. The time between subsequent changes to form 1, i.e., the time remains set at the specified frequency. Otherwise the frequency of the type 1 modified may differ linearly between a maximum (f max ) and a minimum (f min ) of the threshold, but node v is defined. The characteristics are shown in Fig. 1. 2. Type 2 update: Type 2 updates are provided in case if there is a significant shift in the node’s pace or path. The mobile node can assess the approximate location in which it is positioned at a certain time in its current record (specifically from latest information) (Fig. 2). Subsequently, anticipated position (x e , ye ) is provided by the following equations: xe = x + v.(te − t). cos θ
(1)
30
G. Sophia Reena and M. Punithavalli
Fig. 1 Deviation of update frequency of type 1 update together with velocity of the node
Fig. 2. Check at time t c whether type 2 update must be produced
ye = y + v.(te − t). sin θ
2.1.1
(2)
Predictions
When connecting to a specific target b, source a must initially determine the destination b’s geographic position as well as the intermediate hops when the first packet enters the individual nodes. This phase therefore engrosses a spot, in addition to the prediction of propagation delay. It is to be observed that the location prediction is effectively employed for the purpose of determining the geographical position of any node, either an intermediary node or target in the future when the packet enters it at a given time t p . For updates containing node motion direction information, only 1 preceding update is necessary if the position is to be predicted. In the case of a given node, the measurement of the projected position is then exactly the same as the periodic measurement of the actual position in node b itself.
Predictable Mobility-Based Routing Protocol in Wireless …
2.1.2
31
Genetic Algorithm
An adaptive genetic algorithm (GA) was announced by J.Holland for usage as search algorithm. GAs effectively handled several fields of applications and were capable of resolving an extensive array of complicated numerical optimization complications. GAs need no gradient details and is comparatively less possible to be cornered in local minima on multi-modal search spaces. GAs create to be reasonably not sensitive to the existence of noise. The pseudocode of the GAs method is given below: Pseudo code for Genetic Algorithm begin GAs g = 0 generation counter Initialize population Compute fitness for population P (g) While (Terminating condition is not reached) do g=g+1
Crossover P (g) Mutate P (g) Evaluate P (g) end while end GA
The above problem is encoded via gas within chromosomes which represent every possible solution.
2.1.3
Local Search
The combinational optimization problem is described through (S, g) in which S signifies the set of every possible results and g is defined as the objective function which can plot every constituents in S to a given actual value. The end result is finding a solution sin S which will minimize the objective function g. The problem is visualized through the following equation: Min g(s), s S where N represents the function of the neighborhood or problem format (S, g) where it is represented from S to its powerset by the given mapping format:
32
G. Sophia Reena and M. Punithavalli
N: S N(s) is also symbolic of the value of the neighborhoods and it contains each possible solution which is reached via a single move from s. The move represents operators who convert multiple solutions with minute changes. x then represents the solutions which is also known as the local minimum of g in accordance with the neighborhood N if: g(x) g(y), y N (x) The process of minimizing cost functions g or the local search function is the consecutive strides in each of which the existing solution x is being exchanged by a solution y in order that: g(y) < g(x), y N (x) Maximum local search starts with arbitrary solution and end with the selection of local minimums. There are multiple ways to conduct local searches and the complexities in local search computations are dependent on neighborhood set sizes and its approximate time required to evaluate moves. It is thus noted that neighborhood size grows in size and this effects the time required to search for it, in order to determine a better local minima. Local search makes use of concept of state space, neighborhood, and objective function. i.
State space S: It is the collection of potential states that can be extended at some point in the search. ii. Neighborhood N(s): It is the collection of states, neighbors that during which can be extended from the state, s in one step. iii. Objective function f (s): It is a value that signifies the excellence of the state, s. The best possible value of the function is attained during at a stage where s is a solution. Pseudocode for local search is as follows: Select an initial state s0eS. While s0 is not a solution D0. Select by some heuristic, seN(s0) such that f(s) > f(s0) Replace s0 by s.
Predictable Mobility-Based Routing Protocol in Wireless …
2.1.4
33
Genetic Algorithm Using Local Search.
In genetic algorithm, four parameters are accessible. The size of the population, cross-probability, mutation probability, and weight accuracy of influence factors. Figure 3 shows the flowchart for proposed method.
3 Reliable Data Transmission Using TDMA Our planned method uses inter-levels to synchronize signal transmission time in accordance with the local time distinction among any sensor node together with its parent node and to create energy-efficient and proactive TDMA change rather than taking into account the concept of global clock synchronization and position information from nodes between arbitrarily simulated sensor nodes. One sensor with even numerated Id can only steal slots at some point in odd slot number, according to our proposed algorithm (Fig. 4).
4 Results and Discussion The performance evaluation parameters considered here are packet delivery ratio, throughput, end-to-end delay, and network lifetime is evaluated by using existing MADAPT algorithm, previous work QoS-aware channel load-based mobility adaptive routing protocol (QoS-CLMARP) and the proposed work on predictable mobility-based routing scheme (PMRS). The results of end-to-end delay are illustrated in Fig. 5. In existing MADAPT and QoS-CLMARP method, the end-to-end delay is lower. In case of the proposed system, the end-to-end delay is improved considerably by PMRS method (Fig. 5). The results of network lifetime are illustrated in Fig. 6. In existing MADAPT and QoS-CLMARP method, the network lifetime is lower. In proposed system, the network lifetime is improved significantly by PMRS method. In Fig. 7, the packet distribution ratio performance is substantially improved by the PMRS approach in the proposed system. In Fig. 8, the proposed PMRS accomplishes less packet loss ratio of compared with other two methods.
5 Conclusion In this work, node movement in future will be predicted based on node direction and motion angles toward the destination node. By predicting the node mobility in future, it is concluded that whether the node is nearest to destination or not. Thus, the better
34
G. Sophia Reena and M. Punithavalli
Fig. 3 Flowchart of cluster head selection process
Begin
Chromosome’s coding
Initial population
Calculating the fitness value of chromosome combined by weight values
Perform selection, crossover, and mutation operators
Local search
Generation of new population
No Meet optimization Yes Get optimal weight value
End
Predictable Mobility-Based Routing Protocol in Wireless …
Fig. 4 Slot sharing Fig. 5 End-to-end delay comparison
Fig. 6 Network lifetime comparison
35
36
G. Sophia Reena and M. Punithavalli
Fig. 7 Packet delivery ratio comparison
Fig. 8 Packet loss ratio comparison
route path can be established for the successful data transmission. Based on node movement optimal cluster head would be selected, thus the shortest and reliable path can be achieved between source and destination nodes. In this work, cluster head selection is completed by using the genetic algorithm which can confirm the nodes reliable transmission without node failure. Finally, data transmission is finished through cluster head node using time division multiple access (TDMA) method.
References 1. Kim BS, Park H, Kim KH, Godfrey D, Kim KI (2017) A survey on real-time communications in wireless sensor networks. Wireless Commun Mob Comput 2. Oliver R, Fohler G (2010) Timeliness in wireless sensor networks: common misconceptions. In: Proceedings of international workshop on real-time networks, July 2010 3. Collotta M, Costa DG, Falcone F, Kong X (2016) New challenges of real-time wireless sensor networks: theory and applications. Int J Distrib Sens Netw 12(9)
Predictable Mobility-Based Routing Protocol in Wireless …
37
4. Zhan A, Xu T, Chen G, Ye B, Lu S (2008) A survey on realtime routing protocols for wireless sensor networks. In: Proceedings of China wireless sensor network conference, 2008 5. Amundson I, Koutsoukos XD (2009) A survey on localization for mobile wireless sensor networks. In: Mobile entity localization and tracking in GPS-less environnments, pp 235–254. Springer, Berlin 6. Rezazadeh J, Moradi M, Ismail AS (2012) Mobile wireless sensor networks overview. Int J Comput Commun Netw 2(1):17–22 7. Ekici E, Gu Y, Bozdag D (2006) Mobility-based communication in wireless sensor networks. Commun Mag IEEE 44(7):56–62 8. Sara GS, Sridharan D (2014) Routing in mobile wireless sensor network: A survey. Telecommun Syst 57(1):51–79 9. Asad M, Nianmin Y, Aslam M (2018) Spiral mobility based on optimized clustering for optimal data extraction in WSNs. Technologies 6(1):35 10. Poulose Jacob K, Paul V, Santhosh Kumar G (2008) Mobility metric based LEACH-mobile protocol 11. Khandnor P, Aseri T (2017) Threshold distance-based cluster routing protocols for static and mobile wireless sensor networks. Turkish J Electr Eng Comput Sci 25(2):1448–1459 12. Chen C, Ma J, Yu K (2006) Designing energy efficient wireless sensor networks with mobile sinks. In: Proceedings of WSW’06 at SenSys’06, Colorado, USA, 31 October 2006 13. Jain SR, Thakur NV (2015) Overview of cluster based routing protocols in static and mobile wireless sensor networks. In: Information systems design and intelligent applications, pp 619– 626. Springer, New Delhi
Novel Exponential Particle Swarm Optimization Technique for Economic Load Dispatch Nayan Bansal, Surendrabikram Thapa, Surabhi Adhikari, Avinash Kumar Jha, Anubhav Gaba, and Aayush Jha
Abstract Due to vicious competition in the electrical power industry, growing environmental issues and with an ever-increasing demand for electric energy, optimization of the economic load dispatch problem has become a compulsion. This paper emphasizes on a novel modified version of PSO to obtain an optimized solution of the economic load dispatch problem. In the paper, exponential particle swarm optimization (EPSO) is introduced and comparison has been performed on the basis of speed of convergence and its stability. The proposed novel method of exponential PSO has shown better performance in the speed of convergence and its stability. Keywords Exponential particle swarm optimization (EPSO) · Soft computing · Economic load dispatch (ELD) · Variants · Convergence
N. Bansal (B) · A. Gaba Department of Electrical Engineering, Delhi Technological University, New Delhi, India e-mail: [email protected] A. Gaba e-mail: [email protected] S. Thapa · S. Adhikari Department of Computer Science and Engineering, Delhi Technological University, New Delhi, India e-mail: [email protected] S. Adhikari e-mail: [email protected] A. K. Jha · A. Jha Department of Civil Engineering, Delhi Technological University, New Delhi, India e-mail: [email protected] A. Jha e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_4
39
40
N. Bansal et al.
1 Introduction The present-day power system indulges a stack of internally connected electrical networks, with inflation in the prices of fuel being used in thermal power plants; hence, it is imperative to minimize the cost of the generating unit. The main motive of the contemporary power system is to render high-grade electrical power to the buyer at the cheapest rate, considering different constraints of the power system and units being generated. Thus, it creates a base for the ELD problem, which concentrates on finding the actual power generation of each interconnected power plant and diminishes the total cost of fuel. The main problem generated here is of varied inequality and equality constraints which hammers it out into a complicated problem [1]. Regular methods like the Newton method, gradient method, and lambda iteration method can work out a monotonically and linearly increasing function. But, the curve of fuel cost in the ELD problem becomes sharp and extremely nonlinear because of factors like ramp rate, curve inequality constraints, valve point effect, and generator efficiency and prohibited operating constraints making ELD a non-convex and complex issue, which is cumbersome for solving the conventional method. AI, stochastic algorithms, and genetic algorithms like PSO can resolve these highly complicated problems in the affinity of global optimal solution. PSO is an evolutionary and mathematical process that was influenced by the birds and fishes’ colonies. Fewer parameters and high convergence speed make PSO a highly favorable option [2]. In this paper, the application of PSO will be studied by varying the variants and also applying it on the problem of ELD, differentiating each process based on convergence stability and speed [3].
2 Particle Swarm Optimization PSO is an evolutionary and intelligent process that is influenced by the fish colonies and the congregation of birds. The symbiotic cooperation between inhabitants of the society is the principle behind the revolutionary computational technique. PSO requires fewer parameters for evaluation and has more convergence speed than any arithmetic techniques. All through these years, there has been massive analysis on the PSO technique and algorithms which are acquired based on the betterment of diversity of population and parameter adjustment. The primary parameter is used for finding the equilibrium between global searching and local searching [4]. These consist of algorithms such as CPSO, DPSO, and LWPSO. To shun away from premature convergence, the second parameter is employed for obtaining algorithms. For significant improvement of performance, techniques like natural selection are used [5]. In this paper, the primary focus will be on our first parameter due to less computational cost and relatively lesser complexities; efficiency of parameter strategies.
Novel Exponential Particle Swarm Optimization Technique …
41
In the early moderation of PSO, there was an introduction of the inertia weight coefficient to the velocity update equation. There was a decrement of inertia weight in a constant linear manner (LWPSO) [6]. This technique helped increase the convergence speed and obtain a balance between local and global search exploitation. But a compromised situation with local exploration was observed because of the reduction of inertia weight in a constant linear way. Thus, a further modification was made, and inertia weight was decreased by a damping factor rather than a linear manner [7]. This process increased the convergence speed but affected the balance between local and global probing of global optimum value. Further research introduced a constriction factor, thus, removing the inertia weight introduced which was brought up earlier in papers in the velocity update equation. Thus, 0.729 was revealed to be the value constriction factor for rendering the best optimum solutions [8]. This technique demonstrated that the dynamic updating of the velocity equation could upgrade local search of an optimal solution and convergence speed without accumulating any complexities in the PSO technique. Deeply inspired by the improvement in the modification of the PSO techniques, a novel method of exponential partial swarm optimization (EPSO) has been introduced in this paper. In this method, the inertia weight eliminated by CPSO has been reintroduced. The inertia weight in this method is dependent on the Max I t, which is the maximum number of iterations [9]. This gives a big relay decay step in an early stage of the algorithm which enhances the speed of convergence, and in the later stage, the decay step decreases considerably allowing local exploration, which balances local and global exploration. In this paper, the ELD problem has been exposed to techniques like LWPSO, DPSO, CPSO, and EPSO, and the solutions obtained by each algorithm, their convergence speed, and convergence stability are compared.
3 Exponential Particle Swarm Optimization Population swarm having a size of n particles is considered. We assigned a velocity vector vi and position vector xi , both being P-dimensional vectors described as vi = vi1 vi2 . . . vi p and xi = xi1 , xi2 . . . xi p . Convergence speed is affected by vi velocity vector and possible solution is represented by the position vector x i . The speed of convergence and exploration of global and local optimum value is influenced by the velocity equation which hence affects convergence stability. Each particle acquires a personal best position Pb = Pb1 Pb2 . . . Pbp during the PSO search operation. Personal best position of other swarm with members is collated that of the particles and global best position Pg = Pg1 , Pg2 , . . . Pgp is selected using the algorithm. The global and personal best positions are utilized to update the particle velocity. The equation of velocity can be defined as: vi (n + 1) = vi (n) + c1r1 (Pb (n) − xi (n)) + c2 r2 Pg (n) − xi (n) The position equation is realized as:
(1)
42
N. Bansal et al.
xi (n + 1) = xi (n) + vi (n)
(2)
where c1 and c2 being the positive coefficients, and random variable functions are r 1 (.) and r 2 (.) The inertia weight LWPSO was introduced by earlier research papers where inertia constant equation was given by vi (n + 1) = wvi (n) + c1 r1 (Pb (n) − xi (n)) + c2 r2 Pg (n) − xi (n)
(3)
and the inertia weight is linearly decreased as follows: w = wmax −
(wmax − wmin ) · it Max I t
(4)
where ‘MaxIt’ denotes the count of iterations, ‘It’ being the instantaneous iteration, wmin and wmax are constants having values 0.4 and 0.9, respectively. This technique discovered a balance between local and global searching and also improved convergence speed. Further research in this field deduced out that in the updated equation of velocity damping factor in inertia weight produced a better speed of convergence. w = w ∗ wdamp
(5)
W damp is chosen to be 0.99 and w is chosen as 0.9. Further betterment in the PSO algorithm prepared the way for the velocity update equation where inertia weight was eliminated besides introducing constriction factor. So, velocity update equation is defined as: vi (n + 1) = χ vi (n) + c1r1 (Pb (n) − xi (n)) + c2 r2 Pg (n) − xi (n)
(6)
And, x=
2−∅−
2 √
∅2 − 4∅
(7)
Several experiments were conducted to note the value of . The value was found to be 4.1 which results in χ = 0.729. In this case, the algorithm gives the best performance for finding the optimal solution. Deeply inspired by this, a new method exponential PSO (EPSO) is introduced in this paper. In this method, the inertia weight in (3) was modified. The new fixed of inertia weight is: w = 1−
1 Maxlt
Maxlt (8)
Novel Exponential Particle Swarm Optimization Technique …
43
Now, since the maximum number of iterations is large. The expression (8) can be expressed as w = (e)(Maxlt)∗( Maxlt ) = (e)−1 −1
(9)
Thus, with the help of this algorithm, we can have a large step in the initial stage of competition of computation and the smaller one toward the end of the computation. Thus, the equilibrium is maintained between the local searching and global searching for the problem. This is achieved without having any complications in the algorithm which is actually quite essential for an evolutionary algorithm. A convergence speed better than that of damped particle swarm optimization (DPSO), constriction particle swarm optimization (CPSO), and linear weight particle swarm optimization (LWPSO) were obtained with the help of this algorithm, and the stability of convergence was found to be the best when ELD problem was exposed to the algorithms. The numerical results for these algorithms will be discussed in the further section.
4 Economic Load Dispatch Problem Power generation in a thermal power plant takes place by the rotation of prime mover in the turbine with the action of steam. The working fluid in the thermal power plant is water. Water is fed to the boiler and super-heater which convert it to steam. Steam which carries thermal energy is allowed to expand in the boiler which rotates the rotor shaft of generators. The steam loses energy and is condensed and then pumped to the boiler back to be heated up again. The factors which affect the operating cost include the transmission losses, fuel costs, and efficiency of the generators in action. Usually, labor, maintenance, and operation costs are fixed. A typical fuel cost curve of the generating unit is depicted below in Fig. 1. Fig. 1 Cost curve (fuel) locus of a generating unit
44
N. Bansal et al.
Minimum power which can be extracted from the generating unit and below which it’s not feasible to operate the plant is Pimin [10]. Maximum power which can be obtained from generating units is Pimax . The main motive of ELD is to reduce the total generation cost. Execution of the problem can be as follows: Minimise X T =
n
Fi (Pi )
(10)
i=1
X T = total generation cost. F i = ith generation cost function. n = number of generating units. Pi = ith generators real power. The above-stated function can be roughly expressed as real power outputs quadratic function from generating units of Power [11]. Fi (Pi ) = αi + βi Pi + γi Pi2
(11)
where α i , β i and γ represent the fuel cost coefficient for ith generating units. This problem has inequality and equality constraints [12].
4.1 Equality Constraints The cumulative real power being generated out of generating units in the case under study should be equal to the total of the transmission losses and systems demand power which therefore generates the equality constraints. n
Pi = PD + P2
(12)
i=1
where PD = Demand Power (MegaWatt). PL = Transmission Losses (MegaWatt).
4.2 Inequality Constraints Pimin ≤ Pi ≤ Pimax
(13)
Novel Exponential Particle Swarm Optimization Technique …
45
Here, For ith unit, Pimax is maximum possible real power and Pimin is minimum possible real power [10].
4.3 Transmission Losses The following equation describes the transmission losses: P2 = P τ B P + B0τ P + B00
(14)
Here, a vector of length N is taken to be P which hereby represents power output of every generator, B represents the square matrix of loss coefficients, and another vector of length N is taken to be Bo also a constant B00 is considered.
4.4 Ramp Rate Limit Constants Pi is the power being generated in the generating unit. It may not exceed real power being generated in the preceding interval by a fixed amount, URi is the up ramp rate unit and it might not be less than real power, and DRi is the down rate limit [13]. So, the following constraints arise: ◦ ◦ Max Pimin , Pi − D Ri ≤ Pi ≤ Min Pimax , Pi + U Ri
(15)
4.5 Valve Point Effect The progressive fuel cost curve of generating units in ELD is presumed to be a monotonically increasing linear function. Therefore, the nature of input–output characteristics is quadratic. However, because of the value point effect, linearity and discontinuities of high order are displayed by the input–output curve [9]. Thus, the original function is modified to consider these constraints. The adjusted periodic sinusoidal function demonstrates the value point effect which is represented as: Fi (Pi ) = αi + βi Pi + γi Pi2 + ei × sin f i × Pimin − Pi
(16)
where F i , ei represent fuel cost coefficient for ith generating unit in correspondence with valve point effect.
46
N. Bansal et al.
4.6 Prohibited Operating Zones The existence of a steam valve inside the thermal power plant generates vibrations in the shaft bearing which results in the creation of zones that are however restricted for performance in the fuel cost function. Non-segregated auxiliary operating tools like boilers and feed pumps are part of the other reason. The position of the fuel cost curve cannot be predicted in the prohibited zones. Precluding operations of units in given regions are the optimal solution. The cost curve in prohibited zones is given below in Fig. 2. This can be mathematically represented as follows: lower Pimin ≤ Pi ≤ Pi,1
(17)
lower Pi,k−1 ≤ Pi ≤ Pi,k , k = 2, 3, . . . n j
(18)
upper
upper
Pi,ni
≤ Pi ≤ Pimax
(19)
lower where lower real power limit of prohibited kth zone of ith unit is depicted by Pi,k , upper the upper limit of prohibited k − 1th zone of ith unit is denoted by Pi,k−1 and ni : is the number of prohibited zones in ith generating unit [14]. Thus, these are constraints taken into consideration in the ELD Problem, and solutions have been acquired using different versions of PSO.
Fig. 2 Prohibited operating zones are shown in cost curve locus
Novel Exponential Particle Swarm Optimization Technique …
47
5 Numerical Results and Simulation The power system is considered which has six generating units. It is used for denoting the application of PSOs various modified methods and results were obtained. Table 1 consists of the fuel cost coefficient of generating units and Table 2 consists of characteristics of generating units. Table 3 shows prohibited zones. The B-coefficients are given below to compute the transmission losses in the given power system: Table 1 Fuel cost coefficients Unit
αi
γi
ei
fi
1
230
βi 7.1
0.0075
220
0.03
2
190
10.5
0.009
145
0.045
3
225
8.2
0.0095
165
0.035
4
200
11.9
0.008
110
0.047
5
210
10.7
0.0085
185
0.032
6
180
12.2
0.0075
125
0.028
Table 2 Generating units characteristics Unit
Pimin
Pimax
Pio
URi
DRi
1
120
500
450
80
120
2
75
220
170
50
90
3
100
275
200
65
100
4
60
150
150
50
90
5
70
200
190
50
90
6
60
120
110
50
90
Table 3 Generating units prohibited zones Unit
Prohibited zone 1
Prohibited zone 2
Pilower (MW)
(MW)
Pilower (MW)
(MW)
1
215
245
345
375
2
95
115
140
160
3
155
175
210
240
4
85
95
110
120
5
95
115
140
150
6
75
85
100
105
48
N. Bansal et al.
⎡
0.00085 ⎢ 0.0006 ⎢ ⎢ ⎢ 0.000035 B=⎢ ⎢ −0.0005 ⎢ ⎣ −0.00025 −0.0001
0.0006 0.0007 0.00045 0.00005 −0.0003 −0.00005
0.000035 0.00045 0.00155 0.00000 −0.0005 −0.0003
−0.00005 0.00005 0.0000 0.0012 −0.0003 −0.0004
−0.00025 −0.0003 −0.00005 −0.0003 −0.00645 −0.0001
⎤ −0.0001 −0.00005 ⎥ ⎥ ⎥ −0.0003 ⎥ ⎥ −0.0004 ⎥ ⎥ −0.0001 ⎦ 0.0075
B0 = 1e−3 ∗ [−0.390 − 0.1270.7040.0590.216 − 0.663] B00 = [0.065] Different modified techniques of PSO are deployed for calculation of total power generation cost and power being generated by each unit. Apart from this, the comparison is drawn among different modified techniques of PSO. Two indices govern the assessment of different optimization methods and they are convergence speed and convergence stability. A modified stochastic algorithm is one having better convergence stability and speed. The time taken for the computation of various algorithms is also compared in this paper.
5.1 Power Generation and Total Cost By considering all the inequality and equality constraints, the problem of ELD is solved by different innovative techniques of PSO. Table 4 shows the results obtained. 1200 MW of demand power is obtained from the system. PSO algorithm is carried out with each algorithm being run for 200 iterations and also a population size of 300 is taken. The total power generated is 1146.37 MW, out of which 1100 MW is used to meet up demand and 46.37 MW is wasted in transmission loss. The mean total cost is nearly the same in all the modified versions of PSO as shown in Table 4. Table 4 Power generated by various units and total costs Unit
LWPSO (MW)
DPSO (MW)
CPSO (MW)
EPSO (MW)
1
434.72
434.73
434.73
434.73
2
140.34
140.32
140.29
140.28
3
180.76
180.78
180.76
180.76
4
126.84
126.81
126.85
126.84
5
198.17
198.19
198.15
198.15
6 Total mean cost (Rs./hr)
65.54
65.54
65.53
65.33
1431.21
1433.28
1432.39
1432.03
Novel Exponential Particle Swarm Optimization Technique …
49
Fig. 3 Convergence curve of different modified PSOs
5.2 Convergence Speed The convergent algorithm is the one which after a definite no. of iterations reaches an optimal region. An algorithm is called divergent when the optimal region is not reached. The slope of the convergence curve determines the convergence speed [15]. The convergence curve of all the versions of PSO is shown in Fig. 3. In the convergence curve, the vertical axis represents the total cost, whereas the horizontal axis denotes the no. of iterations for the modified algorithms. Thus, a conclusion can be drawn that EPSO performs better than DPSO, CPSO, and LWPSO when each of these algorithms is made to run for 200 iterations.
5.3 Convergence Stability Convergence stability is defined by the dispersion of global optimum value around the mean value after it has been run for certain iterations. The global optimum value’s concentration is an indication of convergence stability [16]. Better is the concentration more is convergence stability [17]. These modified algorithms which have been made to run for 40 times are the global best solution and the digital solution is derived for different modified PSO techniques by calculating standard deviations and mean cost. A smaller standard deviation reflects less divergence and better stability. Hence, from Table 5, it can be seen EPSO has got the least mean and standard deviation among LWPSO, DPSO, and CPSO.
50
N. Bansal et al.
Table 5 Digital analysis of various methods of PSO Criteria
LWPSO
DPSO
CPSO
EPSO
Mean (Rs./hr)
1433.21
1433.28
1432.39
1432.03
63.48
62.45
60.89
58.79
Standard deviation (Rs./hr)
6 Conclusion In this paper, there has been the successful implementation of PSO and the modified algorithms for solving the ELD problem. PSO is a stochastic algorithm which has been inspired from nature and the presence of a fewer number of variants provides it lead over the other evolutionary techniques which have also been nature-inspired, thus, a newer version of PSO has been successfully implemented and the comparison has also been drawn with the present versions of PSO based on convergent stability and speed and also the total mean cost. Digital analysis of the stability of convergence for different versions has been performed. The new method (EPSO) has better convergence stability and convergence speed than the pre-existing models (DPSO, LWPSO, and CPSO). However, the new methods of total min cost are almost equal to those of the existing models. The dynamic step in the modification of the velocity of a particle is given by the iterative weighted term given in the velocity equation. At a later stage, the step gets smaller, therefore, ensuring local exploration. Thus, an equilibrium has been established between the local and global exploration. The novel PSO technique can easily be employed in various applications on optimization of the power system.
References 1. Alam MN (2018) State-of-the-art economic load dispatch of power systems using particle swarm optimization. arXiv preprint arXiv:1812.11610 2. Shi Y (2001) Particle swarm optimization: developments, applications and resources. In: Proceedings of the 2001 congress on evolutionary computation (IEEE Cat. No. 01TH8546), pp 81–86. IEEE 3. Sharma J, Mahor A (2013) Particle swarm optimization approach for economic load dispatch: a review. Int J Eng Res Appl 3:013–022 4. Kalayci CB, Gupta SM (2013) A particle swarm optimization algorithm with neighborhoodbased mutation for sequence-dependent disassembly line balancing problem. Int J Adv Manuf Technol 69:197–209 5. Shen Y, Wang G, Tao C (2011) Particle swarm optimization with novel processing strategy and its application. Int J Comput Intell Syst 4:100–111 6. Abdullah SLS, Hussin NM, Harun H, Abd Khalid NE (2012) Comparative study of randomPSO and Linear-PSO algorithms. In: 2012 international conference on computer & information science (ICCIS), pp 409–413. IEEE 7. He M, Liu M, Jiang X, Wang R, Zhou H (2017) A damping factor based particle swarm optimization approach. In: 2017 9th international conference on modelling, identification and control (ICMIC), pp 13–18. IEEE
Novel Exponential Particle Swarm Optimization Technique …
51
8. Eberhart RC, Shi Y (2000) Comparing inertia weights and constriction factors in particle swarm optimization. In: Proceedings of the 2000 congress on evolutionary computation. CEC00 (Cat. No. 00TH8512), pp 84–88. IEEE 9. Pranava G, Prasad P (2013) Constriction coefficient particle swarm optimization for economic load dispatch with valve point loading effects. In: 2013 international conference on power, energy and control (ICPEC), pp 350–354. IEEE 10. Mondal A, Maity D, Banerjee S, Chanda CK (2016) Solving of economic load dispatch problem with generator constraints using ITLBO technique. In: 2016 IEEE students’ conference on electrical, electronics and computer science (SCEECS), pp 1–6. IEEE 11. Arce A, Ohishi T, Soares S (2002) Optimal dispatch of generating units of the Itaipú hydroelectric plant. IEEE Trans Power Syst 17:154–158 12. Dihem A, Salhi A, Naimi D, Bensalem A (2017) Solving smooth and non-smooth economic dispatch using water cycle algorithm. In: 2017 5th international conference on Electrical Engineering-Boumerdes (ICEE-B), pp 1–6. IEEE 13. Dasgupta K, Banerjee S, Chanda CK (2016) Economic load dispatch with prohibited zone and ramp-rate limit constraints—a comparative study. In: 2016 IEEE first international conference on control, measurement and instrumentation (CMI), pp 26–30. IEEE 14. Hota PK, Sahu NC (2015) Non-convex economic dispatch with prohibited operating zones through gravitational search algorithm. Int J Electr Comput Eng 5 15. Li X (2004) Better spread and convergence: particle swarm multiobjective optimization using the maximin fitness function. In: Genetic and evolutionary computation conference, pp 117– 128. Springer, Berlin 16. Ding W, Lin C-T, Prasad M, Cao Z, Wang J (2017) A layered-coevolution-based attributeboosted reduction using adaptive quantum-behavior PSO and its consistent segmentation for neonates brain tissue. IEEE Trans Fuzzy Syst 26:1177–1191 17. Clerc M, Kennedy JJItoEC (2002) The particle swarm-explosion, stability, and convergence in a multidimensional complex space 6:58–73
Risk Index-Based Ventilator Prediction System for COVID-19 Infection Amit Bhati
Abstract The current epidemic of the corona virus disease (COVID-19) in 2019 comprises a general wellbeing crisis of worldwide concern. Ongoing research shows that factors similar to ımmunity, environmental effect, age, heart and diabetes are significant supporters of this chronic infections. In this paper, a combined machine learning model and rule-based framework is proposed to offer medical decision support. The proposed system consists of a robust machine learning model utilizing gradient boosted tree technique to calculate CRI index for patients suffering from COVID-19 disease. This index is a measurement of COVID-19 patient mortality risk. Based on CRI index system predicts required number of ventilators in forthcoming days. The suggested model is trained and evaluated using a real-time dataset of 5440 COVID-19 positive patient obtained from John Hopkins University, World Health Organization and dataset of Indian COVID-19 patients obtained from open government data (OGD) platform India. Keywords Ventilators · COVID-19 · CRI index · Gradient boosted tree (GBT) · Machine learning (ML)
1 Introduction Current pandemic of COVID-19 is the result of a respiratory disease syndrome that is commonly known as SARS-CoV-2. More than 118,000 individuals around the world have been died with COVID-19 infection [1]. Many infected patients introduced gentle influenza like indications and get recover rapidly [2]. As the COVID-19 infection emergency accelerates, equipment specialists and enthusiasts around the globe are hearing the call of duty. Inside the clinical infrastructure, there are basic technologies that are commonly accessible; however, these technologies do not exist in a sufficiently high quantity to deal with the large number of patients related with A. Bhati (B) Institute of Engineering and Technology, Dr. RML Awadh University, Ayodhya 224001, UP, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_5
53
54
A. Bhati
pandemics [3]. Similarly, current scenario also prevails where critical COVID-19 infected patients struggling are all around the globe because of the absence of access to a few of these technologies [4]. Ventilators are one of the cases in this context that are as of now in basic short supply [5, 6]. Ventilators are integrated for the treatment of both flu and COVID-19 patients in serious intense respiratory failure [7, 8]. Earlier investigations have indicated that intensive care units (ICUs) won’t have adequate assets for providing a better treatment to all patients, who have a need of ventilator support in the pandemic period [9, 10]. A compelling report of Imperial College London assesses that 30% of hospitalized COVID-19 patients are probably going to require ventilator support [11]. As a result, shortage of ventilator stays unavoidable in several places of the world. Andrew Cuomo’s, Governor of New York, requested for 30,000 units of ventilator for treatment of COVID-19 patients [12]. Even in India, government has also suggested the automobile companies to produce low-cost ventilators rather than producing vehicles in such pandemic situation. Regarding their functionality, ventilators are incredibly reliable machines comprising of sophisticated pumps which control the oxygen and flow of air from the patient’s lungs, supporting them while they cannot accomplish their work. As per World Health Organization (WHO), COVID-19 can overpower our clinical facilities at the territorial level by causing rapid growth in death rates [13, 14].
2 Background of Study 2.1 COVID-19 and Mortality Risk Since the seriousness of COVID-19 infection is firmly related to the prediction, the fundamental and basic techniques to improve results are the early identification of high-risk and critically sick patients. Zhou et al. [15] reported discoveries from 191 COVID-19 patients during the primary days of the spread in Wuhan and follow the patient conditions till their discharge. Their discoveries reported about critical suffered patients having age more than 56 years, a high rate of men (62%), and almost half of patients with at least one disease (48%) [15]. In another report identified with Wuhan City of China, mortality rate was 62% among critically sick patients suffered from COVID-19, 81% of those required ventilators [16].
2.2 Machine Learning in COVID-19 Machine learning (ML) is a possibly incredible technology in the battle against the COVID-19 pandemic. ML can be valuable to diagnose, predict, help to treat COVID-19 infections, and help to manage financial effects. Since the outbreak of
Risk Index-Based Ventilator Prediction System …
55
the pandemic, there has been a scramble to utilize and investigate ML, and other scientific analytic methods, for these reasons. ML methods can precisely anticipate how COVID-19 will affect the resource needs like ventilators, ICU beds, and so forth at the individual patient level and the clinic level. In this manner giving a solid image of future resource utilization and empowering medicinal services experts to make well-informed decisions about how these rare resources can be utilized to accomplish the greatest advantage. In this paper, we used gradient boosted machine learning technique to identify optimally required number of ventilators for COVID19 patients.
2.3 Research Gap and Objective Advancement in machine learning shows a massive effect in the field of clinical science. A wide scope of research contemplates is in progress in clinical diagnosis and prediction utilizing machine learning approach. But only very limited works are available in the COVID-19 mortality risk identification and its uses on critical life support resources planning. Forecast of ventilators required utilizing COVID-19 risk identification index is the multi-class characterization problem; consequently, it forces a need to think about variables for numerous classifications. These issues force a need for a model to investigate and analyze a few parameters, to predict the event and to settle on optimal decision. In this manner, advancing proficient integration of behavioural data with patient health information offers better comprehension for prediction.
3 Materials and Methods 3.1 Dataset The propsoed research work has examined the dataset containing clinical records of 5440 COVID-19 patients collected from confirmed source, for example, John Hopkins University, WHO and open government data (OGD), and Government of India sites. The sites have announced the details of COVID-19 cases. In our experimentation, we have considered cases enlisted during month of February, March, and first week of April 2020. The patients include both women and men with age ranges from 21 to 91 years. The dataset comprises of 9 features reports about age, sexual orientation, and clinical history of patients experiencing COVID-19. In Table 1, except date of admission, age and sexual orientation, all features are of binary nature, such as high blood pressure, cardiac disease, diabetes, nervous system illness, respiratory disease, pregnancy childbirth, cancer disease, tuberculosis ailment. Table 2 displays the demise pace of humanity specifically highlight class
56 Table 1 Patient attributes with co-related coefficients to calculate CRI ındex
Table 2 Death Rate of COVID-19 disease affected by patient attributes
A. Bhati Attributes
Coefficient
Age
0.649
Heart disease
0.071
Respiratory disease
0.069
Pregnancy/childbirth
0.054
Neuro disease
0.046
Cancer
0.033
High blood pressure
0.028
Tuerculosis
0.025
Gender
0.017
Attributes
Death rate (%)
Age 80+ years old
21.90
70–79 years old
8.00
60–69 years old
3.60
50–59 years old
1.30
40–49 years old
0.40
30–39 years old
0.20
20–29 years old
0.20
10–19 years old
0.20
Sex Male
4.70
Female
2.80
Existing disease Cardiovascular
13.20
Diabetes
9.20
Chronic respiration
8.00
Hypertension
8.40
Cancer
7.60
No-pre condition
0.90
due to COVID-19 (According to report of WHO). The datatset of 18,134 COVID-19 patient is splitted into subsets of 70 and 30% for training and testing of models, respectively. For verification of results, trained model is tested on 5440 medical records of COVID-19 positive patient from the test dataset.
Risk Index-Based Ventilator Prediction System …
57
3.2 COVID-19 Risk Identification Index In data preparation step, the proposed framework clears out the missing qualities from input dataset with mean values specially for numerical data. For each feature, we have calculated co-relation coefficient as shwon in Table 1. F = {F0 , F1 , F2 . . . . . . .Fn }
(1)
A = {A0 , A1 , A2 . . . . . . .An }
(2)
Here F 0 , F 1 … F n represent features and A0 , A1 … An represent coefficient of respective feature selected for training of machine learning model. COVID-19 risk identification (CRI) index can be calculated as: CRIi =
10
Fi X Ai
(3)
i=0
CRI index obtained from Eq. (3) for ith patient is not normalized. Use of this CRI index values can degrade the performance of the entire learning model. Henceforth, the data ought to be improvised in quality before beginning a learning model. Normalization of CRI can resolve this issue, so in next step we normalize the CRI as: CRI(N )i =
CRIi − Min(CRI0−n ) Max(CRI0−n ) − Min(CRI0−n )
(4)
Normalized CRI index value obtained from Eq. (4) is calculated for every patient record. This processed dataset is now ready for training purpose. For training and validation of trained model, the dataset is divided into two parts. 70% of records are used for training purpose and remaining 30% records are used for validating the prediction of CRI index.
3.3 Gradient Boosted Tree In order to trained the model for prediction, training operation is performed using random forest, deep learning, gradient boosted trees, decision tree, support vector machine approach. Gradient boosting technique is a ML procedure for relapse and characterization problems, which delivers a prediction model as a collection of weak prediction models, general decision trees. It is likely to be say, gradient boosting is commonly utilized with decision trees [17]. Like other boosting techniques, gradient boosting joins weak “learners” into a solitary solid learner in an iterative style. It is
58
A. Bhati
most effortless to clarify at least-squares setting, where the objective is to “instruct” a model G to foresee estimations of the structure bˆ = G(a) by limiting the mean 2 squared error 1 i bˆi − bi where i indexes over some preparation set of size n n
of genuine estimations of the output variable bi , where bˆi is the expected value of G(a), bi is the genuine value and n is the quantity of test in b. Now, let us consider a gradient boosting calculation with R stages. At each stage r of gradient boosting (where 1 ≤ r ≤ R), suppose some defective model Gr (for low r, this model may basically return bˆi = bi , where the RHS is the mean of b). So as to improve Gr , calculation should include some new estimator, E r (a). Hence, G (r +1) (a) = G r (a) + Er (a) = b
(5)
At each G(r+1) step calculation attempts to correct the error of its parent Gr .
3.4 Ventilator Requirement Prediction The output from GBT model is applied to ventilator prediction process which takes adaptive threshold value based on mortality rate in the region and forecasts the expected number of ventilators required in near future based on last 10 days statistics with predicted CRI index of patients. Adaptive threshold is computed automatically with mortality rates in specific region as requirement of ventilator also depends on immunity factor of person live in particular region. For example, immunity of people lives in India may differs from people live in other countries. So in order to provide good estimation adaptive threshold is utilized.
4 Mathematical Support for Proof of Concept To check proposed system acceptability, we are using T-test statistic method. In our case T-test allow us to compare the mean of number of required ventilators obtained from our prediction model with actual number of ventilators used for curing of COVID-19 patients. T-test for single mean can be given as: X − µ t= √ S/ n
(6)
where X , are sample mean (calculated with the help of predicted output) and population mean (actual output), respectively, which can be calculated using Table 3. S represents standard deviation of predicted output where n is the total number of
Risk Index-Based Ventilator Prediction System …
59
Table 3 Predicted number of ventilators required versus actual ventilators from testing dataset Date
No. of patient registered
Actual ventilator used during treatment
Predicted required No. of ventilators
% Accuracy
20-Mar-2020
248
38
32
84.21
04-Apr-2020
3299
331
296
89.42
12-Apr-2020
7790
372
331
88.97
19-Apr-2020
13,888
446
417
93.49
26-Apr-2020
20,483
538
510
94.79
03-May-2020
29,549
612
579
94.60
10-May-2020
43,989
752
703
93.48
17-May-2020
55,875
881
799
90.69
24-May-2020
76,809
971
901
92.79
01-Jun-2020
97,008
1024
956
93.35
09-Jun-2020
133,579
2241
2159
96.34
17-Jun-2020
160,517
2839
2607
91.82
25-Jun-2020
190,156
3512
3374
96.07
03-Jul-2020
236,832
4587
4302
93.78
sample used. Degree of Freedom is (n − 1) Simplified form of Eq. (6) can be specified by Eq. (7). t=
|36.1 − 33.4| √ 16.07/ 10
(7)
Degree of freedom = 9 and t cal = 0.91. By using T-Table value with one-tail having α = 0.01, t9,0.01 = 2.821. Because t cal ”. This token then replaces the named entity in the article and the headline. The named entities of the Article that occur in the Headline are highlighted in italic. Tokenization: The final step left before preprocessing is to create tensor objects out of the pairs of articles and headlines. The “” (start-of-sentence) and “” (end-of-sentence) tokens are added at the beginning and the end of each article and headline, respectively. All articles and headlines are truncated to a length of 80 and 25 characters, respectively (punctuations included). An additional “ ” (padding) token is added to the articles and headlines until they met the desired size of 80 and 25, respectively. When there is an appearance of an unknown word during testing, the “” (out of vocabulary) token is used. But the preprocessing with spaCy mitigates the use of an OOV token. Every word and every token are assigned a number in the vocabulary which is then used to create the tensors that could be taken by the embedding layer.
3.2 Architecture The mechanism of the three approaches: (1) Seq2Seq with attention model, (2) The transformer, (3) Pointer generator model has been described in this section. Sequence to sequence with Attention: It is a classic example of an encoder–decoder model [2] with the encoder responsible for creating context vector representations from the news articles that are provided and the decoder for generating a headline. The headline is generated word by word, as the model calculates the attention given to the encoder representations at every instant. The Encoder: It consists of a trainable embedding layer for which the Glove [10] 300-dimensional word embedding is used. The recurrent layer is chosen to be of
228
B. Singh et al.
Table 2 Shows an article and its headline before and after the preprocessing steps and the corresponding named entity dictionary Article
Headline
Original text
The total number of India reports more than 5000 coronavirus cases in India has coronavirus cases in 5 days, risen to 12,759 after over 5000 total cases rise to 12,759 cases were reported in last five days. Meanwhile, the coronavirus death toll in India has risen to 420, while 1515 Covid-19 patients have been cured, discharged or migrated. Earlier today, the Health Ministry revealed that 941 new cases and 37 deaths were reported on Wednesday
Recognition and classification of named entities
GPE—[India] CARDINAL—[12,759, 5000, 420, 1515, 37] DATE—[5 days, Wednesday] TIME—[Earlier today] ORG—[the Health Ministry]
GPE—[India] CARDINAL—[5000, 12,759] DATE—[5 days]
After replacing the tokens
The total number of coronavirus cases in has risen to after over cases were reported in last . Meanwhile, the coronavirus death toll in has risen to , while Covid-19 patients have been cured, discharged or migrated. , revealed that new cases and deaths were reported on
reports coro-navirus cases in , total cases rise to
bi-directional long short-term memory networks (LSTM) in order to better preserve long-term dependencies. The article is given to the encoder word by word x1 … x j . The hidden representations at every instant of the encoder are used to calculate the attention later in the decoder part as shown in Fig. 2. The Decoder: Every word of the headline (y1 … y2 ) is generated one-by-one in the decoder as the attention mechanism calculates the attention over the encoder inputs [3]. At every instant of the decoder for generating the new word, the significance e jt (the importance of the jth word of the headline on the tth time-step of the decoder) is calculated using the hidden states of the encoder h t and the previous state of the decoder st−1 .
Retaining Named Entities for Headline Generation
229
Fig. 2 Encoder–decoder LSTM model with attention [3]
Equation 1 [3] gives the actual representation of the significance. The matrices Uatt and Watt are used to get the vectors st−1 and h j , respectively, to the same dimension. T is a matrix that leaves us with a scalar e jt . Softmax is applied to the values The Vatt of e jt which gives us alphas. The output vectors at every instance are multiplied by their corresponding scaling factor and are added to form one vector. The final dense layer is of the size of the vocabulary, and cross entropy loss is the loss function. During training, there is a 50% chance of the predicted word being sent in as the next input to implement teacher-forcing. This would result in the model being not completely reliant on the proper input while training and would function better in the testing scenarios T tanh Uatt st−1 + Watt h j e jt = Vatt
(1)
The Transformer: The model (Fig. 3) chosen is the exact rendition of the one from the paper. Attention is all you need [5]. The article is positionally encoded after every word is assigned its 300-dimensional embedding. This is important because the transformer does not rely on recurrence. Hence, an idea of order is required for the model to understand sequence data. Each side of the transformer consists of six encoders and decoders, having multiheaded attention of eight heads for better focus over the article. For calculating
230
B. Singh et al.
Fig. 3 Transformer model [5]
self-attention, a set of three matrices are multiplied with every input word producing three vectors, i.e., query, key and value. The query and key vectors of every input word are used to calculate constants that scale the value vector. This determines the impact of every other word on the current word. This represents a single head of attention, and n such sets of matrices are used to find n-headed attention. The transformed vectors are added to the original ones and are followed by normalization. This occurs in every encoder for 6 times, which leads to generating a contextrich vector, and it is fed to every decoder from the decoder stack. The output of the final decoder from the stack is fed to a dense layer of the size of the vocabulary to predict the next word using the categorical cross entropy loss. Pointer Generator: The paper [4] presents a new architecture for abstractive text summarization that augments the standard sequence-to-sequence attentional model. In this method, a hybrid pointer generator network is used that can not only copy words from the source text via pointing, which aids accurate reproduction of information, but also produces novel words through the generator. Further, the generation probability pgen ∈ [0, 1] for time-step t is calculated from the context vector h ∗t , the decoder state st and the decoder input xt using Eq. 2 [4]. pgen = σ whT∗ h ∗t + wsT st + wxT xt + bptr
(2)
where vectors wh ∗ , ws , wx and scalar bptr are learnable parameters and σ is the sigmoid function. Now, this value of pgen is used to determine whether the words should be picked from the article directly or from the original vocabulary distribution. One of the main advantages of the pointer generator model is its ability to produce out of vocabulary words, by contrast, other text summarization models are restricted to their pre-set vocabulary.
Retaining Named Entities for Headline Generation
231
4 Result This paper is primarily focusing on addressing the issues in existing architectures of text summarization by extending them to improve the accuracy of the generated headlines. The emphasis on identifying the named entities provides better accuracy in news article summarization. In this section, a walk-through of the system demonstrating the experimental result obtained from the system are presented first. The section ends with an evaluation study using the ROGUE score to assess the efficiency of these models. Walk-through of the system: During preprocessing, every article is assigned a dictionary in which the keys are the named entity tokens and the values are lists of named entities of that category. This step requires the spaCy library and is carried out for every article that is preprocessed. Now, when the model predicts any of the named entity tokens as the next word, it can be replaced with the named entities of that article of the same category. If there exists more than one named entity of the same category, then it simply finds the most suitable permutation with a sentence similarity between the generated headline and the original article. Table 3 provides an overall process of predicting the headline for a particular article. Some more examples of the predicted headlines are shown in Tables 4 and 5. Experimental Evaluation: In this section, a comparison between the various models is carried out and analyzed using the ROGUE metric. It is an abbreviation for recall oriented understudy for gisting evaluation. It is a mechanism for analyzing automated summarization of text and also machine translation. Basically, it compares the generated headlines to the original headlines. In Table 6, the average of the F1 scores for ROGUE-1, ROGUE-2, ROGUE-L is shown, which measures the word-overlap Table 3 Obtaining usable results Article
Congress leader Rahul Gandhi has said that the recent video of two young dalit men “being brutally tortured in Nagpur, Rajasthan is horrific and sickening” and urged immediate action. Meanwhile, Rajasthan CM Ashok Gehlot said, “Seven accused have been arrested … we will ensure that the victims get justice”. The two men were allegedly beaten up on the suspicion of stealing money
Predicted headline
Brutal torture of dalits in horrific, sickening:
After replacing with named entities of the same category
Brutal torture of dalits in [Rajasthan, Nagpur] horrific, sickening: [Rahul Gandhi, Ashok Gehlot]
Best permutation according to sentence similarly
brutal torture of dalits in Rajasthan horrific, sickening: Rahul Gandhi
232
B. Singh et al.
Table 4 Example 1 Original article
Delhi-based diabetes management app BeatO has raised over 11 crore in a pre-Series A funding round led by Orios Venture Partners. The funding round also saw participation from existing investors Blume Ventures and Leo Capital. Founded in 2015 by Gautam Chopra, Yash Sehgal and Abhishek Kumar, BeatO offers diabetes management programmes to users via a smartphone app
Original headline
Diabetes management app BeatO raises 11 crore led by Orios
Transformer headline
Diabetes management app BeatO raises 11 crore led by Orios
Pointer generator headline
Diabetes management app BeatO raises 11 crore in series by
Seq2Seq with attention headline Diabetes management app
Table 5 Example 2 Original article
The TMC is leading in Kharagpur Sadar and Karimpur seats in the West Bengal Assembly by poll. Meanwhile, BJP is leading in the Kaliaganj seat. The Kaliaganj by poll was necessitated following the death of sitting Congress MLA Pramatha Nath Roy, while Kharagpur Sadar and Karimpur seats had fallen vacant after the sitting MLAs were elected as MPs in the LS polls
Original headline
TMC leading in 2 of 3 seats in West Bengal by poll
Transformer headline
TMC leading in 2 seats in West Bengal by poll
Pointer generator headline
TMC leading in 2 TMC seats in West Bengal assembly
Seq2Seq with attention headline TMC TMC company seats in polls
Table 6 Comparison of results on the basis of ROGUE metrics
Architectures
ROGUE-1
ROGUE-2
ROGUE-L
Transformer
0.335
0.162
0.521
Pointer generator
0.369
0.157
0.493
Seq2Seq with attention
0.216
0.091
0.225
(unigram, bigram, etc.) between the generated headlines and original headlines over the entire test set. From Table 6, it can be inferred that the pointer generator performs better than transformer and Seq2Seq based on the ROGUE-1 metric, as the pointer generator has a special mechanism of pointing at single words directly from the article. But on comparing the other metrics, the transformer model outperforms the rest in preserving the semantics and also the named entities over the entire headline. The basic Seq2Seq model not only lacks a mechanism to point important words directly to the output, but also has no extensive self-attention architecture like the transformer. Hence, the ROGUE metric is low in both short- and long-term dependencies.
Retaining Named Entities for Headline Generation
233
5 Conclusion and Future Work This paper has presented an approach for adapting existing text summarization models to generate crisp headlines by taking news articles as input. Although it is observed that the SeqSeq with attention and pointer has a problem of repetitions that occur while headline generation. The pointer model has been found to perform well under most circumstances. The transformer model has been seen to give the best results out of all three. A new technique for retaining important named entities has been presented here and produces more natural and meaningful headlines. The proposed system would be a stepping stone toward automating the process of fool proof headline generation, to be used in latest automated AI-based news platforms like Inshorts. Also, the same system can be modified and trained for similar, but other use cases like legal document analysis, stock market prediction based on news or summarization of customer feedback on products, where retaining named entities is essential.
References 1. Inshorts.com (2020) Breaking news headlines: Read All news updates in English—Inshorts. Available at: https://inshorts.com/en/read. Accessed 4 August 2020 2. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks 3. Luong M-T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation 4. See A, Liu PJ, Manning CD (2017) Get to the point: summarization with pointer-generator networks 5. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need 6. ETtech.com (2020) Inshorts debuts ai-based news summarization on its app— Ettech. Available at https://tech.economictimes.indiatimes.com/news/startups/inshorts-debutsaibased-news-summarization-on-its-app/64531038. Accessed 4 Aug 2020 7. Masum KM, Abujar S, Tusher RTH, Faisal F, Hossai SA (2019) Sentence similarity measurement for Bengali abstractive text summarization. In: 2019 10th international conference on computing, communication and networking technologies (ICCCNT), Kanpur, India, 2019, pp 1–5. https://doi.org/10.1109/ICCCNT45670.2019.8944571 8. Hanunggul PM, Suyanto S (2019) The impact of local attention in LSTM for abstractive text summarization. In: 2019 international seminar on research of information technology and intelligent systems (ISRITI), Yogyakarta, Indonesia, 2019, pp 54–57. https://doi.org/10.1109/ ISRITI48646.2019.9034616 9. Mohammad Masum K, Abujar S, Islam Talukder MA, Azad Rabby AKMS, Hossain SA (2019) Abstractive method of text summarization with sequence to sequence RNNs. In: 2019 10th international conference on computing, communication and networking technologies (ICCCNT), Kanpur, India, 2019, pp 1–5. https://doi.org/10.1109/ICCCNT45670.2019.8944620 10. Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. EMNLP 14:1532–1543. https://doi.org/10.3115/v1/D141162 11. Spacy.io (2020) Industrial-strength natural language processing. Available at: https://spacy.io/. Accessed 4 August 2020 12. Partalidou E, Spyromitros-Xioufis E, Doropoulos S, Vologiannidis S, Diamantaras KI (2019) Design and implementation of an open source Greek POS Tagger and Entity Recognizer
234
13. 14.
15.
16.
B. Singh et al. using spaCy. In: 2019 IEEE/WIC/ACM international conference on web intelligence (WI), Thessaloniki, Greece, 2019, pp 337–341 Li J, Sun A, Han J, Li C (2018) A survey on deep learning for named entity recognition. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2020.2981314 Janjanam P, Reddy CP (2019) Text summarization: an essential study. In: 2019 international conference on computational intelligence in data science (ICCIDS), Chennai, India, 2019, pp 1–6. https://doi.org/10.1109/ICCIDS.2019.8862030 Partalidou E, Spyromitros-Xioufis E, Doropoulos S, Vologiannidis S, Diamantaras KI (2019) Design and implementation of an open source Greek POS Tagger and entity recognizer using spaCy. In: 2019 IEEE/WIC/ACM international conference on web intelligence (WI), Thessaloniki, Greece, pp 337–341 Modi S, Oza R (2018) Review on abstractive text summarization techniques (ATST) for single and multi-documents. In: 2018 international conference on computing, power and communication technologies (GUCON), Greater Noida, Uttar Pradesh, India, pp 1173–1176. https:// doi.org/10.1109/GUCON.2018.8674894
Information Hiding Using Quantum Image Processing State of Art Review S. Thenmozhi , K. BalaSubramanya, S. Shrinivas, Shashank Karthik D. Joshi, and B. Vikas
Abstract The bottleneck of digital image processing field narrows down to the memory consumption and the processing speed problems, which can be resolved by performing image processing in quantum state. In this Internet era, all the information is exchanged or transferred through the Web of things, which necessitates maintaining the security of transmitted data. A variety of techniques are available to perform secret communication. A quantum steganography scheme is introduced to conceal a quantum secret message or image into a quantum cover image. For embedding of secret data into the quantum cover, many algorithms like LSB Qubits, QUALPI are available. This paper thrashes out on the subject of secret data transmission using quantum image steganography. Keywords Quantum image steganography · Quantum secure communication · Quantum log-polar image · Quantum image expansion · Lsbqubits
1 Introduction Quantum computation uses the properties of quantum mechanics such as entanglement state and superposition to store, process and retrieve the data. Considering an electron as a basic element, it has two states namely spin up (Bit1) and spin down (Bit0). According to quantum mechanics, the total angular momentum of an electron can be represented as a superposition of both spins up and spin down. The representation S. Thenmozhi (B) · K. BalaSubramanya · S. Shrinivas · S. K. D. Joshi · B. Vikas ECE, Dayananda Sagar College of Engineering, Bangalore, India e-mail: [email protected] K. BalaSubramanya e-mail: [email protected] S. K. D. Joshi e-mail: [email protected] B. Vikas e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_18
235
236
S. Thenmozhi et al.
of quantum bits is shown in Fig. 1. The concept of quantum computation was first proposed by Richard Feynman in the year 1982. Shor’s quantum integer factoring methodology proposed in the year 1994, and a search algorithm proposed by L.K. Grover in the year 1996 (named after him) has developed a new and possible way of computation. By the dawn of the late 1990s, development of quantum computing became a hot topic in Information Science and Technology. Quantum information hiding, a part of quantum computing, is divided into two parts, namely quantum watermarking and quantum image steganography. Due to advancement in information technology with tons of data being transferred through the Internet, it is essential to have secure communication between the end users. This led to the rapid development of quantum multimedia technology and quantum image steganography. Quantum image steganography deals with information hiding inside an image in such a way that the information is completely masked and the eavesdropper will never know about its existence. Many quantum image steganography models have been developed ever since [1]. In this work, the author proposed a Qubit lattice in the year 2003. Further, in the year 2010, he improved his model and proposed an intertwined representation to store statistical information [2]. In [3], authors projected a real ket model of quantum steganography in the year 2005. A new Flexible Representation for Quantum Images (FRQI) model was projected by the authors in the year 2011 [4]. So, for the first time, a method that took consideration of position and intensity was proposed. In the year 2014, Yi Zhang et al. proposed a methodology called novel enhanced quantum representation (NEQR) [5]. This method is similar to FRQI except for the fact that, in FRQI method, it considers only one qubit sequence. On the contrary, NEQR considers superposition of all the qubits. To enhance the performance of quantum steganography, a novel method called quantum log-polar image representation (QUALPI) was proposed by Yi Zhang et al. in the year 2013 [6]. The advantages of quantum steganography over conventional methods are briefly described in Table 1. Fig. 1 Representation of quantum bits
Bit 1
Bit 0
Superposition of Bit 0 and 1
Information Hiding Using Quantum Image …
237
Table 1 Comparison of classical and quantum steganography Conventional steganography
Quantum steganography
The basic memory unit in classical steganography is expressed in bits
The basic memory unit in quantum steganography is expressed in qubits
A bit can take on a value of ‘0’ or ‘1’ at a particular time
A qubit can take the superposition of ‘0’ and ‘1’ simultaneously
It is possible to represent only one of the 2n states using ‘n’ bits
If there are ‘n’ qubits, it is possible to simultaneously represent all 2n states
Classical steganography has many Whereas, in quantum steganography, due to the vulnerabilities. For example, a secret key used principle of non-cloning theorem, an exact during the embedding process can be copy of the secret key cannot be obtained compromised Computational speed in case of classical steganography is linear. Taking an example of the database search, it takes a time complexity of O(n) to identify the required data
Contrary to that when database search happens in quantum computers the time √ complexity involved is in the order of O( n), this is possible by making use of Grover’s search algorithm
2 Literature Review The quantum image steganography was implemented using a variety of techniques. Majorly they can be divided based on (1) Embedding technique used (2) Type of cover chosen. In this paper, the type of cover chosen is an image. The literature review was performed for the cover as an image and concentrating on various embedding techniques.
2.1 LSQbits-Based Methods The paper [7] here proposed a new matrix-coded quantum steganography algorithm which made use of the quantum color image. Here, a covert and quantum secure communication is established by taking into account the good invisibility and higher efficiency in terms of embedding of the matrix coding. Two embedding methods were applied in this paper. The first being single pixel-embedded (SPE) coding where three least significant qubits (LSQbs) of single quantum carrier image pixel were embedded with two qubits of secret message. The second method used was multiple pixels-embedded (MPsE) coding where three least significant qubits (LSQbs) of different pixels of the carrier quantum image were embedded with two qubits of secret message. The PSNR values of the embedding methods used here were found to be higher than other methods referenced in the paper. Combining PSNR and histogram analysis, it is shown that this protocol achieves very good imperceptibility. The protocol is also shown to have good security against noises in the quantum channel
238
S. Thenmozhi et al.
and various attacks. The efficiency and capacity in terms of embedding of singlepixel embedded coding are shown to be 2.67 and 22n+1 , respectively, and MPsE is shown to be 2.67 and 22n+1 /3 [8] Here, the authors proposed three quantum color image steganography algorithms which involved “Least Significant bit” technique. Algorithm one utilized a generic LSB technique. Here, information bits of secret data were substituted in place of the pixel intensity’s LSB values. This utilized a single image channel to hide secret information. Algorithm two made use of least significant bit (LSB) Xoring technique and utilized a single image channel to cover secret data. Algorithm three made use of two channels of the cover image to cover the color image for hiding secret quantum data. As the number of channels increased, the capacity of the third algorithm also increased. An image key was used in all the three algorithms in the process of embedding the secret data and extraction of secret data. The evaluation parameters considered here were invisibility, robustness and capacity. The PSNR values observed in the first algorithm were around 56 dB, the second algorithm was around 59 dB, and the third algorithm was around 52 dB. The quality of the stego image obtained by making use of the second algorithm was better than the other two. As the third algorithm made use of two channels of the cover image to cover the secret data, the capacity was enhanced. The capacity of the third algorithm was 2 bits/pixel, whereas the other two algorithms were 1 bit/pixel. [9] In this proposed work, initially, a quantum carrier image was prepared using “NEQR” model by employing two-qubit patterns to accommodate the grayscale intensities as well as the position of each pixel. In EMD embedding, a group of pixels having a Npixels is formed, and every secret digit of the hidden message, belonging to a system of (2N + 1)-ary notation, was embedded into that group. During this embedding of the secret digit, a single pixel of the cover image alone might be modified or the cover image pixels remain as such. If the cover pixel value was to be modified, then it was either incremented or decremented by a unit value. This implies that for an N no of cover pixels (2N + 1), different transformations need to be performed to acquire (2N + 1) values of a secret digit. The advantage of EMD embedding is to provide good quality of the image with a PSNR exceeding 52 dB. This algorithm achieves high embedding efficiency, security and imperceptibility of the secret information. However, as ‘N’ becomes larger, the embedding rate reduces. In this paper [10], initially, a (2n × 2n )-sized cover image and a (2n−1 × 2n−1 )-sized watermark image were modeled by a “Novel Quantum Representation of Colour Digital Images Model (NCQI)”. The watermark was scrambled into an unordered form through image preprocessing technique to simultaneously change the position of the pixels while changing the color pixel information based on “Arnold transformation”. The (2n−1 × 2n−1 )-sized scrambled watermark image with a gray intensity range of 24-qubits was subjected to expansion to acquire a (2n × 2n )-sized image with a gray intensity range of 6qubits using the “nearest-neighbour interpolation” method. This watermark image was embedded onto the carrier by LSB steganography scheme, by substituting the least significant bits of the pixels of the cover image of three channels, i.e., red, green and blue. In the meantime, a (2n × 2n )-sized key image with information of 3-qubits was also created to retrieve the actual watermark image. The extracting process is just the inverse process of embedding. The PSNR value for the algorithm exceeds
Information Hiding Using Quantum Image …
239
54 dB, which indicates that the imperceptibility of the cover image is not affected by the embedding of a watermark. The proposed scheme, thus, provides good visual quality, robustness, steganography capacity and lower computational complexity.
2.2 FRQI-Based Methods This work [11] proposes three strategies which involve the design of new geometric transformation performed on quantum images. The proposed design focused on affected regions in an image, separability and smooth transformations by representing an image in a quantum computer. The first strategy considered transformations that considered parts of a quantum image. More controls were added to show information about parts present in a quantum image. The second method took the separability present in classical operation to transformation in the quantum state. By making use of the flexible representation for quantum image (FRQI) model, it was feasible to examine and define separable and geometric transformations. Third method aimed at the transformations taking place smoothly. Multi-level controls which were used by the cyclic shift transformations were the primary technique in obtaining smooth transformation. The methods proposed in the paper provided top-level tools for expanding the number of transformations required for building practical applications dealing with image processing in a quantum computer. It is also shown that the design of a quantum circuit with a lesser complexity for inconsistent geometric transformation is feasible [6]. In this paper, the author proposed FRQI, a method in which images are mapped onto its quantum form, in a normalized state which captures information about colors and positions. The quantum image compression algorithm starts with the color group. From the same color group, Boolean min-terms are factored. By combining all the min-terms, a min-term expression is created. In the next step, the minimization of min-terms is done. At the final step, minimized Boolean expression of output is obtained. The paper has three various parameters evaluated depending upon the unitary transformation on FRQI dealing only with color images, color images with its current positions and a union of both color and pixel points. Considering the application of QIC algorithm on a unit-digit binary image, the compression ratios vary from 68.75 to 90.63%, and considering a gray image, the value varies from 6.67 to 31.62% [12] This paper focuses on estimating the similarity between quantum images based on probabilistic measurements. The similarities between the two images were determined by the possibility of amplitude distribution from the quantum measurement process. The methodology utilized in this paper for representing the quantum state is FRQI. The obtained quantum image was then passed on through a Hadamard gate to recombine both the states, and then, it is followed by quantum measurement operation. The result of the measurement was dependent on the differences in the two quantum images present in the strip. The probability of getting a 0 or 1 was dependent on the pixel differences among the two quantum images in the strip, and this was determined through quantum measurements. Comparing a 256 × 256 grayscale original image with the same size watermarked image, the
240
S. Thenmozhi et al.
similarity value was found to be 0.990. When compared with the same-sized darkened image with the original image, the similarity was found to be 0.850. Hence, the similarity between two images is more proficient when the similarity value is nearer to one [4]. The protocol used in this paper enhances the existing FRQI model, by representing the qubit sequence to store the grayscale information. This method starts by converting the intensity values of all the pixels into a ket vector. Then, a tensor product of respective position and its intensity was done to form a single qubit sequence, thereby, successfully converting a traditional image into a quantum image. At the receiver, a same but inverse operation called quantum measurement is done to retrieve back the classical image. The computational time in NEQR is found to be very less. Compression ratio was also found to be better than FRQI.
2.3 NEQR-Based Methods This paper deals with a methodology for hiding a grayscale image into a cover image [13]. An (n/2 × n/2)-sized secret grayscale image with a gray intensity value of 8bits was expanded into a (n × n)-sized image with a gray value of 2-bits. This secret gray image and a (n × n)-sized cover image were represented using NEQR model which stores the color information and position of every pixel in the image. The obtained secret image, in quantum form, was scrambled using “Arnold Cat map” before starting the process of embedding. Later, the quantum secret image, which underwent scrambling, was embedded onto a cover image in quantum form using two “Least Significant Qubits (LSQb)”. The process of extracting requires the steganographic image alone to extract the secret image embedded. This scheme achieves high capacity, i.e., 2-bits per pixel which is significantly higher compared to other schemes in the field of quantum steganography. The security of this scheme is enhanced since this method involves scrambling the image before embedding. The PSNR achieved by this scheme accounts to a value around 43 dB which is higher when compared with Moiré pattern-based quantum image steganography, but less when compared with other LSB techniques. In this proposed paper, initially, the image to be encrypted was mapped onto a NEQR model which stores pixel values and pixel positions in an entangled qubit sequence [5]. A chaotic map called logistic map was used to generate chaotic random sequences. The process of encryption of the carrier image includes three stages, namely, intra bit permutation and inter bit permutation and chaotic diffusion. The intra bit permutations and inter bit permutations were operated on the bit planes. The intra bit permutation was accomplished by sorting a chaotic random sequence, which modified the position of the bits, while the pixel weight remained the same. As the percentage of bit 0 and bit 1 was roughly the same in each and every bit plane, all the bits were uniformly distributed due to the permutation operations. The inter bit permutation was operated between different bit planes, which, simultaneously, modified the grayscale information as well as the information of the pixel. This was achieved by choosing two-bit planes and performing Qubit XOR operations on them. Finally, a chaotic diffusion procedure was put forth to retrieve
Information Hiding Using Quantum Image …
241
the encrypted text image, which was facilitated using an XORing of the quantum image. The chaotic random sequence generated from a logistic map determined the controlled-NOT gates, which was significant in realizing the XOR operations. The parameters to the logistic map were found to be sensitive enough to make the keyspace value large enough. Larger the keyspace value, the more difficult it is to perform the brute-force attack. This methodology not just altered the grayscale intensities and the positions of the pixels, yet, in addition, the bit distribution was observed to be more uniform progressively. According to the simulation output, the proposed technique was found to be more proficient than its classical equivalent. The security accomplished is confirmed by the measurable examination, the sensitivity of the keys and keyspace investigation. When compared with the classical image cipher techniques, mathematical entanglement of the proposed approach was found to be lesser. The PSNR value for a grayscale image of size 256 × 256 was found to be 8.3956 dB as opposed to an image cipher algorithm implemented using no linear chaotic maps and transformation whose value was found to be 8.7988 dB [14]. This paper introduces a novel, keyless and secure steganography method for quantum images dealing with Moiré pattern. Here, the proposed methodology consists of two steps. Initially, they carried out the embedding operation where a secret image was embedded onto a preliminary Moiré grating of the original cover image which resulted in Moiré pattern. Here, the preliminary Moiré grating was modified in accordance with the secret image to result in a final Moiré pattern. The workflow of the embedding operation consisted of three steps. First, a preliminary Moiré grating was under consideration, and the user had the flexibility in choosing the same. Second, a deformation operation was performed to generate a Moiré pattern by making use of the preliminary grating and the image which was needed to be hidden. Finally, denoising was performed which transformed the obtained Moiré pattern to a steganographic image. The second phase of the methodology dealt with the extraction of the secret image by making use of the preliminary grating and an obtained Moiré pattern. Evaluation parameters considered here were visual effects and robustness. PSNR was performed in displaying the steganography scheme’s accuracy. Even though the PSNR value was observed to be around 30 dB, not much noticeable change was found between the cover image and stego image. For the sake of understanding robustness of the proposed scheme, the addition of salt & pepper noise with various densities was done to stego image. The secret image extracted was easily identifiable and robust against the addition of the salt and pepper noises. The stego image was under the influence of cropping attack, and the extracted secret image from the cropped stego image consisted of a few non-adjacent parallel black lines attached. Even though they had observed the appearance of parallel black lines, the meaning and content of a hidden image were observed conveniently.
242
S. Thenmozhi et al.
2.4 QUALPI-Based Methods The paper [15] proposes a new quantum image steganography method which introduced a quantum image representation called QUALPI that makes use of log-polar images in preparing the quantum image model. This was followed by quantum image expansion where an atlas consisting of various quantum image copies are superimposed. The expanded quantum image was subjected to the embedding of the secret information. This was done by choosing one particular image copy out of the atlas followed by embedding the secret information onto the chosen image copy. At the receiver, Grover’s search algorithm, an algorithm that aimed at reducing the time complexity of searching a record present in an unsorted database, was utilized in the retrieval of secret information. This work included three performance parameters namely imperceptibility, capacity and security. The secret information is embedded onto one of the many image copies and with a smaller angle of image expansion, a complex atlas was obtained showing better imperceptibility and thus greater security against eavesdroppers [16]. This paper introduced a novel representation for a quantum image named quantum log-polar image (QUALPI) which involved processing and storing of a sampled image in log-polar coordinates. QUALPI involved following the preparation procedure. Initially, it dealt with the conversion of classical image to an image sampled in log-polar coordinates. For an image of size 2m × 2n with 2q grayscale values, a register consisting of (m + n + q) number of qubits in the quantum state was defined in the storage of image information as qubit sequences also referred to as ket. Later, an empty ket was initialized by keeping all the grayscale intensities as zero followed by setting all the pixels with their appropriate intensities. This constituted a final image representation in quantum state named QUALPI. The time complexity involved in storing a 2m × 2n log-polar image having a grayscale value of 2q was O(q(m + n) · 2m+n ). Common geometric transformations such as rotational transformations and symmetric transformations were performed conveniently with the help of QUALPI when compared with other representations, for example, NEQR and FRQI.
2.5 Other Methods In this paper, a new technique for constructing substitution boxes [17] dealing with quantum walks nonlinear properties was presented. Quantum walks are universal quantum computational models used for designing quantum algorithms. The performance of this method was evaluated by an evaluation criterion called S-box. Also, a novel method for steganography of images was constructed using the S-boxes. The proposed method consists of a mechanism involving data hiding (traditional) and quantum walks. This technique is shown to be secure for data which is embedded. It is also seen that the secret message of any type can be used in this technique.
Information Hiding Using Quantum Image …
243
During the extraction process, only the preliminary values for the S-boxes generation and steganographic image are found to be required. This method has a greater capacity of embedding and good clarity with greater security [18]. In this work, a new quantum steganography protocol was proposed using “Pixel Value Differencing” (PVD) which satisfactorily adheres to edge effects of image and characteristics of the human optical system. The whole process was divided into three parts namely quantization of two-pixel blocks based on the difference in their grayscale values, data embedding and extraction. Here, the cover image was embedded with the operator’s information and secret image based on pixel value differencing. Based on the pixel value difference level, information about the operator with different qubit numbers was embedded. The difference in pixel values was not a concern while embedding a secret image. Secret image and information about the operator were embedded by swapping the pixel difference values belonging to the two-pixel blocks of the cover image with similar ones where embedded data qubits are included. Secret information traceability is realized by extracting information about the operator. The extraction process is seen to be completely blind. There were two parameters taken into account while checking for the invisibility of the secret image. During histogram analysis, it is seen that the histograms of steganographic images are very similar to the original ones. Considering “Peak Signal-to-Noise Ratio” (PSNR), it is seen that the algorithm proposed obtains good clarity. It is also seen that the scheme allows for good embedding capacity and is found to be highly robust [19]. This paper discusses a new protocol which is based on quantum secure direct communication (QSDC). The protocol is used to build a concealed channel within the classical transmission channel to transmit hidden information. The protocol discussed in this paper uses QSDC as its basis. The technique adopts the entanglement transaction of its bellbasis states to embed concealed messages. This protocol contains six steps which are crucial for the preparation of large numbers, mode selection by a receiver, control mode, information transmission, covert message hiding mode and concealed data retrieving mode. The protocol uses IBF which is the extension and a more secured method over BF coupled with QSDC. It was seen that the protocol can reliably deal with the intercept-resend attack and auxiliary particle attack and also a man-inthe-middle attack. This protocol also shows great imperceptibility. Compared to the previous steganography protocols based on QSS and QKD, this protocol has four times more capacity in hidden channels, thereby increasing the overall capacity of the channel [20]. In this proposed work, a quantum mechanical algorithm was proposed to perform three main operations: Creating a configuration in which amplitude of a system present in anyone of the 2n states is identical. Performing Fourier transformation, rotating √ the selective states by the intended angle. This paper presents a method having O( n) complexity in time for identifying a record present in a database with no prior knowledge of the structure in which the database is organized.
244
S. Thenmozhi et al.
2.6 Validation Parameter As in the case of classical image steganography, in this quantum steganography also the performance of the method needs to be evaluated. Evaluation can be done in qualitative as well as quantitative analysis. Under quantitative analysis, the metrics used are (1) PSNR (2) NCC (3) SSIM, etc., and under qualitative analysis, checking the visual quality of the method using the histogram is used. The histogram of both cover and stego image should not have any trace of embedded data. In addition to this, the key sensitivity and keyspace analysis can also be used as a metric to measure the performance of the algorithm used. For better security, the algorithm should have a very large keyspace.
3 Conclusion Quantum computation possesses amazing properties, namely superposition, entanglement and parallelism, because of which quantum image processing technologies offer performances and capabilities that are in a position of no rivalry by their classical equivalents. Guaranteed security, computing speed and minimal storage requirements are the improvements that could be achieved. There is a need for a faster and more efficient way of storing and processing data. Quantum image processing which inherently possesses parallelism and superposition properties could be used to satisfy this need. This pertains to operations in image processing such as expansion, reconstruction or image recognition, which is an even more difficult problem to handle. Based on the literature survey, have seen that “Phuc Q Le” et al. proposed FRQI in the year 2009. But, this representation is not efficient enough. Hence, Yi Zhang et al. proposed a better mapping model from classical to quantum form called novel enhanced quantum representation (NEQR) in the year 2014. This representation model comprises the quantum image in the form of qubit series to store gray intensity values rather than the random magnitude of a single qubit for representing grayscale values as in FRQI. Therefore, this model has very less computational time, better compression ratio and accurate retrieval of the classical image. “Yi Zhang” et al. later enhanced the above model when they proposed quantum representation for log-polar images (QUALPI). This model inherits the properties of NEQR while representing the quantum images in log-polar/ ρ, θ coordinate system. This offers more security against third party attackers. By comparing with other steganography algorithms, it is seen that by the implementation of expanding the quantum images to make an atlas and retrieval using the well-known Grover search is giving greater PSNR values, better security and large payload capacity than the existing methods. Therefore, by combining any of the methods cited in the literature, a better secure communication protocol can be developed in a quantum state.
Information Hiding Using Quantum Image …
245
References 1. Venegas-Andraca SE, Bose S (2003) Storing, processing, and retrieving an image using quantum mechanics. Proc SPIE 5101:1085–1090 2. Venegas-Andraca SE, Ball JL (2010) Processing images in entangled quantum systems. Quant Inf Process 9(1):1–11 3. Latorre JI (2005) Image compression and entanglement, pp 1–4. Available https://arxiv.org/ abs/quant-ph/0510031 4. Zhang Y, Lu K, Gao Y, Wang M (2013) NEQR: a novel enhanced quantum representation of digital images. Quant. Inf. Process. 12(8):28332860 5. Lıu X, Xıao D (Member, IEEE), Xıang Y (2019) Quantum ımage encryption using ıntra and ınter bit permutation based on logistic map. https://doi.org/10.1109/ACCESS.2018.2889896 6. Le PQ, Dong F, Hirota K (2011) A flexible representation of quantum images for polynomial preparation, image compression, and processing operations. Quant. Inf. Process. 10(1):63/84 7. Qu Z, Cheng Z, Wang X (2019) Matrix coding-based quantum ımage steganography algorithm. IEEE Access 1–1 (2019). https://doi.org/10.1109/access.2019.2894295 8. Heidari S, Pourarian MR, Gheibi R, Naseri M, Houshmand M (2017) Quantum red–green–blue image steganography. Int. J. Quant. Inf. 15(05):1750039. https://doi.org/10.1142/s02197499 17500393 9. Qu Z, Cheng Z, Liu W, Wang X (2018) A novel quantum image steganography algorithm based on exploiting modification direction. Multimedia Tools Appl. https://doi.org/10.1007/s11042018-6476-5 10. Zhou R-G, Hu W, Fan P, Luo G (2018) Quantum color image watermarking based on Arnold transformation and LSB steganography. Int. J. Quant. Inf. 16(03):1850021. https://doi.org/10. 1142/s0219749918500211 11. Le P, Iliyasu A, Dong F, Hirota K (2011) Strategies for designing geometric transformations on quantum images. Theor. Comput. Sci. 412:1406–1418. https://doi.org/10.1016/j.tcs.2010. 11.029 12. Yan F, Le P, Iliyasu A, Sun B, Garcia J, Dong F, Hirota K (2012) Assessing the similarity of quantum ımages based on probability measurements. In: 2012 IEEE world congress on computational ıntelligence 13. Zhang T, Abd-El-Atty B, Amin M, Abd El-Latif A (2017) QISLSQb: a quantum ımage steganography scheme based on least significant qubit. https://doi.org/10.12783/dtcse/mcsse2 016/10934 14. Jiang N, WangL (2015) A novel strategy for quantum ımage steganography based on moire pattern. Int J Theor Phys 54:1021–1032. https://doi.org/10.1007/s10773-014-2294-3 15. Qu Z, Li Z, Xu G, Wu S, Wang X (2019) Quantum image steganography protocol based on quantum image expansion and grover search algorithm. IEEE Access 7:50849–50857. https:// doi.org/10.1109/access.2019.2909906 16. Zhang Y, Lu K, Gao Y, Xu K (2013) A novel quantum representation for log-polar images. Quant Inf Process 12(9):31033126 17. EL-Latif AA, Abd-El-Atty B, Venegas-Andraca SE (2019) A novel image steganography technique based on quantum substitution boxes. Opt Laser Technol 116:92–102. https://doi.org/10. 1016/j.optlastec.2019.03.005 18. Luo J, Zhou R-G, Luo G, Li Y, Liu G (2019) Traceable quantum steganography scheme based on pixel value differencing. Sci Rep 9(1). https://doi.org/10.1038/s41598-019-51598-8 19. Qu Z-G, Chen X-B, Zhou X-J, Niu X-X, Yang Y-X (2010) Novel quantum steganography with large payload. Opt Commun 283(23):4782–4786. https://doi.org/10.1016/j.optcom.2010. 06.083 20. Grover L (1996) A fast quantum mechanical algorithm for database search. In: Proceedings of the 28th annual acm symposium on the theory of computing, pp 212–219 (1996)
Smart On-board Vehicle-to-Vehicle Interaction Using Visible Light Communication for Enhancing Safety Driving S. Satheesh Kumar, S. Karthik, J. S. Sujin, N. Lingaraj, and M. D. Saranya
Abstract Li-Fi technology has emerged as one of the sound standards of communication where light sources such as LED and photodiodes are used as a data source. This technology is predominantly used in various modes for felicitating any type of data communication. In the field of automobile, the role of Li-Fi technology marks highly essential for achieving vehicle-to-vehicle interaction in a smart environment. This smart communication finds its user end application even at a traffic light control system. Both the transmitter and receiver section take advantage of using LED as a light source due to its fast switching nature which makes the entire systems to be realized at low cost and greater efficiency. In this paper, an intelligent transport system is proposed using a Li-Fi technique which is generally a visible light communication for felicitating a secured vehicle-to-vehicle interaction in a dynamic situation. The receiver design is robust and dynamic which interprets the light waves into data with the help of solar panels and amplifiers which is being transmitted from the other vehicle. The overall data throughput is good and found an appropriate replacement for typical RF communication systems in automobiles. Keywords Li-Fi · Vehicle-to-vehicle communication · Visible light communication · Automobiles S. Satheesh Kumar (B) · M. D. Saranya Department of Electronics and Communication Engineering, KPR Institute of Engineering and Technology, Coimbatore, India e-mail: [email protected] M. D. Saranya e-mail: [email protected] S. Karthik · J. S. Sujin Department of ECE, Sri Krishna College of Engineering and Technology, Coimbatore, India e-mail: [email protected] J. S. Sujin e-mail: [email protected] N. Lingaraj Department of Mechanical Engineering, Rajalakshmi Institute of Technology, Chennai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_19
247
248
S. S. Kumar et al.
1 Introduction Nowadays, vehicle-to-vehicle communication (V2V) found greater importance in the automobile to enhance safe driving and access reliable road transport. The means of transportation is mostly, wireless in which two different vehicles communicate each other over an ad hoc mesh network. This kind of safe driving mesaures will prevent the large scale of accidents merely happening due to unkowns facts about neighbouring vehicle’s status. On incorporating V2V communication in on-board vehicles, the details about speed control, emergency warning and location can be communicated to each vehicles and make the driver to take appropriate decision for safe driving. Various automobile techologies such as ACC, embedded lane departure systems, blind spot detection and smart parking technology were used to improve the smartness of a vehicle rather than ensuring the safety aspects. Li-Fi-based smart transportation system, originally a safe communication measure, was initially funded by Department of US Transportation and NHTSA allows enhancing the road transport vehicles get connected using the roadside communicating elements like traffic light signs and RF towers etc., In future, driverless cars are realizable not only due to artificial intelligence but also with the implication of light-based V2V communication. Today almost all automobile frontiers are trying to incorporate wide scope advantage of V2V communication in their future models.
2 Review of Related Works In 2004, Komine and Nakagawa [1] proposed a visible light communication technique for including an adaptable mechanism in vehicles for light dimming, and it is mainly due to fast modulation of optical sources such as LED and visual light communication standard (IEEE 802.15.7) for effective wireless communication for short range. Noof Al Abdulsalam et al. (2015) identified a novel approach for designing Li-Fi module using normal LEDs for vehicular automation. Turan et al. [2] proposed a novel modulation coding scheme to achieve minimum Bit Error Rate (BER) for low latency secured communication using VLC. In 2017, Cailean and Dimian [3] addressed challenges for implementing Li-Fi-based vehicle communication for analysed distance measures and visible light positioning. Poorna Pushkala et al. [4] projected a solution for radiofrequency congestion and reliable audio, and image data was communicated using Li-Fi communication without involving microcontrollers and other peripheral devices. In 2018, Jamali et al. [5] proposed a methodology to avoid road accidents due to vehicle collision-based Li-Fi technology. Satheesh Kumar et al. [6] reviewed various advancements in recent automobiles to enhance a vehicular communication for human-centred interactions and also deal with emotions pertained through driver’s action and gestures. Gerardo Hernandez-Oregon et al. (2019) analysed the performance of V2V and V2I communication by modelling the road infrastructure by Markov process which benefits accurate calculation of
Smart On-board Vehicle-to-Vehicle Interaction Using Visible …
249
data throughput. In 2020, Subha et al. [7] explained OFDM-based Li-Fi architecture for 5G and beyond wireless communication for enhancing the natto cells for wireless channel capacity. In 2017, Satheesh Kumar et al. [9] and in 2015, Christable Pravin et al. [8] explained about automation techniques which can be incorporated for continuous speech recognition systems. In 2019, Ganeshprabhu et al. [11] discussed about solar powered robotic vehicle and in 2019, Sujin et al. [10] explained about the impact of public e-health monitoring systems. As a matter of ensuring the safety of the individual, in 2018 Nagaraj et al. [12] proposed an alcohol impaired vehicle tracking system using wearable smart helmet. In 2015, Allin Christe et al. [12, 13] implemented a novel 2D wavelet transform approach for image retrival and segmentation which paved a way for effective motion capture while driving the vehicle. With the same intention, in 2020, Mazher Iqbal et al. [15] implemented a MWT algorithm for effective analysis of medical images. All the way, in 2014, Satheesh Kumar et al. [14] proposed rapid expulsion of acoustic soft noise using RAT algorithm which was found to be an effective algorithm for removing soft noises in the images too and to make it more convenient, region based scheduling is practiced with certain sensor networks and this is focussed by Karthik et al. [16] in the 2019.
3 Smart Vehicle-to-Vehicle Interaction for Safety Driving As discussed in previous sections of this paper, smart vehicle-to-vehicle interaction is an effective solution for on-board automobile communication to eliminate the frequency of occurrence of road accidents due to indecision nature of driver while out of control. Vehicle-to-vehicle communication is always found to be a radial approach which brings out phenomenal outcome when it comes to automation. People are still dwelling behind the scope of AI and data analytics but fail to utter for visible light communication. The proposed system brings out a powerful and simple approach of visible light communication to eliminate the protocol-based Li-Fi technology to save the cost of implementation and energy.
3.1 Transmitter Section The basic functionality of the transmitter section is discussed here. The sensor module integrated with the transmitter section acquires the data from the vehicle which is being sensed. Due to the dynamic nature of the vehicle movement, the variations in the sensing element are generally a fluctuation voltage, but moreover, it represents the AC voltage. This will be converted into a DC voltage level by the sensor module which will be readable by the microcontroller unit. A microcontroller unit is a processing unit which compares the current data with the previous one and provides the output to the LED driver (Fig. 1).
250
S. S. Kumar et al.
Fig. 1 Transmitter section using Li-Fi technology
Once the output reaches the LED driver circuit, the data will be ready for transmission via wireless mode. The photodiode detects the light which is been transmitted and converts it to a current. LCD will display the output appropriately. This approach will make our road accidents to at least reach to a smaller extent. The push buttons are generally used to take care of the contact establishments between different modules. The motor got interfaced with the brake shoe and other primary controlling units of an automobile. The usage of LEDs will bring a simple possible transmitter module for felicitating the speed control of a vehicle for smart transport.
3.2 Receiver Section The receiver section has been implemented with the same set-up where the LED blinking can be detected at the frequency above 1 kHz. The ultrasonic sensor detects the distance between the two different vehicles. If the distance goes below the threshold range, which is generally a safe distance level, an appropriate alert is being transmitted by the Arduino module. The collected data will be processed by a PIC microcontroller unit for reliable control actions. The ultrasonic sensor unit is completely tuned for safe distance and violating that distance results in a chance of getting hit with other vehicles. So, Arduino pro-mini module takes care of other peripheral sensors for further processing of sensed information (Fig. 2). All these separate processes help in achieving sudden response during an unfair situation. The auxiliary systems help the receiver unit to take appropriate actions and reduce the computation and processing burden of PIC microcontroller. The Bluetooth systems help in conveying the state of information to the driver and help him to handle the situation manually to some extent. But if the action fails to happen within a span of time then, automatic actions will be triggered by the microcontroller which helps to avoid accident situation.
Smart On-board Vehicle-to-Vehicle Interaction Using Visible …
251
Fig. 2 Receiver section using Li-Fi technology
4 Experimental Results 4.1 Major Task The entire system gets connected in a coherent fashion to take care of major task whatever the V2V system adopts during the time of controlling actions. The user is the deciding one who can choose the option which highly recommended at the situation to access the vehicle.
4.2 Bluetooth Connectivity The entire set-up is well connected, and served Bluetooth allows the information to be transferred to a short-range preferably inside the vehicle to get connected with all devices. LCD module helps in indicating the information which is really needed to display.
4.3 Switching Control Manual control is essentially required to control the vehicle actions based on the level of comforts. Special buttons can be provided to perform switching actions.
252
S. S. Kumar et al.
4.4 Visible Light Communication for Li-Fi Through VLC and the information, video and sound are then moved through light devotion. Clear light interchanges (VLC) operate by adjusting the current over and over to the LEDs at an incredible rate, to rate to see the naked eye in any way, so that there is no gleaming. While Li-Fi LEDs should be held on to submit information, they could be darkened to below people’s vision while still radiating enough light for information to be transmitted. In addition, the invention is important, relies on the obvious spectrum, because it is limited to the improvement and is not modified in line with portable correspondence. Advances that allow different Li-Fi cells to wander through otherwise called hinders that make it possible for the Li-Fi to change consistently. The light waves can not be separated into a lot shorter range, but are increasingly free from hacking than Wi-Fi. For Li-Fi, direct views do not matter; 70 Mbit/s can be obtained from light that reflects the dividers (Fig. 3). Li-Fi is an invention of ORs, using light emanating from light diodes (LEDs) as an organized, flexible and quick link to Wi-Fi via these boards. Illumination from light-emanating diodes (LEDs). The Li-Fi ad was expected to rise at an annual rate of 82% from 2013 to 2018 and to amount to more than 6 billion dollars per year by 2018. However, the market has not created a speciality showcase in this capacity, and Li-Fi remained primarily for creative assessment (Fig. 4). These sorts of V2V frameworks are required in light of the fact that human can commit errors while driving which can make mishaps and they are valuable all together use the excursion viably and in a made sure about way (Figs. 5, 6 and 7).
Fig. 3 Process flow of Li-Fi-based VLC
Smart On-board Vehicle-to-Vehicle Interaction Using Visible …
253
Fig. 4 LED ıntensity versus transmission range
Fig. 5 Safety units implemented in vehicle ınteraction
For instance fluorescent and flashing lights, powered lights are up to 80% more efficient than traditional lighting. 95% of life in LEDs is converted into light, and just 5% is lost as energy. This is in contrast to glaring lights which make 95% of vitality warm and 5% cold. LEDs are amazingly vitality proficient and expend up to 90% less force than brilliant bulbs. Since LEDs utilize just a small amount of the vitality of a glowing light, there is a sensational decline in power costs. Shading rendering index is an estimation of a light’s capacity to uncover the real shade of items when contrasted with a perfect light source (characteristic light). High CRI is commonly an alluring trademark (despite the fact that obviously, it relies upon the necessary application). LEDs by and large have high (great) appraisals with regards to CRI. Maybe the most ideal approach to acknowledge CRI is to take a gander at an immediate correlation between LED lighting (with a high CRI) and a customary lighting arrangement like sodium fume lights (which by and large have helpless CRI evaluations and are at times practically monochromatic). See the accompanying picture to thoroughly analyse the two occasions (Fig. 8).
254
S. S. Kumar et al.
Fig. 6 Signal propagated with various frequency using LED source
5 Conclusion The task targets planning a model for move of data from one vehicle in the front to the back. This can be controlled remotely by means of an application that gives the highlights of switch mode. An application is run on Android gadget. The framework can be utilized in a wide scope of zones. The framework incorporated with various highlights can be applied in the accompanying fields. • The vehicles will be securely rerouted to elective streets and courses, which will eliminate traffic clog essentially • Because of frameworks, for example, the early accident alert, which will let vehicles impart speed, heading and area with one another, there will be less occasions of accidents. • Because of lower blockage and less time spent in rush hour gridlock, the contamination brought about by the vehicle will be lower also. • The innovation is incorporated with the vehicle during its unique creation and frequently gives both sound and visual alerts about possible issues with the vehicle or the environmental factors.
Smart On-board Vehicle-to-Vehicle Interaction Using Visible …
Fig. 7 Signal received for various frequency using an LED source
Fig. 8 Hardware set-up
255
256
S. S. Kumar et al.
• The innovation is included after the first get together; reseller’s exchange gadgets are ordinarily not as completely coordinated as those applied during creation; V2V secondary selling gadgets can be introduced by vendors or approved vendors. Other reseller’s exchange gadgets could be independent and versatile gadgets that can be conveyed by the traveller or driver. • These gadgets are founded on street foundation things, for example, street signs and traffic signals. The vehicles would have the option to get data from foundation gadgets, which will help forestall mishaps and give natural advantages; this correspondence procedure is called V2V, for short. This sort of correspondence could give an admonition when a vehicle disregards a red light or a stop sign, has unreasonable speed, enters a diminished speed zone, enters a spot with unexpected climate changes and comparable. As of now, the application is made for Android smartphone; different OS stage does not bolster our application. Taking a gander at the current circumstance, crossstage framework that can be created on different stages like iOS, Windows can be manufactured.
References 1. Komine T, Nakagawa M (2004) Fundamental analysis for visible-light communication system using LED lights. IEEE Trans Consum Electron 50:100–107 2. Turan B, Narmanlioglu O, Ergen SC, Uysal M (2016) Physical layer ımplementation of standard compliant vehicular VLC. In: IEEE vehicular technology conference. https://doi.org/10.1109/ VTCFall.2016.7881165 3. Cailean A-M, Dimian M (2017) Curent challenges for visible light communication usage in vehicle applications: a survey. IEEE Commun Surveys Tutor. https://doi.org/10.1109/COMST. 2017.2706940 4. Poorna Pushkala S, Renuka M, Muthuraman V, Venkata Abhijith M, Satheesh Kumar S (2017) Li Fi based high data rate visible light communication for data and audio transmission. Int J Electron Commun 10, 83–97 (2017) 5. Jamali AA, Rathi MK, Memon AH, Das B, Ghanshamdas, Shabeena (2018) Collision avoidance between vehicles through Li-Fi based communication system. Int J Comput Sci Netw Secur 18, 72–81 (2018) 6. Satheesh Kumar S, Mazher Iqbal JL, Sujin JS, Sowmya R, Selvakumar D (2019) Recent advancements in automation to enhance vehicle technology for human centered interactions. J Comput Theor Nanosci 16 7. Subha TD, Subash TD, Elezabeth Rani N, Janani P (2020) Li-Fi: a revolution in wireless networking. Elsevier Mater Today Proc 24:2403–2413 8. Christabel Pravin S, Satheesh Kumar S (2017) Connected speech recognition for authentication. Int J Latest Trends Eng Technol 8:303–310 9. Satheesh Kumar S, Vanathi PT (2015) Continuous speech recognition systems using reservoir based acoustic neural model. Int J Appl Eng Res 10:22400–22406 10. Sujin JS, Gandhiraj N, Selvakumar D, Satheesh Kumar S (2019) Public e-health network system using arduino controller. J Comput Theor Nanosci 16:1–6 11. Ganesh Prabhu S, Karthik S, Satheesh Kumar S, Thirrunavukkarasu RR, Logeshkumar S (2019) Solar powered robotic vehicle for optimal battery charging using PIC microcontroller. Int Res J Multidisc Technovat 4:21–27
Smart On-board Vehicle-to-Vehicle Interaction Using Visible …
257
12. Nagaraj J, Poongodi P, Ramane R, Rixon Raj R, Satheesh Kumar S (2018) Alcohol impaired vehicle tracking system using wearable smart helmet with emergency alert. Int J Pure Appl Math 118:1314–3395 13. Allin Christe S, Balaji M, Satheesh Kumar S (2015) FPGA ımplementation of 2-D wavelet transform of ımage using Xilinx system generator. Int J Appl Eng Res 10:22436–22466 14. Satheesh Kumar S, Prithiv JG, Vanathi PT (2014) Rapid expulsion of acoustic soft noise for noise free headphones using RAT. Int J Eng Res Technol 3:2278–0181 15. Mazher Iqbal JL, Narayan G, Satheesh Kumar S (2020) Implementation of MWT algorithm for image compression of medical images on FPGA using block memory. Test Eng Manag 83:12678–12685 16. Karthik V, Karthik S, Satheesh Kumar S, Selvakumar D, Visvesvaran C, Mohammed Arif A (2019) Region based scheduling algorithm for Pedestrian monitoring at large area buildings during evacuation. In: International conference on communication and signal processing (ICCSP). https://doi.org/10.1109/ICCSP.2019.8697968
A Novel Machine Learning Based Analytical Technique for Detection and Diagnosis of Cancer from Medical Data Vasundhara and Suraiya Parveen
Abstract Cancer is the most dreadful disease which has been affecting human race for a decade. Cancer is of different types which have been affecting people to the at most saturation and ruining thousand of life per year. The most common cancer is breast cancer in females and lung sarcoma in male. Globally, breast cancer has been turned into a cyclone of disease due to which many females particularly of the middle age (30–40 years). Cancer proved out to be the most nationwide disease due to which millions of lives are being taken as recorded by the National Cancer Registry programme of the Indian Council of Medical Research (ICMR), that is, more than 13,000 individuals lose their life every day. Breast cancer contributes to the majority of deaths of many women and the maximum percentage, i.e., 60% of the ladies are being declared dead due to breast cancer. In this paper, our main point of implementation is to develop more précised and more accurate techniques for the diagnosis and detection of cancer. The machine learning algorithms have been taken into account for the betterment and advancement in the medical field. Support vector machine, naïve Bayes, KNN, decision tree, etc. have been used for classification as these are the various types of a machine learning algorithm. Keywords Breast cancer · Machine learning · Dreadful disease · Women
1 Introduction Machine learning is an application of artificial intelligence which has the potential to enhance the system and improve our earlier experiences without complexing them [1]. It mainly focuses on improving dataset and makes it more pronounced in comparison to the earlier one. Machine learning is defined as a process of creating Vasundhara · S. Parveen (B) Department of Computer Science, School of Engineering Science and Technology, Jamia Hamdard, New Delhi, India e-mail: [email protected] Vasundhara e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_20
259
260
Vasundhara and S. Parveen
Fig. 1 Comparison of mammogram images of normal and abnormal breast
models which can perform a certain task without the need of human work for a human explicitly programming it to do something [2]. For example, if a person X listen to music A on YouTube if she or he likes it hitting the like button, then whenever he again listens to music which he likes to hear. The high pace and soothing which provide only due to machine learning (Fig. 1). Machine learning has evolved way of treating and diagnosing breast cancer in the most appropriate manner with the utilization of its various algorithm [3]. It has been proving an essential tool for the early detection and classifying breast cancer based on risk factor extremities. A number of machine learning algorithm such as a neural network (Bayesian network, decision tree, support vector machine) are taken into account on large scale and proved to be an asset as it allows the doctors to detect cancer and classify it on the extremities of different stages.
2 Related Work Breast cancer is one of the most dangerous cancers which has taken lakh of lives and affected the women population the most. Comparsion of different types of algorithms such as support vector machine, ART, naïve Bayes and K-nearest neighbor are being used. In this paper, the algorithm being used is support vector machine on the Wisconsin breast cancer database where KNN technology gives the best accurate results. Other algorithms also performed well in this experiment. SVM is a strong technique and used with Gaussian kernel which is the most appropriate technique for the occurrence and non-occurrence prediction of breast cancer. The SVM used is applicable only when the number of class variable is binary [4]. In [5], this author focuses on proposing adaptive ensemble learning voting method for diagnosing breast cancer database. The aim is to do a comparative study and explain how ANN and logistic algorithm, when combined with ensemble machine learning algorithms, provide output which gives more accurate figures. The dataset being used here is Wisconsin breast cancer database. The accuracy of being used
A Novel Machine Learning Based Analytical Technique …
261
ANN 98.05% as compare to other algorithms. While testing accuracy of this database, came to the conclusion the diagnosing of breast cancer at early stages is more beneficial and many lives can be saved. In [6] this paper, attempted to bring out a comparison between different machine learning algorithms such as support vector machine, random forest, naïve Bayes for more accurate detection of breast cancer. These machine learning algorithms are used along with the database. The results using the above technique give us an exact idea to use which machine learning algorithm for more accurate outcomes. Machine learning techniques are being used on a large scale to serve as a useful diagnostics tool along with other medical equipment. This helps us in many ways. Based on our results, each machine learning algorithm programme is best in its way. But the one which outshines the other is support vector machine, having more accuracy and precision. Random forest is the best to classifying tumors better than support vector machine. In [7] this paper, many women are fighting with this dreadful disease and their lives are taken aback. Due to unawareness of the diagnosis of this disease, this is increasing on a wider level. The practical analysis shows that 97.8% support vector machine is the most appropriate algorithm. In this paper, description of various algorithms has been done to predict breast cancer. The application technology in the medical field has become an essential asset since it provides more efficiency along with medical knowledge. In [8] this paper, breast cancer is the most known common cancer occur in women. As machine learning technique has been used for classification and thus more used for early diagnosis and detection of cancer. In this, the author tries to compare that artificial neural network and SVM are more effectively used as a machine learning algorithm for classification and find its accuracy. This paper [9] emphasis on the implementation of IoT technology medical health care for enhancing the quality of care and minimizing the cost required for its automation and optimization of resources. The use of IoT technology in medical imaging insures to have correct and more accurate information about a particular symptom related to a particular disease. Digitalizing and use of modern technology in the field of medicine have paved off.
3 Methodology In this research, various machine learning techniques have been surveyed and analyzed for analyzing and diagnosing the medical data and have found that the following techniques are very beneficial for it.
262
Vasundhara and S. Parveen
3.1 Support Vector Machine Support vector machine is an efficient and accurate tool of machine learning algorithm [1]. The main role of SVM is to help in minimizing the upper bound generalization error by increasing the margin between separating hyperplane and dataset. It automatically tells about the error and it has the potential of running linear or nonlinear classification [10]. It helps in the early detection of breast cancer. Since the invention of SVM playing an essential role and has proven to be working efficiently with the efficiency of 98.5% in the field of medical data, it is capable of classifying up to 90% of amino acids of different compounds due to its high efficiency, and it has the potential of detecting various cancer stages at its initial stage [11]. When machine learning algorithm is applied in the medical field, it has always come up with high accuracy and help the population to detect the cancer stages as early as possible and save thousands of lives across the globe.
3.2 Naïve Bayes Naïve Bayes classifier is also a type of machine learning algorithm and is one of the most effective and simple probabilistic classifiers which works on the principle of Bayes theorem with strong independent assumptions [12]. Naïve Bayes being simple has proven to be a more précised method in medical data mining. The requirements of naïve Bayes is not much as only a little piece of data is required for the detection and diagnosis of breast cancer [2]. One of the most important facts is that naïve Bayes mainly focuses on providing the decision based on available data. To provide the result to a maximum of the capabilities, they take into account all the necessary and important approaches to give out the results more transparently [13].
3.3 PCA PCA refers to principle component analysis. It is a type of machine learning algorithm which is taken into account when extraction of data from a high dimensional space into lowers dimensional space. It mainly focuses to extract the data having more variation and deleting of non-essential data having few numbers of variants [14]. It can be applied to the given below working fields; Data visualization: In this step, taking into considerations the various types of data having non-essential inputs in higher dimensional space PCA proves to be an asset has it mainly play its role of converting high dimensional space data into low dimensional space data [3].
A Novel Machine Learning Based Analytical Technique …
263
Speeding machine learning algorithms: PCA helps in speeding up the machine learning algorithm training and checkout time.
4 Case Study for the Design and Implementation of the Proposed Methodology Cancer is one of the most dreaded diseases due to which millions of lives are being taken successively by each crossing year. Cancerous cells are those cells which have lost the property of contact inhibition, i.e., when cells come in contact with their neighboring cells, then they start dividing mitotically to give rise to cancerous cells or tumor cells. Cancer gene is known as an oncogene and its cells are called oncocells. Cancer cells often form lumps or unregulated growth which turns out to be dangerous for the whole body. Tumor is a majority of two types; (i) benign tumor and (ii) malignant tumor (Fig. 2). Among the male population, lung cancer is the most common, whereas in female population, breast cancer is the most dreaded one and its taking millions of lives of females all over the globe. Breast cancer is the cancer of the breast in which the alveolar lobes which are 15–20 in number become cancerous, and in the extreme case, the entire breast which is being affected by cancer needs to be removed so that it does not start infecting other especially. It comprises of four stages of cancer. The diagnosis is being done by biopsy in which the piece of tissue is being cut and cultured in culture media for detection of many cancerous cells. Alpha interferon is available in which along with chemotherapy and radiation is used for its diagnosis. Nowadays, many tablets are being developed by using the principle of genetic engineering for curing this disease but a finalized sample is not yet been discovered which can be used as a substitute in place of the painful Fig. 2 Malignant breast cyst
264
Vasundhara and S. Parveen
chemotherapy. So, to deal with this, machine learning algorithm are being tried to use to cope up with the uncertainties which arose during the treatment of cancer. Chemotherapy is a procedure of diagnosis of cancer by injecting of a certain drug which will destroy the cancerous cells.
4.1 Stages of Cancer With the use of machine learning algorithms, it is possible to distinguish between the risk factor related to various stages of breast cancer [15]. These induced technologies have provided with the results, i.e., a way easier to describe four stages of breast cancer depending upon the severity of tumor size. These four stages are shown in Fig. 3.
4.1.1
Stage Zero
This stage indicates the tumor size. The survival rate is 100% in this situation and it is the earliest indication of cancer being developed in the body and it does not require much initiative to get cured. It does not comprise of any types of carcinoma.
4.1.2
Stage One
It is also known as an invasive stage. In this stage, the tumor size is not much but the cancerous cell broke into fatty adipose tissue of the breast. It comprises of two stages:
Fig. 3 Survival rate in different stages of breast cancer
A Novel Machine Learning Based Analytical Technique …
265
Stage 1A The tumor size is 2 cm or smaller.
Stage 1B The tumor size is 2 cm or smaller and it cannot be seen in the breast lobules but is seen in the lymph node with the size of 2 mm.
4.1.3
Stage Two
In this stage, tumor grows as well as start spreading to an associated organ. The tumor size remains small but it starts growing and its shape become like that of walnut. It comprises of two stages:
4.1.4
Stage 2A
The tumor size is 2 cm or smaller but you can find 1–3 cancer cells in the lymph node under the arms.
4.1.5
Stage 2B
The tumor size is larger than 2 cm but less than 5 cm and the cancer cells have reached the internal breast that is the mammary gland and the axillary lymph node.
5 Stage Three In this, size of a tumor is quite prominent and it does not spread to organ and bone but start to spread to 9–10 more lymph nodes. This stage is very hard to fight a patient undergoing the treatment. It comprises of two stages:
5.1 Stage 3A The tumor size is larger than 5 cm and 4–9 cancer cells have reached the axillary lymph node.
266
Vasundhara and S. Parveen
Fig. 4 Different stages of breast cancer
5.2 Stage 3B The tumor size is larger than 5 cm and is approximately 9 cm.
5.3 Stage Four The tumor is 10–20 µm. This is the last stage and the survival is very low because the tumor cells have been spreading to organ other breast, and hence, this stage is known as a metastatic stage (Fig. 4).
6 Proposed Methodology In this review paper, the dataset which has been used is being taken into account through different sources of data for retrieving information like radii of tumor size, the stage of cancer, the diagnosis techniques which have been used to detect cancer at the earliest stage possible by the use of techniques like mammogram, PET-SCAN and got the outcomes based on these techniques. Mammogram: It is the technique which is the most crucial step in breast cancer detection since it tells about the early symptoms and signs about breast cancer it is a type of X-RAY of the breast. In this technique, the breast is being placed on a plate which scans and tells about the cancer symptoms or lumps formed [16] in the breast (Fig. 5).
A Novel Machine Learning Based Analytical Technique …
267
Fig. 5 Mammogram imprint
PET-SCAN: It stands for positron emission tomography. It is advancement in the area of medical field which gives us clarity over the cancerous cells which are at a faster rate of growth and tells about the correct radii of tumor cell and the part of the body which is being adversely affected by the metastatic property of the cancerous cell [17] (Fig. 6). In this section, the detailed study on classification and extraction of data from the dataset is dealt with the utilization of various machine learning algorithm which has come into the light. Steps involved in the classification and extraction of data are shown in 7: Data collection: It is the process of fetching data from different sources and accumulating in the form of.csv file and is preconditioned has an input to the existing review model. Fig. 6 PET-SCAN report
268
Vasundhara and S. Parveen
Fig. 7 Proposed model
Data Extraction/Mining
Data Processing
Data Classification
Performance Evaluation
Results
Data processing: It is the step in which the data is processed and it mainly emphasized on preventing the missing values, noise reduction and picking of relevant data. Data Classification: In this step, the data is categorized based on different machine learning algorithm, support vector machine, naïve Bayes and decision tree. Performance Evaluation: In this step, the processed data is being evaluated and machine learning algorithm is applied to it and then selection of the data which is more accurate and efficient utilizes for further medical diagnosis [18]. It helps in four arenas of requirement, i.e., accuracy, precision, F1-measure and Recall. Accuracy = TP + TN/TP + FP + TN + FN Precision = TP/TP + FP Recall = TP/TP + FN F1-measure = 2 * precision * Recall/Precision + Recall These instances are defined as TP as true positive, TN as true negative, FP as false positive and FN as false negative [19]. Results—This step provides us with the output that can be applied to the analysis of the whole data.
A Novel Machine Learning Based Analytical Technique … Table 1 Experimental results
269
Machine learning algorithm
Accuracy
Precision
Recall
Support vector machine
97.78
98.13
96.29
Naïve Bayes
95.98
94.23
93.21
Decision tree
96.45
94.69
95.37
PCA with decision tree
98.03
97.74
97.89
Dataset In this research paper, the dataset which is being taken into account is breast cancer dataset which has been extracted from the machine learning repository. In the given dataset, there are 569 instances which have been categorized as benign and malignant and 30 attributes have been used.
7 Results This research is based on the use of various machine learning algorithm being applied to the required data to extract the required output, i.e., more accurate and précised diagnosis and detection of breast cancer at an early stage of cancer (Table 1).
8 Conclusion This research paper attempts to identify the most appropriate methodology for the diagnosis and detection of breast cancer with the support of machine learning algorithms such as support vector machine, naïve Bayes, decision tree and PCA. The main point of focus is the prediction of early stages of cancer with the use of the most efficient and precise algorithms. Hereby, concluded that PCA outshines other machine learning algorithm with an accuracy rate of 98.03%, recall 97.89% and precision 97.74% more areas of improvement have also been viewed and therefore PCA can work at a more comfortable rate with 1–2% of improvement in its methodology.
References 1. Ray S (2019) A quick review of machine learning algorithms. In: 2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon), Faridabad, India, pp 35–39 2. Mello F, Rodrıgo P, Antonelli M (2018) Machine learning: a practical approach on the statistical learning theory
270
Vasundhara and S. Parveen
3. Zvarevashe K, Olugbara OO (2018) A framework for sentiment analysis with opinion mining of hotel reviews. In: 2018 Conference on ınformation communications technology and society (ICTAS), Durban, pp 1–4 4. Bharat A, Pooja N, Reddy RA (2018) Using machine learning algorithm breast cancer risk prediction and diagnosis. In: Third international conference on circuits, controls, communication and computing (I4C), pp1–4 5. Khuriwal N, Mishra N (2018) Breast cancer diagnosis using ANN esemble learning algorithm. In: 2018 IEEEMA, engineer infinite conference (eTechNxT), pp 1–5 6. Bazazeh D, Shubair R (2016) Comparative studyof machine learning algorithm for breast cancer and detection. In: 2016, fifth international conference on electronics devices ,systems and application (ICEDSA), pp1–4 7. Khourdifi Y, Bahaj M (2018) Applying best machine learning algorithms for breast cam=ncer prediction and classification. In: 2018, International conference on electronics,control,optimaization and computer science (ICECOCS), pp1–5 8. Bayrak EA, Kırcı P, Ensari T (2019) Comparison of machine learning methods for breast cancer diagnosis. In: 2019 Scientific meeting on electrical-electronics & biomedical engineering and computer science (EBBT), Istanbul, Turkey, pp 1-3 9. Chandy A (2019) A review on iot based medical imaging technology for healthcare applications. J Innov Image Process (JIIP) 1(01):51–60 10. Potdar K, Kinnerkar R (2016) A comparative study of machine algorithms applied to predictive breast cancer data. Int. J. Sci. Res. 5(9):1550–1553 11. Huang C-J, Liu M-C, Chu S-S, Cheng C-L (2004) Application of machine learning techniques to web-based intelligent learning diagnosis system. In: Fourth ınternational conference on hybrid ıntelligent systems (HIS’04), Kitakyushu, Japan, pp 242–247. https://doi.org/10.1109/ ICHIS.2004.25 12. Ray S (2019) A quick review of machine learning algorithms. In: 2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon), Faridabad, India, pp 35–39. https://doi.org/10.1109/COMITCon.2019.8862451 13. Seref B, Bostanci E (2018) Sentiment analysis using naive bayes and complement naive bayes classifier algorithms on hadoop framework. In: 2018 2nd ınternational symposium on multidisciplinary studies and ınnovative technologies (ISMSIT), Ankara, pp 1–7 14. Li N, Zhao L, Chen A-X, Meng Q-W, Zhang G-F (2009) A new heuristic of the decision tree induction. In: 2009 International conference on machine learning and cybernetics, Hebei, pp 1659–166 15. Kurniawan R, Yanti N, Ahmad Nazri MZ, Zulvandri (2014) Expert systems for self-diagnosing of eye diseases using Naïve Bayes. In: 2014 International conference of advanced ınformatics: concept, theory and application (ICAICTA), Bandung, pp 113–116 16. Pandian AP (2019) Identification and classification of cancer cells using capsulenetwork with pathological images. J Artif Intell 1(01):37–44 17. Vijayakumar T (2019) Neural network analysis for tumor investigation and cancerprediction. J Electron 1(02):89–98 18. Rathor S, Jadon RS (2018) domain classification of textual conversation using machine learning approach. In: 2018 9th ınternational conference on computing, communication and networking technologies (ICCCNT), Bangalore, pp 1–7 19. Douangnoulack P, Boonjing V (2018) Building minimal classification rules for breast cancer diagnosis. In: 2018 10th ınternational conference on knowledge and smart technology (KST), Chiang Mai, pp 278–281
Instrument Cluster Design for an Electric Vehicle Based on CAN Communication L. Manickavasagam, N. Krishanth, B. Atul Shrinath, G. Subash, S. R. Mohanrajan, and R. Ranjith
Abstract Electric vehicles are the need of the hour due to the prevailing global conditions like global warming and increase in pollution level. For a driver, controlling the EV is same as a conventional IC engine automobile. Similar to the instrument cluster of a conventional vehicle, EV also has an instrument cluster that acts as an interface between the human and the machine. But the later one displays more critical parameters that are very essential for controlling the EV. This paper deals with the development of EV instrument cluster that would display vital parameters by communicating with different ECUs of the vehicle using industrial standard CAN bus. Speedometer and odometer details are displayed on a touch screen panel which is designed with a user-friendly interface. Python-based GUI tools are used to design the interface. Keywords Electric vehicle · Instrument cluster · Motor control · BLDC motor · CAN communication · UI design
1 Introduction Global warming has become a serious threat to the existence of human beings. One of the main reasons for global warming is carbon dioxide (CO2 ) emission through various man-made sources. One such man-made source is the internal combustion (IC) engine that powers a variety of automobiles worldwide. Electric vehicles (EVs) of different types are replacing IC engine vehicles. The different types of EVs are L. Manickavasagam · N. Krishanth (B) · B. Atul Shrinath · G. Subash · S. R. Mohanrajan · R. Ranjith Department of Electrical and Electronics Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore 641112, India e-mail: [email protected] S. R. Mohanrajan e-mail: [email protected] R. Ranjith e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_21
271
272
L. Manickavasagam et al.
battery electric vehicles (BEV), hybrid electric vehicles (HEV), and plug-in hybrid electric vehicles (PHEV). BEV is the main focus area as it involves core electrical and electronic components. EVs also come in four-wheelers and two-wheelers. Twowheeler EV is the main area of focus. The main components of EVs are the drive motor, battery, and many control units like electronic controller unit (ECU), battery management system (BMS), instrument cluster which work in harmony to make EV a working engineering marvel. Instrument cluster is one of the most important parts of a vehicle as it acts as an interface between the human and the machine. Based on the information given by the instrument cluster, the driver can take the necessary decisions and actions. So, a proper interface between the user and the machine is required. On the motor control front, there are various methods to control the drive motor efficiently. Motor control is the heart of the EV. Motor control involves choosing the appropriate control strategy, implementing the same using microcontroller and integrating motor and inverter ensuring proper basic functionality of an EV. The instrument cluster is the main focus. In the early days, mechanical instrument clusters were used. These clusters used gear arrangements to infer speed, and they displayed the same using needle. The accuracy of these clusters was very less, and they were prone to damage. These faults gave way to electronic clusters which are very accurate and are prone to very less damage. Instrument cluster design involves the calculation of the data to be displayed in the instrument cluster, development of user interface, and programming of the microcontroller for the instrument cluster. There is a need to establish communication between two control units as data is calculated in one unit and displayed in other. There are many communication protocols. CAN bus protocol, which is the industry standard communication, is used to establish communication. Parameters such as distance covered and speed of the vehicle are displayed. The objective of the project can be divided into three areas—motor control, instrument cluster design, and communication between the ECUs.
2 Instrument Clusters for BLDC Motor BLDC motor is preferred over the DC motor because of the BLDC employees the electronic commutation which avoids the wear and tear of the machine, unlike the mechanical commutation which DC motor employees. There are various control methods present for BLDC motor for reducing the torque ripple such as sinusoidal, trapezoidal, and field-oriented control. The back electromotive force (back-emf) waveform of a permanent magnet brushless hub motor is nearly sinusoidal, which is suitable for sinusoidal current control. To minimize the torque ripple, the fieldoriented control based on hall-effect sensors is applied. The authors D. Li and D. Gu propose the field-oriented control for BLDC motor for four-wheel drive vehicles. The rotor position is estimated through the interpolation method using the signals of the hall-effect sensors [1]. A semantic scholar S. Wang has written about the torque ripple reduction in his paper using modified sinusoidal PWM. To reduce the torque
Instrument Cluster Design for an Electric Vehicle …
273
ripple, a feed-forward controller is developed to eliminate the instantaneous great torque change of the motor [2]. The paper by G. Vora and P. Gundewar presents the designing advantages of digital instrument cluster and explained detailly about the communication protocols involved in the automotive applications. J. Peši´c, K. Omerovi´c, I. Nikoli´c, and M.Z. Bjelica explained the process of open source techniques with the development of clustering application [3–5]. The paper by H. Chen and J. Tian introduces the process of CAN protocol, and it helps for the process of fault identification [6]. An EV may use one or more electric motors for propulsion. Other main components are the power converter for an electric motor, high-voltage battery pack, charger, and battery management system (BMS). The paper by L.A. Peri¸soar˘a, D.L. S˘ac˘aleanu, and A. Vasile deal with the usage of two interfaces for the monitoring of electric vehicle. The first interface is based on Arduino Uno with a 20 × 4 LCD. The second interface is based on virtual instrument cluster designed on LabView. Both the interfaces communicate by CAN buses. The instrument clusters can be designed according to the requirements of the user. The hardware cluster is a lowcost solution, while the virtual cluster needs more expensive hardware interface for the communication bus [7].
3 Proposed Instrumentation System The overall system is designed to do the following tasks as discussed in the objective: • BLDC motor control • Display of vehicle parameters in a display • CAN bus communication between ECUs
3.1 BLDC Motor Control Battery, inverter, and motor (along with sensors) form the part of the system which is readily present in the EV (Fig. 1). To start the work, the first part is to make the motor work and is achieved using the motor control unit (MCU). The input to this MCU is the Hall-effect sensor of the BLDC motor. The output goes as PWM signals to the threephase Inverter. The calculations of speed and distance travelled are computed here. All these ECU’s should be communicating in real time to have a real-time dashboard. There are two ECUs in the design which needs to be communicating with each other. CAN bus is the standard communication protocol used in vehicles. Therefore, CAN bus is chosen. The computed values are communicated to the instrument cluster controller-Raspberry pi. A digital display is interfaced with the pi, which acts as the dashboard. The required components and its specifications are listed (Table 1).
274
Fig. 1 Block diagram of electric vehicle
Fig. 2 Closed-loop control by PWM method
L. Manickavasagam et al.
Instrument Cluster Design for an Electric Vehicle …
275
Fig. 3 Basic UI
Fig. 4 Graphical UI
For speed control of BLDC motor, there are three methods namely trapezoidal control, sinusoidal control, and field-oriented control. In trapezoidal control, the stator poles are excited based on a commutation logic which will be described in detail in the later part of this section. The pulses are chopped into smaller pulses, and based on the speed, the pulses width of the smaller pulses is varied. This method is comparatively easier to implement when compared with other methods. This method
276
L. Manickavasagam et al.
Fig. 5 CAN test hardware setup
Fig. 6 CAN frame seen in an oscilloscope
requires the position of the rotor to be known to excite the stator. This method gives ripples in torque. This method is called trapezoidal control as the back-emf has the shape of a trapezium. In sinusoidal control, back-emf is made to resemble the sinusoids. Thus, the motor coils are excited by three sinusoids, each phase shifted by 120°. In this method, torque ripples are reduced. This method requires the position of the rotor to be known at the accuracy of 1°. Therefore, this method is complex in implementation as many estimations has to be done.
Instrument Cluster Design for an Electric Vehicle …
277
Fig. 7 Implementation of motor control
Fig. 8 Communication implementation
The third method is field-oriented control. The main principle of this method is that the torque of a motor is at a maximum when the stator and the rotor magnetic fields are orthogonal to each other. This method tries to maintain both the magnetic fields perpendicular to each other. This is done by controlling the direct axis current and quadrature axis current of the motor. These currents have arrived from the line currents using the Park and Clarke transformations. This method provides the best torque results but very complex to implement. Trapezoidal control is implemented. An electric vehicle is operated using three-phase inverter. The output from the inverter is connected to the three input terminals of the motor. In Fig. 2, the motor phases are represented in a series of resistance, inductance, and back-emf. The gating pulse for the inverter is given from the microcontroller by sensing the position of the rotor with the help of the Hall-effect sensor. Each phase in the stator (A, B, C) is phase shifted by 10°. So, at each phase, there is one Hall-effect sensor embedded in the stator. Hall-effect sensor can sense the presence of a rotor if the rotor is at 90° left or 90° right to the position of the sensor. Each sensor conducts for 180°. Let us assume that the rotor is at phase A. So, here hall B and hall C cannot sense the presence of rotor since the rotor is not around 90° from the position of sensors, and only hall A can sense it. So, logic for Hall sensors will be 100. Each sensor is active for 180°° in each cycle. As it can be seen, the back-emf generated is trapezoidal in shape, and the phase voltage is varied accordingly by taking the Hall-effect sensor states. There are six different states. Based on the Hall-effect sensor outputs, the
278
L. Manickavasagam et al.
Fig. 9 System overview
required phase of the BLDC motor is excited to move the motor forward. Thus, the following excitation table is got (Table 2). Model-based development is an embedded software initiative where a model is used to verify control requirements and that the code runs on target electronic hardware. When software and hardware implementation requirements are included, you can automatically generate code for embedded deployment by saving time and avoiding the introduction of manually coded errors. There is no need to write code manually. The controller automatically regenerates code. Model-based development can result in average cost savings of 25–30% and time savings of 35–40% [7]. Instead of using a microcontroller to generate PWM signals for inverter switching operation, here, STM32F4 microcontroller is used (Fig. 2). In closed-loop speed control, the actual speed is fed back and is compared with the reference speed. This error can be reduced by tuning the PI controller accordingly. The output of the PI controller is given to the PWM generator which generates a pulse based on the duty ratio. When this pulse is given to AND with the gating pulses of the inverter, PWM pulses are obtained. By generating PWM signals, the
Instrument Cluster Design for an Electric Vehicle …
279
average voltage that is applied to the motor can be varied. As the voltage applied to the motor changes, speed also changes. If the average voltage applied to the motor increases, then speed also increases and vice-versa. To generate PWM signals, the first step is creating a simulation in MATLAB. But MATLAB as such do not support STM32F4 board. So, it is necessary to install software which supports STM32F4 to interface with PC. Waijung is a software which interfaces STM32F4 discovery board with MATLAB. After installing Waijung module blocks in MATLAB, run and build the simulation. Once the simulation builds, waijung target setup automatically generates the code, and it is dumped into the board from which gating signals are taken. Hence, without writing code manually, MATLAB automatically generates the code using waijung blockset. Thus, the STM32 discovery board was programmed using waijung blockset and MATLAB to implement the above speed control method. The motor always needed an initial push to start. So in-order to avoid this, various methods were discussed. The second method was to the fact that inverter is triggered based on the position of the rotor. As rotor position cannot be accurately got to the 1° precision, triggering of inverter always had an uncertainty of getting the correct phases excited. So, the main aim was to generate the Hall-effect sensor output pulses manually with the same lag and correct time period based on the speed. The main aim of this was to excite the motor in a way to make it move initially and then give the triggering based on the original Hall-effect sensors present in the motor. This method was done and the motor started to run in slow speed just enough for the Hall-effect sensors to give proper output, and then the Hall-effect sensors took care of the normal running. The third method was to the fact that there was a lag between Hall-effect sensors input and triggering pulses. So to improve that, the sampling time of the board as set in the waijung blockset was reduced even further. This method was also tried, and this method drastically improved the response of the whole system. The second and third methods were implemented in the STM32 discovery board using the MATLAB and waijung blockset.
3.2 Instrument Cluster Design To display a parameter in the instrument cluster, the parameters to be displayed need computation upon the data acquired from other ECUs. The important parameters that are calculated—distance and speed. Hall-effect sensors play a very important role in the proposed design as most of the calculations depend on the output of these sensors. Distance calculation is computed by finding the number of revolutions wheel has undergone and then multiplying it with the circumference of the wheel. The above logic has been implemented in the Simulink using a counter block, then a function block for adding the circumference recursively and a memory block are used to facilitate the recursive action.
280
L. Manickavasagam et al.
One of the main computations to be done is speed. RPM data acquired from motor control unit is converted to speed by considering the diameter of the wheel. The main approach to get the RPM is based on the frequency of the Hall-effect sensor pulses. The relation between frequency and RPM of the motor is got. Then, the relational constant is used to get the RPM from the frequency of the Hall-effect pulses. Then, the RPM is converted to kilometers per hour (kmph) using a formula. Main objective of the project is to create a useful user interface (UI) for the driver to get the required data from the vehicle. Raspberry Pi is used as the controller for the instrument cluster. For developing the UI, Python language is used. Software tool mainly used to design the interface is the Python 3 IDLE. Tkinter is the Python module used for building the UI. A meter class with a basic structure of gauge is formed. The class has the definitions for creating the structure of a gauge, moving the meter needle as per the value, setting the range of the meter and other display specifications such as height and width. Then, Tkinter is used to create a canvas, and two objects of the class meter are placed for displaying RPM and speed. It is an analog type meter which forms the basic structure for displaying of speed and RPM (Fig. 3). Here, two gauges were developed separately for speed and RPM. Initial values are given in the Python program which is represented by the needle. The initial UI design was very basic. So, to make the UI more graphical and look better, various Python packages were searched. An interactive graphing library called Plotly has many indicators which were similar to the requirement. So, the Plotly library was used and the below UI with two meters for speed and RPM, and a display for distance was formed. This module displays in a Web browser (Fig. 4). Here, two gauges were developed using Plotly module in the Python program. This presents the speed, RPM, and distance in a graphical manner than the previous one.
3.3 CAN Communication To communicate data between the motor control unit (MCU) (STM32 discovery board) and instrument cluster, control area network (CAN) is chosen as it is the current industry standard. The CAN protocol is robust and frame-based; hence, it allows the communication between ECUs that happen without any complex wiring in between. CAN uses a differential signal, which makes it more resistant to noise, so that messages are transmitted with less marginal errors. Control area network is a serial communication bus designed to allow control units in vehicles to communicate without a host computer. CAN bus consist of two wires named CAN high (CAN H) and CAN low (CAN L) and two 120 resistors at the end for termination. Traffic is eliminated since the messages are transmitted based on the priority, and the entire network meets the timing constraints. Each device can decide if the message is relevant or needs to be filtered. Additional non-transmitting nodes can be added to the network without any modifications to the system. The
Instrument Cluster Design for an Electric Vehicle …
281
messages are of broadcast type, and there is no single master for the bus. Hence, it is multi-master protocol. Speed can be varied from 125 kbps to 1mbps. For communication in CAN, two additional hardware is required. One is CAN controller, and the other is CAN transceiver. In the transmitting end, CAN controller is used to converting the data to CAN suitable messages. These messages are then turned to differential signals using CAN transceiver. In receiving end, CAN transceiver takes the differential signals and changes it to CAN message. Then, this is changed to the data by the CAN controller. STM32 board has inbuilt CAN controller so only CAN transceiver is required. But Raspberry Pi does not have an inbuilt CAN controller, so it needs to be provided separately along with transceiver. For communication between CAN controller and microcontroller, SPI communication is used. The STM board and Raspberry Pi were made to communicate via CAN bus. Raspberry Pi does not have inbuilt CAN controller, so CAN module was used, and communication between Raspberry Pi and CAN module was established using SPI communication (Fig. 5). SPI communication was enabled in Raspberry Pi with oscillator frequency of the oscillator present in the CAN module, and then, CAN communication was brought up with the baud rate. CAN message was sent from STM board and was programmed using Simulink (Fig. 6). Then, the CAN message was received as sent by the STM board in the Raspberry Pi by setting the baud rate to the value used while programming STM. It was viewed in the terminal of Raspberry Pi and also in the Python shell. CAN library for Python was used to get the CAN message and display it.
4 Results The integration of all three separate parts results at the intended hardware system, the electric vehicle. As discussed in the previous sections, the trapezoidal speed control was implemented using the STM32 discovery board to control the drive motor as discussed in the previous chapter using waijung blockset. The Hall-effect sensor data was received using pins, and triggering pulses were given to inverter switches based on an algorithm. A potentiometer is being used to control the speed by giving the reference speed. The inbuilt ADC is used for giving the potentiometer reading to the controller. Thus speed control is implemented. The parameters to be displayed are also calculated in the motor control unit as discussed earlier (Fig. 7). As discussed in the previous sections, Raspberry Pi is used as the instrument cluster controller unit. Raspberry Pi display is used as the display for the instrument cluster. Python language is used to create the UI for the cluster. There are many packages like Tkinter and wxPython for UI creation. Plotly graphs provided a better design for UI. Plotly graphs have inbuilt gauge meters which are used in the display. As discussed in the previous sections, communication between control units is established using CAN bus. CAN bus is established between the discovery board
282
L. Manickavasagam et al.
(MCU) and Raspberry Pi (instrument cluster). Discovery board has inbuilt CAN controller, so it is connected to the CAN bus using a CAN transceiver module. For Raspberry Pi, external CAN controller and CAN transceiver modules are used. Thus, communication is established (Fig. 8). The calculated values are sent as a message in CAN bus by the motor control unit. CAN package in Python language is used to retrieve the CAN message sent by the motor control unit. The data is separated from the message and then given to the UI to display the same. The data is separated to speed and distance based on the bit position set on the MCU while sending data. Then, the data is converted from hexadecimal to the decimal and then given to the UI as the appropriate variable (Fig. 9).
5 Conclusions Instrument cluster for an electric vehicle was implemented. The implemented motor control is primitive which gives many disadvantages like torque ripples. A better control algorithm can be implemented. The inverter used for implementation is very bulky and damages the core principle of a two-wheeler which is its compactness. A better single-chip inverter is required to maintain the compactness of the vehicle. The calculation algorithms used for calculation of the parameters such as the speed and distance work. The results seem to be correct as the RPM, which is the main parameter for calculating the speed and distance, are in accordance with the tachometer reading. But they can be perfected using better algorithms which give more precision and accuracy. This proposed design proves that high-level programming languages can be used to design user-friendly interfaces instead of sophisticated software tools. CAN communication which was established between the ECUs was troublesome because of many issues, like getting garbage values that were faced during its implementation. Many important parameters like the SOC of battery, distance left until charge drains, and speed of economy can be added as future works.
Instrument Cluster Design for an Electric Vehicle …
283
Table 1 Components and its specifications Components
Specifications
Battery
Battery is the electrical energy source for all the EV system components Capacity-7 Ah
BLDC motor
Brushless DC motor is the drive for the vehicle Voltage: 48 V Current: 5.2 A Power: 250 W Speed: (250–350) rpm
Three-phase inverter
Inverter is used to supply the energy from the battery to the BLDC motor Voltage: 48 V Current: 18 A
Motor control unit ( MCU) This ECU controls the basic operation of the vehicle. It takes the input using accelerator, brake and gives appropriate triggering pulses to the inverter based on the position of the rotor. STM32 discovery board is used as the motor control unit. STM32 discovery board uses an STM32F407VG processor which is based on the high-performance ARM®Cortex®-M4 32-bit RISC core Instrument cluster
Instrument cluster displays the necessary information to the user using the display connected to it. Instrument cluster has been implemented using Raspberry Pi. Its high processing capabilities allow it to act as a standalone minicomputer
Table 2 Excitation table Hall1
Hall2
Hall3
Sw1
Sw2
Sw3
Sw4
Sw5
Sw6
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
1
0
0
0
1
0
0
1
0
0
1
0
0
1
1
1
1
0
0
0
0
1
0
0
0
0
1
0
0
1
1
0
1
0
0
1
1
0
0
1
1
0
0
0
0
0
1
1
1
1
1
0
0
0
0
0
0
References 1. Lu D, Li J, Gu J (2014) Field oriented control of permanent magnet brushless hub motor in four-wheel drive electric vehicle. In: 2014 8th international conference on future generation communication and networking, Haikou, pp 128–131 2. Singh J, Singh M (2019) Comparison and analysis of different techniques for speed control of brushless DC motor using MATLAB simulink
284
L. Manickavasagam et al.
3. Peši´c J, Omerovi´c K, Nikoli´c I, Bjelica MZ (2016) Automotive cluster graphics: current approaches and possibilities. In: 2016 IEEE 6th international conference on consumer electronics-Berlin (ICCE-Berlin), Berlin, pp 12–14 4. Krivík P (2018) Methods of SoC determination of lead acid battery. J Energy Storage 15:191– 195. ISSN: 2352–152X 5. Choi J, Kwon Y, Jeon J, Kim K, Choi H, Jang B (2018) Conceptual design of driver-adaptive human-machine interface for digital cockpit. In: 2018 international conference on information and communication technology convergence (ICTC), Jeju, pp 1005–1007 6. Chen H, Tian J (2009) Research on the controller area network. In: 2009 international conferenon networking and digital society, Guiyang, Guizhou, pp 251–254 7. Peri¸soar˘a LA, S˘ac˘aleanu DL, Vasile A (2017)Instrument clusters for monitoring electric vehicles. 2017 IEEE 23rd international symposium for design and technology in electronic packaging (SIITME), Constanta, pp 379–382
Ant Colony Optimization: A Review of Literature and Application in Feature Selection Nandini Nayar, Shivani Gautam, Poonam Singh, and Gaurav Mehta
Abstract Ant colony optimization (ACO) is a meta-heuristic that is inspired by real ants that are capable of exploring shortest paths, which inspires researchers to apply it for solving numerous optimization problems. Outstanding and acknowledged applications are derived from biologically activated algorithms like ACO that are established from artificial swarm intelligence which in turn is motivated by the amalgamated behavior of social insects. ACO is influenced by natural ants system, their behavior, team planning and organization, their integration for seeking and finding the optimal solution and also to preserve data of each ant. Currently, ACO has appeared as a popular meta-heuristic technique for finding the solution of conjunctional optimization problems that is beneficial for finding shortest paths via construction graphs. This paper highlights the behavior of ants and various ACO algorithms (their variants as well as hybrid approaches) that are used successfully for performing feature selection, applications of ACO and current trends. The fundamental ideas of ant colony optimization is reviewed including its biological background and application areas. This paper portrays how current literature utilizes the ACO approach for performing feature selection. By analyzing the literature, it can be concluded that ACO is a suitable approach for feature selection. Keywords Ant colony optimization · Feature selection · Swarm intelligence
N. Nayar (B) · G. Mehta Department of Computer Science and Engineering, Chitkara University Institute of Engineering and Technology, Chitkara University, Himachal Pradesh, India e-mail: [email protected] G. Mehta e-mail: [email protected] S. Gautam · P. Singh Department of Computer Science and Applications, Chitkara University School of Computer Applications, Chitkara University, Himachal Pradesh, India e-mail: [email protected] P. Singh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_22
285
286
N. Nayar et al.
1 Introduction In social insects’ colonies, every insect typically performs its tasks autonomously. However, the tasks performed by individual insects are correlated because the whole colony is able to solve even the complex problems by cooperation. Without any kind of central controller (supervisor), these insect colonies can solve numerous survival-related issues, e.g., selection/pick-up of material, exploring and storing the food. Although, such activities require sophisticated planning, still, such issues are resolved by insect colonies devoid of any central controller. Hence, such collective behavior emerging from social insects’ group is termed as “swarm intelligence.” During the last few decades, ever-increasing research on these algorithms suggests that nature is an extraordinary source of inspiration for developing intelligent systems and for providing superior solutions for numerous complicated problems. In this paper, the behavior of real ants are being reviewed. Owing to the fact that these ants are capable of exploring the shortest path between their food source and their nest, many ant-based algorithms are proposed by researchers. Ant colony optimization (ACO) algorithm is a prominent and successful swarm intelligence technique. During the past decades, a considerable amount of research has been conducted to develop the ACO-based algorithms as well as to develop its pragmatic applications to tackle real-world problems. Inspired by real ants’ behavior and their indirect communication, the ant colony optimization algorithm was proposed by Marco Dorigo. Since then, it is gaining tremendous attention to research. In ACO algorithms, ants (simple agents) are involved that collaborate for achieving amalgamated behavior for the system and thereby develop a “robust” system that can find superior-quality solutions for a variety of problems comprising of large search space. The paper reviews the basis of ACO. In Sect. 1, the behavior of real ant colonies is depicted. In Sect. 2, the feature selection concepts are presented. In Sect. 3, numerous existing ACO algorithms are reviewed.
1.1 Biological Background While walking, individual ants deposit a chemical known as “Pheromone” on the ground. Due to the accumulation of this chemical, a trail is created by the ants to mark their path. When an ant discovers its food source, it will create a trail for marking the path from its nest toward its food source and vice versa. The other ants can detect the presence of pheromone, and they prefer the path having more pheromone concentration. As the intensity of pheromone is supposed to be higher in case of shortest paths toward a food source, so the other ants are anticipated to follow the shortest path. Thus, the shorter pathway attracts more ants. Individual ants can find a solution to a problem. However, by cooperating with one another, they can find superior solutions.
Ant Colony Optimization: A Review of Literature …
287
1.2 Ant Colony Optimization (ACO) Algorithm Ants are considered to be “social” insects, and they live in colonies. Ants drop pheromone on the ground while traveling, which helps ants to explore the shortest route. Probabilistically, every ant prefers to follow the path that is rich in pheromone density. However, pheromone decays with time, thus leading to less pheromone intensity on less popular paths. Therefore, the shortest route will have more number of ants traversing, whereas other paths will be diminished till all ants pursue the same shortest path leading the system to converge to a “single” solution. Over time, the pheromone intensity decreases automatically. Practically, this pheromone evaporation is required for avoiding speedy convergence of this algorithm towards any sub-optimal region. Inspired by the behavior of real ants, the artificial ants were designed to solve numerous optimization problems as they are capable of moving through the existing problem states and making decisions at every step. In ACO, basic rules are defined as: • Problem is depicted as a graph with “nodes” representing features and “edges” representing a choice of the subsequent feature. • η denotes the “heuristic information” that is the goodness of path. • “Pheromone updating rule” which updates pheromone level on edges. • “Probabilistic transition rule” which finds the probability of ant for traversing to a subsequent node (Table 1).
1.3 Advantages of ACO • ACO algorithms are robust in nature; i.e., they are flexible according to the changing dynamic applications. • They have the advantage of distributed computations. • They give positive feedback, which, in turn, brings about the revelation of optimal solutions which may be further used in dynamic applications. • They allow dynamic re-routing via shortest path algorithms if any node is broken. • While analyzing real dimension networks, ACO algorithms allow network flows to be calculated more fastly than traditional static algorithms [33]. Some more advantages of ACO are summarized in Fig. 1 as:
1.4 Generic Structure of the ACO Algorithm Generic ACO algorithm is depicted below [34], which comprises of four major steps: 1. Initialization
288
N. Nayar et al.
Table 1 ACO applications Application area
Authors
Water resource management Maier, Holger R., Angus R. Simpson, Aaron C. Zecchin, Wai Kuan Foong, Kuang Yeow Phang, Hsin Yeow Seah, and Chan Lim Tan [1] López-Ibáñez, Manuel, T. Devi Prasad, and Ben Paechter [2] Zheng, Feifei, Aaron C. Zecchin, Jeffery P. Newman, Holger R. Maier, and Graeme C. Dandy [3], Sidiropoulos, E., and D. Fotakis [4], Shahraki, Javad, Shahraki Ali Sardar, and Safiyeh Nouri [5] Protein structure prediction
Do Duc, Dong, Phue Thai Dinh, Vu Thi Ngoc Anh, and Nguyen Linh-Trung [6], Liang, Zhengping, Rui Guo, Jiangtao Sun, Zhong Ming, and Zexuan Zhu [7]
Tele-communication
Özmen, Mihrimah, Emel K. Aydo˘gan, Yılmaz Delice, and M. Duran Toksarı [8], Di Caro, Gianni, and Marco Dorigo [9], Khan, Imran, Joshua Z. Huang, and Nguyen Thanh Tung [10]
Feature selection
Shunmugapriya, P., and S. Kanmani [11], Sweetlin, J. Dhalia, H. Khanna Nehemiah, and A. Kannan [12], Mehmod, Tahir, and Helmi B. Md Rais [13], Wan, Youchuan, Mingwei Wang, Zhiwei Ye, and Xudong Lai [14], Ghosh, Manosij, Ritam Guha, Ram Sarkar, and Ajith Abraham [15], Dadaneh, Behrouz Zamani, Hossein Yeganeh Markid, and Ali Zakerolhosseini [16], Peng, Huijun, Chun Ying, Shuhua Tan, Bing Hu, and Zhixin Sun [17], Nayar Nandini, Sachin Ahuja, and Shaily Jain [18], Rashno, Abdolreza, Behzad Nazari, Saeed Sadri, and Mohamad Saraee [19], Saraswathi, K., and A. Tamilarasi [20]
Vehicle routing problems
Ding, Qiulei, Xiangpei Hu, Lijun Sun, and Yunzeng Wang [21], Yu, Bin, Zhong-Zhen Yang, and Baozhen Yao [22], Wu, Libing, Zhijuan He, Yanjiao Chen, Dan Wu, and Jianqun Cui [23], Huang, Gewen, Yanguang Cai, and Hao Cai [24], Xu, Haitao, Pan Pu, and Feng Duan [25], Huang, Ying-Hua, Carola A. Blazquez, Shan-Huen Huang, Germán Paredes-Belmar, and Guillermo Latorre-Nuñez [26], Zhang, Huizhen, Qinwan Zhang, Liang Ma, Ziying Zhang, and Yun Liu [27]
Robot path planning
Brand, Michael, Michael Masuda, Nicole Wehner, and Xiao-Hua Yu. [28], Chia, Song-Hiang, Kuo-Lan Su, Jr-Hung Guo, and Cheng-Yun Chung [29], Cong, Yee Zi, and S. G. Ponnambalam [30], Liu, Jianhua, Jianguo Yang, Huaping Liu, Xingjun Tian, and Meng Gao [31], Deng, Gao-feng, Xue-ping Zhang, and Yan-ping Liu [32]
In this step, all the pheromones and parameters are declared. 2. Formulate Ant Solutions In this step, a group of ants formulate the solution of the problem that is being solved by making use of pheromone values and other related information. 3. Local Search (optional) In this step, the optimization of the built solution is constructed. 4. Global Pheromone Update
Ant Colony Optimization: A Review of Literature …
289
Fig.1 Benefits of ant colony optimization algorithm
In the last step, updations in pheromone variables are performed based on the search background as echoed by ants. Begin ACO Algorithm initialization; while (Iterate till end criteria is satisfied) do. Formulate ant solutions; Perform local search; Global pheromone update; end end End of ACO.
2 Feature Selection As a consequence of technological innovations, a massive amount of data is generated that leads to an increased number of dimensions in a dataset. For the discovery of knowledge from these massive datasets, feature selection is imperative [35, 36]. As the number of features surpasses a definite limit, there exist a substantial number of features that are redundant and independent, leading to poor classifier performance. The purpose of feature selection is to reduce dimensionality for
290
N. Nayar et al.
performing significant data analysis. Feature selection is a significant task in data mining as well as for pattern recognition. The goal of feature selection is to choose an optimal feature subset having the maximum discriminating ability as well as minimum redundancy [14]. It is imperative to develop a classification model, suitable for dealing with problems having different sample size as well as different dimensionality. The vital tasks include a selection of valuable features and an apt classification method. The idea of feature selection involves choosing a minimal feature subset, i.e., the best feature subset comprising of k-features yielding least amount of generalization errors. It is expected that feature selection techniques are utilized either as preprocessing step or in combination with a learning model for the task of classification. The set of all original features are given as input to feature selection methods, which will subsequently generate “feature subsets.” Then, the subset that is selected is evaluated by learning algorithm or through consideration of data characteristics.
2.1 Need for Feature Selection: 1. If the number of features is extremely large, it becomes a complicated task to work with all available features. 2. Most of the available features are redundant, noisy or irrelevant to the classification or clustering task. 3. If the number of features exceeds the number of data input data points, it becomes a problem. 4. To decrease the computation cost as well as training time. 5. To avoid the curse of dimensionality. 6. To provide better model interpretability and readability.
2.2 ACO for Feature Selection Motivated by numerous benefits possessed by ACO algorithm, it is widely being used for performing the task of feature selection. The numerous advantages of ACO include its powerful search ability, ability to converge expeditiously, thereby leading to efficient exploration of minimal feature subset [37]. ACO-based methods for feature selection are fairly prominent as they apply knowledge from previous iterations and thus achieve optimum solutions [38]. As compared to other conventional approaches, ACO is fast and simple. It is considered to be one of the most preferred approaches for resolving various complex problems [39]. In ACO, the problem as a graph is represented, where nodes correspond to features and edges denote the choice of a subsequent feature.
Ant Colony Optimization: A Review of Literature …
291
3 Literature Review: ACO for Feature Selection Several ACO-based algorithms have been proposed by researchers. In this section, a review of ACO-based algorithms is presented that is used for feature selection. Subset evaluation based on “memory” is introduced [15] for keeping the best ants and pheromone update that is feature-dimension dependent for selecting a feature set in a multi-objective manner. The approach was tested on numerous life datasets making use of multi-layer perceptron and K-nearest neighbors classifiers. In [40] a hybrid approach comprising of ant colony optimization and k-nearest neighbor (KNN) is proposed for the selection of valuable features from customer review dataset. The performance of this algorithm was evaluated and validated by parametric significance tests. Results prove that this technique is valuable for providing a superior feature set for customer review data. A computer-aided diagnostic system is developed [12] for detecting “pulmonary hamartomas” from lung CT images. For selecting relevant features, the ACO algorithm is used, which train the SVM and naïve Bayes classifiers, to mark nonexistence or existence of the hamartoma nodule. Results demonstrate that from features selected by ACO-RDM approach yields superior accuracy (94.36%) with the SVM classifier. Using the CT images of the lungs, [41] proposed an ACO-based method for selecting relevant features for enhancing accuracy for diagnosis of pulmonary bronchitis. The approach comprises of ACO with cosine similarity and SVM classifier. Furthermore, the tandem run recruitment strategy assists in choosing the best features. Results demonstrate that the ACO algorithm, with a tandem run strategy, yields an accuracy of 81.66%. In [19], the feature space is reduced by ACO, which decreases the time complexity that is a vital concept in Mars on-board applications. The proposed method reduces feature size to a great extent (up to 76%) and yields high accuracy (91.05%), thereby outperforming the conventional approach that yields 86.64% accuracy. By combining traits of ant colony and bee colony, [11] proposed a hybrid AC-ABC algorithm for optimizing feature selection. The approach eliminates the stagnated behavior of ants, as well as the time-consuming search for the initial solutions of bees. The approach is evaluated on 13 datasets and shows promising results in terms of optimal features selected and classification accuracy. Reference [20] proposed for ant colony optimization-based algorithm for feature selection for extracting feature sets from reviews by making use of term frequencyinverse document frequency (TF-IDF). Furthermore, the selected features are classified by SVM or naïve Bayes classifier. The results obtained demonstrate that the approach is efficient for classifying opinions. Reference [42] presented SMMB-ACO method by combining Markov blanket learning with stochastic and ensemble features, thereby guiding the stochastic sampling process by including ACO. The experimental results demonstrate that the proposed method is more stable as compared to SMMB. [14] presented an approach for feature selection, which is based on modified binary-coded ant colony optimization (MBACO) algorithm integrated with genetic algorithm. There are two models:
292
N. Nayar et al.
“pheromone density model” and “visibility density model.” The approach is validated on ten datasets obtained from the UCI repository. The results demonstrate that the proposed method is capable of keeping a fair balance on classification accuracy as well as efficiency, thereby making it apt for feature selection applications. For enhancing the stability of feature selection, [17] proposed FACO, an improved algorithm for feature selection, that is based on ACO. It uses two-stage pheromone updating rule that averts the algorithm from falling into premature local optimum and is validated on KDD CUP99 dataset. The outcomes demonstrate that the algorithm has great practical significance as it enhances classification accuracy as well as the efficiency of a classifier. By conducting the literature review, some gaps can be identified, so can be concluded that “construction algorithms” are not able to provide superior-quality solutions and may not be optimal according to “minor” changes. Some challenges in ACO include exploring superior pheromone models and to uphold apt balance among “exploration” as well as “exploitation” of search space. Moreover, there is a need for adaptive intelligent models having the capability of automatically identifying dynamic alterations of dataset’s characteristics, thereby upgrading the algorithms in a self-directed manner. After carrying out an extensive literature review of ACO in feature selection, the value of accuracies achieved by various ACO variants is summarized in Table 2.
4 Current Trends The recent developments of ACO algorithms involve hybridization with other renowned meta-heuristics or with other mathematical programming techniques. Moreover, many capable schemes for parallelization of ACO algorithms are also being proposed. The problems comprising of conflicting, as well as multiple objectives, also need to be addressed by exploring a solution that provides superior compromise among various other objectives. According to [53], ACO algorithms are effectively used in numerous technology fields. ACO consequently has made an effectual contribution in digital image processing and monitoring of structural damage. Furthermore, ACO has gained much attention in solving the issues related to economic dispatch problems. Researchers are also exploring the capability of ACO for data clustering, scheduling and routing problems. Present ACO applications come under two types of classes: Static as combinatorial optimization problems [54]. • Static problems can be defined as the problems wherein topology and price do not change while resolving the problems, e.g., the traveling salesman problem where the city location and intercity distance do not change during run-time of an algorithm.
Ant Colony Optimization: A Review of Literature …
293
Table 2 Comparison of classification accuracy for test datasets Dataset
Algorithm
Accuracy achieved (%)
Sonar Statlogheart Zali
ACO-BSO PSO [43] ACO-BSO PSO [43] ACO-BSO PSO [43]
99.09 95.53 86.56 84.89 88.35 84.29
Cleveland dataset
ACO-HKNN [44] SVM [44] Naïve Bayes [44]
99.2 97.74 96.21
STULONG dataset
ACO [45] ACO/PSO [45] ACO/PSO with new Q [45]
80.75 78.1 87.43
Labor Ionosphere
ACO-CE PSO [46] ACO-CE PSO [46]
78.94 70.17 91.16 86.32
Sonar Ion Vehicle
BACO ABACO [47] BACO ABACO [47] BACO ABACO [47]
86.0 91.0 92.1 93.3 76.9 78.7
Soyabean-large-MD Heart SPECT
Genetic search ACO [48] Genetic search ACO [48] Genetic search ACO [48]
98.33 99.02 84.81 85.92 70.03 75.65
Perturbed-breast dataset Perturbed-dermatology dataset
ACO-based search PSO-based search [49] ACO-based search PSO-based search [49]
97.14 97.14 88.28 80.9 Harvard
NSL-KDD
PSO [50] ACO [50] ABC [50]
96.04 98.13 98.9
Reuter’s dataset
ACO [51] ACO-ANN [51] GA [51]
79.02 81.35 78.27
German traffic signs
ACO + ANN [52] ACO + SVM [52] ACO + EFSVM [52]
82.22 88.95 92.39 (continued)
294
N. Nayar et al.
Table 2 (continued) Dataset
Algorithm
Accuracy achieved (%)
Hepatitis cancer
ACO-PSO [11] ABC-DE [11] AC-ABC [11] ACO-PSO [11] ABC-DE [11] AC-ABC [11]
75.34 71.26 79.29 87.06 96.01 99.43
• Dynamic problems are those problems where the topology and cost change even when the solutions are being built. For example routing problem in telecommunications network where traffic patterns keep on changing everywhere. The ACO-based algorithms for addressing these kinds of problems are the same in general, but they vary unquestionably in implementation details. Of late, ACO algorithms have raised a lot of speculation among the researchers. Nowadays, there are various successful implementation of ACO algorithms which are utilized to a broad scope of various combinatorial optimization problems. Such kinds of applications come under two broad application areas: • NP-hard problems: • For these problems, the best-noted algorithms have found to have the exponential time worst-case complexity. Most ant-based algorithms are equipped with more abilities like problem-specific local optimizers that obtain the ant solution to local optima. • Shortest path problems: • In these problems, the properties of the problem’s graph representation may vary over time (synchronously) with the optimization method, which needs to be adapted to problem dynamics. In such a scenario, the graph can be made available but its properties (cost of components, connections) may vary over time. • In such cases, can be concluded that the use of ACO algorithms is recommended as the variation rate of the cost augments but the know-how of the variation process decreases.
5 Conclusion and Future Scope From the literature studied, it can be inferred that the identification of pertinent and valuable features for training the classifier impacts the performance of the classifier model. ACO has been and proceeds to be a productive paradigm for structuring powerful combinatorial solutions for optimization problems. In this paper, the origin and biological background of the ACO algorithm is presented and several application areas of ACO. Finally, a survey of ACO used in the domain of feature selection is presented. ACO algorithm has become one of the most popular metaheuristic approaches to resolve various combinatorial problems. The previous ACO
Ant Colony Optimization: A Review of Literature …
295
versions were not good enough to compete with other well-known algorithms, but the outcomes were promising enough to open new avenues for exploring this area. Since then, many researchers have explored the basic ACO algorithm and updated it for obtaining promising results. This paper focuses on outlining the latest ACO developments in terms of algorithms as well as ACO applications. Applications like multi-objective optimization and feature selection are the main targets of recent ACO developments. For enhancing the performance of ACO algorithms, these algorithms are further combined with existing meta-heuristic methods and inter-programming techniques. A clear improvement in results for different problems has been shown by hybridization of ACO algorithms. Implementation of ACO algorithms with parallel versions has been seen in latest trends. Due to the use of multi-core CPU architectures and GPUs, the creation of enhanced parallel versions of ACO algorithms is possible.
References 1. Maier HR, Simpson AR, Zecchin AC, Foong WK, Phang KY, Seah HY, and Tan CL (2003) Ant colony optimization for design of water distribution systems. J Water Resour Plan Manage 129(3):200–209 2. López-IbáñezM, Prasad TD, Paechter B (2008) Ant colony optimization for optimal control of pumps in water distribution networks. J Water Resour Plann Manage 134(4):337–346 3. Zheng F, Zecchin AC, Newman JP, Maier HR, Dandy GC (2017) An adaptive convergencetrajectory controlled ant colony optimization algorithm with application to water distribution system design problems. IEEE Trans Evol Comput 21(5):773–791 4. Sidiropoulos E, Fotakis D (2016) Spatial water resource allocation using a multi-objective ant colony optimization. Eur Water 55:41–51 5. Shahraki J, Sardar SA, Nouri S (2019) Application of met heuristic algorithm of ant Colony optimization in optimal allocation of water resources of Chah-Nime of Sistan under managerial scenarios. IJE 5(4):1 6. Do Duc D, Dinh PT, Anh VTN, Linh-Trung N (2018) An efficient ant colony optimization algorithm for protein structure prediction. In: 2018 12th international symposium on medical information and communication technology (ISMICT), pp 1–6. IEEE 7. Liang Z, Guo r, Sun J, Ming Z, Zhu Z (2017) Orderly roulette selection based ant colony algorithm for hierarchical multilabel protein function prediction. Math Prob Eng 8. Özmen M, Aydo˘gan EK, Delice Y, Duran Toksarı M (2020) Churn prediction in Turkey’s telecommunications sector: a proposed multiobjective–cost-sensitive ant colony optimization. Wiley Interdisc Rev Data Min Knowl Disc 10(1):e1338 9. Di Caro G, Dorigo M (2004) Ant colony optimization and its application to adaptive routing in telecommunication networks. PhD diss., PhD thesis, Faculté des Sciences Appliquées, Université Libre de Bruxelles, Brussels, Belgium 10. Khan I, Huang JZ, Tung NT (2013) Learning time-based rules for prediction of alarms from telecom alarm data using ant colony optimization. Int J Comput Inf Technol 13(1):139–147 11. Shunmugapriya P, Kanmani S (2017) A hybrid algorithm using ant and bee colony optimization for feature selection and classification (AC-ABC Hybrid). Swarm Evol Comput 36:27–36 12. Sweetlin JD, Nehemiah HK, Kannan A (2018) Computer aided diagnosis of pulmonary hamartoma from CT scan images using ant colony optimization based feature selection. Alexandria Eng J 57(3):1557–1567 13. Mehmod T, Md Rais HB (2016) Ant colony optimization and feature selection for intrusion detection. In: Advances in machine learning and signal processing, pp 305–312. Springer, Cham
296
N. Nayar et al.
14. Wan Y, Wang M, Ye Z, Lai X (2016) A feature selection method based on modified binary coded ant colony optimization algorithm. Appl Soft Comput 49:248–258 15. Ghosh M, Guha R, Sarkar R, Abraham A (2019) A wrapper-filter feature selection technique based on ant colony optimization. Neural Comput Appl:1–19 16. Dadaneh BZ, Markid HY, Zakerolhosseini A (2016) Unsupervised probabilistic feature selection using ant colony optimization. Expert Syst Appl 53:27–42 17. Peng H, Ying C, Tan S, Bing Hu, Sun Z (2018) An improved feature selection algorithm based on ant colony optimization. IEEE Access 6:69203–69209 18. Nandini N, Ahuja S, Jain S (2020) Meta-heuristic Swarm Intelligence based algorithm for feature selection and prediction of Arrhythmia. Int J Adv Sci Technol 29(2):61–71 19. Rashno A, Nazari B, Sadri S, Saraee M (2017) Effective pixel classification of mars images based on ant colony optimization feature selection and extreme learning machine. Neurocomputing 226:66–79 20. Saraswathi K, Tamilarasi A (2016) Ant colony optimization based feature selection for opinion mining classification. J Med Imaging Health Inf 6(7):1594–1599 21. Ding Q, Xiangpei Hu, Sun L, Wang Y (2012) An improved ant colony optimization and its application to vehicle routing problem with time windows. Neurocomputing 98:101–107 22. Yu B, Yang Z-Z, Yao B (2009) An improved ant colony optimization for vehicle routing problem. Eur J Oper Res 196(1):171–176 23. Wu L, He Z, Chen Y, Dan Wu, Cui J (2019) Brainstorming-based ant colony optimization for vehicle routing with soft time windows. IEEE Access 7:19643–19652 24. Huang G, Cai Y, Cai H (2018) Multi-agent ant colony optimization for vehicle routing problem with soft time windows and road condition. In: MATEC web of conferences, vol 173, p 02020. EDP Sciences 25. Xu H, Pu P, Duan F (2018) Dynamic vehicle routing problems with enhanced ant colony optimization. Discrete Dyn Nat Soci 2018 26. Huang Y-H, Blazquez CA, Huang S-H, Paredes-Belmar G, Latorre-Nuñez G (2019) Solving the feeder vehicle routing problem using ant colony optimization. Comput Ind Eng 127:520–535 27. Zhang H, Zhang Q, Ma L, Zhang Z, Liu Y (2019) A hybrid ant colony optimization algorithm for a multi-objective vehicle routing problem with flexible time windows. Inf Sci 490:166–190 28. Brand M, Masuda M, Wehner N, Yu X-H (2010) Ant colony optimization algorithm for robot path planning. In: 2010 international conference on computer design and applications, vol 3, pp V3–436. IEEE 29. Chia S-H, Su K-L, Guo J-R, Chung C-Y (2010) Ant colony system based mobile robot path planning. In: 2010 fourth international conference on genetic and evolutionary computing, pp 210–213. IEEE 30. Cong YZ, Ponnambalam SG (2009) Mobile robot path planning using ant colony optimization. In: 2009 IEEE/ASME international conference on advanced intelligent mechatronics, pp 851– 856. IEEE 31. Liu J, Yang J, Liu H, Tian X, Gao M (2017) An improved ant colony algorithm for robot path planning. Soft Comput 21(19):5829–5839 32. Deng G-F, Zhang X-P, Liu Y-P (2009) Ant colony optimization and particle swarm optimization for robot-path planning in obstacle environment. Control Theory Appl 26(8):879–883 33. Deepa O, Senthilkumar A (2016) Swarm intelligence from natural to artificial systems: ant colony optimization. Networks (Graph-Hoc) 8(1):9–17 34. Akhtar A (2019) Evolution of ant colony optimization algorithm—a brief literature review. In: arXiv: 1908.08007 35. Nayar N, Ahuja S, Jain S (2019) Swarm intelligence for feature selection: a review of literature and reflection on future challenges. In: Advances in data and information sciences, pp 211–221. Springer, Singapore 36. Manoharan S (2019) Study on Hermitian graph wavelets in feature detection. J Soft Comput Paradigm (JSCP) 1(01):24–32 37. Aghdam MH, Kabiri P (2016) Feature selection for intrusion detection system using ant colony optimization. IJ Netw Secur 18.3:420–432
Ant Colony Optimization: A Review of Literature …
297
38. Aghdam MH, Ghasem-Aghaee N, Basiri ME (2009) Text feature selection using ant colony optimization. Expert Syst Appl 36(3):6843–6853 39. Shakya S, Pulchowk LN, A novel bi-velocity particle swarm optimization scheme for multicast routing problem 40. Ahmad SR, Yusop NMM, Bakar AA, Yaakub MR (2017) Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis. In: AIP conference proceedings, vol 1891(1), p 020018. AIP Publishing LLC 41. Sweetlin JD, Nehemiah HK, Kannan A (2017) Feature selection using ant colony optimization with tandem-run recruitment to diagnose bronchitis from CT scan images. Comput Methods Programs in Biomed 145:115–125 42. Sinoquet C, Niel C (2018) Ant colony optimization for markov blanket-based feature selection. Application for precision medicine. In: International conference on machine learning, optimization, and data science, pp 217–230. Springer, Cham 43. Liang H, Wang Z, Liu Yi (2019) A new hybrid ant colony optimization based on brain storm optimization for feature selection. IEICE Trans Inf Syst 102(7):1396–1399 44. Sowmiya C, Sumitra P (2020) A hybrid approach for mortality prediction for heart patients using ACO-HKNN. J Ambient Intell Humanized Comput 45. Mangat V (2010) Swarm intelligence based technique for rule mining in the medical domain. Int J Comput Appl 4(1):19–24 46. Naseer A, Shahzad W, Ellahi A (2018) A hybrid approach for feature subset selection using ant colony optimization and multi-classifier ensemble. Int J Adv Comput Sci Appl IJACSA 9(1):306–313 47. Kashef S, Nezamabadi-pour H (2013) A new feature selection algorithm based on binary ant colony optimization. In: The 5th conference on information and knowledge technology, pp 50–54. IEEE 48. Jameel S, Ur Rehman S (2018) An optimal feature selection method using a modified wrapperbased ant colony optimisation. J Natl Sci Found Sri Lanka 46(2) 49. Selvarajan D, Jabar ASA, Ahmed I (2019) Comparative analysis of PSO and ACO based feature selection techniques for medical data preservation. Int Arab J Inf Technol 16(4):731–736 50. Khorram T, Baykan NA (2018) Feature selection in network intrusion detection using metaheuristic algorithms. Int J Adv Res Ideas Innovations Technol 4(4) 51. Manoj RJ, Praveena MDA, Vijayakumar K (2019) An ACO–ANN based feature selection algorithm for big data. Cluster Comput 22(2):3953–3960 52. Jayaprakash A, KeziSelvaVijila C (2019) Feature selection using ant colony optimization (ACO) and road sign detection and recognition (RSDR) system. Cogn Syst Res 58:123–133 53. Nayyar A, Le DN, Nguyen NG (eds) (2018) Advances in swarm intelligence for optimizing problems in computer science. CRC Press (Oct 3) 54. Dorigo M, Stützle T (2019) Ant colony optimization: overview and recent advances. In: Handbook of metaheuristics, pp 311–351. Springer, Cham
Hand Gesture Recognition Under Multi-view Cameras Using Local Image Descriptors Kiet Tran-Trung and Vinh Truong Hoang
Abstract Hand gesture recognition has various applications in recent years such as robotics, e-commerce, human–machine interaction, e-sport, and assisting people with hearing-impaired humans. The latter is the most useful and interesting application in our daily life. Nowadays, cameras can be installed easily and everywhere. So, gesture recognition faces the most challenging issue under image acquisition by multiple cameras. This paper introduces an approach for hand gesture recognition under multi-views cameras. The proposed approach is evaluated on the HGM-4 benchmark dataset by using local binary patterns. Keywords Hand gesture recognition · Local image descriptor · Multi-view cameras
1 Introduction The hand gesture is a typical and basic tool of humans for conversation. It is very difficult to train someone to learn and understand all gestures-based sign language in a short time. Many intelligent systems are proposed to automatically recognize and understand those gestures. Hand gesture recognition has received a lot of attention from many vision scientists in the last decade. It is a core process of smart home, contactless device, and multimedia systems [1, 2]. Various methods are proposed in literature which is based on image analysis. Dinh et al. [3] proposed a method for analyzing hand gesture sequence images by using the hidden Markov model and evaluates on the one- and two-hand gestures databases. Tavakoli et al. [14] recognize hand gestures based on EMG wearable devices and SVM classifiers. Chaudhary et al. [2] introduced a method based on light invariant for hand gesture recognition. They applied a technique for extracting features by orientation histogram on the region of K. Tran-Trung (B) · V. T. Hoang Ho Chi Minh City Open University, Ho Chi Minh City, Vietnam e-mail: [email protected] V. T. Hoang e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_23
299
300
K. Tran-Trung and V. T. Hoang
interest. Chansri et al. [1] presented a method based on HOG descriptor and neural network for Thai sign language recognition. Since cameras are installed at any outdoor or indoor position, the modern hand gesture faces a challenging issue due to the multi-views in different angles of acquisition. Figure 1 illustrates an example of one hand gesture under four cameras at different positions. The distinct views can be seen and illusion from a unique hand gesture. The problem of hand gestures under multi-views has been investigated in [11, 12]. The authors introduced a method to fuse features extracted from different cameras with two hand gestures. There exist a few public and gesture datasets in literature [10, 13]. All images are usually captured by one camera. Recently, Hoang [4] surveyed different hand gesture databases with multi-views and released a new dataset (HGM-4) under four different cameras for Vietnamese sign language recognition. This paper presents a preliminary result on the HGM-4 dataset based on a local image descriptor. The local binary pattern (LBP) [8, 9] is considered to represent a hand gesture image since it is an efficient approach and fast computing for characterizing texture image [7]. The remainder of this paper is organized as follows. Section 2 introduces the LBP descriptor, the proposed approach, and the experimental results on HGM-4 dataset. Finally, the conclusion is given in Sect. 3.
Fig. 1 An example of one-hand gesture captured by four different cameras
Hand Gesture Recognition Under Multi-view Cameras …
301
2 Proposed Approach 2.1 Local Binary Patterns Local binary patterns (LBP) are obtained by computing the local neighborhood structure for representing the texture around each pixel of the image from a square neighborhood of 3 × 3 pixels. The L B P P,R (xc , yc ) code of each pixel (xc , yc ) is calculated P−1 of by comparing the gray value gc of the central pixel with the gray values {gi }i=0 its P neighbors, by this formula: LBP P,R (xc , yc ) =
P−1
(gi − gc ) × 2i
(1)
i=0
where is a threshold function which is computed as: (gi − gc ) =
1 if (gi − gc ) ≥ 0 0 otherwise
(2)
2.2 Proposed Approach Several LBP patterns occur more frequently in texture images than others. The authors in [15] proposed to define the “LBP uniform pattern” LBPu2 P,R which is a subset of the original LBP [8]. For this, they consider a uniformity measure of a pattern which analyzes the number of bitwise transitions from 0 to 1 or vice versa when a circular bit transformation is applied. An LBP is named uniform if the transition is achieved at most 2. For example, the patterns 11111111 (0 transitions), 00011110, and 11100111 are uniform, and the pattern 00110010 is not. The final features obtained from image patch are better and more representative than extracting from a global image [7, 15]. To extract features from multi-blocks, each original image is proposed. The features extracted from these blocks of each color component are then fused to create a final feature vector, e.g., having a vector with 59 × 3 = 177 features, for an original image without division. An illustration of the proposed approach is presented in Fig. 2. The HGM-4 [4] is a benchmark dataset for hand gestures under multi-camera. Table 1 presents the characteristic of this database. Four cameras are installed at different positions to capture hand gesture images and have 26 distinct gestures which are performed by five different persons. Figure 3 illustrates different images of the same gesture under one camera (left camera). Since all images are segmented to have a uniform background, this problem is more challenging in a complex background.
302
K. Tran-Trung and V. T. Hoang
Fig. 2 Proposed approach
Table 1 Characteristics of HGM-4 benchmark dataset Camera
Gestures and number of images per gesture
Number of performing person
Total
Below
26 × 8
5
1040
Front
26 × 8
5
1040
Left
26 × 8
5
1040
Right
26 × 8
5
1040
Fig. 3 A gesture performing by different volunteers under one camera
2.3 Results Cross-validation method is applied by hold out a technique on the initial dataset to create training and testing subset. Different ratios are considered as: 50:50; 70:30; 80:20, and 90:10. Seven strategies are applied to divide the whole image into multiblocks. Table 2 illustrates the classification results by using LBP uniforms features extracted from color images. The first column indicates the number of blocks used to split the original image. For the decomposition at 50:50, the best accuracy is achieved at 78.43% with 7 × 7 blocks. Similarly, better results are always obtained
Hand Gesture Recognition Under Multi-view Cameras …
303
Table 2 Classification performance by 1-NN classifier and LBP uniform on the HGM-4 dataset Decomposition (Train:Test) Number of blocks
50:50
70:30
80:20
90:10
1×1
57.35
61.90
63.38
62.29
2×2
72.87
77.84
77.91
79.58
3×3
75.81
79.51
81.30
81.97
4×4
77.39
80.90
83.28
83.54
5×5
77.83
82.46
83.33
85.16
6×6
78.35
83.22
84.01
85.79
7×7
78.43
83.02
85.73
86.58
by using this number of blocks. This can confirm the extraction approach based on block division for extracting LBP uniform features as in [15].
3 Conclusion This paper presented an approach for hand gesture recognition under multi-views cameras. The LBP uniform descriptor is used to perform the features extraction from color images on the HGM-4 benchmark dataset. First, this work is now extending to enhance the recognition rate by fusing many local image descriptors and deep features. Second, a fusion scheme should be proposed by capturing all information from different cameras.
References 1. Chansri C, Srinonchat J (2016) Hand gesture ecognition for Thai sign language in complex background using fusion of depth and color video. Procedia Comput Sci 86:257–260 2. Chaudhary A (2018) Light invariant hand gesture recognition. In: Robust hand gesture recognition for robotic hand control, pp 39–61. Springer 3. Dinh DL, Kim JT, Kim TS (2014) Hand gesture recognition and interface via a depth imaging sensor for smart home appliances. Energy Procedia 62:576–582 4. Hoang VT (2020) HGM-4: a new multi-cameras dataset for hand gesture recognition. Data Brief 30:105676 5. Just A, Marcel S (2009) A comparative study of two state-of-the-art sequence processing techniques for hand gesture recognition. Comput Vis Image Underst 113(4):532–543 6. Lee AR, Cho Y, Jin S, Kim N (2020) Enhancement of surgical hand gesture recognition using a capsule network for a contactless interface in the operating room. Comput Methods Programs Biomed 190;105385 (Jul 2020) 7. Nhat HTM, Hoang VT (2019) Feature fusion by using LBP, HOG, GIST descriptors and Canonical Correlation Analysis for face recognition. In: 2019 26th international conference on telecommunications (ICT), pp 371–375 (Apr 2019)
304
K. Tran-Trung and V. T. Hoang
8. Ojala T, Pietikäinen M, Harwood D (1996) A comparative study of texture measures with classification based on featured distributions. Pattern Recogn 29(1):51–59 9. Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution grayscale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987 10. Pisharady PK, Saerbeck M (2015) Recent methods and databases in vision-based hand gesture recognition: a review. Comput Vis Image Underst 141:152–165 11. Poon G, Kwan KC, Pang WM (2018) Real-time multi-view bimanual gesture recognition. In: 2018 IEEE 3rd international conference on signal and image processing (ICSIP), pp 19–23. IEEE, Shenzhen (Jul 2018) 12. Poon G, Kwan KC, Pang WM (2019) Occlusion-robust bimanual gesture recognition by fusing multi-views. Multimedia Tools Appl 78(16):23469–23488 13. Ruffieux S, Lalanne D, Mugellini E, Abou Khaled O (2014) A survey of datasets for human gesture recognition. In: International conference on human-computer interaction,pp 337–348. Springer 14. Tavakoli M, Benussi C, Alhais Lopes P, Osorio LB, de Almeida AT (2018) Robust hand gesture recognition with a double channel surface EMG wearable armband and SVM classifier. Biomed Signal Process Control 46:121–130 15. Van TN, Hoang VT (2019) Kinship verification based on local binary pattern features coding in different color space. In: 2019 26th international conference on telecommunications (ICT), pp 376–380 (Apr 2019)
Custom IP Design for Fault-Tolerant Digital Filters for High-Speed Imaging Devices Somashekhar Malipatil, Avinash Gour, and Vikas Maheshwari
Abstract Digital filters are most commonly used in signal processing and communication systems. The fault-tolerant filters are required when the system is unreliable. Many methodologies have been proposed to defend digital filters from errors. In this paper, fault-tolerant finite impulse response have been designed using errorcorrecting codes and Hamming codes with efficient coded in hardware descriptive language Verilog. In this paper, we have designed custom IP for fault-tolerant digital filter with reduced power dissipation and with high speed. This work is concentrating on creating and packaging custom IP. The proposed design of custom IP fault-tolerant digital filter is synthesized in Xilinx Vivado 2018.3 and selected Xilinx Zynq-7000 SoC ZC702 evaluation board. Keywords Custom IP · Vivado 2018.3 · Xilinx Zynq-7000 · Fault tolerance · FIR filter · VLSI · Verilog · ECC · Hamming codes
1 Introduction Digital filters are the essential devices in the digital signal processing system and recently used in several applications such as video processing, wireless communications, image processing, and many imaging devices. The use of digital circuits is exponentially increasing in space, and automotive and medical applications in reliability are critical. In this type of designs, a designer has to adopt some degree of fault tolerance. This requirement further increases in CMOS technologies that are soft errors and manufacturing variations [1]. The generally used hardware redundancy techniques are double modular redundancy and triple modular redundancy (TMR) [2]. These methods are suitable to S. Malipatil (B) · A. Gour Department of Electronics & Communication Engineering, Sri Satya Sai University of Technology & Medical Sciences (SSSUTMS), Sehore, Madhya Pradesh, India e-mail: [email protected] V. Maheshwari Department of Electronics & Communication Engineering, Bharat Institute of Engineering & Technology, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_24
305
306
S. Malipatil et al.
identify or detect the errors but consume more area to implement these techniques. These names itself indicates double and triple and so needs similar structures in parallel to detect faults. In this [3], author proposed FIR filter using reduced precision replicas which was designed to minimize the cost of implementing modular redundancy. Some researchers used different implementation methodologies using only one redundant to rectify errors [4]. The new method to protect parallel filters is generally used based on ECC in modern signal processing to the outputs of the parallel filter to identify and correct errors. A discrete-time filter [5] is implemented by Eq. (1). In this Eq. (1), Y [n] represents output, x[n] is input signal, and h[i] represents impulse response. Y [n] =
∞
x[n − 1] · h[i].
(1)
i=0
In system on a chip, the custom IP blocks are used to improve productivity. Custom IP consists of pre-designed blocks that are easy and can be used in bigger. In custom, IP is divided into two types, one is hard IP and another one is soft IP. Usually, hard IP will have a pre-designed layout, but soft IP comes as a synthesizable module [6].
2 Proposed Work In the world of embedded systems, it is the engineer’s responsibility to minimize cost and increase the performance. To achieve this, our intellectual properties (IP) have been designed. In this work, FIR filter with fault-tolerant have been designed and also designed custom IP for fault-tolerant FIR filter. Firstly, FIR filter using error-correcting codes and Hamming codes have been designed with efficient coding used in Verilog. The block diagram of the fault-tolerant FIR filter is shown in Fig. 2. This design includes five main blocks code generation, syndrome, memory, bit, and info block. In this, seven bits have been designed. The block diagram of information signal is shown in Fig. 3 and internally it consists multiplexer. The block diagram of syndrome is shown in Fig. 5. Table 1 shows the properties of designed custom IP (Figs. 1 and 4). In this work, a simple custom IP have been packaged and added it into a user repository location. In the process, specified vendor, library, name, and version (VLNV) information, added device compatibility, specified IP parameters, explored many possible file groups (though you used only the basic file groups for this lab), specified ports, and defined interfaces. After packaging the IP, configured it from the Vivado IP catalog for use in design and directly instantiated it. A block design is also used to instantiate the IP along with the clock and experienced the benefits of graphically connecting interfaces (without needing to know the details of the individual signals).
Custom IP Design for Fault-Tolerant … Table 1 Properties of IP
307
˙IP properties fault_tolerant_v1_0 Version
1.0 (Rev. 2)
Description
fault_tolerant_v1_0
Status
Production
License
Included
Change log
View change log
Vendor
Xilinx, Inc
VLNV
xilinx.com:user.fault_tolerant:1.0
Fig. 1 Block diagram of FIR filter
Fig. 2 Block diagram of fault-tolerant FIR filter Fig. 3 Block diagram of information block
308
S. Malipatil et al.
Fig. 4 Internal structure of the information block
Fig. 5 Syndrome
In this paper, a fault-tolerant digital FIR filter have been designed with reduced power and area efficient by using ECC codes and avoiding TMR and DMR methodologies and also designed custom IP for fault-tolerant FIR filter using Xilinx Vivado 2018.30 version and implemented on Xilinx Zynq-7000 SoC ZC702 evaluation board. This proposed custom IP produces similar outcomes as that of the existing module. The total on-chip power consumption is 1.803 W including dynamic and static power consumption. The area has analyzed based on resource utilization.
Custom IP Design for Fault-Tolerant …
309
In this design, error-correcting codes and Hamming codes have been used to design fault-tolerant FIR filter and also designed custom IP. This designed IP is reusable and cost efficient. In this design, the check bits are produced by an XORtree to the G matrix. The syndrome is generated by an XOR network corresponding to the H matrix. No error is detected if the syndrome is zero vector. ⎡
⎤ 1000111 ⎢0 1 0 0 1 1 0⎥ ⎥ G=⎢ ⎣0 0 1 0 1 0 1⎦ 0001011 ⎡ ⎤ 1110100 H = ⎣1 1 0 1 0 1 0⎦ 1011001
(2)
(3)
The encoding is obtained by Eq. (4), and error is detected by computing Eq. (5). out = x · G
(4)
s = out · H T
(5)
Syndrome structure consists internally of XOR, mux, and latches. It will scan for the error, and if no error found, the signals memen1 and write will set to logic ‘1’ (Figs. 6 and 7).
Fig. 6 Package IP
310
S. Malipatil et al.
Fig. 7 Final designed custom IP for fault-tolerant FIR filter
3 Results and Discussions 3.1 Synthesis Results See Table 2 and Graph 1. Table 2 Resource utilization summary
Resource
Estimation
LUT
17
53,200
FF
7
106,400
0.01
BRAM
0.50
140
0.36
IO
36
200
18.00
BUFG
1
32
3.13
Graph 1 Resource utilization summary
Available
Utilization in % 0.03
Custom IP Design for Fault-Tolerant …
311
Fig. 8 Power analysis
3.2 Power Analysis The power analysis is shown in Fig. 8. The total 1.803 W on-chip power is achieved. It is the combination of both dynamic and static power consumption. 92% of dynamic power is achieved, and it is useful and will consume power when the device is working condition. They have reduced almost static power consumption and having 8% of static power consumption. The following parameters have been used to achieve low power consumption. Total on-chip power: 1.803 W. Junction temperature: 45.8 °C. Thermal margin 39.2 °C(3.3 W). Effective JA 11.5 °C/W. Power supplied to off-chip 0 W.
4 Implementation Results The implementation have been done on Xilinx Zynq-7 ZC702 evaluation board. The product Zynq-7000, package-clg484, and speed grade which is −1 have been used. From Table 3, it shows that all the routed nets are working properly, and there is no unrouted nets (Figs. 9, 10 and 11).
312 Table 3 Implementation summary
S. Malipatil et al. ˙Implementatıon
Summary
Conflict nets
0
Unrouted nets
0
Partially routed nets
0
Fully routed nets
59
Route status
Fig.9 Floor planning
5 Conclusion In this paper, fault-tolerant digital FIR filter have been designed with reduced power and area efficient by using ECC codes and avoiding TMR and DMR methodologies and also designed custom IP for fault-tolerant FIR filter using Xilinx Vivado 2018.30 version and implemented on Xilinx Zynq-7000 SoC ZC702 evaluation board. This proposed custom IP produces similar outcomes as that of the existing module. The total on-chip power consumption is 1.803 W including dynamic and static power consumption. The area has been analyzed based on the resource utilization, and our intellectual properties (IPs) have been designed. This type of fault-tolerant filters is used in space, and automotive and medical applications in reliability are critical.
Custom IP Design for Fault-Tolerant …
Fig.10 IO planning
Fig.11 Simulation results
313
314
S. Malipatil et al.
References 1. Gao Z et al (2014) Fault tolerant parallel filters based on error correction codes. IEEE Trans Very Large Scale Integr (VLSI) Syst 2. Somashekhar, Vikas Maheshwari, Singh RP (2019) A study of fault tolerance in high speed VLSI ciruits. Int J Sci Technol Res 8(08) (Aug) 3. Shim D, Shanbhag N (2006) Energy-efficient soft error-tolerant digital signal processing. IEEE Trans Very Large Scale Integr (VLSI) Syst 14(4):336–348 (Apr) 4. Reviriego P, Bleakley CJ, Maestro JA (2011) Strutural DMR: a technique for implementation of soft-error-tolerant FIR filters. IEEE Trans Circuits Syst Exp Briefs 58(8):512–516 (Aug) 5. Oppenheim AV, Schafer RW (1999) Discrete time signal processing. Prentice-Hall, Upper Saddle River, NJ, USA 6. Software manual Vivado Design Suite Creating and Packaging Custom UG973 (v2018.3) December 14, 2018, [online] Available: www.xilinx.com 7. Vaisakhi VS et al (2017) Fault tolerance in a hardware efficient parallel FIR filter. In: Proceeding of 2018 IEEE ınternational conference on current trends toward converging technologies. 978– 1–5386–3702–9/18/$31.00 © 2017 IEEE 8. Nicolaidis M (2005) Design for soft error mitigation. IEEE Trans Device Mater Rel 5(3):405– 418 (Sept) 9. Kanekawa N, Ibe EH, Suga T, Uematsu Y (2010) Dependabilitu in electronic systems: mitigation of hardware failures, soft errors, and electro-magnetic disturbances. Springer, NewYork, NY, USA 10. Lin S, Costello DJ (2004) Error control coding, 2nd edn. Prentice-Hall, Englewood Cliffs, NJ, USA 11. Cheng C, Parhi KK (2004) Hardware efficient fast parallel FIR filter structures based oniterated short convolution. IEEE Trans Circuits Syst I: Regul Pap 51(8) (Aug) 12. Somashekhar, Vikas Maheshwari, Singh RP (2019) Analysis of micro inversion to improve fault tolerance in high speed VLSI circuits. Int Res J Eng Technol (IRJET) 6.03 (2019):5041–5044 13. Gao Z, Yang W, Chen X, Zhao M, Wang J (2012) Fault missing rate analysis of the arithmetic residue codes based fault-tolerant FIR filter design. ˙In: Proc. IEEE IOLTS, June 2012, pp 130–133 14. Somashekhar, Vikas Maheshwari, Singh RP (2020) FPGA ımplementation of fault tolerant adder using verilog for high speed VLSI architectures. Int J Eng Adv Technol (IJEAT) 9(4)a. ISSN: 2249–8958 (Apr) 15. Hitana T, Deb AK (2004) Bridging concurrent and non-concurrent error detection in FIR filters. ˙In: Proc. Norchip Conf., Nov 2004, pp 75–78. https://doi.org/https://doi.org/10.1109/ NORCHP.2004.1423826 16. Ponta’relli s,Cardarilli GC,Re M, Salsano (2008) Totally fault tolerant RNS based FIR filters. ˙In: Proc.14th IEEE Int On-Line Test Symp (IOLTS), July 2008, pp 192–194 17. Kumar NM (2019) Energy and power efficient system on chip with nanosheet FET. J Electron 1(01):52–59
A Novel Focused Crawler with Anti-spamming Approach & Fast Query Retrieval Ritu Sachdeva and Sachin Gupta
Abstract The Web pages are growing in a design of terabytes or even petabytes day by day. In the case of the small Web, it is an easy task to answer a query, whereas robust modus operandi for storage, searching, or spamming is needed in case of large volumes of data. This study gives a novel approach for the detection of malicious URLs and fast query retrieval. The proposed focused crawler checks URL before entering in the search engine database. It discards malicious URLs but allows benign URLs to enter in search engine database. The detection of malicious URLs is done via the proposed feature set of URL which is created by selecting those attributes of URL features which are susceptible to spammers. Thus, a non-malicious database is created. The searching process performed through this search engine database by triggering a query. Search time taken by the proposed crawler is less as compared to the base crawler. The reason behind it is that the proposed focused crawler uses a trie data structure for storing fetched results in Web repository instead of HashSet data structure as used by the base crawler. On behalf of computed average search time (for ten queries), it is observed that proposed focused crawler is 12% faster than base crawler. To check the performance of proposed focused crawler, quality parameters, i.e., precision and recall, are computed which are found to be 92.3% and 94.73%. Detection accuracy is found to be 90% with an error rate of 10%. Keywords HashSet · Trie · Focused crawler · Base crawler · Search engine · Malicious URL · Lexical · Content
1 Introduction Technically, information retrieval (IR) [1] is a discipline of searching for sole data in a document, all documents as well as metadata (text, image, or sound data or R. Sachdeva (B) · S. Gupta Department of Computer Science, MVNU, Palwal, India e-mail: [email protected] S. Gupta e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_25
315
316
R. Sachdeva and S. Gupta
Fig. 1 Process of crawling (in general)
databases). Search engine is a type of information retrieval that enlists the relevant documents via specified keywords by the use of a spider robot [2]. The process of crawling is initiated with a list of URLs, called seed URLs. A queue of pages (called frontier) is preserved by a Web crawler that is to be downloaded. The seed set initializes the frontier (done manually). A URL from this seed collection has been selected and submitted to the downloader to download the URL Web page. The indexer module utilizes the fetched pages. A continuous process in which extracted URLs from downloaded pages are fed to the URL frontier for further downloading till the frontier becomes empty. Figure 1 illustrates how a Web crawler functions. The main components of the crawler are URL frontier, DNS resolution, fetch module, parsing module & URL duplicate eliminator. The URL frontier is the collection of URLs which are to be fetched next in the crawl. A DNS resolution module is used to determine the IP address of the webserver specified by the URL in the URL frontier. A fetch module uses the hypertext transfer protocol (HTTP) to extract the Web page. A parsing module takes the Web page as input, extracting from it the text and collection of hyperlinks. URL duplicate eliminator checks the availability of the links in the frontier and discards the link if it is already fetched. The robot template is used to determine whether or not to allow removal of the Web page.
1.1 Focused Crawler Chakrabarti et al. had proposed the oriented crawler [3]. It is composed of a hypertext classifier, a distiller, and a crawler. The classifier makes appropriate decisions about the expansion of links on crawled pages, while the distiller calculates a measure of the centrality of crawled pages to determine visit priorities. The search function of
A Novel Focused Crawler with Anti-spamming …
317
the crawler is dynamically reconfigurable priority controls and is managed by the distiller and the classifier.
1.2 Spamming in Consonance with Malicious URLs Today’s spammers target the URL to induce spamming in Web pages. Such types of URLs are called malicious URLs. These URLs are difficult to be detected by the end user, and user data is illegitimately accessed. These malicious URLs have resulted in a cyberattack, unethical behavior like the breach of confidential as well as secure content, installation of ransomware on the user devices causing massive losses worldwide each year, etc. Benign URLs can be converted into malign URLs by obfuscation. Obfuscation is a technique used to mask malicious URLs. It is reported that about 50 million Web site users are visiting malicious Web sites. Black–hosting, heuristic classification, etc., are traditional techniques based on keyword as well as URL syntax matching are some filtering mechanisms used to reveal malicious URLs, but these techniques are inefficient to cope up with technologies and Web access techniques as well as detecting modern URLs.
1.3 Query Processing & Role of Data Structure in Searching In response to queries, the crawler locates the related Web pages and downloads their content that is stored on the disk for further processing. These results are usually stored in a database repository in the form of an inverted index, hash tables, etc., to help user queries be processed in the future. But the Web index generated must be compact, i.e., the memory requirements for index storage should be smaller. The main challenges are improving query performance by handling queries faster and providing faster results in trillions of Web data. Kanagavalli [4] discussed several data structures based on storage, process, as well as a description that is used for storing the data. The author dictated that mostly hash tables are used as a storage structure. The efficient data structure used for storage leads to the optimization of search engine and ultimately the whole process of generating final results has accelerated.
1.4 HashSet Versus Trie HashSet is an unordered, special array of elements. It is implemented by means of a hash table. A HashSet contains a set of objects, but in such a way that the user can quickly and easily decide if an object is already in the set or not. It does this by managing an array internally and storing the object using an index calculated from the object’s hashcode. HashSet also provides standard set operations such as union,
318
R. Sachdeva and S. Gupta
symmetric, and intersection. The methods add, delete, and contain are of constant time complexity O (1). A HashSet has an internal structure (hash), in which objects can be easily searched and defined. It does not preserve the order of elements. There is no access by indices. But either enumerator or built-in functions can be used to access elements. Built-in functions convert the HashSet to a list and iterate through it. Iterating through a HashSet (or having an object by index) is therefore very slow particularly in the case of large text queries. Moreover, HashSets are not cache friendly. Trie is a dynamic ordered tree data structure used to store a set or associative array in which keys are normally strings. These strings are arranged in lexicographic order. The search complexity of a string of key length m is O(m) in the worst case. Updating a trie is quite simple as inserting starts with a search and when a node with no correct edge to follow appears, then a node is added with the remaining string on the edge to this node. Trie can be better represented in the form of a compressed or compact form. A compressed or compact representation of a trie is one that merges all chains of edges that have no branches (the nodes between these edges are of degree one) to one edge, labeled with the string of characters of the merged edges or labeling the resulting path. In the particular case of a compact binary trie, the total number of nodes is 2n − 1, like in a full binary tree, where there are n-strings are represented by trie.
1.5 Advantages of Trie Data Structure Over HashSet Tries are an incredibly special and functional data structure that is dependent on the prefix of a string. So, these are being able to help in searching for a value having the longest possible prefix similar to a given key. Tries can also be used for determining the association of value with a group of keys that share a common prefix. They are used to signify the “Retrieval” of data. Strings are placed in a top to bottom manner based on their prefix in a trie. All prefixes of length 1 are stored up to level 1, and all prefixes of length 2 are sorted down to level 2 and so forth. So, it is considered a better data structure for faster searching of string in comparison to HashSet. Trie typical makes use of the case when dealing with a group of strings rather than individual strings. The search, insert, and delete complexity of operations is O(L) where L is the length of a key. It is faster because of the ways it is implemented. Do need to compute any hash function. There is no collision handling. It prints all words in alphabetical order. These are space efficient if you are storing lots of words that start with a similar pattern. These may reduce the overall storage cost by storing shared prefixes once. Thus, trie can quickly answer queries about words with shared prefixes resulting in efficient prefix queries.
A Novel Focused Crawler with Anti-spamming …
319
Table 1 Studies on URL feature-based crawlers S. no Author
Study done
1
Justin [14]
Proposed a real-time framework that gathers lexical & host-based URL features and pairs it to a wide Webmail provider with a real-time stream of labeled URLs.
2
Xu [15]
Put limelight on malicious Web sites & their detection via a novel cross-layer method, handling adaptive attacks as well as statistical characteristics with the help of lexical, host, and content features
3
Choi [16]
Proposed a technique that spans discriminative features like link structures, textual properties, Web page contents, network traffic, DNS information, etc., considering URL lexical features
4
Popescu [17]
Focuses on machine learning for detecting malicious URLs. Moreover, FPR & detection rate is also calculated
5
Mamun [18]
Described an approach using lexical analysis. Blacklist as well as obfuscation methods are also discussed
6
Vanhoenshoven [19] Detection of malicious URLs using a machine learning technique. Performance is checked via reliability, TP, TN, FP & FN
7
Sridevi [20]
Worked on malicious URL at browser level using blacklisting, heuristic, as well as machine learning techniques
8
Chong [21]
Detection of malicious URLs using lexical and Javascript features of URL via support vector machine
9
Patil [22]
Proposed a multi-classification technique considering 42 features of spam, malware URLs, and phishing.
10
Naveen [23]
Detection of malicious URLs using keyword matching and URL syntax matching, i.e., syntactic, lexically, as well as semantically
11
Sahoo [24]
Framed a malicious URL detection system using lexical, context, host, as well as popular features of URL
2 Literature Survey 2.1 Studies Done on URL Feature-Based Crawlers A URL has many features like lexical features, host-based Features, content-based features, popularity features, and context features, on behalf of which spammed URL can be detected. Table 1 shows a summary of related work.
2.2 Studies Were Done on the Usage of Data Structures as Storage Unit in Searching Shishibori [5] discussed the implementation of binary tries as a fast access method by converting binary tries into a compact bitstream for reducing memory space.
320
R. Sachdeva and S. Gupta
Moreover, he explained that it takes ample time to search (due to pre-order bitstream) and update in large key sets, thereby, hike in time and cost of each process in case of large key sets. Kangavalli [4] discussed various data structures to be required in information retrieval due to trillions of data. The author explains that data structures can be process-oriented, descriptive, or storage in this case. Response time, as well as the quality of the system, is defined for its performance. Steffein Heinz [6] proposed a new data structure called burst tries for string keys which is faster than a trie but slower than a hash table. Shafiei [7] discussed Patricia tree in concern to the sharedmemory system by implementing insertion, deletion, and replacement operations. This proposed work is also justified for storage of unbounded length strings with flags but being avoided due to consumption of a lot of memory. Thenmozhi [8] analyzed the efficiency of various data usage models for tree- and trie-based implementations under different hardware and software configurations such as the size of RAM & cache, as well as the speed of physical storage media. Andersson [9] discussed the string searching problem using a suffix tree being compressed at level, path, and data. It is very effective for large texts due to decreasing the number of accesses to slow secondary memory as well as limited main memory usage simultaneously. Roberto Grossi [10] proposed fast compressed tries through path decompositions with less memory space and latency. Nilsson [11] implemented a dynamic compressed trie, i.e., LPC trie, with level and path compression. A comparison with balanced BST showed that search time is better due to small average depth, but memory usage of balanced BST and LPC trie is similar. So, LPC trie is a good choice for a data structure that preserves order, where very quick search operations are necessary. Shishibori [12] proposed a strategy for compressing Patricia tries into a compact data structure, i.e., bitstream. But, compact Patricia stores information about eliminated nodes, so large storage is required to implement it. The study also evaluates the space and time efficiency. Isara Nakavosute [1] suggested an approach for maximizing information retrieval (IR) time or database search time using a BST & a doubly linked list. Mangla [13] proposed a method named context-based indexing in IR system using BST that solves the large search space problem by indexing, stemming, and removal of stop words in case of large documents.
3 Proposed Focused Crawler The proposed focused crawler or classifier is based on selected attributes of different URL features like lexical features, JavaScript features, content features & popularitybased features. The selected features are susceptible to the spammers in a lesser or greater degree. An experimental analysis via different Web sites is done for each chosen feature and assigned a weight based on the existence of these attributes. If the attribute has existed in maximum URLs, then its severance is high, so the weight assigned to that attribute is high and vice versa. For fewer occurrences for any attribute of URL, less weight is assigned. Then, the average value has been set for each feature unit (mentioned in Table 2). The total sum of the average weights of
A Novel Focused Crawler with Anti-spamming …
321
Table 2 Feature representation of proposed focused crawler Features
Category
Average value
Count of dots in URL
Lexical
0.1
>3
Length of primary domain
Lexical
0.1
>10
Length of URL
Lexical
0.1
>30
Keywords like “Confirm”/“Banking”
Popularity-based
0.1
Existing
Escape()
Javascript
0.4
Equal to 4
Eval()
Javascript
Link()
Javascript
Unescape()
Javascript
0.2
Equal to 2
Exec()
Javascript
Search()
Javascript
“Scripting.FileSystemObject”
DHTML
“WScript.Shell”
DHTML
“Adodb.Stream”
DHTML
Status assumption
all the attributes is 1. Then, for the detection of malicious URL, multiple malicious URLs are analyzed, and the threshold value is determined based on the sum of weights based on attributes that occurred in provided URLs. It is found to be 0.7. Thus, the system inbreds a mathematical range from 0 to 1 and differentiates the benign and malign URL statistically. Zero depicts that the URL is benign, while the value greater than 0.7 (the threshold value) shows that the URL is malign.
4 Methodology of Proposed Focused Crawler The proposed focused crawler works in two steps. The first step filters benign and malign URLs on behalf of selected feature set. Malicious URLs are rejected while benign URLs are added in the search engine database. Then, the query is triggered, and results are displayed. • Filteration of Benign & Malign URLs It filters malign and benign URLs on the interface of the multifeatured malicious URL detection system. After detecting malicious URL, these URLs are blocked, and benign URL passes the filter. The benign URLs are entered into the database. • Fast Query Retrieval Later, searching is performed by triggering a query on the search interface. As it is a focused crawler, it limits the search up to a domain. This interface leads to a window that not only gives search results but also gives the comparison of search time of base crawler as well as proposed focused crawler. The base crawler uses HashSet, and the
322
R. Sachdeva and S. Gupta
Fig. 2 Design of proposed focused crawler
proposed focused crawler uses trie as a storage unit during searching. Moreover, the theme of the categorization of a focused crawler improves the search results. Also, it reduces crawling time as well as saves database space (Fig. 2).
A Novel Focused Crawler with Anti-spamming …
5 Pseudo Code of the Proposed Methodology
323
324
R. Sachdeva and S. Gupta
Fig. 3 Interface for detecting malign and benign URLs
6 Experimental Results and Discussion 6.1 Data Source and Dataset A database of collection of showing malign and benign URLs has been downloaded from https://www.kaggle.com/antonyj453/urldataset/data. In the implementation of this classifier, 50 URLs from different domains are tested on the malicious URL detection interface of the crawler (Fig. 3). This interface filters benign and malign URLs. Malign URLs are blocked and are not allowed to enter in the database while benign URLs are passed the anti-malicious barrier and saved in the database. Then, these URLs take part in the searching process. A number of queries or keywords are to be passed in search space on the searching interface which leads to a search engine result page after searching. This window shows the comparison of base crawler and proposed focused crawler in terms of search time based on the storage data structure used by base crawler, i.e., HashSet, and proposed focused crawler, i.e., trie, during searching.
6.2 Experimental Results 6.2.1
Detection of Malign and Benign URLs
See Table 3.
A Novel Focused Crawler with Anti-spamming …
325
Table 3 Record of tested URLs Sr. no
1
Domain
D1: Cab
Statement of domain
Web address (URL of the Web site)
Malicious/Non-malicious URL Detected
Actual
Different cabs
http://http://dub lincabs.com/
NM
NM
http://http://swi ftcabs.com/
NM
NM
http://http://cha ndigarhcabs. com/
NM
M
http://www.che nnaicabs33. com/
M
NM
http://www.mus kancabs.com/
NM
NM
http://www.biz aad.com/taxitour-package
M
M
http://www.par ultravels.com/
NM
NM
http://delhicarb ooking.in/
NM
NM
http://www.inn ovarental.in/ban galore
M
M
http://driveg oplus.com/
NM
NM
http://motima hal.in/
NM
NM
http://www.bol oco.com/
NM
NM
http://www.sur uchirestaurants. com/
NM
NM
http://www.amu NM ldairy.com/
NM
http://www.foo dzonepalwal. com/
NM
NM
http://www.gul sproductions. com/
M
M
http://www.mer rymilkfoods. com
M
M
Cabs outstation
2
D2: Food
Food outlets in/near Faridabad
Food items
(continued)
326
R. Sachdeva and S. Gupta
Table 3 (continued) Sr. no
3
Domain
D3: Books
Statement of domain
Best fiction novels
Role of books in our life
4
D4: Care
Animal care centers
Web address (URL of the Web site)
Malicious/Non-malicious URL Detected
Actual
http://speyfoods. M com/
M
http://mcdona ldsrestaurantsd ucanadalte-273. foodpages.ca/
NM
NM
http://o.foodpa ges.ca/
NM
NM
http://www.mod NM ernlibrary.com/
NM
http://www.bbc. com/
NM
NM
http://muh ich.pw/
NM
NM
http://www.mod NM ernlibrary.com/
NM
http://www.boo kspot.com/
NM
NM
http://mcxl.se/
NM
NM
http://www.kli entsolutech. com/
NM
NM
http://www.rus evec.com/
NM
NM
http://www.myn NM ewsdesk.com/
NM
http://lifestyle. iloveindia.com/
M
NM
http://www.pfa faridabad.com/
NM
NM
http://www.san NM jaygandhianimal carecentre.org/
NM
http://smallanim NM alcarecenter. com/
NM
http://abhyas trust.org/
NM
NM
http://www.ani malandbirdvet. com/
NM
NM
(continued)
A Novel Focused Crawler with Anti-spamming …
327
Table 3 (continued) Sr. no
5
Domain
D5: Sports
Statement of domain
Web address (URL of the Web site)
Health insurance
http://www.app M leton-child-care. com/
M
http://sunkeyins urance.com/
M
Sports arena
Sports famous in India
Malicious/Non-malicious URL Detected
M
Actual
http://nycfootdr. NM com/
M
http://insurance companiesinn ewyork.com/
M
NM
http://www.kai serinsuranceonl ine.com/
NM
NM
http://richsport smgmt.com/
M
M
http://2amspo rts.com/
M
M
http://opensport sbookusa.com/
NM
NM
http://raresport sfilms.com/
NM
NM
http://www.sch ultesports.com/
NM
NM
http://www.wal kthroughindia. com/
NM
NM
http://www.ilo veindia.com/
NM
NM
http://www.ias lic1955.org/
NM
NM
http://www.ind iaonlinepages. com/
NM
NM
http://www.icc rindia.net/
NM
NM
*Acronym used in the table—for malicious & NM (non-malicious)
328
R. Sachdeva and S. Gupta
Table 4 Parameter values of performance factors True positive (TP)
True negative (TN)
False positive (FP)
False negative (FN)
36
9
3
2
6.2.2
Computed Parameter Values
On behalf of data tested, the different parameter values of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) is got where TP = Case was positive & predicted positives, i.e., benign URLs TN = Case was negative & predicted negative, i.e., malign URLs FP = Case was positive, but predicted negative, i.e., malign URLs FN = Case was negative, but predicted positive, i.e., benign URLs The obtained values of TP, TN, FP & FN from Table 4 are as follows.
6.2.3
Computed Search Time of Proposed Focused Crawler & Base Crawler During Searching
See Table 5. Table 5 Output window of searching Sr. no
Domain
Statement of domain
Search time (milli seconds) Using trie
Using HashSet
1
D1: Car
Cab
0.0012
0.6083
2
D2: Food
Food
0.0016
0.0201
3
D3: Books
Books
0.0016
0.0192
4
D4: Care
Care
0.0012
0.0201
5
D5: Sports
Sports
0.0028
0.0192
6
D6: Education
Code
0.0049
0.0254
7
D7: Travel
Cab
0.0024
0.0246
8
D8: Insurance policies
Insurance
0.002
0.0143
9
D9: Animals
Animals
0.002
0.0197
10
D10: Animal care center
Animals care center
0.002
0.0201
0.092 ms
0.7207 ms
Total
A Novel Focused Crawler with Anti-spamming …
329
7 Analytical Study The proposed focused crawler is binary class classifier as it differentiates only two classes, i.e., benign and malign. Different binary evaluation metrics are precision, recall, false positive rate (FPR), false negative rate (FNR), detection accuracy, Fmeasure, and AUC. Worked on three parameters, i.e., accuracy, rrecision & recall.
7.1 Accuracy This parameter is calculated to observe the overall performance in terms of accuracy and error rate. It is determined by dividing the total number of instances by all correct predictions. It can be said that in the absence of any mistake (FP and FN being zero), the measure of accuracy will be 1 (100%). Accuracy =
TP + TN TP + TN + FP + FN
Thus, accuracy = (36 + 9)/(9 + 36 + 2+3) = 0.9 or 90%.
7.2 Precision (Positive Predictive Value) It is the ratio of correctly classified positive predictions. It is determined by dividing the number of correct positive predictions divided by a total number of positives. Precision =
TP TP + FP
Thus, precision = 36/(36 + 3) = 0.923 or 92.3%.
7.3 Recall (Sensitivity) It is the ratio of actually positive cases that are also identified as such. It is calculated by dividing the number of correct positive predictions by the total number of positives. Recall =
TP TP + FN
Thus, recall = 36/(36 + 2) = 0.9473 or 94.73%.
330
R. Sachdeva and S. Gupta
Fig. 4 Graphical analysis of search time of HashSet & trie
Animals
Search Time using Hashset (ms)
cab Sports
Search Time Using Trie (ms)
Books outstaƟon cab 0 0.01 0.02 0.03
7.4 Graphical Analysis of HashSet & Trie Several queries are made for comparing the search time using HashSet and trie storage data structures. Figure 4 graphically analyzes search time taken to search the same keyword by using HashSet and trie data structures.
8 Conclusion & Future Research Studies of McGrath Gupta (2008) and Kolari et al. (2006) suggested that a combination of URL features should be used to develop an efficient classifier. The proposed focused crawler is based on this theory. The classifier is developed by a combination of lexical features, Javascript features, DHTML features, and popularity-based features. It successfully detects malign URLs with an accuracy of 90% with an error rate of 10%. Other metrics, precision and recall, are computed as 92.3 and 94.73%. Moreover, storing the fetched data in a trie data structure during searching leads to less search time as compared to the base crawler that uses a HashSet data structure. Thus, it fastens the query retrieval process. This focused crawler can be made more resistive to spamming via adding a more robust feature set of URL. A study on short URLs can be done for effective detection and attack type identification because it is the most growing trend today on the microblogging sites or online social networks like Facebook, Twitter, Pinterest, etc. Implementing via machine-learning approach will make this classifier more dynamic.
References 1. Nakavisute I, Sriwisathiyakun K (2015) Optimizing information retrieval (IR) time with doubly linked list and binary/search tree (BST). Int J Adv Comput Eng Netw 3(12):128–133. ISSN 2320-2106 2. Lewandowski D (2005) Web searching, search engines and information retrieval. Inf Serv Use 25(3):137–147. IOS - 0167-5265 3. Soumen C, Van Den BM, Byron D (1999) Focused crawling: a new approach to topic-specific web resource discovery. Comput Net J 1623–1640
A Novel Focused Crawler with Anti-spamming …
331
4. Kanagavalli VR, Maheeha G (2016) A study on the usage of data structures in information retrieval. http//www.researchgate.net/publication/301844333A 5. Shishibori M et al (1996) An efficient method of compressing binary tries. Int J Sci Res (IJSR) 4(2):2133–2138 (IEEE). 0-7803-3280-6 6. Heinz S, Zobel J, Williams HE (2002) Burst tries a fast, efficient data structure for string keys. ACM Trans Inf Syst 20(2):192–223 7. Shafiei N (2013) Non-blocking Patricia tries with replace operations. In: Proc. Int. Conf. Distrib. Comput. Syst. IEEE, 1063-6927, pp 216–225 8. Thenmozhi M, Srimathi H (2015) An analysis on the performance of Tree and Trie based dictionary implementations with different data usage models. Indian J Sci Technol 8(4):364– 375. ISSN 0974-5645 9. Andersson S, Nilsson A (1995) Efficient implementation of suffix trees. Softw Pract Exp CCC 25(2):129–141. 0038-0644 10. Grossi R, Ottaviano G (2014) Fast compressed tries through path decompositions. ACM J Exp Algorithms 19(1):1.8.2–1.8.19 11. Nilsson S, Tikkanen M (1998) Implementing a dynamic compressed trie. In: Proc. WAE’98, pp 1–12 12. Shishibori M et al (1997) A key search algorithm using the compact patricia trie. In: Int. Conf. Intell. Process. Syst. ICIPS ’97. IEEE, pp 1581–1584 13. Mangla N, Jain V (2014) Context based indexing in information retrieval system using BST. Int. J. Sci. Res. Publ 4(6):4–6. ISSN-2250-3153 14. Justin MA, Saul LK, Savage S, Voelker GM (2011) Learning to detect malicious URLs. ACM Trans Intell Syst Technol 2(3):2157–6904 15. Xu S, Xu L (2014) Detecting and characterizing malicious websites. Dissertation, Univ. Texas San Antonio, ProQuest LLC 16. Choi H, Zhu BB, Lee H (2011) Detecting malicious web links and identifying their attack types. In: Proc. 2nd USENIX Conf. Web Appl. Dev. ACM, pp 1–12 17. Popescu AS, Prelipcean DB, Gavrilut DT (2016) A study on techniques for proactively identifying malicious URLs. In: Proc. 17th Int. Symp. Symb. Numer. Algorithms Sci. Comput. SYNASC 2015, 978-1-5090-4/16, IEEE, pp 204–211 18. Manun MSI et al (2016) Detecting malicious URLs using lexical analysis, vol 1. Springer Int. Publ. AG, pp 467–482. 978-3-319-46298-1_30. http//www.researchgate.net/publication/308 365207 19. Vanhoenshoven F, Gonzalo N, Falcon R, Vanhoof K, Mario K (2016) Detecting malicious URLs using machine learning techniques. http://www.researchgate.net/publication/31158202 20. Sridevi M, Sunitha KVN (2017) Malicious URL detection and prevention at browser level framework. Int J Mech Eng Technol 8(12):536–541. 0976-6359 21. Chong C, Liu D, Lee W (2009) Malicious URL detection, pp 1–4 22. Patil DR, Patil JB (2018) Feature-based Malicious URL and attack type detection using multiclass classification. ISC Int J Inf Secur 10(2):141–162. ISSN 2008-2045 23. Naveen INVD, Manamohana K, Verma R (2019) Detection of malicious URLs using machine learning techniques. Int J Innov Technol Explor Eng (IJITEE) 8(4S2):389–393. ISSN 22783075 24. Sahoo D et al (2019) Malicious URL detection using machine learning: a survey. Association Comput Mach 1(1):1–37. arXv 1701.07179v3
A Systematic Review of Log-Based Cloud Forensics Atonu Ghosh, Debashis De, and Koushik Majumder
Abstract Inexpensive devices that leverage cloud computing technology has proliferated the current market. With the increasing popularity and huge user base, the number of cybercrimes has also increased immensely. The forensics of the cloud has now become an important task. But due to the geographically distributed nature and multi-device capability of the cloud computing environment, the forensics of the cloud has become a challenging task. The logs generated by the cloud infrastructure provide the forensics investigator with major hints that may follow to reconstruct the crime scene chronology. This is highly critical for the forensics investigator to investigate the case. But the logs are not easily accessible, or they often fail to provide any critical clues due to poor logging practices. In this paper, initially, the importance of log-based cloud forensics has been discussed. Then, a taxonomy based on the survey of the literature has been furnished. Finally, the issues in the existing log-based cloud forensics schemes have been outlined and open research problems have been identified. Keywords Cloud forensics · Digital forensics · Log forensics · Log-based cloud forensics · Issues in cloud forensics · Cloud forensics taxonomy
1 Introduction The untoward exploitation of the capability and the flexibility of the cloud computing environment has brought in the need for cloud forensics [1]. The cloud computing environment is not only capable of meeting minor general-purpose computing requirements, but the tremendous power of the cloud computing environment can be exploited by the malicious users to procure gigantic computing resources and network bandwidth to launch various attacks on or off devices and applications. Thus, there is a need for forensics investigation in the cloud computing environment. The commonly used digital forensics practices do not apply to cloud forensics due to the A. Ghosh · D. De · K. Majumder (B) Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, Kolkata, West Bengal, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_26
333
334
A. Ghosh et al.
inherent properties of the cloud computing environment. The multitenant, volatile, geographically distributed nature, and the complex architecture of cloud computing hinder the forensics process in the cloud. Nevertheless, currently, the forensics investigators have extended the use of digital forensics tools to the cloud which makes the whole process of cloud forensics investigation less promising. There is a pressing need for extensive research and development in the field of cloud forensics. In this work, the literature on log-based cloud forensics published between 2011 and 2020 has been reviewed. A taxonomy based on the survey of the literature has been provided in Sect. 2. In Sect. 3, the challenges that log-based cloud forensics faces have been identified. Finally, in Sect. 4, the open research areas have been provided.
1.1 Log-Based Cloud Forensic Logs are records generated by software under execution in some format as specified by the developer. Each application, platform, or infrastructure usage is logged by the CSP for various purposes such as but not limited to troubleshooting and malicious activity tracking. In each of the cloud service models, logs are generated for every possible access and execution of applications or any other services provided by the CSP. So generated logs are maintained by the CSP. These logs have the potential to reveal an enormous amount of information that might be required in various scenarios as mentioned earlier [2] Thus, these cloud infrastructure generated logs are used by cloud forensics investigators to reconstruct the sequence of activities that have taken place in a cloud crime scene. A use case of logs generated by various systems in a network is depicted in Fig. 1. The logs may be needed to track down unauthorized access to a network by an unauthorized IP. In this scenario, the network logs from a router or a firewall can be of tremendous help to find the details of such intrusion and possibly prosecute the intruder in the court of law. In the cloud computing environment, such as unauthorized access or other malicious activities, such as sensitive data theft, causing damage to other businesses over the cloud, have become quite common. In an investigation in the cloud computing environment, the logs generated can give a promising direction to the investigation and may help the prosecution to succeed as the logs generated in the cloud provide details of the activities that have taken place in a cloud crime scene. The cloud forensics activity which takes the help of the Logs generated in the cloud is known as log-based cloud forensic. The users rarely have full access to logs. The CSP holds exclusive access to the logs generated in her cloud service infrastructure. But she may or may not be obliged to grant access to the logs to her users [3]. As mentioned in earlier sections, the cloud computing environment is spread all over the globe. It is a mammoth task to track down the exact location where the generated logs sit. In a cloud forensics scenario, the CSP may provide access to the logs on an order by the court of law. Since the cloud computing environment encompasses multi-jurisdiction property, it again, in turn, becomes very tough to acquire the desired logs for the forensics
A Systematic Review of Log-Based Cloud Forensics
335
Fig. 1 Log-based investigation scenario
investigation. This grant of access to the logs by the CSP to the investigators may lead to sensitive data leaks of other cloud service users. This is one of the main reasons, why the CSPs do not tend to disclose the logs as doing so might lead to a breach of the SLA among the CSP and its users. Such a breach, in turn, may defame the CSP and lead to running out of business, let alone the jurisdictional chaos that the CSP might have to face in case a cloud service user reports a breach of SLA to the court of law. As per a report from the DPCI, 74% of the forensics investigators have raised dependency on CSP as a concern. Also, ununiform logging structure leads to the difficulty of identification and segregation of logs if access is granted.
2 Comparison with Related Work and Taxonomy In this section, this work of review has been compared with similar works by other researchers. Additionally, the proposed taxonomy constructed based on the literature review for log-based cloud forensics has been provided (Table 1).
336
A. Ghosh et al.
Table 1 Comparison with related work Contribution
Solution taxonomy
Coverage of tools
Research gap identification
Scope of literature review
This work
✓
✓
✓
2011–2020
[4]
✗
✓
✓
2010–2017
[5]
✗
✗
✗
2011
[6]
✗
✗
✗
2011–2014
[7]
✓
✓
✗
2011–2018
[8]
✗
✗
✓
2011–2016
2.1 Proposed Taxonomy Log-based cloud forensics research is categorized into continual forensics (discussed in Sect. 3) and sporadic forensics (discussed in Sect. 4). This high-level categorization focuses on the practice of log acquisition for forensics investigations. Forensics process models, frameworks, forensics systems, and other security and integrity systems can be the contributions by researchers. During this literature survey, it was found that trust in third parties and the integrity of logs have been emphasized as some of the major issues in log-based cloud forensics. Logs are generated in almost all systems in the cloud infrastructure, the client devices, and the intermediate communication systems. Thus, it is not surprising that vast literature focusing on such log generating systems was encountered and their significance being highlighted (Fig. 2).
Fig. 2 Proposed taxonomy
A Systematic Review of Log-Based Cloud Forensics
2.1.1
337
Continual Forensics
Continual forensics put the forensicability of a system under test. It is the practice of continuous analysis of the logs generated by a forensics sound system. Unlike the post-incident forensics analysis of logs, in continuous forensics, the system logs are continuously monitored for ill behavior in the system. This is a result of the development in cloud services and the research in forensics readiness of the cloud systems. Due to the ever-broadening of the cybercrime landscape, several contributions have been made in the “forensic-by-design” attribute of cloud systems. Simou et al. [9] in their work proposed a methodology for forensic readiness of the cloud systems. They coined the term “forensicability” to describe a system or a service that can be designed in a forensic sound manner. They have further identified the forensic constraints which are the concepts to be realized to enable a cloud as forensic ready. These constraints when implemented increase the forensicability of a cloud service. Kebande et al. [10, 11] have proposed a botnet-based solution for gathering logs from live systems in the cloud computing infrastructure. They proposed infecting the virtual machines with non-malicious bots that would collect live logs from the user virtual machines and provide logs for live analysis. Park et al. [12] described the work environments incorporating cloud services as smart work environments. They suggested that cloud services for such work environments must implement forensic readiness as a pro-active measure of threat preemption. They further proposed their pro-active forensics model that was based on their investigation of the literature. They identified and analyzed 50 components of digital forensic readiness and designed 7 detailed areas. For validating the model designed by them, they undertook surveys of their models by digital forensics professionals. Finally, they deduced the areas that can be worked on for the existing systems to gain forensic readiness. Datta et al. [13] proposed a machine learning-based system that ranks malicious users in a crime scene. This ranking of the suspect IPs helps eliminate the need to investigate all the IPs in a crime scene (Fig. 3). De Marco et al. [14] stated the fact that breaches in the cloud by the cloud client happens by the violation of the service level agreements. Thus, the pro-active detection of violation of the service level agreements (SLAs) has the potential to lead a forensic investigation to confidence. Based on this, they emphasized the need for automation of detection of SLA breaches. They further proposed a framework for the development of forensic ready cloud systems. Their framework considered the technical aspects of the SLA and monitored that the system fulfilled the service obligations. Baror et al. [15] emphasized the need for forensic readiness in the cloud. The authors stated that there must be some form of communication in natural humanunderstandable language be it friendly or vindictive. Thus, they proposed a natural language processing-based framework that analyzes the language of the malicious entity and detects an ongoing cyber-attack.
338
A. Ghosh et al.
Fig. 3 Continual and sporadic forensics
2.1.2
Sporadic Forensics
Sporadic forensics is referred to the forensics process that includes the activities being carried out as a response to an incident rather than a continuous process of forensics preparedness activities being carried out in the cloud infrastructure. It is where the forensics data is acquired on a later stage as opposed to continuous data acquisition for future incidents of forensics as in continual forensics. Dykstra and Sherman [16] proposed “Forensic Open-Stack Tools” (FROST) for upright log extraction from Infrastructure-as-a-Service (IaaS) cloud platforms. FROST is capable of extracting logs from virtual disks, APIs, and guest firewalls. The distinct feature of FROST is that it operates in the cloud management plane and does not interact with the guest operating system. It also ensures log data integrity by maintaining hash trees. Marty [17] proposed a log-based cloud forensic framework that focuses solely on the logs generated at different levels of the cloud computing environment. The proposed model is carefully designed keeping in mind—when there is a need for logging, what is being logged and how an event is being logged. The author also emphasizes on the non-existence of any standard log format and proposed the must-haves in a log record such as the timestamp of the log record, key-value pair format of the log entry, normalized values for making the logs ready for the analysis, etc. The author has also focussed on log transportation, storing logs in centralized storage, archiving the logs, and retrieving the logs when needed. Anwar and Anwar [18] showed that the system generated logs of a cloud system that can be used in a forensic investigation. They generated their log dataset by launching known attacks on Eucalyptus. Then, they analyzed the data generated by the attacks and built a system that could detect further such attacks on Eucalyptus. Roussev et al. [19] showed that the traditional forensic practices on the client-side are inefficient to be employed in cloud forensics
A Systematic Review of Log-Based Cloud Forensics
339
and it requires a new toolset for efficient forensics in the cloud computing environment. Further, the authors developed and demonstrated tools for forensics analysis of the cloud. They proposed “kumodd” for remote acquisition of cloud drives, “kumodocs” for acquisition and analysis of Google docs, and “kumofs” for remote visualization of cloud drives. Ahsan et al. [20] stated that the existing systems focus on the forensics of the traditional systems rather than the cloud computing environment. The authors then proposed their logging service system called “CLASS: cloud log assuring soundness and secrecy scheme for cloud forensics.” In their proposed logging system, discrete users encrypted their logs with the help of their public key with the purpose that only the users can decrypt their logs using their private key. To avert unsanctioned alteration of logs, the authors spawned “Proof of Past Log (PPL)” by implementing Rabin’s Fingerprint and Bloom Filter. Park et al. [21] affirmed that despite the extensive research and development in the field of cloud forensics, there still exist problems that have not yet been addressed. Maintaining the integrity of the log is one such area. To solve this problem of log integrity, they proposed their blockchain-based logging and integrity management system. Khan and Varma [22] in their work proposed a framework for cloud forensics taking into consideration the service models in the cloud. Their proposed system implemented pattern search and used machine learning for the extraction of features in the cloud logs. This implementation of machine learning in their work enabled the prioritization of shreds of evidence collected from the virtual machines in the cloud. Rane and Dixit [23] emphasized that there exists a great deal of trust in the third party in log-based cloud forensics such as acquiring the logs from the cloud service provider (a third party in the cloud forensics investigation). Stakeholders may collude to alter the logs for their benefit. Thus, to solve this problem, the authors proposed their forensic aware blockchain-based system called “BlockSLaaS: blockchain-assisted secure Logging-as-a-Service” for the secure storage of logs and solving the collusion problem. Further, they claimed that their proposed system provides the preservation of log integrity (Table 2).
2.1.3
CSP Provided Logs
The logs in the cloud computing environment are generated in all the levels of the service model. In the SaaS level, the user gets little or no logs at all. In the PaaS level, the user has access to some of the logs but the degree of detail in such logs is limited. The IaaS level gives the highest access to logs that can be instrumental to the process of forensics investigation in the cloud. It is the cloud service provider (CSP) who determines what logging information is provided to the cloud user. The CSP has exclusive control of the logs in the cloud computing environment. It is because of this reason, in a forensics investigation, the CSP is requested to provide the relevant logs for the investigation to proceed. The CSP is not an entity that can be completely trusted as there is a chance of collusion and alteration of logs. In the case of collusion among the CSP and the adversary, the investigation might not lead to confidence and even the wrong person might get framed [24].
340
A. Ghosh et al.
Table 2 Recapitulation of contributions in continual and sporadic forensics Continual forensics [9]
Proposed forensic readiness model
[10, 11]
Proposed botnet-based logging system
[12]
Proposed forensics readiness model based on identified components of digital forensics
[13]
Proposed machine learning-based suspect ranking
[14]
Proposed system that alerts breach of service level agreement
[15]
Proposed NLP-based forensics system
Sporadic forensics [16]
Proposed “FROST” for log extraction in IaaS
[17]
Provided and demonstrated logging guidelines
[18]
Emphasized the usefulness of system logs
[19]
Proposed and demonstrated forensics acquisition tools
[20]
Proposed “CLASS: cloud log assuring soundness and security scheme”
[21]
Proposed blockchain-based system that ensures the integrity of logs
[22]
Proposed machine learning-based system for prioritization of evidence
[23]
Proposed “BlockSLaaS: blockchain-assisted secure LaaS
VM Logs These logs provide detailed clues to what has been done during an attack by an adversary. The logs might not be available, and a legal notice needs to be sent out. But multijurisdictional issues might lead to the non-availability of such logs. Due to the pivotal role of logs from VMs, several researchers have contributed to various methods and suggestions for VM log gathering and analysis. Thrope et al. [25] proposed a tool called “Virtual Machine Log Auditor (VMLA).” VMLA can be used by a forensics investigator to generate a timeline of events in the virtual machine using the virtual machine logs. Zhang et al. [26] proposed a method for the detection of active virtual machines and the extraction of system logs, process information, user accounts, registry, loaded modules, network information, etc. They have experimented and have been successful with current CPUs and operating systems such as Fedora. Lim et al. [27] have emphasized the role of VM’s in forensics investigation and have presented suggestions on how to forensically investigate a virtual machine based on their findings. Wahyudi et al. [28] in their research demonstrated that even when a virtual machine is destroyed, and the forensic relevant data can be recovered from the host machine using autopsy tools and FTK.
A Systematic Review of Log-Based Cloud Forensics
341
Resource Logs Performing forensics in the cloud computing environment not only requires the logs from virtual machines and host machines, but the logs from other resources such as load balancers, routers, and network firewalls are also essential. These resource logs help the forensics investigator in the reconstruction of the crime scene. The acquisition of such logs is a challenge and demands trust in the cloud service provider as the cloud infrastructure after all is owned and maintained by her. While surveying the literature, it was found that there has been a significant amount of research and development in this area too. Mishra et al. [29] emphasized the usefulness of the resource logs along with the logs from the virtual machines and the host system. In their work, they proposed the collection of virtual machine logs along with resource logs and stored the logs in a database. They further demonstrated the implementation of a dashboard for monitoring the logs for the identification of unusual activities in the virtual machines and the resources that are being logged. They performed their experiment in a private cloud implemented using Eucalyptus. Gebhardt and Reiser [30] in their research, outlined the need for network forensics in the cloud computing environment. Additionally, they have emphasized the challenges in network forensics. To solve these problems, they have proposed a generic model for forensics of the network in the cloud computing environment. They validated their proposed model by implementing a prototype with “OpenNebula” and the analysis tool called “Xplico.”
2.1.4
LaaS Provided Logs
Logs are extracted from various levels of the cloud infrastructure as well as from the devices that reside with the Internet service provider. During forensics investigation, logs are requested from the cloud service provider as well as from the Internet service provider. But the issue of putting trust in a third party is subsisted in such a log acquisition process. To mitigate the issue of putting trust in a third party, Loggingas-a-Service has emerged. This scheme of service gathers logs and provides access to the logs to the forensics investigator through trusted, secure, and privacy-aware mechanisms. Thus, keeping the dependency on untrusted parties to a minimum. Khan et al. [31] have emphasized the importance of cloud logs and have proposed a “Logging-as-a-Service” scheme for the storage of outward gathered logs. Deployment of logging system being expensive due to the persistence of logs gathered, they have opted for a cloud-based solution. They deployed the logging service in the cloud where they implement “reversible watermarking” for securing the logs. This kind of watermarking is very efficient, and any tampering of the logs can be easily detected by the virtue of it. The logs are collected using Syslog and the logs thus collected are stored for a longer stretch of time. Muthurajkumar et al. [32] have accentuated the pivotal role that logs play in forensics and the usefulness of extended storage of logs. In their work, they have implemented a system using Java and Google Drive for the secured and integrity maintaining manner. The authors have implemented the
342
A. Ghosh et al.
“Temporal Secured Cloud Log Management Algorithm” for maintaining log transaction history. The logs that they store are encrypted before storage. Batch storage of logs is implemented by the authors for seamless retrieval of the stored logs. Liu et al. [33] have outlined the importance and vulnerability of logs in the cloud computing environment. Considering the risks that persist for the log databases, the authors have proposed a blockchain-based solution for such log storage. The authors have implemented the logging system where the integrity of the logs to be stored is first verified and then the logs are stored in the log database and the hash of the logs are stored in the blockchain. Users retrieve the hashes from the blockchain and store them in a separate database called the “assistant database.” Then, the users send acceptance of the logs to the cloud service provider. Finally, the cloud service provider stores the acceptance in the log database. Patrascu and Patriciu [34] discuss the problem of logs not being consolidated. They further propose a system for the consolidation of cloud logs to help the forensics process in the cloud computing environment. The framework proposed by the authors consists of five layers. The “management layer” consists of a cloud forensics module and other cloud-related services. The “virtualization layer” consists of all virtual machines, workstations, etc. The third layer consists of the log data storage that is sent from the “virtualization layer.” The raw log data is then analyzed in the fourth layer. Finally, in the fifth layer, the analyzed and processed data are stored. Rane et al. [35] proposed an interplanetary file system (IPFS)-based logging solution. The IPFS system is used to store network and virtual machine log meta-data. The authors claim that their system provides “confidentiality,” “integrity,” and “availability.” The authors maintain an index of hashes from the IPFS system. Any tampering of data will result in new hash which will not be present in the index. Thus, providing integrity of the logs (Table 3).
2.1.5
Client Device Logs
Cloud services exhibit multi-device capability, i.e., the cloud services can be accessed from different kinds of devices. A quite common example of such a cloud service is cloud storage services such as Google Drive, Dropbox, etc. From the lens of forensics investigation, all such devices accessing the cloud services need to be examined for shreds of evidence. Several developments have been made in this area of research. The logs found in such client devices have been termed as “somatic logs” in the proposed taxonomy. Satrya and Shin [36] in their work proposed a method for forensics investigation of the client devices and the Google Drive application on Android devices. They demonstrated the forensics investigation by following the six steps of a digital forensics investigation. They compared various logs generated in the system such as login and logout data, install and uninstall data. Amirullah et al. [37] performed forensics analysis of client-side applications on Windows 10 devices to find remnant data on the device related to the crime scene. They discovered various kinds of data in the device such as deleted data, install and uninstall data, browser data, and memory data. They claim their success to be 82.63%. But they were unable to analyze the remaining data on the device (Table 4).
A Systematic Review of Log-Based Cloud Forensics
343
Table 3 Recapitulation of contributions in CSP provided logs and LaaS provided logs CSP provided logs [25]
Proposed system that generates timeline of events using VM logs
[26]
Proposed system for extraction of logs
[27]
Provided suggestions on how to perform forensics investigation of VMs
[28]
Recovered evidences from deleted VMs using Autopsy tools and FTK
[29]
Proposed system for acquisition and consolidation of logs
[30]
Proposed a generic model for forensics of network using Xplico tool
LaaS provided logs [31]
Proposed “Logging-as-a-Service” system using “reversible watermarking” in cloud
[32]
Proposed system for secure and integrity preserving persistence of logs using Google Drive
[33]
Proposed blockchain-based solution for log storage and anonymous authentication
[34]
Proposed an extensible system for consolidation of logs for existing clouds
[35]
Proposed IPFS-based log storage system
Table 4 Recapitulation of contributions in client device logs Client device logs [36]
Proposed and demonstrated forensics investigation of Google Drive client Android application
[37]
Performed forensics analysis of client-side applications on Windows 10 devices
3 Challenges in Log-Based Cloud Forensics Log-based cloud forensic faces several challenges. In this section, the challenges faced by an investigator in log-based cloud forensics have been discussed. • Synchronization of Timestamps Timestamps in the cloud logs enable the forensics investigator to reconstruct the chain of activities that have taken place in a crime scene in the cloud. By design, cloud infrastructure is spread across the globe. This makes the logs from different systems maintaining timestamps of respective time zones. Thus, when logs from different time zones are analyzed, co-relating the timestamps becomes a mammoth task. • Logs Spread Across Layers Moving along in the order IaaS, PaaS, SaaS, the access of logs decreases, i.e., in SaaS, the CSP either provides the user with little or no log data. In PaaS, the user gets access to some extent. The highest level of (comparatively) access to the logs
344
A. Ghosh et al.
is given to the user in IaaS. There is no centralized access to the logs in the cloud computing environment. Moreover, the IaaS user is only granted access to logs that the cloud service provider deems suitable. For detailed logs of the network and hardware, the cloud service prover must be requested and trusted. • Volatile Logs In the cloud environment, if the user can create and store data then she also can delete the data. Because of the multi-locational nature of cloud computing, the data present at different data locations are mapped to provide abstraction and to provide the illusion of unity to the user. When data is deleted, its mapping is also erased, this removal of the mapping happens in a matter of seconds thus making it impossible to get remote access of the deleted data in an investigation scenario which partly relies on deleted data recovery. • Questionable Integrity of Logs The cloud service provider is the owner of most of the crucial logs and must be requested for access to the logs in a forensics investigation. But the integrity of logs that are provided by the cloud service provider is questionable. There is always a chance of collusion among the parties involved in the forensics investigation. Moreover, the cloud service providers are bound by the service level agreements for the privacy and integrity of their clients. Thus, a cloud service provider will not be obliged to breach the service level agreements fearing their running out of business due to such a breach.
4 Open Research Areas in Log-Based Forensics Despite the extensive research and development surveyed in this paper, there still exist several unaddressed issues. Some of the open research areas that can be worked on for the maturing of log-based cloud forensics are. • Forensics-as-a-Service The volume of data that must be analyzed is huge. Analysis of such a huge volume of data requires high computing power. The typical workstation-based practice of forensics analysis needs to be changed for an improved turnaround time of the cases. The elasticity of the cloud computing environment can be exploited to process such huge volumes of data. Thus, cloud-based Forensics-as-Service can play a pivotal role in the reduction of pending cases. • Innovation of Tools Cloud forensics is still practiced using traditional digital forensics tools. This makes the process inefficient and at times leads to a complete halt of the cases. Thus, there is an urgent need for specialized tools that suit the cloud computing environment, that can handle the huge volumes to be analyzed, and automates the tasks in the process of forensics analysis. • Integrity of Log Files
A Systematic Review of Log-Based Cloud Forensics
345
As discussed in Sect. 3, the integrity of the logs provided by the cloud service provider is questionable. There is an urgent need to come with solutions that help preserve the integrity of the log files. This is because, if the logs are modified by any party, the case will not lead to confidence and there is a chance of the righteous person being punished. • Prioritization of Evidence to Be Analyzed One of the major reasons for the high turnaround time of the cases is the excessively high volumes of data that need to be examined. Thus, discarding irrelevant data will speed up the process of examination. Hence, there is a need for smarter and automated tools and techniques that prioritizes the data relevant to a case.
5 Conclusion The field of computing is evolving rapidly due to the introduction of cloud computing. Cloud computing has given computing a new dimension which the humankind can leverage. Misuse of the tremendous power and flexibility also exists and the cloud itself can be used to deal with the adversaries. Cloud computing has the potential to give a significant boost to forensic investigations be it on cloud or off the cloud. Overall, the potential is phenomenal. In this work of review, it has been explained why traditional digital forensics fails in a cloud computing environment. By undertaking the survey, taxonomy has been proposed with continual and sporadic Forensics being the two main types of cloud forensics. The sub-types of the proposed taxonomy have been discussed in detail with the coverage of tools. The challenges in log-based cloud forensics have been identified and discussed in detail. Finally, the areas of open research and development in the field of log-based cloud forensics have been identified.
References 1. Santra P, Roy A, Majumder K (2018) A comparative analysis of cloud forensic techniques in IaaS.Advances in computer and computational sciences. Springer, Singapore, pp 207–215 2. Santra P et al (2018) Log-based cloud forensic techniques: a comparative study.Networking communication and data knowledge engineering. Springer, Singapore, pp 49–59 3. Datta S, Majumder K, De D (2016) Review on cloud forensics: an open discussion on challenges and capabilities. Int J Comput Appl 145(1):1–8 4. Baldwin J et al (2018) Emerging from the cloud: a bibliometric analysis of cloud forensics studies.Cyber threat intelligence. Springer, Cham, pp 311–331 5. Ruan K et al (2011) Cloud forensics.IFIP International conference on digital forensics. Springer, Berlin 6. Sibiya G, Venter HS, Fogwill T (2015) Digital forensics in the cloud: the state of the art. In: 2015 IST-Africa conference. IEEE 7. Studiawan H, Sohel F, Payne C (2019) A survey on forensic investigation of operating system logs. Dig Invest 29:1–20
346
A. Ghosh et al.
8. Khan S et al (2016) Cloud log forensics: foundations, state of the art, and future directions.ACM Comput Surv (CSUR) 49(1):1–42 9. Simou S et al (2019) A framework for designing cloud forensic-enabled services (CFeS). Requirements Eng 24.3:403–430 10. Kebande VR, Venter HS (2015) Obfuscating a cloud-based botnet towards digital forensic readiness. In: Iccws 2015—the proceedings of the 10th ınternational conference on cyber warfare and security 11. Kebande VR, Venter HS (2018) Novel digital forensic readiness technique in the cloud environment. Austral J Forens Sci 50(5):552–591 12. Park S et al (2018) Research on digital forensic readiness design in a cloud computing-based smart work environment.Sustainability 10(4):1203 13. Datta S et al (2018) An automated malicious host recognition model in cloud forensics. In: Networking communication and data knowledge engineering. Springer, Singapore, pp 61–71 14. De Marco L et al (2014) Formalization of slas for cloud forensic readiness. In: Proceedings of ICCSM conference 15. Baror SO, Hein SV, Adeyemi R (2020) A natural human language framework for digital forensic readiness in the public cloud.Austral J Forensic Sci 1–26 16. Dykstra J, Sherman AT (2013) Design and implementation of FROST: digital forensic tools for the OpenStack cloud computing platform. Digital Invest 10:S87–S95 17. Marty R (2011) Cloud application logging for forensics. In: Proceedings of the 2011 ACM symposium on applied computing 18. Anwar F, Anwar Z (2011) Digital forensics for eucalyptus. In: 2011 Frontiers of ınformation technology. IEEE 19. Roussev V et al (2016) Cloud forensics–tool development studies & future outlook.Digital investigation 18:79–95 20. Ahsan MAM et al (2018) CLASS: cloud log assuring soundness and secrecy scheme for cloud forensics.IEEE Trans Sustain Comput 21. Park JH, Park JY, Huh EN (2017) Block chain based data logging and integrity management system for cloud forensics.Comput Sci Inf Technol 149 22. Khan Y, Varma S (2020) Development and design strategies of evidence collection framework in cloud environment. In: Social networking and computational ıntelligence. Springer, Singapore 23. Rane S, Dixit A (2019) BlockSLaaS: blockchain assisted secure logging-as-a-service for cloud forensics. In: International conference on security & privacy. Springer, Singapore 24. Alex ME, Kishore R (2017) Forensics framework for cloud computing. Comput Electr Eng 60:193–205 25. Thorpe S et al (2011) The virtual machine log auditor. In: Proceeding of the IEEE 1st ınternational workshop on security and forensics in communication systems 26. Zhang S, Wang L, Han X (2014) A KVM virtual machine memory forensics method based on VMCS. In: 2014 tenth ınternational conference on computational ıntelligence and security. IEEE 27. Lim S et al (2012) A research on the investigation method of digital forensics for a VMware Workstation’s virtual machine.Math Comput Model 55(1–2):151–160 28. Wahyudi E, Riadi I, Prayudi Y (2018) Virtual machine forensic analysis and recovery method for recovery and analysis digital evidence.Int J Comput Sci Inf Secur 16 29. Mishra AK, Pilli ES, Govil MC (2014) A Prototype Implementation of log acquisition in private cloud environment. In: 2014 3rd ınternational conference on eco-friendly computing and communication systems. IEEE 30. Gebhardt T, Reiser HP (2013) Network forensics for cloud computing. In: IFIP ınternational conference on distributed applications and ınteroperable systems. Springer, Berlin 31. Khan A et al (2017) Secure logging as a service using reversible watermarking.Procedia Comput Sci 110:336–343 32. Muthurajkumar S et al (2015) Secured temporal log management techniques for cloud. Procedia Comput Sci 46:589–595
A Systematic Review of Log-Based Cloud Forensics
347
33. Liu J-Y et al (2019) An anonymous blockchain-based logging system for cloud computing. In: International conference on blockchain and trustworthy systems. Springer, Singapore 34. Patrascu A, Patriciu V-V (2015) Logging for cloud computing forensic systems. Int J Comput Commun Control 10(2):222–229 35. Rane S et al (2019) Decentralized logging service using IPFS for cloud ınfrastructure.Available at SSRN 3419772 36. Satrya GB, Shin SY (2018) Proposed method for mobile forensics investigation analysis of remnant data on Google Drive client.J Internet Technol 19(6):1741–1751 37. Amirullah A, Riadi I, Luthfi A (2016) Forensics analysis from cloud storage client application on proprietary operating system. Int J Comput Appl 143(1):1–7
Performance Analysis of K-ELM Classifiers with the State-of-Art Classifiers for Human Action Recognition Ratnala Venkata Siva Harish and P. Rajesh Kumar
Abstract Recent advances in computer vision have drawn much attention toward human activity recognition (HAR) for numerous applications similar to video games, robotics, content recovery, video surveillance, etc. The enlightening and pursuing of human actions recognized by the wearable sensor devices (WSD) generally used today face difficulty in precision and reckless automatic recognition due to regular change of body movements by the human. Primarily, the HAR system will preprocess the WSD signal, and then, six sets of features were extracted from wearable sensor accelerometer data that are viable from the computational viewpoint. In the end, after the crucial dimensionality reduction process, the selected features were utilized by the classifier to ensure high human action classification results. In this paper, to analyze the performance of the K-ELM, classifiers-based deep model for selected features is predominantly focused with the state-of-the-art classifiers such as artificial neural network (ANN), k-nearest neighbor (KNN), support vector machines (SVM) and convolutional neural network (CNN). The experimental results obtained by analyzing performance using the metrics such as Precision, Recall, F-measure, specificity and accuracy shows that K-ELM outperforms with less time for most of the abovementioned state-of-the-art classifiers. Keywords Kernel extreme learning machine (K-ELM) · Human action recognition (HAR) · Wearable sensor devices (WSD) · Classifiers
1 Introduction The persuading development in the field of computer vision is utilized by various smart applications today, principally the concept of human action recognition is R. V. S. Harish (B) · P. Rajesh Kumar Department of Electronics and Communications Engineering, Au College of Engineering (Autonomous), Visakhapatnam 530 003, Andhrapradesh, India e-mail: [email protected] P. Rajesh Kumar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_27
349
350
R. V. S. Harish and P. Rajesh Kumar
exploited for security as well as for various smart environmental applications [1]. The recognition of human action can be analyzed only through constant monitoring by the methods as shown in Fig. 1 among that, in recent this can be attained mostly by the researchers by manipulating a wearable sensor device (WSD) [2]. Furthermore, among the various HAR system developed with internal and external sensors for posture and motion estimation, accelerometers and gyroscopes are more precisely used by the researchers [3]. Among that, accelerometers are the sensor most commonly used in wearable devices, owing to its noted primes such as miniature size, low cost and power stipulations, and its competency to deliver data promptly interrelated to the motion of the people [4]. The signal logged by the accelerometer is validated upon the human activity and its device location, and the increase in the use of accelerometers for HAR needs to embrace certain inadequacies like positioning issues and useableness apprehensions [5]. The accelerometer sensor-based reliable HAR system requires an efficient classifier to speed up the recognition process and its accuracy, and time taken by each classifier is a major constraint issue [6]. Therefore, quick classification of human action is necessary to overcome the drawback of conventional classifiers used in processing signals as it processed as a time-series and desires to remain as continuous as probable [7]. Furthermost, the recent research studies related to HAR make use of classifiers such as k-nearest neighbor (kNN), support vector machines (SVM), supervised learning Gaussian mixture models (SLGMM), random forest (RF), k-means, Gaussian mixture models (GMM) and hidden Markov model (HMM) [8]. Although the advancements in recognizing daily living activities like standing, sitting, sitting on the ground, lying down, lying, walking, stair mounting and standing up are done through various approaches, automated HAR is still inadequate due to minor classification inaccuracies [9]. These issues drawn us toward the analysis of standardized classifier evaluation based on WSD for multiple applications due to its exertion in characterizing promising classifier for human action recognition system [4]. The main contribution of this paper is to evaluate the performance of the K-ELM classifier-based deep model by comparing it with that of the conventional state-of-art classifiers by using the real-world dataset, which was collected by W. Ugulino’s team using wearable accelerometers. The human action recognition includes following
Vision Based Human Action Recognition (HAR)
Wearable
Sensor Based
Object Tagged
Dense Sensing
Fig. 1 HAR approaches
Performance Analysis of K-ELM Classifiers …
351
process: (i) Accelerometer sensor placement (ii) Preprocessing (iii) Feature extraction (iv) Feature selection (v) Classification. The results obtained by the classifiers are evaluated based on metrics such as F-measure, recall, precision and accuracy. This paper is organized as follows: In Sect. 2, background on human action recognition system by various research scholars was addressed. In Sect. 3, the adopted K-ELM classifier-based HAR with above five steps was addressed. In Sect. 4, the experimental results for suggested and state-of-art classifiers were deliberated. Finally, with a short description, the perspectives of the paper are concluded in Sect. 5.
2 Related Work In recent times, due to the striking accomplishment of various classifiers in computer vision diligences, research scholars are keen to use it for HAR systems. Some of their research works were reviewed here in this section, and its limited classification accuracy of state-of-art classifiers is tabulated in Table 1. Sheng et al. in [10] has introduced an improved Extended Region-aware Multiple Kernel Learning (ER-MKL) for HAR by fusing the human and contextual visual cue (Multilayer deep features) based on the pre-learned CNN classifiers and prior Table 1 Extraction of limited classification accuracy of state of artwork in related works (Sect. 2) Author
Year
Classifier
Classification accuracy
Sheng et al. in [10]
2020
Extended region-aware multiple kernels learning (ER-MKL)
Classification accuracy of about 70.5%
Weiyao et al. in [11]
2016
Kernel-based extreme Classification accuracy of learning machine classifier about 94.5%
Jaouedi et al. in [12]
2019
Recurrent neural networks to predict human action
Classification accuracy of about 86%
Xiao et al. in [13]
2019
Convolutional neural network
Classification accuracy of about 0.6212, 0.6637, 0.9216 and 0.894 for different datasets
Zhang et al. in [14]
2019
Deep belief network as classifier
Classification accuracy of about 74.59%, 89.53%, 87.03%, 90.66% based on the differ features
Zerrouki et al. in [15]
2018
AdaBoost algorithm
Classification accuracy of about 96.56%, 93.91%, 96.56%, 93.91%, for different datasets
Feng-Ping et al. in [16]
2018
SVM classifier
Classification accuracy of about 92.1%, 91.3%, 91.2%, 79.8%, 88.3%, 55.2% for different datasets
352
R. V. S. Harish and P. Rajesh Kumar
knowledge utilized. They make use of a JHMDB and UCF Sports datasets to evaluate the performance of the proposed ER-MKL strategy in comparison with the other conventional classifiers. Weiyao et al. in [11] has suggested an effective framework by modeling the multilevel frame select sampling (MFSS) model to sample the input images for recognizing human action. Then the motion and static maps (MSM) method, blockbased LBP feature extraction approach and fisher kernel representation are used to get the motion and static history, texture extraction and combining the block features, respectively. By analyzing the key parameters such as τ and MSM thresholds, it was proved that the 3-level temporal level was effective in recognizing human action than the others. The evaluation of the proposed approach was carried out on three publicly available datasets and as a future, suggestion Convolutional Neural Network and NTU dataset was recommended. Jaouedi et al. in [12] has introduced a HAR strategy by using the GMM-KF based motion tracking and Recurrent Neural Networks model with Gated Recurrent Unit for video sequencing. An important tactic used in this approach is to extract the features from each and every frame of the video under analysis to achieve a better human action recognition. The experiment outcome proves its high classification rate and suggests an idea to minimize the video classification time for challenging datasets like UCF Sport and UCF101 as a future scope. Xiao et al. in [13] has suggested a new HAR approach, it includes spatial decomposition by three-level spatial pyramid feature extraction scheme and deep representation extraction by the dual-aggregation scheme. Then by fusing both the local and deep features, CXQDA based on Cosine measure and Cross-view Quadratic Discriminant Analysis (XQDA) are utilized to categorize the human action. The experimental outcome shows its effective performance than that of the conventional strategies. Zhang et al. in [14] has suggested a DBN based electromyography (EMG) signal classifier for time-domain features for 20 human muscular actions. By means of the best set of features 4-class EMG signal classifier was designed for a user interface system mainly used in potential applications. Due to high variance of EMG signal for multiple features, it was difficult to choose the optimal classifier, hence they suggest to optimize the structural parameters of DBN with dominant features for real-time multi-class EMG signal recognitions for human muscular actions. Zerrouki et al. in [15] has introduced a video camera monitoring along with adapting AdaBoost classifier based human action recognition strategy in the paper. By partitioning the human body into 5 partitions six classes of activities such as walking, standing, bending, lying, squatting, and sitting are analyzed during the recognizing process. To evaluate the performance Universidad de Malaga, fall detection dataset (URFDD) was utilized and to deliberate its effectiveness they compared it with the conventional classifiers like a neural network, K-nearest neighbor, support vector machine and naive Bayes. Finally, as future direction, they suggest using an automatic updating method and infrared or thermal equipped cameras to ease the recognition process in the dusky environment.
Performance Analysis of K-ELM Classifiers …
353
Feng-Ping et al. in [16] has developed a deep learning model based on MMN and Maxout activation function for human action recognition. The suggested approach guarantees stable gradient propagation, avoid slow convergence process and improves the image recognition performance. Here, high-level space–time features from the sequences are extracted and finally classified with a two-layer neural network structure trained support vector machine. The type of human action and multi-class action recognition can be achieved through RBM-NN approach. The multi-class human action recognition can be evaluated by means of 3 set of datasets and proves to be quick and accurate recognition than that of the conventional multi-class action recognition approaches.
3 Proposed Methodology The main objective here is to work out the performance analysis of a K-ELM deep model aimed at human action recognition (HAR) originate from wearable sensor device motion analysis all for the selected set of features. The extraction of features subsisted through the help of multilayer extreme learning machine (ML-ELM) and finally classified in the midst of kernel extreme learning machine (KELM) classifier which has the advantage of convolutional neural network (CNN) to overcome the instability in ELM. The brief extension of the proposed strategy is portrayed in the underneath sections.
3.1 Extreme Learning Machine (ELM) ELM is one of the successful feed-forward regression classifiers suits well for largescale video or motion analysis exertions. The conventional neural networks involve hidden layer and include mapping through back propagation algorithm and least square approaches while learning. But, the learning problems in ELM are converted into an undeviating scheme whose weight matrices were evaluated through comprehensive inverse operation (Moore–Penrose pseudo inverse), i.e., it will assign only the hidden neurons and randomize the weights as well as bias in between the input and hidden layers to evaluate the output matrix during the execution process. Finally, the Moore–Penrose pseudo inverse method beneath the principle of least-squares method helps to attain the weight in between the final hidden and output layer. This undeviating learning scheme with the norm of weights and less error process quickly with superior classification competence than that of the conventional learning strategies. Figure 2 indicates the ELM network with input layer with ‘n’, ‘l’ and ‘m’ number of inputs, hidden and output layers, respectively. Let us assume that the network with input sample as [x, y] = {xi , yi }; where, i = 1, 2, . . . Q
(1)
354
R. V. S. Harish and P. Rajesh Kumar
Output Layer
………….
β H=G(ωX+b) Hidden Layer
………….
Input Layer
………….
x
Fig. 2 Structure of the extreme learning machine (ELM)
The input feature of the above sample x and its desired matrix y are represented as follows, x = (xi1 , xi2 , . . . xi Q )
(2)
Performance Analysis of K-ELM Classifiers …
⎡
x11 ⎢x ⎢ 21 ⎢ ⎢· x =⎢ ⎢· ⎢ ⎣· xn1
355
⎤ x12 . . . x1Q x22 . . . x2Q ⎥ ⎥ ⎥ · · ⎥ ⎥ · · ⎥ ⎥ · · ⎦ xn2 xn Q
y = (yi1 , yi2 , . . . yi Q ) ⎡
y11 ⎢y ⎢ 21 ⎢ ⎢· y=⎢ ⎢· ⎢ ⎣· yn1
⎤ y12 . . . ym Q y22 . . . ym Q ⎥ ⎥ ⎥ · · ⎥ ⎥ ⎥ · · ⎥ ⎦ · · yn2 ym Q
(3)
(4)
(5)
In the above equations, ‘n’ and ‘m’ denote the input and output matrix dimensions, the randomized weight wi j sandwiched between the input and hidden layer was expressed as follows. ⎡
w11 ⎢w ⎢ 21 ⎢ ⎢· wi j = ⎢ ⎢· ⎢ ⎣· wl1
⎤ w12 . . . w1m w22 . . . w2m ⎥ ⎥ ⎥ · · ⎥ ⎥ · · ⎥ ⎥ · · ⎦ wl2 wln
(6)
Likewise, the weight β jk made by the ELM in between the hidden and output layer was represented as follows, ⎡
β jk
β11 ⎢β ⎢ 21 ⎢ ⎢· =⎢ ⎢· ⎢ ⎣· βl1
⎤ β12 . . . β1m β22 . . . β2m ⎥ ⎥ ⎥ · · ⎥ ⎥ · · ⎥ ⎥ · · ⎦ βl2 βlm
(7)
The Bias made by the ELM for the hidden layer neurons were expressed as B = [b1 b2 . . . bn ]T , the network activation function was represented as g(x), and the output matrix was represented as follows, T = [t1 t2 . . . t Q ]m×Q , i.e.,
356
R. V. S. Harish and P. Rajesh Kumar
⎡
⎤
l
βi1 g(ωi x j + bi ) ⎥ ⎢ ⎢ i=1 ⎥ ⎥ ⎡ ⎤ ⎢ ⎢ l ⎥ ti j ⎢ ⎥ ⎢t ⎥ ⎢ βi2 g(ωi x j + bi ) ⎥ ⎥ ⎢ 2j ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ i=1 ⎥ ⎢. ⎥ ⎢ ⎢ ⎥; ⎢ ⎥ tj = ⎢ ⎥ = ⎢ . ⎥ ⎥ ⎢. ⎥ ⎢ ⎥ ⎢ ⎥ ⎢. ⎥ ⎣. ⎦ ⎢ ⎢ ⎥ . ⎢ ⎥ tm j ⎢ ⎥ ⎢ ⎥ l ⎣ ⎦ βim g(ωi x j + bi )
j = 1, 2, 3, 4, . . . Q
(8)
i=1
By using the above equations, it can be formulated, Hβ = T
(9)
where H represents the hidden layer output, T is the transpose of T . To evaluate the weight matrix value β with minimum error, least square method was utilized. β = H +T
(10)
To regularize the term β and for the hidden layer neurons with less training samples with stabilized output results, the β is represented as follows, β=
1 + HT H λ
−1
HT T
(11)
Similarly, for hidden layer neurons with more training samples, the β is represented as follows, β=H
T
1 + H HT λ
−1
T
(12)
3.2 Multilayer Extreme Learning Machine (ML-ELM) The multilayer extreme learning machine (ML-ELM) consists of two or more hidden layers with ‘l’ neurons and a single output layer. The g(x) is selected for the network layers, then the bias evaluation and weight updating for all the layers in between the input and output layers were done by using the following equivalences.
Performance Analysis of K-ELM Classifiers …
357
Let us assume that two hidden layers ML-ELM shown in Fig. 3 has (X, T ) = {xi , ti }; i = 1, 2, 3, . . . , Q training samples, in which x denotes the input and t denotes the ruminated sample, the hidden layer output can be evaluated by using the following equations. H = g(wx + b)
(13)
where w and b signify the randomly initialized weight and bias of the hidden layers, the final layer output matrix was evaluated by using the following equation
Output Layer
………….
βnew Hidden Layer
………….
H2=g(wH H)
Hidden Layer
H=g(wHX)
Input Layer
………….
x
Fig. 3 Structure of multilayer extreme learning machine (ML-ELM)
358
R. V. S. Harish and P. Rajesh Kumar
β = H +T
(14)
where H + signifies the Moore–Penrose inverse matrix of H . Let us assume that our ML-ELM was designed with three hidden layers and its expected output and weight matrix were evaluated by using the following Eq. (15) + H2 = Tβnew
(15)
where + signifies the inverse weight matrix βnew βnew W H 1 = [B1 W1 ]
(16)
H3 = g(H1 w1 + B1 ) = g(w H H)
(17)
where w H 1 = g −1 (H2 )H1+ , W2 and H2 = weight and output between second and third hidden layer, H1+ = Inverse of H1 = [1 H2 ]T ; 1 signifies the column vector size Q and g −1 (H2 ) signifies the inverse activation function. Here, to evaluate the performance, the logistic sigmoid function is adopted, g(x) = 1 (1 + e−x )
(18)
Finally, the last layer, i.e., second layer and the output weight matrix with less and more neurons than that of the training samples was evaluated by using the following equations, H4 = g(W H 1 H1 )
(19)
−1 βnew = 1 λ + H3T H2 H3T T
(20)
−1 βnew = H3T 1 λ + H1 H1T T
(21)
f (x) = H3 βnew
(22)
where f (x) is the actual final hidden layer output after parameter optimization through all the inner layers present. ⎧
−1 ⎪ ⎨ h(x)H T I C + H H T T; N < l f (x) =
−1 ⎪ ⎩ h(x) I + H T H HT T; N ≥ l C
(23)
Performance Analysis of K-ELM Classifiers …
359
The deep learned six-time-domain features like mean value, standard deviation, Minmax, skewness, kurtosis and correlation are extracted by means of the above equivalences help us to gain better action recognition rate in the next classification process. Though specific features are captured during the different aspects of actions in the video, synthesis of features before classification gives us distinct characteristics.
3.3 Kernel Extreme Learning Machine (K-ELM) The ELM is a most efficient method with high-speed classification process than that of the conventional back propagation strategies due to its capability in generating the weights and bias randomly. Kernel based extreme learning machines proposed by Hung et al. in [17] by following Mercer’s has been utilized here for Human action classification process. The kernel matrix with unknown mapping function h(x) is well-defined as follows. ⎡
ELM
k(x1 , x1 ) . . . ⎢· T = HH = ⎢ ⎣· k(x N , X 1 ) . . .
⎤ k(x1 , x N ) ⎥ · ⎥ ⎦ · k(x N , x N )
K (xi , x j ) = h(xi ) · h(x j )
(24)
(25)
At last, the K-ELM output function was expressed as
T .. f (x) = h(x)β = k(x, x1 ).k(x, xn ) ( I C + k)−1 T
(26)
By considering the above equations from (24) to (26), the output weight of the K-ELM was evaluated by using the following Eq. (27), in which is the kernel matrix of input matrix given to the K-ELM classifier. −1 Y β = + I C
(27)
Our proposed K-ELM-based HAR strategy syndicates the benefits of convolutional neural network and multiple layer extreme learning machine methods. The time-domain features of wearable sensor data (WSD) are extracted by means of two-layer ML-ELM and then correlated before sending it to the K-ELM classifier.
360
R. V. S. Harish and P. Rajesh Kumar
3.4 Proposed K-ELM-Based HAR Strategy The accelerometer data collected by the WSD was given as an input to the HAR system, assume that data with a four-dimensional sequence with respect to time is taken here for action recognition from the real-life dataset collected by W. Ugulino’s by means of using four wearable accelerometers. After the acquisition of data, preprocessing was carried out, which includes the dimensionality reduction and segmentation of moving parts, i.e., sequencing the signal data into subsequences quietly termed as sliding window process is applied to sequential data partitioning. Subsequently, after partitioning, the input sensor data is preceded by dint of the feature extraction process. Here, time-domain features are aimed to extract for human action recognition in this exertion by an ML-ELM. The accelerometer signal time integrals are evaluated by means of heterogeneous metric so-called as integral of modulus of accelerations (IMA) is expressed as follows. N
N |ax |dt +
IMA = t=1
t=0
a y dt +
N |az |dt
(28)
t=0
where ax , a y , az —orthogonal acceleration components. t—time. N —window length. Formerly from the extracted features, a set of six-time-domain features is selected which includes mean value, standard deviation, Minmax, skewness, kurtosis and correlation to differentiate from the original set of samples and to ease the further classification process at less time. Finally, K-ELM classifier is used to classify human actions based on the selected set of features with less error. The performance of the K-ELM hierarchical classifier was compared with those of standard classifiers such as an artificial neural network (ANN), k-nearest neighbor (KNN), support vector machines (SVM) and convolutional neural network (CNN) based on its classification accuracy rate was discussed in the below section.
4 Result and Discussion The performance analysis of the K-ELM classifier in HAR is implemented in MATLAB 2015 and compared here with the conventional standard classifiers based on the wearable W. Ugulino’s HAR dataset. W. Ugulino’s makes use of a four triaxial ADXL335 accelerometer, AT mega 328 V microcontroller, Lilypad Arduino toolkit, and by placing the accelerometer in the waist, left thigh, right ankle and right arm, the HAR data was collected. Here, the dataset of W. Ugulino’s is classified into training and testing data with 80–20% with 128 readings/window for each dimension
Performance Analysis of K-ELM Classifiers …
361
data. The performance evaluation criteria used here for analysis include precision, recall, F-measure, specificity and accuracy. Accuracy = Tp + Tn Tp + Tn + Fp + Fn
(29)
where Tn (True negative)—Truly classified negative samples. Tp (True positive)—Truly positive samples. Fn (False negative)—Faultily classified positives. Fp (False positive)—Faultily classified negatives. F-measure is the integration of both the recall and precision, and it is expressed as follows Precision = Tp Tp + Fp
(30)
Recall = Tp Tp + Fn
(31)
2 F-Score = (1 + β ) ∗ recall ∗ precision β 2 ∗ recall + precision
(32)
Specificity = Tn Tn + Fp
(33)
where β—represents the weighing factor. Here, the performance of ANN, KNN, SVM and CNN is analyzed with the proposed K-ELM approaches for human action recognition system with wearable sensor data. The six features are extracted (mean value, standard deviation, Minmax, skewness, kurtosis and correlation), and these features are used to train K-ELM classifier for more stable performance than that of the conventional classifiers. The execution outcome of these approaches and its comparative difference were discussed through Table 2 and Figs. 4, 5, 6 and 7. As shown in Table 1, every classifier under analysis shows a level of accuracy with respect to time for HAR. Here, the SVM and ANN have a lesser F.score of about 90.66–90.23% subsequently with the time of about 11 and 20 min separately to classify the human action while simulating. Similarly, KNN and CNN had an Table 2 Comparison between the different approaches for detecting HAR Time
F-score
Recall
Accuracy
Specificity
ANN
20 min
90.23
91.21
91.83
97.07
KNN
15 min
94.60
94.57
94.62
99.67
SVM
11 min
90.66
90.98
90.33
96.56
CNN
30 min
93.21
93.45
95.72
96.57
K-ELM
20 min
98.243
97.458
98.97
98.56
362
Fig. 4 Comparative F-score analysis of classifiers
Fig. 5 Comparative recall analysis of classifiers
Fig. 6 Comparative specificity analysis of classifiers
R. V. S. Harish and P. Rajesh Kumar
Performance Analysis of K-ELM Classifiers …
363
Fig. 7 Comparative accuracy analysis classifiers
F.score of about 94.60–93.21% congruently with the time taken of about 15–30 min desperately to classify the human action. Our proposed K-ELMM classifier has an F.score of about 98.24% with the time taken of about 20 min, from this analysis, K-ELM attains efficient F.score value at the more or lesser time than that of the other classifiers under analysis with high F.score value. Similarly, the recall, specificity and accuracy of the proposed K-ELM approach in comparison with the state-of-art classifiers are shown in Figs. 4, 5 and 6, respectively. The selected set of features helps to characterize the human action (sit, walk, upstairs, stand and downstairs) well than that of the conventional classifiers for some reasons like sensor used and procedure variances meant for validation. In the case of ANN, KNN, SVM, CNN and K-ELM, the recall of about 91.21%, 94.57%, 90.98%, 93.45% and 97.458% is obtained while recognizing the action. Similarly, in the case of specificity and accuracy, the values obtained are about 97.07%, 99.67%, 96.56%, 96.57%, 98.56% and 91.83%, 94.62%, 90.33%, 95.72%, 98.97% correspondingly by the classifiers under analysis. It should be noted that the analysis using selected six sets of features as input to the K-ELM classifiers shows better performances than the others from this study by recognizing the human actions.
5 Conclusion In this paper, the performance analysis of proposed k-ELM classifier has been presented with a selected set of features with the conventional state-of-art classifiers by using W. Ugulino’s accelerometer dataset. The human action recognition process is described with its signifying equivalences utilized for feature extraction and classification. Finally, a comparative analysis of K-ELM is presented by using the selected set of time-domain features (mean value, standard deviation, Minmax, skewness, kurtosis and correlation) as an input shows effective results than that of other ANN, KNN, SVM and CNN approaches. From the analysis, the integration of above classifiers as a future direction would perform better with accurate
364
R. V. S. Harish and P. Rajesh Kumar
complementary decisions. However, our approach has a drawback of computation complexity, and while implementing it on real-time application because of the Wi-Fi signal requirement by WSD, an effective action recognition without any interruption is solely difficult.
References 1. Bayat A, Pomplun M, Tran D (2014) A study on human activity recognition using accelerometer data from smartphones. Procedia Comput Sci 34:450–457 2. Casale P, Oriol P, Petia R (2011) Human activity recognition from accelerometer data using a wearable device. In: Iberian conference on pattern recognition and image analysis. Springer, Berlin, pp 289–296 3. Pantelopoulos A, Bourbakis N (2010) A survey on wearable sensor-based systems for health monitoring and prognosis. IEEE Trans Syst Man Cybern Part C (Appl Rev) 40:1–12 4. Jordao A, Antonio C, Nazare Jr., Sena J, Schwartz WR (2018) Human activity recognition based on wearable sensor data. A standardization of the state-of-the-art. arXiv preprint arXiv: 1806.05226 5. Cleland I, Kikhia B, Nugent C, Boytsov A, Hallberg J, Synnes K, McClean S, Finlay D (2013) Optimal placement of accelerometers for the detection of everyday activities. Sensors 13:9183–9200 6. Poppe R (2010) A survey on vision-based human action recognition. Image Vis Comput 28:976–990 7. Vishwakarma S, Agrawal A (2012) A survey on activity recognition and behavior understanding in video surveillance. Vis Comput 29:983–1009 8. Tharwat A, Mahdi H, Elhoseny M, Hassanien A (2018) Recognizing human activity in mobile crowdsensing environment using optimized k-NN algorithm. Expert Syst Appl 107:32–44 9. Ignatov A (2018) Real-time human activity recognition from accelerometer data using convolutional neural networks. Appl Soft Comput 62:915–922 10. Weiyao X, Muqing W, Min Z, Yifeng L, Bo L, Ting X (2019) Human action recognition using multilevel depth motion maps. IEEE Access 7:41811–41822 11. Jaouedi N, Boujnah N, Bouhlel M (2020) A new hybrid deep learning model for human action recognition. J King Saud Univ Comput Inf Sci 32:447–453 12. Xiao J, Cui X, Li F (2020) Human action recognition based on convolutional neural network and spatial pyramid representation. J Vis Commun Image Rep 71:102722 13. Zhang J, Ling C, Li S (2019) EMG signals based human action recognition via deep belief networks. IFAC-Pap OnLine 52:271–276 14. Zerrouki N, Harrou F, Sun Y, Houacine A (2018) Vision-based human action classification using adaptive boosting algorithm. IEEE Sens J 18:5115–5121 15. An F (2018) Human action recognition algorithm based on adaptive initialization of deep learning model parameters and support vector machine. IEEE Access 6:59405–59421 16. Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B (Cybern) 42:513–529 17. Niu X, Wang Z, Pan Z (2019) Extreme learning machine-based deep model for human activity recognition with wearable sensors. Comput Sci Eng 21:16–25
Singular Value Decomposition-Based High-Resolution Channel Estimation Scheme for mmWave Massive MIMO with Hybrid Precoding for 5G Applications V. Baranidharan, N. Praveen Kumar, K. M. Naveen, R. Prathap, and K. P. Nithish Sriman Abstract Channel estimation is a very challenging task for an on-grid massive MIMO over the mmWave system with hybrid precoding. The problem arises due to more number of radio frequency chains that are comparatively smaller than the antennas used in the systems. The conventional massive MIMO channel estimation system over mm waveband is based on the off-grid learning models parameters and virtual channel estimators. The resolution loss is arises based on the high computational complexity of channel estimation by its Bayesian learning framework. The accuracy of the on-grid channel estimation, the work proposes a singular value decomposition-based iterative reweighted channel estimation scheme for massive MIMO over mmWave band. The objective function-based gradient descent method is used for the optimization. This proposed objective functions can be used to optimize the channel estimation by iteratively moving the estimate of an angle of arrival and departs. The iteratively reweighted parameters are used for the optimization of trade-off errors. Keywords Massive MIMO · Angle of arrival · mmWave band · Channel estimation · Angle of departure
V. Baranidharan · N. Praveen Kumar (B) · K. M. Naveen · R. Prathap · K. P. Nithish Sriman Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India e-mail: [email protected] V. Baranidharan e-mail: [email protected] K. M. Naveen e-mail: [email protected] R. Prathap e-mail: [email protected] K. P. Nithish Sriman e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_28
365
366
V. Baranidharan et al.
1 Introduction In 5G wireless communication, mmWave massive MIMO systems have been realized as a greatest and use trendy technology in modern days 5G applications. In this system, hybrid precoding is recommended to reduce the cost of the hardware and consumption of high power. More number of antennas is handled by N number of radio frequency chains. Hybrid precoding contain both analog and digital co-design, the RF chain used is less, so the antennas cannot directly accessible by the digital baseband. The accurate channel state information (CSI) needed for hybrid precoding [1]. So it can be very hard to estimate the high range of MIMO channels and its matrix. Recently, more novel channel estimation schemes are proposed for this massive MIMO system over mmWave band with hybrid precoding technique. Specially, the channel sounding scheme which is based on adaptive codebook, the best beam pair is searched by both transmitter and receiver, due to adjusting the predefined precoding. The codebook size is limited in this channel estimation scheme. Because of limited codebook size, it can determine the best angle estimation with the help of comparison between amplitude and beam pair. Or else, by use of angular channels sparsity, the less training overhead channel could evaluate by on-grid compressive method. However, it is assumed that the angle of arrival departures (AoA’s /AoD’s) always lies in the angle domain as a discrete point. In the time domain, AoA’s/AoD’s are constantly distributed in practice [2]. The use of on-grid AoA’s/AoD’s leads to power leakage problem, which critically reduces the accuracy of channel estimation schemes. To solve the issues and problem caused by angle estimation of on-grid channels, the present SVD-based iterative reweighted estimation of the channel has been proposed to formulate the off/on-grid AoA’s and AoD’s [3]. At first, the proposed work will iteratively optimize the estimates of the angle of arrivals and departures to eliminate the weighted total of the sparsity of the symbols and fitting error of data. This will introduce the SVD method preconditioning to decrease computational complexity in this scheme. In addition, it is suggested that the SVD method reduce the computational complexity of the iteratively reweighted procedure and makes in real-time practice in mmWave channel estimation. The paper is organized as follows, Sect. 2 gives related works carried out in massive MIMO systems. Section 3 explains the system model and proposed SVD-based high-resolution channel estimation scheme. Section 4 gives the simulated results and its discussions. The paper is concluded in Sect. 5.
2 Related Works This section explores the many existing techniques which greatly overcomes the computational complexity and feedback overhead changes. Pazdanowski have proposed the new channel estimation schemes through parameters learning [4] for channel estimation of mmWave massive MIMO systems. This work is fully based
Singular Value Decomposition-Based High-Resolution Channel Estimation …
367
on the off-grid channel model is widely used to spatial sample mismatch-based characterization using discrete Fourier transform (DFT) method for mmWave massive MIMO channel estimation. The main limitation of this channel estimation scheme is that this work is used to estimate off-grid parameters of mmWave massive MIMO channels. Qi et al., have proposed off-grid method to estimate the channel for massive MIMO systems over mmWave band [5]. In this method, the major advantage is that the pilot overhead is decreased. This system employs an off-grid scenario-based sparse signal reconstructing scheme. The accuracy of channel estimation is quite improved in this scheme. The separated stages of channel estimation including AOA ad AOD’s estimation followed for path gaining estimation is good. In this proposed method, the accuracy of this algorithm is comparatively less in grid point construction. Minimizing the objective function is not refined by suppressing the effect of off-grids. Wang et al., have proposed multi-panel mmWave with hybrid precoding for mmWave massive MIMO scheme [6]. In this method, the channel vector is converted into an angular domain and then CSI information is restored by the way of formulated angular CSI. In order to exploit the strucutural features of mmWave MIMO channels is always very difficult in angular domain. The major disadvantage of this method is that the computational complexity is not decreased. Qibo Qin et al., have proposed channel estimation by time-varying for millimeter wave massive MIMO systems [7]. In this method, the scatter nature (i.e., time-varying) of mmWave channels is used AoA’s/AoD’s are estimated. In order to overcome these issues, the adaptive angle estimation method is used to formulate the AoA’s/AoD’s estimation. Even though, the computational complexity of separated stages of channel estimation is very high. Jianwei Zhao et al., have proposed hybrid precoding with angle domain and tracking method of mmWave channel for wave massive-based MIMO channel systems [8]. A hybrid precoding with angle domain and mmWave channel tracking method is used for searching the structural features of millimetre wave MIMO channel. All the users can be scheduled by one of the schemes, based on their direction of arrivals [9, 10]. The major limitation of this method is high SNR error value is limited. While retraining the system, the effect of DoA tracking is not improved.
3 Proposed SVD-Based High-Resolution Channel Estimation Scheme for mmWave Massive MIMO Systems 3.1 System Model Consider mmWave massive MIMO with efficient hybrid precoding systems, with arbitrary geometry. N T , N TRF , N R , N RRF be the number of antennas for transmission, RF transmitter chains, receive antennas and receiver RF chains, respectively [11–15]. In real-time practical 5G systems with hybrid precoding, the number of RF chains is less than the number of antennas, i.e., N TRF < N T , N RRF < N R The system model
368
V. Baranidharan et al.
is given below, r = Q H HPs + n
(1)
where Q is a hybrid combining matrix, r is received signal, H is the matrix representation of the channel coefficients, P is precoding matrix used before the transmission, n is noise received in channel and s is transmitted signal. The channel model H=
H azi ele ele zl a R ϕ azi R,L , ϕ R,L aT ϕT,L , ϕT,L
(2)
In this massive MIMO systems, assume that, number of propagation paths L minimum (N R , N T ), the azimuth elevation AoA’s and AoD’s, followed by steering vector in both transmitter and receiver, similarly steering vectors are determined by array geometry. a(ϕ azi , ϕ ele ) = [1, e j2πd sin ϕ ele/λ , . . . , e j2π(N 1−1)d sin nϕazi sin ϕele/λ ]T T ⊗ 1, e j2πd cos ϕ ele/λ , . . . , e j2π(N 2 − 1) d cos ϕele/λ
(3)
where the symbol ⊗ represents is a Kronecker product, d is spacing between the antennas, the wavelength is λ. T a(ϕ) = 1, e j2πd sin ϕ/λ , . . . , e j2π(N 1−1)d sin ϕ/λ
(4)
The channel matrix H in (2) is given as H = A R ∗ (θ R ) ∗ diag(z) ∗ A TH (θT ).
(5)
Here, the ith channel matrix element at x transmitted signal at the ith transmit antenna. In the mth time frame for the users, the combined matrix W m is obtained by the number of RF chains N RRF dimensional received pilot arrangements. Y p,m = WmH H x p + n p,m
(6)
By received pilots in M path slots, the output of channel Y is Y = WHHX + N
(7)
The estimated channel matrix H (7) is always equivalent the number of paths estimation, AOA’s and AOD’s angles are normalized. The channel matrix (H) is obtained due to the angle domain sparsity min zˆ o , s.t.Y − W H Hˆ ≤ ε
z,θ R ,θT
F
(8)
Singular Value Decomposition-Based High-Resolution Channel Estimation …
369
where Z o is the total number of nonzero elements occurs at zˆ, and ε is the error tolerance parameter.
3.2 Proposed Optimization Formulation The major disadvantage of the above equation is that by solving l0 -norm vector. Because this l0 -norm is not computationally efficient as expected. min F(z)
z,θ R ,θT
L
log |z 1 |2 + δ , s.t.Y − W H Hˆ ≤ ε F
l=0
(9)
where δ > 0 ensures that the below Eq. (10) is well defined. Z, θ R , and θT are the parameters which are used to determine Hˆ . Further, it is used to optimize this as an unconstrained optimized problem by addition of a normalization parameter λ is greater than 0. min G
z,θ R ,θT
L
2 log |z 1 |2 + δ + λY − W H Hˆ
F
l−1
(10)
When the log-sum function is replaced by iterative surrogate function, the minimized surrogate function is similar to the minimization of G(z, θ R , θT ). 2 min s (i) λ−1 z H D (i) z + Y − W H Hˆ
z,θ R ,θT
F
(11)
where D (i) is given by, 1 1 1 + 2 · · · 2 D (i) 2 (i) (i) (i) ˆz 1 + δ ˆz 1 + δ ˆz 1 + δ
where the z
(i)
(12)
is the ith iteration estimation of z.
3.3 Iterative Reweight-Based Channel Estimation The constrained optimization problem is overcome by the reconstructing method H which is discussed in Fig. 1. The term s(i) is based on the sum of two parts: z Dz regulates sparsity of estimated results. Y − W H Hˆ is signifying the balance. F Where λ term is defined as the regularization parameter to controls the trade-off
370
V. Baranidharan et al.
Fig. 1 Flowchart of the proposed algorithm to find AOA’s and AOD’s
between data fitting error and sparsity. The cyclic-reweighted technique (10), the value λ is not at all fixed, however, it is updated in every cycle. If the past cycle is ineffectively fitted, the λ is chosen smaller to make the estimated as sparser. The larger value is chosen to quicken the search of the best-fitting estimation. In the calculation is proposed, λ is given as
λ = min (d) r (i) , λmax
(13)
Singular Value Decomposition-Based High-Resolution Channel Estimation …
371
To make the problem well-conditioned λmax is selected, a constant scaling factor is denoted as d, squared residue value r (i) , i.e.,
2 r (i) = Y − W H A R θˆR(i) diag zˆ (i) A TH θˆT(i) X
F
(14)
More details of updated λ were discussed in (13). At angle domain grids the iteration of the proposed algorithm begins. The main aim is to estimates θˆR(i+1) and θˆT(i+1) in the neighborhood of the previous estimate θˆR(i) and θˆT(i) This is used to define the smaller the objective function s (i) . This is done by gradient descent method (GDM) of optimization. The equation is given as
(i) ˆ (i) ˆ (i) θˆR(i+1) = θˆR(i) − η · ∇θ R Sopt θ R , θT
(15)
(i) ˆ (i) ˆ (i) θˆT(i+1) = θˆT(i) − η · ∇θT Sopt θ R , θT
(16)
Depending on gradient values, the chosen step length is ï to make sure that the new optimized objective function’s estimate is less than equal to the previous optimized objective functions estimation. During the iterative searching, the estimate becomes more accurate, it happens till when the previous estimate is the same as the new estimate. In this scheme proposed, from the initial estimated on-grid coarse values (θ R , θ T ) are moved to its actual positions of the off-grid. The flowchart of the proposed algorithm to find AOA’s and AOD’s is explained in Fig. 1. It is very much important to figure out the value of the unknown sparsity level. In this scheme, the sparsity is beginned greater than the real value of the channel sparsity. The iteration-based high-resolution scheme, the sparsity level is set to maximum when comparing to the real channel sparsity, and the paths will always be considered as a noise generated in the channel instead of real paths when the path gain is too small. Then, this proposed algorithm is pruned these channel paths to make the result as sparser than the existing systems. During iteration, the predicted sparsity level will have decreased to the number of paths. The proposed algorithms computational complexity of each iteration lies in gradient value calculation. The term Ql (N X NY (N R + N T )L 2 is the computational complexity which is used to find the gradients. The total number of starting candidates L (0) is critical. To make the computation affordable, L (0) should be smaller. The method is widely used to select the
(0)
(0)
effective initial values of θ R and θ T before the iteration will be discussed as detail in the next section.
3.4 Preconditioning Using SVD Techniques To minimize the computational complexity in the iterative reweight-based existing channel estimation, SVD-based preconditioning is introduced in this scheme. The
372
V. Baranidharan et al.
angle domain grids which are nearer to the AoD’s/AoA’s are identified by using this scheme. The preconditioning is significantly minimized the calculation difficulties when comparing to the usage of all N R and N T grids as initial candidates. By using the singular value decomposition to the matrix Y, Y = U V H , where = diag(σ1 , σ2 . . . σ(min(N X ,NT ) ∈ R NY ∗NT ) whose entries in diagonal is σ1 ≥ σ2 ≥ . . . ≥ σmin(N X ,NT ) ≥ 0 are Y singular values and U H U = I(N X ∗NT ) . From the above equations, Y = (W H A R (θ R ))diag(z)(X H A T (θT )) H + N
(17)
Since the noise is comparatively small, the L paths identify the singular value and their vector, i.e., for i = 1, 2, … L. From uniform planar array, for an N 1 × N 2 receiver antenna arrays, the set of grids can be determined by R = {(i/N1 , j/N2 )|i = 0, 1, . . . , N1 − 1; j = 0, 1, . . . N2 − 1}. T is also determined similarly for the transmitter. The algorithm for SVD-based preconditioning is described in this section. The initial candidate’s values of Fig. 1 are used to set the values of the grid, i.e., L (0) = N R N T . When N T and N R values are large, the computational complexity is Ql N X NY (N R +N T )N R2 N T2 and it is unaffordable. As in Fig. 2, it will describes the singular value decomposition-based preconditioning, the beginner candidates of Fig. 1 is coarse estimates, i.e., L (0) = Nintt ≈ L. So the computational complexity will be Ql N X NY (N R +N T )L 2 . This is the result after the SVD-based preconditioning. The computational difficulty is lesser when comparing to directly applying in the scheme.
Fig. 2 Comparison of SNR with normalized mean square error (NMSE)
Singular Value Decomposition-Based High-Resolution Channel Estimation …
373
4 Simulation Results and Its Discussion The performance metrics are investigated by using the simulation results achieved using MATLAB is explained in this section. The proposed SVD-based highresolution-based channel estimation scheme with hybrid precoding is compared with the existing systems. Initially, some simulation parameters are considered for mmWave massive MIMO with precoding techniques. The path gain of the channel is given as α 1 . The path gain is assumed to be as Gaussian, i.e., α ∼ Cjð (0, σ 2 ). Here, the value of e is the power required for transmission and it is uniformly distributed from 0 to 2π. The SNR is defined as the average power of the transmitted signal divided by the average power of the RMS value of the noise voltage across the system.
4.1 Comparison of NMSE with Respect to SNR For this mmWave channel estimation scheme, the SNR value will become SNR = C(σ α 2 /σ n 2 ). Where σ n 2 —noise variance. Figure 2 compares the SNR with normalized mean square error (NMSE) under the both line of sight (LOS) and non-line of sight (NLOS) channels are considered to estimate the channel of the proposed highresolution-based channel estimation scheme with the hybrid precoding technique with an existing spatial mismatching-based DFT method. The SNR value is varied from −5 to 10 dB. In this below figure, the red and blue color line indicates the hybrid precoding method and DFT spatial mismatching technique, respectively. The Rician K factor value is considered as 20. In both cases, the proposed channel estimation scheme will outperform than the existing method. The normalized mean square error value will comparatively low than the existing method. This result is achieved by considering the uniform planer array in this proposed scheme. The 64 antenna uniform planar array is considered in both the transmitter and receiver with 8 rows and 8 columns. This result is achieved by the way to estimate the azimuth and elevation angles. Here, it is observed that the estimate values of the proposed and existing values of channel estimation schemes for massive MIMO system over mmWave communication. Table 1 gives the comparison values of the SNR and NMSE of the proposed and existing schemes. This table shows that the NMSE plot statistics values are comparatively less than the existing system (Table 2).
4.2 Comparison of SNR Versus Squared Residue of Samples The squared residue value is defined as a sum of the square of the residuals, i.e., the predicted deviations from actual empirical data values. The squared residue error
374 Table 1 Initial simulation parameters
Table 2 Comparison of SNR and NMSE of the proposed work with existing systems
V. Baranidharan et al. Simulation parameters
Values
Number of propagation paths (L)
3
Antenna size (d)
λ/2
Number of transmit antennas
64
Number of receiver antennas
64
Number of transmitting RF chains
4
Number of receiving RF chains
4
Number of transmitting pilot sequence (N X )
32
Number of receiving pilot sequence (N Y )
32
Parameters
SNR
NMSE Spatial mismatch High-resolution DFT hybrid coding
Min
−5
0.01339
0.003572
Max
10
0.2136
0.2009
Mean
2.5
0.07807
00,657
Median
2.5
0.07112
0.005486
Standard deviation
5.4001
0.0674
0.06611
Range
15
0.2002
0.1974
values are measured for the different number of samples 1, 10, 20, 30, 40 and 50. The SNR versus average squared residue error values for different samples of the existing (spatial mismatching using DFT method) and proposed (high-resolutionbased hybrid precoding) methods are shown in Figs. 3 and 4. The ideal Channel State Information (CSI) of the both existing and the proposed systems are used to find the estimation errors of azimuth and elevation angles of the massive MIMO mmWave channels. The average residue error codes are estimated to get an AOA’s and AOD’s of the channels. Table 3 shows the mean values of sample 1, 10, 20, 30, 40 and 50 of both existing and the proposed schemes. From the above table, the number of samples increases the average residual values decreases for the proposed schemes. The existing system shows that the mean value is 32.47 but for the proposed scheme (high-resolution-based channel estimation scheme) it is 30.04 value only. The average residue error is decreased. This will lead to less computational complexity and channel CSI is adopted for good communication systems over mmWave MIMO channels. The proposed estimation scheme can achieve better channel accuracy as compared with the existing system.
Singular Value Decomposition-Based High-Resolution Channel Estimation …
375
Fig. 3 Comparison of SNR versus squared residue of samples of DFT-based spatial mismatching channel estimation schemes
Fig. 4 Comparison of SNR versus squared residue of samples of SVD-based high-resolution channel estimation schemes with hybrid coding Table 3 Comparison of average residue values (error values) of the proposed work with existing schemes
Iteration samples Spatial mismatching High resolution with DFT hybrid precoding Sample 1
32.75
34.32
Samples 10
32.65
33.22
Samples 20
31.93
34.36
Samples 30
32.15
32.43
Samples 40
32.59
31.39
Samples 50
32.47
30.04
376
V. Baranidharan et al.
5 Conclusion The SVD-based high-resolution channel estimation scheme for mmWave MIMO systems with hybrid coding techniques has been analyzed critically with existing systems. At first, an efficient optimization problem of the new objective function is proposed which is used to calculate the summation weighted based on the channel sparsity and data fitting errors. The proposed channel estimation schemes will calculate the on-grid points in the angle domain, and these points are a move toward neighboring actual points by iteratively via gradient descent methods. The accuracy is increased in this proposed high-resolution channel estimation schemes is confirmed based on the simulation results. Better estimation of the angle of arrivals and departure with high resolution will give a better spectral efficiency. In future work, high-resolution-based channel estimation scheme will be employed for high mobility multi-cell mmWave MIMO systems is a challenging topic that needs to be investigated.
References 1. Gavrilovska L, Rakovic V, Atanasovski V (2016) Visions towards 5G: technical requirements and potential enablers. Wireless Pers Commun 87(3):731–757. https://doi.org/10.1007/s11277015-2632-7 2. Hu C, Dai L, Mir T, Gao Z, Fang J (2018) Super-resolution channel estimation for mmWave massive MIMO with hybrid precoding. IEEE Trans Veh Technol 67(9):8954–8958. https://doi. org/10.1109/TVT.2018.2842724 3. Mumtaz S, Rodriguez J, Dai L (2016) mmWave massive MIMO: a paradigm for 5G. mmWave massive MIMO: a paradigm for 5G, pp 1–351 4. Pazdanowski M (2014) SVD as a preconditioner in nonlinear optimization. Comput Assist Methods Eng Sci 21(2):141–150 5. Qi B, Wang W, Wang B (2019) Off-grid compressive channel estimation for mm-wave massive MIMO with hybrid precoding. IEEE Commun Lett 23(1):108–111. https://doi.org/10.1109/ LCOMM.2018.2878557 6. Qin Q, Gui L, Cheng P, Gong B (2018) Time-varying channel estimation for millimeter wave multiuser MIMO systems. IEEE Trans Veh Technol 67(10):9435–9448. https://doi.org/10. 1109/TVT.2018.2854735 7. Shao W, Zhang S, Zhang X, Ma J, Zhao N, Leung VCM (2019) Massive MIMO channel estimation over the mmWave systems through parameters learning. IEEE Commun Lett 23(4):672–675. https://doi.org/10.1109/LCOMM.2019.2897995 8. Wang W, Zhang W, Li Y, Lu J (2018) Channel estimation and hybrid precoding for multipanel millimeter wave MIMO. Paper presented at the IEEE international conference on communications. https://doi.org/10.1109/ICC.2018.8422137 9. Zhao J, Gao F, Jia W, Zhang S, Jin S, Lin H (2017) Angle domain hybrid precoding and channel tracking for millimeter wave massive MIMO systems. IEEE Trans Wireless Commun 16(10):6868–6880 10. Hur S, Kim T, Love DJ, Krogmeier JV, Thomas TA, Ghosh A (2013) Millimeter wave beamforming for wireless backhaul and access in small cell networks. IEEE Trans Commun 61(10):4391–4403 11. Alkhateeb A, Ayach OE, Leus G, Heath RW (2014) Channel estimation and hybrid precoding for millimeter wave cellular systems. IEEE J Sel Top Signal Process 8(5):831–846 (2014)
Singular Value Decomposition-Based High-Resolution Channel Estimation …
377
12. Zhu D, Choi J, Heath RW (2017) Auxiliary beam pair enabled AoD and AoA estimation in closed-loop large-scale millimeter-wave MIMO systems. IEEE Trans Wireless Commun 16(7):4770–4785 13. Lee J, Gil GT, Lee YH (2016) Channel estimation via orthogonal matching pursuit for hybrid MIMO systems in millimeter wave communications. IEEE Trans Commun 64(6):2370–2386 14. Marzi Z, Ramasamy D, Madhow U (2016) Compressive channel estimation and tracking for large arrays in mm-wave picocells. IEEE J Sel Top Signal Process 10(3):514–527 15. Fang J, Wang F, Shen Y, Li H, Blum RS (2016) Super-resolution compressed sensing for line spectral estimation: an iterative reweighted approach. IEEE Trans Sig Process 64(18):4649– 4662
Responsible Data Sharing in the Digital Economy: Big Data Governance Adoption in Bancassurance Sunet Eybers
and Naomi Setsabi
Abstract Bancassurance organizations, with its origin in Europe, have become a contemporary phenomenon in developing countries. Bancassurance organizations are typically formed by companies who expand their services to selling insurance products and services to their current customer base. The benefits are realized by both the bank and customers as banking customers who are potential customers in insurance houses and insurance policyholders who may have an interest in banking accounts. To enable this process, data is shared among the bank and the insurance house. Typically, information technology (IT) infrastructure and data resources interchangeably connect to enable data sharing. This might introduce not just infrastructure challenges but also considerations for governance dictating what data can be shared and the format of datasets. This case study investigated the big data governance structures currently adopted by bancassurance organizations in a developing country focusing on three main areas identified in literature. These areas include basic, foundation level big data governance structures, data quality and the adoption of guidelines and frameworks with subsequent business value calculations. The results indicated the existence of data governance structures for structured and semistructured operational data but highlighted the need for governance catering for unstructured big data structures. This also applies to data quality checking procedures. Additional education and training for the various roles responsible for organizational data governance can increase the quality of interoperability of data among entities. Keywords Data governance · Data sharing · Big data · Bancassurance · Data · Decisions
S. Eybers (B) · N. Setsabi University of Pretoria, Private Bag X20, Hatfield, Pretoria, South Africa e-mail: [email protected] N. Setsabi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_29
379
380
S. Eybers and N. Setsabi
1 Introduction Data, information, knowledge, understanding and wisdom (DIKW) pyramid are one of the prevalent concepts in the information systems field of study [1]. This concept drives the idea that information does not exist without data, knowledge is not gained without information and without learning how to apply the knowledge, and no wisdom will be created. The symbiotic relationship is depicted using a triangle and referred to as the DIKW pyramid, with data as the foundation [2]. Organizations depend heavily on data to survive because, without data, no informed decisions can be made. Increased demand for rapid business decisions leads to the need to decrease data processing time. Furthermore, an increase in innovations and changes in the digital society lead to the availability of higher data volumes from diverse data sources that can be collected and analyzed. These datasets have unique characteristics such as volume (datasets in large files), velocity (speed of data throughput), variety (data outputs in various file formats) and veracity (accuracy of data) and value (social and monetary gain) and often referred to as big data [3, 4]. Large, diverse datasets from different data sources imply stricter data governance requirements. The management of the data as well as how it is consumed and stored has a direct impact on the quality of organizational decisions due to the quality of the data used [5]. The availability of vast amounts of data calls for great responsibility for users of the data in how they should engage with information. As a result, data governance has received renewed attention to protect consumer rights. According to [6] big data governance is the management of large volumes of structured (contained in columns and rows according to data model specifications), semi-structured (data that can be structured but doesn’t conform to data model specifications) and unstructured data (no predefined structure). The main objective of this research is to investigate how data governance is enforced in big data implementations focusing on bancassurance organizations in a developing country such as South Africa. The research question under investigation is, therefore, what elements should be considered when bancassurance organizations, in a developing country, adopt data governance during big data implementation projects?
2 Bancassurance and Data Governance In bancassurance, a bank and an insurance company hold a significant number of customers shared between the two entities [7]. In isolation each organization only has a partial customer view and understanding, placing them at a disadvantage in serving their client needs. Typically, each organization holds data on their respective information technology (IT) platforms as such variances in the products and customer data on the data
Responsible Data Sharing in the Digital Economy …
381
shared are imminent. Furthermore, consumers can perform many financial transactions using the digital platforms offered on mobile devices; and can use social media platforms for escalations or compliments. One of the focus areas in the banking industry in South Africa is a drive to identify and understand the characteristics of their customers and their needs with the main objective of pre-empting customer needs. Due to the fast pace of information flow by consumers, the banks have expanded this notion with predictive analysis focusing on customer behavior, demographics, financial status, social standing and health. This analysis has been based on large volumes of data collated from customer profiles sourced from various data sources over an extensive period. For example, unstructured customer data is collected from voice recordings, images and identity documents (to name a few). This is combined with structured data, which refers to data generated by enterprise systems used in operational and executive decision making in this instance. The output of this is usually in the form of reports. Strict rules, regulations, standards and policies are required to govern these big datasets to protect the customer, ensure the streamlining of reports to provide accurate information fostering decision making as well as conceptualization of innovations [3]. To ensure accountability in decisions over the organization’s data assets, [8] refers to data governance as a business program that should be enforced in the entire organization. The research further suggests that effective data governance programs are key to ensuring that processes are correctly followed when managing data assets, similar to financial controls when monitoring and controlling financial transactions. In a bancassurance model, a company uses the existing channels and customer base to offer insurance services. Figure 1 depicts how bancassurance relates to big data, what typical data is shared between the bank and insurance organization and the need for governance. Figure 1 starts by highlighting the different types of big data as well as the various decision domains in data governance. Big data is then classified according to its features, which is the 5 Vs (volume, variety, velocity, veracity and value), while the scope of data governance is defined by the business model, the stakeholders, content and federation. The scope of data governance and the data governance programs will be discussed later. The model highlights the different types of structured and unstructured data found in each of the bancassurance entities. A decision is further required to determine which of the different types of big data will need to be shared across entities. This includes, but not limited to, mostly structured data (e.g., customer details like name, age, gender, marital status, race, source of funds, residential address, banking patterns (deposits and withdrawals), credit bureau risk indicator, insurance products sold, banking core product held, customer personal identification number, bank account number, bank balance, customer segment, bank account status, policy activation status) and sometimes may include unstructured data such as FICA documents which includes a copy of the national identity document, vehicle license document, proof of address, company registration documents. In the context of big data, in terms of volume, as at December 2016 Bank A had 11.8 million active customers. A competitor bank showed that they were getting
382
S. Eybers and N. Setsabi
Governance Data Quality Data Interoperability Metadata Data Policies and Principles Interoperability/Sharing Data Storage Data Privacy and Security Data Access/Data Use Analytics
Big Data Structured Unstructured Semi Structured
DECISIONS
DECISIONS
DECISIONS
Velocity Variety Volume Veracity Value
Business Model Stakeholders Content Federation BANCASSURANCE
Insurance Data
Banking Data
- Insurance products - Insurance Premiums - Activated and non Activated Policies - Non banking customers - policy documentation - customer interaction recording (email, calls, social media)
- Bank customer base - Active and non active accounts - Customer bank balance - Customer personal information - customer interaction recording
Bancassurance Decisions Structured data - customer bank balance - premium payment - Policy activation Bank customer base Veracity
Data Quality Data Access Data Interoperability Data Privacy and Security Analytics
Volume
Unstructured data - customer interaction recording - policy documentation
Value
Velocity
Data Quality Data Privacy and Security analytics Data Access/Use
Variety
Fig. 1 A bancassurance model depicting big data sharing and data governance
Responsible Data Sharing in the Digital Economy …
383
on average 120,000 new customers per month in 2016. In terms of velocity, about 300,000 large transactions on average per minute are processed in the evening (7 pm) and about 100,000 transactions are processed in minute midday (12 pm). Multiple data sources make for variations in data especially in the insurance entity (referring to the big data characteristic of variety). Data sources are multichannel within customer interactions including call (voice) recordings from contact centers, digital interactions via online channels (including social media), voice recordings at branches sites, image scans of customer personal data via mobile devices, video recordings of the car/building inspection uploaded via mobile devices and textual conversations using USSD technology. Veracity in data has been implemented by the bank using various tools to determine the data quality of its customer base. The FAIS and Banks Act highlight the importance of insurance and banking organizations to ensure they are presenting the right products, to the right customer according to the identified need. This process is carried out by business compliance officers, risk operations officers as well as data stewards. Data monetization is evident through data analytics. According to [4] the value characteristic in big data represents value, evident through the utilization of analytics to foster decision making. As a result, using data analytics, decisions can pre-empt the next move organizations should take to obtain or retain a competitive advantage. The more analytics the organization applies to data, the more it fits the definition of big data. In his book, [8] remarks that three components affect the scope of data governance (DG) programs, namely: (1) understanding the business model of your organization, the corporate structure and its operating environment; (2) understanding the content to be controlled such as data, information, documents, emails, contacts and media and where it adds the most business value; and lastly, (3) the intent, level and extent to which the content is monitored and controlled. The business model of the bancassurance organization is the partnership of an insurance organization selling their products using the bank’s distribution channel. As such, the nature of this business model is that the organization has different business units and as such does not share common information. It is in this instance that [8] suggests the data governance programs are implemented per business unit. Insurance and the bank hold a significant number of customers shared between the two entities. In isolation, each organization only has a partial customer view and understanding, placing them at a disadvantage in understanding and serving their client needs. The two organizations hold data on their respective IT platforms and differences in the product and customer numbers on shared customers is noted. Content control: Bancassurance as a role player in the financial services industry is highly regulated by national as well as international banking industry laws, insurance laws as well as financial services legislation and regulations such as the Financial Intelligence Centre Act 38 of 2001 (FICA), the Financial Advisory and Intermediary Services Act (FAIS), Anti-Money Laundering, Code of Ethics, Code of Conduct, Protection of Personal Information and National Credit Regulator (NCR) (to name a few). As such, structured data (such as financial transactions and premium payments),
384
S. Eybers and N. Setsabi
as well as unstructured data (such as emails and insurance policies), may require governance that would focus on identifying the data used and how it should be used. Furthermore, archival and protection of personal information should also be considered. Federation: Federation as a process that involves “defining an entity (the data governance program) that is a distinct blend of governance functions where the various aspects of data governance interact with the organization” [8]. As a result, a clear understanding of the extent to which various governance standards will be applied and implemented across various business units, divisions and departments within an organization is important. The governance processes applied and implemented must be driven by the same organizational motives and involves the same processes.
3 Research Method and Approach A case study approach was adopted to investigate how data Governance was instilled during a big data implementation project. A case study is a heuristic investigation of a phenomenon within a real-life context to understand its intricacies [9]. In this instance, a qualitative research approach was adopted using data collected from interviews, heuristic observations and secondary documentation to create an in-depth narrative describing how data governance was considered during the big data project.
3.1 Interviews Due to the nature of bancassurance organizations (the bank and insurance), different teams manage big data and as such different policies were implemented. As a result, the governance of big data governance is a joint responsibility and much dependent on the stakeholders working together toward a common decision-making process and decision-making standards on big data-related matters. There also needs to be clearly defined roles and responsibilities in such decisions [10]. To determine which policies and governance standards were implemented in different teams and the logic used, interviews were conducted with various role players with varying data governance responsibilities, as well as interdepartmental teams working with big data such as enterprise information management, business intelligence, customer insights and analytics. A convenience sampling method was adopted as the participants were selected based on stakeholders’ availability, geographic location and reachability, as well as the geographical location of the researcher. All the selected participants, as well as the researcher, were based in Johannesburg, South Africa. Johannesburg is part of Gauteng, which is one of nine provinces in South Africa and the largest by population with the biggest economic contribution.
Responsible Data Sharing in the Digital Economy …
385
Table 1 Summary of interview participants Job title
Job level
Governance focus
Bancassurance link
CIO for data analytics and AI
Executive
Group level
Bank
Head [divisional] measurement
Localized
Bank—business unit
Head of customer insights and analytics
Group level
Bank—global
Group level
Bank
Release train engineer
Senior manager
Head of [divisional] data
Localized
Insurance
Data engineer
Middle manager
Localized
Insurance
Data scientist
Junior
Localized
Insurance
Analytics and modeling manager
Senior manager
Localized
Insurance
The participants have been working in their respective roles for a minimum of two years (junior role) and five years in a senior role. They were, therefore, all well informed on the topic of data governance. Based on the evidence of current data governance implementation in the organization under study (governance structures implemented), the data governance maturity was classified as in its infancy. A total of eight in-depth semi-structured interviews were conducted: seven interviews in person which was recorded and one using email. Table 1 contains a summary of the positions of the various research participants as well as job level, the level on which the participants focus with regard to governance-related matters as well as an indication of their involvement in the bank or insurance-related business units. In-depth, face-to-face interviews were conducted using a predefined interview template. Twenty-five interview questions were grouped into three main focus areas identified during an extensive systematic literature review on the topic of data governance. The focus areas are described as part of the case study section. In instances where the response from participants was not clear, second and third round of interviews were conducted. Each interview was scheduled for one hour and the research participants were invited to the interview via a detailed email which provided context and background to the request for their time and why they were identified as the ideal candidates for this research.
3.2 Heuristic Observations In observations, [11] mentions that during the planning of an observational research method the researcher to prepare a list of possible outcomes of expected animal behavior based on a list of common behaviors. A similar concept has been applied to this study. A prepared sheet with hypothetical questions that entails the hypothetical outcome of each big data governance structure was used, supplemented by
386
S. Eybers and N. Setsabi
a sheet with comments from observations. Grove and Fisk [12] refer to this observation method as a structured observation of a more positivist than interpretivist in philosophy. Due to the limited time frame in which to conclude the study, a few in-person (direct) live observations will be made. Observations will be done based on the various data governance framework themes identified by the researcher. Organizational interdepartmental teams frequently attend scheduled forums where all issues related to big data, including governance, were discussed. The output and decisions taken at these forums were implemented over time and as such override any other decisions taken within the business units. The researcher attended three of these meetings to observe the dynamic in the group as well as to get feedback on data governance-related topics.
3.3 Documents The data obtained from participant interviews and observations were supplemented by various secondary documentation, including but not limited to: documented outputs from the respective discussion forums and divisional meetings focusing on data governance (minutes of the meetings as well as supporting documentation); data architecture documents; data flow diagrams; visualization and information architecture documentation as well as conceptual design documents for data-related concepts (including data storage); consumer and data engineering training plans (for big data); business area enrollment form requesting to become a data consumer; principles and best practices (for data asset management and data scrubbing, data governance structures) as well as metadata management (for data tokenization); progress meetings of data artifact development teams. Thematic analysis was used to evaluate all the documentation, in particular, the transcriptions of the interviews. Atlas.ti was used to substantiate the manual coding process.
4 Case Study Bank A is currently the second-largest banking financial service provider in South Africa, employing more than 48,000 employees in various branches with their head office in the economic hub (Johannesburg) in South Africa. The bank has an estimated annual revenue of $7.9B (as of December 2019). The bank is currently moving toward a digital economy. As a result, big data interventions have been earmarked and data governance forums established with the mandate to ensure the implementation of the enterprise-wide digitization strategy across all platforms. The strategy is largely influenced by the protection of information act (PoPI), FICA, Banks Act as well as FAIS encouraging financial service providers to provide evidence of their operational ability. The data governance forums in the insurance part of the business highlighted the need for big data governance accountability due to a lack of data-related policies
Responsible Data Sharing in the Digital Economy …
387
and standards. As a result, departmental insurance data governance forums were mandated with this task as they develop, support and maintain their own, departmental level business systems and subsequent data entities. Discussion forums were attended by one exco member and a combination of senior data governance role players. Being in the minority, the exco member had difficulty influencing insurance data requirements discussed during the forums. Additional topics of discussion were data quality and big data analytics. Unfortunately, these forums were discontinued after a while due to reasons such as: Lack of identification and involvement of relevant stakeholders: IT partners were not invited to discussion forums. They can assist with the clarification of technological infrastructure and other technology requirements in support of big data information requirements. Key role players were not involved such as data stewards, risk and compliance officers. In some instances, third-party applications were acquired but system owners lacked the understanding of data elements, including data correctness and validity. Terminology and requirement clarification: Forum members didn’t have a clear understanding of the meaning of big data and its attributes, the business requirements for business decision-making purposes and a lack of data lifecycle clarification. Mandate clarification: The forum wasn’t provided with a clear mandate and policies and procedures were nonexistent. No key success factors were established leaving the discussion forums without a clear goal. Roles and responsibilities: Data governance forum attendees, including data stewards and stakeholders, were not clear on what their tasks, duties, roles and responsibilities were. Benefit clarification: Forum participants did have insights into the anticipated monetization of big data implementations, for example, the cost of change and the potential revenue to be gained from just-in-time analytics. Training: There was no formal training available for forum participants to educate them on the key aspects of data-related topics such as data sharing and data quality checking processes applicable to insurance. Existing big data governance documentation, outside of the discussion forum, was consulted to shed light on the planned big data intervention as part of the digitization project. Surprisingly, a lot of work has already been done on drafting and implementing big data governance structures, such as data quality checkpoints. This indicates an awareness of big data governance on the group executive level of both the bank and the insurance organization. Unfortunately, this was not transparent enough to raise awareness among forum participants. Interview questions focused on three main focus areas identified in current academic literature namely; (1) basic, foundation level big data governance elements that should be implemented to support enterprise-wide data governance (referring to big data policies and standards, ethical prescriptions, data governance chain of command and organizational hierarchies, auditing checks and balances and lastly
388
S. Eybers and N. Setsabi
storage structures) [6, 10, 13–15]; (2) data governance elements focusing on data quality (which refer to data dictionaries and data libraries, metadata and processes followed to ensure data quality) [5, 10, 14–17]; and (3) the adoption of big data governance guidelines and frameworks (pertaining to the data lifecycle, and safeguarding of data, including during interoperability procedures that will maximize the big data value proposition) [5, 8, 13, 15–19]. These three focus areas were used to describe the findings of the research.
4.1 Focus Area 1: Data Governance Foundational Elements At the executive level, the majority of participants in both the banking and insurance functional levels indicated that they were aware of big data governance structures that were in place. Linked to the organizational structure, one executive mentioned that he was unsure if the existing governance policies were specific to the handling of big data or data in general. On a lower functional level, engineers were not aware of data governance policies, while one out of three senior managers shared a similar view: “yes, having a group-wide forum where having a set data governance standards. A framework have been put together on how our data is governed as an organization. As a business unit, what the rest of the group are doing is being adopted. This was drafted as a document after functional business unit inputs and shared as the organizational standard”. Organizational data governance structures: Executives confirmed that they were aware of the current group data governance structure. This could be attributed to the seniority level they have in the organization which is privy to the group’s structures. Senior managers shared the same view, adding that “there is an executive data office (EDO) which governs the usage of data”. Another senior manager added that an enterprise data committee (EDC) was formed: “individual business units have their own data forums and own data committee. This is mandated from the EDC. Whatever have been agreed upon can be taken up to EDC for noting”. Importantly data stewards, acting as business unit request coordinators and business unit representatives, play an integral part in the data committees. Interestingly, junior interview participants were not aware of these structures. Key data stakeholders: Executive level interview participants highlighted that a lot of interest has been shown lately in data-related projects and as a result, many participants volunteered to become part of the big data project. Unfortunately, participants were not necessarily skilled in the area of data science and therefore lacked skills of data scientists. However, various data-related roles existed within the organization namely data analysts, information management analysts, domain managers, data stewards, provincial data managers and data designers. Executive participants alluded that “…specific roles are critical in driving this contract [Big data project], e.g., data engineers and data stewards in the information management space”. Apart from these resources, business stakeholders were also included in data governance
Responsible Data Sharing in the Digital Economy …
389
forums and of vital importance: “…the owner of the data is the one who signs off on the domain of the data. A business will thus be the owner in that domain of that data. For example, head of the card division will be the domain holder of credit card data”. In contrast, one executive claimed that all participants were data practitioners (and not business stakeholders). All senior managers, as well as the middle manager and the junior resource, agreed that stakeholders with specific roles were invited to the data governance forums. Although these roles were involved in the forums, resources fulfilling these roles were not sure of what was expected of them. Other roles that were included in the forums were “support casts” (as postulated by [20]) and include IT representatives, compliance officers, IT security and business analysts. Data storage: Participants acknowledge the importance of the ability of current technologies, as part of the IT infrastructure, to cater for organizational data storage and easy retrieval. Importantly, one of the executives mentioned that regulatory and compliance data storage requirements might differ from operational, business unit requirements. “The analytics of financial markets comes in quick and fast and this requires a reliable storage system… From an analytics perspective, you would need as much of historic data as possible, way beyond the five years”. The current ability of organizational data storage was debated among research participants. Although one advised that the current technologies do cater to the storage needs of the organization, it can be improved. Furthermore, other participants indicated that the storage facilities and infrastructure was adequate as it adhered to regulatory and compliance prescriptions. However, the value can still be derived from data stored and not necessarily required to meet regulatory requirements. This can be meaningful in the analysis of current customer retention. It was also felt that the organization should focus more on the processes prescribed to store data and not necessarily data storage technology ability. Junior resources indicated their frustration in delays when requesting data, mainly attributed to processes.
4.2 Focus Area 2: Data Quality All interview participants believed that a data dictionary for each business unit is of vital importance. This was attributed to the nature of the bancassurance organization (the difference between the meaning of data entities). Data dictionaries were “living documents” authored by data stewards and referred to as an “information glossary”. A data reference tool was being created at the insurance group level to assist the business of updating the glossary, but currently under development. This will, in particular, assist with the need to include data source to target mapping requirements. An important feature of Bank A’s current big data implementation strategy is the focus on metadata. Metadata existed for structured and semi-structured data (particularly) in different toolsets. Despite the availability, the additional meaning was not derived from data entities. For example, referring to a personal identity
390
S. Eybers and N. Setsabi
number, “knowingly that it’s an id number but if they derive age and date of birth … that level of maturity is not there yet. It’s on the cards”. No metadata existing for unstructured data. The senior managers all concur that there was metadata available for the different types of data content on both group and business unit level. The biggest concern was that metadata was not frequently updated. The junior resource mentioned that they have not been exposed to unstructured metadata and as such believes it does not exist. The resource suggested that this could be due to the large volume of insurance data. On the topic of big data quality, executives mentioned that a data quality process exists that ensures data quality throughout the data lifecycle. One of the executives added that “there is no one size fits all”, referring to the standard data quality process across business functions, but that measures or weightings applied to data elements might differ. Research participants did not concur when asking about data quality. Although raw data was thoroughly checked during the data generation phase (during customer interaction), not enough data checks were performed after the acquisitions phase. However, junior resources who actively engaged with the data felt that the data quality checks performed during the data generation phase were insufficient as they had to frequently perform data cleanup tasks. Research participants agreed on the existence and enforcement of data checks to ensure data accuracy and completeness. The process has been seen as “fail-proof and well executed”. For example, a clear set of criteria was identified against which datasets were checked, including data models. Tools and infrastructure technologies from the IT partners have been employed to assist with third-party data sources. Additional data checks are performed whereby personal data (e.g., national identify numbers) are verified against the central government database and vehicle detail verified against the national vehicle asset register (NaTIS system). Trends analysis is also used to perform data validity checks. For example, if funeral sales were 100 with a general 10% monthly increase, a sudden monthly increase by 40%, with no change in business strategy or business processes, the dataset is flagged. Communication and data management: To manage changes to datasets, the respondents highlighted that various tools were used to communicate changes and the introduction of new datasets, mainly by data stewards. Only data owners are included in the communication process at various stages of the data lifecycle management on an event management basis. Should there be a need, data stewards will escalate data related to higher levels of governance such as the information governance forum. Not surprisingly research participants felt that the data management process if far from perfect as it is reactive in nature—only when there’s an impact in business or in systems is communication effected. Big data policies and standards—audit checks: The organization is forced to imply strict auditing checks as prescribed by industry compliance regulations. As a result, respondents indicated that they have employed “…. an information risk partner, compliance and risk and legal. Depends on how frequently the data is used, they’ll either audit you every quarter or once a year. There’s also internal and external audit initiatives”. Another executive, in charge of data analytics, mentioned frequent,
Responsible Data Sharing in the Digital Economy …
391
“periodic deep sticks” into sample datasets. Furthermore, it was highlighted that Bank A also leverage off the data supporting cast such as IT security to run risk checks on it. Apart from infrequent, ad hoc data checks, the compliance criteria were programmed into reporting artifacts such as dashboards. An interesting finding was that most data practitioners were aware of the policies and standards, but business stakeholders lacked knowledge on the topic.
4.3 Focus Area 3: Guidelines, Frameworks and Value Proposition Focusing on the issue of big data privacy and security, the majority of executives explained there were adequate measures to guard against unauthorized access to data. The general notion was that the financial services industry was very mature when dealing with data due to strong regulatory prescriptions of data being handled in real time, near real time and batches. One of the three executives, however, highlighted that although the measures were in place and validated in conjunction with IT partners, he was not sure that measures were sufficient. One of the senior managers questioned the adequacy of access to internal data as well as the access to bank data available to third parties. According to the senior manager, internal data access is adequate. However, third parties have been lacking adherence to safeguarding principles set out to safeguard the bank’s data against unauthorized access. The rest of the senior managers concurred that measures were in place and maintained in the entire organization; however, the competence and capability of those measures are sometimes inadequate. Junior staff members supported this viewpoint and elaborated that measures are prone to human discretion to grant access to data. As a result, predefined quality checkpoints can be ignored. The middle manager felt at the localized level (insurance) adequate measures were informed to external parties who request access to data. Big data interoperability: Executives indicated that terms of references have been agreed at the Bank’s group architecture level when sharing data. Predefined methods and procedures exist for the sharing of data to ensure data integrity during the movement process. One of the executives threw caution to the wind highlighting that, even though data is securely transported between landing areas, integrity in the data is sometimes compromised between the landing areas. Research participants were confident that data security and integrity measures were successfully employed when sharing data. Training was provided to data users as well as the senders of the data. Software security tools were approved by group information security to ensure that data was not compromised during live streaming. In addition, “failsafe” methods are currently being developed. Apart from this, additional sign off procedures was employed at each stage of data movement which ensures integrity and safe transportation. This can also be accredited to the source
392
S. Eybers and N. Setsabi
to target mapping exercise that is done that sets a baseline on what to expect from source data as well as thereafter. Big data analytics access: Only one of the executives mentioned that their role didn’t require to have access to the ability to analyze data. However, the other executives as well as senior managers indicated that data analytics was important to them and therefore drive business unit level data analytics interventions. Junior and middle manage confirmed that they have access to data analytics employed data mining and analytical tools like SAS. Data analytics was used to predict customer behavior, based on a large historical dataset, to identify fraudulent activities. Prescriptive analytics was still in its infancy. An example of prescriptive analytics in short-term insurance would be Google’s driverless car. Knowing that there will be no driver as part of the underwriting questions, what would be the next course of action? Or there are a number of algorithms that play input to Google’s self-driving car to determine what the next course of action it needs to take, i.e., take a short-left turn, sharp curve ahead drive slowly, pick up speed as going up the mountain, etc. The general look and feel of data visualization artifacts are governed by a data visualization committee. This committee provides guidance and standard practice in the design of the dashboards, visualization tools to be used as well as who needs to use them is discussed at the guilds. Scope of big data: Currently, the scope of big data intervention projects is clearly defined. Senior-level research participants remarked that there is no need for data governance in instances where the business unit can attend to their big data request locally and supported by the data steward and IT. Only in instances where there is “inter-sharing of data including from 3rd parties, then the Governance process will kick in”. For enterprise level, big data projects, the scope of the projects was defined at the executive data office level. Business value: Research participants indicated that the value of (correct) data is measured as an information asset. Subsequently, it is important to understand the objective of data analysis. For example, does the organization want to increase revenue considering existing customer trends, or saving costs by predicting fraudulent activities? Quality data drives meaningful decisions. One of the executives mentioned by looking at the output of the data, a four dimension matrix comes into play—“(1) I’m expecting an outcome and there’s an outcome (2) I’m not expecting an outcome but there’s an outcome (3) I’m expecting an outcome but I’m seeing something different (4) I’m not expecting an outcome and there is no outcome. …Measuring the value of data looks at the quantum of the opportunity that sits on the data.” Another executive highlighted, measuring the value of data is done through prioritization of initiatives within the business or the organization: “to ensure that it works on the most impactful drivers of the organization is needed”.
Responsible Data Sharing in the Digital Economy …
393
5 Summary of Findings and Conclusion An academic literature review on the topic of data governance and big data highlighted three main data governance focus areas that should be considered in the implementation of big data projects. These three focus areas were used in an indepth case study to identify the data governance elements that should be considered in bancassurance organizations when implementing big data projects. Focus area one, in general, highlighted the fact that current data governance structures in the bancassurance organization under study did not cater to big data interventions per se but data in general. It, therefore, seems as if some unique elements of the planned big data intervention might be missed. His research [16] has indicated that big data interventions might need special intervention and definitional clarification. A lack thereof can have a huge effect on focus area three, the value proposition of big data implementations. Data governance specifications also need to cater for the unique characteristics of big data such as volume, velocity and variety. Focus area two indicated that formal education and training should be included as a formal structure or at the very least a part of the communication decision domain within the big data governance structures. This is because business stakeholders required to attend the data governance structures are either new to the business, new to the role or simply not aware of the importance of having big data governance structures in place. Education and training on big data via “big data master classes” and “insurance data sharing” held by the bancassurance under study are a stepping stone toward bringing awareness to every stakeholder working with data and their role in the decision making of the data asset. The importance of the clarification of different data governance roles and responsibilities and subsequent educational background was highlighted by extensive research by Seiner [20]. The researcher also noted most big data governance structures have been adopted for structured data but not for unstructured data. Metadata for unstructured data is nonexistent and as such the management of unstructured data is pushed over to another guild referred to as “records management”. Unstructured data also proves difficult to apply the big data quality processes due to its nature. Thus, a lot more work will need to be put in to ascertain standardized processes that will be required to govern unstructured data. Governance structures ensuring data quality of current structured and semi-structured data was well enforced and adequate. The need for quality assurance of unstructured datasets remained. The researcher versed some limitations in this study as the content dealt with is highly classified and several governance processes had to be followed to obtain it. At some point, only one contact at the executive level was used to verify the accuracy of the data obtained. The availability of research participants was also limited as they are based in different buildings, as such non-face-to-face meetings were held with most of them in the interest of time. Finally, research area three highlighted adequate data interoperability governance structures. Although research participants took cognizance of the value of big data,
394
S. Eybers and N. Setsabi
no evidence of such calculations on group level (both banking and insurance) could be found. An unintentional finding of the research was the reasons for the failure of data governance discussion forums. It should be interesting to investigate this matter in future research work.
References 1. Data Management Association (2009) The DAMA guide to the data management body of knowledge (DMA—DMBok Guide). Technics Publications 2. Rowley J (2007) The wisdom hierarchy: Representations of the DIKW hierarchy. J Inf Sci 33:163–180 3. Munshi UM (2018) Data science landscape: tracking the ecosystem. In: Data science landscape: towards research standards and protocols. Springer, Singapore 4. Chen M, Mao S, Liu Y (2014) Big data: a survey. Mob Netw Appl 19:171–209. https://doi. org/10.1007/s11036-013-0489-0 5. Ghasemaghaei M, Calic G (2019) Can big data improve firm decision quality? The role of data quality and data diagnosticity. Decis Support Syst 120:38–49 6. Al-Badi A, Tarhini A, Khan AI (2018) Exploring big data governance frameworks. Procedia Comput Sci 141:271–277. https://doi.org/10.1016/j.procs.2018.10.181 7. Elkington W (1993) Bancassurance. Chart Build Soc Inst J 8. Ladley J (2012) Data governance: how to design, deploy, and sustain an effective data governance program. Morgan Kaufmann, Elsevier 9. Yin RK (2014) Case study research design and methods. Sage 10. Kuiler EW (2018) Data governance. In: Schintler LA, McNeely CL (eds) Encyclopedia of big data, pp 1–4. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-31932001-4_306-1 11. Ice GH (2004) Technological advances in observational data collection: the advantages and limitations of computer-assisted data collection. Field Methods 16:352–375 12. Grove SJ, Fisk RP (1992) Observational data collection methods for service marketing: an overview. J Acad Mark Sci 20:217–224 13. Soares S (2014) Data governance tools: evaluation criteria, big data governance, and alignment with enterprise data management. MC Press online 14. Mei GX, Ping CJ (2015) Design and implementation of distributed data collection management platform. Presented at the 2015 international conference on computational intelligence and communication networks, Jabalpur, India, 12 Dec 2015 15. Ballard C, Compert C, Jesionowski T, Milman I, Plants B, Rosen B, Smith H (2014) Information governance principles and practices for a big data landscape. RedBooks 16. Al-Badi A, Tarhini A, Khan AI (2018) Exploring big data governance frameworks. In: Procedia computer science, pp 271–277 17. Khatri V, Brown CV (2010) Designing data governance. Commun. ACM 53:148–152 18. Almutairi A, Alruwaili A (2012) Security in database systems. Glob J Comput Sci Technol Netw Web Secur 12:9–13 19. Davenport TH, Dyche J (2013) Big data in big companies 20. Seiner R (2014) Non-invasive data governance: the path of least resistance and greatest success. Technics Publications, USA
A Contextual Model for Information Extraction in Resume Analytics Using NLP’s Spacy Channabasamma , Yeresime Suresh , and A. Manusha Reddy
Abstract The unstructured document like resume will have different file formats (pdf, txt, doc, etc.), and also, there is a lot of ambiguity and variability in the language used in the resume. Such heterogeneity makes the extraction of useful information a challenging task. It gives rise to the urgent need for understanding the context in which words occur. This article proposes a machine learning approach to phrase matching in resumes, focusing on the extraction of special skills using spaCy, an advanced natural language processing (NLP) library. It can analyze and extract detailed information from resumes like a human recruiter. It keeps a count of the phrases while parsing to categorize persons based on their expatriation. The decision-making process can be accelerated through data visualization using matplotlib. Relative comparison of candidates can be made to filter out the candidates. Keywords NLP · spaCy · Phrase matcher · Information extraction
1 Introduction While the Internet has taken up the most significant part of everyday life, finding jobs or employees on the Internet has become a crucial task for job seekers and employers. It is vastly time-consuming to store millions of candidate’s resumes in the unstructured format in relational databases and requires considerably a large extent of human effort. In contrast, a computer which parses candidate resumes has to be constantly trained and adapt itself to deal with the continuous expressivity of human language. Channabasamma (B) · A. Manusha Reddy VNRVJIET, Hyderabad 500090, India e-mail: [email protected] A. Manusha Reddy e-mail: [email protected] Y. Suresh BIT&M, Ballari 583104, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_30
395
396
Channabasamma et al.
Resumes may be represented with different file formats (pdf, txt, doc, etc.) and also with different layouts and contents. Such diversity makes the extraction of useful information a challenging task. The recruitment team puts a lot of time and effort in parsing resumes and pulling out the relevant data. Once it is extracted, matching of resumes to the job descriptions is carried out appropriately. This work proposes a machine learning approach to phrase matching in resumes, focusing on the extraction of special skills using spaCy [1], an advanced natural language processing (NLP) library. It can analyze and extract detailed information from resumes like a human recruiter. It keeps a count of the phrases while parsing to categorize persons based on their expatriation. It improves recruiter’s productivity, reduces the time in the overall candidate selection process, and improves the quality of selected candidates. The rest of the paper is organized as follows: Section 2 highlights the literature survey, Sect. 3 presents the proposed work of extracting relevant information from the unstructured documents like resumes, Sect. 4 discuses about the implementation and the obtained output, and Sect. 5 concludes the work with scope for future research.
2 Literature Survey This section summarizes the contributions made by various researchers toward the extraction of relevant information for resume parsing. Sunil Kumar introduced a system for automatic information extraction from the resumes. Techniques for pattern matching and natural language processing are described to extract relevant information. Experimental results have shown that the system can handle different formats of resumes with a precision of 91% and a recall of 88% [2]. Papiya Das et al. proposed an approach to extract entities to get valuable information. R language with natural language processing (NLP) is used for the extraction of entities. In this paper, the authors briefly discussed process of text analysis and extraction of entities by using different big data tools [3]. Jing Jiang worked on information extraction, described two fundamental tasks— named entity recognition [NER] and the relation extraction. NER concept is to find names of entities, for instance, people, locations, and organizations. A named entity is often context dependent. Relation extraction aimed at finding semantic relations among entities [4]. Harry Hassan et al. introduced an unsupervised information extraction framework, which is based on mutual reinforcement in graphs. This framework is mainly used in acquiring the extraction patterns for the content extraction, relation detection and then for characterization task, as it is one the difficult tasks in the process of information extraction due to the inconsistencies in the available data and absence of large amounts of training data. This approach achieved greater performance when compared with supervised techniques with reasonable training data [5].
A Contextual Model for Information Extraction …
397
Sumit Maheshwari et al. identified resumes by analyzing “skills” related information. It improved the performance of resume processing of candidates by extracting required special skill type and even special skill values. It is observed that the recruiting team can reduce 50–94% of the features while selecting an appropriate resume [6]. Chen Zhang et al. proposed a ResumeVis, a visualization system to extract and visualize resume data. To extract semantic information text, mining method is presented. Then, to show the semantic information in various perspectives a set of visualizations are presented. This paper focused on accomplishing of following tasks: to trace out individual career development path to mine underlying social relations among individuals and even to keep the full representation of enormous resumes’ combined mobility. It is observed that the system has effectively demonstrated on several government officer resumes [7]. Eirini Papagiannopoulou et al. presented keyphrase extraction methods using supervised and unsupervised learning. Keyphrase extraction is the task of textual information processing which deals with automatic extraction of characteristic and representative phrases from a document that expresses all the key aspects of its content. According to this paper, among listed unsupervised methods, for short documents, graph-based methods will work better, while statistical methods will work better for long documents [8, 9]. Khaled Hasan et al. focused on an in-depth approach of automatic keyword extraction from Bangla scientific documents employing a hybrid method, which utilizes both unsupervised and supervised machine learning model. Hybrid approach exhibited 24 and 28% improvements over two existing approaches: neural-based approach and approach using part-of-speech tagging along with named entity recognizer, respectively, for extraction of five keywords [10]. Xavier et al. evaluated and compared five named entity recognition (NER) software (StanfordNLP, OpenNLP, NLTK, SpaCy and Gate) for two different corpora. NER takes a major part in finding and classifying the entities in NLP applications [11]. It has been observed that the past research done on resumes generally focused on information extraction from resume to filter millions of resumes to a few hundred potential ones. If these filtered resumes are similar to each other, examining each resume becomes challenging to know about the right candidates. For a given set of similar resumes, none of the above approaches tried to extract special skills and to keep a count it. In our work, effort has been made to extend the notion of special skills related to a particular domain like data science, machine learning, R language, big data, etc., to improve the recruiter’s productivity, to reduces the time in the overall candidate selection process, and to improve the quality of selected candidates.
398
Channabasamma et al.
3 Methodology 3.1 Natural Language Processing (NLP) Background NLP is an artificial intelligence (AI) technique which allows the software to understand human language whether spoken or written. The resume parser works on the keywords, formats, and pattern matching of the resume. Hence, resume parsing software uses NLP to analyze and extract detailed information from resumes as like a human recruiter. The raw data needs to be preprocessed by the NLP algorithm prior to which the consequent data mining algorithm is used for processing. NLP algorithmic process involves various sub-tasks such as tokenization of raw text, part-of-speech tagging, and named entity recognition etc. Tokenization: In this process, the text is first tokenized into small individual tokens such as words, punctuation. This process is done by the implementation of rules, specific to each language. Based on the specified pattern, the strings are broken into tokens using regular expressions. The patterns used in this work (r \w ) remove the punctuation in the data processing. The function add.lower() can be used in the lambda function to convert to lowercase. Stopword Removal: The stopwords are a group of often used words in the language. Like in English, having several stop words such as “the”, “a”, “is”, “are”, etc. The perception of using these kinds of stop words is, removal of low informative words from the text could lead to focus more on the important words. spaCy has inbuilt stopwords. Based on the context, for example in sentiment analysis, the word “not” and “no” are important in the meaning of a text such as “not good”, stopwords list can be modified accordingly. Stemming and Lemmatizing: Both stemming and lemmatization shorten words from the text to their root form. Stemming is the process of decreasing or removing the inflection in words to their root form (for instance performs/performed to perform). In this case, the “root” might not be true root word, but simply a canonical form of its original word. It streams the prefixes of words based on common words. In some cases, it is helpful, but not always as a new word may lose its actual meaning. Lemmatization is the process of converting a word into its base form, for example, “caring” to “care”. spaCy’s lemmatizer has been used to obtain the lemma (base) form of the words. Unlike stemming, it returns an appropriate word that could be easily found in the dictionary. Part-of-speech tags and dependencies: After the process of tokenization, spaCy will parse and provide the tags to a given document. At this point, statistical models are used, which enable the spaCy to predict the label or tag that likely appears with the context. A model will consist of binary data, it shows the system good enough examples to make the predictions which may generalize across the language—say, a word following “the” in the English language is most of the times a noun.
A Contextual Model for Information Extraction …
399
Named entity recognition (NER): NER is possibly the first step in the information extraction; it identifies and classifies the named entities in the document into a set of pre-defined categories like the person names, expressions of times, locations, organizations, monetary values, quantities, percentages [11]. The more accuracy in the recognition of a named entity as a preprocessing step, the more information on relations and events can be extracted. There are two types of NER approaches: a rule-based approach and a statistical approach [4], and even a combination of both (hybrid NER) has also been introduced. The hybrid approach provided a better result compared to relying only on the rule-based method in recognizing the names [12, 13]. The rule-based approach defines a set of rules that determines the occurrence of an entity with its classification. To represent the cluster of relatively independent categories, ontologies are also used. These systems are most useful for the specialized entities and their categories, which have a fixed number of members. The quality of the rules determines performance. Statistical models use supervised learning that is built on very large training sets of data in the classification process. Algorithms use real-world data to apply rules; rules can be learnt and modified. The process of learning could be accomplished through a fully supervised, unsupervised, or semi-supervised manner [14, 15].
3.2 The Approach There is a lot of ambiguity and variability in the language used in the resume. Someone’s name can be an organization name (e.g., Robert Bosch) or can be an IT skill (e.g., Gensim). Such heterogeneity makes the extraction of useful information a challenging task. It gives rise to the urgent need of understanding in the context where the words occur. Semantics and context play a vital role while analyzing the relationship between objects or entities. The most difficult task for unstructured data is extracting the relevant data because of its complexity and quality. Hence, semantically and contextually rich information extraction (IE) tools can increase the robustness of unstructured data IE. The Problem: In various scenarios, running of a CV parser for a person’s resume and to look for data analytical skills will help you to look for candidates with the knowledge in data science. The parsing fails if the search is more specific, like if you are looking for a Python developer who is good in server-side programming with having good NLP knowledge in a particular development environment for the development of software systems in the healthcare domain. This is because the parsing of job descriptions and resumes do not bring quality data from unstructured information. The Solution: • Have a table or dictionary which covers various skill sets categorized.
400
Channabasamma et al.
• An NLP algorithm to parse the entire document for searching the words which are mentioned in the table or dictionary. • Count the occurrence of the words belonging to different categories. In the proposed system spaCy [1], an advanced library of natural language processing (NLP) is used. It can analyze and extract detailed information from resumes as like a human recruiter.
4 Experimental Results and Discussions To evaluate information extraction systems, in this work, 250 resumes with different formats and templates have been considered which are downloaded from Kaggle. These are parsed by an advanced library of NLP, Spacy, which has a feature called “Phrase Matcher.” It can analyze and extract detailed information from resumes by preprocessing. When raw text is fed as an input to NLP, spaCy tokenizes it, processes the text, and produces a Doc object. Then, the Doc processed in several different steps—it is known as the processing pipeline. The pipeline depicted in Fig. 1 includes a tagger, a parser and an entity recognizer (detailed in Sect. 3). The output Doc that has been processed in each phase of pipeline is fed as input to the next component. The proposed model is designed as shown in Fig. 2; the steps to implement the module are as follows: 1. Dictionary has been created (Table 1) which includes various skill sets categorized from a different domain. The list of words under each category is used to perform the phrase matching against the resumes. 2. Documents are parsed by the advanced library of NLP, Spacy, which has a feature called “Phrase Matcher.” 3. Which intern parses the entire document to search for the words which are listed in the table or dictionary. 4. Finds the phrase. 5. Then the frequency of the words of different categories will be counted.
Fig. 1 NLP pipeline
A Contextual Model for Information Extraction …
401
Fig. 2 Proposed design
6. Data Visualization: Matplotlib is used to represent the above information visually so that it becomes easy to choose the candidate (see Fig. 3). From the above screenshot, it looks like two candidate resumes are satisfying the job requirements as a data scientist is visualized.
4.1 Merits 1. The code automatically opens the documents and parses the content. 2. While parsing it keeps a count of phrases, for easy categorization of persons based on their expatriation. 3. The decision-making process can be accelerated using data visualization. 4. Relative comparison of candidates can be made to filter out the job applicants.
402
Channabasamma et al.
Table 1 Dictionary of the various skills for different categories Statistics
Machine learning
Deep learning
Statistical models
Linear regression
Statistical modeling
Python language
NLP
Data engineering
Neural network R
Python
nlp
aws
Logistic regression
keras
ggplot
Flask
Natural language processing
ec2
Probability
K means
theano
shiny
django
Topic modeling
amazon redshift
Normal distribution
Random forest
Face detection
cran
Pandas
lda
s3
Poisson distribution
xgboost
Neural network dplyr
numpy
Named entity recognition
docker
Survival models
svm
Convolutional neural network (enn)
tidyr
scikitlearn
pos tagging kubernetes
Hypothesis testing
Naive Bayes
Recurrent neural network(RNN)
lubridate
sklearn
word2vec
scala
Bayesian inference
pea
Object detection
knitr
matplotlib
WORD embedding
teradata
Factor analysis
Decision trees
yolo
scipy
lsi
google big query
Forecasting
svd
gpu
bokeh
spacy
aws lambda
Markov chain
Ensemble models
cuda
statsmodel
gensim
aws emr
tensorflow
nltk
hive
lstm
nmf
hadoop
gan
doc2vec
sql
opencv
cbow
Monte carlo Boltzman machine
R language
bagod words skip gram bert sentiment analysis chat bot
A Contextual Model for Information Extraction …
403
Fig. 3 Visualization of shortlisted candidates with the frequency count of special skills
5 Conclusion Heterogeneity in unstructured data like resume makes extraction of useful information a challenging task. Once the relevant data has been extracted, the process of matching skills from resumes to job descriptions can be easily carried out. This paper explores previous approaches to extract meaningful information from unstructured documents. Furthermore, this paper provides background knowledge and covers a basic understanding of NLP. Finally, it presents how documents are parsed by an advanced library of NLP, Spacy, which has a feature called “Phrase Matcher.” Spacy parses the entire document to search for the words listed in the table or dictionary. The next step is to count the occurrence of the words of various categories. For data visualization, Matplotlib is used to represent the information visually so that it becomes easy to choose the candidate. With the fast blowing big data on the Internet, further research on information extraction can deal with noisy and diverse text. The work can be extended further by generating recommendations to the shortlisted candidates to avoid predicted metric deviation.
References 1. https://spacy.io/usage/spacy-101. Last accessed 2020/3/15 2. Sunil Kumar K (2010) Automatic extraction of usable information from unstructured resumes to aid search. 978-1-4244-6789-1110/$26.00©2010 IEEE 3. Das P, Pandey M (2018) Siddharth Swarup Rautaray: A CV parser model using entity extraction process and big data tools. Int J Inf Technol Comput Sci (IJITCS) 10(9):21–31. https://doi.org/ 10.5815/ijitcs.2018.09.03 4. Jiang J (2012) Information extraction from text. In: Aggarwal CC, Zhai CX (eds) Mining text data. https://doi.org/10.1007/978-1-4614-3223-4_2 © Springer Science+Business Media, LLC 2012
404
Channabasamma et al.
5. Hassan H et al (2006) Unsupervised information extraction approach using graph mutual reinforcement. In: Proceedings of the 2006 conference on empirical methods in natural language processing (EMNLP 2006), pp 501–508, Sydney, July 2006. c 2006 Association for Computational Linguistics 6. Maheshwari S et al (2010) An approach to extract special skills to improve the performance of resume selection. In: DNIS 2010, LNCS 5999, pp. 256–273, z2010. _c Springer-Verlag Berlin Heidelberg 7. Zhang C, Wang H (2018) ResumeVis: a visual analytics system to discover semantic information in semi-structured resume data. ACM Trans Intell Syst Technol 10(1), Article 8, 25 p. https://doi.org/10.1145/3230707 8. Papagiannopoulou E, Tsoumakas G (2019) A review of keyphrase extraction. WIREs Data Mining Knowl Discov 2019:e1339. https://doi.org/10.1002/widm.1339 9. Vijayakumar T, Vinothkanna R () Capsule Network on Font Style Classification.J Artif Intell 2(02):64–76 10. Sazzad KH et al (2018) Keyword extraction in Bangla scientific documents: a hybrid approach. Comput Sci Eng Res J 11. ISSN: 1990-4010 11. Schmitt X et al (2019) A replicable comparison study of NER software: StanfordNLP, NLTK, OpenNLP, SpaCy, Gate. 978-1-7281-2946-4/19/$31.00 ©2019 IEEE 12. Balgasem SS, Zakaria LQ (2017) A hybrid method of rule-based approach and statistical measures for recognizing narrators name in Hadith. In: 2017 6th international conference on electrical engineering and informatics (ICEEI), Langkawi, pp 1–5 13. Kumar, Saravana NM (2019) Implementation of artificial intelligence in imparting education and evaluating student performance.J Artif Intell 1(01):1–9 14. Zaghloul T (2017) Developing an innovative entity extraction method for unstructured data. Int J Qual Innov 3:3. https://doi.org/10.1186/s40887-017-0012-y 15. Nedeau D, Sekine S (2007) A survey of named entity recognition and classification. Linguisticae Investig 30(1):3–26 16. https://towardsdatascience.com/a-practitioners-guide-to-natural-language-processing-part-iprocessing-understanding-text-9f4abfd13e72, 2020/3/15
Pediatric Bone Age Detection Using Capsule Network Anant Koppar, Siddharth Kailasam, M. Varun, and Iresh Hiremath
Abstract Convolution neural network (CNN) is a state-of-the-art method that is widely used in the field of image processing. However, one major limitation of CNN is that it does not consider the spatial orientation of the image. Capsule network, proposed by Geoffrey E. Hinton et al., was an attempt to solve this limitation. However, the architecture was designed for discrete data. This paper modifies the architecture appropriately to make it suitable to work on continuous data. It works on the dataset RSNA Pediatric Bone Age Challenge (2017) (RSNA Pediatric Bone Age Challenge in Stanford medicine, Dataset from https://www.kaggle.com/kma der/rsna-bone-age (2017) [1]) to detect the bone age of the patient from his X-ray, whose maximum age is restricted to 228 months. In order to achieve the purpose, mean squared error (MSE) was used for backpropagation. The 20 most significant outputs were taken from the network to address the problem of diminishing gradients. The results were validated to check if it is biased to an age range. This could be a characteristic for running on continuous data using an architecture that supports the classification of only discrete data. Since the validation held true, one could infer that this network could be more suitable for continuous data than capsule network. Keywords Capsule network · Pediatric bone age · Bone age detection · Continuous data · Convolutional neural network · Neural network · Mean squared error
1 Introduction Convolutional neural network (CNN) is a state-of-the-art method that has been relevant for quite a long time in today’s image processing field. It tries to identify an object by identifying its subcomponents, which are further identified by key points. It has been widely used as it is known to yield very good accuracy. A. Koppar · S. Kailasam (B) · M. Varun · I. Hiremath Computer Science Engineering Department, Engineering Department, PES University, 100 Feet Ring Road, Banashankari Stage III, Dwaraka Nagar, Banashankari, Bengaluru, Karnataka 560085, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_31
405
406
A. Koppar et al.
However, CNN has its own limitations. One major limitation is that it does not consider spatial information when it tries to detect the object. When a CNN identifies a key point somewhere, it is always considered as a match irrespective of where it found the key point and in what direction it was found. This could hence lead to a few misclassifications. Capsule network [2] tries to address this particular issue. It obtains vectors as an output from each neuron instead of a single intensity value. The neuron then communicates with vectors obtained from other neurons, before it decides on how confidently it was a part of a bigger subcomponent. This way, it ensures that the spatial orientations are taken into consideration. Capsule network [2] was originally designed for discrete data. This paper attempts to modify the architecture so that it could be used for continuous data. It tries to predict the bone age of a patient based on an X-ray of the person’s wrist bone. It tests its validity using the dataset provided by the Radiological Society of North America (RSNA) called RSNA Pediatric Bone Age Challenge [1].
2 Literature Survey The capsule network was introduced in the paper “Dynamic Routing Between Capsules” by Hinton et al. [2]. It performed handwritten digit recognition on the MNIST dataset. The input image is fed into a convolution layer with RELU activation function. The output was normalized by using a safe norm that scales the length of the probability vectors between 0 and 1. Then the vector with the highest estimated class probability was taken from the secondary capsule and the digit class is predicted. The loss function consists of margin loss and reconstruction loss added together with a learning rate alpha. The dataset was split into two parts, where 80% of the dataset was used for training and 20% of the dataset was used for validation. The model achieved a final accuracy of 99.4% on the validation dataset after 10 epochs. This paper was hence ideal to classify discrete data, although it may not be suitable for continuous data. The paper “Capsule Networks and Face Recognition” by Chui et al. [3] talks about performing the task of face recognition by using a capsule network. It used the dataset Labelled Faces in the Wild (LFW) [4]. A total of 4324 images was sampled from the dataset where 3459 were used as training images and 865 as testing images. In another attempt to train the network, the dataset comprised 42 unique faces from a collection of 2588 images with at least 25 faces for every unique person. The traintest split is similar to the way mentioned above. The model achieved an accuracy of 93.7% on the whole test dataset. The paper “Pediatric Bone Age Assessment Using Deep Convolutional Neural Networks” by Iglovikov et al. [5] approaches the problem of pediatric bone age assessment by using a deep convolution neural network where the CNN performs the task of identifying the skeletal bone age from a given wrist bone X-ray image. This paper implements a stack of VGG blocks. It used the exponential linear unit
Pediatric Bone Age Detection Using Capsule Network
407
(ELU) as the activation function. The output layer was appended with a softmax layer comprising 240 classes as there are 240 bone ages which result in a vector of probabilities. A dot product of the probabilities with the age was taken. The model used mean squared error (MSE) as the loss function. This way, CNN was used to predict continuous data. The paper “the relationship between dental age, bone age and chronological age in underweight children” by Kumar et al. [6] talks about the relationship between dental age, bone age, and chronological age in underweight children. It was experimentally proven that a normal female has a mean difference of 1.61 years between the chronological age and the bone age and a mean difference of 1.05 years for males. In addition, the study concludes by saying that bone age and chronological age have a positive correlation with each other which are the maturity indicators of growth. Therefore, any delay incurred between the bone age and the chronological age is an attribute significant to a sample of 100 underweight children. The paper “Bone age assessment with various machine learning techniques” by Luiza et al. [7] talks about the traditional approaches to assess the bone age of the individual as one of the topics among many other topics it talks about. Traditional methods include the Fels method which is based on radio-opaque density, bony projection, shape changes, fusion. It also talks about Fischman technique, which is based on the width of the diaphysis, a gap of the epiphysis, fusion of epiphysis and diaphysis between third and fifth finger. But manual methods are usually very time consuming and prone to a lot of errors as humans are involved.
3 Proposed Methodology This paper proposes an approach based on capsule network [2] to detect the bone age. Capsule network is an object recognition tool that is a modification of a convolutional neural network (CNN) [8]. It imparts an additional property of making it robust to spatial orientation. Capsule network follows a hierarchical approach for detection just like a CNN. For example, in facial recognition of a three-layered network, the first layer may detect the types of curves. The second layer may use the detected curves to identify features such as an eye, a nose, or an ear. The third layer may use these detected subparts and identify a face. The difference lies in the fact that a CNN outputs a single value from each neuron which represents the confidence of the particular shape. The confidence value may or may not be on a scale of one. However, a capsule network outputs a vector which not only represents the confidence, but also the direction in which the shape was identified. The neurons in each layer communicate with each other to decide on the magnitude of the vector. The vector is then passed to the next layer, which tries to see the bigger picture of the image based on the output vectors from the previous layer.
408
A. Koppar et al.
An issue with capsule network is that it has been designed to work only on discrete data. This paper modifies it to detect continuous data, which is the bone age of the wrist X-ray of the given patient.
4 About Capsule Network Capsule network was proposed by Hinton et al. [2]. It consists of the following layers 1. Primary capsules—This is a set of convolutional layers that are applied to the image. Each neuron represents a particular shape. The output from this layer is a feature map from which n vectors of m dimensions are derived, where m and n are constants depending on the architectural decision by the user. There is usually more than one convolutional layer in the neural network. 2. Squash function—This acts as an activation layer and imparts nonlinearity to the network so that it could effectively learn from state-of-the-art backpropagation algorithms, which depend on nonlinearity. It is given by the formulae 2 s j sj vj = 2 1 + s j s j
(1)
3. Digit capsule—This is the output layer that gives the probability of occurrence of each value. For example, in handwritten digit recognition, there are 10 digit capsules as there are 10 outputs between 0 and 9. Similarly, this paper proposes to use 228 digit capsules as the age range of the pediatric bone age dataset as given by RSNA [1] is between 1 and 228 months 4. Decoder—This component tries to reconstruct the original image from the digit capsule. This reconstructed image is used to calculate the reconstruction loss, which is the loss of image data after it passes through the network. One of the most important features in a capsule network is routing by agreement. After obtaining the output from each convolutional layer in the primary capsules, this operation is performed before the output goes to the next convolutional layer. This enables communication across neurons to see if a feature identified by a neuron has an equivalent feature identified by other neurons in the same layer. Let the output of layer 1 be u1 , u2 , u3 … un , the output vector be m dimensions represented as v1 , v2 , v3 … vm , the weights from u to v be W 1,1 , W 1,2 … W n,m . The following constants is got u 1|1 = W1,1 u 1 u 1|2 = W1,2 u 2 .. .
u n|m = Wn,m u n
(2)
Pediatric Bone Age Detection Using Capsule Network
409
The network includes another set of values b1,1 , b1,2 … bn,m whose ultimate goal is to indicate how the vector outputs of the neurons from the previous layer correlate to the input of the neurons from the next layer based on other vector outputs from the next layer. These are initialized to the same value at the beginning. The weights c1,1 , c1,2 … cn,m are then calculated by applying a softmax function on the values b1,1 , b1,2 … bn,m . c1,1 , c1,2 . . . c1,m = softmaxc1,1 , c1,2 . . . c1,m c2,1 , c2,2 . . . c2,m = softmax c2,1 , c2,2 . . . c2,m .. . cn,1 , cn,2 . . . cn,m = softmax cn,1 , cn,2 . . . cn,m
(3)
The values v1 , v2 , v3 … vm are then calculated using the following formulae v j = squash
ci, j × u j|i
(4)
i
Following this, bi,j is updated using the following formulae bnewi, j = bi, j + u ji · v j
(5)
The term uj|i . vj talks about how much vj has changed with respect to uj|i . The network is then run again with the new bi,j . This is done for a fixed number of iterations, so that the final vj appears like, all the neurons have communicated with each other to decide the final output vector. The routing by agreement algorithm of a capsule network is given in Fig. 1.
Fig. 1 Routing by agreement algorithm in a capsule network
410
A. Koppar et al.
5 Proposed Preprocessing Before the image is fed to the neural network, it is always important to ensure that it is in its best form and could be easily understood by the neural network. The preprocessing in this paper has 3 main goals • To identify the edges in the image. • To remove the background as much as possible with minimal loss of data. • To highlight the edges of the wrist bone. Before the goals were achieved through various preprocessing techniques, the image was first resized to a standard size of 1000 * 1000 to ensure that the effect of any technique applied on the image was similar for every image. The edges were then identified using adaptive thresholding. This method identifies the areas of interest based on the intensity of the neighboring pixels. In order to strengthen the working of adaptive thresholding, contrast enhancement was performed, so as to widen the intensity difference between the pixels. This was followed by smoothing, using a Gaussian filter to remove noise and hence reduce the chance of salt and pepper noise in the output of adaptive thresholding. Once to have all the edges, the next aim is to ensure that the background edges such as the frame of the X-ray are removed as much as possible from the image, so that the network can focus on the wrist bone features. This was removed by applying a closing filter on the image using kernels with long horizontal line and vertical line. To cope up with the real-world data, random white spots were added to these kernels. These kernels were applied 10 times on the image, each time with white spots at different places. Following this, the image was converted to grayscale using a Gaussian filter. This could get intermediary values depending on the surrounding. Also, color inversion was performed for human convenience for evaluating the quality of output as humans are generally more accustomed to seeing X-rays as white bone on black background. Hence, one can genuinely see if quality has improved. Following this, contrast is enhanced to ensure a maximum difference between pixels. The image was then sharpened two times to make the edges glow. In between, the image was smoothed using an edge preserving filter. Edge preserving filter is the latest smoothing tool that smoothes the pixels by identifying an edge instead of using the surroundings. Hence, it is ideal in this case. Once the image runs through this pipeline, it is ready to be used by the neural network (Fig. 2).
6 Neural Network The neural network is based on capsule network architecture. Capsule network [2] is a modification of convolutional neural network [8] that imparts the property of
Pediatric Bone Age Detection Using Capsule Network
411
Fig. 2 Preprocessing architecture
considering the spatial orientation of the image in addition to the properties provided by a CNN [8]. However, capsule network has been designed to classify discrete data. This paper uses it to predict continuous data. It tries to make sure that the accuracy is not biased to a particular age range as is usually the case when a network that classifies discrete data is applied on continuous data. The original capsule network architecture has two loss functions—margin loss and reconstruction loss. The margin loss is given by the following formulae 2 2 L k = Tk ∗ max 0, m + − vk + λ(1 − Tk ) ∗ max 0, m − − vk where L k = margin loss Tk =
0, if k is correct 1 if k is incorrect
m + = 0.9 m − = 0.1
(6)
412
A. Koppar et al.
λ is a constant The reconstruction loss is a function that indicates how well it has coded the image to represent the original image. It is based on the output obtained from the decoder. Final loss function is given by Final loss = margin loss + α ∗ reconstruction loss
(7)
where alpha was the learning constant and was taken as 0.0005 in the original capsule network architecture [2]. In the network, the margin loss function tries to ensure that the network gets as close as possible to the original distribution, while the reconstruction loss tries to ensure that the final layer represents as much information of the original image as possible. In discrete data, like handwriting recognition, when an image of digit 3 is given, then recognizing the digit as 4 is equally wrong as recognizing the digit as 5. However, in continuous data, if the original age is 15 months, predicting it as 18 months is much better than predicting it as 30 months. It is hence clear that the goal of the network in a continuous data is to get as close to the value as possible, while in case of discrete data if it cannot reach the exact value, it does not matter what value is predicted. Let us now examine the margin loss function. Let the correct value be k. Consider 3 points k—alpha, k—(alpha–gamma), k + alpha, where 0 < gamma < alpha < k. Most backpropagation algorithms propagate its network based on the loss function is known. Hence, for that particular iteration, when all other coefficients are constant, if the loss function varies, then backpropagation is taken on different steps. However, if the loss function is the same, the neural network propagates by the same step in the same direction. Case 1—The Predicted Value for the Iteration is (k-Alpha) T k = 0 as prediction is incorrect From Eq. 6 2 2 L k = Tk ∗ max 0, m + − vk + λ(1 − Tk ) ∗ max 0, m − − vk 2 2 = 0 ∗ max 0, m + − vk + λ(1 − 0) ∗ max 0, m − − vk L k = λ ∗ max(0, vk − 0.1)2 Case 2—The Predicted Value for the Iteration is (k-(Alpha-Gamma)) T k = 0 as prediction is incorrect From Eq. 6 2 2 L k = Tk ∗ max 0, m + − vk + λ(1 − Tk ) ∗ max 0, m − − vk
(8)
Pediatric Bone Age Detection Using Capsule Network
413
2 2 = 0 ∗ max 0, m + − vk + λ(1 − 0) ∗ max 0, m − − vk L k = λ ∗ max(0, vk − 0.1)2
(9)
Case 3—The Predicted Value for the Iteration is (k + alpha) T k = 0 as prediction is incorrect From Eq. 6 2 2 L k = Tk ∗ max 0, m + − vk + λ(1 − Tk ) ∗ max 0, m − − vk 2 2 = 0 ∗ max 0, m + − vk + λ(1 − 0) ∗ max 0, m − − vk L k = λ ∗ max(0, vk − 0.1)2
(10)
From Eq. (8), (9), and (10), it is evident that the step is taken across the same direction and magnitude irrespective of how far or on which direction the data is present. Hence, the convergence to any minima is dependent on the order in which the data is fed to the network. The problem here is evident. Although the margin loss function indicates if the value is correct or incorrect, it does not indicate how close it is to the actual value. Hence, the capsule network as proposed by Hinton et al. [2] is not suitable for continuous data and needs modifications for usage on continuous data. For this purpose, this paper uses mean squared error as a loss function and is scaled down to 3 using the following formulae yk ∗3 228 y_pred_normk y_pred_normk = ∗3 228 L k = (y_normk − y_pred_normk )2 y_normk =
(11)
where yk is the actual age y_pred_normk is the predicted age 228 is the highest expected age This is then added to reconstruction loss using Eq. (7). There are 228 output layers called digit capsules in the network, with each layer representing the confidence value of the respective output from 1 to 228 months. These were made into probabilities by passing them to a softmax layer. From here, 20 highest probabilities were taken and scaled such that they add to one. This was done by dividing each value by the sum of 20 probabilities, which could be denoted as Pi =
Pi j=20
sort_desc j=0 (P j )
(12)
414
A. Koppar et al.
where P i is the updated probability at age i Pi is the initial probability at age i i is a subset of all j values These probability values were then multiplied to the age they represent and were added. This paper proposes to take the top 20 outputs instead of all the probabilities in order to address the problem of vanishing gradients during the initial phase of training that eventually leads to network collapse. At the beginning of the training, due to a large number of output neurons with Gaussian initialization, the probabilities are almost equal to each other. Hence, it outputs the same value for every image. Later, when the neural network begins to alter weights during backpropagation, it still continues to output the same values as there are too many weights to alter. In the end, because multiple alterations do not affect the result, the network collapses and stops learning. In order to make sure that these values were not subjected to excessive variance, each batch of size 3 was given 3 consecutive tries, which tried to make the image get as close as possible to the actual distribution. The value 3 was obtained using parameter tuning. When top 20 probabilities are taken, it is made sure that each time different digit capsules are taken, thus resulting in different values based on the image. The top 20 probabilities represent 20 most significant outputs of the distribution of the neural network X and should effectively represent most of X . It is expected that the neural network learns such that the top one or two values are much higher than the rest. Hence, this setup is expected to work well in the long run too. Another modification made to the network was to change ReLU [9] activations in the convolution layers to Leaky ReLU [10]. This helped to solve the problem of “dying ReLU”, where if a neuron reaches the value of zero, ReLU [9] never provides a way that the weight could be altered again, which implies the neuron has effectively died. On using four layers in the network, as proposed by Hinton et al.[2], there are too many weights in the neural network. This introduces the possibility of exploding gradients. Hence, this paper proposes to use only two layers in order to address this problem. To summarize, the following modifications are proposed to the capsule network to make sure it handles continuous data 1. The backpropagation was done using mean squared error (MSE) scaled to three in place of margin loss. This makes the model try to minimize the distance between the predicted value and the actual value instead of trying to focus on getting the exact value. The reconstruction loss was still added to this network. 2. The values from the 228 digit capsules were passed through the softmax layer and the probabilities were obtained. Following this, top 20 values were taken and were scaled such that these 20 probabilities add up to 1. The top 20 probabilities were multiplied with their respective values.
Pediatric Bone Age Detection Using Capsule Network
415
Fig. 3 Size of filters in neural network
3. In order to make sure that these values were not subjected to excessive variance, each batch of size 3 was given 3 consecutive tries, for the neural network to get as close as possible to the actual distribution. 4. The ReLU [9] was changed to Leaky ReLU [10] to address the problem of “dying ReLU” 5. To address the problem of exploding gradients, only two layers were used. The specifications of the filter size are given in Fig. 3.
7 Convergence Here are 2 prerequisite theorems to prove the convergence Theorem 1 A function is convergent to a minima if it is convex. In other words, a function f is convergent if f x + x ≥ f (x) + f x
(13)
Theorem 2 A composite function f (g(x)) is convex if f (x) and g(x) is convex and the range of g(x) is within the domain of f (x).
416
A. Koppar et al.
Let the loss function be L(x) and the optimizer be O(x) The capsule network uses Adam optimizer [11] to optimize the loss function. It has been proven by Diederik P. Kingma and Jimmy Lei Ba that Adam optimizer is a convex function. Hence, O(X) is a convex function. Let the correct value be x. Consider 2 points x + y and x + z, where 0 z
(17)
Hence, it is proven that L(x, x ) is convex on (x − x ), where x is a sample from the neural network distribution corresponding to the sample x from the original dataset. Both L(x, x ) and O(x) work on real numbers. L(x, x ) and O(x), is a known function of multiplication, addition, and subtraction on x and x . Since real numbers have closure property with respect to addition, subtraction, and multiplication, it could be inferred that range of L(x, x ) is within the domain of O(x). Since L(x, x ) is convex on (x − x ), O(x) is convex on x and the range of L(x, x ) belongs to the domain of O(x), it could be inferred that O(L(x, x )) is convex on (x–x ) using Theorem 2. Since O(L(x, x )) is convex on (x−x ), it could be inferred that (x − x ) is convergent to a minima using Theorem 1. Hence, the backpropagation tries to pull the distribution of the neural network X to X + mu where X is the distribution of the originally provided dataset and mu is the convergence point of the function. Now let us see how the network works when top k outputs are considered. Let the distribution of the original dataset be X and the distribution of the neural network be X . Let the loss function be expected to converge at X + mu. Consider 2 cases 1. When X has reached close to X + mu 2. When X is far away from X + mu Case 1 (When X has reached close to X+ mu) The significance of the neuron output decreases as the output probability decreases. Hence, for an optimal value of kSignificance of top k neuron significance of other neurons.
Pediatric Bone Age Detection Using Capsule Network
417
Hence, when the top k values are taken, have been effectively sampled X . Hence, X is still close to X + mu and changes are made to the distribution X by backpropagation is not major, as it should rightfully be. Case 2 (When X is far away from X+ mu) There could be two sub cases here 1. When the top k neurons taken do not change in the next iteration, the probability of the appropriate neurons is still reduced as X is far away from X + mu. 2. When the top k neurons taken a change in the next iteration change • If the top k significant neurons of X obtained is such that it is closer to X + mu, it is converging closer. • If the next highest probability is farther to X + mu, then the neuron is propagated with a huge loss function in the next iteration. Hence, the most significant outputs of X are propagated with a bigger loss function, in the next trial or when a similar sample is taken again.
8 Dataset Used The dataset used was RSNA Pediatric Bone Age Challenge (2017) [1]. It was developed by Stanford University and the University of Colorado, which was annotated by multiple expert observers. The dataset contains images of wrist bone X-ray in multiple orientations using multiple X-ray scanners, each resulting with a different texture of the X-ray.
9 Results The experiments were conducted on Google Cloud on a TensorFlow VM. The system was configured to have two virtual CPUs, 96 GB RAM on tesla P100 GPU, to support the comprehensive computations. The training set was split into 9000 images for training and 3000 images for validation. The results of the model were analyzed on the validation dataset. Figure 4 was plotted on the results obtained when all the 228 output layers were taken into consideration instead of the top 20 probabilities. From Fig. 4, one can observe that the algorithm outputs a constant value within a very narrow range of 113–114 months for random samples. This happens because of vanishing gradients, as a large number of weights are learned. Hence, it is justified why top 20 probabilities are taken and scaled to 1 instead of taking all 228 probabilities. Figure 5 is a depiction of parameter tuning to identify the optimal number of trials to be given for the network to come close to the actual distribution with a batch of three images. It could be found here that the lowest point in the graph is obtained at three trials and is hence best while training. In the same graph, the “not as number
418
A. Koppar et al.
Fig. 4 Scatterplot of a random sample predicted bone age samples for first 100 iterations when all 228 probabilities were taken
Fig. 5 Average MSE for number of trials with a batch size of three images
(NaN)” obtained corresponding to 1 trial and 2 trials also show us why the network is needed to give a few trials for it to get close to the actual distribution. Figures 6 and 7 and the images following it are the results plotted on the validation set after training with the proposed methodology. One can observe in Figs. 6 and 7 that the deviation is unbiased to any particular age range in general. In other words, it could be observed that the ratio of the number of patients with age deviation >15 and the number of patients with age deviation β, and α + β = 1) are assigned to BTij and CTij . Now the basic trust is computed using the relation represented by SEm (i, j) as given Eq. 2. N tk BT(t)itkj =
m=1
SEm (i, j) Ntk
(2)
The current trust (CT) value estimated in this model is the trust value of the node in the time interval between t and t + 1. This proposed trust model from this research work is to compute the node’s entire trust value based on the fuzzy expert system approach. In this article, the term current trust (CT) represents the node’s current trust value as given in Eq. 3. Another factor that evolved using threshold value based on trust is creditability. The node which is going to be a part of the transmission path is based on the creditability value. The value of the creditability is high when it is higher than the specified threshold value; otherwise, it will be considered as medium or low. The creditability value will change dynamically based on the factors used for evaluating the creditability. CT(t)ir j = CC(t)ir × BT(t)r j t1 ≤ t ≤ NOW
(3)
In this work, current trust (CT) is computed using the mathematical representation given in Eq. 4. CC(t)ir = BT(t)i1 × BT(t)12 × BT(t)23 × · · · × BT(r −1)r t1 ≤ t ≤ NOW
(4)
If n nodes are present in the communication, having current trust values: Using these n values, CT(t) is computed using the form shown in Eq. 5. CT(t)r P1 j , CT(t)i P2 j , . . . , CT(t)i Pn j CT(t)i j =
n
W Pk × CT(t)i Pk j
(5)
k=1
This model which estimates the trust value of the node i on node j in time interval t + 1 is represented as (T ij (t + 1)) which is derived with the help of both basic trust of i on j at time t (BTij (t)) and current trust on j to i by few other nodes at the time of t as (CTij (t)) as shown in Eq. 6 as follows T i (t + 1) = α × BTij (t) + (1 − α) × CTij (t), 0 ≤ α ≤ 1, t1 ≤ t ≤ NOW
(6)
Fuzzy Expert System-Based Node Trust Estimation … Table 1 Range of fuzzy values for each input trust parameter basic trust (BT), current trust (CT) and path trust (PT)
439
Linguistic variables
Fuzzy values
Symbols
Low
0.0 ≤ z ≤ 0.4
LOW
Low medium
0.3 ≤ z ≤ 0.6
LM
Medium
0.5 ≤ z ≤ 0.8
MED
High
0.7 ≤ z ≤ 1.0
HGH
Fig. 1 Fuzzy membership function representation of the node’s basic trust (BT)
In this research work, the proposed model incorporates Gaussian fuzzifiers for estimating membership values of the number of packets transmitted by each node using Eq. 7.
μTrust-value (X ) = e
−(x−c)2 2σ 2
(7)
Based on the knowledge of domain experts, input parameters (low, low medium, medium, and high), as well as output parameters (low, low medium, medium, and high), are selected. The range of fuzzy value for each linguistic variable of the trust-based parameter is shown in Table 1. The fuzzification process begins with the transubstantiation of the given node-based trust parameters using the functions that are represented in Eq. 7. Both basic and current trust of node’s related fuzzy membership representation is shown in Figs. 1 and 2, respectively.
4 Results and Discussions The proposed model combines both global as well as local-based trust optimization and provides an acceptable and accurate prediction of malicious nodes as well as path recommendation. The environment for this experiment is created using NS2.3.5. The simulation environment considered 25 nodes in an area of 500 x 500 m2 . Nodes are static and each node having equal energy 1 J at the initial stage. The membership
440
K. Selvakumar and L. Sai Ramesh
Fig. 2 Fuzzy membership function representation of the node’s current trust (CT)
values was determined from these values by using the Gaussian fuzzy membership function which is discussed in Sect. 3. In the crisp set approach, the minimum threshold value is assumed as 0.4. If the trust value is greater than threshold value, it is represented as 0, i.e., the trusted node, and if it is lesser than the threshold value, then it is represented as 1, i.e., untrusted node (malicious node) in crisp set (Table 2). Even though the crisp set value is accurate, but they do not explain anything about the range of trust value. To overcome this dynamism of truth value, a fuzzy expert system is approached which corresponds to low, low medium, medium, high. Fuzzy expert system values are more accurate than the crisp set value which does not provide anything about the range of trust value. With the help of a fuzzy expert system, a trusted path is established for transferring data from source to destination. Hence, a fuzzy expert system-based trust evaluation model is a better, accurate, reliable result than existing approaches depicted in Fig. 3.
5 Conclusions and Future Work In this research article, a completely new extension of the representation model of a fuzzy expert system-based trust model through a heuristic approach is proposed to measure the trust worth of nodes. This newly proposed model furnishes versatile and feasible overture to select a better node altogether based on the trust constraints, and energy consumption is achieved. The trust value includes both initial and current trust values which make the system to efficiently identify the actual trust value before starting the packet transmission. The future work may consider any other alternative imputes to the trust model for enhancing the accuracy of trust evaluation.
BT
0.98
0.984
0.966
0.795
0.961
0.911
0.754
0.372
0.977
0.925
0.333
0.819
0.923
0.937
0.437
0.652
0.963
0.652
0.628
0.935
0.872
Node
N1
N2
N3
N4
N5
N6
N7
N8
N9
N 10
N 11
N 12
N 13
N 14
N15
N 16
N 17
N18
N19
N 20
N 21
0.926678
0.797287
0.779817
0.836519
0.72683
0.836519
0.294878
0.792451
0.825533
0.98847
0.12746
0.820922
0.689871
0.17909
0.990311
0.852302
0.732033
0.999009
0.718986
0.671108
0.681849
BT_MF
HGH
MED
MED
HGH
MED
MED
LOW
MED
HGH
HGH
LOW
HGH
MED
LOW
HGH
HGH
MED
HGH
MED
MED
MED
BT_LTM
0.919
0.839
0.702
0.634
0.885
0.497
0.361
0.715
0.846
0.73
0.277
0.852
0.773
0.433
0.975
0.852
0.873
0.907
0.891
0.84
0.804
CT
0.680443
0.879071
0.99215
0.907792
0.771475
0.562255
0.235038
0.99786
0.864295
0.999992
0.112574
0.851118
0.979677
0.391793
0.524318
0.851118
0.80169
0.713284
0.755925
0.877001
0.941919
CT_MF
Table 2 FES-based trust estimation of nodes and its trust classes
MED
HGH
HGH
HGH
MED
LMD
LOW
HGH
HGH
HGH
LOW
HGH
HGH
LMD
LMD
HGH
MED
MED
MED
HGH
HGH
CT_LTM
0.8955
0.887
0.665
0.643
0.924
0.5745
0.399
0.826
0.8845
0.7745
0.305
0.8885
0.875
0.4025
0.8645
0.8815
0.917
0.851
0.9285
0.912
0.892
TV
0.81113
0.831603
0.911407
0.867305
0.737544
0.694499
0.246115
0.949458
0.83747
0.996703
0.107043
0.828048
0.859074
0.252918
0.881575
0.844413
0.756218
0.908152
0.725373
0.76934
0.819655
TV _MF
MED
MED
MED
HGH
MED
LMD
LOW
MED
HGH
HGH
LOW
HGH
MED
LOW
MED
HGH
MED
MED
MED
MED
MED
TRUST_CLS
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
CRS_CL
(continued)
12
15
15
16
11
7
1
15
16
16
1
16
15
5
8
16
11
12
11
15
15
RUL_NO
Fuzzy Expert System-Based Node Trust Estimation … 441
BT
0.337
0.939
1
0.575
Node
N 22
N23
N 24
N 25
Table 2 (continued)
0.640932
0.627781
0.78758
0.132171
BT_MF
LMD
LMD
MED
LOW
BT_LTM
0.528
0.967
0.885
0.244
CT
0.649018
0.546446
0.771475
0.080896
CT_MF
MED
LMD
MED
LOW
CT_LTM
0.5515
0.9835
0.912
0.2905
TV
0.629911
0.571087
0.76934
0.092537
TV _MF
LMD
LMD
MED
LOW
TRUST_CLS
0
0
0
1
CRS_CL
10
6
11
1
RUL_NO
442 K. Selvakumar and L. Sai Ramesh
Fuzzy Expert System-Based Node Trust Estimation …
443
Fig. 3 Successive process of summation of all outputs of node-based trust estimation
References 1. Forghani A, Rahmani AM (2008) Multi state fault tolerant topology control algorithm for wireless sensor networks. future generation communication and networking. In: FGCN ‘08. Second ınternational conference, pp 433–436 2. Munir SA, Wen Bin Y, Biao R, Man M (2007) Fuzzy logic based congestion estimation for QoS in wireless sensor network. In: Wireless communications and networking conference, WCNC.IEEE, pp 4336–4346. 3. Akkaya K, Younis M (2003) An Energy-Aware QoS Routing Protocol for Wireless Sensor Networks. Distributed Computing Systems Workshops, Proceedings. 23rd International Conference. 710–715 4. Sun YL, Han Z, Liu KJR (2008) Defense of trust management vulnerabilities in distributed networks. Commun Mag 46(4):112–119 5. Sathiyavathi V, Reshma R, Parvin SS, SaiRamesh L, Ayyasamy A (2019) Dynamic trust based secure multipath routing for mobile Ad-Hoc networks. In: Intelligent communication technologies and virtual mobile networks. Springer, Cham, pp 618–625 6. Selvakumar K, Ramesh LS, Kannan A (2016) Fuzzy Based node trust estimation in wireless sensor networks. Asian J Inf Technol 15(5):951–954 7. Thangaramya K, Logambigai R, SaiRamesh L, Kulothungan K, Ganapathy AKS (2017) An energy efficient clustering approach using spectral graph theory in wireless sensor networks. In: 2017 Second ınternational conference on recent trends and challenges in computational models (ICRTCCM). IEEE, pp 126–129 8. Poolsappasit N, Madria S (2011) A secure data aggregation based trust management approach for dealing with untrustworthy motes in sensor networks. In: Proceedings of the 40th ınternational conference on parallel processing (ICPP ’11), pp 138–147 9. Feng RJ, Che SY, Wang X (2012) A credible cluster-head election algorithm based on fuzzy logic in wireless sensor networks. J Comput Inf Syst 8(15):6241–6248 10. Selvakumar K, Karuppiah M, SaiRamesh L, Islam SH, Hassan MM, Fortino G, Choo KKR (2019) Intelligent temporal classification and fuzzy rough set-based feature selection algorithm for intrusion detection system in WSNs. Inf Sci 497:77–90 11. Raj JS (2019) QoS optimization of energy efficient routing in IoT wireless sensor networks. J ISMAC 1(01):12–23 12. Claycomb WR, Shin D (2011) A novel node level security policy framework for wireless sensor networks. J Netw Comput Appl 34(1):418–428
444
K. Selvakumar and L. Sai Ramesh
13. Selvakumar K, Sairamesh L, Kannan A (2017) An intelligent energy aware secured algorithm for routing in wireless sensor networks. Wireless Pers Commun 96(3):4781–4798 14. Feng R, Xu X, Zhou X, Wan J (2011) A trust evaluation algorithm for wireless sensor networks based on node behaviors and D-S evidence theory. Sensors 11(2):1345–1360 15. Ganeriwal S, Balzano LK, Srivastava MB (2008) Reputation-based framework for high integrity sensor networks. ACM Trans Sens Netw 4(3):1–37 16. Kamalanathan S, Lakshmanan SR, Arputharaj K (2017) Fuzzy-clustering-based intelligent and secured energy-aware routing. In: Handbook of research on fuzzy and rough set theory in organizational decision making. IGI Global, pp 24–37 17. Shaikh RA, Jameel H, d’Auriol BJ, Lee H, Lee S, Song YJ (2009) Group-based trust management scheme for clustered wireless sensor networks. IEEE Trans Parallel Distrib Syst 20(11):1698–1712 18. Selvakumar K, Sairamesh L, Kannan A (2019) Wise intrusion detection system using fuzzy rough set-based feature extraction and classification algorithms. Int J Oper Res 35(1):87–107 19. Chang EJ, Hussain FK, Dillon TS (2005) Fuzzy nature of trust and dynamic trust modeling in service-oriented environments. In: Proceedings of workshop on secure web services, pp 75–83 20. Guo S, Yang O (2007) Energy-aware multicasting in wireless ad hoc networks: a survey and discussion. Comput Commun 30(9):2129–2148 21. Beth T, Borcherding M, Klein B (1994) Valuation of trust in an open network. In: Proceedings of ESORICS, pp 3–18 22. Josang A (2001) A logic for uncertain probabilities. Int J Uncertainty Fuzziness Knowl Based Syst 9(3):279–311 23. Darney PE, Jacob IJ (2019) Performance enhancements of cognitive radio networks using the improved fuzzy logic. J Soft Comput Paradigm (JSCP) 1(02):57–68
Artificial Neural Network-Based ECG Signal Classification and the Cardiac Arrhythmia Identification M. Ramkumar, C. Ganesh Babu, G. S. Priyanka, B. Maruthi Shankar, S. Gokul Kumar, and R. Sarath Kumar
Abstract Electrocardiogram is an essential tool to determine the clinical condition of cardiac muscle. An immediate and the precise detection of cardiac arrhythmia is highly preferred for aiding good and healthy life, and it leads to healthy survival for the humans. In this study, utilizing MATLAB tools the feature extraction is made by various statistical parameters from both normal and the abnormal categorization of ECG signals. These features are inclusive of variance, arithmetic mean, kurtosis, standard deviation, and skewness. The feature vector values reveal the informational data with respect to a cardiac clinical health state. The focus on this study is made by utilizing the classifier of artificial neural network in order to identify the ECG abnormalities. Levenberg–Marquardt backpropagation neural network (LM-BPNN) technique is being utilized for the cardiac arrhythmia classification. The ECG data are extracted from the MIT-BIH cardiac arrhythmia database, and the data are tested which is utilized for further classification of ECG arrhythmia. The comparison for the results of classification is made in terms of accuracy, positive predictivity, sensitivity, and specificity. The results of experimentation have been validated based on its accuracy of classification through tenfold cross-validation technique. It has resulted M. Ramkumar (B) · G. S. Priyanka · B. Maruthi Shankar · R. Sarath Kumar Department of Electronics and Communication Engineering, Sri Krishna College of Engineering and Technology, Coimbatore, India e-mail: [email protected] G. S. Priyanka e-mail: [email protected] B. Maruthi Shankar e-mail: [email protected] R. Sarath Kumar e-mail: [email protected] C. Ganesh Babu Bannari Amman Institute of Technology, Sathyamangalamn, India e-mail: [email protected] S. Gokul Kumar Department of Technical Supply Chain, Ros Tech (A & D), Bengaluru, Karnataka, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_34
445
446
M. Ramkumar et al.
with an average accuracy of 99.5% in predicting the cardiac arrhythmias of a different class. Keywords Artificial neural networks · Electrocardiogram · Feature extraction · Classification · MIT-BIH arrhythmia database · Levenberg–Marquardt (LM) algorithm
1 Introduction The heart enables the triggering of minute electrical impulses at the sinoatrial node, and it enables its spread through the heart’s conduction system in order to make the rhythmic contraction. The recording of these impulses is done by the ECG instrument in terms of sticking the surface electrodes over the layer of skin in various parts of the chest surrounding the cardiac muscle. The electrical tracings of the heart’s activity are represented as the ECG waveform, and the spikes and dips will determine the conditions of the cardiac muscle. The generation of normal ECG waveform is shown in Fig. 1. An ECG waveform is represented as the series of positive waves and the negative waves which are resulted due to various deflections in each section of the cardiac beat. The tracing of typical ECG signal is consisting of the P wave, QRS complex, and T wave for each cycle of a cardiac beat. The ECG makes the detection over the ion transfer via the myocardium which gets varied in each heartbeat. The isoelectric line is denoted as the ECG signal’s baseline voltage wherein which it gets traced following the sequence of T wave and the preceding of the successive P wave. The
Fig. 1 Generation of normal ECG Wave
Artificial Neural Network-Based ECG Signal Classification …
447
Table 1 Normal ECG signal amplitudes and time duration S. No.
Parameters of ECG
Typical amplitude (mV) and time duration (s)
1
P wave
0.25 mV
2
R wave
1.60 mV
3
Q wave
25% of R wave
4
T wave
0.1–0.5 mV
5
P-R interval
0.12–0.20 s
6
Q-T interval
0.35–0.44 s
7
S-T segment
0.05–0.15 s
8
P wave Interval
0.11 s
9
QRS interval
0.09 s
heart’s upper chamber initiates the P wave. The P wave is declared as the initial wave to be generated because of the contraction of heart’s upper chamber followed by the flat straight line caused because of the electrical impulse and travels to the lower chambers. As it is being discussed, the ventricle contraction determines the QRS complex and the last production of T wave for resting the ventricles. The periodic cycle of the heart’s electrical activity is denoted by the sequence of P-QRS-T. The normal values of various ECG waveforms are represented in the following Table 1. Different data mining and machine learning methods have been formulated for improving the accuracy of ECG arrhythmia detection. Due to the non-stationary and the nonlinear nature of the ECG signal, the nonlinear methods of extraction are denoted as the best candidates for the information extraction in the ECG signal [1]. Because of the ANN is denoted as the pattern matching method on the basis of mapping the nonlinear input–output data, it can be efficiently utilized for making the detection of morphological variations in the nonlinear signals like the ECG signal component [2]. This study proposes the usage of neural network using backpropagation algorithm and the technique of Levenberg–Marquardt (LM) technique for ECG signal classification with which the data has been acquired from MIT-BIH arrhythmia database.
2 Review of Literature The presentation over a few studies has been made over the neural network system performance when it is being utilized for detecting and recognizing the abnormal ECG signals [3]. The utilization of neural network systems for analyzing the ECG signal produces few advantages over several conventional techniques. The required transformations and the clustering operations could be performed by the neural network simultaneously and automatically. The neural network is also capable of
448
M. Ramkumar et al.
recognizing the nonlinear and the complex groups in the hyperspace [4]. The capability of producing distinct classification results over various conventional applications holds a better place for neural network computational intelligence systems. However, minute work has been dedicated to making the derivation over better parameters for the network size reduction along with the maintenance of good accuracy value in the process of classification. The model of artificial neural network is being utilized for the prediction of coronary cardiac disease on the basis of risk factors which comprises of T wave amplitude variation and a segment of ST [5]. Two stages have been adapted in the neural network for the classification of acquired input ECG waveform into four different types of beats that aid in the improvement over the accuracy on diagnosis [6]. Support vector machine (SVM) is denoted as one of the machine learning algorithms which is utilized for the classification can proceed with the process of pattern recognition based on statistical learning theory [7]. The KNN method is denoted as the process of learning on the basis of instance which is widely utilized the technique of data mining in recognizing the pattern and classifying the problems [8]. Mode, median, standard deviation and mean are represented as the first-order probabilistic features. Variance, skewness, and kurtosis denote the top order probabilistic features [9]. Standard deviation lends the calculative measure for quantifying the total amount of depression or variation for a set of values in a data. Kurtosis is declared as the measurement of the data whether it is flat or peaked in relation to the normal distribution. Informational data which possess a high value of kurtosis, it is assumed to possess a distinct mean subsequent to the mean, rapidly it declines added to that it possesses heavy tails [10]. Skewness makes the indication over the deviation and asymmetry in the analysis of distribution to the normal distribution. (a) Mean: When the values set possess the deep central tendency sufficiently, the relation by the set of numbers to its respective moments determines the additive component of the values integer powers. The arithmetic mean for the set of values for x1 , . . . , xn is denoted by the following equation. x=
N 1 xj N j=1
(1)
(b) Variance: When the description of the mean is made to the distribution location, the variance is declared as the path for capturing its degree or scale of its condition being spread apart. The variance unit is denoted as the square of the original variable unit. The variance positive square root is termed to be a standard deviation. 2 1 xj − x N − 1 j=1 N
Var(x1 , . . . x N ) =
(2)
Artificial Neural Network-Based ECG Signal Classification …
449
(c) Standard Deviation: The standard deviation is denoted as the set of multiple values and it is represente as the measure of values in the state of probabilistic dispersion. Usually, the standard deviation is denoted by the symbol σ. Definition of standard deviation is expressed in terms of determining the square root of variance. σ x1,..., x N = Var(x1 , . . . x N )
(3)
where σ x1,..., x N is denoted as the distribution of standard deviation. (d) Skewness: For the statistical distribution of real-valued random variable, skewness is represented as its asymmetrical calculation or measurement. The positive parameter or the value of skewness represents an asymmetric tail distribution which extends throughout the positive region of x at the time of negative value gets signified by the distribution, whose later tail is extending through a maximum negative region of x [11]. However, any of the sets of calculated parametric values of N willingly tend to lend a nonzero value even if the distribution is symmetrical throughout its underlying (possess skewness of zero). For focusing this as a meaningful statement, few ideas are required in which its standard deviation has to be determined as the skewness estimator for the underlying distribution.
Skew x1,..., x N
3 N 1 Xj − X = N j=1 σ
(4)
(e) Kurtosis: The definition of Kurtosis is made in such a way that the cumulant of the 4th scale is divided by the square of cumulant in the second scale which is equated to its fourth moment surrounding the mean which gets divided by the variance square of the statistical distribution subtracted by 3, which is denoted as the term excess kurtosis. The conventional expression of the kurtosis is denoted by the following expression.
Kurt x1,..., x N
⎧ 4 ⎫ N ⎨1 Xj − X ⎬ = −3 ⎩N ⎭ σ j=1
(5)
where the term (−3) denotes the zero value for the normal distribution. If this is being considered for a case, the 3rd moment or skewness and the 4th moment or kurtosis must be utilized with caution else it need not be used with these considerations. The kurtosis is being determined as the quantity of non-dimensional. It establishes the measurement of the relative distribution over its flatness or peak. The positive kurtosis distribution is represented as leptokurtic, and the negative kurtosis distribution is represented as the platykurtic. The middle part of the distribution is
450
M. Ramkumar et al.
Fig. 2 Distributions whose 3rd moment and 4th moment significantly getting varied from the Gaussian or normal distribution. a 3rd moment or Skewness, b 4th moment or kurtosis
termed to be as mesokurtic [12]. Figure 2 depicts the variation of its distribution from the Gaussian to the skewness and kurtosis. The term skewness determines the representation over the distribution in an asymmetrical manner. The distribution which possesses an asymmetric tail that tends to extend toward right side direction is denoted to be skewed in a positive direction. And similarly, an asymmetric tail that tends to extend toward left side direction is assigned to be skewed in a negative direction. In our study, it is mainly used for the measurement and verification of symmetricity of data that indicates the statistical variable distribution. Kurtosis is the measurement which makes the determination over the distribution degree of flatness whether it is flattened or tapered made in comparison with the normal pattern characterization. When the value of Kurtosis becomes higher the resulted values will be greater than that of the average value. Thus, in this classification study, these realizations are made with respect to select the adequate features to process within the ANN classifier.
3 Materials and Methods The proposed approach for classifying the ECG cardiac arrhythmias makes the involvement over ECG signal preprocessing, feature extraction over the distinguished statistical and the non-statistical parameters and finally classifying the cardiac arrhythmias utilizing the artificial neural network technique with the application of Levenberg–Marquardt backpropagation neural network (LM-BPNN). The schematic diagram for the classification of ECG arrhythmia using ANN has been represented in Fig. 3.
Artificial Neural Network-Based ECG Signal Classification …
451
Fig. 3 Flowchart for the ANN classification
In Fig. 3, the general flowchart of ANN could be visualized in classifying and detecting the heartbeats and cardiac arrhythmias. In this flowchart, the determination has been shown with the input layers, hidden layers, and the output layers. 12 categorization of ECG beats have been acquired from the ANN classification system, and the detection of arrhythmia is being made. Figure 4 represents the functional block diagram of a neural network system in diagnosing the cardiac arrhythmia. The ANN network is consisting of a single input layer, single hidden layer and single output layer. The input layer is possessed with 5 neurons denoting the features of mean, standard deviation, variance, kurtosis and skewness, and it is possessed with tan-sigmoid transfer function. The hidden layer is possessed with 4 neurons with log-sigmoid transfer function, and the output layer is possessed with 12 neurons indicating the arrhythmia of ECG beats. Hence as the sequence of the process, the raw ECG from MIT-BIH arrhythmia database is acquired and the denoising is carried out in terms of preprocessing followed by the statistical feature extraction and with selected features, the classification by ANN is undergone. (a) Levenberg–Marquardt (LM) algorithm The Levenberg–Marquardt (LM) algorithm is generally represented as the method functions with iteration that place the minimum value of a multivariate function with Raw ECG signal from MIT-BIH Arrhythmia database
Preprocessing in terms of Denoising
Output categorization of cardiac beats and detection of arrhythmias
ECG Feature Extraction
ANN Classification of ECG
Fig. 4 Block diagram of neural network system in diagnosing the cardiac arrhythmia
452
M. Ramkumar et al.
which the expression is made in terms of the square’s sum of real-valued functions which is nonlinear [13, 14]. LM algorithm can be considered as the combination of Gauss–Newton method and the steepest descent algorithm. This LM algorithm is declared as one of the most robust methods when compared to that of GN algorithm with which most importantly it identifies the solution even if it is initiated with the final minimum. At the time of iterations, the new weight configuration in the sequential step k + 1 for which the calculation is made as follows. −1 T J ε(k) W (k + 1) = W (k) − J T J + λI
(6)
where J-denotes the Jacobian matrix, λ-denotes the adjustable parameter, ε-denotes the error vector. The modification over the λ parameter is based on error function (E) development. If the step induces the reduction of E, then it could be accepted. Else the value of λ will be varied. Finally, the original value is being reset and recalculation is made for W (k + 1). (b) Preprocessing of data Data preprocessing is the primary initial step for developing any model. The columns which is consisting of all 0’s is being deleted along with the disappeared values and the columns with most of the zero values are also being deleted. It is being acquired with 182 columns out of which 12 are meant for categorizing and the balance 170 are meant as numerical one. As the next step, 32 rows have been deleted which comprises of missing values and the balance 37,500 number of samples are considered for determining the analysis in the system. Randomization is completely done on the datasets after deleting the unwanted records. In this mode, there is no presence of outlier in the processing data. The partitioning of datasets has been made into three different representation. They are 68% of training sets, 16% of validation sets, and 16% of the testing dataset. (c) Classification of arrhythmia In this study, following are the arrhythmias of ECG beats considered for classification and it is made into 12 classes. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
Normal beat Left Bundle Branch Block (LBBB) beat Right Bundle Branch Block (RBBB) beat Atrial Escape (AE) beat Nodal (Junctional) Escape (NE) beat Atrial Premature (AP) beat Premature ventricular contraction (PVC) beat Fusion of ventricular and normal (FVN) beat Ventricular escape (VE) beat Paced beat Supra-ventricular premature beat (SP) beat Nodal (junctional) Premature (NP) beat
Artificial Neural Network-Based ECG Signal Classification …
453
By utilizing ANN, for classifying the ECG cardiac arrhythmias the analysis is being determined for mean, standard deviation, variance, skewness, and kurtosis as the variables of input, and it is acquired from the heart rate signals. The suitable values for various cardiac arrhythmias are being chosen as provided from Table 1 [15, 16]. (d) Method of Performance Evaluation By the utilization of ANN, the classification performance has been evaluated by utilizing 4 performance measures of metrics. They are sensitivity, specificity, positive predictivity, and accuracy. These performance metrics are determined using True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) [17, 18]. 1. True Positive: An instance with which the detection of cardiac arrhythmia is being coincided with the diagnosis of the physician. 2. True Negative: An instance with which both the physician and the classifier output has provided a suggestion that the result declares the absence of arrhythmia. 3. False Positive: An instance with which the classification system wrongly classifies the healthy ECG as the arrhythmia. 4. False Negative: The classification system determines the result as healthy instead of arrhythmia. 5. Classification Accuracy: It is determined as the ratio of the total count of correctly classified signals and is denoted with the following equation. Accuracy =
TP + TN N
(7)
N denotes the total count of inputs. 6. Sensitivity: It denotes the rate of positively classified correct samples. It is also named as True Positive Rate. Normally for a system, the value of sensitivity must be higher. Sensitivity =
TP TP − FN
(8)
7. Specificity: It denotes the detection over the negative samples has made correctly. It is also referred to as the False Positive Rate. Normally for a system, the value of specificity must be the highest. Specificity =
TN FP − TN
(9)
8. Positive Predictivity: It denotes the ratio of the total count of correctly detected events (TP) to the total count of events with which the analyzer has been detected. Positive Predictivity =
TP TP − FP
(10)
454
M. Ramkumar et al.
4 Results and Discussions The training of neural network is made with backpropagation algorithm, i.e., variable learning rate backpropagation Levenberg–Marquardt. The neural network training window is being shown in Fig. 6. The neural network fitting function is shown in Fig. 5. The neural network is allowed to process with 37,500 various samples for doing the training and the testing processes. Among those various samples, 68% of samples are utilized for training the neural network, 16% of the samples are utilized for testing the neural network, and the balance 16% of the samples are utilized for the validation the network. The comparison of the results has been made over the periodical repetition of iterations through the adaptive mechanism and by shuffling the sample values during the process of training. The error histogram is denoted as the plot in-between the value of error and the total count of instances in which the error has been formed. The 20-bins error histogram with respect to the different instances on y-axis and the error value (target-output) has been plotted, and it is depicted in Fig. 7. At the middle of the histogram plot, it has a minimum error and the error value increases as it is moved away from its center. Figure 8 depicts the neural network training regression plot which shows the relation between the target and the output. In the NN training, for classifying the cardiac arrhythmias it takes 50 iterations to complete the cycle. At 17th epoch, the regression window has been depicted. The neural network training state has been shown in Fig. 9 which shows the relation between the gradient and the epochs at 14th iteration. The best validation performance has been determined at the 17th epoch, and it is acquired with the value of 0.0016349, and its window is shown in Fig. 10. The output response as the result of classification has also been analyzed with the individual element determining the characteristics over the target, output and error with respect to time. Its window is shown in Fig. 11. The indication over the output and the target is related by the regression value. The plot of regression lends the information on how close the output is matched with the target values. The output of the network would possess a strong linear relationship with the desired targets if the regression coefficient value termed to be as unity. If the regression coefficient value is approaching zero, then the prediction over the output and the target cannot be done with the relation. The performance plot is denoted as the plot across the mean square error and the total count of epochs. Mean square error is denoted as the squared average difference
Fig. 5 Fitting function of neural network
Artificial Neural Network-Based ECG Signal Classification …
455
Fig. 6 Neural network training window
Fig. 7 20-bins error histogram plot
Error Histogram with 20 Bins
105
2 Training Validation Test Zero Error
1
Errors = Targets - Outputs
0.9566
0.8544
0.65
0.7522
0.5478
0.4456
0.2412
0.3434
0.1391
0.03687
-0.1675
-0.06532
-0.2697
-0.3719
-0.4741
-0.5763
-0.6785
-0.7806
0
-0.985
0.5
-0.8828
Instances
1.5
456
M. Ramkumar et al.
Fig. 8 Regression plot relating the output and the target
between the output data and the target data. MSE with zero indicates that there is no error. When the training is initiated and it is under progress, the error gets reduced. When the value of mean square error is reduced to the minimum value, the process of training gets stopped and the validation of the network happens with the samples. In the phase of validation, if the behavior of the network is identified properly, then the training comes to an end and it will be ready to undergo the testing process. The LM determines better performance when the comparison is made with other methods on the basis of calculating MSE. Table 2 determines the classification of 12 different types of arrhythmias along with its accuracy, sensitivity, positive predictivity, and specificity values. The error plot has been obtained on the basis of the acquired error value at the time of training stage, validation stage and testing stages utilizing the individual cycle data. The mechanism of data sharing with respect to statistical feature selection techniques is significantly attaining high performance with minimum error. Hence,
Artificial Neural Network-Based ECG Signal Classification …
457
Fig. 9 Neural network training state at 17th epoch
in the histogram plot of 20 bins, the peak value is in the middle region of 0.03687, and it holds minimum error only in the middle region. And also, it indicates that when the error histogram moves away from the middle region, it will result in more error. This proves the ANN system has been classified with high accuracy. The regression plot which relates the output and target is shown in Fig. 8. At the regression value R with 0.99387, 0.99431, and 0.9951 for training validation and testing, respectively, the output has been acquired with almost linear which is nearly unity. This linearity determines the best realization in determining the relationship between the training, validation, and the testing values. As a whole, the processing data has been attained with the regression value of 0.99412. At 17th epoch, the neural network training state is being shwon in Fig. 9 which infers the performance over the neural network training process is highly efficient and has resulted in the gradient value of 0.00052892. It yields the accuracy of the ANN system by evaluating its performance. The realization has been made between the MSE and the total number of iterations that have been undergone. It has been resulted in the best validation performance of 0.0014894 at 17th iteration. The best realization has attained from the training, validation, and the testing data in formulating the error analysis. The
458
M. Ramkumar et al.
Fig. 10 Best validation performance between MSE and 17 epochs
response characteristics of the output element had also been shown in Fig. 11 realizing its response of error, target, and output. From the above classification results, it could be inferred that the accuracy is higher with 98.8% for classifying the normal beats, and the sensitivity is high with 97.64% for the fusion of ventricular and normal (FVN) beat, specificity is high with 96.68% for left bundle branch block (LBBB) beat and the positive predictivity is with the higher value of 97.63% for fusion of ventricular and normal (FVN) beat.
5 Conclusion The evaluation over the performance of classification algorithm using Levenberg– Marquardt backpropagation neural network (LM-BPNN) technique is being done by acquiring the data from the MIT-BIH arrhythmia database with accuracy, sensitivity, positive predictivity, and specificity. These performance metrics have been made defined by utilizing True Negative (TN), True Positive (TP), False Positive (FP),
Artificial Neural Network-Based ECG Signal Classification …
459
Response of Output Element 1 for Time-Series 1 1 Training Targets Training Outputs Validation Targets
0.9995
Output and Target
Validation Outputs Test Targets
0.999
Test Outputs Errors Response
0.9985
0.998
0.9975
0.997 10-3
3
Error
Targets - Outputs
2 1 0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Time
Fig. 11 Response characteristics of output element
and False Negative (FN). The results over the experimentation have shown that the accuracy of classification is being existed from 91.18 to 98.8% for the 12 class of ECG arrhythmias.
460
M. Ramkumar et al.
Table 2 Probabilistic results of classifying ECG signal by LVQ NN showing the performance metrics of 12 class of arrhythmias S. No.
ECG arrhythmia beats
Accuracy (%)
Sensitivity (%)
Specificity (%)
Positive predictivity (%)
1
Normal beat
98.8
96.48
55.48
96.55
2
Left bundle branch block (LBBB) beat
94.48
93.34
96.68
93.47
3
Right bundle branch 92.68 block (RBBB) beat
90.84
92.24
90.89
4
Atrial escape (AE) beat
91.25
90.97
91.18
90.92
5
Nodal (junctional) escape (NE) beat
95.62
91.68
39.14
91.67
6
Atrial premature (ap) beat
96.24
91.43
84.49
91.45
7
Premature ventricular contraction (PVC) beat
94.64
95.45
85.68
95.41
8
Fusion of ventricular and normal (FVN) beat
95.54
97.64
92.46
97.63
9
Ventricular escape (VE) beat
91.18
92.21
95.57
92.23
10
Paced beat
97.68
97.04
91.28
97.09
11
Supra-ventricular 94.14 premature beat (SP) beat
96.67
85.66
96.68
12
Nodal (junctional) 96.66 premature (NP) beat
90.01
78.24
90.09
References 1. Turakhia MP, Hoang DD, Zimetbaum P et al. (2013) Diagnostic utility of a novel leadless arrhythmia monitoring device. Am J Cardiol 112(4):520–524 2. Perez de Isla L, Lennie V, Quezada M et al (2011) New generation dynamic, wireless and remote cardiac monitorization platform: a feasibility study. Int J Cardiol 153(1):83–85 3. Olmos C, Franco E, Suárez-Barrientos A et al (2014) Wearable wireless remote monitoring system: An alternative for prolonged electrocardiographic monitoring. Int J Cardiol 1(172):e43–e44 4. Huang C, Ye S, Chen H et al (2011) A novel method for detection of the transition between atrial fibrillation and sinus rhythm. IEEE Trans Biomed Eng 58(4):1113–1119 5. Niranjana Murthy H, Meenakshi M (2013) ANN model to predict coronary heart disease based on risk factors. Bonfiring Int J Man Mach Interface 3(2):13–18 6. Ceylan R, Özbay Y (2007) Comparison of FCM, PCA and WT techniques for classification ECG arrhythmias using artificial neural network. Expert Syst Appl 33(2):286–295
Artificial Neural Network-Based ECG Signal Classification …
461
7. Dubey V, Richariya V (2013) A neural network approach for ECG classification. Int J Emerg Technol Adv Eng 3 8. Zadeh AE, Khazaee A, Ranaee V (2010) Classification of the electrocardiogram signals using supervised classifiers and efficient features. Comput Methods Prog Biomed 99(2):179–194 9. Jadhav SM, Nalbalwar SL, Ghatol AA (2010) ECG arrhythmia classification using modular neural network model. In: IEEE EMBS conference on biomedical engineering and sciences 10. Sreedevi G, Anuradha B (2017) ECG Feature Extraction and Parameter Evaluation for Detection of Heart Arrhythmias. I Manager’s J Dig Signal Process 5(1):29–38 11. Acharya UR, Subbanna Bhat P, Iyengar SS, Rao A, Dua S (2003) Classification of heart rate using artificial neural networkand fuzzy equivalence relation. Pattern Recognit 36:61–68 12. Kannathal N, Puthusserypady SK, Choo Min L, Acharya UR, Laxminarayan S (2005) Cardiac state diagnosis using adaptive neuro-fuzzy technique. In: Proceedings of the IEEE engineering in medicine and biology, 27th annual conference Shanghai, China, 1–4 Sept 2005 13. Acarya R, Kumar A, Bhat PS, Lim CM, Iyengar SS, Kannathal N, Krishnan SM (2004) Classification of cardiac abnormalities using heart rate signals. Med Biol Eng Comput 42:288–293 14. Shah Atman P, Rubin SA (2007) Errors in the computerized electrocardiogram interpretation of cardiac rhythm. J Electrocardiol 40(5):385–390 15. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1): 1929–1958 16. Turakhia MP, Hoang DD, Zimetbaum P, Miller JD, Froelicher VF, Kumar UN, Xu X, Yang F, Heidenreich PA (2013) Diagnostic utility of a novel leadless arrhythmia monitoring device. Am J Cardiol 112(4):520–524 17. Xiong W, Droppo J, Huang X, Seide F, Seltzer M, Stolcke A, Yu D, Zweig G (2016) Achieving human parity in conversational speech recognition. arXiv preprint arXiv:1610.05256 18. Melo SL, Caloba LP, Nadal J (2000) Arrhythmia analysis using artificial neural network and decimated electrocardiographic data. In: Computers in cardiology 2000, pp. 73–76. IEEE
CDS-Based Routing in MANET Using Q Learning with Extended Episodic Length D. S. John Deva Prasanna, D. John Aravindhar, and P. Sivasankar
Abstract Broadcast storm created during the message exchanges in MANET is a serious issue in routing as MANET nodes have limited processing resource and residual energy. Researches in the area of connected dominating set (CDS) in MANETs mostly focus on centralized approaches, which require strong and stable topological information about the MANET. Adaptation of centralized algorithms is intangible due to the dynamic nature of the MANET and will demand extensive control message exchange. Hence, it is required to deduce an algorithm to compute CDS using the partial observations of the MANET. In this paper, a Q learning-based CDS formation algorithm has been proposed with extended episodic length approach for efficient MANET routing. In the proposed algorithm, the CDS nodes were chosen not only based on its own Q value but also based on the Q value of its neighbours. In this way, the episodic length of the Q learning algorithm is extended from one hop to two hops. Here residual energy and signal stability are used to estimate the Q values of the individual nodes. The simulation of the algorithm gives promising results when compared to conventional CDS establishment algorithms. Keywords Connected dominating set · Reinforcement learning · Q learning · Decay factor · Learning rate · Episode
1 Introduction Routing in MANETs is challenging due to the dynamic nature of the network. The routing information in MANET needs to be updated on regular intervals due to the D. S. John Deva Prasanna (B) · D. John Aravindhar CSE, HITS, Chennai, India e-mail: [email protected] D. John Aravindhar e-mail: [email protected] P. Sivasankar NITTTR, Chennai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_35
463
464
D. S. John Deva Prasanna et al.
Fig. 1 Sample MANET used for illustration
mobility of the nodes. Exchange of routing information is done by the broadcasting of control messages across the network. This message exchanges often create a broadcast storm due to multiple retransmission of messages. This broadcast storm will cause the nodes to receive multiple copies of the same control messages from multiple neighbours. To avoid this problem, the virtual backbone of nodes is formed so that all transactions are done only through those backbone nodes. Figure 1 shows sample network architecture considered for explaining proposed extended Q CDS-based MANET routing.
1.1 Advantages of Connected Dominating Sets in MANET Connected dominating set (CDS) is a resilient technique used for the formation of a backbone in the MANET. CDS is a graph theory concept in which every node in the graph either will be in the dominating set of the graph or will be a one-hop neighbour to the dominating node. The concept of CDS is used in MANET routing, as the CDS will act as a backbone for all communications. MANET routing is usually done by broadcasting, and hence, if a node transmits a data, then all the neighbouring nodes will be receiving that message. By routing all the communications through CDS, this can be avoided. Most CDS constructions techniques follow a centralized approach, which needs information about the node and edge weight for all the nodes in the graph. Centralized CDS formation approaches are hard to implement in a MANET scenario, as it needs information about the entire MANET. CDS is constructed using nodes with better network metrics like link stability and residual energy. Though these network parameters are considered, the construction of CDS is done by greedy approaches in MANET. Greedy approaches are easy to implement but might result in less efficient CDS with an increased number of CDS nodes.
CDS-Based Routing in MANET Using …
465
1.2 Partial Observability of MANETs and Significance of Q Learning In MANET scenario, it is less possible to obtain the entire topographical information about the nodes and the link metrics due to the mobility of the nodes. This limitation makes the centralized CDS algorithms to be vague for deducing CDS in MANET. Any node in the MANET can obtain information only about its immediate nexthop neighbour. Any pair of nodes in the MANET cannot have a full view of all the intermediate nodes between them and the quality of the link between the nodes. Though proactive routing protocols obtain routes prior to the data transmission, the routes need to be recomputed often due to node mobility. This character of MANET is referred to as partial observability. Q learning is a reinforcement learning (RL) algorithm, in which the system learns by interacting with the environment and reinforcing the learning process using the feedback obtained by the previous interaction with the environment. This feature of the RL algorithm makes it suitable for environments, which are partially observable like MANET. In the Q learning algorithm, the learning entities are called agents. Agents interact with the environment by taking action, and the feedback of the action will be either a reward or a penalty. The agent will be rewarded for taking an action if the result is a success and will be penalized if the result is a failure. The entire sequence of the agent’s interaction with the environment and obtaining the feedback is called an episode. Hence, this paper proposes a novel extended Q learning-based CDS formation algorithm by considering the advantages of conventional CDS and Q learning algorithms to optimize the efficient route between the source and the destination. In this paper, the nodes of the MANET are considered as agents and the nodes learn about their neighbours by transmitting data to them. The nodes will receive a reward if the data transmission is successful and will receive a penalty if the transmission fails. Q values are then estimated using the rewards values of the nodes. In this manner, any nodes will possess a Q value, which is a cumulative representation of its previous transmission behaviours. The nodes with higher Q values will naturally have a higher success rate in transmission. The nodes with higher Q values are included to form CDS in the MANET, which will act as a backbone.
2 Literature Survey A MANET routing algorithm using the concept of reinforcement learning was proposed in [1]. In this paper, the nodes learn about their neighbours’ continuously using reward and penalty model. The routes with higher cumulative reward are used for data transmission. The work establishes decay constants for identifying the old routes. Though this work solves routing using RI, the concept is not used for CDS constructions.
466
D. S. John Deva Prasanna et al.
A Q learning algorithm for MANET routing was proposed in [2], which uses residual energy, node mobility and link quality for finding the efficient intermediate nodes in a data transfer. These parameters are used in calculating the Q values of the nodes. The nodes in the MANET are bootstrapped with these parameters before the learning begins. This algorithm suffers a setback when the size of the MANET increases. The algorithm requires the nodes to know about the hop count between any pair of nodes and hence obtaining the topological information from all the nodes in the MANET, which is challenging in large networks. Performance evaluation of routing algorithm for MANET based on the machine learning techniques was proposed in [3]. A Patially Observable Markov Decision Process (POMDP) was modelled for the entire MANET Scenario in [4]. Here the MANET is considered as an observable entity since nodes cannot obtain topological information for the entire network. The nodes are marked as agents. The packet delivery actions like unicasting, multicasting and broadcasting the packets and considered as actions. The nodes can interact with the environment by sending packets in the network. Status of the nodes during the packet transfer like packet sent, packet received and packet dropped is considered as the state of the nodes. If a node successfully transfers a packet to its neighbours, then it received a reward value, and if the transmission fails, the node will receive a penalty. The reward/penalty model accumulates the value of a node. A MANET path discovery was done using on policy first-visit Monte Carlo method in [5]. This algorithm combines with ticket-based path discovery method with reinforcement learning. The nodes send probing tickets to the destination node and learn about the intermediate nodes and their available resources. The nodes will maintain a table of routes to other nodes in the MANET. Though the algorithm minimizes resource utilization, it is hard for the energy-constrained to probe the route in a dynamic environment. The algorithm also suffers a set back when the size of the MANET grows. In [6], an algorithm was built to construct connected dominating set using the nodes, which have a high packet delivery ratio and residual energy. This algorithm is called RDBR and follows conventional beacon method exchange of network parameter information like packet delivery ratio and residual energy. Usage of machine learning algorithms which can learn on partial observable conditions like MANET is not used in this paper. Several algorithms with centralized CDS formation approach and optimizations were proposed in [7] and [8] by forming a maximal independent set (MIS) and then optimizing the MIS nodes to be connected to each other in order to form a CDS. Though this algorithm computes the minimum CDS, it follows a centralized approach and will only be suitable for stable networks. The algorithms have higher time complexity and may not be suitable for dynamic networks, where observing the topological changes of the entire network is not possible within the stipulated time. In [9], a CDS reconstruction algorithm was proposed to reconstruct the CDS locally with minimum changes by exchanging tiny control message packets. The algorithm reduced the control over considerably, and hence, the performance of the MANET is increased. This works much focuses on the reconstruction of the CDS
CDS-Based Routing in MANET Using …
467
and contributes less to the initial establishment of CDS. Adopting CDS construction strategies based on network performance metrics will further improve the performance of this algorithm [9]. A CDS construction using the Q values of the nodes estimated using Q learning was proposed in [10]. In this algorithm, the CDS construction is done in a greedy fashion using the Q values of the nodes in the MANET. This algorithm suffers due to the greedy approach as some CDS nodes might have all neighbour nodes with low Q values. The algorithm will not have any option except to add one of the low Q-valued nodes as CDS node. From the literature survey, it can be inferred that many MANET routing algorithms are formulated using reinforcement learning technique. Each algorithm has its own pros and cons. Constructions of CDS using reinforcement learning is done using a greedy approach. Therefore, a Q learning-based CDS construction algorithm in MANETs is proposed, and to avoid the greedy nature of the algorithm, the length of the reinforcement learning episode is extended from one hop to two hop.
3 Proposed Extended Q-CDS Construction Using Q Learning with Extended Episode Length This paper aims to achieve an efficient route between the source and destination by proposing an algorithm to construct CDS using Q learning algorithm with extended episode length. Nodes in the MANET interact with its neighbours by sending a message to its neighbour. Successful transactions earn a node a rewards and every failed transaction will incur a penalty. Every node in the MANET develops cumulative Q values of its neighbours by sending messages at various point of time. This scenario of a node sending a message and calculating reward/penalty is called an episode. In MANET, nodes can assess the network parameters like link quality and residual energy for its one-hop neighbours only. In conventional routing techniques, collecting parameter values beyond one-hop neighbours will need extra control message exchanges. In the proposed algorithm, the Q value of a node is estimated using its signal stability, residual energy and the Q value of its neighbour. Hence, the Q value of any node will now reflect the quality of itself as well as the quality its best neighbour. When this Q value is used for CDS construction, not only nodes with higher Q values and quality neighbours are selected as CDS members. Through this, the visibility of a node is increased from one hop to two hops and the obtained CDS is more efficient and stable. The conceptual diagram and workflow of the proposed extended Q CDS algorithm are shown in Fig. 2.
468
D. S. John Deva Prasanna et al.
Fig. 2 Process flow of the proposed extended Q CDS
3.1 Issues in Greedy CDS Exploration During the CDS exploration process, the algorithm always considers the immediate next node with the maximum Q value to be added as the next CDS node. This technique is greedy and sometimes will result is longer and inefficient CDS. Figure 3 illustrates this scenario, where the greedy approach constructs a sub-optimal solution. The initial Q values are assumed according to the residual energy and signal stability ratio of the individual nodes in the network for the illustrative purpose. The Q values are learned and estimated during the exploration phase and are updated during the exploitation phase based on signal stability, residual energy and assigned reward/penalty values. Here the node n2 chooses node n5 as its next CDS node as it has the highest Q value. After including the node n5 to the CDS, the only possible next
Fig. 3 MANETs with CDS comprising of the nodes n2, n5, n3, n7, n11 and n10
CDS-Based Routing in MANET Using …
469
CDS node is n3 which has a very low Q value. Constructing CDS through this path will only result in sub-optimal CDS, which may require frequent re-computation. Moreover, the number of nodes in the CDS is also increased due to this technique.
3.2 Mitigating Greedy Issue by Extending the Episode Length To solve the above issue every, it is prudent to increase the episode length in RL learning phase. The nodes learn about their one-hop neighbouring nodes through interaction, in this technique, the nodes will also know about the neighbours of its onehop neighbours. When a node sends a message to its neighbour node, the receiving node will acknowledge the message and appends the highest Q value among of all its one-hop neighbours. In the example scenario shown in Fig. 4, node n2 sends a message to the node n7 and node n7 will acknowledge along with the highest Q value of n7’s one-hop neighbours, which is in this case n11. Node n2 incorporates the Q value of the node n11 into the Q value of node n7. In this way, the nodes that are having high Q value and high-quality neighbour nodes will be selected to form the CDS. The obtained CDS is found to be optimal in terms of CDS size and versatile in terms of CDS lifetime.
Fig. 4 MANETs with CDS comprising of the nodes n2, n7, n11 and n10
470
D. S. John Deva Prasanna et al.
The proposed extended Q CDS construction using Q learning will be elaborated by briefing Q value estimation, decay factor, learning rate, exploration and exploitation of CDS, extending the episodic length to obtain an efficient CDS node in formation of backbone, in the following section.
3.3 Q Value Estimation Q values of the nodes are calculated by sending beacon packets to its neighbours and verifying the status of the packet delivery. The node, which is sending the beacon packet, is called an assessor node, and the node, which receives the packet, is called as an assessed node. The assessor node will set a reward of one, for all the assessed nodes, which are responding to the initial beacon with an acknowledgement. This reward value will be further characterized by the learning rate parameter calculated based on the residual energy and the signal stability obtained from the received acknowledgement. The Q values calculated for every packet transmission are added to form a single cumulative Q value for the particular node. The Q value of the nodes will be estimated using the generic Q learning formula personalized by learning rate and decay constant. ⎛
⎞
max (β · Q(S, at+1 )⎠ Q(S, at ) = Q(S, at ) + ρ ⎝ Rt+1 +
(1)
S∈D,F
In Eq. (1), ‘S’ refers to the set of states a node can be in any point of time, ‘at ’ refers to the action taken at time ‘t’ that can change the state of the node from one to another. S = {D, F}, the terms D and F refer to the states ‘“Delivered’ and ‘“Failed’, respectively. at = {T, B}, the terms T and B refer to the action of ‘Transmit’ and ‘Buffer’, respectively. When the node transmits the data, the action is referred to as ‘“Transmit’. If the node stalls the transmission or the transmission fails, then the data will remain in the buffer and the action is referred to as ‘Buffer’. If a node successfully delivers the packet to its neighbour, then the state of the node will be delivered, and if the packet is not delivered, then the node will be in the failed state. Q(S, at ) refers to the exiting Q value of the node. max (β · Q(S, at+1 )) refers to the policy of selecting an action which will yield S∈D,F
maximum reward, in our case, ‘D’ is the desired state which can be obtained by taking an action of data transmission. If the data transmission fails, then the node will attract a negative reward. Rt+1 refers to the reward allotted for taking an action. The learning rate ρ and the decay factor β are elaborated in the following sections.
CDS-Based Routing in MANET Using …
471
3.4 Estimation of Decay Parameter Decay parameter is used to estimate the staleness of the learned Q value in Q learning. The Q value will be reduced as time passes on because the estimation once done might change over a period of time. In the proposed algorithm, residual energy of the nodes is considered as decay parameter. The energy levels of the nodes will reduce as they involve in the data transfer, more the nodes involve in data transfer more they will drain their residual energy. The decay factor calculated using the residual energy will add a standard deduction to the node’s Q value whenever the node transmits a data. When a CDS node’s Q value goes below than the threshold, then the CDS re-computation is triggered. Decay factor is represented using the symbol ‘β’. β = r0 e−0.001tr
(2)
t r refers to the number of packets sent/received to a particular node and r0 refers to the initial energy of the node.
3.5 Estimation of Learning Rate Learning rate is a parameter, which controls the algorithm’s speed of learning. When a node receives acknowledgement from more than one neighbour by beaconing, not all nodes can be assigned with the same Q value. Q learning has to ensure that the best neighbour receives the highest Q value. The proposed algorithm identifies the best neighbour node by using the residual energy of the node and the signal stability between the nodes as learning rate parameters. Through the learning rate, the node with better residual energy and link stability will receive a higher Q value than other nodes. Estimation Link Stability Learning Rate ρ ss . The signal stability between the two nodes will be estimated based on bit error rate calculation. ρss = LSi j ∝ LSi j = k
1 BERi j
1 BERi j
(3) (4)
where K is the proportionality constant and can be assigned with an integer. LSij → link stability between the node i and j. Estimation of Residual Energy Learning Rate ρ re . The residual energy of the node will be evaluated using the initial energy of the node and the energy consumed.
472
D. S. John Deva Prasanna et al.
ρre = 1 − E c
(5)
Here E r and consumed energy of the node, respectively, which is represented by Ec = Et + Er + Ep
(6)
E c → Energy consumed E t → Energy spent to transmit a packet. E r → Energy spent to receive a packet. E p → Energy spent to process a packet. ρre = E AR + E PR
(7)
Hence, learning rate is calculated by ρ = ρre + ρss
(8)
3.6 Extended Q Value Estimation In the extended Q learning algorithm, the Q value of a node is computed with its Q value as well as the Q value of its best neighbour. Hence,
Q x (S, at ) = (w1 Q i (S, at )) + w2 max Q j (S, at ) (9) j=1...n
where Q x (S, at ) refers to the extended Q value of a node ‘i’ incorporated with the Q value of its neighbour. Qi (S, at ) is the Q value of the node ‘i’ estimated through Eq. (1). The term max Q j (S, at ) refers to the highest Q value found among the neighj=1...n
bouring nodes of the node ‘i’. The direct Q value of the node and the maximum Q value of the neighbour nodes are given the weightage of w1 and w2 which are 0.6 and 0.4, respectively. So the Q value of any node will reflect its own quality as well as its one-hop neighbour where w1 > w2 here w1 and w2 .
CDS-Based Routing in MANET Using …
473
3.7 Exploration and Exploitation of CDS The process of deducing the CDS using the estimated Q values of the nodes is called an exploration process. CDS exploration will happen only during the initial phase of the CDS establishment and when the Q value of any one of the CDS node goes below the threshold value. During the exploration process the node, which initiates the CDS construction will select its neighbour node with highest extended Q value as the next CDS node. All other one-hop neighbour nodes are declared as covered nodes. This incremental addition of nodes with the highest extended Q values to CDS will continue until all the nodes in the MANETs are covered by the CDS. Once the CDS is established, all communications will go through the backbone and the process is called exploitation. During the exploitation process, the Q value is calculated on every transaction. If any CDS node’s Q value goes below the threshold, then CDS exploration process is triggered.
3.7.1
Algorithm for Exploration of Q-CDS
Step 1: Initialize the MANET by placing nodes randomly with equal energy and specified terrain dimension. Step 2: Bootstrap the nodes with distance, BER and residual energy. Step 3: Estimate signal stability learning rate using BER and distance. Step 4: Estimate residual energy learning rate. Step 5: Estimate the overall learning rate using signal stability learning rate and residual energy learning rate. Step 6: Assign reward and penalty values for nodes based on packet transitions. Step 7: Calculate Q value of the neighbouring nodes and incorporate the Q value of the two hop nodes obtained from neighbouring nodes. Step 8: Explore the next best neighbour based on highest Q value and include it in the CDS. All the immediate neighbours will act as covered nodes. Step 9: Repeat step 8 to form CDS until all nodes in the network are covered. Each and every node will update its Q value table about their neighbours. Step 10: If the Q value of any one of the nodes decays below the threshold then reinitiate exploration again. Figure 5 illustrates the flowchart of the extended Q CDS.
474 Fig. 5 Flowchart of the extended Q CDS
D. S. John Deva Prasanna et al.
CDS-Based Routing in MANET Using …
475
4 Results and Discussion The extended Q CDS is implemented using NS2,and the simulation parameters are provided in Table 1. Figure 6 and 7 show the screenshots of NS2 simulation of the algorithm. The experiments have been carried out for different seed values, and the average is used for result and analysis. The algorithm is experimented by varying the number of nodes and metrics such as packet delivery ratio, end-to-end delay, residual energy and size of CDS Table 1 Simulation parameters in NS2
Parameter
Values
Number of nodes
100
Speed
Upto 20 m/s
Mobility model
Random waypoint
Node placement
Random
Initial energy of the node
400 J
Simulation area
1000 m × 1000 m
Simulation time
10 min
Fig. 6 Screenshot of NS2 implementation of extended Q CDS
476
D. S. John Deva Prasanna et al.
Fig. 7 Screenshot of NS2 implementation of extended Q CDS trace file
was measured. The extended Q CDS algorithm is compared with the reliable CDS (RCDS) [4] and cognitive CDS (CDSCR) [2] and Q CDS [10] and the algorithm performs considerably better. Figure 8 illustrates that extended QCDS can perform better than other algorithms with respect to packet delivery ratio. The QCDS and extended QCDS algorithms constructs almost same CDS when the number of nodes are less, but when the number No.of nodes Vs Packet Delivery Ratio Packet Delivery Ratio(%)
100 95 90 85 RDBR
80 75
CDSCR
70
QCDS
65
Ex QCDS
60 10
20
30
40
50
60
70
No.of Nodes
Fig. 8 Number of nodes versus packet delivery ratio
80
90
100
CDS-Based Routing in MANET Using …
477
of nodes is increased, the algorithm shows improvement in the CDS construction which reflects in the packet delivery ratio. Figure 9 shows that the extended Q CDS algorithm computes a very optimal and stable CDS, so the end-to-end delay of the algorithm is very minimal than other algorithms. Initially, all algorithms perform equally but the performance degrades when MANET size increases. RDBR and CDSCR are centralized CDS algorithms, and QCDS follows the greedy approach; due to this, the performance degrades as more time is spent on route computation before data transfer. By inspecting Fig. 10 extended Q CDS computed CDS with a minimum number of nodes. Though the size of the CDS increases linearly with the increase in MANET size, extended Q CDS still computes CDS with lesser number of nodes but the CDS size is closer to other algorithms. Since RBDR and CDSCR follows a centralized approach it is easy to optimize the redundancy in CDS, these algorithms construct CDS with lesser number of nodes. Figure 11 shows that the control overhead increases linearly for all algorithms with respect to the increase in the number of nodes in the MANET. RBDR and
Delay in ms
No.of Nodes Vs End to End Delay 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0
RDBR CDSCR QCDS Ex Q CDS
10
20
30
40
50
60
70
80
90
100
No.of Nodes
Fig. 9 Number of nodes versus delay
Nof Nodes Vs No of CDS Nodes 35
No of CDS Nodes
30 25 20
RDBR
15
CDSCR
10
QCDS
5
Ex Q CDS
0 20
40
60
80
No.of Nodes Fig. 10 Number of nodes versus number of nodes in CDS
100
478
D. S. John Deva Prasanna et al.
Control Overhead
1000
No.of Nodes Vs Control Overhead
800 600
RDBR
400
CDSCR QCDS
200 0
Ex Q CDS 20
40
60
80
100
No.of Nodes Fig. 11 Number of nodes versus control overhead
CDSCR algorithms demand control message exchange whenever there is a change in the MANET topography due to the centralized approach. The extended Q CDS consumes less overhead as Q values are calculated over normal data transactions. Thus, whenever there is a failure is CDS, the Q values are readily available with the node and the CDS re-computation begins with a low number of control overhead message exchange.
5 Conclusion In this paper, a technique for construction of connected dominating sets using Q learning with expended episodic length is proposed. The results were found to be better, when compared to existing heuristic CDS construction algorithms, In this paper, the decay factor is measured from the neighbour nodes and energy level of nodes is bootstrapped into the system before the algorithm begins the learning phase. In heterogeneous environments, the nodes will have varying energy levels and hence the decay factor needs to be estimated dynamically with respect to the neighbour nodes energy level. In future, the episodic length can be further increased by adopting optimization algorithms like ant colony optimization and tabu search.
References 1. Dowling S, Curran E, Cunningham R, Cahill V (2005) Using feedback in collaborative reinforcement learning to adaptively optimize MANET routing. IEEE Trans Syst Man Cybern Part A Syst Hum 35(3):360–372 2. Tilwari V, Dimyati M, Hindia A, Fattouh, Amiri I (2019) Mobility, residual energy, and link quality aware multipath routing in MANETs with Q-learning algorithm. Appl Sci 9(8):1582 3. Duraipandian M (2019) Performance evaluation of routing algorithm for Manet based on the machine learning techniques. J Trends Comput Sci Smart Technol (TCSST) 1(01):25–38
CDS-Based Routing in MANET Using …
479
4. Nurmi P (2007) Reinforcement learning for routing in ad hoc networks. In: 2007 5th international symposium on modeling and optimization in mobile, ad hoc and wireless networks and workshops 5. Usaha W, Barria J (2004) A reinforcement learning ticket-based probing path discovery scheme for MANETs. AdHoc Netw 6. Preetha K, Unnikrishnan (2017) Enhanced domination set based routing in mobile ad hoc networks with reliable nodes. Comput Electr Eng 64:595–604 (2017) 7. Tran TN, Nguyen T-V, An B (2019) An efficient connected dominating set clustering based routing protocol with dynamic channel selection in cognitive mobile ad hoc networks. Comput Electr Eng 8. Hedar AR, Ismail R, El-Sayed GA, Khayyat KMJ (2018) Two meta-heuristics designed to solve the minimum connected dominating set problem for wireless networks design and management. J Netw Syst Manage 27(3):647–687 9. Smys S, Bala GJ, Raj JS (2010) Self-organizing hierarchical structure for wireless networks. In: 2010 international conference on advances in computer engineering. https://doi.org/10.110 9/ace 10. John Deva Prasanna DS, John Aravindhar D, Sivasankar P (2019) Reinforcement learning based virtual backbone construction in Manet using connected dominating sets. J Crit Rev
A Graphical User Interface Based Heart Rate Monitoring Process and Detection of PQRST Peaks from ECG Signal M. Ramkumar, C. Ganesh Babu, A. Manjunathan, S. Udhayanan, M. Mathankumar, and R. Sarath Kumar
Abstract An electrocardiogram (EKG/ECG) is represented as the recording of electrical impulses of the cardiac muscle and it is utilized in investigating and detecting the cardiac disease or arrhythmia. This activity of electrical impulses of the heart’s cardiac muscle exhibits the translation into the tracings of line on a paper. This dips and spikes over the tracings of the line are determined to be as a wave series. This wave series is consisting of six waveforms of various characteristics which are discernible and its differentiation can be made as P peak, Q peak, R peak, S peak, T peak, and sometimes U peak. One of the most ancient methodologies in analyzing the ECG signal for making the detection of PQRST waveform is done based on digital signal processing technique such as fast Fourier transform (FFT), discrete wavelet transform (DWT), and artificial feed forward neural networks. However, M. Ramkumar (B) · R. Sarath Kumar Department of Electronics and Communication Engineering, Sri Krishna College of Engineering and Technology, Coimbatore, India e-mail: [email protected] R. Sarath Kumar e-mail: [email protected] C. Ganesh Babu Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathyamangalam, India e-mail: [email protected] A. Manjunathan Department of Electronics and Communication Engineering, K. Ramakrishnan College of Technology, Trichy, India e-mail: [email protected] S. Udhayanan Department of Electronics and Communication Engineering, Sri Bharathi Engineering College for Women, Pudukkottai, India e-mail: [email protected] M. Mathankumar Department of Electrical and Electronics Engineering, Kumaraguru College of Technology, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_36
481
482
M. Ramkumar et al.
this proposal of current work is made in such a way that it is a very dependable and simple method in detecting the values of P, Q, R, S, and T peak values of an ECG waveform. This technique is proposed on the basis of determining the mathematical relationship between the ECG signal’s peak value and the time sequence. This methodology has been focused in the direction of making the graphical user interface (GUI) design for exhibiting the detection of PQRST by utilizing the MATLAB tool and plots these peak values on the waveform of ECG signal at each instant of its respective time. These types of ECG signal processing techniques will be aided for scientific research purposes instead of proceeding with a medical diagnosis. Keywords Graphical user interface · MATLAB · Analysis of QRS complex · Heart rate detection · ECG feature extraction · Leads of ECG
1 Introduction The ECG or electrocardiogram is determined as one of the simplest and accurate tests utilized for estimating the heart’s clinical condition. An ECG analysis is the most widely utilized technique since it is capable of screening the different abnormalities of the heart. ECG tools and devices which is utilized for measuring the heart rate and plotting the waveform based on the frequency of heartbeat are most commonly available in the medical centers and it is declared to be as less expensive and riskless with respect to treatment. From the tracing of the ECG waveform, the identification of the following data can be determined [1]. 1. 2. 3 4. 5. 6.
Heart rhythm Heart rate Thickening of cardiac muscle Any physical symptoms for the possibilities of heart attack Detection and the estimation of coronary artery disease Detection of conduction abnormal state (an abnormal state with which the spreading of electrical impulse results across the cardiac muscle).
The above-mentioned traits shall be treated as one of the most significant degradation factors affecting cardiac functionalities. The results of ECG signal tracing make the doctors to determine the exact clinical condition of the heart and also the researches in this field does the work in enhancing the non-invasive detection technique in estimating the cardiac abnormalities. Additional testing of ECG is normally carried out for determining the serious problem which induces a better way to proceed with the treatment. In the hospitals, the test related to ECG most commonly includes stress test and catherization of cardiac muscle. The subjects with which they are resulted with severe heart pain, increased blood pressure, variation in the velocity of flow of blood through the arteries and veins, and unbalanced cholesterol levels in the blood are susceptible to undergo the ECG diagnosis test to check whether there is any thickening of valves over the cardiac muscle which causes arrhythmia or abnormality condition. It is very easier to identify through the variations in the
A Graphical User Interface Based Heart Rate Monitoring Process …
483
electrical flow of the signal pattern of ECG lends the reason for the abnormal condition of the heart. Hence, at the time of every cardiac cycle, an ECG is represented as the graphical pattern of the bioelectrical signal which is being generated within the human body [2]. It is highly possible to acquire the relevant and useful information via the ECG chart or graphical waveform which makes the relation to the heart’s function through the waves and the baseline denoting the variations of the voltage of cardiac muscle at every instant of time [3]. An ECG is very much essential and it possesses its adequate amplitude values at every instant of time [4] and it aids in the determination of certain clinical cases after diagnosis. They are as follows. 1. 2. 3. 4. 5. 6. 7.
Heart attack (myocardial infarction) Electrolytic transformations Ventricular and auricular hypertrophy Pericarditis Cardiac arrhythmias Medicine effects over cardiac muscle, specifically quinidine, and digital Abnormal blood pressure and inadequate velocity of blood flow.
The proposal over this study determines the prototype over the package of software. This development over this software has been made mainly for scientific and technological research over the medical diagnosis under the clinical setting. In this proposed study, MATLAB has been utilized to develop the software package in designing the graphical user interface for analyzing P, Q, R, S, and T peak values of an ECG signal. The computation of all the peak value parameters has been made from the recordings of the ECG signal. The acquisition of ECG signal in terms of input informational data can be done in terms of binary or text-formatted files or simple excel files. The structure of this paper has been prescribed as follows. The second section determines the literature review on feature extraction techniques of ECG signal. The third section determines the summary on the nature of ECG continuous signal patterns and its placing of leads. The fourth section describes the software developed techniques which analyzes the ECG signal to obtain P peak, Q peak, R peak, S peak, and T peak amplitudes. The fifth section focuses on the results and discussion obtained from the simulations, and lastly, the sixth sections are ended with its conclusion and the future scope of work.
2 Review on a Related Literature Survey The recording of continuous ECG signal plays a vital role in the initial stage of diagnosing the cardiac disease which is later on processed and the analysis which will be carried out with the assistance of signal conditioning device. Even though the ECG analysis has been acquired through the determination of heartbeats and heart rates, the abnormalities over the heart functioning results due to the physical change in position and the size of the chambers or the causes resulted due to the continuous consumption of drugs prescribed by the physicians. Acquiring the ECG signals from
484
M. Ramkumar et al.
the electrocardiogram device helps in diagnosing the cardiac abnormality functions by suitable predefinition of the parameters of ECG signals which is configured in the device, and also by the development of optimization prediction techniques, the abnormality conditions of the heart are analyzed more accurately with the development of computational algorithms [5]. Many computational artificial intelligence algorithms have been developed to classify various abnormality factors associated with cardiac muscle functioning. It includes fuzzy logic techniques, artificial neural networks, analysis of digital signal processing techniques, support vector machines, genetic algorithm, self-organizing map, hidden Markov model, Bayesian approach, slope vector waveform (SVW), difference operation method (DOM), discrete wavelet transform (DWT), and fast Fourier transform (FFT) [6–11]. All the above-mentioned methods have been revealed their study based on analyzing the P wave, QRS complex, and T wave peak amplitude for different cycles of ECG signals. For instance, the method of slope vector waveform (SVW) has proposed an algorithm to make the detection of QRS complex of ECG signal and evaluated R-R interval. This SVW technique has offered better prediction over the R-R interval and QRS peak with adequate accuracy which aids in the best classification results due to better extraction of features over ECG wave component [12]. Whereas the method of discrete wavelet transform (DWT) performs the process of feature extraction in acquiring adequate input data from the ECG input signal and finally does the process of classification into different sets. The proposal in this particular work using the DWT technique is consisting of the series of processing functional modules such as preprocessing to eliminate the noise and process noise-free signal, feature extraction, feature selection, and finally classification. In the process of feature extraction, the design of the wavelet transform is applied to focus the problem on non-stationary parameters of ECG signal. Its derivation is obtained from the single function of the generation which is denoted as mother wavelet by the process of translation and dilation. By applying this technique over the feature extraction, it is extracted with the variable size of the window, narrowing of the signal at very high frequency, very broad at minimum frequencies that lead to an optimal resolution over the frequency-time domain in all the range of frequencies [13]. Difference operation method (DOM) is utilized for making the detection of QRS complex for an ECG signal and it is inclusive of two main processes. The first is the difference operation process (DOP) and the second is the waves detection process. The outline of this method is inclusive of two main steps. The initial step is to find the peak point of R by the application of operation on the difference equation for the ECG signal component. The second focuses on the Q and S peak points in accordance with the R peak point to determine the QRS complex [14]. One more variant system to determine the ECG waveform features by the utilization of neural networks has advertised an integral system with which this system possess a technique of cepstrum coefficient for enabling the extraction of features from the models of artificial neural networks and long-term signals of ECG to do the classification process. From the mentioned methods, the recognition of features can be made for an ECG signal and do the process of classification and also detect the arrhythmias [15]. Whereas the novel technique by utilizing the neural networks and wavelet transform
A Graphical User Interface Based Heart Rate Monitoring Process …
485
method for performing the classification of ECG images on the basis of its extracted features. Extraction of features is made from wavelet decomposition method with the intensity of ECG images and then processed further by the utilization of artificial neural networks. The essential features are median, mean, maxima, minima, standard deviation, mean absolute deviation, and variance [16]. Another technique by the utilization of artificial neural networks for the detection of PQRST peak waveform by the utilization of derivative with which the establishment is made for maximum and the minimum search for the derivative of an ECG wave component. The R peak which is denoted as the highest peak must be within the zero crossing between the derivative of the minimum and the maximum. Subsequently, the Q peak must be existing prior zero crossing maximum and the presence of S peak must be relying on the zero crossing after the point of minimum. The P peak and the T peak are made similarly by focusing its view on the local maxima in the original existing signal and then utilizing the derivative for making the identification of peak and the end points [17]. In this proposed study, it is presented with the dependable and the simple methodology for detecting the P peak, QRS peak, and T peak values of an ECG signal. This method is preceded on the basis of determining the mathematical relationship between the maximum peak and valley values of the ECG signal with respect to time. In this proposed study, the GUI has been designed by using the MATLAB software for the detection of PQRST peak waveform by the application of the simple mathematical algorithm in order to acquire the PQRST waveform and plot these values over the ECG signal with respect to time. Apart from these, the process of denoising has been implied for the extraction of noise-free signal.
3 Description of ECG Signal An electrocardiogram is determined as the measurement of electrical impulse activity of cardiac muscle which could be acquired from the skin’s surface from various angles as shown in Fig. 1. When the contraction of the cardiac muscle is initiated, the blood is pumped to various parts of the body with which the action potentials are produced through the mechanical process within the cardiac muscle which then leads to the activity of electrical impulse [18].
3.1 Waveform of ECG An ECG signal is denoted as the waveform plot which gets explored from the print out as a paper trace with which the recording is done to acquire the activity of electrical impulses of the human’s cardiac muscle. The normal ECG signal component is comprising of a series of negative and the positive cycles of waveforms such as P wave signal, QRS complex waveform, and T wave signal. The existence of
486
M. Ramkumar et al.
Fig. 1 Anatomy of cardiac muscle along with the signals acquired from different regions of the heart
P wave amplitude and the QRS complex determines the linear relationships over distinguishing different irregularities of cardiac muscle. The typical ECG waveform is depicted in Fig. 2, wherein which the peak amplitude of P wave denotes the state of atrial depolarization and the initial part of deflection in the upward direction. Whereas the QRS complex is comprising of three peaks, namely Q peak, R peak, and S peak, which determines the state of depolarization of ventricles and the T peak of the waveform corresponds to the ventricle repolarization and results with the termination of ventricle systolic effect [18]. The typical ECG waveform which is depicted in Fig. 2 determines that the horizontal axis of the plot denotes time variant parameter, whereas the vertical axis of the plot denotes the depth and height of the wave and its amplitude is measured in terms of voltage. The first timing interval over the horizontal axial line is termed to be as P-R timing interval which denotes the time period from the P peak onset to the initial position of the QRS complex. The interval denotes the axial time in between Fig. 2 Typical ECG waveform [18]
A Graphical User Interface Based Heart Rate Monitoring Process …
487
Table 1 Range of normal amplitudes of typical ECG waveform [18] Signal
Range of amplitudes Potential of Lead 1
Potential of Lead 2
Potential of Lead 3
Potential of Lead aVR
Potential of Lead aVL
Potential of Lead aVF
P
0.015–0.12
0.00–0.19
-0.073–0.13
-0.179–0.01
-0.085–0.140
-0.06–0.16
Q
0.00–0.16
0.00–0.18
0.00–0.28
0.00–0.90
0.00–0.22
0.00–0.19
R
0.02–0.13
0.18–1.68
0.03–1.31
0.00–0.33
0.00–0.75
0.02–0.15
S
0.00–0.36
0.00–0.49
0.00–0.55
0.00–0.15
0.00–0.90
0.00–0.71
T
0.06–0.42
0.06–0.55
0.06–0.3
0.00–0.00
−0.16 to 0.27 0.04–0.46
Signal
Range of amplitudes Potential of Lead V1
Potential of Lead V2
Potential of Lead V3
Potential of Lead V4
Potential of Lead V5
Potential of Lead V6
P
−0.08 to 0.18
0.15–0.16
0.00–0.18
0.01–0.23
0.00–0.24
0.00–0.19
Q
–
–
0.00–0.05
0.00–0.16
0.00–0.21
0.00–0.27
R
0.00–0.49
0.04–1.52
0.06–2.24
0.18–3.20
0.42–2.42
0.25–2.60
S
0.08–2.13
0.19–2.74
0.09–2.22
0.02–2.09
0.00–0.97
0.00–0.84
T
0.03–1.22
−0.14 to 1.44
0.00–1.60
0.05–1.31
0.0–0.96
0.0–0.67
the initial position of the atrial depolarization and the initial arise of ventricular depolarization. The QRS complex which is followed by the S-T framed segment determines the section between the terminal point of S peak denoted as J point, and the initial position of T peak which makes the representation over the timing space between the depolarization and the repolarization of ventricles. The interval of Q-T is denoted as the time scale between the initial arise of Q peak to the terminal end of T peak over the cardiac muscle’s electrical impulse cycle. The interval of Q-T denotes the entire duration of electrical impulse activity of the ventricle depolarization and repolarization. Table 1 denotes the normal levels of amplitudes for an ECG signal.
3.2 Leads in ECG [19–23] Any contraction over a muscle produces electrical impulse variation in terms of depolarization, and subsequently the detection over this variation could be made by the electrode pair which is placed on the body surface by utilizing the leads of ECG and these leads constitute to an imaginary axial line in-between two electrodes of ECG and they are all together comprising of 12 leads and each one of these leads denotes the activity of electrical impulse from different orientation acquired from the cardiac muscle. This leads to 12 various electrical plots which correspond to various shapes and voltage levels depending on the electrode placement over the body surface, which thereby results with a multi-dimensional projection of cardiac
488
M. Ramkumar et al.
muscle with different states of response over its function. The standard 12 leads of ECG are partitioned into two groups with which the first set of the cluster is denoted as limb leads and it is comprising of three bipolar limb leads (1, 2, and 3), wherein which the lead 1 is acquired between the positive electrode and the negative electrode in which the negative electrode is made located on the right forearm and the positive electrode is made located on the left forearm. Lead 2 is acquired between the positive and the negative electrode with which the negative electrode has relied on the right forearm, whereas the positive electrode relies on the left foot. Then, the lead 3 is acquired between the positive and the negative electrode with which the negative electrode has relied on the left forearm, whereas the positive electrode relies on the left foot. The second cluster of leads is represented as chest leads with which they are denoted in terms of AVR, AVL, and AVF. They are also represented as V leads (V1, V2, V3, V4, V5, and V6) or precordial leads. The 12 ECG leads were described and the schematic representation which determines the position of electrode mapping is depicted in Fig. 3.
4 Definition of Heart Rate Heart rate is determined as the velocity of heartbeat and the measurement is done by the total count of heartbeats in the specific interval of time and it is normally expressed in terms of bpm (beats per minute). The normal rate of heart from the normal human being is ranging from 60 to 100 beats calculated per minute and its value gets varied with the variation in sex, age, and also other relevant factors. When the heart rate is minimum than 60 beats per minute then the condition is represented as Bradycardia, whereas when the heart rate is maximum than that of 100 beats per minute the condition is stated as Tachycardia [24, 25]. There are several techniques to determine the value of heart rate which is dependent on the preceding ECG signal by utilizing interval space of R-R peak as the subsequent follows. The initial one depends on making the count of the total number of R peak in the strip of 6 s cardiac rhythm and the value has to be multiplied by the factor of 10. As the second step, determines the total count of small boxes, with which the R-R interval is represented in mm as a typical value. Then perform the process of division with the factor of 1500 in order to determine the heart rate. As the third step, determines the total count of large boxes in-between the preceding successive R peaks for arriving the typical value on the basis of R-R interval. Finally, perform the division by the factor of 300 to the resulted number and make the determination of heart rate [22].
5 Results and Discussion In this study, the results have been simulated from the MATLAB software with which the recordings of electrocardiogram signal have been analyzed to acquire the P peak,
A Graphical User Interface Based Heart Rate Monitoring Process …
489
Start
Input ECG Data
Selection of Lead (1,2,3)
Selection of file type (.mat file, xlsx file, txt file)
Read the ECG signal
Removal of Low frequency components
Usage of windowing filter and perform thresholding
Adjust the filter coefficients and plot the PQRST wave and detect the heart rate
End Fig. 3 Precordial chest electrodes usually located on the left side of the chest leads
Q peak, R peak, S peak, and T peak and finally determine the detection of heart rate. The graphical user interface (GUI) is very simpler for obtaining PQRST values of peak and gets plotted by the analysis of ECG. It is much required for all the human who makes use of testing of this source code, must determine the selection of total sample count for a single ECG waveform cardiac cycle to detect the PQRST peak and must equal the total sample count of 400. This software tool produces the essential following features for processing ECG signal and analyzing it. 1. Preliminary recordings of ECG signal have to be loaded from any informational source as in the form of excel, binary, or text files. 2. The recordings of ECG which has been loaded has to be plotted for every lead. 3. The detection of PQRST has to be made as a unique value and also it should be made to appear on the plot. 4. The graph has to be exported in terms of bmp or png or fig types. 5. The data has to be saved as either mat or txt or xlsx types.
490
M. Ramkumar et al.
6. The plotting has to be extracted for any leads of ECG as a response. 7. Finally, R peak detection and the heart rate measurement have to be established with the following sequence as depicted in the following Fig. 4. Figure 4 depicts the sequential steps for plotting ECG PQRST waveform and the detection process of heart rate. Initially, the informational data of ECG has to be read after the selection of lead and the information type from any of the source files that have to be acquired as an input. Once the acquisition is made, the low-frequency components are removed and later on by using the window filtering technique and do the process of thresholding. Once the thresholding is done, the filter coefficients are adjusted the R peak is detected to determine the estimation of heart rate. And the PQRST peak is detected by sourcing the code with simple mathematical operations in the MATLAB.
Fig. 4 Sequential steps for plotting ECG PQRST waveform and the detection of heart rate
Fig. 5 Graphical user interface design for acquiring the ECG signal
A Graphical User Interface Based Heart Rate Monitoring Process …
491
Figure 5 depicts the design of the graphical user interface (GUI) for plotting the ECG wave and determine the plot of PQRST peak. The following are the sequence of steps followed to frame an algorithm using the MATLAB tool. 1.
Determine the sampling rate for the ECG waveform and estimate the calculation of heart rate. 2. By using the window filtering technique, determine the detection of heart rate. 3. Obtain the plot and save it in the form of (.png) or (.bmp) or (.fig) type. 4. Next, the data has to be saved in the format of txt or mat or xlsx type. 5. Perform the analysis of acquired ECG informational data and estimate the values after the detection of PQRST peak values. 6. After the plot is created for the PQRST peaks, the plot is extracted with the marked peak values. 7. Then, the marked or acquired plot is saved in any one of the represented formats as (.png) or (.bmp) or (.fig) type. 8. Selection based on the requirement has to be done to print the entire samples or the specific samples alone. 9. The graph shall be finally saved and proceed with the program for the heart rate detection and acquire the ECG plot again in the txt or mat or xlsx type. 10. Based on the selected lead, the response could be seen with the adequate plot of ECG to read the heart rate from the waveform. 11. On entering the sampling data range for analysis, the ECG is imported to acquire PQRST peaks. Figure 6 illustrates the plotting of raw ECG signal. Later on, the filtered signal plotting is exhibited in Fig. 7. The detection over the R peak points in order to estimate the heart rate of the ECG signal component is depicted in Fig. 8. It determines the demonstration over how to acquire P, Q, R, S, and T peak values for the approximated range of data values say nearly 400 samples and establish the process of heart rate Fig. 6 Plotting of raw ECG signal
Raw ECG Data plotting
200 180 160
amplitude
140 120 100 80 60 40 20 0 0
5
10
15
time
20
25
30
492
M. Ramkumar et al.
Fig. 7 Plotting of filtered ECG signal
Filtered ECG signal
120
100
amplitude
80
60
40
20
0
-20 0
5
10
15
20
25
30
time
Fig. 8 Detection of peak points in the ECG signal
PEAK POINTS DETECTED IN ECG SIGNAL
120
100
amplitude
80
60
40
20
0
-20 0
5
10
15
20
25
30
time
detection. Figure 9 depicts the acquisition made on the QRS filtered signal along with the identified pulse train which is formulated by the adaptive threshold detection. From the above-mentioned steps, the graphical user interface could be designed in order to acquire the ECG signal for further processing. The main impact of designing a GUI is the ECG could be acquired directly from the database such as MIT-BIH arrhythmia database to make the detection of cardiac abnormalities. In this proposed study, the acquisition of the ECG signal could be processed only after the process of detecting the peaks such as PQRST from the ECG is being done and the mapping could be directly interpreted to measure the heart rate and it could also be monitored continuously. In Fig. 5, GUI for acquiring ECG is being represented. There is a virtual key for acquiring the input which is stored in the text format. Once the ECG
A Graphical User Interface Based Heart Rate Monitoring Process …
493
QRS on Filtered Signal 0.5 0 -0.5 -1
100
200
300
400
500
600
700
800
900
1000
QRS on MVI signal and Noise level(black),Signal Level (red) and Adaptive Threshold(green) 0.3 0.2 0.1 100
200
300
400
500
600
700
800
900
1000
900
1000
Pulse train of the found QRS on ECG signal 0.4 0.2 0 -0.2 -0.4 100
200
300
400
500
600
700
800
Fig. 9 Sequence of acquisition of filtered QRS along with the representation identified of pulse train
signal acquisition is completed, based on the peak values of PQRST and the time duration of R-R, R-T, Q-T, and S-T interval the abnormality state of cardiac muscle could be diagnosed by processing it through computational intelligence techniques. For the detection of cardiac arrhythmias from the ECG signals acquired from MITBIH arrhythmia database, by using the machine learning algorithms such as artificial neural networks, genetic algorithm, fuzzy logic techniques, and so on could be developed and this will aid as a non-invasive technique in detecting the abnormal states of cardiac muscle which will lead to immediate death. Preprocessing of raw acquired ECG signal inclusive of denoising, dimensionality reduction and baseline wander removal, feature extraction, feature selection and classification of ECG whether it is coming under the normal category or abnormal category. The essential segment in designing the GUI is once the clear plot has been made with the predefined peaks and interval between the peaks, enhancement can be made in developing the computational intelligence techniques for classifying the cardiac arrhythmias and it will lead the doctors to proceed and carry their right path for treatment. The test has been undergone by carrying the data acquisition process from the MIT-BIH physionet database. The normal ECG signal is acquired from the physionet database and the plot over the raw and filtered component of ECG has been made. As the simulated pattern of GUI, the plot of the raw ECG signal is depicted in Fig. 6. As an initial step immediately after the ECG signal acquisition process, denoising is applied with which the noise-free ECG signal has resulted and the plot of noise-free filtered ECG signal is depicted in Fig. 7. For the noise-free ECG signal
494 Table 2 Comparison of PQRST peaks and the heart rate of normal acquired ECG signal
M. Ramkumar et al. Parameters of ECG
Standard PQRST values
Detected PQRST values
P
0.25 mv
0.054 mv
Q
25% of R wave
−0.435 mv
R
1.60 mv
1.495 mv
T
0.1–0.5 mv
0.114 mv
Heart rate
60–100 bpm
78 bpm
component, the R peak detection is made represented as a plot and is depicted in Fig. 8. The parametric representation of peak values of the acquired normal ECG signal is determined over the comparison of the standard PQRST value along with the heart rate is being depicted in Table 2. From Table 2, it is being inferred that the comparison has been made for the peak values of PQRST waveforms and the heart rate between the standardized values and the obtained values. The detected values of P peak, Q peak, R peak, and T peak are 0.054 mv, −0.435 mv, 1.495 mv, and 0.114 mv, respectively. Likewise, the determined value of the heart rate is 78 beats per minute. Herewith, an analogy has been created for determining the detection of PQRST peaks and the heart rate based on GUI design which then enhances for the development of machine learning algorithms.
6 Conclusion and Scope of Future Work This study has proposed a technique of monitoring the heart rate and the detection of PQRST peak from the acquired ECG signal component by designing the GUI from the MATLAB software. This process of detection can be used by the clinical analysts as well as the researchers in the field of diagnosing the abnormalities of the ECG signal. The most initial technique which is used to determine the analysis of ECG signal to estimate the PQRST peaks on the basis of digital signal processing technique and artificial neural networks could be enhanced in terms of accuracy by using this matrix laboratory software. The prediction of the optimal value of the heart rate could be made accomplished by the proposed method of extraction from GUI. It is also used for the prediction of different cardiac disease which is designated as cardiac arrhythmias. As future work, by using the computational intelligence techniques, the cardiac abnormalities classification algorithms could be implemented in diagnosing the arrhythmias in a non-invasive manner. As similar to that of MITBIH arrhythmia database, the processing over an ECG signal could be made by the real-time acquisition of ECG signal processing for which the graphical user interface could be developed integrating the machine learning algorithms in diagnosing the abnormality state conditions.
A Graphical User Interface Based Heart Rate Monitoring Process …
495
References 1. Bronzino JD (2000) The biomedical engineering handbook, vol. 1, 2nded. CRC Press LLC 2. Goldshlager N (1989) Principles of clinical electrocardiography, Appleton & Lange, 13th edn. Connecticut, USA 3. Singh N, Mishra R (2012) Microcontroller based wireless transmission on biomedical signal and simulation in Matlab. IOSR J Eng 2(12) 4. Acharya RU, Kumar A, Bhat PS Lim CM, Iyengar SS, Kannathal N, Krishnan SM (2004) Classification of cardiac abnormalities using heart rate signals. Med Biol Eng Comput 42:172– 182 5. Babak M, Setarehdan SK (2006) Neural network based arrhythmia classification using heart rate variability signal. In: Signal Processing Issue: EUSIPCO-2006, Sept 2006 6. Beniteza D, Gaydeckia PA, Zaidib A, Fitzpatrickb AP (2001) The use of the Hilbert transform in ECG signal analysis. Comput Biol Med 31:399–406 7. De Chazal P, O’Dwyer M, Reilly RB (2004) Automatic classification of heartbeats using ECG morphology and heartbeat interval features. IEEE Trans Biomed Eng 51(7):1196–1206 8. Dewangan NK, Shukla SP (2015) A survey on ECG signal feature extraction and analysis techniques. Int J Innov Res Electr Electron Instrum Control Eng 3(6):12–19 9. Dima SM, Panagiotou C, Mazomenos EB, Rosengarten JA, Maharatna K, Gialelis JV, Curzen N, Morgan J (2013) On the detection of myocardial scar-based on ECG/VCG Analysis. IEEE Trans Biomed Eng 60(12):3399–3409 10. Ebrahimi A, Addeh J (2015) Classification of ECG arrhythmias using adaptive neuro-fuzzy inference system and Cuckoo optimization algorithm. CRPASE 01(04):134–140. ISSN 24234591 11. Burhan E (2013) Comparison of wavelet types and thresholding methods on waveletbased denoising of heart sounds. J Signal Inf Process JSIP-2013 4:164–167 12. Ingole MD, Alaspure SV, Ingole DT (2014) Electrocardiogram (ECG) signals feature extraction and classification using various signal analysis techniques. Int J Eng Sci Res Technol 3(1):39–44 13. Jeba J (2015) Classification of arrhythmias using support vector machine. In: National conference on research advances in communication, computation, electrical science and structures-2015, pp 1–4 14. Kar A, Das L (2011) A technical review on statistical feature extraction of ECG signal. In: IJCA special issue on 2nd national conference computing, communication, and sensor network, CCSN, 2011, pp 35–40 15. Kelwade JP, Salankar SS (2015) Prediction of cardiac arrhythmia using artificial neural network. Int J Comput Appl 115(20):30–35. ISSN 0975-8887. 16. Kohler B, Hennig C, Orglmeister R, (2002) The principles of software QRS detection reviewing and comparing algorithms for detecting this important ECG waveform. IEEE Eng Med Biol 42–57 17. Kutlu Y Kuntalp D (2012) Feature extraction for ECG heartbeats using higher order statistics of WPD coefficients. Comput Methods Progr Biomed 105(3):257–267 18. Li Q, Rajagopalan C, Clifford GD (2014) Ventricular fibrillation and tachycardia classification using a machine learning approach. IEEE Trans Biomed Eng 61(6):1607–1613 19. Luz EJDS, Nunes TM, Albuquerque VHCD, Papa JP, Menotti D (2013) ECG arrhythmia classification based on optimum-path forest. Expert Syst Appl 40(9):3561–3573 20. Malviya N, Rao TVKH (2013) De-noising ECG signals using adaptive filtering algorithms. Int J Technol Res Eng 1(1):75–79. ISSN 2347-4718 21. Markowska-Kaczmar U, Kordas B (2005) Mining of an electrocardiogram. In: Conference proceedings, pp169–175 22. Masethe HD, Masethe MA (2014) Prediction of heart disease using classification algorithms. In: Proceedings of the world congress on engineering and computer science WCECS 2014, vol 2, pp 22–24
496
M. Ramkumar et al.
23. Moavenian M, Khorrami H (2010) A qualitative comparison of artificial neural networks and support vector machines in ECG arrhythmias classification. Expert Syst Appl 37(4):3088–3093. https://doi.org/10.1016/j.eswa.2009.09.021 24. Muthuchudar A, Baboo SS (2013) A study of the processes involved in ECG signal analysis. Int J Sci Res Publ 3(3):1–5 25. Narayana KVL, Rao AB (2011) Wavelet-based QRS detection in ECG using MATLAB. Innov Syst Des Eng 2(7):60–70
Performance Analysis of Self Adaptive Equalizers Using Nature Inspired Algorithm N. Shwetha and Manoj Priyatham
Abstract Through the communication channel, the sender can send a message to the receiver. But due to some noise in the channel, the sent message is not similar to the received message. Likewise, in the digital communication channel, the broadcasted signal may get dispersal. So that both the communicated and transmitted information is not similar. An ISI (inter-symbol interference) and Additive noise cause the dispersal of the signal. If the channel is exactly established, the ISI can be reduced. In training, though, rarely have preliminary information of the channel attributes even if there is an obvious issue of inaccuracy that occurs in physical deployments of the filters. The equalization is utilized to counteract the distortion of intrinsic residual. In this article, the accomplishment of an adaptive equalizer for data transfer through a network that triggers ISI (inter symbol interference). One chance to decrease the impact of this challenge is to utilize a channel equalizer at the receiver. The role of the equalizer is to create a modernized version of the communicated signal as near as possible to it. The equalizer is utilized to decrease the BER (bit error rate); the proportion of received bits in error to overall transferred bits. In this article, the hybrid approach like least mean square (LMS)and EPLMS algorithms are utilized to detect the minimum MSE i.e. mean square error and Optimum Convergence rate which will improve the efficiency of the communication system. Keywords Inter symbol interference · Least mean square · Evolutionary programming LMS · Bit error rate (BER) · Adaptive equalizer · Communication channel
N. Shwetha (B) Department of ECE, Dr. Ambedkar Institute of Technology, Bangalore, Karnataka 560056, India e-mail: [email protected] M. Priyatham Department of ECE, APS College of Engineering, Bangalore, Karnataka 560082, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_37
497
498
N. Shwetha and M. Priyatham
1 Introduction Due to the arrival of digital technology, digital signal communication got essential in a wide range of applications. Such applications take the initiative in the digital field several modulation systems and additional updates in them [1]. But those plans and their updates were highly affected by noise. Basically, two fundamental problems occur in traditional digital transmission methods. Mostly, ISI (Inter Symbol Interference) and noise have a high impact on those methods and their updates. These errors were caused due to channel characteristics that linked among receiver and transmitter and the dissemination of transferred pulse. The noise impact on the communication is determined by the channel features and may also be diminished with the appropriate selection of channel [2–5]. However, the channel is still noisy the signal that the user receive could be less impacted if the SNR kept at the transmitter by enhancing transferring signal power [6]. Because of ISI on symbol electrical power is propagated into a new symbol duration which impacts the interaction and diffuses the symbol. An efficient method of decreasing this is utilizing an adaptive equalized channel. In digital communication systems, adaptive equalization is critical to diminish the impact of ISI, where an adaptive algorithm like the LMS algorithm will alter the measurements of the equalizer. When everything is good in the receiver, there is no communication among the consecutive symbols; each symbol enters and is decrypted individually of all others [7, 8]. However once the symbols communicate with each other, the waveform of a single symbol harms the value of an adjacent symbol, and then the accepted signal turns into a distorted. It is hard to differentiate the message from such a transmitted pulse or accepted signal are rubbed out so that signals related to the various symbols are not distinguishable. This impairment is known as the ISI (inter symbol interference). This impact can be reduced by utilizing the channel equalizer at the receiver. Two of the very intensively emerging fields of digital transmissions, such as cellular communications and digital subscriber lines are heavily reliant on the implementation of a trusted channel equalizer (Fig. 1).
Fig. 1 Fundamental structure of channel equalization
Performance Analysis of Self Adaptive Equalizers …
499
where, Z −d is a delay function.
2 Concept of Inter Symbol Interference In the digital communication system, if everything is right at the receiver side then there will be no interaction among successive symbols. Here each of the signals which are arrived are decoding self-reliantly among others. But when it comes to symbol interaction, one of the waveforms will corrupt the values of the next nearby symbols. Due to this, the received signal will be distorted. Because of this, it is difficult to differentiate messages from such a received signal. The shortage is known as intersymbol interference (ISI). The purpose of an equalizer is to reduce the ISI so that can have a reconstructed signal from the transmitter side. Due to this it also reduces the bit rate of the transmitted signal. As assumption made in all pass AWGN is impractical. The lack of frequency spectrum the signal is filtered to minimise the bandwidth so that frequency structured division can be obtained. There are many band pass channels available in practical but the response varies with respect to the different frequency components. To avoid this, the simplest AWGN model is needed to have for representing the practical channels very accurately. Such commonly available retirement is a dispersive channel model shown in Fig. 2. y(t) = x(t)∗ h c (t) + n(t)
(1)
From the equation u(t) is the transmitted signal, hc (t) is the impulse response of the channel and n(t) is AWGN power spectral density N 0 /2. The dispersive characteristic of the channel is prototyped by using the linear filter hc (t). This dispersive channel model is a low pass filter. By using this low pass filter as can line the transmitted signal with respect to time causing the effect of symbol difficult to adjust symbols in a practical case while transmitting the signals from the transmitter. Due to this, the ISI will deteriorate the error caused by the transmitted signal with respect to error performance in the communication system. Two main methods are mainly concentrated on which eradicates the ISI deterioration effect. In the first method, the
Noise
x(t)
Fig. 2 Inter-symbol interference
Channel
∑
y(t)
500
N. Shwetha and M. Priyatham
band limited transmission pulses are used to minimize the ISI. The pulses obtained by the ISI are called free pulsed which are known by its name Nyquist Pulses. In the second method, the received signal is needed to filter to cancel the ISI which was introduced by the channel impulse response. This is known as equalization.
3 Channel Equalization Equalization is utilized in several interaction applications in which the interference and noise do occur and impossible to eliminate. Equalization is the procedure of pairing frequency modules to the transmitter to the reception pulse to decrease the interference and some noise produced throughout the broadcast. The device that makes the balance among received pulse and receiver is referred to an equalizer. In this article, the focus is on equalizing the transmitter to the channel [9–11].
3.1 Channel Equalization Figure 1 shows the structure of channel equalization. The equalization is the procedure of changing, channel equalization is adapting the coefficients of the channel. The radio transmitter filter is levelled to the channel over which information is transferred. The channel is equalized by many algorithms. The equalizing receiver filter reduces the impact of ISI on the channel. At present, adaptive algorithms catch the attention of most investigators [12]. The channel is equalized by several deterministic algorithms. The most powerful deterministic algorithm utilized to eliminate ISI, prior to adaptive methods is LMS, where channel response for inclinations assessed depending upon the highest possible probability function. The coefficients of the receive filter are equalized or adjusted to correspond to the channel [13–16]. The ISI and noise are reduced by modifying coefficients at the output. Then an error signal power the adaptation of equalizer. Contemporary data transmission methods are manipulating progressively more physical events to develop the flow of data. In late periods, a huge measure of effort was accomplished in the advancement of transmission. To suit into limitations expressed in the global radio guidelines, a lot of quickening and improvements were applied to previous forms of data modulation and encryption. According to the volume of transmitted data, the harmful transmission impacts are getting increasingly huge with the expansion of data intensity in the channel, [17]. To counteract the distortion, modern-day radio communication equipment uses extraordinary measures that include digital signal processing and channel analysis. Equalizers can recreate the transmitted pulse from its altered version. The equalization procedure which improves the noise will not be able to attain improved functioning [18]. This article concentrates on methods of adaptive channel equalization, with an intent to replicate real interaction circumstances and
Performance Analysis of Self Adaptive Equalizers … Fig. 3 Performance of channel equalizer
Transmitting signal
Channel
501
Equalizer
Estimated Signal n
Channel Estimation ℎ(n)
execute a powerful error correction algorithm on the committed computer equipment. Equalizers are essential for the effective functioning of electronic methods i.e.an analogue transmission television. An equalizer utilized to maintain the pulse contents in television application and cancels any interruptions in radio transmissions like group and phase delay. Figure 3 illustrates the performance of channel equalizer. Channel equalization is a process of adjusting the equalizer coefficients to channel coefficients to reduce ISI. If a channel is considered as a filter then equalizer is an inverse filter. An equalizer is not only going to compensate the effect of the channel but also going to compensate all the unnecessary effects transmitted signal went through, i.e. due to pulse shaping, transmitter filter and receiver filter to get back the initial signal. The block diagram of channel equalization is shown in Fig. 3. When the channel is known, then by sending a known signal through the channel error signal can be calculated. The error signal is the difference between the desired signal and the received signal. The error signal is the driving force for an equalizer. Equalizer will aim to minimize the error signal. Hence optimization techniques/algorithms are used to achieve this. There are many algorithms used for equalization. The most effective algorithm used before the adaptive algorithm is Maximum likelihood sequence estimation (MLSE), wherein it depending on MLF of the channel response for the impulse which was measured. The equalizer coefficients are adjusted or equalized to nullify the effect by channel. Adjustment of the coefficients is done to reduce the ISI and noise at the output. Hence, from the distorted version of the transmitted signal, the original version can be reconstructed by the equalizer. Once the equalizer weights are set then it would not change and then required information can be sent through the channel.
3.2 Adaptive Channel Equalization Figure 4 illustrates an adaptive filter structure and its working is specified in the four stages that are as follows: 1. The signal received is being handled by the filter. 2. The response of the filter that characterizes the relation among obtained and developed pulse.
502
N. Shwetha and M. Priyatham
Fig. 4 Adaptive filter
3. The factors of filter that are possible to adjust. 4. An adaptive algorithm that explains how the filter factors are modified from any time moment to the later. 1. The filter handles the received pulse in two different ways they are: (a) Training method: The filter response of the transmitter i.e. equalization is accomplished in this method with a recognized input pulse and training values for the parameters to adapt the approaches are modified with the recognized signal as an input to the filter. (b) Testing form: The channel equalizer has been examined with the pulse in this method with undetermined consequences. When the equalizer is trained, then this approach has been in existence. The equalizer is examined with the actual time signal in this method, and it is desirable to obtain a minimal error value because the equalizer is qualified previously. If the error achieved is non-converged with considerable error, then the equalizer is prepared. 2. This stage illustrates the filter’s impulse response that provides the relationship among the equalized and received noisy signal. The filter forms typically utilized are transversal or FIR and IIR filter. 3. This phase illustrates the considerations of the filter which could be modified to balance the channel. Such considerations are the mass values of the filter response and the values for which are altered adaptively to have an excellent relationship among the desired and obtained signal. 4. This phase illustrates the adaptive algorithm utilized that adapt the factor under the framework of the filter to describe the output and input connection for each iteration it manages. Because of an active medium, the filtering method has to adjust to a specific environment to decrease the undesirable signal distortion [19–23]. The concept following an adaptive filter is essentially an uninterrupted modification of the filter’s handling kernel that is dependent on a specific standard. Ordinarily, the important requirement is to accommodate the yield of the framework x(n) to coordinate the reference signal d(n). The adaptive element begins from the error value, which is attained as
Performance Analysis of Self Adaptive Equalizers …
503
the contrast among reference input and output. In Fig. 4, the channel works in an ordinary manner in which an input pulse is handled by the channel and is sent to the output. Figure 4 illustrates a streamlined model of an adaptive filter [24, 25]. Error Signal = (required Signal) − (attained signal)
(2)
Considering this aspect, the adaptive secondary system becomes an endeavour to enhance the channel, framing a sort of feedback loop among input and output by means of the mechanism of the adaptation.
4 Problem Formulation In case the step size is enormous, it is realized that the merging rate of the LMS algorithm will be dissolute, yet the consistent state MSE i.e. mean square error will improve. Then again, if the step size is little, the consistent state MSE will be little, yet the convergence rate will be moderate. In this way, the step size gives a trade-off among the convergence rate and the consistent state MSE of the LMS algorithm. The one way to increase the efficiency of the LMS algorithm is to make the step size variable as opposed to fixed which leads to VSSLMS algorithms. By using this methodology, both a fast convergence rate and a little consistent state MSE can be achieved. The step size should satisfy the condition: 0 < step-size < 1/(max Eigenvalue of the input auto-correction matrix) For fast convergence, step-size is set close to its maximum allowed value.
5 Performance Criteria The performance of the LMS adaptive filter is explained in three methods, one is the sufficiency of the FIR filter, the second one is the speed of the convergence of the system, and finally the misadjustment in steady-state.
5.1 Speed of Convergence The rate at which the coefficients approach their ideal qualities is known as the speed of convergence. The speed of convergence improves as the estimation of the step size is expanded, up to step sizes close to a one-a large portion of the most maximal value necessary for the steady activity of the framework. This outcome can be acquired from a cautious examination for various kinds of the input signal and relationship measurements. For normal signal situations, it is seen that the speed of convergence
504
N. Shwetha and M. Priyatham
of the abundance MSE diminishes for huge enough advance size qualities. The speed of convergence declines as the length of the filter is expanded. The most extreme conceivable speed of convergence is restricted by the biggest advance size that can be selected for solidity for related input signal less significant than a large portion of the greatest qualities once the input signal is reasonably associated.
6 Formulation of LMS Algorithm The LMS i.e. least mean squares algorithm is one of the most famous algorithms in adaptive handling of the signal. Because of its robustness and minimalism has been the focal point of much examination, prompting its execution in numerous applications. LMS algorithm is a linear adaptive filtering algorithm that fundamentally comprises of two filtering procedure, which includes calculation of a transverse filter delivered by a lot of tap inputs and creating an estimation error by contrasting this output with an ideal reaction. The subsequent advance is an adaptive procedure, which includes the programmed modifications of the tap loads of the channel as per the estimation error. The LMS algorithm is additionally utilized for refreshing channel coefficients. The benefits of LMS algorithm are minimal calculations on the sophisticated nature, wonderful statistical reliability, straightforward structure and simplicity of usage regarding equipment. LMS algorithm is experiencing problems regarding step size to defeat that EP i.e. evolutionary programming is utilized. Figure 5 shows the block diagram of a typical adaptive filter where x(n) y(n) d(n) e(n)
is the input signal to a linear filter is the corresponding output signal is an additional input signal to the adaptive filter is the error signal that denotes the difference between d(n) and y(n)
Fig. 5 Typical adaptive filter
Performance Analysis of Self Adaptive Equalizers …
505
In the case of Linear filter, it can be of different types, namely the FIR or it can be IIR. The coefficients of linear filter iterations are adjusted by the adaptive algorithm to minimize the power of e(n). It also adjusts the coefficients of the FIR filter and includes the recursive least square algorithm. The LMS algorithm performs some of the operations to appraise the coefficient of an adaptive FIR filter. They are noted below. 1. Calculates the output signal y(n) from the FIR filter. y(n) = u T (n) · w(n) where u(n) is the filter input vector and u(n) = [x(n)×(n −1) . . . x(n − N +1)]T T w(n) is the filter coefficients vector and w(n) = w0 (n)w1 (n) . . . w N −1 (n) 2. Calculates the error signal e(n) by using the following equation: e(n) = d(n) − y(n) 3. Updates the filter coefficients by using the following equation: w(n + 1) = (1 − µc) · w(n) + µ · e(n) · u(n) where µ is the step size of the adaptive filter w(n) is the filter coefficients vector u(n) is the filter input vector.
7 Evolutionary Programming Evolutionary algorithms are stochastic search methods and not the deterministic ones. In 1960, Lawrence J. Fogel utilized the evolutionary programming in the US to utilize modelled evolution as an educational procedure which is seeking to create AI. The previous existing methods like linear programming, calculus-based methods for example Nutenian method are having the difficulties in delivering the global solution. They are tending a stuck in the local solution. To overcome this problem nature inspiration computation can be applied. In this approach, some characteristics that are available in nature is taken as a reference to develop the mathematical model. This mathematical model will utilize to discover the solution to the problem. In this paper, the characteristics of nature are taken as evolution. This is one of the most success full characteristics available in nature where the things evolved (the things changed) with the time to adapt the environment in result betterment in fitness value hence, the chances of survival are high. For example, the transformation from monkey to a human. A mathematical model based on the evolution is referred to as evolutionary computation.
506
N. Shwetha and M. Priyatham
8 EPLMS Algorithms Based upon natural evolution a mathematical model called evolutionary computation has created. In nature, the things change from one time to others to increase its fitness so that chances of survival could be better. For example human evolution. The Algorithm steps are as follows 1. At the beginning random step size is defined as the population. 2. With respect to each step size apply the LMS and get its corresponding error value (fitness). e(n) = d(n) − x T (n)∗ W (n)
(3)
where, e(n) is the deviation error, d(n) is the expected output value, x(n) is the input vector at sampling time ‘n’ and W (n) is coefficient vector. 3. A step size having the minimum error select it with respect to current sample point. 4. With the selected step size LMS applied to get the coefficient value. W (n + 1) = W (n) + µe(n)x(n)
(4)
5. As the new input sample appears, from the previous generation a new population of step size is created in EP and procedure repeated.
9 Simulated Results MATLAB 2014b was utilized to implement the modelling and subsequent results were shown in this portion. The ability of the recommended structures was determined in accordance with the conditions of its convergence nature as described in figures. The GUI model shows the steps involved in implementing EPLMS algorithm is described in Fig. 6. To see the performance of Evolutionary Programming LMS algorithms (EPLMS) for any channel, here 11 taps are selected for an equalizer. 500 samples are considered in the Input signal; through uniform distribution, the values are generated randomly as shown in Fig. 7. The Gaussian noise is having zero mean and has 0.01 standard unconventionality additional with the input signal shown in channel features is given by the vector: [0.05 − 0.063 0.088 − 0.126 − 0.25 0.9047 0.25 0 0.126 0.038 0.088] This is the randomly generated input signal consists of 500 samples. This signal transfer in a bipolar form (+1, −1). To make the system more complex random information generated between +1 and −1. This makes the information unpredictable at the receiver side.
Performance Analysis of Self Adaptive Equalizers …
Fig. 6 Overall GUI module
Fig. 7 Generated input signal and signal with noise from channel
507
508
N. Shwetha and M. Priyatham
Figure 8 shows the MSE error plot using the LMS with fixed step size values 0.11, 0.045, and 0.0088. With observation, it is clear that for the different step size values the performance is different. i.e., they have different convergence characteristics along with a different quality of convergence. And also, it is very tough to identify the optimum step size value. Figure 9 shows the comparative MSE error plot using LMS and EPLMS. From this, an observation can be made that the error with the EPLMS algorithm is reduced when compared to the LMS algorithm. Figure 10 shows the generated input signal, Signal from the channel after the addition of noise, Signal from equalizer and signal recovered after decision respectively.
Fig. 8 Fixed step size performance of LMS with step size equal to 0.11, 0.045 and 0.0088
Fig. 9 Comparative error plot using LMS and EASLMS algorithm
Performance Analysis of Self Adaptive Equalizers …
509
Fig. 10 Original signal, signal from channel with noise, signal from equalizer (EPLMS), the recovered signal after a decision
Observation depicts that the EPLMS algorithm is very much efficient in providing noise free information and also reduced bit error rate and Minimum mean square error and EPLMS algorithm have proven to be the best algorithm in adaptive signal processing.
10 Conclusion Bandwidth-effective data transfer through radio and telephone channels has been made through the usage of adaptive equalization to counteract for dispersal of time launched by the channel. Stimulated by useful applications, a constant research attempt over the past two decades has been producing a wealthy body of fiction in adaptive equalization and the associated more common disciplines of a function of system identification, adaptive filtering, and digital signals. This article provides
510
N. Shwetha and M. Priyatham
a summary of the adaptive equalization. In our design, since evolutionary programming are being used, it will decide what would be the value of step size for a particular application so that Mean Square Error(MSE) is minimized and convergence is optimal. And also, faster convergence is obtained. With the layout of the obtaining filters, the impact of Intersymbol Interference can be reduced. Consequently, we can enhance the effectiveness of an interaction system.
References 1. Dey A, Banerjee S, Chattopadhyay S (2016) Design of improved adaptive equalizers using intelligent computational techniques: extension to WiMAX system. In: 2016 IEEE Uttar Pradesh section international conference on electrical, computer and electronics engineering (UPCON). IEEE 2. Gupta S, Basit A, Banerjee S (2019) Adaptive equalizer: extension to communication system. Int J Emerging Trends Electron Commun Eng 3(1). ISSN:2581-558X (online) 3. Ghosh S, Banerjee S (2018) Intelligent adaptive equalizer design using nature inspired algorithms. In: 2018 second international conference on electronics, communication and aerospace technology (ICECA). IEEE 4. Dorigo M, Birattari M, Stutzle T (2006) Ant colony optimization. IEEE Comput Intell Mag 1(4):28–39 5. Shin H-C, Sayed AH, Song W-J (2004) Variable step-size NLMS and affine projection algorithms. IEEE Signal Process Lett 11(2):132–135 6. Pradhan AK, Routray A, AbirBasak (2005) Power system frequency estimation using least mean square technique. IEEE Trans Power Deliv 20(3):1812–1816 7. Banerjee S, Chattopadhyay S (2016) Equalizer optimization using flower pollination algorithm. In: 2016 IEEE 1st international conference on power electronics, intelligent control and energy systems (ICPEICES). IEEE 8. Praliya S (2016) Intelligent algorithm based adaptive control of nonlinear system. Dissertation 9. Sun L, Bi G, Zhang L (2005) Blind adaptive multiuser detection based on linearly constrained DSE-CMA. IEE Proceed Commun 152(5):737–742 10. Schniter P, Johnson CR (1999) Dithered signed-error CMA: robust, computationally efficient blind adaptive equalization. IEEE Trans Signal Process 47(6):1592–1603 11. Xiao Y, Huang B, Wei H (2013) Adaptive Fourier analysis using a variable step-size LMS algorithm. In: Proceedings of 9th international conference on information, communications & signal processing. IEEE, Dec 2013, pp 1–5 12. Wang Y-L, Bao M (2010) A variable step-size LMS algorithm of harmonic current detection based on fuzzy inference. In: 2010 The 2nd international conference on computer and automation engineering (ICCAE), vol 2. IEEE 13. Xiao Y, Huang B, Wei H (2013) Adaptive Fourier analysis using a variable step-size LMS algorithm. In: 2013 9th International conference on information, communications & signal processing. IEEE 14. Eweda E (1990) Analysis and design of a signed regressor LMS algorithm for stationary and nonstationary adaptive filtering with correlated Gaussian data. IEEE Trans Circ Syst 37(11):1367–1374 15. Sethares WA, Johnson CR (1989) A comparison of two quantized state adaptive algorithms. IEEE Trans Acoust Speech Signal Process 37(1):138–143 16. Hadhoud MM, Thomas DW (1988) The two-dimensional adaptive LMS (TDLMS) algorithm. IEEE Trans Circ Syst 35(5):485–494 17. Haykin SS (2005) Adaptive filter theory. Pearson Education India
Performance Analysis of Self Adaptive Equalizers …
511
18. Rao RV, Savsani VJ, Vakharia DP (2011) Teaching–learning-based optimization: a novel method for constrained mechanical design optimization problems. Comput-Aided Des 43(3):303–315 19. Kennedy J (2006) Swarm intelligence. In: Handbook of nature-inspired and innovative computing. Springer, Boston, MA, pp 187–219 20. Meng H, Ll Guan Y, Chen S (2005) Modeling and analysis of noise effects on broadband power-line communications. IEEE Trans Power Deliv 20(2):630–637 21. Varma DS, Kanvitha P, Subhashini KR (2019) Adaptive channel equalization using teaching learning based optimization. In: 2019 International conference on communication and signal processing (ICCSP). IEEE 22. Gibson JD (ed) (2012) Mobile communications handbook. CRC Press 23. Garg V (2010) Wireless communications & networking. Elsevier 24. Palanisamy R, Verville J (2015) Factors enabling communication based collaboration in inter professional healthcare practice: a case study. Int J e-Collab 11(2):8–27 25. Hassan N, Fernando X (2019) Interference mitigation and dynamic user association for load balancing in heterogeneous networks. IEEE Trans Veh Technol 68(8):7578–7592
Obstacle-Aware Radio Propagation and Environmental Model for Hybrid Vehicular Ad hoc Network S. Shalini and Annapurna P. Patil
Abstract The presence of physical obstacles between communicating vehicles will significantly impact the packet transmission performance of the vehicular ad hoc network (VANET). This is because the line-of-sight link among transmitter and receiver is frequently being obstructed. Very limited work is carried out addressing the impact of the presence of multiple obstacles in the line of sight (LOS) of communicating vehicle on different environment condition such as urban, rural, and highway. Further, very limited work is carried out in incorporating obstructing effect in designing medium access control (MAC) under multichannel VANET communication environment, resulting in packet collision and affecting system throughput performance. Thus, an efficient MAC design that maximizes throughput with a minimal collision is required. For overcoming research challenges, an obstacle-aware radio propagation (OARP) model for the different environment such as urban, rural, and highway is presented. Along with, a distributed MAC (DMAC) that maximize system throughput is presented. Experiment outcome shows proposed OARP-DMAC model attain significant performance over existing radio propagation and MAC model in terms of throughput and packet collision. Keywords Environment model · Hybrid network · Medium access control · Radio propagation · VANET · Vehicle of obstructing effects
1 Introduction With the significant growth and requirement for provisioning smart transportation systems, vehicles in current days are embedded with various hardware and smart devices such as sensors and camera. Building smart intelligent transport system aids in providing seamless mobility, safe journey, more enjoyable, and improved user S. Shalini (B) · A. P. Patil RIT Bangalore, Bangalore, India e-mail: [email protected] A. P. Patil e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_38
513
514
S. Shalini and A. P. Patil
experience on the go [1]. Moreover, the vehicle is expected to be connected everywhere with prototyping of fifth-generation (5G) communication network. Communication of vehicles in a 5G network is described as vehicle-to-everything (V2X) [2]. The V2X communication allows the vehicle to communicate with pedestrian (i.e., vehicle-to-pedestrian (V2P)), with other vehicle (i.e., vehicle–vehicle (V2V)), and with roadside unit (RSU) among each other (i.e., vehicle-to-infrastructure). Thus, makes VANET safe, smart, and efficient [3]. The IEEE 802.11-based cellular network is one of the widely used communication protocols for the vehicle to communicate among different entity such as RSU, vehicle, and pedestrian. This allows a VANET device to access the Internet through LTE. Nonetheless, these communication networks cannot cope with everincreasing vehicle density and packet load. Further, the data size is expected to grow exponentially and make the data transmission even more challenging [4]. Recently, various alternative communication protocols such as dedicated short-range communication (DSRC) and Wi-Fi have been used in VANET communication. The heterogeneity nature of VANET brings new research problems and challenges. Thus, various researchers are exploring a hybrid framework for supporting efficient and reliable vehicular ad hoc network correspondences [5, 6] provided an outline in building hybrid vehicular ad hoc network combining both cellular and DSRC network. [7] evaluated the performance achieved by heterogeneous vehicular communication comprising LTE, DSRC, and Wi-Fi network. However, the major drawback is that it can access only one channel at a given session instance, which leads to improper system resource utilization. Utilizing resource more efficiently [8] presented software-defined network (SDN)-based vehicular ad hoc network framework. In [9], the benefit of using SDN for providing flexible communication was shown. Further, they introduced different features and services into vehicular ad hoc network. Nonetheless, both [8, 9] did not address the issues in modeling realistic and practical radio propagation model. This paper focuses on addressing the radio propagation issues in VANET environment. In general, the two major features of VANET correspondence are projected to cater to future smart vehicular communication environment [10–12]. From one perspective, it can provide collision avoidance and vehicle location localization for enhancing vehicle maneuvering safety by sharing channel characteristic dynamic information in real time. On other perspectives, it offers ideal communication between devices for efficiently transmitting packets. Both the services are dependent on the quality of radio signal and channel conditions, which differs with different radio propagating environment [10]. Further, in VANET, the vehicle moves at high speed with dynamic mobility pattern, thus requires efficient and reliable wireless link for ensuring precision and stability of real-time information [13]. Additionally, the remote correspondence interface exceptionally depends upon the radio channel characteristics, which are influenced by the kind of radio propagating environment [10]. Consequently, understanding VANET radio propagation channel characteristics is significant, particularly under real-time traffic conditions.
Obstacle-Aware Radio Propagation and Environmental Model …
515
The state-of-the-art radio propagation models are divided into two classes. Few approaches have focused on addressing the delay constraint by increasing propagating speed. Rest of the approaches has focused on establishing a reliable propagating route in the vehicular ad hoc network. However, the majority of these approaches have considered that if vehicles are within association range can communicate among each other and if not they cannot communicate with each other. Further, the presence of a bigger vehicle in a line of sight (LOS) of transmitter and receiver will induce significant overhead of effective coverage for transmitting information, because receiver will experience decreased received signal powers. Along these lines, the receiver cannot decode the information successfully [14] as they do not consider vehicles obstructing effect (VOE) into consideration. Thus, state-of-the-art conventions will experience the ill effects of the broadcasting hole (BH) issue: few vehicles within association range cannot receive the broadcasted message from the nearest device (i.e., both source and hop device) with enough signal power. A communicating device inside the zone of BH will fail in decoding information and will not have any knowledge of the current dynamic traffic condition. Thus, these devices consequently become potential casualties of vehicle collisions. For addressing the above-discussed problems, this work describes transmission efficiency (TE) (i.e., the additional attenuation power required) for estimating the influence of vehicles obstructing effect on a different channel and environment condition. The TE can be established in a heuristic manner by taking the ratio of total vehicle density that obtain information with no error to the overall vehicle density within correspondence range of source vehicle considering moving at a certain speed and density. This paper presents the obstacle-aware radio propagation model considering VOE for different environmental conditions. Further, the work presented in distributed MAC design maximizes vehicular ad hoc network throughput with minimal collision under a multichannel environment. The highlight of the work is discussed below: • Presents obstacle-aware radio propagation model for different environment condition such as urban, rural, and highway. • Modeled distributed MAC that maximizes system throughput of vehicular ad hoc network. • Experiment outcome shows the proposed distributed MAC design archives better outcome when compared with existing MAC design with better throughput and less collision. The organization of the research work is as follows. In Sect. 2, the literature survey of various existing radio propagation and environmental is discussed. In Sect. 3, the obstacle-aware radio propagation and environmental model are proposed. The result and discussion are discussed in Sect. 4. Lastly, the research is concluded and the future direction of research work is discussed.
516
S. Shalini and A. P. Patil
2 Literature Survey This section discusses the various existing radio propagation model presented for improving communication performance of vehicular ad hoc network under different environment and network conditions. In [15], the author focused on highlighting the physical obstacle and further it is observed that vehicles have a large impact of safety information on optimized propagations in VANET through the continuous obstruction of links among two communicating devices. Moreover, obstructing effect incur various impact on road safety and it diminishes effective coverage regarding safetyrelated information; however, so far it is not addressed in efficient manner. Here, at the first broadcast definition as a metric is highlighted then optimization of mitigation problem on safety-related information is extensively investigated and to overcome such issue graph theory optimization technique is adopted. Maximum broadcastefficient relaying (ER) aka MBER algorithm is developed for distributable optimization in VANET, and MBER helps in maximizing operative information coverage, also it tries to meet certain requirement such as reliability constraint through incorporating propagation distance and broadcast efficiency in the relay contention. Furthermore, algorithm is evaluated and it is observed that the MBER tries to promote the effective coverage of information (safety-related) through varying vehicular distribution in vehicular ad hoc networks. In [16], the author focused on properties of V2V radio channel characteristics of ramp conditions following the different structure; moreover, ramps are divided into the various construction structures. The first structure is defined bridge ramp along with soundproof walls in the urban area, second is general ramp without any sound-proof walls is given sub-urban region. Moreover, the whole propagation process of the radio signal is parted into different propagation zones while considering the line of sight (LOS); further, the propagation characteristics include shadow fading, propagation path loss, RMS delay spread, average fade duration, level crossing rate, fading distribution, and fading depth, and these characteristics are estimated. Furthermore, in accordance with different characteristics, various ramp conditions are compared and the following observation is carried out. (1) In urban bridge ramp condition, an abrupt fluctuation indicates the significance of soundproof walls in radio channel of vehicle-to-vehicle communication. (2) Frequent change in received signal strength parameter and various fading parameter in different propagation environment are observed in ramp scenario of sub-urban environment. Moreover, statistical features are optimized and fixed through certain generalization error parameter; hence, propagation path loss is exhibited through demonstration of path loss parameter differences in a given operating environment. In [17], the author tries to achieve reliable communication; hence, it is observed that features of the wireless channel need to be analyzed properly. In here author mainly focuses on radio channel characteristics of V2V which is primarily based on 5.9 GHz under overtaking scenario; further, they are analyzed through empirical results under four environment and network conditions. However, the primary concern is channel characteristics difference among non-overtaking and overtaking scenarios; hence, here they divided the non-overtaking and overtaking points based
Obstacle-Aware Radio Propagation and Environmental Model …
517
on small-scale fading distribution and further it is observed that the value of average fade distribution and root-mean-square delay spread are significantly high when compared with non-overtaking scenarios; however, level crossing rate and root-meansquare Doppler spreads are lesser than non-overtaking conditions. Moreover, [18] considered variation in the velocity of communicating vehicles; the further generic model was designed considering the various parameters such as path powers, path delays, arrival angle, departure angle, and Doppler frequencies; these parameters are analyzed and simplified through Taylor’s series. They aimed modeling mechanism which can be applied for real-time vehicle-to-vehicle communication and further explicitly reveals velocity variation impact on channels. In [19], the author designed 3D model of a stochastic model which was irregular in shape and based on the geometry for vehicle-to-vehicle communication scenarios; here, multiple inputs and multiple output mechanisms were used at transmitting device. Further, geometric path length which is time variant is developed for capturing non-stationary; this nonstationary is caused by transmitting and receiving device. Moreover, it is observed that the author focuses on investigating the impact of relative moving time and directions of respective channel state information. Similarly, [20] observed that multipath component in dynamic clusters is not modeled ideally in the existing model; hence, in here multipath component clusters distributions for both horizontal and a vertical dimension are considered. Here, expectation maximization algorithm is introduced for extracting multipath component and further for identification and tracking is carried through developing clustering and tracking methodologies. Moreover, MPC clusters are parted into two distinctive categories, i.e., scatter cluster and global cluster; the cluster distribution is further categorized through various inter- and intracluster parameters. However, it is observed that elevation spread and azimuth spread both follow a lognormal distribution. From the survey, it is seen the number of radio propagation model has been presented considering different scenarios considering the presence of an obstacle and environmental conditions. The 3D geometric model is very efficient in modeling VOE more efficiently. However, induce higher computation overhead. Further, the number of 2-way and 3-way knife edge model has been presented addressing largescale fading issues, however did not address the small-scale fading issues under varied environment scenarios. Along with, very limited work is carried out designing distributed MAC employing VOE. For overcoming research issues in modeling VOE under a different environmental condition in the next section, this work presents radio propagation and distributed MAC model for different environment condition such as urban, rural, and highway.
518
S. Shalini and A. P. Patil
3 Obstacle-Aware Radio Propagation and Environmental Model for Hybrid Vehicular Ad hoc Network This section presents obstacle-aware radio propagation (OARP) model for dynamic environment condition such as urban, rural, and highway. Let us consider a set of a vehicle of different size moving in different region as shown in Fig. 1. Let us assume that each vehicle has a homogenous communication radius which is described by notation S y and these vehicles can communicate with one RSU or vehicles at given instance of time. Then, each vehicle transmits H number of packets with the size of N and pass through radio propagation environment with a set of vehicle A = {1, . . . , A}. Let M describe the average size of vehicles (i.e., average vehicles arrival rate within the coverage area) that is passing through a radio propagation environment with the Poisson’s distribution. The vehicle speed and density are described by u and l, respectively. The vehicle speed considering certain vehicle density l can be obtained using the following equation u = uk
l 1− , l↑
(1)
where u k depicts the speed of vehicles under Poisson’s distribution and l↑ is the maximum feasible vehicle density in a radio propagation environment. Therefore, the M can be estimated using the following equation M = lu.
(2)
The maximum amount of vehicle P that can be allowed by certain vehicle or RSU y can be obtained utilizing floor function using the following equation P↑,y = 2S yl↑ , ∀y ∈ A.
Fig. 1 Obstacle-aware radio propagation model
(3)
Obstacle-Aware Radio Propagation and Environmental Model …
519
For improving packet transmission under dynamically changing environmental condition, a distributed medium access control (DMAC) scheduling algorithm is presented. In DMAC scheduling algorithm, the time is segmented into equal slot size δn. The overall time that a vehicle will be within a coverage area of yth RSU or vehicle is described using the following equation N y = 2S y /uδn
(4)
For computing N th slot time when the vehicle will be in the communication region of neighboring vehicle can be obtained using the following equation V(y, N ) =
y−1
N x + N , ∀N ∈ 1, . . . , N y ,
(5)
x=0
Where N0 = 0. The time line representation of time slots in yth the device is described using the following equation
N y = V(y, 1), . . . , V y, N y
(6)
Further, for maximizing resource utilization of VANET, the slots are selected that maximizes the system throughput. Let the slot assignment decision be exy and the throughput attainable by each vehicle X in a vehicular ad hoc network is S X . Here, e X N is set to 1 provided slot N is assigned to the vehicle. Similarly, if not slot is assigned to a vehicle, e X N is set to 0. Therefore, the throughput gain problem is described using the following equation max E
R
SX .
(7)
X
where R depicts the overall vehicle size in VANET. Further, slot assignment constraint is described using the following equation R
e X N = 1 ∀y.
(8)
X
Thus, this paper computes the attainable throughput of vehicle X on slot assignment below. Let VX describes the slots allocated to vehicle x and let l X N describes the probability that slot N is reachable by vehicle X . For simplicity, this paper assumes that l X N does not rely on each other. As a result, the S X is estimated using the following equation
520
S. Shalini and A. P. Patil
SX = 1 −
N ∈VN
lXN = 1 −
T
e X N lXN
(9)
N =1
where 1 − N ∈VX l X N depicts the probability that at least one slot is available for each vehicle X Then, the parameter l X N depicts the probability that slot N is not reachable for vehicle X is computed using the following equation l X N = 1 − l X N
(10)
This is because every vehicle can at least use only one assigned slots, its maximum throughput achievable will be 1 under different radio propagation environment considering certain data rate. The different environment has different shading, path loss, and shadowing component. Thus, it is important to model such an effect for improving packet transmission performance. This work addresses the issues of path loss component on channel attenuation. Let us consider for a given slot time n the bandwidth can be estimated using the following equation
cn = C log2 G/P0 Crnα + 1 ,
(11)
where C depicts the bandwidth of vehicular ad hoc network, G depicts the communication power, P0 power spectral density with zero Gaussian noise, rn depicts the distance between communicating vehicles at time slot n and α depict the path loss component. For evaluating α in Eq. (11), as described in [21] log normal shadowing model and path loss component considering signal-to-noise ratio (SNR) (r )dB with distance r apart from the sender, the receiver can be obtained using the following equation α(r )d B = Pt − PL(r0 ) − 10n log10 (r/r0 ) − Xσ − Pn
(12)
where Pt depicts the power required for processing packet PL(r0 ) depicts the path loss component at a certain distance apart r0 , Xσ depicts the zero mean Gaussian random parameters with standard deviation σ , Pn depict the noise level in decibel watt. Further, this work considers VOE for modeling channel [22, 23] for improving log normal shadowing models. This paper considers neighboring device as VOE for modeling obstacle-aware radio propagation model. In the OARP model first, the vehicle that would get affected by VOE between transmitting vehicle x and receiving vehicle y is described as obtProbAff(x, y). If the distance between VOE of vehicle x and vehicle y is higher than that of those in the middle of VOE vehicle, in that case, the vehicle is said to be probably obstructing. Second, the vehicle that obstructs the VOE between vehicle x and vehicle y are chosen from a selected probable candidate of obstructing vehicle established in previous round are described using following notation obtLOSaff([ProbableAff]). Further, it must be seen that the transmitted signal may get obstructed because of obstructing effects of Fresnel zone ellipsoid
Obstacle-Aware Radio Propagation and Environmental Model …
521
(FZE) by vehicles which are estimated using the following equation.
raff + z x − 0.6sk + z t z = z y − zx r
(13)
where raff depicts the distance between the obstructing vehicles and transmitting vehicles z x and z y depicts the height parameter of transmitter vehicle x and receiver vehicle y, z t depicts the vehicle antenna height, r depicts the distance between transmitter vehicle and receiver vehicles and sk depicts the value for obtaining main FZE using following equation. sk =
W raff (r − raff )/r
(14)
where W depicts the wavelength. Finding the height of entire possible obstructing vehicles before carrying out communication plays a significant part before transmitting packets. Further, it is noted a vehicle will obstruct vehicle x and y provided if z is greater than that of its height. Thus, the probability of VOE between vehicle x and vehicle y is estimated using the following equation.
L LOS|z x , z y = 1 − Q(z − ϕz /ωz )
(15)
where L depicts the probability of VOE by vehicle (i.e., obstacle) among transmitter vehicle and receiver vehicle, ϕz depict the mean of amplification of obstructing vehicles and ωz depicts a standard deviation of amplitude of the obstructing vehicles, Q(·) depicts Q function. Finally, the amplified attenuation needed of signal power obtained is estimated for VOE of obstructing vehicle in a prior round is established utilizing following notation obtAttenuation([AffDevices]). This work uses multiple knife edge (MKE) models. Using MKE a candidate of VOE vehicle is obtained and base on the distance and height the attenuation is optimized. The OARP model computation for amplifying attenuation between transmitter vehicle x and receiving device y considering the presence of multiple obstacles because of neighboring vehicles are described in the flow diagram in Fig. 2. The proposed obstacle-aware radio propagation and distributed MAC model attain significant performance when compared with the existing model under different environment condition which is experimentally proved in the next section below.
4 Results and Analysis This section presents a performance evaluation of the proposed obstacle-aware radio propagation model under different environmental conditions such as urban, rural, and highway. The SIMITS simulator [24] which is an extension of [25] is been used for evaluating the performance of the proposed obstacle-aware radio propagation model
522
S. Shalini and A. P. Patil
Fig. 2 Flow diagram of obstacle-aware radio propagation modeling
under different environmental conditions. Further, the performance of obstacle-aware distributed MAC is evaluated over the existing MAC model [24, 26] considering packet collision and network throughput. 1. Performance evaluation of obstacle-aware radio propagation model under a varied environmental condition such as urban, rural, and highway: This section evaluates the collision and throughput performance achieved by obstacle-aware radio propagation model under different environmental condition. For experiment analysis, the vehicle moves at a speed of 3 m/s and a total of 40 vehicles is considered. Figure 3 shows the throughput performance achieved by obstacle-aware radio propagation model for urban, rural, and highway environment. From result achieved it is seen, higher throughput is achieved by the urban environment, followed by highway environment, and rural environment achieves the least throughput when compared with urban and highway environment. Figure 4 shows the collision performance achieved by obstacle-aware radio propagation model for urban, rural, and highway environment. From result achieved it is seen, lesser collision is achieved by an urban environment, followed by highway environment, and rural environment achieves the least throughput when compared with urban and highway environment. No prior work such as [24, 26] considered such performance evaluation considering the different environmental condition. Thus, the proposed obstacle-aware radio propagation environment can be used to simulate in more realistic environmental conditions as described in [16] (Table 1).
Throughput (Mbps)
Obstacle-Aware Radio Propagation and Environmental Model …
523
Throughput performance attained under different environment condition Throughput per channel_Urban Throughput per channel_Highway Throughput per channel _Rural
30 20 10 0
1
6
11
16
21
26
31
36
Simulation time (S)
Number of packet collided
Fig. 3 Throughput performance attained under varied environmental condition
Collision Achieved for Varied Environment Collisions _Urban Collisions_Highway
400
Collisions_Rural
300 200 100 0
1
6
11
16
21
26
31
36
Simulation time (S) Fig. 4 Collision performance attained under varied environmental conditions Table 1 Simulation parameter used for experiment analysis
Simulation parameter used
Configured value
Vehicular ad hoc network size
50 m * 50 m
Number of vehicles
20–60
Modulation scheme
Quadratic amplitude modulation-64
Mobility of devices
3 m/s
Coding rate
0.75
Bandwidth
27 Mb/s
Number of channels
7
Number of time slot
8 µs
Message information size
20 bytes
Medium access control type used
TECA, DMAC
Throughput achieved per channel (Mbps)
524
S. Shalini and A. P. Patil 30 20
Throughput performance considering varied vehicle
Proposed Model
Existing Model
10 0
20V
40V
Number of vehicles
80V
Fig. 5 Throughput performance attained by proposed distributed MAC under denser environmental condition considering the varied size of vehicles
2. Performance evaluation of proposed distributed MAC over existing MAC under denser environment condition considering varied vehicles: This section evaluates the collision and throughput performance achieved by proposed distributed MAC model over existing MAC model [24, 26] under dynamic obstacle-aware radio propagation and environmental condition such as urban, rural, and highway. Here, the vehicle moves at a constant speed of 3 m/s over urban, followed by rural, and then through highway segment and throughput and collision achieved is noted considering different vehicle size of 20, 40, and 80. Figure 5 shows the throughput achieved by the proposed DMAC over existing MAC. From the result it is seen as vehicle size is increased the throughput is improved for both proposed and existing MAC. That is when vehicle size is 20, 40, and 80; the throughput obtained by the existing model is 5.2075 mbps, 10.34 mbps, and 15.877 mbps, respectively. Similarly, when vehicle size is 20, 40, and 80, the throughput obtained by the proposed DMAC model is 8.598 mbps, 13.6 mbps, and 19.49 mbps, respectively. Thus, the proposed DMAC model improves throughput by 39.43%, 23.96%, and 18.53% over the existing model when vehicle size is 20, 40, and 80, respectively. Therefore, an average throughput improvement of 27.31% is achieved by DMAC over existing MAC model under dynamic radio propagation and environment condition. Further, it is seen the DMAC model achieved much better throughput outcome when compared with existing MAC irrespective of vehicle size. Figure 6 shows the collision achieved by proposed DMAC over existing MAC. From the result it is seen as vehicle size has increased the collision is increased for both existing and proposed MAC. That is when vehicle size is 20, 40, and 80; the collision incurred by the existing model is 11 packets, 45 packets, and 141 packets, respectively. Similarly, when vehicle size is 20, 40, and 80, the throughput obtained by the proposed DMAC model is 6 packets, 33 packets, and 125 packets, respectively. Thus, the proposed DMAC model improves throughput by 45.45%, 26.66%, and 11.35% over the existing model when vehicle size is 20, 40, and 80, respectively. Therefore, an average collision reduction of 27.82% is achieved by DMAC over existing MAC model under dynamic radio propagation and environment condition. Further, it is seen the DMAC model
Obstacle-Aware Radio Propagation and Environmental Model …
Number of packet collided
200
525
Collision performance considering varied vehicle Proposed Model
0
20V
40V Number of vehicles
80V
Fig. 6 Collision performance attained by proposed distributed MAC under denser environmental condition considering the varied size of vehicles
achieved much better collision outcome when compared with existing MAC irrespective of vehicle size. The significant throughput and collision reduction achieved using proposed DMAC model under dynamical environmental condition are because slots are assigned to a vehicle that maximizes throughput using Eq. (7) and the bandwidth is optimized in Eq. (11) based on signal-to-noise ratio considering obstructing effect of among communicating device. On the side, in the existing model, the slot is assigned to vehicle based on resource availability. Thus, a vehicle cannot maximize system throughput. Further, they consider simple attenuation model without considering multiple obstructing deceive in LOS among statically associating device. However, the obstructing effects in real time vary significantly. Thus, require dynamic obstructing effects measurement model. Thus, from the result it can be seen they induce high packet loss. Thus, from the result achieved, the proposed DMAC can be concluded as robust in nature under varied vehicle density and radio propagation environment as it brings good tradeoffs between reducing collision and improving throughput. 3. Results and discussions This section discusses the result and its significance of proposed radio propagation environmental and distributed MAC model over existing model [24, 26]. In Table 2, the proposed model comparison of the proposed approach over the state-of-the-art model is shown. [15] presented MBER by considering the presence of multiple obstacles between LOS of communicating vehicles. However, they considered performance evaluation under the highway environment only. Further, the number of radio propagation model has been presented for addressing the obstacle effect between LOS among communicating device [17–20]. However, this model aimed at reducing propagation delay adopting 3D geometry, as a result inducing high computational overhead. Along with, these models are simulated under different environment condition thus they are realistic. On the other side, this paper presented an efficient radio propagation model that considers the obstructing effect between communicating vehicle. Further, the experiment conducted in [24] shows high packet collision under a multichannel environment. Thus, for addressing [26] presented a throughput efficient channel access model. However, these models induce slightly higher collision and failed to maximize system throughput. For addressing, this paper presented
526
S. Shalini and A. P. Patil
Table 2 Performance comparison of proposed approach over state-of-the-art technique OARP-DMAC
[15]
[24]
[26]
Urban, rural, and highway
Highway
Cognitive environment
Cognitive VANET environment
Type of MAC used DMAC
MERA
ENCCMA
TECA
Radio propagation model used
Custom (OARP)
ITU-R [27]
Log normal
–
Simulator used
802.11p SIMITS
Omnet++
SIMITS
–
MAC layer
802.11p
802.11p
802.11p
802.11p and 802.11ad
Communication support
V2I and V2V
V2V
V2V
V2I
Multichannel support
Yes
No
Yes
Yes
Performance metric considered
Throughput and collision
Broadcast efficiency and relay successful ratio
Throughput and collision
Throughput
Type of environment used
distributed MAC design that maximizes throughout with minimal collision overhead. From overall result attained, it is seen the proposed model achieved much superior throughput with lesser collision when compared with the existing model. Thus, the proposed MAC model brings good tradeoff between maximizing throughput with minimal collision.
5 Conclusion First, this work analyzed various existing work that was presented recently for addressing VOE among communicating vehicles. Various radio propagation methodologies considering obstacle in line of sight of communicating vehicle has been presented with good result. However, these models are not applicable for simulating under practical or real-time environment. Further, the adoption of 3D geometry statistical model induces high computational complexity considering dynamically changing vehicle environment. For addressing the research problem, this paper first presented obstacle-aware radio propagation model. Further, the impact of the obstructing effect on communication is tested under different environment condition such as urban, rural, and highway. No prior work has considered such an evaluation. Further, this paper presented a distributed MAC model that overcomes the problems of maximizing throughput with minimal contention overhead. The OARP model is incorporated into DMAC thus able to optimize the slot time dynamically aiding system performance. Experiments are conducted by varying vehicle size. An
Obstacle-Aware Radio Propagation and Environmental Model …
527
average collision reduction of 27.81% is achieved by OARP-DMAC over OARPENCCMA. Further, an average throughput enhancement of 27.31% is achieved by OARP-DMAC over OARP-ENCCMA. From result attained, it can be stated the proposed OARP-DMAC achieves much better throughput and collision reduction when compared with existing radio propagation and MAC model. Thus, the OARP-DMAC is robust irrespective of vehicle density considering dynamic environment such as urban, rural, and highway. Future work would further improve MAC considering heterogeneous/hybrid network design (i.e., combining SDN and cloud computing environment). The MAC will be designed considering multiuser multichannel that maximizes system throughput with minimal resource allocation overhead.
References 1. Fangchun Y, Shangguang W, Jinglin L, Zhihan L, Qibo S (2014) An overview of Internet of Vehicles. China Commun 11(10):1–15 2. Study on LTE-based V2X services (Release 14), technical specification group services and system aspects (TSG SA), document 3GPP TR 36.885, 2016 3. Kaiwartya O et al (2016) Internet of Vehicles: motivation, layered architecture, network model, challenges, and future aspects. IEEE Access 4:5356–5373 4. Cisco visual networking index: forecast and methodology, 2016–2021. In: White Paper, Cisco, San Jose, CA, USA, 2016 5. Naik G, Choudhury B, Park J (2019) IEEE 802.11bd & 5G NR V2X: evolution of radio access technologies for V2X communications. IEEE Access 7:70169–70184 6. Abboud K, Omar HA, Zhuang W (2016) Interworking of DSRC and cellular network technologies for V2X communications: a survey. IEEE Trans Veh Technol 65(12):9457–9470 7. Dey KC, Rayamajhi A, Chowdhury M, Bhavsar P, Martin J (2016) Vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication in a heterogeneous wireless networkPerformance evaluation. Transp Res C Emerg Technol 68:168–184 8. Liu Y-C, Chen C, Chakraborty S (2015) A software defined network architecture for geobroadcast in VANETs. In: Proceedings of IEEE international conference communications (ICC), June 2015, pp 6559–6564 9. Alexander P, Haley D, Grant A (2011) Cooperative intelligent transport systems: 5.9 GHz field trials. Proc IEEE 99(7):1213–1235 10. Sepulcre M, Gozalvez J, Carmen Lucas-Estañ M (2019) Power and packet rate control for vehicular networks in multi-application scenarios. IEEE Trans Veh Technol 1–1. https://doi. org/10.1109/TVT.2019.2922539. 11. Akhtar N, Ergen SC, Ozkasap O (2015) Vehicle mobility and communication channel models for realistic and efficient highway VANET simulation. IEEE Trans Veh Technol 64(1):248–262 12. Huang R, Wu J, Long C, Zhu Y, Lin Y (2018) Mitigate the obstructing effect of vehicles on the propagation of VANETs safety-related information. In: 2017 IEEE ıntelligent vehicles symposium (IV), Los Angeles, CA, pp 1893–1898 13. Li C et al (2018) V2V Radio channel performance based on measurements in ramp scenarios at 5.9 GHz. IEEE Access 6:7503–7514 14. Chang F, Chen W, Yu J, Li C, Li F, Yang K (2019) Vehicle-to-vehicle propagation channel performance for overtaking cases based on measurements. IEEE Access 7:150327–150338 15. Li W, Chen X, Zhu Q, Zhong W, Xu D, Bai F (2019) A novel segment-based model for nonstationary vehicle-to-vehicle channels with velocity variations. IEEE Access 7:133442–133451
528
S. Shalini and A. P. Patil
16. Jiang H, Zhang Z, Wu L, Dang J (2018) Novel 3-D irregular-shaped geometry-based channel modeling for semi-ellipsoid vehicle-to-vehicle scattering environments. IEEE Wirel Commun Lett 7(5):836–839 17. Yang M et al (2019) A Cluster-based three-dimensional channel model for vehicle-to-vehicle communications. IEEE Trans Veh Technol 68(6):5208–5220 18. Manzano M, Espinosa F, Lu N, Xuemin Shen, Mark JW, Liu F (2015) Cognitive self-scheduled mechanism for access control in noisy vehicular ad hoc networks, Hindawi Publishing Corporation. Math Probl Eng 2015, Article ID 354292 19. Hrizi F, Filali F (2010) simITS: an integrated and realistic simulation platform for vehicular networks. In: 6th international wireless communications and mobile computing conference, Caen, France, pp 32–36. https://doi.org/10.1145/1815396.1815404 20. Han Y, Ekici E, Kremo H, Altintas O (Feb. 2017) Throughput-efficient channel allocation algorithms in multi-channel cognitive vehicular networks. In: IEEE transactions on wireless communications, vol 16. no 2, pp 757–770 21. Huang R, Wu J, Long C, Zhu Y, Lin Y (2018) Mitigate the obstructing effect of vehicles on the propagation of VANETs safety-related information. In: 2017 IEEE intelligent vehicles symposium (IV), Los Angeles, CA, pp 1893–1898 22. Li C et al (2018) V2V radio channel performance based on measurements in ramp scenarios at 5.9 GHz. In: IEEE Access, vol 6. pp 7503–7514 23. Chang F, Chen W, Yu J, Li C, Li F, Yang K (2019) Vehicle-to-vehicle propagation channel performance for overtaking cases based on measurements. In: IEEE access, vol 7. pp 150327– 150338 24. Li W, Chen X, Zhu Q, Zhong W, Xu D, Bai F (2019) A novel segment-based model for non-stationary vehicle-to-vehicle channels with velocity variations. In: IEEE access, vol 7. pp 133442–133451 25. Jiang H, Zhang Z, Wu L, Dang J (Oct. 2018) Novel 3-D irregular-shaped geometry-based channel modeling for semi-ellipsoid vehicle-to-vehicle scattering environments. In: IEEE wireless communications letters, vol 7. no. 5, pp 836–839 26. Yang M et al. A cluster-based three-dimensional channel model for vehicle-to-vehicle communications. In: IEEE Transactions on Vehicular 27. ITU-R (June 2019) Propagation by diffraction. In: International telecommunication union radio communication sector, 2007, Technology vol 68. no. 6, pp 5208–5220
Decision Making Among Online Product in E-Commerce Websites E. Rajesh Kumar, A. Aravind, E. Jotheeswar Raghava, and K. Abhinay
Abstract In the present era, customers are mainly engrossed in the product-based system. To make their exertion easy all pupils are trusting in internet marketing. By catching this public interest, all the product-based systems are playing enormous activities which may be legal or illegal. Due to this reason, decision making among products in e-commerce websites is making most ambiguity. Considering this perspective, this paper is providing an analysis of how to evaluate customer reviews. It deals with deciding how to manage the customer experience in marketing. This paper presents how to analyze online product reviews. The framework aims to distill large volumes of qualitative data into quantitative insights on product features so that designers can make more informed decisions. This paper set out to identify customer’s likes and dislikes found in reviews to guide product development. Keywords Online products · Customer reviews · Naïve Bayes · Visualization · Classification · Support vector machine (SVM) · Decision making · E-commerce
1 Introduction The data from different sources like conducting surveys, interviews, etc. The importance of the customer and their needs were the key role to design a product and that must satisfy the customer needs. Nowadays, customers can review all aspects of products in e-commerce websites. Big data is needed for the product designers E. Rajesh Kumar (B) · A. Aravind · E. Jotheeswar Raghava · K. Abhinay Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur 522502, Andhra Pradesh, India e-mail: [email protected] A. Aravind e-mail: [email protected] E. Jotheeswar Raghava e-mail: [email protected] K. Abhinay e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_39
529
530
E. Rajesh Kumar et al.
to exploit product ability. A large amount of information is available on Internet. Through Internet product reviews are taken from e-commerce websites provide a piece of valuable information for the customers to buy a new product and also for the manufactures to develop their products. The summarization model builts multiple aspects of user preferences like product affordance, emotions, and conditions. It is also introduced to overcome limitations like opinions of user requirement to designers, ambiguity in the meaning of summarization data, issue of complexity in linguistic patterns [1]. Sentimental analysis design framework was built as a set of categorized customer opinions, and it is characterized based on the integration of natural language processing techniques and machine learning algorithms where the machine models are compared with general analysis [2]. The problem of classifying documents is not by topic, but by overall sentiment, for example, determining whether a review is positive or negative. The usage of machine learning techniques definitely outperforms on human-produced baselines [3]. There are many e-commerce websites, any website can give the review in terms of rating or comments that is specified to a particular website. Generally, if any customer wants to buy a product through online then customer need to observe all the raw reviews which don’t have any specifications or specific pattern was not specified in the individual websites. The customer checks individual website ratings and comments and can’t compare the product ratings or comments with other websites. In this paper, the dataset is based on the reviews given by the end-users on the products which are bought in different websites and specifications required by a product. The specifications are mentioned with ratings. So by this paper customers can easily analyze or select the products based on the specifications in reviews from the end-users and also customers can compare the products with all other websites to choose the best website.
2 Initial Consideration of Project Figure 1 shows the person’s perception analysis over the online product. Mainly this picture indicates the process of visualization on the online product by the customer. Generally, any customer will list out the relative product website, now consider the few interesting websites which are trusted by users. Now the user will enter into the specific online website URL and look out the interface if any pre-conditional login is required then the customer will fill and get into it. So, by this login, the user can understand that the online website is giving user individuality or not. After login into the website now the user will look out the entire item which displayed on the first screen and now user search the key product in search engine. After that, the user will select one product and look into its specification or feature of a product. If the user finds all features are good, then the user decides whether to buy a product or not. If the user didn’t find any item as good, then the user browses to the next website. In this format, the user will travel from one source of online website to another online website.
Decision Making Among Online Product in E-Commerce Websites
531
Fig. 1 General assumption of customers on online product websites
2.1 Naive Bayes Naive Bayes is a probabilistic classification technique based on Bayes theorem. In machine learning, it has proven to not only be simple but also fast, accurate, and reliable. It is successfully used for many purposes. It assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature. An object can be classified based on its features like the state and behavior of an object, for example, consider an object desk, where the material is wood it is considered as the state of the object, and it is used for work surface it is considered as the behavior of the object. If these features depend on each other or depend on the existence of different features, a naive Bayes classifier considers all these properties to contribute independently to the probability that the object is a desk. The input variables are categorical, but the algorithm accepts continuous variables. There are ways to convert continuous variables into categorical variables this process is referred to as the discretization of continuous variables [4]. Naive Bayes is probabilistic which means that calculates the probability of each element and then output the element with the highest one. How naive Bayes can get the probabilities is by utilizing Bayes theorem, which depicts the likelihood of an element. It does surprisingly well, and it is widely used because it outperforms more sophisticated classification methods [5]. It is a probabilistic model, and it can implement easily and execute efficiently for very large datasets and without any prior knowledge of data, it gives quick response to user requests in real-world applications and it is one of the most popular algorithms for classifying the text documents. It is used in spam filtering to distinguish spam emails from legitimate email, and it can also be used in fraud detection, for example, to claim insurance based on attributes such as
532
E. Rajesh Kumar et al.
vehicle age, vehicle price, police report status naive Bayes can provide probabilitybased classification whether the claim is genuine [4]. Bayes theorem provides a way to calculate the posterior probability p(a/b) from p(a), p(b) and p(b/a). Probability Rule The conditional probability of event a occurring, given that event b has already occurred, is considered as p(a/b). P • • • •
a b
=
P
b
P(a) P(b) a
(1)
P(a/b) is posterior probability P(b/a) is the probability of predictor given class P(a) is the prior probability P(b) is the prior probability of predictor.
Naïve Bayes classifier finds prior probability and likelihood for each feature of the given class labels and posterior probability from the above formula. The class label with the highest posterior probability is the result of the prediction [5]. Naive Bayesian includes all predictors using Bayes’ rule and the independent assumptions between predictors [6].
2.2 Support Vector Machine (SVM) A support vector machine is a supervised machine learning algorithm, which is used for both classification and regression. A support vector machine is a directed AI calculation that can be utilized for both arrangement or relapse difficulties [7]. In this algorithm, plots are based on all the data items in the dimensional space as points [8]. The algorithm classifies the data items with the hyperplane that differentiate the classes based on the features of each element in the dimensional space. Support vector machines are best in segregating into different classes, i.e., hyperplane or line. Support vector machines are based on decision planes concept that defines boundaries [9–12]. A hyperplane is one that separates between a set of objects with different memberships. A schematic example is shown below, in this example, the items belong to either circle or square shape can differentiate based on their features. The separating plane called hyperplane defines that the items on the left side are square and the items on the right side are circle. Any new object which classifies as a square will come to the left and which classifies as the circle will come to the right. Figure 2 is an example of a linear classifier, i.e., the classifier separates the objects in two different sets with a line [8].
Decision Making Among Online Product in E-Commerce Websites
533
Fig. 2 Accurate hyperplane of the dataset
3 Visualization The customer can analyze the occurrences of product websites that are preferred by the end-user. From Fig. 3, customers can analyze the most preferred website like amazon for that product next followed by Flipkart and so on (Fig. 4).
Fig. 3 Representation of customer observation on online product companies
534
E. Rajesh Kumar et al.
Fig. 4 Construction model
4 Implementation 1. Dataset collection is taken from Google form which specifies the different attributes of the product should be rated with different values. This form is mainly about to take the analysis of the customer who was already bought the product in another source of online website. 2. Data validation is a process of splitting the dataset with reference to customerspecified products and removing all the unnecessary variables upon the specified product from the dataset and ordering the dataset according to the websites. 3. In the planning stage, the dataset is divided into two different data frames: one as a training dataset and another one as a testing dataset. where training dataset can train by the algorithm with its dataset, whereas testing can test the trained dataset. 4. In the modeling stage, the training dataset undergoes into classified algorithm training with considering the target variable of the dataset. 5. Prediction of the model is happened by considering both the trained and testing datasets. This prediction use to test the dataset with reference to the trained dataset. 6. The confusion matrix is the process of displaying the table of values which are correctly predicted through the modeled dataset and testing dataset. The size of the confusion matrix is a 2 × 2 matrix with labels as 0 and 1. Where [0,0] and [2] matrix index used to show correctly predicted values of the data, [0, 1] and [1, 0] matrix index shows incorrect predicted values. 7. If any new customer wants to buy the product, then the result analysis shows the result in deployment.
Decision Making Among Online Product in E-Commerce Websites
535
Fig. 5 Naive Bayes result
Fig. 6 Support vector machine result
4.1 Result Analysis: Figure 5 shows the result for a specific product using the naive Bayes algorithm. Figure 6 shows the result for a specific product using the SVM algorithm. By applying naive Bayes and SVM algorithm on the dataset which can get an output of accuracy which is related to the Buys products. This accuracy has been calculated from a confusion matrix where a result is a total number of correctly predicted from the testing dataset by the total number of rows in a dataset. Buys products are dependent on all the variable which are present in the dataset. This result shows the best online product website based on the accuracy result of a product on a specific company. From these two algorithms, Naïve Bayes gives the best accurate result. Figure 5 shows Flipkart having the highest review rating of 38 percentage and Fig. 6 shows the club factory having the highest review rating of 35%. so, by this, the new consumer can utilize this analysis result for buying online products on an e-commerce website. This process considers the algorithm based on an online independent product specification review which is given individually by the customer. The main important application of the paper is to predict the online product website which is been reviewed by the end-user and give their review.
536
E. Rajesh Kumar et al.
5 Conclusion This paper reviewed the current online review summarization methods for products. Knowing how clients use to buy the item, what their enthusiastic state when utilizing it. This paper has gone through the naive Bayes and SVM models to perform target result where the SVM algorithm used to check the best hyperplane in the given data set but it doesn’t check the occurrences in the data set. To overcome this problem using a naive Bayes algorithm, where it use to see the probability of occurrences over the data set. Hence by using the Naïve Bayes algorithm, it can easily predict good accuracy results over SVM.
References 1. Bhongade S,Golhani S (2016) HIFdetectionusingwavelettransform, travellingwaveand supportvectormachine. In: 2016Internationalconferenceon electricalpowerandenergysystems(ICEPES) 2. Jin J, Ji P, Liu Y (2014) Prioritising engineering characteristics based on customer online reviews for quality function deployment. J Eng Des 25(7–9):303–324 3. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), pp 79, 86 4. EMC Education Services (2015) Discovering, analyzing, visualizing and presenting data. In: Data science & big data analytics 5. Bisht D, Joshi S (2019) Adding improvements to multinomial naive bayes for increasing the accuracy of aggressive tweets classification 6. https://saedsayad.com/naive_bayesian.htm 7. Wang WM, Wang JW, Li Z, Tian ZG, Tsui E (2019) Multiple affective attribute classification of online customer product reviews a heuristic deep learning method for supporting Kansei engineering 8. https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-exa mple-code/ 9. Ireland R, Liu A (2018) Application of data analytics for product design: sentimentanalysis of online product reviews. CIRP JManuf SciTechnol 10. Sentiment analysis using feature based support vector machine—A proposed method. IntJRecent Technol Eng 2019 11. https://www.statsoft.com/textbook/support-vector-machines 12. Mantyla MV, Graziotin D, Kuutila M (2018) The evolution of sentimental analysis
A Descriptive Analysis of Data Preservation Concern and Objections in IoT-Enabled E-Health Applications Anuj Kumar
Abstract The Internet of things (IoT) is an expanding field which increases its participation in different fields like e-health, retail, smart transportation, etc., day by day where devices communicate with each other and with persons for providing different facilities for the users and for the overall community of humans. In this, communication technologies are used with modern wireless communications. Communication between device to device and human made possible with the help of sensors and wireless sensors network which are provided by IoT. With these capabilities in IoT, various challenges are available. This paper focused on an overview of IoT and application scenarios of IoT. IoT contribution health sector, IoT E healthcare architecture and point out the various security concern and objections in the E-health enabled with IoT are also talk over and disclosed. Keywords Internet of things · IoT application · E-health care · IoT security · Privacy
1 Introduction In IoT, there is a wide area for research available now which attracts research scholars. IoT has now changed the way of living of humans. In this pattern, different types of devices and gadgets are attached in the manner that they can communicate or transfer information with each other. Internet is a medium which is used for this type of interaction. The team of research and innovations Internet of things describe IoT is a network Infrastructure which spread out worldwide with own composed qualities which are based on standard rules of exchange and use information in large heterogeneous network where both types of an object like physical and virtual have a specification, physical attributes, and virtual characters use smart coherence and are logically united into the information network. The ability to exchange and use information in a large heterogeneous network is a special feature of IoT which provides A. Kumar (B) Department of Computer Engineering and Applications, GLA University, Mathura, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_40
537
538
A. Kumar
Table 1 Applications of IoT [2, 3] Fields
Some example of applications
E-health
Patient observations, ensuring availability of critical Hardware, tracking sufferers, employees, and records Increased medicine management
Retail and logistics
Automated checkout, personalized discounts, beacons smart shelves, robot employees, optimizing supply chain control
Smart transportation
Real-time vehicle tracking, personalized travel information, smart parking
Smart environment
Pleasant accommodation, smart workplace, effective apparatus for industry, and smart athletic club, modish shops
Energy conservation
Adoption of green energy, energy management systems, smart grid
Smart home
Lighting control, gardening, safety and security, air quality, water-quality monitoring, voice assistants, switches, locks
Green agriculture
Precision farming, climate conditions, agricultural drones, soil humidity care
Futuristic
Automatic taxi cab, town information prototype, increased play room
more and rapid growth in its popularity. IoT has a feature, to collect and share data from the connected smart devices or objects with other devices and structure. Through the analysis and processing of the data, there is a little bit or no need for any type of interaction by a human in devices when devices performing their actions. Nowadays, the Internet of things (IoT) completely modified connectivity from “anytime,” “anywhere” for “anyone” into “anytime,” anywhere” for “anything” [1]. Forming a smart city and smart home, creating environmental monitoring, providing a new direction for a new smart healthcare system, and adding specific features at transportation, etc., are the objectives of IoT. IoT Applications Some of the IoT applications are given in the below Table 1. Many areas are seen where IoT gives its applications and creates dynamic changes in our life. In the next section, smart health concept which is enabled with IoT is discussed 1.
IoT in e-health—Before the existence of the Internet of things there is some limitation in traditional healthcare system. 1. Patients can be bounded to interact with doctors only by visiting hospitals and sending text messages. 2. No options were available for doctors for caretaking and monitor its patient health 24 h and provide treatment accordingly. IoT solves these problems by providing different IoT-enabled medical equipment which help to distant monitoring patients in the medical field possible and meeting with doctors have become very systematic. Presently, IoT is changing the scenario of the medical field by reformatting the space of devices. Lots of IoT-enabled health applications in the medical field provide gain for patients, families, doctors, and medical institutions (Fig. 1). 1.1 IoT for patients—IoT provides wearable devices like fitness bands, CGM, sensors, coagulation testing, etc., give feels to a patient like doctors attend
A Descriptive Analysis of Data Preservation Concern …
539
Fig. 1 Sample healthcare scenario
personally his/her case. IoT provides a big change in old people life where these devices track continuous health status of them. Devices contain techniques, which provide an alert signal to the relatives and concerned medical practitioners who follow up those people who are living alone. 1.2 IoT for physicians—Physicians use wearable and other embedded IoT devices. With the help of this, they can monitor and track their patients’ health and any type of medical needed by patients. IoT creates a strong and tightly bounded relation between physician and with their patients. Patient’s output (data) come from these devices and it is very helpful to doctors to identify the disease and for providing the best treatment. 1.3 IoT for hospitals—IoT devices in hospitals are used in devices like defibrillators, tracking devices, etc. IoT-enabled devices protect the patient against infection also. IoT devices also work as a manager for getting information like pharmacy inventory control, and environmental monitoring, and also works for humidity and temperature control. IoT architecture contains four steps. The output of every step will give input to the next step. These steps are combined in a process. Final combined values in the process are used as per users need and for different area prospects. Step 1 In the initial step, there is a formation of interconnected devices embedded with sensors, actuators, monitors, detectors, etc., and these types of equipment are used for the collection of data. Step 2 In this step, sensor is used for providing data; data is in the analog form, so it will be required to collect and transform analog form to the digital form for further execution.
540
A. Kumar
Step 3 Outputted data of step 2 is digitized and aggregated; in this step, it will be stored in a data center or cloud. Step 4 Advanced analytics are used in this data in the final step for managing and structuring it as users get the right decision on behalf of this. Health care with IoT contains various issues and challenges in terms of data security. IoT-enabled connected devices gathered a lot of data which also contains very sensitive data indeed. So data security should be a major concern on this field, and several security and privacy issues are observed [1].
2 Literature Survey Tao et al. [4] discussed healthcare data acquisition technique and studied about security and privacy concern of it. The author proposed a technique which was collected data secured for IoT-based healthcare system. Secure data scheme was composed of four layers but author contributed to the first three layers. FPGA algorithm and secret cipher algorithm were used for the initial phase and implemented KATAN algorithm maximize. FPGA was used for hardware platform, and for achieving privacy and protection to patients’ data, the secret cipher was used. Distributed database technique was applied at cloud computing layer for achieving privacy in patients’ data. Simulations with FPGA were used for measuring the performance of secure data in terms of a different parameter of algorithm. Fan et al. [5] proposed a scheme which can solve the problem of medical security and privacy. RFID was used for this purpose because it contained features of data exchanging and collecting, and for this execution back-end server was used. Encoded text (Cipher Text) was used for this type of information exchanging process, which makes this process more securable. Tang et al. [6] proposed a secured health data collection scheme where data was collected from various sources and signature techniques were used for a guaranteed fair incentive for contributing patients and for keeping data obliviousness security and fault tolerance combination of two cryptosystems was used. Also, the key factor of the scheme in terms of the resistance of system from attacks toleration of healthcare center failure, etc., was discussed. Puthal [7] proposed a static model for data privacy. Basically, a model was used for restricted flow of information on a huge amount of data streams. Two types of the lattice were used; one for wearable sensors, known as sensor lattice, and the second one is user lattice for users. Static lattices aimed to execute the model as much as faster. Results manifest that model can handle the huge amount of data which comes in the form of streams with minimum dormancy and store requirement. Deebak et al. [8] proposed a secured communication scheme for healthcare applications. In this, an authentication scheme was used for users which is based on biometric technology and gives better result as compared to existing techniques in terms of like packet delivery ratio, end-to-end delay, throughput rates, and routing overhead and founded these results when it was implemented on NS3simulator. This will make more securable smart healthcare application system.
A Descriptive Analysis of Data Preservation Concern …
541
Minoli et al. [9] proposed a novel IoT protocol architecture, inspect tools and techniques which was used in security that could be grip as part of the distribution of IoT; the author said that these techniques were very most important in e-health and special care facility home like nursing homes applications. Tamizharasi et al. [10] discussed the various architectural models and also acquire control algorithms for IoT-enabled e-health systems. Further they discussed a relative analysis of different architecture segments and about the security measures, and at last advised best appropriate techniques for IoT-enabled e-medical care systems. Koutli et al. [11] firstly surveyed about the field of e-health Internet of things and found the security requirements and challenges in (IoT) applications. Then proposed an architecture which contained VICINITY. Also contained General Data Protection Regulation (GDPR) which was an amenable feature ordered to provide secured e-medical facilities to oldand middle-aged people. At last, it highlighted the point of designing of this architecture and security and privacy needs of this system. Rauscher and Bauer [12] proposed a safe and secured analysis approach which contained a standardized meta-model and an IoT safety and security framework that embracing a personalized analysis language. Boussada et al. [13] proposed a new privacy maintaining e-health solution over NDN. All privacy requirements and security were achieved by this architecture and focusing on the dependency, named AND_aNA. security analysis was followed for proving the vigorous of that proposal and through the performance, evaluation shows its effectiveness. At last, after simulation results are applied, it is disclosed that technique had an acceptable transmission delay and involves a negligible overhead. Almulhim and Zaman [14] proposed a secure authentication scheme, where a group-based lightweighted scheme was used for credential for IoT-based e-health applications, and the proposed model contained various specific features like mutual authentication, energy efficient, and computation for healthcare IoT-based applications. For containing these features in the scheme, elliptic curve cryptography (ECC) concept was used. Savola et al. [15] proposed a new set of rules applied for security objective decomposition aimed at security metrics definition. defined and managed security metrics Systematically Which provide a higher level of effectiveness in security controls, permitting informed risk-driven security decision-making. Islam et al. [16] proposed an intelligent collaborative security model to minimize security risk; it was also discussed how different new technologies like big data, ambient intelligence, and wearable, are gripped in a healthcare context. Various relations between IoT and e-health policies and its control across the world are addressed, and a new path and new area for future research on IoT-based health care based on a set of open issues and challenges are provided. Suo et al. [17] discussed all the aspects of data security in each layer of IoT architecture like perception, network, support application layers and found the holes in power and storage and other different issues like DDoS attack authentication and confidentiality privacy protection in IoT architecture. All these issues and challenges are elaborated briefly. Qiang et al. [18] focused issues like network transmission of information security wireless communication and information security RFID tag information security privacy protection and found challenges were like RFID identification, communication channel, RFID reader security issues etc. Chenthara et al. [19] discussed a security model which works for electronic
542
A. Kumar
health record (EHR). They also focused on that points which identified after study about the research work which has already published on EHR approach before two decades. Those techniques which can maintain the integrity and other basic data security measure of any patient in EHR are further explained. Chiuchisan et al. [20] discussed major concerns of data security like confidentiality, integrity, etc. Healthcare systems also surveyed about information protection in terms of different measures which was used in security and communication techniques also. Some issues about security which arose with patients in some specific disease like Parkinson when performed monitoring and other services were further explained. Abbas and Khan [21] focused on facilities of the cloud for health information record, like data storage center. Author described the state of the art in the field of cloud supports in health records and explained the point like classification and taxonomy, which were found after surveyed different privacy preserving techniques. Further, it strengths and weaknesses of these techniques are focused and some new challenges are also give for new research scholars. Ma et al. [22] proposed a new technique for e-health application. The technique was based on compression which was a combination of two techniques: Fourier decomposition algorithm (AFD) and symbol substitution (SS). In the initial step, AFD was worked on, data was compressed by using lossy compression further SS performed lossless compression technique. Hybridization of both techniques was very effective in terms of CR and PRD that gave more valuable results. Idoga et al. [23] highlighted those point who effected the healthcare consumers. For identification, this data was applied on various measures and found the structural path model, also development of healthcare center by using the cloud. After applying data on various models, it is analyzed with some specific measures like social science, LISREL, etc. Pazos et al. [24] proposed a new programmed framework that inscribes the fragmentation process. Overall process flows with the help of communicating agents. They were using different set of rules for communication between devices. In this framework, communication agent was developed according to giving specification, also was feasible in all terms of security and expandable. Maurin et al. [25] discussed the objects that are exchanged and communicated on the Internet and focused on security features of these objects and threats like cyber risk, vulnerabilities discussed which broke the security shield of these objects. And the overview of solutions of these problems is further explained and the requirement which was necessary to adapt for business and market perspectives is explained. Karmakar et al. [26] proposed SDN-based architecture for the Internet of things. This was based on authentication scheme which means it allowed only authenticated devices for accessing the specific network. For authentication, lightweight protocol was used. It was also worked for secure flow in the network. Combinations of both concepts make the system more securable from malicious attacks and threats. Chung et al. [27] described the IoT issues and explored some challenges. A new system according to our needs for security is further proposed. In this, author said that security features could be added in old ones without regenerating a system. The old features of the system are exchanged with new coming features without doing any type of renewal in the system. Borgia [28] explored new security technique in M2M communication, routing, end-to-end reliability, device management, data
A Descriptive Analysis of Data Preservation Concern …
543
management, and security of IoT which make more secure the overall process. And further author found some challenges in this field at the time IoT-enabled devices and objects were communicated. There were privacy and security issues at the time of transmission of data. Xiaohui [1] explained the concepts of IoT than about the issues and challenges in terms of security and privacy being faced in IoT field. At the time of transmission, the author found two types of security issues like wireless sensor network security problems and information transmission and processing security. And highlighted other threats like counterfeit attacks and malicious code attacks. Keoh et al. [29] described all four nodes on which IoT devices are based. Standard rules of security are described and communication security for IoT is mainly focused. Some challenges such as interoperable security datagram layer security are also explained.
3 Issues and Challenges in E-health and the Internet of Things See Table 2.
4 Motivation Health is the most concerned subject of human beings and e-health with IoT has a notable area of the future Internet with a vast effect on human’s community life and trades. IoT applications belong to the health sector and in other sectors also and services which are providing both. There are some security issues in applications of this field which are elaborated in Table 2. To secure IoT environment against those issues, a new architecture or a mechanism is needed for this application areas. By the help of that mechanism, it can be solved or fill small holes who arise in security terms like authentication, confidentiality, and data integrity in IoT-embedded field. The main motivation behind this survey is to provide mainly a detailed study about e-health system with IoT and other IoT applications related to this field and find the security issues and challenges in the field of e-health with IoT.
5 Conclusion IoT gives big changes in the usage of Internet and also opens new opportunities for new research scholars in the real world. Although a lot of researches are on IoT, its IoT-based application areas are still now open. Data security issues in e-health system with IoT have been carried out. Lots of researchers already gave data security
544
A. Kumar
Table 2 Issues and challenges in the Internet of things Author
Description
Issues and challenges
Tao et al. [4]
Discussed security data scheme For three layers (1) IoT network sensors/devices; (2) Fog layers; (3) Cloud computing layer
Collusion attacks, eavesdropping Impersonation patients’ data leakage and Destruction
Fan et al. [5]
Discussed RFID-based health system, RFID system architecture with tags, fixed and mobile users
Tag anonymity: replay attack resistance Forward secrecy: mutual authentication Anti-DoS attack
Tang et al. [6]
Explained secure data aggregation (1) system setup; (2) aggregation request allocation; (3) data collection; (4) data aggregation
Healthcare centers fault tolerance Healthcare center and cloud server obliviousness security Differential privacy
Puthal [7]
Focused to integrate the information flow control model to a stream manager
Information flow control problem
Deebak et al. [8]
Secure and anonymous biometric-based user authentication scheme
The privacy preservation issues in the IoM
Minoli et al. [9]
Explained Secure data transmission from one end to another Unauthorized users cannot access data
Data availability Eavesdropping, denial of service attack
Tamizharasi et al. [10]
Reviewed traditional methods of access control techniques in following terms for IoT-based e-health systems. security privacy Fine-grained access control Scalability
RBAC ABE CP-ABE Novel approach required
Koutli et al. [11]
Discussed two e-health applications 1. ambient-assisted living (AAL) 2. M Health and Explain VICINITY architecture
Integrity Availability Data minimization Anonymization
Rauscher and Bauer [12]
Presented an approach for the IoT-S2A2F for IoT-MD architecture security and safety optimization
Health-endangering vulnerabilities safe and secured architectures identify architectural weak points (continued)
A Descriptive Analysis of Data Preservation Concern …
545
Table 2 (continued) Author
Description
Issues and challenges
Boussada et al. [13]
Discussed named data networking (NDN) nodes exchanging Identity-based cryptography (IBC) E-health solutions
Privacy issues over NDN Comparison with IP solutions Simulation Conduction
Almulhim and Zaman [14] Lightweight authentication scheme ECC principles comparable level of security group based authentication scheme\model
Middle attack Unknown key sharing attacks Increase number of users’ access points Security issues
Savola et al. [15]
Explored security risk, discussed Hierarchy of security metrics heuristics for security objective More detailed security decomposition, systematically objectives for the target system defined, and managed security Metrics
Boussada et al. [30]
A novel cryptographic scheme PKE-IBE. Based on identity-based cryptography (IBC) tackles the key escrow issue and ensures the blind partial private key generation
Contextual privacy requirements Sensibility of exchanged data Secure session keys transmission
Islam et al. [16]
Surveys advances in IoT-based healthcare technologies Analyzes distinct IoT security and privacy features Discussed security requirements, threat models, and attack taxonomies
Standardization IoT healthcare platforms Cost analysis the app development process data protection network type Scalability
Suo et al. [17]
Explained the security issues Storage issues, attacks like which comes in all four types of DDoS attack layers of IoT architecture Basic security needs like authentication confidentiality, access control, etc.
Qiang et al. [18]
Discussed RFID tag information security wireless communication and information security network transmission of ınformation security Privacy protection
RFID identification, communication channel, RFID reader security issues Radio signals attack Internet information security Private information security (continued)
546
A. Kumar
Table 2 (continued) Author
Description
Issues and challenges
Chenthara et al. [19]
Discussed EHR security and privacy security and privacy requirements of e-health data in the cloud (3) EHR cloud architecture, (4) diverse EHR cryptographic and non-cryptographic approaches
Integrity Confidentiality Availability Privacy
Chiuchisan et al. [20]
Explored data security, communication techniques, strategic management, rehabilitation and monitoring with a specific disease
Security issues in communication techniques
Abbas and Khan [21]
Discussed facilities of the cloud for health information record, classification, and taxonomy, reviewed more research work
Secure transfer of the data .Attacks like DoS Authentication issues
Ma et al. [22]
Explained combination of two techniques Fourier decomposition algorithm (AFD) and symbol substitution (SS), evaluated in terms of CR and PRD
Physical alteration can be possible There is no access control in the transmission of data
Wang [31]
Worked for outsourced data and user’s data secured in data sharing To ensure the privacy of data owner
Due to IoT-enabled devices unique working features this will create an issue of data security Mobility, scalability, the multiplicity of devices
Idoga et al. [23]
Identification, structural path model Data statistics like effort expectancy performance expectancy information sharing
Integration of different techniques creates a challenge for security Secure transfer of the data
Pazos et al. [24]
Discussed program-enabled framework for fragmentation flexible communication agents Security and scalability aspects
Fragmentation in terms of communication Protocols and data formats
Maurin et al. [25]
Discussed the objects Threats like cyber risk, vulnerabilities The solution in terms of business and market perspectives
Communication between IoT objects/ machines. Compromise basic security aspects of data. Device tampering, information disclosure, privacy breach (continued)
A Descriptive Analysis of Data Preservation Concern …
547
Table 2 (continued) Author
Description
Issues and challenges
Karmakar et al. [26]
Explained SDN-based architecture using an authentication scheme lightweight protocol Security challenges
Malicious attacks and threats
Chung et al. [27]
Discussed on-demand security configuration system Worked for unexperienced challenges on security issues
No proper pre-preparation for handling security threats No techniques for authentication Compromise data security and privacy
Borgia [28]
Explored Security in terms of IoT devices management and security of data, network and applications
Authentication Privacy Data security
Xiaohui [1]
Discussed wireless sensor network security problems and information transmission and processing Security
Counterfeit attacks, malicious code attacks
Keoh et al. [29]
Discussed Standardization Communication security
Security issue at the time of exchange and use information by devices Transport layer security
solutions in IoT but still, there is a need for more security solutions in application fields of IoT like smart home, e-health, retail, etc. As an output of this survey, many issues and challenges are found like a small hole placed in our data security in the field of e-health in IoT like denial of service, man-in-the-middle, identity and data theft, social engineering, advanced persistent threats, ransomware, and remote recording. Many researchers gave solution about that but it is not sufficient, day by day, new issues and challenges are in front of researchers so now more research should be done in this field.
References 1. Xiaohui X (2013) Study on security problems and key technologies of the Internet of Things. In: International conference on computational and Information Sciences, 2013, pp407–410 2. Mathuru GS, Upadhyay P, Chaudhary L (2014) The Internet of Things: challenges & security issues. In: IEEE international conference on emerging technologies (ICET), 2014, pp 54–59 3. Atzori L, Iera A, Morabito G (2010) The Internet of Things: a survey. ElsevierComputer Netw 2787–2805 4. Tao H, Bhuiyan MZA, Abdalla AN, Hassan MM, Zain JM, Hayajneh T (2019) Secured data collection with hardware-based ciphers for IoT-based healthcare. IEEE Internet of Things J 6(1):410–420. https://doi.org/10.1109/JIOT.2018.2854714
548
A. Kumar
5. Fan K, Jiang W, Li H, Yang Y (2018) Lightweight RFID protocol for medical privacy protection in IoT. IEEE Trans Ind Inf 14(4):1656–1665. https://doi.org/10.1109/TII.2018.2794996 6. Tang W, Ren J, Deng K, Zhang Y (2019) Secure data aggregation of lightweight E-healthcare IoT devices with fair incentives. IEEE Internet of Things J 6(5):8714–8726. https://doi.org/10. 1109/JIOT.2019.2923261 7. Puthal D (2019) Lattice-modeled information flow control of big sensing data streams for smart health application. IEEE Internet of Things J 6(2):1312–1320. https://doi.org/10.1109/JIOT. 2018.2805896 8. Deebak BD, Al-Turjman F, Aloqaily M, Alfandi O (2019) An authentic-based privacy preservation protocol for smart e-healthcare systems in IoT. IEEE Access 7:135632–135649. https:// doi.org/10.1109/ACCESS.2019.2941575 9. Minoli D, Sohraby K, Occhiogrosso B IoT Security (IoTSec) Mechanisms for e-Health and Ambient Assisted Living Applications. In: 2017 IEEE/ACM international conference on connected health: applications, systems and engineering technologies (CHASE), Philadelphia, PA, pp 13–18. https://doi.org/10.1109/CHASE.2017.53 10. Tamizharasi GS, Sultanah HP, Balamurugan B (2017) IoT-based E-health system security: a vision archictecture elements and future directions. In: 2017 International conference of electronics, communication and aerospace technology (ICECA), Coimbatore, 2017, pp 655– 661. https://doi.org/10.1109/ICECA.2017.8212747 11. Koutli M et al (2019) Secure IoT e-health applications using VICINITY framework and GDPR guidelines. In: 2019 15th International conference on distributed computing in sensor systems (DCOSS), Santorini Island, Greece, 2019, pp 263–270. https://doi.org/10.1109/DCOSS.2019. 00064 12. Rauscher J, Bauer B (2018) Safety and security architecture analyses framework for the Internet of Things of medical devices. In: 2018 IEEE 20th international conference on ehealth networking, applications and services (Healthcom), Ostrava, 2018, pp 1–3. https://doi. org/10.1109/HealthCom.2018.853112 13. Boussada R, Hamdaney B, Elhdhili ME, Argoubi S, Saidane LA (2018) A secure and privacypreserving solution for IoT over NDN applied to e-health. In: 2018 14th International wireless communications & mobile computing conference (IWCMC), Limassol, 2018, pp 817–822. https://doi.org/10.1109/IWCMC.2018.8450374 14. Almulhim M, Zaman N (2018) Proposing secure and lightweight authentication scheme for IoT based E-health applications. In: 2018 20th International conference on advanced communication technology (ICACT), Chuncheon-si Gangwon-do, Korea (South), 2018, pp 481–487. https://doi.org/10.23919/ICACT.2018.8323802 15. Savola RM, Savolainen P, Evesti A, Abie H, Sihvonen M (2015) Risk-driven security metrics development for an e-health IoT application. In: 2015 Information security for South Africa (ISSA) Johannesburg, 2015, pp 1–6 https://doi.org/10.1109/ISSA.2015.7335061 16. Islam SMR, Kwak D, Kabir MH, Hossain M, Kwak K (2015) The Internet of Things for health care: a comprehensive survey. IEEE Access 3:678–708. https://doi.org/10.1109/ACC ESS.2015.2437951 17. Suoa H, Wana J, Zoua C, Liua J (2012) Security in the Internet of Things: a review. In: International conference on computer science and electronics engineering, 2012, pp649–651 18. Qiang C, Quan G, Yu B, Yang L (2013) Research on security issues on the Internet of Things. Int J Future Gener Commun Netw 1–9 19. Chenthara S, Ahmed K, Wang H, Whittaker F (2019) Security and privacy-preserving challenges of e-health solutions in cloud computing. IEEE Access 7:74361–74382 20. Chiuchisan D, Balan O, Geman IC, Gordin I (2017) A security approach for health care information systems. In: 2017 E-health and bioengineering conference (EHB), Sinaia, 2017, pp 721–724 21. Abbas, Khan SU (2014) A review on the state-of-the-art privacy-preserving approaches in the e-health clouds. IEEE J Biomed Health Inf 18(4):1431–1441 22. Ma J, Zhang T, Dong M (2015) A novel ECG data compression method using adaptive Fourier decomposition with security guarantee in e-health applications. IEEE J Biomed Health Inf 19(3):986–994
A Descriptive Analysis of Data Preservation Concern …
549
23. Idoga PE, Toycan M, Nadiri H, Çelebi E (2018) Factors Affecting the successful adoption of e-health cloud based health system from healthcare consumers’ perspective. IEEE Access 6:71216–71228 24. Pazos N, Müller M, Aeberli M, Ouerhani N (2015) ConnectOpen—Automatic integration of IoT devices. In: 2015 IEEE 2nd world forum on Internet of Things (WF-IoT), Milan, 2015,pp 640–644 25. Maurin T, Ducreux L, Caraiman G, Sissoko P (2018) IoT security assessment through the interfaces P-SCAN test bench platform. In: 2018 Design, automation & test in Europe conference & exhibition (DATE), Dresden, 2018, pp 1007–1008 26. Karmakar KK, Varadharajan V, Nepal S, Tupakula U (2019) SDN enabled secure IoT architecture. In: 2019 IFIP/IEEE symposium on integrated network and service management (IM), Arlington, VA, USA, 2019, pp 581–585 27. Chung B, Kim J, Jeon Y (2016) On-demand security configuration for IoT devices. In: 2016 International conference on information and communication technology convergence (ICTC), Jeju, 2016, pp 1082–1084 28. Borgia E (2014) The Internet of Things vision: key features, applications and open issues. Elsevier Comput Commun 1–31 29. Keoh SL, Kumar SS, Tschofenig H (2014) Securing the Internet of Things: a standardization perspective. IEEE Internet of Things J 265–275 30. Boussada R, Elhdhili ME, Saidane LA ()2018 A lightweight privacy-preserving solution for IoT: the case of e-health. In: 2018 IEEE 20th international conference on high performance computing and communications; IEEE 16th international conference on smart city; IEEE 4th international conference on data science and systems (HPCC/SmartCity/DSS), Exeter, United Kingdom, 2018, pp 555–562. https://doi.org/10.1109/HPCC/SmartCity/DSS.2018.00104 31. Wang H (2018) Anonymous data sharing scheme in public cloud and its application in e-health record. IEEE Access 6:27818–27826 32. Sudarto F, Kristiadi DP, Warnars HLHS, Ricky MY, Hashimoto K (2018) Developing of Indonesian intelligent e-health model. In: 2018 Indonesian association for pattern recognition international conference (INAPR), Jakarta, Indonesia, 2018, pp 307–314. https://doi.org/10.1109/ INAPR.2018.8627038 33. Abomhara M, Koien GM (2014) Security and privacy in the internet of things: current status and open issues. In: IEEE International conference on privacy and security in mobile systems (PRISMS), 2014, pp1–8. 34. Gubbi J, Buyya R, Marusic S, Palaniswami M (2013) “Internet of Things (IoT)” a vision, architectural elements, and future directions. Elsevier Future Gener Comput Syst 1645–1660 35. Al-Fuqaha A, Guizani MM, Aledhari M, Ayyash M (2015) Internet of Things: a survey on enabling technologies, protocols and applications. IEEE Commun Surv Tutor 17(4):2347–2376 36. Said O, Masud M (2013) Towards Internet of Things: survey and future vision. Int J Comput Netw (IJCN) 1(1):1–17 37. Matharu GS Upadhyay P, Chaudhary L (2014) The Internet of Things: challenges & security issues. In: IEEE, international conference on emerging technologies (ICET), 2014,pp 54–59 38. Granjal J, Monteiro E, Sa Silva J (2015) Security for the Internet of Things: a survey of existing protocols and open research issues. IEEE Commun Surveys Tutor 17(3):1294–1312 39. Atamli AW, Martin A (2014) Threat-based security analysis for the Internet of Things. In: International workshop on secure Internet of Things, 2014, pp 35–43 40. Mahmoud R, Yousuf T, Aloul F, Zualkernan I (2015) Internet of Things (IoT) security: current status, challenges and prospective measures. In: International conference for internet technology and secured transactions (ICITST), 2015, pp336–341 41. Vasilomanolakis E, Daubert J, Luthra M, Gazis V, Wiesmaier A, Kikiras P (2015) On the security and privacy of Internet of Things architectures and systems. In: International workshop on secure internet of things, 2015, pp 49–57 42. Zhang Z-K, Cho MCY, Wang C-W, Hsu C-W, Chen C-K, Shieh S (2014) IoT security: ongoing challenges and research opportunities. In: IEEE international conference on service-oriented computing and applications, 2014,pp 230–234
550
A. Kumar
43. Jiang DU, Shi Wei CHAO (2010) A study of information security for M2M of lOT. In: IEEE international conference on advanced computer theory and engineering (ICACTE), 2010,pp 576–579 44. Basu SS, Tripathy S, Chowdhury AR (2015) Design challenges and security issues in the Internet of Things. In: IEEE region 10 symposium, 2015, pp90–93 45. Miorandi D, Sicari S, De Pellegrini F, Chlamtac I (2012) Internet of Things: vision, applications and research challenges. Elsevier Ad Hoc Netw 1497–1516 46. Asghar MH, Mohammadzadeh N, Negi A (2015) Principle Application and vision in Internet of Things (IoT). In:International conference on computing, communication and automation, 2015, pp427–431 47. Chen X-Y, Zhi-Gang, Jin (2012) Research on Key technology and applications for Internet of Things. In: Elsevier international conference on medical physics and biomedical engineering, 2012, pp561–566 48. Vermesan O, Friess P (eds) (2014) In: Internet of Things—From research and innovations to market deployment. River Publishers Series in Communication
Applying Deep Learning Approach for Wheat Rust Disease Detection Using MosNet Classification Technique Mosisa Dessalegn Olana, R. Rajesh Sharma, Akey Sungheetha, and Yun Koo Chung
Abstract Nowadays, technologies deliver to humankind the ability to produce enough food for billions of people over the world. Even though the world is producing a huge amount of crops to ensure food security of the people, there are a lot of factors that are threats in the process of ensuring food security. The threats that occur on crops can be with climate changes, pollinators, and plant diseases. Plant diseases are not only threats to global food security, but they also have devastating consequences on smallholding families in Ethiopia, who are responsible for supporting many, in one family. Crop reduction is the major problem the world is facing currently, and solving this with artificial intelligence detection methods has been the major challenge of experts on the efficiency of the algorithms, because of the nature of the of the diseases to be identified on the crops. Convolutional neural network is showing a promising result, especially in computer vision. This paper elaborates on implementation of deep learning (convolutional neural networks) using the RGB value of the color of the disease found on the crop, which increases the efficiency of the model. Keywords Wheat rust · Convolutional neural networks · RGB value segmentation · Computer vision · MosNet
1 Introduction The economy of Ethiopia mainly depends on agriculture which constitutes of 40% GDP, 80% of exports, and 75% of country’s overall workforce [1]. But, the outbreak of different diseases on different crops is the most difficult challenge the country M. D. Olana (B) · R. Rajesh Sharma · A. Sungheetha · Y. K. Chung Department of Computer Science and Engineering, School of Electrical Engineering and Computing, Adama Science and Technology University, Adama, Ethiopia e-mail: [email protected] R. Rajesh Sharma e-mail: [email protected] Y. K. Chung e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_41
551
552
M. D. Olana et al.
is facing, which brings different socioeconomic problems such as food insecurity, market inflation and hard currency shortage because of the necessity of imports to cover it in import from foreign countries. In Ethiopia, the second position of crop yielding is occupied by wheat [2] which covers approximately 17% of entire farmland as per country’s statistics. The crop is critical to not only smallholder incomes but the food and nutrition providers for 10 million Ethiopians. Ethiopia is expecting production of 4.6 million tons in 2019/20, which is an increment of 0.1 ton only, which is still not good enough to cover wheat consumption of the country [9]. Even though yellow rust is the most common wheat rust disease, in some places stem and leaf rust [1, 2] have also been identified. The biggest challenge in Ethiopia to deal with this problem is that it could not be addressed easily with the help of technology at an early stage so that it cannot spread through all the fields, because every premature detection is done by deploying a large number of experts in the field and inspects it manually. From study-wise, there are a lot of studies that worked on different kinds of plant disease detections. But, with all their performances, at the same time, they do have their drawbacks as no research has a perfect ending. This study tried to figure out some of the drawbacks that occurred in previous researches. Here are some basic problems that tried to be figured out in this study: • Using laboratory images and when the model is tested in the real field or cultivation area, the accuracy of the models degrades. • Being constrained to single leaf images that are with a homogenous background. • Using manually cropped images specifically focusing only on the disease area which decreases the efficiency when it is taken to a real-world test. These drawbacks are discussed in the second chapter (Related works) in detail with their references. • Being dependent on previously trained models which bring future insecurity.
2 Lıterature Revıew All over the world, wheat is considered as one of the common cereal grains which comes from grass type (Triticum) and grown as a different variety all over the world. Wheat is considered as highly proactive as it has gluten which triggers harmful immune response in individuals. Besides, these people all over the world consume wheat as main food due to its rich antioxidants, minerals, vitamins, and fibers. Wheat is a prominent food security crop of Ethiopia, and it provides millions of dollars for Ethiopia [3]. Despite the importance of the crop in Ethiopia, wheat is the most commonly rustinfected grain, basically with three most common rusts: leaf rust, yellow (stripe) rust, and stem rust (Table 1). This paper deeply works on how to detect three types of wheat rust diseases using 2113 RGB value of the images with RGB value segmentation and convolutional
Applying Deep Learning Approach for Wheat Rust Disease Detection … Table 1 Wheat import and local growth rate in Ethiopia
Year
Import amount in (MT)
553
Growth rate from the previous year (%) −34.58
2010
700
2011
1300
85.71
2012
1100
−15.38
2013
1000
2014
1075
7.50
2015
2600
141.86
2016
1100
−57.69
2017
1500
36.36
2018
1500
0.00
2019
1700
13.33
9.095
neural network approaches together. Three types of wheat rust diseases are severely harming wheat crops in Ethiopia.
2.1 Leaf Rust Leaf rust (Puccinia triticina) is one of the common diseases that occurs in plants due to fungus [4, 5]. ˙It is also called as brown rust that occurs mainly in wheat, barley, and other grown crops. By attacking the foliage, leaf rust changes the leaf surface into dusty, reddish orange to brown.
2.2 Yellow Rust Yellow rust (Puccinia striiformis) [4] is also known as wheat stripe rust. It is one of the rust diseases that occurs in wheat grown in cool environments. Mostly, the northern latitudes are such regions which have temperature range of 2 and 15 °C. Even though leaf and yellow rusts are types categorized under leaf rust family, they are indifferent races, but they are sometimes difficult to recognize them easily by normal visual inspection and they need to be tested in laboratory.
2.3 Stem Rust The stem rust is caused by the fungus-like other rust types (Puccinia graminis) [4] and is a significant disease that affects bread wheat, barley, durum wheat, and triticale.
554
M. D. Olana et al.
Deep learning is a subfield of machine learning [6] that studies statistical models called deep neural networks that can learn hierarchical representations from raw data. The aim of creating a feasible neural network for detection and recognition of images that categories under neural network as convolutional neural network (ConvNet) [7, 8]. ConvNet is widely used for feature extraction through its processing layers for the input image.
2.4 Related Works and Limitations First, this research [9] used 15,000 manually cropped RGB images into a single leaf to detect only the infected area of the crop. These images are used to classify three types of cassava leaf diseases, by applying a different set of a train, validate, and test split ranges, in which 10 percent is used for validating the model, and other are used for train and test of 10/80, 20/70/, 40/50, and 50/40 percent, respectively. They also used Google InceptionV3 and achieved 98% of accuracy, but at the same time, this study cannot achieve good performance when having random images, which are captured under random conditions, which will not allow the model to be applied in real-world conditions. In this research [10], they used GoogLeNet and AlexNet models to train 54,306 images from the plantVillage Web site, in which GoogLeNet performs better and consistently with a training accuracy of 99.35%. But in this study, the accuracy degrades to 31.4%. In this study, three train–test split distribution of 75/25, 60/40, and 70/30 in percent has been used with three types of image types which are RGB color images, grayscale images, and segmented images. In third work [11], they used an automated pattern recognition using CNN, to detect three types of plants and their diseases, based on simple leaves of the plants, using the 5 basic CNN models, from pre-trained models. The study uses 70,300 for training and other 17,458 images for testing, with a standard size of 256 × 256 pixel size. Models fine-tuned [10, 12, 13] in these studies are AlexNet, AlexNetOWTBn, GoogLeNet, OverFeat, and VGG, with the highest accuracy on the training set with 100 and 99.48% on the testing set in VGG model. Table 2 shows the existing research for wheat rust detection with significantly improved accuracy, as well as they have some limitations in their works which is compared with the current scenario. Here, listed major four techniques are taken for comparison of [9–12].
3 Proposed MosNet Method In this learning, our model has been proposed from scratch which is named after the author’s name ‘MosNet’ model without using any transfer learning method and our model classifies the images into two categories, which are infected and not infected
Applying Deep Learning Approach for Wheat Rust Disease Detection …
555
Table 2 Comparison of existing work and limitations Authors
Techniques
Accuracy (%)
Limitations
S. P. Mohanty et al.
Applied fine-tuning using pre-trained deep learning models
99.35
The study constrained to single leaf images with homogenous background
K. P. Ferentinos et al.
Applied different pre-trained networks to train it on laboratory images
99.48
Accuracy degrades when the model tested on images from real cultivation field
A. Picon et al
Fine-tuned ResNet50 model
87
Images are segmented manually by expert technicians
A. Ramcharan et al.
Transfer learning using InceptionV3
96
Images are manually cropped to a single leaflet
classification; this is because it is needed to have hundreds of thousands or millions of images to recognize in which type of wheat rust the crop is infected. The first step before starting to train our model is image acquisition using a digital camera and smartphones from different rural parts of Ethiopia. These images are different types of wheat images which are infected with the three types of wheat rusts and also the healthy ones in different positions and different humidity and weather conditions to give the model variety images so that it can help us to achieve better accuracy. The algorithm described below illustrates the detailed approach for the proposed image acquisition and RGB value segmentation of our algorithm.
3.1 MosNet Model It is a convolutional neural network [14–17] architecture which is named after my name and implemented from scratch with totally 2113 data of the infected and healthy image classes, with three different training and test split sets. These train–test split distributions are 50%–50%, 75%–25%, and 70%–30% for train and test, respectively. Sigmoid activation function for the output layer is used with the loss function of the binary cross-entropy, and Adam gradient descent algorithm is used. The weight functions are learned at the learning rates of 0.001 and 0.0001. • Conv2D is the first layer of a convolution layer. It has 32 feature maps of size 3 × 3 along with a rectifier activation function. This is the input layer, expecting image shape with (pixels, width, and height) which I fed it with 150 × 150 image sizes. • The next layer contains a pooling layer that takes the max called MaxPooling2D. It is configured with a pool size of 2 × 2.
556
M. D. Olana et al.
• The third layer contains a convolution layer with 32 feature maps with a size of 3 × 3 and a rectifier activation function called ReLU and a pooling. • Layer with a size of 2 × 2. • The next layer contains a convolutional layer with 64 feature maps with the size of 3 × 3 and rectifier activation function followed by the max pooling layer with a pool size of 2 × 2. • In the next layer, the 2D matrix data is converted into flattened vector and it allows the output to be processed through its fully connected layers and activation function. • Regularization layer is the next layer which uses dropout. In order to reduce the overfitting, this layer is configured to randomly 30% of neurons. • The next layer is a fully connected layer with 64 neurons with a rectifier activation function. • The output layer has 2 neurons for the 2 classes and a sigmoid activation function to output probability-like predictions for each class. The model is trained using binary cross-entropy as loss function and the Adam gradient descent algorithm with different learning rates (Fig. 1).
3.2 Algorithm Step 1 Image collection from different parts of Ethiopia, especially parts that are known by their vast product of wheat crop using digital camera and smartphones, is the first step. Step 2 The second step is increasing the number of images using ‘data augmentation’ technique because there are no means and culture of image data stored in Ethiopia since the country is still not aware of new approaches of machine learning. In this step, 192 original images had been taken from real cultivation field to 2113 total images used to train our model using ten (10) augmentation features. Here below the ten features have been listed used to augment the original images. • Rotation, height shift, width shift, rescaling, shear, zoom, horizontal flip, fill mode, data formant, and brightness Step 3 The third step is to resize the images into a common standard format so that our model can have a uniform image reading system and it makes it easy to get images with different sizes and resizes them into 150 × 150 height and width. Step 4 The final step is to segment the images using the RGB color Value of the infected images and feed to the model. Therefore, each color of the three types of wheat rust diseases has its color and all colors have their own unique RGB value representation. Note: R = red, G = green, B = blue.
Applying Deep Learning Approach for Wheat Rust Disease Detection …
Input image
(None, 32, 148, 148) Convolution2D Conv Layer1
ReLuActivation MaxPooling2D (None, 32, 72, 72) Convolution2D
Conv Layer2
ReLuActivation MaxPooling2D (None, 64, 34, 34) Convolution2D
Conv Layer3 ReLuActivation MaxPooling2D (None, 64) Flatten Output Layer
SigmoidActivation Dense Dense
Fig. 1 MosNet model architecture
These representations are: Yellow rust has a color value in between B > 0 and B < 45. Stem rust has a color value in between R > 130 and R < 200. Leaf rust has a color value in between G > 50 and G < 150.
557
558
M. D. Olana et al.
Fig. 2 Original and RGB segmented images
Segmenting the images using these color values gave us the perfect classification by creating an identified zone on only infected areas of the crop, and (f) the image is the healthy one; it gives us a solid dark image so that it can identify it easily from infected crops. As shown in Fig. 2, there are four different crop images, which are the healthy wheat image (a), a wheat crop which is infected with yellow rust (b), a wheat crop which is infected with stem rust (c), and wheat crop which is infected with leaf rust (d). This segmentation has made the model classify far better than images without being segmented.
4 Experimental Results All the experiments are performed using the Python programming language on Anaconda distribution platform using JupyterLab editor. MosNet model has been evaluated on three different datasets (grayscale image dataset, RGB image dataset, and RGB value segmented image dataset) using different parameters that can affect the efficiency of any model. These parameters are learning rate, dropout ratios, and train–test split ratio. These important parameters have been used in different combinations; because each and every parameter has its effect, every time the value of the parameter is changed. Therefore here below there are the parameter values which are selected to be more efficient and to show the impact of the parameter values.
Applying Deep Learning Approach for Wheat Rust Disease Detection …
559
Epochs Epochs are the number f training iterations the model runs for the given dataset. Three different epochs have been used, and these are 100, 200 and 300 training iterations. Learning Rate Adam (0.001), Adam (0.0001), and Adam (0.00001) have been used. Test Ratio Test ratio is the amount of data that is used to test the accuracy of the model after the training is finished. And test data is a type of data that is not learned by the model before. Test ratios of 25, 20, and 30% have been used from the total dataset. Dropout It is explained in (2.5 dropout) section and has used two values of dropout rate which are 50 and 30%. Each and every parameter has its impact on the model depending on the values used in combination with the other parameters, and the effect they brought is discussed. In this study, more than two hundred experiments have been conducted by exchanging the combination of parameters that are explained in this section. MosNet is evaluated on three kinds of image dataset which are: grayscale, RGB, and RGB value segmented image datasets. Grayscale images are images with only one color channel, which cannot hold enough information to be processed or extracted from the image data. Evaluating the same model on the same dataset in Table 3, by changing their parameters that can affect the efficiency of the model, the effect of the learning rate is discussed for results 1 and 2 in the table. The only parameter changed from the two results is their learning rate, in which 0.001 is used for the first result and 0.00001 is used for Table 3 Cumulative summaries for MosNet model No.
Dataset type
Epochs
Learning rate
Test ratio (%)
Dropout (%)
1
Grayscale
100
Adam(0.001)
25
50
100
Adam(0.00001)
25
50
200
Adam(0.001)
25
200
Adam(0.001)
300
Adam(0.001)
200
2
3
RGB
RGB segmented
Training time
Accuracy (%)
Error (%)
60.49
85.89
14.11
60.95
81.63
18.37
30
120.77
80.78
19.22
25
50
120.43
86.62
13.38
25
50
179.25
79.08
20.92
Adam (0.001)
25
50
110.34
98.78
1.22
200
Adam (0.001)
20
50
115.65
98.48
1.52
300
Adam (0.001)
25
30
170.90
99.51
0.49
300
Adam (0.001)
25
50
196.24
99.27
0.73
300
Adam (0.0001)
25
50
176.19
97.57
2.43
300
Adam (0.00001) 25
50
169.17
96.84
3.16
300
Adam (0.001)
25
30
165.61
99.76
0.24
300
Adam (0.001)
25
50
168.14
98.05
1.95
560
M. D. Olana et al.
Table 4 Improved accuracy of the proposed model No. Techniques
Accuracy (%) Error (%)
1
Proposed MosNet
99.76
0.24
2
Applied fine-tuning using pre-trained deep learning models
99.35
0.65
3
Applied different pre-trained networks to train it on laboratory 99.48 images
0.52
4
Fine-tuned ResNet50 model
87.00
5
Transfer learning using InceptionV3
96.00
13.0 4.00
the second result, which results in an accuracy of 85.89% and 81.63%, respectively. This shows that the learning rate in result 1 which is equal to 0.001 has decreased to 0.00001 in result 2. Table 4 shows the improved result in accuracy of proposed MosNet model. This sounds, as the learning rate decreases, the accuracy of the model decreases proportionally, which sounds like, decreasing the learning rate means, decreasing in the speed of the learning ability of the model. As long as the model is using the same number of epochs, dropout rate, and test ratio, the result shows the model is taking too much time to learn features from the data. As understood from the table, the model starts to degrade on the 300th epoch and this shows for grayscale images it has nothing more to extract and learn because there is only one channel this makes it lose more features from the data and that will degrade the efficiency of the model after the 200th epoch. This result will force us to use RGB images which have three channels and can contain more than 16 million colors. This helps the model to extract and learn from the color nature of images and helps to prevent losing information from the images, which can be extracted from the color of the images since in the case of this study wheat rusts are identified by their colors from the healthy wheat images. The graphical results of each evaluation can be seen (Figs. 3 and 4). Fig. 3 Confusion matrix of validation data for Table 3, row 1
Applying Deep Learning Approach for Wheat Rust Disease Detection …
561
Fig. 4 Confusion matrix of validation data for Table 4, row 1
As have shown in Figs. 5 and 6, there is a big difference when testing our model on the training data and test data. The difference is the training data is the data that is already learned by the model during the training time and it does not take it so long for the model to generalize all samples as a class of learning time. As understood from the above confusion matrix, loss graph and accuracy graph cannot have good enough detection model that can work properly in real-time conditions as it is with many error results, even though it detected more than 80% of the testing data that is why needed to improve the quality and more extracted data as much as possible. Therefore, the next step is to train our model using RGB images without segmentation, this brought as much further accuracy than the model trained with the grayscale image dataset, and the results of the two datasets can be compared with the same parameters. If the results of grayscale dataset of row 5 are shown in Table 3 and RGB dataset of row 4 in Table 3, there is a big difference in their result. Fig. 5 Loss graph of train versus data for Table 3, row 1
562
M. D. Olana et al.
Fig. 6 Accuracy graph of train versus val val data for Table 3, row 1
Fig. 7 Accuracy graph for grayscale dataset
The above two diagrams are the results found evaluating our model on two different datasets, the grayscale dataset which is shown in Fig. 7 and RGB dataset which is shown in Fig. 10. As the accuracy graph of both evaluations is seen, the one with the grayscale dataset has the minimum value on validation (test) data, which scores only 79.08% accuracy and 20.92% error, which is not convenient to use this model on real-time applications, and this shows as the number of epochs increased to the maximum when grayscale image dataset is used, it starts to degrade the accuracy of the model because when the model extracts grayscale images repeatedly, the model generalizes the samples into false classes and starts to overfit the classes. The next graph (Fig. 8) is the evaluation result of RGB image dataset with the same parameter used in the previous one, the result is far better than the one with grayscale image dataset, this evaluation scored with 99.27% accuracy and 0.73%, this is a pretty good result, and besides the comparison, the model has still an accuracy value of 99.51%, but still, needed more extraction technique because have seen some problems with problems that it classifies some of the wheat images with rain droplets, soil and leaves with fire burn as an infected crop which is an error. Therefore, there
Applying Deep Learning Approach for Wheat Rust Disease Detection …
563
Fig. 8 Accuracy graph for RGB dataset
Fig. 9 Classification report of RGB segmented dataset value with the highest accuracy
is a need to find another way to fix this problem, it was segmenting the images using their unique RGB value, and this brought us a big improvement with the result of 99.76% (Table 3 row 1 of RGB segmented dataset part) accuracy and also fixing the problems encountered in previous evaluations on grayscale and RGB image datasets. This result also gave us a precision and recall value of 1 which indicates a perfect evaluation of the model (Fig. 9). As shown in Fig. 10, the result that has been achieved by segmenting images using their RGB value gave us an excellent accuracy with validation accuracy of 99.76% and the lowest error rate of 0.24% which is absolutely a great achievement in this study. You can see that the loss of the model starts from around 0.6 and tends to almost zero error gradually, and for the accuracy part it starts from around 88% of accuracy and ends on 99.76% on the 300th epoch.
5 Conclusion and Future Work This study discussed different CNN models by applying different important factors that can affect the model designed in the study. Three dataset types are used in the study to conduct the experiments for the MosNet model, and these dataset types
564
M. D. Olana et al.
Fig. 10 Loss and accuracy graph of RGB segmented dataset with the best result
are: grayscale image dataset, a dataset which only contains one channel images, RGB dataset, a dataset with 3 channel images and RGB color segmented dataset, a dataset which is RGB and segmented with the disease color code. MosNet model has achieved an accuracy of 86.62% with 200 epochs, 0.001 learning rate, and a 50% dropout rate. This result is improved when the model is trained on the RGB image dataset, which climbed to an accuracy of 99.51%. Finally, after segmenting the images using the color of the infected images, the model extracted better information than the previous model and achieved an accuracy of 99.76% with 300 training epochs, the learning rate of 0.001, and the dropout rate of 30%. This study delivered a CNN model that can effectively monitor wheat crop health which is quite helpful for early protection of the wheat farm before the disease spreads out and makes total damage to the crop. This is a nice thing for early prevention of total crop loss, but it is not enough for statistical data, which means the detection should be with identifying which type of disease occurred and in how much extent it occurred on the farm. Collecting enough and well-defined dataset on different agricultural land, that help this study to progress to work on different variety of crops and their disease types to apply CNN models to the real world in a short period of time in Ethiopia. Since lack of enough data limits the study unable to progress more than the result found currently.
References 1. FAO (2016) Ethiopia battles wheat rust disease outbreak in critical wheat-growing regions. https://www.fao.org/emergencies/fao-in-action/stories/stories-detail/en/c/451063/. Accessed 20 Mar 2019 2. Seyoum Taffesse A, Dorosh P, Gemessa SA (2013) Crop production in Ethiopia: regional patterns and trends. In: Food and agriculture in Ethiopia: progress and policy challenges, vol. 9780812208, pp 53–83 3. USDA. Ethiopia Wheat Production. https://ipad.fas.usda.gov/rssiws/al/crop_production_ maps/eafrica/ET_wheat.gif. Accessed 19 Jan 2019
Applying Deep Learning Approach for Wheat Rust Disease Detection …
565
4. GreenLife. Wheat Rust (1). https://www.greenlife.co.ke/wheat-rust/. Accessed 25 Mar 2019 5. Wegulo SN, Pathologist EP Byamukama E (2000) Rust diseases of wheat. https://ohioline.osu. edu/factsheet/plpath-cer-12 6. Miceli PA, Blair WD, Brown MM (2018) Isolating random and bias covariances in tracks 7. Prabhu R (2018) Understanding of convolutional neural network (CNN) deep learning. Medium.Com. https://medium.com/%40RaghavPrabhu/understanding-of-convolutional-neu ral-network-cnn-deep-learning-99760835f148. Accessed 19 Feb 2019 8. Brownlee J (2019) A gentle ıntroduction to pooling layers for convolutional neural networks. In: Deep learning for computer vision. https://machinelearningmastery.com/pooling-layersfor-convolutional-neural-networks/. Accessed 14 Feb 2019 9. Ramcharan A, Baranowski K, McCloskey P, Ahmed B, Legg J, Hughes DP (2017) Deep learning for image-based cassava disease detection. Front Plant Sci 8(2002):1–10 10. Mohanty SP, Hughes DP, Salathé M (2016) Using deep learning for ımage-based plant disease detection. Front Plant Sci 7 11. Ferentinos KP (2018) Deep learning models for plant disease detection and diagnosis. Comput Electron Agric 145(January):311–318 12. Too EC, Yujian L, Njuki S, Yingchun L (2019) A comparative study of fine-tuning deep learning models for plant disease identification. Comput Electron Agric 16(October 2017):272–279 13. Picon A, Alvarez-Gila A, Seitz M, Ortiz-Barredo A, Echazarra J, Johannes A (2019) Deep convolutional neural networks for mobile capture device-based crop disease classification in the wild. Comput Electron Agric 161(October 2017):280–290 14. Sharma AR, Beaula R, Marikkannu P, Sungheetha A, Sahana C (2016) Comparative study of distinctive image classification techniques. In: 2016 10th International conference on ıntelligent systems and control (ISCO), Coimbatore, 2016, pp 1–8. https://doi.org/10.1109/ISCO.2016. 7727002 15. Rajesh Sharma R, Sungheetha A (2017) Segmentation and classification techniques of medical images using innovated hybridized techniques—A study. In: 2017 11th International conference on ıntelligent systems and control (ISCO), Coimbatore, 2017, pp 192–196. https://doi.org/10. 1109/ISCO.2017.7855979. 16. Sungheetha A, MsSujitha R, Arthi V, Sharma RR (2017) Data analysis of multiobjective density based spatial clustering schemes in gene selection process for cancer diagnosis. In: 2017 4th International conference on electronics and communication systems (ICECS), Coimbatore, 2017, pp 134–137. https://doi.org/10.1109/ECS.2017.8067854. 17. Sungheetha A, Sharma R, Nuradis J (2019) Implication centered learning mechanism for exploring analysis of variance by means of linear regression in artificial neural networks. In: 2019 Third international conference on I-SMAC (IoT in social, mobile, analytics and cloud) (I-SMAC) Palladam, India, 2019, pp 748–752. https://doi.org/10.1109/I-SMAC47947.2019. 9032475
A Decision Support Tool to Select Candidate Business Processes in Robotic Process Automation (RPA): An Empirical Study K. V. Jeeva Padmini, G. I. U. S. Perera, H. M. N. Dilum Bandara, and R. K. Omega H. Silva Abstract Robotic process automation (RPA) is the automation of business processes (BPs) using software robots. RPA robots automate repetitive, non-value-adding human work. The extent that a BP can be transformed into a software robot and its utility depends on several factors such as task type, complexity, repeated use, and regulatory compliance. The estimated RPA project failure rates are relatively high, and transforming the wrong BP is attributed as one of the critical reasons for this. Therefore, given a candidate set of BPs, it is imperative to identify only the suitable ones for RPA transformation. In this paper, a decision support tool (DST) is presented to select candidate BPs for RPA. First, 25 factors are identified from the literature that captures the characteristics of a BP. The list is then reduced to 16 factors based on a set of interviews with RPA subject matter experts (SMEs). Then an online survey with snowball sampling was conducted to measure the relevance of those factors in predicting the outcome of RPA transformation of a candidate BP. Finally, the two-class decision forest classification model was used to develop the DST. The utility of the proposed DST was validated by applying it to three RPA projects of a global IT consulting services company. Keywords Business process automation · Decision forest algorithm · Decision support tool · Robotic process automation
K. V. Jeeva Padmini (B) · G. I. U. S. Perera · H. M. N. Dilum Bandara · R. K. O. H. Silva Department of Computer Science and Engineering, University of Moratuwa, Moratuwa, Sri Lanka e-mail: [email protected] G. I. U. S. Perera e-mail: [email protected] H. M. N. Dilum Bandara e-mail: [email protected] R. K. O. H. Silva e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_42
567
568
K. V. Jeeva Padmini et al.
1 Introduction Businesses continually strive to improve customer satisfaction while increasing the quality of products or services, productivity, and reducing cost [1]. Robotic process automation (RPA) plays a pivotal role in this attempt by automating various fractions of business processes, especially the ones that involve repetitive, non-value-adding human tasks and procedures [2–5]. For example, a customer service representative could spend more time interacting with customers as opposed to entering data from forms filed by customers. RPA could improve accuracy, quality, operational speed, efficiency, customer satisfaction, employee productivity, and decrease operational costs due to reduced human intervention and human errors [6, 7]. Thus, the use of RPA in digitalization and automating business processes has dramatically increased recently [4, 8]. It is estimated that 100 million knowledge workers will be replaced by software robots by 2026 and RPA could cut cost up to 75% in financial services firms [5]. In the journey of business process automation (BPA), RPA is the process of using software robots to reduce the burden on human workers [1]. RPA mimics users performing tasks on existing user interfaces and tools than the traditional BPA approach of interacting with the backend of applications using application programming interfaces (APIs) or specific scripting languages. However, software robots are far more than a screen scraping technology and could encompass cognitive and artificial intelligence features. Given the benefits and seaming potential to automate almost anything, there is a great interest in transforming any business process to RPA. However, not every business process is a suitable candidate for RPA transformation. For example, a study by Lamberton [9] showed that 30–50% of RPA projects fail in the initial attempt. Transforming the wrong business process is attributed to one of the key reasons for such failure [8–10]. Therefore, deciding on suitable candidate business processes for bot implementations is critical for the success of RPA transformation. However, this is not trivial as RPA is an emerging technology, and only a few process analysts have the required experts in RPA [4, 9]. In this paper, a decision support tool (DST) is proposed to assist the process analysts in selecting candidate business processes for RPA. The first step in developing a DST is to identify the factors/parameters that affect the decision. Therefore, the following research question is first formulated: What are the factors that affect candidate business process selection for RPA transformation?
A literature survey is conducted to answer the research question and identified 25 factors that capture the type, complexity, and workload of a business process. A set of interviews with RPA subject matter experts (SMEs) in the industry was then conducted to identify more relevant factors. Based on the interview results, the list of factors was reduced to 16. An online survey with snowball sampling was further conducted to measure the relevance of those 16 factors in predicting the success and failure of the RPA transformation of a candidate business process. Finally, using the survey data, a two-class decision forest with fourfold classification was trained as
A Decision Support Tool to Select Candidate Business Processes …
569
the predictive model of the proposed DST. To test the utility of the proposed DST, the feedback is consulted from five RPA SMEs of a global IT consulting services company, who applied the tool to three different projects of a multinational financial services company’s RPA transformation. Out of the three projects considered, one project was a failure, and the proposed DST predicted all three scenarios correctly. The rest of the paper is organized as follows. Section 2 presents the literature review. Section 3 describes the research methodology. The proposed DST for analyzing business processes and identifying candidate business processes for RPA is presented in Section 4. Section 5 presents the survey results and the empirical study conducted to validate the utility of the proposed DST, concluded in Sect. 6.
2 Literature Review RPA is a novel technology in BPA [1], which has the potential to automate business processes quickly and efficiently without altering existing infrastructure and systems. RPA software robots (aka., virtual bots or bots) work 24 × 7 and are programmed to follow a specific set of rules to complete tasks with higher accuracy and speed, which is not achievable from a typical human worker [12]. A bot can integrate with almost any software without accessing third-party tools or API [1] and behaves like a proxy human worker to operate a business application [7]. It relives workers from mundane tasks and enables them to focus on work requiring more human intervention, cognitive skills, and creativity [11]. RPA could replace the human work pertaining to criteria such as high-volume transactions, need to access multiple systems, stable environment, low cognitive requirements, easy decomposition into unambiguous rules, proneness to human error, limited need for exemption handling, and clear understanding of the current manual costs [12]. Artificial intelligence (AI) is applied to non-repetitive tasks where is RPA is typically performed in repetitive tasks. Both RPA and AI together help to generate compelling applications and decision making in business processes [5]. As an emerging technology, RPA has its own set of challenges. Based on the experience in implementing RPA in 20 companies, Lamberton [9] identified that 30–50% of the initial projects failed. It has been identified that applying traditional software development methodologies and transforming the wrong business process are critical contributors to such failure [8–10]. While both factors require considerable attention from academics and practitioners, this work focused only on the problem of selecting the right business process for RPA transformation. Failing to select a suitable process early in the business analysis leads to user and customer frustration, in addition to direct losses such as time and money. In [13], Lacity and Willcocks presented a widely used set of indicators such as rule-driven, repetitive in nature, data-intensive, high compliance, and validations to select suitable candidate business processes for RPA. Further, in [14], a set of recommendations to assess and analyze each business process against human factor, complexity, and stability are presented. In [7, 15, 16], it has been identified that selecting candidate business
570
K. V. Jeeva Padmini et al.
processes with a high volume of transactions, process maturity, and business rules lead to better success. Moreover, in [16], it is identified that business processes with high workload and low complexity are better candidates for RPA bot implementation. While these related works give useful insights on a few factors to consider in determining the RPA suitability of a candidate business process, no systemic mechanism/process exists to do so. Therefore, it is imperative to identify a suitable process to determine the candidacy early in the business analysis.
3 Research Methodology To decide on RPA suitability of a candidate business process, needed to first understand the characteristics of that business process and relative significance of each characteristic in determining a successful outcome. Therefore, a two-step process is adopted to develop the DST, where a set of factors/parameters is first identified that characterize a business process and then developed a predictive model to derive a yes/no decision on its RPA suitability. Figure 1 shows the adopted research methodology. First, a literature review was conducted to identify factors that characterize a given business process and determine its RPA suitability. While formal literature on the topic was limited, identified several whitepapers and articles discuss the difficulties in selecting a business process for RPA transformation and a set of best practices. Based on these resources, 25 factors are identified that could impact the selection of a candidate business process for RPA transformation. They are categorized into three groups as process factors, external factors, and financial impact factors. As it would be difficult to derive a meaningful RPA candidacy decision with 25 factors, it is decided to identify more significant ones among them. For this, the opinion of RPA SMEs is consulted using a set of face-to-face interviews. Five RPA SMEs in the industry were identified using snowball sampling. Three of these SMEs had one to three years of experience in cognitive RPA transformation and the other two had one to three years of experience in static RPA transformation. In addition
Fig. 1 Research methodology
A Decision Support Tool to Select Candidate Business Processes …
571
to determining the relevance of the 25 pre-identified factors, identification of any missing factors is also attempted. Based on the interview findings, the 16 factors listed in Table 1 as significant in determining the RPA suitability of a candidate Table 1 Finalized factors impacting candidate business process selection in RPA Factor type
Factor name
Description
Process factor
Volume of transactions (VOT)
No. of processing High, medium, or low requests that the computing system will receive and respond to within a specified period
Business process complexity (BPC)
Levels of ability to High, medium, or low analyze, variety, non-routines, difficulty, uncertainty, and inter-dependence with other business processes
External factor
Measurement criteria
Rate of change (ROC) No. of changes made to the business process within a month
≤2, 3–5, 6–8, ≥8
Rule-based business process (RBBP)
Logic-based business process
Yes, with few exceptions, Yes, or No
Workload variation (WV)
Irregularity of workload Yes, sometimes, no where additional staff support is required during peak times
Regulatory compliance (RC)
Need to satisfy organizational, government, or any other regulations, e.g., ISO or PCI-DSS
Cognitive features (CF)
Task requires creativity, Yes, no subjective judgment, or complex interpretation skills (e.g., natural language processing and speech recognition)
No. of systems to access (NOSA)
Automation task involves accessing multiple systems
≤2, 3–5, 6–8, ≥8
Multi-modal inputs (MMI)
Images, audio clips, or videos are given as inputs
Yes, No
Data-driven (DD)
Decisions during an activity are made based on the analysis of data
Yes, No
Yes, No
(continued)
572
K. V. Jeeva Padmini et al.
Table 1 (continued) Factor type
Financial impact factor
Factor name
Description
Measurement criteria
Stability of environment (SOE)
Availability of UAT or any other near-production environment of the target application for automation
Yes, No
Operational cost (OC) Cost of day-to-day maintenance and administration
High, medium, or low
Impact of failure (IOF)
Financial impact to client if the bot goes wrong
High, Medium, or Low
Human error (HE)
Task is error-prone due to human worker negligence than inability to code as a computer program
High, medium, or low
End of life (EOL)
How soon the bot will become obsolete, e.g., new system is on the road map of client
≤2 years, ≤4 years, ≥5 years
Service-level agreement (SLA)
Time taken to complete end-to-end functionalities of a task
Seconds, minutes, hours, days
business process are finalized . Five factors were combined with other factors, as they were determined to be alternative terminologies/definitions of the same set of factors. Four factors were removed because they were hard to measure, e.g., the client’s expected outcome of the process delivery. A requirement of irregular labor was identified as a new factor from the interview findings that were not in the initial list of 25. Based on the interviews, it was further realized that not all 16 factors are of equal significance. However, while attempted to rank the relative importance of the factors with the feedback from RPA SMEs, it was not straightforward. Hence, an industry survey is decided to conduct to measure 16 factors of RPA projects/bots developed by surveyed SMEs and their outcomes and consequently developed a questionnaire based on the 16 factors. It was first shared with ten RPA SMEs of the global IT consulting services company as a pilot survey. Profiles of SMEs included project managers, architects, developers, and testers. In the questionnaire, the survey participants are still asked to prioritize the factors. However, while analyzing the pilot survey results, it is understood that prioritization of factors depended on the role the survey participants played within the RPA project. Hence, the factor prioritization option is removed and update the questionnaire based on the feedback received.
A Decision Support Tool to Select Candidate Business Processes …
573
The online questionnaire is then shared with the industry experts across the World. These experts were identified using LinkedIn, public forums, webinars, etc. 56 responses are collected. Three of them were discarded after verifying their validity because those respondents commented that they cannot assess the process complexity. Even after attempting to collect data for more than three months, it was difficult to collect many responses, as it was hard to find professionals who have completed at least one RPA project. Moreover, some declined to participate, citing confidentiality agreements with their clients or conflict of interest. Among the 53 responses, only 22 of survey participants were involved in a bot implementation that was successful and another eight were involved projects with failed bots. The other 23 participants had ongoing projects which were at different phases of the project lifecycle. As the dataset collected from the online survey contained both categorical and ordinal data, the Spearman correlation is calculated for 30 (22 + 8) responses. That factors are identified such as workload variance (WV), no. of systems to access (NOSA), and service-level agreement (SLA) had a positive correlation with the RPA outcome of the business process (represented as the Status of Candidate Business Process (SCBP) in Table 1), whereas volume of transactions (VOT), regulatory compliance (RC), and cognitive features (CF) have a negative correlation with SCBP. However, due to the limited number of responses, as well as the feedback from SMEs during interviews indicated that the other ten factors are still useful in capturing the characteristics of a business process, decided to develop the prediction model using all 16 factors. This was also motivated by the fact that the dataset was small. To derive the RPA suitability decision, the two-class decision forest classification model is used. The overall accuracy was verified using the fourfold cross-validation process and the prediction model was evaluated to have an overall accuracy of 90%. The choice of Spearman correlation, two-class decision forest classification model, and resulting data analysis are presented in Sect. 4. Finally, the predictive model was further validated with the help of five RPA SMEs who applied it to three different projects of a global IT consulting services company that develops RPA bots. The three projects were chosen from three business processes of a consumer division of a multinational financial services company. At the time of evaluation, three projects were already implemented at the customer site and were in operation for a sufficient time to determine their success/failure in achieving the business outcomes. The proposed DST determined two of the projects as suitable and the other as unsuitable for RPA transformation. Indeed the project that was determined to be predicted as unsuitable for RPA transformation had failed due to wrong selection of the business process.
4 Predictive Model While the correlation analysis indicated that all 16 factors are relevant in determining the RPA suitability of a candidate business process, it is difficult to determine the
574
K. V. Jeeva Padmini et al.
Fig. 2 Process of data analyzing and predictive model development
suitability of a candidate business process based on the 16 values assigned to those factors. While initially attempted to derive a linear model with a threshold score or a set of rules to determine the RPA suitability, it was nontrivial to identify a suitable set of weights, thresholds, and rules, especially with a mix of categorical and ordinal data. Hence, a predictive data analysis method by adopting the two-class decision forest algorithm is adopted. A two-class classification model was sufficient as the DST is wanted to provide a yes or no answer on the RPA suitability of a candidate business process. Figure 2 shows the process followed in developing the decision model to determine the RPA suitability of a business process. SPSS is used for survey data analysis which consisted of data preprocessing and factor selection steps. First, the collected data is validated and removed redundant and incomplete responses. After that, the Spearman correlation is calculated to verify the interdependency of the factors and correlation of the factors with the dependent variable (i.e., SCBP). Spearman correlation is used because it helps to identify the significant factors in the dataset which showed the impact of each factor in determining the suitability of a candidate business process. Then the predictive model was developed using Microsoft Azure Machine Learning Studio due to its ease of use. For the predictive model, a two-class decision forest with fourfold classification is trained using the decision forest algorithm [17]. Figure 3 shows overall model development and testing workflow. While the above tools were used for convenience, the proposed model can be implemented in any tool that supports two-class decision forest and fourfold classification.
4.1 Data Preprocessing 56 responses are collected from the online survey shared across the industry. Three responses were discarded during data cleaning to correct the data inconsistency and to remove noise. Two of those respondents had commented that they could not assess the process complexity of the selected business process.
A Decision Support Tool to Select Candidate Business Processes …
575
Fig. 3 Model development and testing workflow
4.2 Factor Selection Spearman’s correlation is used to identify the significant factors from the survey responses. Pearson’s correlation is used when the data contains intervals or ratios, whereas some of the factors are binary or categorical. Because having a categorical and ordinal dataset, Spearman’s correlation was used to measure the monotonic relationship strength between variables. Table 2 lists the resulting correlation values. SCBP is the dependent variable and the 16 factors are independent variables. Lower Spearman’s correlation among factors indicates that there is no inter-relationship among them. Therefore, the chosen 16 factors capture the different properties of the candidate business process.
VOT
0.16
0.06
0.14
−0.16
0.00
−0.01
ROC
RBBP
0.14
0.19
0.14 −0.37
0.16
0.10
0.00
−0.02
−0.14
−0.02
0.05
0.40 −0.29
HE
EOL
SOE
SLA
0.12
0.32 0.26
0.11
0.12
0.05 −0.42 −0.18
0.47 −0.16 −0.25 0.38
0.00
0.25 −0.19 −0.20
0.00 −0.16
0.07
0.24
0.21 −0.35
0.61 −0.22
0.22 −0.19 −0.17
0.01 −0.07
0.00
0.19
0.25 −0.11
0.24
0.25
0.00
0.30
0.33
0.11
1.00
1.00 −0.07
0.30 −0.29 −0.43 −0.29
IOF
0.09
0.05
−0.18
OC
0.35
0.00 −0.53 −0.34 −0.07 −0.14
−0.06
DD
0.07
0.00 −0.23 −0.27
−0.19 0.15
0.24
0.11
0.06
0.03
0.24 −0.15
0.11 −0.15 −0.07
MMI
0.06
0.05 −0.04 −0.13 −0.15 −0.30
−0.39
CF
1.00
1.00 −0.23 −0.30
0.06 −0.25 −0.27 −0.12 −0.23
0.03
0.13
0.13
DD
OC
IOF
HE
EOL
0.00
0.05
0.30
0.09
0.35
0.00
0.24
0.02
0.07
0.00
0.19
0.00 0.14
0.01
0.25
1.00 0.37
0.11 −0.24
0.12 −0.18
0.11 −0.42
0.05
0.38
0.01
0.02
1.00
1.00
1.00 −0.24
0.02 −0.02
0.37 −0.19 −0.02
0.01 −0.04
0.14 −0.10
0.17 −0.15 −0.08 0.00 −0.02
0.21
0.00
0.00 −0.19
0.26 −0.20
0.25
0.14 0.47 −0.11
0.24
0.25 −0.25 0.00 −0.19
0.40
SLA 0.00 −0.29 0.16
0.00 −0.07 −0.16
0.00
0.00
0.01 −0.19
0.00 −0.02
0.21
1.00
0.51
0.14
0.12 −0.37
0.05
SOE
−0.24 −0.08 −0.09 −0.04 −0.02 −0.03 −0.24
0.11 −0.15
0.17
0.51
1.00
1.00 −0.38 −0.52
0.28 −0.19
0.02 −0.52 −0.19
0.22
0.19
0.30
0.21 −0.29 −0.17
−0.19 −0.39
0.28
1.00
0.10 0.32
0.33 −0.43 −0.19 −0.16
−0.22 −0.35
0.61
0.11
0.07 −0.14 −0.29
0.15 −0.07
−0.27 −0.34
−0.23 −0.53
0.00
−0.19 −0.06 −0.18 −0.02 −0.15 −0.02
NOSA MMI
0.05 −0.19
1.00 −0.08 −0.12 −0.15
0.22 −0.77
−0.32
0.13
0.06
0.22 −0.27
RC
NOSA −0.31 −0.19
CF
0.16 −0.07 −0.25 −0.04
0.14 −0.53
1.00 −0.19
0.19
0.06
RC
0.41 −0.32 −0.37 −0.31
RBBP WLV
0.00 −0.01
ROC
0.16 −0.19
0.19
1.00
0.16
0.41 −0.53 −0.71
1.00
−0.62
BPC
WV
BPC
1.00 −0.62 −0.16
SCBP
VOT
SCBP
Table 2 Spearman’s correlation among 16 factors and the status of the business process
576 K. V. Jeeva Padmini et al.
A Decision Support Tool to Select Candidate Business Processes …
577
From Table 2, it can be seen that volume of transaction (VOT), regulatory compliance (RC), and cognitive features (CF) variables have a negative correlation with the dependent variable SCBP. Workload variance (WV), no. of systems to access (NOSA), and service-level agreement (SLA) variables have a positive correlation with the dependent variable, SCBP, whereas business process complexity (BPC), rate of change (ROC), rule-based business process (RBBP), multimodel input (MMI), data-driven (DD), operational cost (OC), impact of failure (IOF), human error (HE), end of life (EOL), and stability of environment (SOE) have an insignificant correlation with SCBP. VOT (-0.625), WV (0.413), and SLA (0.40) had a moderate correlation with the SCBP and those were statistically significance, where 2-tailed values were 0.00, 0.002, and 0.003, respectively, whereas RC (−0.317), CF (−0.396), and NOSA (0.307) had a weak correlation with the SCBP, where the factors were statistically significant with 2-tailed values of 0.021, 0.003, and 0.025, respectively. However, BPC (−0.159), ROC (0.00), RBBP (−0.007), MMI (−0.191), DD (−0.062), OC (0.175), IOF (−0.018), HE (0.147), EOL (−0.018), and SOE (0.051) had insignificant correlation with SCBP. The corresponding 2tailed values were 0.256, 0.998, 0.961, 0.170, 0.660, 0.210, 0.898, 0.292, 0.898, and 0.716, respectively. This result may have been due to the lower number of samples, as the interviews with RPA SMEs revealed that the ten factors with insignificant correlations are still useful in capturing the characteristics of a business process. Therefore, the model is decided to build with 16 variables including the ten variables with insignificant correlation.
4.3 Classification Model Development The two-class classification model is selected out of multiclass classification to keep the model simple, as well as looking to derive a yes/no decision on the suitability of a candidate business process. The decision forests algorithm is used [17] because it is an ensemble learning method that can be used for classification tasks. These models provide better coverage and accuracy compared to single decision trees. The decision forests framework is used in Microsoft Azure Machine Learning Studio which extends several forest-based algorithms and unifies classification, regression, density estimation, manifold learning, semi-supervised learning, and active learning under the same decision forest framework.
4.4 Model Evaluation Next, the accuracy of the trained two-class classification model is evaluated based on the decision forests algorithm. Accuracy, precision, recall, and F score are some of the commonly used metrics to evaluate the accuracy of a classification model. Accuracy measures the quality of the classification model as the proportion of the
578 Table 3 Accuracy of the predictive model
K. V. Jeeva Padmini et al. Accuracy
Precision
Recall
F score
0.900
1.00
0.870
0.930
true results to all the cases within the model. Precision describes the proportion of true results overall positive results. The recall is the fraction of all correct results returned by the model. F score is calculated as the weighted average of precision and recall between 0 and 1 (higher the better). Table 3 presents the results based on fourfold cross-validation. It can be seen that the overall accuracy of the model is 90% and the model has good precision and recall too. A higher F 1 score further indicates good accuracy of the test. The proposed DST primarily includes the trained two-class classification model and the values of 16 factors fed into it. The prediction model is published as a Web service in Microsoft Azure platform such that the SMEs could access it to validate the model (see Fig. 3). Finally, the proposed DST is verified by applying it to evaluate the RPA transformation of three business processes of a multinational financial services company.
5 Results and Discussion The proposed DST was validated with the help of five RPA SMEs who applied it to three different projects of the global IT consulting services company. Figure 4 shows the output of the DST for the failed RPA transformation project which illustrates the results as 0 for No (1 for Yes) with the status of the list of factors. The second sentence explains the output. The DST predicted the other two projects as suitable for RPA transformation, and they were indeed successful in actual operation at the client site.
2
or
σ >
σmax + 2
√ 2 2(ξ N + ε N ) 1 2 2 2 2 wmax + amax + + cmax + σmax 4 κ
Therefore, convergence is guaranteed, and the system of error dynamics [Eqs. (17)–(18)] is stable in the sense that the size of the state vector is bounded.
6 Prototype and Mathematical Model of Surface Vessel The CAD model (Fig. 1) of the surface vessel designed in Catia V6 by following the standards of the naval architecture described especially in SV designs in [24–26] for autonomous applications. The SV design is consisted of two similar hulls (both shape and the mass) as the main floating bodies of the SV. Two electric propellers are connected to the back end of the hulls as shown. To maintain the vertical stability, a submerged body of aerofoiled shape Gertler body is attached to thin vertical strut and that is fixed to the SV structure symmetrically. Physical dimensions of the SV of the vessel are given in Table 1. The mathematical model of the SV was developed following the first principles and using the standard notations rather physical parameters that are later substituted to at the numerical simulation stage. The frame definitions are as follows. EF: origin coincides with the center of gravity (CG) of the vessel at the initial position. XE axis is directed toward the North. YE axis is directed toward the East of the Earth. ZE axis points downward.
656 Table 1 Design parameters of the prototype surface vessel
K. J. C. Kumara and M. H. M. R. S. Dilhani SV part
Physical dimensions
CAD measurement
Hulls (2 Nos.)
Length
2.540 m
Beam
0.152 m
Draft
0.127 m
Platform area
0.774 m2
Length
1.580 m
Beam
0.025 m
Chord
0.075 m
Projected area (projected through surge)
0.035 m2
Projected area (projected through sway)
0.119 m2
Gertler body
Length
1.870 m
Greatest diameter
0.443 m
Propellers (2 Nos.)
Blade diameter
0.100 m
Distance from the CG (surge direction)
0.850 m
Total mass (m)
505.12 kg
Yaw inertia (I z )
559.23 kg m2
Vertical strut
Surface vessel
BF: origin is fixed at the CG of the vessel. XB is directed toward the sway, YB is directed toward the surge. Vertical (heave), pitch, and roll motions of the SV are neglected as appreciable loss in accuracy under typical and slightly severe maneuvers is very small. Therefore, SV mathematical representation is limited to three degrees of freedom (3DoF) system. The configuration vector of the SV in BF with respect to (w.r.t.) EF can be written as T η(t) = x y φ ; t ≥ 0
(19)
where x = x(t) and y = y(t) represents the linear displacements in the surge (X E ) and sway (Y E ) directions and, φ = φ(t) is the yaw about Z E . By defining SV velocitiesu = u(t), v = v(t) and ω = ω(t) in the directions ofX B , Y B and rotation about Z B directions, respectively, velocity vector of the SV is given by (20). T V (t) = u v ω ; t ≥ 0 The rotation matrix (from BF to EF) can be obtained as,
(20)
Shape-Adaptive RBF Neural Network …
657
⎡
⎤ cos(φ) −sin(φ) 0 J (η) = ⎣ sin(φ) cos(φ) 0 ⎦ 0 0 1
(21)
In (21), the angle of trim and angle of the roll are considered negligible and have a minimum effect on the dynamics of the SV. The relationship between (19) and (20) can be further derived using (21) to describe the kinematics of the SV as, η(t) ˙ = J (η)V (t)
(22)
By applying the Newton–Euler equations to the motion of the SV (Fig. 2), the general equations of motion along the directions of BF that describe the dynamics of the SV are obtained as, X = m(u˙ − vω − yG ω˙ − x G ω2 )
Fig. 2. 3D design of the prototype SV, its main components, and the coordinate frames defined for the analysis of kinematics and dynamics (: two hulls, < 2>: vertical strut, < 3>: gertler body, : two propellers, CG: Center of gravity, EF: Earth fixed-frame, B: Body)
658
K. J. C. Kumara and M. H. M. R. S. Dilhani
Y = m(v˙ − uω + x G ω˙ − yG ω2 ) N = I Z ω + m[x G (v˙ + uω) − yG (u˙ − vω)]
(23)
where X , Y , and N are the external forces and moment acting on the vehicle. x G and yG are the distances to the CG of the SV from the origin of the BF. Here, by placing the origin of the BFF at the CG, x G → 0,yG → 0. Hence, the above set of relations (23) are further simplified and concisely derived as, TR (t) = M V˙ (t) + [C(V ) + D(V )]V (t) + g0 [η(t)]; t ≥ 0.
(24)
where M is the positive definite mass-inertia matrix, C[V (t)] ∈ R3×3 is the total matrix for Coriolis and centripetal terms, and D[V (t)] ∈ R3×3 is the damping force matrix, g0 [η(t)] ∈ R3 represents the vector of gravitational forces and moments, and finally, TR (t) ∈ R3×3 gives the input vector that represents the external forces and moments on the SV. The detailed version of the mathematical model in (24) is developed by considering the SV parameters and marine hydrodynamics theories in [24] [26]and [27]. Time differentiating (22) and substituting into (24) yields, M η(t) ¨ = f [η, V ] + τ[η, U ]
(25)
where f [η, V ] = −J (η)[C(V ) + D(V )]V (t) − J˙(η)V (t) and τ[η, U ] = j(η(t)) · g(U ). The controlled terms g(U ) given by (26) are determined by the control method described in Sects. 2 and 4 under numerical simulations. Furthermore, propeller thrust (T ) and angle (δ) provide the actual output to move the SV. Once the entries of g(U ) are known, (26) will then be solved for the control vector, U. T g(U ) = T cos(δ) T sin(δ) T sin(δ)
(26)
One may refer [18] for detail kinematics and dynamics analysis of the SV with all the physical properties and marine hydrodynamic modeling.
7 Final Controller Design and Numerical Simulations Referring back to the error dynamics described by the state space model in (16) and (17), matrix A is selected, so that first part of (16) contributes to the PD controlling, while nonlinear dynamics are handled by the second part with SA-RBF-NN controlling signal. Taking K p and K d are proportional and derivative gains, respectively, the PD controller gain functions are given as follows.
0 1 A= −K p −K d
Shape-Adaptive RBF Neural Network …
659
T T T Further by defining B = 0 1 , C = 1 1 and E = e e˙ are completed the definition of the state space model of the SV control system. It can also be proved that the transfer function T (s) of the state space model of SV is stable for K P ≥ 0 and K d > 1 converting the transfer function into controller canonical form [20] and by using the pole placement approach. This proof can be found elsewhere [17] for weight tuning law, and one can follow the same procedure to obtain the same for all other tuning laws. The control method presented requires full state feedback and acceleration measurements in both surge and sway directions. Yaw rate can be measured using a gyroscope placed near the CG, and accelerations are measured by accelerometers. Nowadays, an inertial sensor-based low-cost hardware–software system is available with all the above measuring capabilities [28]. Further Kalman filter-based algorithm is used to estimate the velocities of SV [29]. With the availability of geographical positioning system (GPS) signal corrected to localize dead reckoning, absolute position coordinates are obtained. Further, as the SA-RBF-NN controller developed here is compact, and an embedded computer is capable enough to process the data and calculate the control signal to deliver the required thrust in real time. The completed mathematical model and the controller are converted to a MATLAB [30] code and simulated for the eight-shape trajectory defined by (27) as the desired path (with positions; xd , yd ). The application proposed here is the load and unloading of ship cargos temporarily anchored near the harbor. αt xd = 2Rsin 2 yd = Rsin(αt)
(27)
Two SA-RBF-NN subnets are derived based on the controller described above to handle the surge and sway dynamics. The design parameters of the controller tuned with the randomly selected valued to achieve the best performance in terms of stability and the tracking accuracy. The initial simulation results shown in Figs. 3, 4, and 5 indicate that the SA-RBF-NN has the highest position accuracy compared to the conventional PD controller for the controller gains, K p = 1, and K d = 7. Further, these results are shown only the online training where dynamics are changed based on the desired path and desired speeds governed by the desired trajectory and hence the dynamics of the SV.
8 Conclusion and Future Works SA-RBF-NN is designed and applied for a real-time path tracking application combining with classical PD control found in control engineering. In the reported literature, RBF-NNs are found very efficient and compact along with their fast learning and accurate approximation properties. Conventional RBF-NN has been
660
K. J. C. Kumara and M. H. M. R. S. Dilhani
Fig. 3 Tracking of eight-shape trajectory by PD controller only and with SA-RBF-NN controller
Fig. 4 Surge directional tracking of error of eight-shape trajectory by PD controller only and with SA-RBF-NN controller
Shape-Adaptive RBF Neural Network …
661
Fig. 5 Sway directional tracking of error of eight-shape trajectory by PD controller only and with SA-RBF-NN controller
presented for various application with weight updates which is modified by introducing its shape change by integrating RBF center states, standard deviations, and altitudes, so that activation function itself updated and adapt to the situation. With the initial RBF function in (1), the author’s previous controller design approach is extended by modifying the activation function given by (7). Tuning laws proposed are optimized and developed by considering the overall feedforward transfer function of the state space model consisted of error dynamics, and the control signal is convergent, and therefore, controller design parameters are selected. A short tracking path of the eight-shape curve is selected, and numerical simulations are carried out for the dynamic model of prototype SV designed in 3D with all necessary components. Two propellers and propeller angles (two propellers together) are controlled by the control signal delivered by two neural subnets developed to handle longitudinal (surge) and lateral (sway) dynamics. Results of numerical simulations are shown that the desired trajectory is accurately tracked by the newly developed controller the SA-RBF-NN by combining PD controller, compared to the latter controller both with the full state feedback sensors. Therefore, SA-RBF-NN controller can be proposed to control such nonlinear systems and especially when run by low-cost embedded hardware as the controller is compact and need less computing power compared to most other NN based controller today. In an actual situation, this type of SV come across many other challenges such as obstacle avoidance, navigation, and mapping and various application-based problems. Some of them can easily be solved by integrating the LiDAR and vision-based sensing system; however, the computing power and energy are necessitated. Future
662
K. J. C. Kumara and M. H. M. R. S. Dilhani
works of the research included the deployment of the SA-RBF-NN + PD controller developed here to low-cost hardware like Jetson Nano by NVIDIA [31] that can handle image processing with sufficient speed for such applications and run robotics operating system (ROS) [32] in the laboratory environment. Further controlling of autonomous ground vehicles will also be a good candidate for this, especially where pitching and rolling are minimized, so that two subnets can optimally handle the desired path tracking.
References 1. Buhmann M (2009) Radial basis functions: theory and implementations. Cambridge University Press: Cambridge, pp 1–4 2. Graupe D (2007) Principles of artificial neural networks. World Scientific, Singapore 3. Park J, Sandberg I (1991) Universal approximation using radial-basis function networks. J. Neural Comput 3(2):246–257 4. Moody J, Darken C (1989) Fast learning in networks of locally tuned processing units. J. Neural Computing. 1(2):281–294 5. Wang L, Fu X (2005) A simple rule extraction method using a compact RBF neural network. In: Advances in neural network. Springer, Heidelberg pp 682–687 6. Baughman D, Liu Y (1995) Classification: fault diagnosis and feature categorization. In: Neural networks in bioprocessing and chemical engineering. Academic Press Ltd., California. pp 110–171 7. David VK, Rajasekaran S (2009) Pattern recognition using neural networks and functional networks. Springer, Heidelberg 8. Wu J (2012) Prediction of rainfall time series using modular RBF neural network model coupled with SSA and PLS. In: Asian conference on intelligent information and database systems. Kaohsiung, Taiwan (2012) 9. Saastamoninen A, Lehtokangas M, Varri A, Saarinen J (2001) Biomedical applications of radial basis function networks. In: radial basis function networks. vol 67. Springer, pp 215–268 10. Halali M, Azari M, Arabloo M, Mohammadi A, Bahadori A (2016) Application of a radial basis function neural network to estimate pressure gradient in water–oil pipelines. J. Taiwan Inst Chem Eng 58:189–202 (Elsevier2016) 11. Wang P (2017) The application of radial basis function neural network for mechanical fault diagnosis of gear box. In: IOP conference series: materials science and engineering. Tianjin, China 12. Liu J (2010) Adaptive RBF neural network control of robot with actuator nonlinearities. J. Control Theory Appl 8(2):249–256 13. Chaudhuri A (2012) Forecasting financial time series using multiple regression, multilayer perception, radial basis function and adaptive neuro fuzzy inference system models: a comparative analysis. J Comput Inf Sci 5:13–24 14. Sisil K, Tsu-Tian L (2006) Neuroadaptive combined lateral and longitudinal control of highway vehicles using RBF networks. IEEE Trans Intell Transp Syst 17(4):500–512 15. Marino R (1997) Adaptive control of nonlinear systems: basic results and application. J Annu Rev Control 21:55–66 16. Howlet R, Jain L (2010) Radial basis function networks 1: recent developments in theory and applications. Springer, Heidelberg 17. Kumara KJC, Sisil K Intelligent control of vehicles for “ITS for the sea” applications. In: IEEE third international conference on information and automation for sustainability. IEEE Press, Melbourne, pp 141–145
Shape-Adaptive RBF Neural Network …
663
18. Kumara KJC (2007) Modelling and controlling of a surface vessel for “ITS for the Sea” applications. Master thesis. University of Moratuwa 19. Fadali A, Visioli A (2013) Elements of nonlinear digital control systems. In: Digital control engineering. Academic Press, Amsterdam, pp 439–489 20. Ogata K (2010) Modern control engineering. Prentice Hall, Boston, pp 649–651 21. Giesl P (2007) Construction of global lyapunov functions using radial basis functions. Springer, Heidelberg, pp 109–110 22. Zhang J, Xu S, Rachid A (2001) Automatic path tracking control of vehicle based on Lyapunov approach. In: IEEE international conference on intelligent transportation systems. IEEE Press, Oakland 23. Vidyasagar M (1993) Nonlinear systems analysis. Prentice-Hall, Englewood Cliffs 24. Bishop B (2004) Design and control of platoons of cooperating autonomous surface vessels. In: 7th Annual maritime transportation system research and technology coordination conference 25. Caccia M (2006) Autonomous surface craft: prototypes and basic research issue. In: 14th Mediterranean conference on control and automation 26. Vanzweieten T (2003) Dynamic simulation and control of an autonomous vessel. Master thesis. Florida Atlantic University, Florida 27. Newman J (1977) Marine hydrodynamics. MIT Press, London 28. Sukkarieh S (2000) Low cost, high integrity, aided inertial navigation system. Ph.D. thesis. University of Sydney, Sydney 29. An intro to Kalman filters for autonomous vehicle. https://towardsdatascience.com/an-introto-kalman-filters-for-autonomous-vehicles 30. MATLAB. https://www.mathworks.com/products/matlab.html 31. NVIDIA Jetson nano developer kit. https://developer.nvidia.com/embedded/jetson-nano-dev eloper-kit 32. ROS: robot operating system. www.ros.org
Electricity Load Forecasting Using Optimized Artificial Neural Network M. H. M. R. Shyamali Dilhani, N. M. Wagarachchi, and Kudabadu J. C. Kumara
Abstract Electric load forecasting becomes one of the most critical factors for the economic operation of power systems due to the rapid increment of daily energy demand in the world. The energy usage of the electricity demand is higher than the other energy sources in Sri Lanka according to the record of Generation Expansion Plan—2016, Ceylon Electricity Board, Sri Lanka. Moreover, forecasting is a hard challenge due to its complex nature of consumption. In this research, the long-term electric load forecasting based on optimized artificial neural networks (OANNs) is implemented using particle swarm optimization (PSO) and results are compared with a regression model. Results are validated using the data collected from Central Bank annual reports for thirteen years from the year 2004–2016. The choice of the inputs for ANN, OANNs, and regression models are given depends on the values obtained through the correlation matrix. The training data sets used in the proposed work are scaled between 0 and 1, and it is obtained by dividing the entire data set by its large value. The experimental results show that OANN has better accuracy in forecasting compared to ANN and regression model. The forecasting accuracy of each model is performed by the mean absolute percentage error (MAPE). Keywords Back propagation · Electricity load forecasting · Neural network · Particle swarm optimization · System-type architecture
M. H. M. R. Shyamali Dilhani (B) · N. M. Wagarachchi Department of Interdisciplinary Studies, University of Ruhuna, Hapugala, Galle, Sri Lanka e-mail: [email protected] N. M. Wagarachchi e-mail: [email protected] K. J. C. Kumara Department of Mechanical and Manufacturing Engineering, Faculty of Engineering, University of Ruhuna, Hapugala, Galle, Sri Lanka e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_48
665
666
M. H. M. R. Shyamali Dilhani et al.
1 Introduction Predicting future electricity consumption is vital for utility planning. The process is difficult due to the complex load patterns. Electricity forecasting is the basic planning process which is followed in the electric power systems industry for a particular area over different time horizons [1, 2]. Accurate electricity forecasting leads to reduce operation and maintenance costs. It increases the power supply reliability and delivery system which helps to obtain a reliable decision for future development. At present, utilities have been growing interest in smart grid implementation. Therefore, electricity forecasting has a greater impact on storage maintenance, demand-side management, scheduling, and renewable energy integration. Forecasting helps the user to obtain the relationship between the consumption and its price variations in detail [3]. Forecasting is broadly classified into long-term forecasting, medium-term forecasting, and short-term forecasting. In this, long-term forecasting is used to predict the plant capacity and its planning, medium-term forecasting is claimed to predict the plant maintenance schedule and short forecasting is used to predict the daily operations. The proposed research work focused on long-term forecasting. Longterm forecasting is common in the planning and operation of electric utilities [4]. Electricity consumption varies according to the economic and social circumstances of a society. The major advantage of long-term electricity load forecasting is it indicates economic growth for the system. Moreover, it provides relevant data in terms of transmission, distribution, and generation. However, electricity forecasting accuracy differs from the nature of the situation. For example, one can forecast daily load demand in a particular region within 1–3% accuracy, whereas accurate prediction for an annual load is a complex process due to the unavailability of long-term load factors information [2, 3, 5]. Annual electricity forecasting in Iran is reported in Ghanbari et al. [6] research work where the artificial neural network and linear, log linear regression models are used to experiment. Real GDP and population are the two economical parameters that are considered as the experimental lags in the proposed approach. Abd-El-Monem [7] provides an in-depth analysis for forecasting in the Egyptian region where ANN and other forecasting parameters are used to test the given condition. The study provides better analysis for load demand, sales, population, GDP, and average price in an accurate manner including econometric variables. In addition to that, meta-heuristic models are used to optimize the forecasting model which gives reliable and robust results. Using global search procedures, probabilistic and heuristic nature are analyzed [8–11]. Since back propagation (BP) algorithm stops at local minima, many researchers have an interest in training the ANN model using PSO. The ability to solve complex and nonlinear functions using PSO is a major influence to use it for electricity demand forecasting [12–15]. PSO is a fast convergence process which is based on swarmbased operation in which the particles are adjusted to obtain the desired performance output.
Electricity Load Forecasting Using Optimized Artificial Neural Network
667
The main purpose of this research is to investigate how an optimized neural network influences the performance of the long-term electricity forecasting in Sri Lanka the regression and BP ANN models. Thirteen years of historical electricity data are considered for this experiment from Central Bank annual reports of Sri Lanka. Typical monthly and annual load demand patterns from the year 2004–2016 depicted in Fig. 1 clearly show that the demand has increased over the past 12 year’s period continuously. The rest of the paper is organized as follows. The next section describes numerous electricity forecasting models reported in the literature. The PSO optimized ANN model proposed by this paper is discussed in Sect. 3. Section 4 presents the forecasting performances followed by results and discussion and concluding remarks of the research by Sects. 5 and 6, respectively.
2 Electricity Forecasting Models 2.1 Artificial Neural Networks The neural network model is developed based on the human brain structure where the nerves are considered as neurons for process the given input. Figure 2 depicts an illustrative representation of a neural network and neurons. Generally, it has three layers such as the input layer, hidden layer, and output layer. These layers are interconnected based on the weights. The hidden layer is used to connect the input and output layer and the weights are used to reduce the error between the layers. Based on the learning rules, the weights are modified in the neural network architecture. Most probably, initial weights are chosen randomly. Then adjust them by comparing the output error. The mathematical representation of the input parameter of the neuron is X (where X = [x1, x2, . . . .xn ]) and the output is y. W = [w1 , w2 , . . . wn ] represents the synaptic weights and b is the bias. The neuron output is given by (1). y=
wn x n + b
(1)
The estimation of these parameters focusing the minimum error criterion is called the training of the network. Back propagation [16] is a widely used model in a neural network and various research works and applications are evolved using this back propagation algorithm [17–19]. It updates the weights and biases until it obtains a zero training error or predetermined epochs. Weight is frequently changed based on the error function obtained between the actual output and desired output values. The weight correction at each iteration, k of the algorithm is given as, wik+1 = wik + αik gik
(2)
668
M. H. M. R. Shyamali Dilhani et al.
Fig. 1 Monthly load demands in Sri Lanka
Fig. 2 Neuron structure of artificial neural networks
Neuron Inputs
⋮
⋮ Output
where wik is the current set of weights and biases, gik is the current gradient which is based on the given error, and αik is the learning rate.
2.2 Particle Swarm Optimization (PSO) ANNs’ parameter optimization is a major problem when it is used in electricity demand forecasting. For parameter optimization, various methods are evolved through particle swarm optimization [20, 21] is familiar among all of them. The
Electricity Load Forecasting Using Optimized Artificial Neural Network
669
first algorithm has been developed based on the observations from fish swarms and bird flocks. Multi-dimensional search space is considered as the main objective of PSO where the swarm of the objects is used to analyze in the search space. Based on the particle movements, PSO provides a globally optimum solution. PSO uses local optima to handle optimization issues and it is useful for model implementation. Also, PSO has successfully been applied to least squares estimation [22], multiple regression models [15], neural network training [14, 23–26], and support vector machine (SVM) [27]. The generic steps of the PSO are mathematically interpreted as follows. Each particle is initialized in the search space with random positions and velocities. The position and the velocity of each particle at generation i are given by the vectors xi = (xi1 , xi2 , . . . , xid ) and vi = (vi1 , vi2 , . . . , vid ) respectively, where d is the number of particles. Based on the fitness function of particles, two kinds of memories are obtained in the particle swarm optimization process. The fitness value is the mean squared error (MSE) between the target and actual data series. After calculating the fitness value of all the particles, PSO updates the personal best (pbest) and global best (gbest), where pbest is the best personal value for each particle updated so far and the gbest is the best global value for the entire set. The pbest and gbest represent the population, velocities, and positions are updated according to the (3) and (4). Vk (i + 1) = w.Vk (i) + n 1 .r1 .(pbest − xk (i)) + n 2 .r2 .(gbest − xk (i))
(3)
xk (i + 1) = xk (i) + Vk (i + 1)
(4)
where Vk (i) and xk (i) are the velocity and position of particle k at ith iteration and w is the inertia weight. r1 and r2 are the random values between 1 and 0 and the predetermined learning factors are given as n 1 and n 2 .
3 Proposed Techniques In this proposed work, three models are employed to solve the problem of long-term electricity demand forecasting. 1. Forecasting model using back propagation ANN. 2. Forecasting model using ANN optimized by PSO. 3. Forecasting model using linear regression. The first model uses BP to train the weights of the ANN, while the second model uses the PSO to optimize the weights of ANN. The third model discusses a statistical model called linear regression to forecast long-term electricity demand. The results are obtained using real historical monthly data from Central Bank reports, Sri Lanka. These methods are explained in the following subsections in detail.
670
M. H. M. R. Shyamali Dilhani et al.
3.1 Forecasting Model Using Back Propagation ANN In this model, forecasts will be prepared for each month of the year 2016. Five inputs are created including population (1000 per capita), GDP (per capita in US$), energy sales (GWh), an exchange rate (US$), and historical loads (MW). The monthly GDP data are collected for thirteen years from 2004 to 2016. The correlation matrix is used to obtain the input choices and results. Figure 3 depicts that the selected five factors are highly correlated with each other. Table 1 summarizes the results from the correlation matrix. For example, the correlation coefficient between historical annual load and population, GDP, energy sales, and exchange rate are 0.917, 0.972, 0.999, and 0.953, respectively. All the training data in this process are scaled to be between 0 and 1. To meet this, all the data sets are divided by its largest singular value. The proposed model is described with the following equation to explain the inputs and outputs for the ANN. F(m) = a1 L(m − 1) + a2 Pop(m − 1) + a3 GDP(m − 1) + a4 ES(m − 1) + a5 ER(m − 1)
Fig. 3 Correlation matrix of all the input factors
(5)
Electricity Load Forecasting Using Optimized Artificial Neural Network
671
Table 1 Correlation coefficient between each input factors Annual load values Annual load values (MW)
1
Population (‘000)
0.917
Midyear population
GDP/Per capita
Energy sales
Exchange rate (Avg.)
1
GDP/Per capita 0.972 (US$)
0.900
1
Energy sales (GWh)
0.999
0.916
0.969
1
Exchange rate (Avg US$)
0.953
0.816
0.928
0.956
1
Load of the forecasting month, F(m) is calculated using the load of the previous month, L(m − 1), the population in the previous month, Pop(m − 1), the per capita GDP in previous month, GDP(m − 1), energy sale in the previous month, ES(m − 1), and the exchange rate in the previous month, ER(m − 1). In this model, ANNs are trained by 12 years’ historical data (from 2004 to 2015) and designed to predict the total load for each month of the year 2016. In this process, the target set is loaded with data from 2004 to 2015 which has 144 values and the input set is used to load with 144 × 5 elements as historical load values, monthly population, etc. The monthly population of Sri Lanka where annual population increases has divided by 12 to take monthly population assuming uniform population growth, the third column is the per capita GDP, the fourth column is the historical energy sales, and the last column is the exchange rate in Sri Lanka for the specified period. A prior analysis and simulations were carried out for different ANN structures with different training functions by varying the hidden neurons. By varying biases and weight functions at each series, various results are obtained for the same structure. Through the conjugate gradient BP (traincgb), training function performs better compared to other training functions. Also, minimum forecasting error was obtained from the three-layer topology with five hidden neurons for the first and second layers and the last layer consists only one hidden neuron. Therefore, the later experiment is performed with a similar structure identified through the secondary experiment. The BP training algorithm is used with 1000 epochs to find the optimum weight values of the network.
3.2 Forecasting Model Using ANN Optimized by PSO In particle swarm optimization, a set of weights are used to represent each particle which provides the relationship between the neurons. Mean square error is the fitness function of every particle which is used to measure from the output and the weights of
672
M. H. M. R. Shyamali Dilhani et al.
the given series. The error function could be reduced by updating the weight function frequently. Once the fitness functions are calculated, the pbest and the gbest values are updated in the process which describes the effective weight of the particle from the entire set. The process of ANN optimized by PSO is summarized as follows. Step 1 Sample data are scaled to be between 0 and 1. Step 2 All the variables are randomly initialized and update the velocity and the position of each particle. In the process, 0 and 1 are assigned to r 1 and r 2 as a random variable and the learning factor along with inertia weight are fixed into 0.5 and 2, respectively. The maximum number of iteration is 100 and the fitness value is calculated using MSE. This step also allows placing the weights and biases of each particle. The total number of weights and biases for the proposed model is 36: 30 weights and 6 biases. Step 3 Calculate the MSE using the following equation n 1 MSE = (L m − Fm )2 n m=1
(6)
where training samples are given as n, and load value is represented as L m at mth sample while Fm is the output load at mth sample. The fitness of each particle is defined by (6). If the new position is better than pbest, pbest will be replaced by the new position. Otherwise, it will not change. The same concept updates the gbest. Step 4 According to Eqs. (3) and (4), the position and velocity of each particle are updated. Step 5 If the stopping conditions are met, the algorithm terminates. Otherwise, the process repeats from step 3. Step 6 Take the optimum set of parameters from PSO and put in the ANN to forecast the monthly electricity demand.
3.3 Forecasting Model Using Linear Regression The linear regression model is considered to forecast the monthly load of the year 2016. The regression model is a statistical technique and most of the researchers use this model due to the easiness of model implication. It is the same as the factors which is used in the ANN model. The forecasted load is considered as the dependent variable and the other factors are considered as the independent variables. The mathematical representation of this model can be summarized as follows. F(m) = C0 + C1 L(m − 1) + C2 Pop(m − 1) + C3 GDP(m − 1) + C4 ES(m − 1) + C5 ER(m − 1)
(7)
Electricity Load Forecasting Using Optimized Artificial Neural Network
673
where F(m) is the forecasted month, Ci , i = 0, . . . , 5 are the load forecasting estimated coefficient function. This condition is applied to the data sets to obtain the monthly load values.
4 Forecasting Performance Absolute percentage error and mean absolute percentage are used to calculate the accuracy of the forecasting model and it is given as n L m − Fm × 100 APE = L m=1
MAPE =
(8)
m
APE n
(9)
where n is the total number of months, L m represents the actual load demand at month m, and Fm represents the forecasted load demand at month m.
5 Experimental Results and Discussion In this section, the MAPE results for ANN, OANN, and regression model simulations are summarized. According to the MAPEs in Table 2, it is observed that OANN attains best performance in annual electricity forecasting and electricity load demand compared to ANN with back propagation and linear regression model. All the forecasting models have five input variables together with the historical load demand to forecast the monthly load of the year 2016. An optimized neural network consists of optimized weights using particle swarm optimization performs better compared to the other two models. It has shown that the average monthly forecasted error is 1.836. The second least average forecasting error is given by a neural network. The average monthly forecasting error is 2.502 while the regression model shows a 2.949 average forecasting error. All three models have their highest forecasting error in April (3.611, 4.665, and 4.976, respectively), whereas minimum forecasted errors are given in December (0.556, 1.452, and 1.753, respectively). Moreover, the optimized neural network has shown that over 2 percent forecasting error only in February (2.613), May (2.304), August (2.240), and November (2.142), whereas the ANN model and regression model have more than 2 percent forecasting error for all the months except in December. Figure 4 shows that the best forecasting results are given by the OANN model. Moreover, a paired t-test is carried to check the model accuracy. Tables 3 and 4 are showed the correlation between the actual load demand and the forecasted load
674
M. H. M. R. Shyamali Dilhani et al.
Table 2 Monthly and annually APE and MAPEs for the year 2016
Forecast month
APE OANN
ANN
Regression
January
1.007
2.395
3.227
February
2.613
2.924
3.807
March
1.354
2.005
2.108
April
3.611
4.665
4.976
May
2.304
2.559
2.678
June
1.366
2.287
2.641
July
1.929
2.074
2.322
August
2.24
2.323
2.562
September
1.704
2.31
2.809
October
1.212
2.568
3.725
November
2.142
2.463
2.785
December
0.556
1.452
1.753
MAPE
1.836
2.502
2.949
Monthly absolute percentage error (APE) for three different models such as Optimized Artificial Neural Network, Artificial Neural Network and Regression model are appeared in the each column of the table. In addition to that, in the last row of the table shown that annual mean absolute percent error (MAPE) for three models
Fig. 4 Actual and forecasted loads for time series model Table 3 Paired samples correlations N
Correlation
Sig.
Pair 1
Actual and OANN
12
0.982
0.000
Pair 2
Actual and ANN
12
0.986
0.000
Pair 3
Actual and LR
12
0.983
0.000
Electricity Load Forecasting Using Optimized Artificial Neural Network
675
Table 4 Paired samples test Paired differences Mean
Std. deviation
T Std. error mean
95% Confidence interval of the difference Lower
Upper
Sig. (2-tailed)
Pair 1
Actual -OANN
21.67
9.34
2.69
15.74
27.61
8.03
0.000
Pair 2
Actual-ANN
30.05
8.65
2.49
24.55
35.55
12.03
0.000
Pair 3
Actual-LR
34.84
9.68
2.79
28.68
40.99
12.46
0.000
by OANN, ANN, and regression model. It is highly correlated, and the pairs are significant with probability value 0.
6 Conclusion The technique based on the PSO and ANN was proposed in this research to forecast monthly load demand for the Sri Lankan network. The results of numerical simulations show that the OANN model together with five input factors reduces the forecasting error significantly. The correlation matrix is used to obtain the choice of inputs to obtain the desired results. The selected factors have higher correlations with each other. In the data preparation process, all the training data are uniformly scaled to be between 0 and 1. Weights and biases of the ANN model are optimized by using PSO and BP training algorithms, and the regression model is considered to check the model adequacy. Though the ANN and regression model provides relatively better results, it is still not accurate as the OANN model. OANN performs well as it is having a unique ability to deal with the nonlinearity of the model. As such, it overcomes many time series models’ drawbacks as per the case presented and tested by the work here. It can also conclude that all techniques are quite promising and relevant for long-term forecasting according to the paired t-test results.
References 1. Beaty HW (2001) Handbook of electric power calculations. McGraw-Hill 2. Singh AK, Ibraheem SK, Muazzam M, Chaturvedi DK (2013) An overview of electricity demand forecasting techniques. In: National conference on emerging trends in electrical, instrumentation & communication engineering 2013, Network and complex systems, pp 38–48. 3. Chakhchoukh Y, Panciatici P, Mili L (2011) Electric load forecasting based on statistical robust methods. IEEE Trans Power Syst 26(3):982–991
676
M. H. M. R. Shyamali Dilhani et al.
4. Chow JH, Wu FF, Momoh JA (2005) Applied mathematics for restructured electric power systems. In: Applied mathematics for restructured electric power systems, Springer, pp 1–9 5. Starke M, Alkadi N, Ma O (2013) Assessment of industrial load for demand response across US regions of the western interconnect. Oak Ridge National Lab. (ORNL), Oak Ridge, TN, US 6. Ghanbari A et al (2009) Artificial neural networks and regression approaches comparison for forecasting Iran’s annual electricity load. In: International conference on power engineering, energy and electrical drives, 2009, POWERENG’09. IEEE 7. Abd-El-Monem H (2008) Artifical intelligence applications for load forecasting 8. Zhang F, Cao J, Xu Z (2013) An improved particle swarm optimization particle filtering algorithm. In: 2013 International conference on communications, circuits and systems (ICCCAS). IEEE 9. Jiang Y et al (2007) An improved particle swarm optimization algorithm. Appl Math Comput 193(1):231–239 10. Samuel GG, Rajan CCA (2015) Hybrid: particle swarm optimization genetic algorithm and particle swarm optimization shuffled frog leaping algorithm for long-term generator maintenance scheduling. Int J Electr Power Energy Syst 65:432–442 11. Chunxia F, Youhong W (2008) An adaptive simple particle swarm optimization algorithm. In: Control and decision conference, 2008. CCDC 2008. Chinese. IEEE 12. Subbaraj P, Rajasekaran V (2008) Evolutionary techniques based combined artificial neural networks for peak load forecasting. World Acad Sci Eng Technol 45:680–686 13. Da¸s GLS (2017) Forecasting the energy demand of Turkey with a NN based on an improved particle swarm optimization. Neural Comput Appl 28(1): 539–549 14. Jeenanunta C, Abeyrathn KD (2017) Combine particle swarm optimization with artificial neural networks for short-term load forecasting. ISJET 8:25 15. Hafez AA, Elsherbiny MK (2016) Particle swarm optimization for long-term demand forecasting. In: Power systems conference (MEPCON), 2016 eighteenth international middle east. IEEE 16. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533 17. Mazzoni P, Andersen RA, Jordan MI (1991) A more biologically plausible learning rule for neural networks. Proc Natl Acad Sci 88(10):4433–4437 18. Dilhani MS, Jeenanunta C (2017) Effect of neural network structure for daily electricity load forecasting. In: Engineering research conference (MERCon), 2017 Moratuwa. IEEE 19. Samarasinghe S (2016) Neural networks for applied sciences and engineering: from fundamentals to complex pattern recognition. Auerbach Publications 20. Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: 1997 IEEE international conference on systems, man, and cybernetics. Computational cybernetics and simulation. IEEE 21. Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth international symposium on micro machine and human science, 1995. MHS’95. IEEE 22. AlRashidi M, El-Naggar K (2010) Long term electric load forecasting based on particle swarm optimization. Appl Energy 87(1):320–326 23. Meissner M, Schmuker M, Schneider G (2006) Optimized particle swarm optimization (OPSO) and its application to artificial neural network training. BMC Bioinf 7(1):125 24. Freitag S, Muhanna RL, Graf W (2012) A particle swarm optimization approach for training artificial neural networks with uncertain data. In: Proceedings of the 5th international conference on reliable engineering computing, Litera, Brno 25. Tripathy AK et al (2011) Weather forecasting using ANN and PSO. Int J. Sci Eng Res 2:1–5 26. Shayeghi H, Shayanfar H, Azimi G (2009) STLF based on optimized neural network using PSO. Int J Electr Comput Eng 4(10):1190–1199 27. Sarhani M, El Afia A (2015) Electric load forecasting using hybrid machine learning approach incorporating feature selection. In: BDCA
Object Detection in Surveillance Using Deep Learning Methods: A Comparative Analysis Dharmender Saini, Narina Thakur, Rachna Jain, Preeti Nagrath, Hemanth Jude, and Nitika Sharma
Abstract Unmanned aerial vehicle (UAV) technology has revolutionized the field globally in today’s scenario. The UAV technologies enabled the activities to be efficiently monitored, identified, and analyzed. The principal constraints of the present surveillance system, along with closed-circuit television (CCTV) cameras, are limited surveillance coverage area and high latency in object detection. Deep learning embedded with UAVs has found to be effective in the tracking and monitoring of objects, thus overcoming the constraints mentioned above. Dynamic surveillance systems in the current scenario seek high-speed streaming, and object detection in real-time visual data has become a challenge over a reasonable time delay. The paper draws a comprehensive analysis of object detection deep learning architectures by classifying the research based on architecture, techniques, applications, and datasets. It has been found that RetinaNet is highly accurate while YOLOv3 is fast. Keywords Object detection · Convolutional neural network · Surveillance
1 Introduction Smart city is fragmented without the incorporation of an effective surveillance system (such as UAVs) based on Internet of things (IoT) that provides an impeccable analysis of captured videos and images [1]. A strenuous examination is required to deploy smart autonomous surveillance systems which become challenging with long-established object detection methods, built on slender trainable architectures. Thus, IoT-based monitoring can be accomplished through object detection.
D. Saini (B) · N. Thakur · R. Jain · P. Nagrath · N. Sharma Department of Computer Science Engineering, Bharati Vidyapeeth’s College of Engineering, New Delhi, India e-mail: [email protected] H. Jude Department of ECE, Karunya University, Coimbatore, Tamil Nadu, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_49
677
678
D. Saini et al.
For detection and monitoring of objects in real time, high-level, large-scale data is retrieved, localized, and classified. Object detection [2], hence, supplies important facts about the acknowledged visual data for the logical understanding. The detection and tracking of objects are ubiquitous and find its place in many prevalent applications that includes surveillance through human behavior analysis [3], driverless cars, medical diagnosis [4], pose estimation, handwriting estimation, visual object detection, large-scale object detection, and traffic surveillance [5–7]. The field of object detection holds a huge potential for research to make the most efficient learning algorithms. The learning models are trained and tested on the labeled dataset. The applied algorithm must perform in real-time situations proficiently and efficiently, particularly in fields of a major problem with safety. The issues experienced in the identification of objects, such as diverse lighting conditions, occlusion [8], varying viewpoints, and poses, have broad opened a fresh research window for the building of systems that could effectively perform object identification and localization tasks. Therefore, the task for state-of-the-art research is not just restricted to detection and tracking but also to meet the abovementioned challenges. The full paper is organized in the following manner. Section 2 in this paper discusses the related research work on object detection. Section 3 elaborates on the existing architectures of object detection and draws a performance comparison. Section 4 discusses challenges to object detection. Section 5 concludes the work along with the future scope of the research.
2 A Brief Overview in Object Detection on Deep Learning A most descriptive model of deep learning algorithm is CNN [9] also termed as ConvNet. It is the non-recurrent feed-forward technique of artificial neural networks employed to recognize visual data. Some of the traditional CNN architectures include LeNet-5, ZFNet [10], VGG16 [11], and AlexNet [12], while modern approaches include Inception, ResNet, and DenseNet. Convolution is a mathematical term or operation referred for the sum of products, and successive convolutional layers are subsets of previous layers interleaved with average pooling or maximum pooling layers as depicted in Fig. 1. Each 3D matrix in the neural network is called the feature map. Filtering and pooling of the transformations are applied to these feature maps to extract robust features. LeNet 5 [13] was developed for postal services to recognize handwritten digits of zip codes. AlexNet [12] is another architecture that uses large kernel sized filters 11 × 11 in the first convolutional layer and 5 × 5 in the second layer. The architecture is trained on the ImageNet dataset. These large-sized non-uniform filters are replaced in VGG16 architecture that has 3 × 3 uniform filters through 16 convolutional layers [11]. VGG16 architecture has also been trained on ImageNet, and the model performance is very high with an accuracy value of 92.7%. Modern technique based on CNN which includes Inception architecture developed by Google, also referred to as GoogLeNet, contains Inception
Object Detection in Surveillance Using Deep Learning Methods…
679
Fig. 1 Convolutional neural network architecture
Table 1 Description of traditional convolutional neural network architectures Paper
Architectures
Description
Yann LeCunn (1998) [13]
LeNet
Deep learning-based CNN architecture is presented for handwriting recognition
Krizhevsky (2012) [12]
AlexNet
Large kernel 11 × 11 sized filter convolution layers are used
Zeiler (2013) [10]
ZFNet
Filter layer’s size is reduced to 7 × 7 to observe features in the pixel domain and reduced error rates
Simonyan (2014) [11]
VGG16
16 uniform convolution layers used with 3 X 3 uniform filters
Kaiming (2015) [14]
ResNet
Vanishing gradient problem addressed through layer skipping method
Szegedy(2015)
GoogLeNet (Inception)
Convolution of feature maps is performed on three different scales
cells that perform convolution in series at different scales. It uses global pooling at the end of the network. Deep neural networks face the concern of vanishing gradient due to the stacking of several layers and backpropagation to the previous layers. This concern is addressed and resolved in Residual Network (ResNet) by skipping one or more layers, thereby creating shortcut connections in the convolutional network, while DenseNet presented another approach by connecting each layer with the input layer through shortcut connections. Table 1 draws a comparison among the discussed architectures.
3 Object Detection Object detection follows a series of the procedure from region selection where the object is located by creating bounding boxes around the detected objects. This is also referred to as region-based object detection. Following that, certain visual features from selected objects recognized by SIFT [15] or HOG [16] algorithms are extracted
680
D. Saini et al.
Fig. 2 Types of object detection
Object detecƟon
Region based Object detecƟon
Regression based Object detecƟon
and classified to make the data hierarchical, followed by predicting the data to extract logical information using SVM [17] or K-means classifiers. The object detection architectures have been segregated based on localization techniques and classification techniques. However, deep learning techniques are based on simultaneous detection and classification referred to as regression techniques. Therefore, object detection is broadly categorized as region-based and regression-based object detection shown in Fig. 2.
3.1 Region-Based Object Detection This section discusses various region-based object detection methods.
3.1.1
Region-Based CNN (RCNN)
RCNN uses selective search to generate exact 2000 region proposals, followed by classification via CNN. These region proposals are refined using regression techniques. But the assortment of 2000 regions makes the computation slow. The architecture is as shown in Fig. 3.
Fig. 3 Region-based CNN architecture[18]
Object Detection in Surveillance Using Deep Learning Methods…
3.1.2
681
Fast RCNN
Fast region CNN [19] is an open-source and implemented in Python and C++ [20]. It is more accurate with a higher mean average precision (MAP) and nine times faster than RCNN since it uses single-stage training rather than the three-stage training used in RCNN. The training in fast region CNN updates all network layers concurrently.
3.1.3
Faster RCNN
This architecture employs a combination of region proposal network (RPN) and fast RCNN detector [21]. RPN is a complete convolutional network that predicts the bounding region and uses these proposals to detect objects and predict scores. RPN is based on the ‘attention’ mechanism and shares it with fast RCNN to locate the objects. Fast RCNN detector uses the proposed regions for classification. Both the accuracy and quality of region proposals are improved in this method. Figure 4 depicts the faster RCNN architecture. The comparison of region-based object detection has been given in Table 2.
Fig. 4 Faster RCNN architecture[21]
Table 2 Comparison of region-based object detection methods
Parameters
RCNN
Fast RCNN
Faster RCNN
Test time/Image (sec)
50
2
0.2
Speedup
1x
25x
250x
mAP (VOC 2007)
66.0
66.9
66.9
682
D. Saini et al.
Fig. 5 YOLO architecture [22]
3.2 Regression-Based Object Detection This section elaborates regression-based object detection methods.
3.2.1
Yolo
You Look Only Once (YOLO) shown in Fig. 5 is faster object detection. This regression approach predicts the region proposals and class probabilities in sole estimation by a single neural network. The basic YOLO processes images in real time at 45 frames per second (fps) and stream video with latency less than 25 ms. The regression-based value obtained by this method is twice as compared to other realtime detectors [22]. It performs better than prior detection methods but lacks in the detection of small objects.
3.2.2
Single Shot Multibox Detector (SSD)
SSD is also based on VGG-16 architecture [23], as shown in Fig. 6. It is a regressionbased technique that generates the object score at the prediction stage. It is faster than YOLO, and its accuracy is comparable to faster RCNN. It has a region-based fixed number of default bounding boxes for different aspect ratios and object categories followed by a prediction of shape offsets and confidence scores.
Object Detection in Surveillance Using Deep Learning Methods…
683
Fig. 6 Single shot multibox detector (SSD) [23]
3.2.3
YOLOv2
YOLOv2 is an improved architecture of the previous YOLO version built on the DarkNet-19 CNN model. It is further improved to an additional version YOLO9000, proposed to detect over 9000 categories of objects. YOLOv2 achieves 76.8% mAP on PASCAL VOC 2007 at 67 fps and 78.6% mAP at 40 fps [24].
3.2.4
RetinaNET
The extreme foreground–background class imbalance problem found in a one-stage detector [25] is regarded as one of the major concerns in performance degradation. It is resolved by RetinaNet, developed by Facebook AI Research (FAIR). The prediction is improved by using focal loss or lower loss supplied by “easy” negative samples to the “hard” samples. RetinaNet is proficient in feature extraction formed with the combination of ResNet and FPN and outperforms faster RCNN. In this architecture, feature maps are pyramidically stacked as shown in Fig. 7.
Fig. 7 RetinaNet architecture with pyramid stacked feature map layers [25]
684
D. Saini et al.
Fig. 8 YOLOv3 architecture [26]
3.2.5
YOLOv3
YOLOv3 [26] is built on the DarkNet-53 CNN model. This method uses logistic regression for each bounding box to predict an object score. The value is “1” if the real object boundary, overlapped by the bounding box, is more than any other bounding box. The prediction is ignored if the previous bounding box overlaps the object boundary more than the threshold value. YOLOv3 uses independent logistic classifiers for multi-label classification. This method is three times faster than SSD and equally accurate. Unlike previous versions of YOLO, this method is also capable of detecting small objects. YOLOv3 architecture is illustrated in Fig. 8. Table 3 embodies the models capable of region proposal and classification of detected objects. It also highlights the constraint in terms of involved high computation time. From the table, a clear observation is drawn that the methods comprising a selection of object regions and simultaneous classification are faster. Table 4 is a detailed analysis of object detection methods describing backbone CNN architecture, trained on MS COCO, PASCAL VOC 2007, and PASCAL VOC 2012 datasets. The mean average precision (mAP) values are also compared and are found maximum in YOLOv3, about 57.9% when trained on MS COCO dataset which is also faster but less accurate than RetinaNet [26].
Object Detection in Surveillance Using Deep Learning Methods…
685
Table 3 Object detection method with constraint review Object detection techniques
High computation time
Region proposal detection
Regression/classification
CNN RCNN
✓
✓
✗
✓
✓
✗
Fast RCNN
✓
✓
✗
Faster RCNN
✗
✓
✗
SSD
✗
✓
✓
YOLO
✗
✓
✓
YOLOv2
✗
✓
✓
RetinaNet
✗
✓
✓
YOLOv3
✗
✓
✓
4 Challenges in Object Detection Several difficulties emerged in identifying the objects that exist in an image. This section discusses the impediments experienced while dealing with the standard datasets, which limit to achieve high-performance measures. Occlusion: The major problem in object detection is occlusion [8, 27, 28]. Occlusion is the effect of one object blocking another object from the view in 3D images. CNN framework is not capable to handle occlusions. The complex occlusions in pedestrian images with deep learning framework, DeepParts is proposed [29] to deal with the problem. Viewpoint variation: Severe distortions occur due to degree variation in viewpoints of the image. The classification of objects becomes difficult on varied angles and has a direct impact on accuracy in predicting the object [6]. Variation in poses: Variation in facial expression and poses makes it difficult for the algorithm to detect the faces. To address occlusions and pose variations, a novel framework based on deep learning is proposed in [28] which collects the responses from local facial features and predicts faces. Lighting conditions: Another big challenge in detecting objects is the lighting conditions that may vary throughout the day. Different approaches are followed by researchers to tackle varying lighting conditions in the daytime and nighttime traffic conditions [7].
63.4
PASCAL VOC 2007 + PASCAL VOC 2012
Network inspired by GoogLeNet
YOLO
81.6
PASCAL VOC 2007 + PASCAL VOC 2012 + MS COCO
ResNet-101
VGG16
SSD
PASCAL VOC 2007
VGG16
Faster RCNN
Regression-based method
73.2
MS COCO
VGG16
mAP (%)
70.0
PASCAL VOC 2007 + PASCAL VOC 2012 18.9
20.5
MS COCO
66.0
Fast RCNN
Training datasets
Region-based method
PASCAL VOC 2007
Model VGG16
RCNN
Method
Table 4 Dataset and model-based object detection method review with mean precision value Description
(continued)
YOLO [22] method takes advantage of the prediction of class probabilities along with region proposal. The main limitation of YOLO is in detecting small objects
Single shot multibox detector (SSD) [23] is relatively simple that encapsulates object localization and classification in single network
Region proposal network (RPN) is used by faster RCNN [21] method to reduce running time
Fast RCNN [19] method reduces running time further by performing convolution once, to generate a feature map
The main drawback of CNN [18] method is exhaustive running time involved in identifying the number of regions which is resolved in RCNN method by restricting the number of region proposals to 2000
686 D. Saini et al.
Method
Table 4 (continued)
ResNet + FPN
DarkNet-53
RetinaNet
YOLOv3
Model DarkNet-19
YOLOv2
MS COCO
MS COCO
57.9
37.8
78.6
PASCAL VOC 2007 + PASCAL VOC 2012
mAP (%) 21.6
Training datasets MS COCO
YOLOv3[26] indulges residual skip connections and un-sampling features, which is found missing in YOLOv2. It is capable of detecting small objects
RetinaNet [25] uses feature pyramid architecture, a one-stage detector
YOLOv2 [24] is an improved version of YOLO in which the performance is enhanced by incorporating batch normalization and high-resolution classifier. YOLO9000 is an extended version of YOLOv2 capable to classify 9000 classes
Description
Object Detection in Surveillance Using Deep Learning Methods… 687
688
D. Saini et al.
5 Conclusion and Future Scope Various object detection architectures are compared based on the training datasets, and the performance measures are analyzed in this research. The comparison focused on recognizing the most appropriate methods that could be used for surveillance, requiring real-time data extraction with the least latency and maximum accuracy. The analysis shows that object detection methods performing regional proposal detection and classification reduce computation time and are therefore faster than traditional methods. The study highlights the fact that there is always a trade-off between speed and accuracy. SSD provides maximum precision; however, with minimal latency, YOLOv3 outperforms all other object detection techniques. Using an unmanned aerial vehicle (UAV), YOLOv3 can be used to produce highly responsive smart systems for live streaming videos and images in a surveillance system over the Internet. Acknowledgements This work is supported by the grant from Department of Science and Technology, Government of India, against CFP launched under Interdisciplinary Cyber-Physical Systems (ICPS) Programme, DST/ICPS/CPS-Individual/ 2018/181(G).
References 1. Minoli D, Sohraby K, Occhiogrosso B (2017) IoT Considerations, requirements, and architectures for smart buildings—energy optimization and next generation building management systems. IEEE Internet Things J 4(1):1–1 2. Schmid C, Jurie F, Fevrier L, Ferrari V (2008) Groups of Adjacent Contour Segments for Object Detection. IEEE Trans Pattern Anal Mach Intell 30(1):36–51 3. Singh A, Patil D, Omkar SN (2018) Eye in the sky: real-time drone surveillance system (DSS) for violent individuals identification using scatternet hybrid deep learning network. IEEE Comput Soc Conf Comput Vis Pattern Recognit Work 2018:1710–1718 4. Jain R, Jain N, Aggarwal A, Hemanth DJ (2019) Convolutional neural network based Alzheimer’s disease classification from magnetic resonance brain images. Cogn Syst Res 57:147–159 5. Hu Q, Paisitkriangkrai S, Shen C, van den Hengel A, Porikli F (2016) Fast detection of multiple objects in traffic scenes With a Common detection framework. IEEE Trans Intell Transp Syst 17(4):1002–1014 6. Hayat S, Yanmaz E, Muzaffar R (2016) Survey on unmanned aerial vehicle networks for civil applications: a communications viewpoint. IEEE Commun Surv Tutorials 18(4):2624–2661 7. Tian B, Li Y, Li B, Wen D (2014) Rear-view vehicle detection and tracking by combining multiple parts for complex Urban surveillance. IEEE Trans Intell Transp Syst 15(2):597–606 8. Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Occlusion-aware R-CNN: detecting pedestrians in a Crowd. In lecture notes in Computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp 657–674 9. Westlake N, Cai H, Hall P (2016) Detecting people in artwork with CNNs. Lecture notes Computer Sciene (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol 9913 LNCS, pp 825–841 10. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In Lecture notes in Computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 8689 LNCS, no. PART 1, pp 818–833
Object Detection in Surveillance Using Deep Learning Methods…
689
11. Simonyan K, Zisserman A (2014) Very deep Convolutional networks for large-scale image recognition, pp 1–14 12. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet Classification with Deep Convolutional Neural Networks. In: Proceedings of the 25th international Conference on neural information processing systems, Vol 1, pp 1097--1105 13. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. proc IEEE 14. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2016: 770–778 15. Keypoints S, Lowe DG (2004) Distinctive Image Features from. Int J Comput Vis 60(2):91–110 16. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings—2005 IEEE Computer society Conference on Computer Vision and Pattern Recognition, CVPR 2005 17. Kyrkou C, Theocharides T (2009) SCoPE: Towards a systolic array for SVM object detection. IEEE Embed Syst Lett 1(2):46–49 18. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature Hierarchies for accurate object detection and semantic segmentation. IEEE Conf Comput Vis Pattern Recogn 2014:580–587 19. Girshick R (2015) Fast R-CNN. In 2015 IEEE International Conference on Computer Vision (ICCV), vol 2015 Inter, pp 1440–1448 20. Jia Y et al (2014) Caffe: Convolutional architecture for fast feature embedding,” MM 2014. Proc 2014 ACM Conf Multimed 675–678 21. Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149 22. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2016:779–788 23. Liu W, et al (2016) SSD: Single Shot MultiBox Detector,” in Lecture notes in computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 9905 LNCS, pp 21–37 24. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: 2017 IEEE conference on Computer Vision and Pattern Recognition (CVPR), 2017, vol 2017, pp 6517–6525 25. Lin TY, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. Proc IEEE Int Conf Comput Vis 2017: 2999–3007 26. Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement, Apr 2018 27. Yao L, Wang B (2019) Pedestrian detection framework based on magnetic regional regression. IET Image Process 28. Yang S, Luo P, Loy CC, Tang X (2015) From facial parts responses to face detection: a deep learning approach. Proc IEEE Int Conf Comput Vis vol 2015 Inter, no 3, pp 3676–3684 29. Mathias M, Benenson R, Timofte R, Van Gool L (2013) Handling occlusions with frankenclassifiers. Proc IEEE Int Conf Comput Vis pp 1505–1512
MaSMT4: The AGR Organizational Model-Based Multi-Agent System Development Framework for Machine Translation Budditha Hettige, Asoka S. Karunananda, and George Rzevski
Abstract A framework is an essential tool for agent-based programming that saves a programmer’s time and provides development standards. There are few multiagent frameworks and models available for agent-based programming. The AGR organizational model is a successful agent development model that builds artificial societies through agents. MaDKit is one of the successful frameworks that use the AGR organizational model. However, the English to Sinhala agent-based machine translation system needs a lightweight framework and the fastest message-passing capabilities. These features are currently not available on the existing frameworks at a sufficient level. Thus, the Java-based multi-agent system development framework has been developed using the AGR organizational model with the required modifications. This paper presents a multi-agent system development framework, MaSMT, which is specially designed to handle English to Sinhala agent-based machine translation. The MaSMT has been developed using the AGR organizational model that provides an infrastructure of the agents, communication methods for agents, agent status controlling, and a tool for agent monitoring. Different types of multi-agent systems have already developed through the MaSMT framework, including, Octopus, AgriCom, and RiceMart. The framework is freely available and can be downloaded from the source forge. Keywords MaSMT · Multi-agent systems · AGR organizational model
B. Hettige (B) · A. S. Karunananda · G. Rzevski Department of Computational Mathematics, Faculty of Information Technology, University of Moratuwa, Moratuwa, Sri Lanka e-mail: [email protected] A. S. Karunananda e-mail: [email protected] G. Rzevski e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_50
691
692
B. Hettige et al.
1 Introduction Multi-agent systems (MASs) are computerized systems composed of multiple interacting intelligent agents [1]. A multi-agent system also consists of two or more agents capable of communicating with each other in a shared environment. MAS consists of four major components, namely agents, environment, ontology, and the virtual world [2]. An agent in a multi-agent system may be a computer application or independently running process (a thread) capable of doing actions. Theoretically, agents are capable of acting independently (autonomously) and controlling their internal states according to the requirements. Agents communicate with other agents through the messages. These messages consist of information for agent activities (agent doing their task according to the messages they have taken from others). The environment is the interaction between the “outside worlds” of the agent. In most of the cases, environments are implemented within a computer. The ontology can be defined as an explicit specification of conceptualization. Ontologies capture the structure of the domain. Thus, the agent nature of the multi-agent system capabilities is based on the ontology. According to the activities, behavior, and existing features, agents can be categorized in different types including simple reflex agent, model-based reflex agent, goal-based agent, utility-based agent, and learning agents [3]. However, based on agents’ behaviors, various classifications are also available including, collaborative agent, interface agent, mobile agent, information or Internet agent, reactive agent, hybrid agent, and smart agent. These agents show different capabilities and behaviors to work together and that makes the artificial society. Note that communication among agents is necessary to allow collaboration, negotiation, and cooperation between independent entities [4]. Thus, it should require a well-defined, agreed, and commonly understood semantics. Further, agents can communicate with each other with the peer-to-peer, broadcast, or noticeboard method. In peer-to-peer agent communication, agents send messages only for known peer(s). It means that there is an agreement between the sender and the receiver. The broadcasting method sends the message(s) for all. According to this method, everyone in the group (sometimes all agents in the system) takes the message. Further, the noticeboard method allows a different way than the above. In the noticeboard method, agents send messages into the shared location (noticeboard) if the agent required some information, then the agent takes that information from the noticeboard. Further, multi-agent systems consist of some advantages [5]. A multi-agent system consists of interconnected agents. They used to distribute computational resources to the central resources. MAS provides interconnection and interoperation of multiple systems and should be capable of taking globally coordinates, distributed information from sources efficiently. However, there are some limitations; MAS needs some time to take the solutions and be required to pass messages among agents. The parallelism of the multi-agent
MaSMT4: The AGR Organizational Model-Based Multi-Agent System …
693
system and avoiding unnecessary message parsing can help to deal with the above limitations. Note that, the multi-agent system also uses to handle the complexity of a software system and to provide intelligent solutions through the power of agent communication. Thus, the development of multi-agent systems is also a bit of a complicated process. According to such complexity, selecting a suitable framework is highly essential than ad-hoc development. In general, the multi-agent framework provides agent infrastructure, communication, and monitoring methods for agents. In addition to that common standards available for agent development, especially communication among agents including FIPA-ACL [6, 7] and KQML [8]. Among others, FIFA is one of the most used common stranded for agent development. The development of a multi-agent system from scratch is not an easy task and that required to model agents, communications, and controlling with some standards. Therefore, most of the multi-agent system developers give attention to an existing framework to build their MAS solutions easily. Number of well-developed multiagent framework available for multi-agent system development including, JADE [9], MaDKit [10], PADE [11], SPADE [12], AgentBuilder [13], and ADK [14]. Further, most of the existing multi-agent development frameworks are uniquely designed to develop general-purpose, fully distributed multi-agent applications. However, these existing frameworks directly do not support distinct requirements of the agent-based English to Sinhala machine translation. The English to Sinhala machine translation system requires several activities on natural language processing, including morphological processing, syntax processing, and semantic processing. A new multi-agent development framework has been designed and develops with incorporating the following required features, including capabilities to work with a large number of agents efficiently, the ability to send and reserve several messages quickly, and the ability to customize agents easily for local language (Sinhala) processing requirements. Multi-agent system for machine translation (MaSMT) was released in 2016 to provide the capability to AGR-based agent development. This paper reports the latest version of the MaSMT4.0, which consists of new features, including email-based message parsing and the ability to work with customized behaviors. The rest of the paper is organized as follows. Section 2 presents a summary on existing multi-agent frameworks. Section 3 comments on the AGR (Agent/Group/Role) model and infrastructure of the agents. Section 4 presents the design of the MaSMT, including agents, communications, and monitoring features of the framework. Section 5 gives some details of the developed multi-agent systems developed through the MaSMT. Finally, Sect. 6 concludes the paper along with the future scope.
694
B. Hettige et al.
2 Related Works There are several multi-agent system development frameworks available for different requirements. This section briefly describes some existing multi-agent system development frameworks with their features. Java Agent Development Framework (JADE) [9] is a Java-based open-source software framework for MAS development. JADE provides middle-ware software support with GUI tools for debugging and deployment. Further, JADE provides task execution and composition model for agent modeling, and peer-to-peer agent communication has been done with asynchronous message passing. In addition to the above, JADE consists of the following key features: JADE provides a FIPA-compliant distributed agent platform, multiple directory facilitators (change agents active at run time), and messages are transferred encoded as Java objects. MaDKit [10] is a Java-based generic multi-agent platform based on the Aalaadin conceptual model [16, 17]. This organizational model consists of groups and roles for agents to manage different agent activities. MaDKit also provides a lightweight Java library for MAS design. The architecture of MaDKit is based on three design principles, such as micro-kernel architecture, agentification of services, and the graphic component model. Also, MaDKit provides asynchronous message passing. Further, it can be used for designing any multi-agent applications, from distributed applications to multi-agent simulations. PADE is a free, entirely Python-based multi-agent development framework to develop, execute, and manage multi-agent systems in distributed computing environments [11]. PADE uses libraries from twisted project to allow communication among the network nodes. This framework supports multiplatform, including embedded hardware that runs Linux. Besides, PADE consists of some essential functionalities. PADE agents and their behaviors have been built using object-orientation concepts. PADE is capable of handling messages in the FIPA-ACL standard and support cyclic and timed behaviors. A smart Python multi-agent development environment (SPADE) is another framework for multi-agent system development [12]. This framework provides a new platform aimed at solving the drawbacks of the communication models from other platforms. SPADE includes several features such as SPADE agent platform is based on the XMPP support, and agent’s model based on behaviors supports FIPA metadata using XMPP data forms and provides a web-based interface for agent control. In addition to the above popular frameworks, AgentBuilder [13], Agent Development Kit (ADK) [14], Jason [15] and Shell for Simulated Agent Systems (SeSAm) [18] are the most people used other multi-agent development frameworks. Table 1 gives a brief description of the selected multi-agent system development frameworks and their main features. With this theoretical and application base, MaSMT was developed through the AGR organizational model. The next section briefly reports the AGR model and its architecture for multi-agent system development.
MaSMT4: The AGR Organizational Model-Based Multi-Agent System …
695
Table 1 Summary of the existing frameworks System
Type
Platform
Features
JADE
Open source
Java
Asynchronous message parsing
MaDKit
Open source
Java
Asynchronous message parsing,
PADE
Free
Python
Supports FIPA
SPADE
Free
Python
Supports FIPA metadata using XMPP data
AgentBuilder
Open source
Java
Quickly and easily build intelligent agent-based applications with MaS knowledge
Jason
Open source
Java
Speech-act-based inter-agent communication
SeSAm
Open source
Programming shell
GUI-based agent modeling
ADK
Open source
Mobile-based
Large-scale distributed solutions
3 AGR Organizational Model and MaSMT Architecture This AGR organizational model was designed initially under the Aalaadin model, which consists of agents, groups, and roles. Figure 1 shows the UML-based Aalaadin model [15] for multi-agent system development. According to the model, each agent is a member of one or more groups and a group contains one or more roles. The agent should be capable of handling those roles according to the agents’ requirements. This model is used by the MaDKit system [16] by allowing free overlapping agents among groups. The MaSMT model is almost the same as the above model but removes a freely overlapping feature of the group and role at the same time. It means the agent is on one or more groups as well as one or more roles; however, there is only one active group and role. Thus, the agent does actions according to this active group and role. Note that, agents are only active communicating entities capable of playing roles within groups. Therefore, MaDKit provides the freedom for agent designers to design appropriate internal models for agents. With this idea, the MaSMT agent is designed considering the three-level architecture that consists of a root agent, controlling agents, and ordinary agents. The root represents the top-level in the hierarchy and Fig. 1 UML-based Aalaadin model for multi-agent system development
696
B. Hettige et al.
Fig. 2 Agents’ architecture on MaSMT
contains several controlling agents (managers). Each controlling agent consists of any number of ordinary agents. In addition to that, agents can be clustered according to their group and role. This layered model allows building agents’ swarms quickly. Figure 2 shows the agent diagram of the three-level architecture on MaSMT. With this model, an ordinary agent can communicate with its swarm, as well as its controller agent. The controller agent should capable of communicating and fully controlling its ordinary agents. Controllers can communicate with other controllers, and the root can handle all agents in the system. With this model, MaSMT allows for passing messages through the noticeboard method, peer-to-peer communication, or broadcast method.
4 MaSMT MaSMT is an open-source multi-agent development framework, developed using Java that enables cross-platform capabilities. The framework provides agents and their communication facilities to develop multi-agent systems easily. This section briefly noted on the architecture of the MaSMT.
4.1 Agent’s Model This model comprises of three types of agents, namely the ordinary agent, controller agent (previously called as manager), and root agent. The MaSMT ordinary agents do actions when others are used to controlling them. A controller agent consists of several MaSMT agents. Hierarchically root is capable of handling a set of controller
MaSMT4: The AGR Organizational Model-Based Multi-Agent System …
697
agents. Using this three-layer agent model, agents can easily cluster and model to build a swarm of agents easily.
4.2 Abstract Model of the MaSMT Agent AbstractAgent model is used to identify agents through its group-rule-id. Agent identifier for the particular agent can generate using group, role and relevant id. Also, role(dot)id@group can be used to locate agents quickly. As an example [email protected] provides read_words is a role, id is 101 and ensimas.com is a group. This AbstractAgent model is used by the MaSMT to handle all the agent-based activities that are available in the MaSMT agents, MaSMT controllers, and MaSMT root agent.
4.3 MaSMT Agent MaSMT agents are the active agents in the framework provides agents’ infrastructure for agent development. The modular architecture of the MaSMT consists of several built-in features including noticeboard reader and environment controller. The modular architecture of the MaSMT agent is shown in Fig. 3. The noticeboard
Fig. 3 A modular view of the MaSMT agent
698
B. Hettige et al.
Fig. 4 Life cycle of the MaSMT agents
reader provides access capabilities of the noticeboard and environment controller can directly access environment. Also, through the status monitor agent controller can see the status of the MaSMT agent.
4.4 MaSMT Agent’s Life Cycle Agents in the MaSMT system follows the life cycle (status of the agent), namely active, live, and end. When new agents initiate it directly start the “active” section (it usually work as a one-step behavior for the agents) Then, agent moves to its live section. (The live section is the working section of the agent that usually works as a cyclic behavior). The MaSMT agent leaves from the live section when the live property of the agent is false. According to the requirements, an agent may wait until a specific time or until a new message comes to the in-queue. Figure 4 shows the life cycle of the MaSMT agent.
4.5 MaSMT Controller Agent MaSMT controller agent is the middle-level controller agent of the MaSMT framework, capable to control its, clients, as required. Figure 5 shows the life cycle of the MaSMT controller agent. The MaSMT controller agent also provides all the features available in the MaSMT agents. In addition to that, MaSMT controller should capable to provide message passing, network access, noticeboard access and environment handling capabilities.
4.6 MaSMT Root Agent The root agent is the top-level controller agent (MaSMT Manager), which is capable of handling other MaSMT controller agents. According to its architecture, there is
MaSMT4: The AGR Organizational Model-Based Multi-Agent System …
699
Fig. 5 Life cycle of the MaSMT controller agents
only one root for the system. Further, the MaSMT root agent is also capable of communicating with other root agents through the “Net access agent”.
4.7 MaSMT Messages The MaSMT framework uses messages named MaSMT Messages to provide agent communication. These MaSMT Messages have been designed using FIPA-ACL message standards. MaSMT Messages can be used to communicate in between MaSMT agents as well as other agents that support “FIFA-ACL” message standards. Table 2 gives the structure of the “MaSMT Message” including data fields and types. More information on “MaSMT Messages” is provided under the MaSMT development guide [19].
4.8 MaSMT Message Parsing MaSMT supports noticeboard concepts as well as peer-to-peer and broadcast methods for message parsing. In addition to that, it allows to email-based direct communication among root agents. Further, MaSMT Messages are classified and forward according to its message headers. Table 3 shows the description of each message headers.
700
B. Hettige et al.
Table 2 Fields, type, and information on MaSMT Messages Date Field
Type
Description
Sender
MaSMT AbstractAgent
The sender of the message
ReplyTo
MaSMT AbstractAgent
The conversation is to be directed to the agent
Receiver
MaSMT AbstractAgent
The receiver of the message
Message
String
A message should be subject to the message
Content
String
The original content of the message
Ontology
String
Relevant information
Type
String
Type of the message
Data
String
Information
Header
String
Header (use to redirect)
Language
String
Language of the message
Conversation
int
Conversation ID
Table 3 Message directives (headers for messages) Message Header
Description
Agents
Sends a message to a particular group and role
AgentGroup
Sends a message to a particular group
AgentRole
Sends a message to a particular role
Local
The agent or controller sends a message to it sawm who has given role and group
LocalRole
The agent or controller sends a message to it sawm who has given the role
RoleOrGroup
The agent or controller sends a message to it sawm who has given role or group
Broadcast
The agent or controller sends a message to it sawm
Root
The controller can send a message to Root agent
Controller
The agent can send a message to its controller
NoticeBoard
The agent or controller sends a message to the noticeboard
MailMessage
The agent or controller send a message as a email message
5 Usage of the MaSMT MaSMT has been developed as an open-source product, and which is available on the source forge since March 2016. Figure 6 shows the download statistics on the MaSMT framework. Using this MaSMT framework, a number of the multi-agent systems have been successfully developed. Among them, English to Sinhala agent-based machine translation system (EnSiMaS), [20], has been completely developed through the MaSMT [21]. EnSiMaS uses a hybrid approach to its translation [22]. All the natural language processing activities on EnSiMaS have been done through the agents. Thus system
MaSMT4: The AGR Organizational Model-Based Multi-Agent System …
701
Fig. 6 Usage of the MaSMT framework
consists of eight natural language processing swarms to translate English sentence(s) into Sinhala through the agents’ support. Furthermore, various kind of multi-agent systems has already developed, including, Octopus [23], AgriCom [24] and RiceMart [25]. Octopus provides multiagent-based solution for Sinhala chatting [23], and AgriCom and RiceMart provide a communication platform for agricultural domain. Besides, a multi-agent-based filesharing application has already developed trough for the distributed environment [26]. MaSMT already developed as a Java application. Thus, web-based application development capabilities have been already tested through the web-based event planning system [27].
6 Conclusion and Future Work The multi-agent system development framework can be used as a tool for multiagent system development. The MaSMT framework has been designed with considering the AGR organizational model which was available in the Aalaadin project. According to MaSMT’s AGR model, each agent has a group and a role at a time. However, agents can change their role and group at run time. Further, MaSMT also uses a three-layer agent model (root, controller, and agent) to build agents’ swarms quickly. Also, MaSMT is capable to communicate with agents using noticeboard method and a peer-to-peer or broadcasting way for message parsing. Especially MaSMT framework allows email-based message-passing capabilities. Numbers of multi-agent applications have been successfully developed with MaSMT including EnSiMaS, Octopus, AgriCom, and RiceMart. The framework is freely available and can be downloaded from the source forge. Further, MaSMT can implement for other languages like Python is one of the further directions of the research.
702
B. Hettige et al.
References 1. Wooldridge M (2009) An introduction to multi agent systems, 2nd edn Wiley 2. Multi Agent Systems—an overview, ScienceDirect Topics https://www.sciencedirect.com/top ics/chemical-engineering/multi-agent-systems 3. Coelho H (2014) Autonomous agents and multi-agent systems 4. Rzevski R, Skobelev P (2014) Managing complexity. Wit Pr/computational mechanics, Southampton, Boston 5. Bradshaw JM (1997) An introduction to software agents 6. Labrou Y, Finin T, Peng Y (1999) Agent communication languages: the current landscape 7. FIPA Agent Communication Language Specifications. https://www.fipa.org/repository/acl specs.html 8. Finin T, Fritzson R, McKay DP, McEntire R (1993) KQML-a language and protocol for knowledge and information exchange 9. Java Agent DEvelopment Framework. https://jade.tilab.com 10. MaDKit, https://www.madkit.net/madkit 11. Python Agent DEvelopment framework Pade 1.0 documentation. https://pade.readthedocs.io/ en/latest 12. Palanca P Spade: Smart Python agent development environment 13. AgentBuilder https://www.agentbuilder.com 14. Mitrovic D, Ivanovic M, Bordini RH, Badica C (2016) Jason Interpreter, Enterprise Edition, Informatica (Slovenia), vol 40 15. Xu H, Shatz SM (2003) ADK: an agent development kit based on a formal design model for multi-agent systems. Autom Softw Eng 10(4):337–365 16. Ferber J, Gutknecht O (1998) A meta-model for the analysis and design of organizations in multi-agent systems. In: Proceedings international conference on multi agent systems (Cat. No.98EX160) 17. Gutknecht O, Ferber J (1997) MadKit and organizing heterogeneity with groups in a platform for multiple multi-agent systems 18. Klügl K, Puppe F (1998) The multi-agent simulation environment SeSAm, in University Paderborn 19. MaSMT 3.0 Development Guide, ResearchGate. https://www.researchgate.net/publication/ 319101813_MaSMT_30_Development_Guide 20. Hettige B, Karunananda AS, Rzevski G (2016) A multi-agent solution for managing complexity in English to Sinhala machine translation. Int J Des Nat Ecodyn 11(2):88–96 21. Hettige B, Karunananda AS, Rzevski G (2017) Phrase-level English to Sinhala machine translation with multi-agent approach. In 2017 IEEE international conference on industrial and information systems (ICIIS), pp 1–6 22. Hettige B, Karunananda AS, Rzevski G (218) Thinking like Humans: a new approach to machine translation, in Artificial Intelligence, pp 256–268 23. Hettige B, Karunananda AS (2015) Octopus: a multi agent Chatbot 24. Goonatilleke MAST, Jayampath MWG, Hettige B (2019) Rice express: a communication platform for rice production industry. In Artificial Intelligence, pp 269–277 25. Jayarathna H, Hettige B (2013) AgriCom: a communication platform for agriculture sector. In: 2013 IEEE 8th international Conference on industrial and information systems, pp 439–444 26. Weerasinghe L, Hettige B, Kathriarachchi RPS, Karunananda AS (2017) Resource sharing in distributed environment using multi-agent technology. Int J Comput Appl 167(5):28–32 27. Samaranayake TD, Pemarathane WPJ, Hettige B (2017) Solution for event-planning using multi-agent technology. In: 2017 seventeenth international Conference on advances in ICT for emerging regions (ICTer), pp 1–6
Comparative Study of Optimized and Robust Fuzzy Controllers for Real Time Process Control Ajay B. Patil and R. H. Chile
Abstract A comparative analysis is carried out to measure the performance of μsynthesis D-K iteration based controllers and μ-synthesis based fuzzy controller is implemented in this research. It is difficult to design the weighting function used in μ-synthesis based robust control methods. The weighting functions are chosen mostly on a trial and error basis. The response for T-S fuzzy and μ-synthesis D-K iteration based controller are compared. The real-time platform of process control is used to examine the performance. Stability analysis is conducted by Nyquist and bode plots. Frequency analysis shows the importance of weighting function. Keywords Robust stability and robust performance · μ synthesis · Weighting function · T-S fuzzy · D-K iteration
1 Introduction Robustness is of great importance in control system design, the reason behind that is, real engineering systems are unprotected to external disturbance and noise. Generally, to stabilize a plant, a control engineer is required to design a controller, if the plant is not stable at first and meets certain levels of efficiency in the presence of disturbance, noise, and plant parameter variations. One of the robust control issues that were deemed astronomically to be solved by the H-∞ approach and MU (μ) approach [1, 2]. It is possible to achieve nominal performance and robust stability against unstructured perturbations with H-∞ optimal approach but the issue of robust performance requirements is neglected. In a real-time implementation, the higher-order controller may not be viable because of computational and hardware limitations. Design methods based on structure-singular value MU(μ) can be used to achieve Robust Stability and Robust Performance (RSRP). One of the strong robust design approaches is MU(μ) synthesis problem [11, 12] [14]. The stabilizing controller can A. B. Patil (B) · R. H. Chile Department of Electrical Engineering, SGGS Institute of Engineering and Technology, Vishnupuri, Nanded, M.S, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_51
703
704
A. B. Patil and R. H. Chile
be designed by μ synthesis approach since the system at equilibrium can be represented by a non-minimum phase system [2].Performance of H-2 and H-∞ control is not satisfactory in the presence of parametric uncertainty, whereas in μ synthesis approach structured/parametric uncertainties are considered therefore it provides better performance than H-∞ approach [3, 15, 16]. Time-domain specifications are used in μ approach which reduces tedious and usually inaccurate design iterations [6].Maximization of system performance is possible with structured uncertainty [7]. The nonlinear model can be successfully reduced to a linear model with uncertainties based on its geometric structure. Hence the first step is to derive a suitable linear converter model, where parameter variations are described in terms of Linear Fractional Transformation (LFT). μ synthesis is applied to this reduced model, which reduces complexity in design procedure [3–5, 17]. D-K iteration is a commonly used method in μ synthesis, which is successful in many practical applications [8–10].Weighting functions are considered for structured uncertainties, which can be optimized with optimization techniques such as particle swarm optimization (PSO), Genetic algorithm (GA), JAYA [13]. T-S fuzzy models present a stabilization approach for time-delay systems via state feedback and Fuzzy observer Controller [18]. The flow of the paper is as follows, Sect. 2 gives the details about the μ analysis and synthesis used for RSRP and Sect. 3 has the concept of weighting function and contains μ synthesis controller with D-K iteration. Section 4 contains the μsynthesis controller with Takagi–Sugeno (TS) fuzzy.Sect. 5 gives a detailed study of the real-time process control model under study which contains a continuous stirrer tank reactor (CSTR) system. Section 6 includes simulation outcomes and Sect. 7 provides hardware outcomes. Finally, Sect. 8 covers the conclusion and the future scope of the research.
2 RSRP Based on µ-Synthesis The μ-synthesis controller is in state-space form and can be defined with the following models by incorporating suitable state variables and easy manipulations x˙ = Ax + Bu y = C x + Du Here, x is the state vector, u and y are input and output vector respectively and A, B, C, and D are state-space parameters. The structure uncertainties are taken into consideration when the μ-synthesis problem is stated. The structure uncertainties are arranged in a particular manner as follows [1], = {[δ1/r 1, . . . , δs/r s, 1, . . . , F]
∈ C, i ∈ Cmj × mj}
(1)
Comparative Study of Optimized and Robust Fuzzy…
705
where s i=1
ri +
s
mj = n
(2)
i=1
n—Dimension of the blocka from Eq. (1) there are two typof uncertainty blocks of block and they are, s—Repeated scalar blocks. f —Full scalar blocks. δi—the repeated scalar blocks parameter can be only real numbers. The value of μ is given as 1 := min{σ () : |(I − M)| = 0} μ(M)
(3)
If, | (I−M )| = 0 then, μ(M) := 0. Consider Fig. 1 which shows standard M− configuration. Where M is enclosed with uncertainty block . here M(s) is interconnected transfer function matrix, v, d are vector signal; v is the input to the uncertainty and d is the output to the uncertainty block . μ (M(s) := sup μ (M( jω) w∈R
(4)
The normalized set of structure uncertainty B is given by B := { : σ () ≤ 1, ∈ }
(5)
Equation (4) shows the structure singular value of interconnected transfer function matrix. If M(s) is stable and μ (M(s)) < 1 (or, ||M||μ < 1), then and only then the Standard M− configuration in Fig. 1 is robustly stableand shows, ‘w’ as the input, generally including disturbances, noises, command signals, ‘z’ is the error output
Fig. 1 Standard M− configuration and configuration for RP analysis
706
A. B. Patil and R. H. Chile
normally consists of tracking errors, regulator output, and filtered actuator signals. Let M(s) is split appropriately as below, M(s) =
M11 M12 M21 M22
(6)
It can easily get that, z = M22 + M21 ((I − M11 )−1 M12 w
(7)
= Fu (M, )w
(8)
From Eq. (8), Fu(M,) is the upper linear fraction transformation (ULFT), an upper loop of an interconnected transfer function matrix is enclosed by structured uncertainty hence it is called ULFT. If the condition, Fu(M, )∞ < 1 satisfies then concerning , Fu(M, ) becomes robustly stable. Now the fictitious uncertainty block P is added in such a way that it does not affect the robust stability of the system and it can be shown in Fig. 2. The reliable performance is determined if only ∈ B condition is satisfied.Fig. 1 describes the robust stability problem replace ˜ ˜ thus the below is a robust stability problem in respect of by ˜ ∈ ˜ := {diag{, p}} : ∈ B, p ∞ ≤ 1}
(9)
If the infinity norm of M22 is less than one then the performance of the plant is nominal and also if M(s) is internally stable then the system gives nominal stability. The general method used to solve the μ synthesis problem is the D-K iteration method. This method is shown in Fig. 2 with considering the controller K., the feedback signal y and u are shown and M is created by P and K. The relation of M
Fig. 2 Standard M- configuration with P analysis and configuration with K
Comparative Study of Optimized and Robust Fuzzy…
707
and P can be given by, M(P, K ) = Fl (P, K )
(10)
⎡
⎤ P11 P12 P13 P(s) = ⎣ P21 P22 P23 ⎦ P31 P32 P33
(11)
Now Eq. (10) will be rewritten in Eq. (12) as, M(P, K ) =
P11 P12 P13 + K(IP33K)−1 [P31 P32] P21 P22 P23
inf sup inf σ − D M(P, K )D −1 ( jw) k(s) w∈R D∈D
(12) (13)
where D is scaling matrix and given by, D = {D = diag D1 . . . Ds , d1 Im1 . . . , d f Im f : Di ∈ C ri ∗ri , Di∗ > 0, d j > 0} sup inf σ − D M(P, K )D −1 ( jw) < 1 w∈R D∈D
(14) (15)
The technique of D-K iteration is to minimize the Eq. (13).
3 Concept of the Weighting Function (WF) The effect of weighting functions of the system is used to suppress noise and disturbances that occur in the system. The controller’s performance and characteristics of the system are attained by the transient response. Weight function can be selected by the following assumptions by the selection of the weighting function that should be in the frequency domain, Stable weights, and diagonal weights are chosen, and the diagonal elements are to be restricted for a minimum phase, real-rational function. The closed-loop system block diagram is shown in Fig. 3 by taking weighting functions into account. Wu and W p are respectively, the performance weights linked to control law and tracking, K is a controller, G is the plant’s transfer function, d is an external disturbance and e is an error. Table 1 shows the ISE calculation for different weighting functions. These weighting functions are chosen by the trial and error method. As seen from the Table 1 and Fig. 4, it is analyzed that the 6th column weight function gives better results, but this is very time-consuming process. To overcome this problem in this
708
A. B. Patil and R. H. Chile
Fig. 3 Block diagram of the closed-loop system Table 1 Different weighting function results Sr. No
wp(s)
1
s+1 s+0.001
2
s+1 s+0.0005
3
s+2 s+0.01
wu(s) 1 1 1
ISE 7.323 6.738 6.654
Sr. No
wp(s)
wu(s)
ISE
4
s+2 s+0.0005
1
6.239
5
2 0.95 ss 2 +1.8s+10 +8s+0.01
10−2
4.156
6
s 2 +1.8s+11 s 2 +8s+0.01
10−2
4.064
0.95
Fig. 4 Comparative results of different weighting function
Comparative Study of Optimized and Robust Fuzzy…
709
research the nature-inspired algorithm known as the Particle Swarm Optimization (PSO) algorithm is proposed to optimize the weighting function.
3.1 μ-Synthesis Controller Using Particle Swarm Optimization (PSO) After performing random analysis in Table 1, the second-order weighting function equation is used for optimization purposes, and in that x(1) and x(2) will have to optimize by using PSO algorithms. The value of μ depends on weight values. But weight values depend on the value of x(1) and x(2). Hence to optimize this fitness function the PSO algorithm is used. The weight function for optimization selected ass 2 + 1.8s + X (1) s 2 + 8s + X (2) −2 Wu = 10 ; μ = X W p = 0.95
Where, Range of the optimization parameters is, x(1) = [10 11], x(2) = [0 1] and x(3) = [0 2]. The parameter required to optimize fitness function in PSO algorithm is set in Table 2. After PSO optimization, the optimized fitness function obtained is given in Eq. (16), W p = 0.95
s 2 + 1.8s + 10.14 and μ = 0.35 s 2 + 8s + 0.011
(16)
The μ-synthesis is executed by placing optimize weighting function and the Robust Control Toolbox. It is seen that from Table 3, at iteration 3, the value of γ is reduced to 0.961 and the value of μ is equivalent to 0.951, which implies RP has been attained. The controller obtained is given in Eq. (17) and state-space form in Eq. (18). T.F =
s4
105.2s 3 + 7824s 2 + 6162s + 1977 + 27.96s 3 + 341.9s 2 + 1281s + 1.607
(17)
The state-space model of the system is, Table 2 PSO algorithm parameters
Parameters of PSO
Values
Parameters of PSO
Values
Number of particles Number of iterations
25
wfinal
0.4
50
c1
winitial
0.5
1.2
c2
1.5
710
A. B. Patil and R. H. Chile
Table 3 Iteration Summary of μ-synthesis controller using PSO
Iteration no
1
2
3
Order of controller
4
12
16
Total D-scale order
0
8
12
γ achieved
1.693
0.988
0.961
Peak value of μ
1.461
0.950
0.951
⎡
−0.0013 0.0292 ⎢ −0.0292 −6.6647 A=⎢ ⎣ −0.0058 −6.7775 0.0209 10.9527 ⎡ ⎤ −1.2492 ⎢ −14.6961 ⎥ ⎥ B=⎢ ⎣ −2.8073 ⎦
0.0058 8.7775 −0.3426 2.0498
⎤ − 0.0209 −10.9527 ⎥ ⎥, 2.0498 ⎦ −20.9505
10.2185
C = −1.2492 −14.6961 2.8073 −10.2185
D=0
(18)
4 µ-Synthesıs Controller Wıth T-S Fuzzy Fuzzy logic control is inherently robust in sense of the imprecise parameter information and the variation with some bound. Hence fuzzy controllers are used for the systems where the data is complex and with variations. Here in this paper fuzzy controller is developed using Takagi Sugeno based compensation technique. Takagi—Sugeno’s suggested fuzzy model is described by fuzzy IF–THEN rules that represent a nonlinear system’s local input–output relationships. The main feature of a Takagi—Sugeno fuzzy model is that a linear system model expresses the local dynamics of each fuzzy implication (rule). The fuzzy dynamic model or T–S fuzzy model consists of a family of local linear dynamic models smoothly connected through fuzzy membership functions. The fuzzy rules of the fuzzy dynamic model have the form I F R l : I F z 1 is F1l and...z v is Fvl THEN, x(t + 1) = Al x(t) + Bl u (t) + al y(t) = Cl x(t)l ∈ {1, 2, . . . , m}
Comparative Study of Optimized and Robust Fuzzy…
711
Fig. 5 Membership functions of CSTR fuzzy model
where Rl denotes the lth fuzzy inference rule, m the number of inference rules,F jl ( j = 1, 2, ..., ν) the fuzzy sets, x(t) ∈ n the state vector, u(t) ∈ g the input vector,y(t) ∈ p the output vector, and (Al , Bl , al , Cl ) the matrices of the lth local model, and z(t) = [z 1 , z 2 , . . . z v ] are the premise variables, which are some measurable variables of the system, for example, the state variables or the output variables. Fuzzy rules are designed based on the local state-space model for the dynamic system. The control gains are designed using the linear quadratic regulation technique. The sample rules are given in Eq. (19), with different equilibrium points obtained with the phase plane method. For fuzzy designing, the triangular fuzzy functions are used as shown in Fig. 5. Fuzzy rules: Rule 1: If x2 (t) is low (i.e.,x2 (t) is about 0.8862). THEN δ x(t) ˙ = A11 δx(t) + A12 δx(t − τ ) + B 1 δu(t)δu(t) = −F1 δx(t) Rule 2: If x2 (t) is Middle (i.e.,x2 (t) is about 2.7520). THEN δ x(t) ˙ = A21 δx(t) + A22 δx(t − τ ) + B 2 δu(t) δu(t) = −F2 δx(t) Rule 3: If x2 (t) is High (i.e.,x2 (t)x2 (t) is about 4.7052). THEN δ x(t) ˙ = A31 δx(t) + A32 δx(t − τ ) + B 3 δu(t)
712
A. B. Patil and R. H. Chile
δu(t) = −F3 δx(t) where δx(t) = x(t) − xd , δx(t − τ ) = x(t − τ ) − xd , δu(t) = u(t) − u d and F1 , F2 , F3
(19)
are to be designed. In this proposed method, μ-synthesis is designed with T-S fuzzy in which the reduced transfer function obtained by the μ- synthesis method is converted in the form of control input and error signal. This will be used in T-S fuzzy and for creating a membership function. Error is taken as input and control input is taken as output of the FIS file.
5 Contınuous Stırred Tank Reactor (CSTR) CSTR plays an important role in the chemical process, where the exothermic reaction takes place and the heat of the reaction needs to be removed by the use of coolant. The control objective of CSTR is to maintain the temperature inside the tank at the desired value. The system has a cooling process it is shown in Fig. 6 and the block diagram is shown in Fig. 7. The controller is used to minimize the error signal. The output of the controller is given to DAC since the output of the controller is digital and the CSTR system only understands physical quantity that is analog, hence it is given to the E/P converter (Electro-pneumatic converter) which is used to convert a current input signal to a linear proportion. To initiate the work of the E/P converter the compressor is used and the output of the E/P converter is given to the control valve. Control valve action takes place to minimize an error. The procedure is repeated until the desired output is obtained. Using the process reaction curve method the second-order transfer function obtained for CSTR is shown in Eq. (20). This system equation is imprecise and with different nonlinearities. The proposed methods are applied to study the responses of CSTR with a transfer function given in Eq. (20). G(s) =
−0.12S + 12S 3S2 + 4S + 1
(20)
Comparative Study of Optimized and Robust Fuzzy…
Fig. 6 CSTR hardware experimental set up
Fig. 7 Block diagram of continuously stirred tank reactor
713
714
A. B. Patil and R. H. Chile
6 Simulation Results Experimental results are taken on the CSTR system explained in Sect. 5. Table 4 shows the iteration summary of the D-K iteration method by which it will prove that the system performance is robust since the value of μ is less than one. The time response with the D-K iteration controller and μ T-S fuzzy controller is given in Figs. 8 and 9 respectively. Figure 10 gives the comparative time response of PID with gains as KP = 12.24, KI = 2.0980, KD = 14.2, μ-controller D-K iteration Table 4 Summary of iteration for D-K method
Iteration no
1
2
Order of controller
4
12
16
Peak value of μ
1.461
0.950
0.946
Fig.8 Time response of D-K iteration controller
Fig.9 Time response of μ-controller with T-S fuzzy
3
Comparative Study of Optimized and Robust Fuzzy…
715
Fig.10 Time response of PID, μ-controller D-K iteration, μ-controller with T-S fuzzy
Table 5 Comparative results of response parameters Parameter
PID
D-K iteration controller
μ- synthesis controller with T-S fuzzy
Rise Time (sec)
2
5.295
10.044
Settling Time (sec)
110
85
85
Overshoot
14.368%
33.088%
0%
Undershoot
19.886%
11.193%
1.977%
ISE
6.197
4.064
3.046
method and μ-synthesis controller with T-S fuzzy. It shows that the μ-synthesis controller with the T-S fuzzy controller has less overshoot (zero) than the other two controllers. The μ-Synthesis controller with T-S fuzzy controller is more robust as compared to the other two controllers. Table 5 gives analysis for the comparative time response of PID, μ-controller D-K iteration method, and μ-synthesis controller with T-S fuzzy. The Integral square error function (ISE) is 3.046 in the case of a fuzzy controller. The settling time and overshoot are improved in T-S fuzzy control as compared to conventional methods. The analysis of response shows better performance for optimized μ- synthesis controller with T-S fuzzy as compare to others. Figures 11 and 12 give the RSRP for the D-K iteration controller and presented a μ-PSO fuzzy based controller, which shows the relationship between frequency and upper bound of μ, which is less than one. The reduced 4th order controller (transfer function) obtained using the proposed topology is shown in Eq. (17). This shows the robust stability of the proposed methods.
716
A. B. Patil and R. H. Chile
Fig. 11 RS for D-K iteration controller
Fig. 12 RS for presented μ-PSO controller
7 Real-Time Process Control Results The result of the μ-synthesis controller using D-K iteration is shown in Fig. 13 which describes that the setpoint is correctly tracked by the controller. Initially, the
Fig.13 Hardware result of Mu synthesis controller using D-K iteration method
Comparative Study of Optimized and Robust Fuzzy…
717
Fig.14 Hardware result of Mu synthesis controller using T-S fuzzy
temperature of the CSTR set at 60 degrees and the setpoint is 55, so the controller gradually decreases temperature form 60 to 55° as shown in Fig. 13. The result of the T-S fuzzy controller is shown in Fig. 14 where the initial temperature is 45° and the setpoint is 40°. In both, the case controller tracks the setpoint but in T-S Fuzzy control, the required time is less compared to the D-K iteration method.
8 Conclusion and Future Scope The Mu-synthesis controller using the D-K iteration method and Mu synthesis controller using T-S fuzzy is proposed in this research. In order to improve the precision and stability of the real-time system, fuzzy control is used. The μ value obtained is less than one which defines the stability of the proposed method. In the proposed method overshoot problem is nullified and settling time is also improved as compared to the PID controller. Similarly, the ISE value for the proposed method is less than the PID controller and DK iteration method. Simulation results clearly show that the μ-synthesis controller with T-S fuzzy is more robust than the D-K iteration method. PSO based tuning used to optimize the weighting function added to the system gives more accurate results. Also, from the hardware study, it is clearly visible that the CSTR system works properly and gives accurate results when it uses the μ-synthesis controller with a T-S fuzzy controller. The Future Scope of the system can be modified further by using optimization algorithm like GA, TLBO, and JAYA by which more accurate results can be achieved and satisfies the RSRP criteria.
718
A. B. Patil and R. H. Chile
References 1. Zhou K, Doyle JC (1998) Essentials of robust control. Vol. 104. Upper Saddle River, Prentice hall, NJ 2. Pannu S, Kazerooni H, Becker G, Packard A (1996) μ-synthesis control for a walking robot. IEEE Control Syst Mag 16(1):20–25 3. Bendotti P, Beck CL (1999) On the role of LFT model reduction methods in robust controller synthesis for a pressurized water reactor. IEEE Trans Control Syst Technol 7(2):248–257 4. Buso S (1999) Design of a robust voltage controller for a buck-boost converter using /spl mu/-synthesis. IEEE Trans Control Syst Technol 7(2):222–229 5. Stein G, Doyle JC (1991) Beyond singular values and loop shapes. J Guidance Control Dyn 14(1) 6. Tchernychev A, Sideris A (1998) /spl mu//k/sub m/-design with time-domain constraints. IEEE Trans Autom Control 43(11):1622–1627 7. Wallis GF, Tymerski R (2000) Generalized approach for /spl mu/ synthesis of robust switching regulators. IEEE Trans Aerosp Electron Syst 36(2):422–431 8. Tsai KY, Hindi HA (2004) DQIT: /spl mu/-synthesis without D-scale fitting. IEEE Trans Auto Control 49(11):2028–2032 9. Lee TS, Tzeng KS, Chong MS (2004) Robust controller design for a single-phase UPS inverter using /spl mu/-synthesis. IEE Proc Electric Power Appl 151(3):334–340 10. Lanzon A, Tsiotras P (2005) A combined application of H/sub /spl infin// loop shaping and /spl mu/-synthesis to control high-speed flywheels. IEEE Trans Control Syst Technol 13(5):766– 777 11. Qian X, Wang Y, Ni ML (2005) Robust position control of linear brushless DC motor drive system based on /spl mu/-synthesis. IEE Proc Electric Power Appl 152(2):341–351 12. Shahroudi KE (2006) Robust servo control of a high friction industrial turbine gas valve by indirectly using the standard$mu$-synthesis tools. IEEE Trans Control Syst Technol 14(6):1097–1104 13. Franken N, Engelbrecht AP (2005) Particle swarm optimization approaches to coevolve strategies for the iterated prisoner’s dilemma. IEEE Trans Evol Comput 9(6):562–579 14. Kahrobaeian A, Mohamed YAI (2013) Direct single-loop /spl mu/-synthesis voltage control for suppression of multiple resonances in microgrids with power-factor correction capacitors. IEEE Transactions on Smart Grid 4(2):1151–1161 15. Bevrani H, Feizi MR, Ataee S (2016) Robust frequency control in an islanded microgrid: ${H} _{\infty }$ and $\mu $ -synthesis approaches. IEEE Trans Smart Grid 7(2):706–717 16. Cai R, Zheng R, Liu M, Li M (2018) Robust control of PMSM using geometric model reduction and $\mu$-synthesis. IEEE Trans Indus Electron 65(1):498–509 17. Gu DW, Petkov P, Konstantinov MM (2005) Robust control design with MATLAB®. Springer Science & Business Media 18. Cao YY, Frank PM (2000) Analysis and synthesis of nonlinear time-delay systems via fuzzy control approach. IEEE Trans Fuzzy Syst 8(2):200–211
Ant Colony Optimization-Based Solution for Finding Trustworthy Nodes in a Mobile Ad Hoc Network G. M. Jinarajadasa and S. R. Liyanage
Abstract Mobile ad hoc network (MANETs) is one of the most popular wireless networks which is having dynamic topologies due to their self-organizing nature. These are less infrastructure-oriented networks due to nodes being mobile, and hence routing becomes a more important issue in these networks. With the ubiquitous growth of mobile and the Internet of things (IoT) technologies, mobile ad hoc networks are acting as a vital character in the process of creating social interactions. Although these are having plenty of problems and challenges including security, power management, location management, and passing multimedia over the network due to the routing issue. MANETs consist of plenty of dynamic connections between nodes by finding a trustworthy route for communication is a challenge. Therefore, based on swarm intelligence methodologies, an ant colony optimization (ACO) algorithm for finding the most trusted path is proposed here via the use of probabilistic transition rule and pheromone trails. Keywords Mobile Ad hoc Networks (MANETs) · Swarm intelligence · Ant colony optimization (ACO) · Probabilistic transition rule · Pheromone trails
1 Introduction With the invention of IoT and social networking concepts, the use of wireless devices and mobile technologies has increased over the recent past decades. Hence, wireless networks including mobile ad hoc networks provide the major contribution in establishing the interactions among the network nodes in service-oriented networks. But the major issue in MANETs is lack of security because of the easily changing nature of network topology since organizing the network nodes happen in a decentralized manner and not having a fixed infrastructure. Therefore, there is a problem with the G. M. Jinarajadasa (B) · S. R. Liyanage University of Kelaniya, Kelaniya, Sri Lanka e-mail: [email protected] S. R. Liyanage e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_52
719
720
G. M. Jinarajadasa and S. R. Liyanage
reliability of the transferred data within a mobile ad hoc network. As an example, an application of a mobile ad hoc network for a service-oriented network, the service provider should guarantee the trustworthiness of the requested service or the transferred data to the service provider. When transferring data, the message or the packet must be transferred through several nodes. Those intermediate nodes can be either trustworthy nodes or compromised nodes. Hence, to guarantee the safety and reliability of data passed, the transferred data must avoid traversing through the compromised nodes. There can be a set of possible paths available for data transferring and to ensure the trustworthiness of the transferred data finding the optimal path for sending the data will be the solution. Hence, finding the shortest possible path which contains only trustworthy nodes is needed. The trust value of the network nodes is computed considering the node properties and the recommendation of neighbor nodes and with the help of transition probabilistic rule and the heuristic values obtained by applying the ACO algorithm, an optimal trust path can be identified [1, 2].
1.1 Swarm Intelligence Swarm intelligence is one piece of computational intelligence that portrays the aggregate conduct of decentralized and self-composed frameworks that are natural or artificial. Provinces of ants and termites, fish, honeybees, groups of birds, and crowds of different creatures are the instances of the systems concentrated by swarm intelligence [3].
1.2 Artificial Ants and the Shortest Path Problem Artificial ants live in a discrete world and store pheromone in an issue relying on the way. They have additional capacities like neighborhood search, look forward, and backtracking. By exploiting the inward memory and storing a measure of pheromone capacity of the arrangement quality, they can utilize neighborhood heuristics [4]. As shown in Fig. 1, ants are given a memory of visited hubs where ants manufacture arrangements probabilistically without refreshing pheromone trails. Ants deterministically in reverse backtrack the forward way to refresh pheromone and they store several pheromone functions, as per the quality of the arrangement they created (Fig. 2).
1.3 Ant’s Probabilistic Transition Rule At each node, ant takes the decision where to move next, depending on the pheromone trail. In light of the pheromone stockpiling on every node, the ant takes the choice
Ant Colony Optimization-Based Solution for Finding Trustworthy …
721
Fig. 1 Finding the shortest way to arrive at the goal
Fig. 2 Ant utilizing pheromone, heuristics, and memory to pick the next hub
by applying the probabilistic rule which can be characterized as follows. Pikj
α
= i ij/
N
k [i ikj ]α i n=1
(1)
i ij is the amount of pheromone trail on the edge and N k i is the set of probable neighbor nodes ant k positioned on node i can shift to where α is the relative influence of pheromone function [5, 6].
1.4 Ant Colony Optimization Metaheuristics This is the populace-based strategy in which artificial ants iteratively develop possible answer arrangements. In each cycle, every ant makes one possible answer arrangement utilizing a productive search technique. The development of the answers is probabilistically influenced by pheromone trail data, heuristic data, and partial possible
722
G. M. Jinarajadasa and S. R. Liyanage
solutions or answer arrangements of every ant. Pheromone trails are adjusted during the search procedure to reflect aggregate experience [7, 8].
2 Related Work Swarm intelligence and the discipline of mobile ad hoc networks have been researched in many types of aspects in the field including shortest path search, routing protocol optimization, energy balancing, and improving the quality of service. Among the swarm intelligence mechanism, there are plenty of utilizations of the ant colony optimization in the improvement of different aspects of MANET environments. A hybrid routing algorithm for MANETs based on ant colony optimization called HOPNET is proposed by Wang, J. et al., where the calculation has been contrasted with the random waypoint model and random drunken model with the ZRP and DSR routing protocols [9]. A routing protocol called ant colony-based routing algorithm (ARA) is proposed with the main goal of reducing the routing overhead. The presented routing protocol is exceptionally versatile, productive, and scalable [10]. The “AntNet” and the “AntHocNet” are applications of swarm intelligence in MANETs where they utilize the notion of ant colony optimization (ACO) by finding near-best solutions to graph optimization problems [11]. The AntNet and the AntHocNet are finding the closest optimal routes in a drawn diagram of interactions without global information. But the disadvantage of this approach is, it creates additional communication overhead by the usual transferring of both “forward ants” and the “backward ants” [12, 13]. Schoonderwoerd et al. have addressed the above-mentioned issue by proposing the solution called ant-based control (ABC) which is very similar to AntNet where makes the communication overhead relatively smaller by using only the forward ants [14]. A trust calculation for online social networks based on ant colony optimization is suggested by Sanadhya et al., where it creates a trust cycle for trust pathfinding in the means of achieving the trustworthiness and satisfaction of the service provided to the service requester [8]. An improved ACO-based secure routing protocol for wireless sensor networks is proposed by Luo et al. for the optimal path finding with the combination of probability value and applying fuzzy logic for trust value calculating. The proposed secure routing protocol shows that the calculation can ensure the discovery of the forwarding path with low cost in the reason of guaranteeing security [15]. When it comes to mobile ad hoc networks, the biggest challenge is to discover a way among the communication endpoints, satisfying the client’s quality of service (QoS) prerequisites. Several approaches are suggested to improve the quality of service requirement in MANETs while finding the multiple stable paths among the source and the goal [16, 17, 18].
Ant Colony Optimization-Based Solution for Finding Trustworthy …
723
An altered ant colony algorithm is proposed by Asghari et al. to locate an authentic path alongside improving load balancing among the target users where the refreshed pheromone reverse affects the chosen path by the ants [19].
3 Proposed Work The proposed work can be divided into three major parts which are creating the artificial intimacy pheromone for defined mobile ad hoc network, applying the trust concept with the ant’s probabilistic transition rule, and the algorithm calculation of trust value.
3.1 Network Intimacy Pheromone ( i s ) When considering a MANET environment, a node can connect to another network node with the network communication intimation. The network intimation value can be defined as; if node “A” can connect to his neighbor node “B” which is within
one-hop distance network intimacy pheromone is equal to one ( i s = 1); if node “A” cannot connect to other nodes directly without having the help of one-hop neighbor
node, then network intimacy pheromone is less than one ( i s < 1). As shown in Fig. 3, when node A connects to nodes B, C, and D, since node A
can directly connect with them, the network pheromone value i s is equal to 1. But if node A requires connecting to node E, F, and G, node A should connect indirectly
by going through the several hops. Then, the network pheromone value i s becomes less than 1 where it decrements the pheromone value by 0.1 for each hop. Based on the above Table 1, an aggregated weighted heuristic function can be generated to calculate the heuristic value for the data transferring through the created
Fig. 3 Pheromone value variation with the neighbor nodes
724
G. M. Jinarajadasa and S. R. Liyanage
Table 1 Filtered network parameters and weights
Parameter for trust calculation
Weight
Control packets received for forwarding
1
Control packets received forwarded
1
Routing cost
2
No. of packet collisions
1
Data packets received for forwarded
1
Data packets received forwarded
1
Packet dropped
2
No. of packets transmitted
1
No. of packets received
1
Packet signal strength
0.1
Available energy
0.1
network. η˜r = sum of weights between two adjacent nodes (Ws ) /Upper bound of weighted sum(Ws )
(2)
3.2 Trust Decision by Ant’s Probabilistic Transition Rule (Pk ij ) With the aggregation of the network heuristic values and the network intimacy pheromone values, ant’s probabilistic transition rule can be modified as follows to make a trust defining rule for ants to decide the next trustworthy node among the neighbor nodes. α α β β n ηr (i, j) /i=1 is ηr Pikj = i s (i, j)
(3)
Trust decision value by ant’s probabilistic transition rule can be declared as above
where i s (i,j) and Ár (i,j) are, respectively, the pheromone value and the heuristic value related with the i and j nodes. Further, α and β are positive real parameters whose values decide the overall significance of pheromone versus heuristic data.
Ant Colony Optimization-Based Solution for Finding Trustworthy …
725
3.3 Trust Calculation ACO Algorithm Because of the network parameters filtered from the network simulation and with the help of probabilistic and heuristic values in ACO the following algorithm is proposed to calculate the trust in the created MANET environment. 1
Get filtered network data
2
For (1 to Iteration) Begin [where Iteration = total no of Iterations]
3
Calculate the upper bound of the weighted sum of network parameters
4
Determine node actions matrix (d) from filtered data and calculate heuristic value Ár
5
Initialize data(d,α,β,a,N,E,φ) [ where α = 1, β = 2, a = Total no of ants, N = total no of nodes, E = requesting node, φ = pheromone evaporation
6
Loop
7
Randomly position a Ants on E node
8
For (1 to N)[calculate trust path with the path construct a solution, from source to destination)
9
Begin
10
For(1 to a)
11
Begin
12
For (1 to N-1 neighbor node)
13
Begin
14
Pk ij = [ i
15
Update network intimacy pheromone
i
s (i,j) (1-
16
End
17
End
18
End
19
End
α β s (i,j) ] [Ár (i,j) ]
φ) i
/ n
α β i=1 [ i s ] [Ár ]
s (i,j) ]
Algorithm 1: ACO algorithm for trust calculation.
4 Experimental Results Figure 4 shows the created MANET environment with 9 nodes that choose the routing behavior with the ad hoc on-demand distance vector (AODV) routing protocol, and Fig. 5 shows the training of ants for minimum trust pathfinding by the pheromone updating where 10 ants and the no. of iterations were 100.
726
G. M. Jinarajadasa and S. R. Liyanage
Fig. 4 MANET simulation with 9 nodes and AODV protocol
Fig. 5 Minimum trust cycle with the length of 99
Ant Colony Optimization-Based Solution for Finding Trustworthy …
727
5 Conclusion The data are transferred between the nodes against a MANET by utilizing a reliable path in the given network is achieved in this research. The integrity of the communication between the nodes is calculated by the ant’s probabilistic transition rule and heuristic values calculated by the modified probabilistic trust value calculating equation. From the simulation of the trust, the calculation algorithm applied network is finding the shortest and the optimal trust path which is having the trustworthy nodes.
References 1. Papadimitratos P, Haas Z (2002) Secure routing for mobile ad hoc networks. In Communication Networks and Distributed Systems Modeling and Simulation Conference (CNDS 2002) (No. SCS, CONF) 2. Marti S, Giuli TJ, Lai K, Baker M (2000) Mitigating routing misbehavior in mobile ad hoc networks. In Proceedings of the 6th annual international conference on Mobile computing and networking (pp 255–265), ACM 3. Bonabeau E, Marco DDRDF, Dorigo M, Theraulaz G (1999) Swarm intelligence: from natural to artificial systems (No. 1). Oxford university press 4. Hsiao YT, Chuang CL, Chien CC (2004) Ant colony optimization for best path planning. In IEEE International Symposium on Communications and Information Technology, ISCIT 2004. (vol 1, pp 109–113). IEEE 5. Kanan HR, Faez K (2008) An improved feature selection method based on ant colony optimization (ACO) evaluated on face recognition system. Appl Math Comput 205(2):716–725 6. Asif M, Baig R (2009) Solving NP-complete problem using ACO algorithm. In 2009 International conference on emerging technologies (pp 13–16). IEEE 7. Dorigo M, Stützle T (2003) The ant colony optimization metaheuristic: algorithms, applications, and advances. In Handbook of metaheuristics (pp 250–285), Springer, Boston, MA. 8. Sanadhya S, Singh S (2015) Trust calculation with ant colony optimization in online social networks. Procedia Comput Sci 54:186–195 9. Wang J, Osagie E, Thulasiraman P, Thulasiram RK (2009) HOPNET: A hybrid ant colony optimization routing algorithm for mobile ad hoc network. Ad Hoc Netw 7(4):690–705 10. Gunes M, Sorges U, Bouazizi I ARA-the ant-colony based routing algorithm for MANETs. In Proceedings international conference on parallel processing workshop (pp. 79–85). IEEE 11. Dorigo M, Birattari M, Blum C, Clerc M, Stützle T, Winfield A (eds) (2008) Ant colony optimization and swarm intelligence. In Proceedings 6th international conference, ANTS 2008, Brussels, Belgium, 22–24 Sept 2008 (vol 5217) Springer 12. Di Caro G, Dorigo M (1998) AntNet: distributed stigmergetic control for communications networks. J Artif Intell Res 9:317–365 13. Di Caro G, Ducatelle F, Gambardella LM (2005) AntHocNet: an adaptive nature-inspired algorithm for routing in mobile ad hoc networks. European Trans Telecomm 16(5):443–455 14. Schoonderwoerd R, Holland OE, Bruten JL, Rothkrantz LJ (1997) Ant-based load balancing in telecommunications networks. Adaptive Behavior 5(2):169–207 15. Luo Z, Wan R, Si X (2012) An improved ACO-based security routing protocol for wireless sensor networks. In 2013 International Conference on Computer Sciences and Applications (pp 90–93). IEEE 16. Roy B, Banik S, Dey P, Sanyal S, Chaki N (2012) Ant colony based routing for mobile ad-hoc networks towards improved quality of services. J Emerg Trends Comput Inf Sci 3(1):10–14
728
G. M. Jinarajadasa and S. R. Liyanage
17. Deepalakshmi P, Radhakrishnan DS (2009) Ant colony based QoS routing algorithm for mobile ad hoc networks. Int J Rec Trends Eng 1(1) 18. Asokan R, Natarajan AM, Venkatesh C (2008) Ant based dynamic source routing protocol to support multiple quality of service (QoS) metrics in mobile ad hoc networks. International Journal of Computer Science and Security 2(3):48–56 19. Asghari S, Azadi K (2017) A reliable path between target users and clients in social networks using an inverted ant colony optimization algorithm. Karbala Int J Modern Sci 3(3):143–152
Software Development for the Prototype of the Electrical Impedance Tomography Module in C++ A. A. Katsupeev, G. K. Aleksanyan, N. I. Gorbatenko, R. K. Litvyak, and E. O. Kombarova
Abstract The basic principles and features of the implementation of the electrical impedance tomography (EIT) method in the C++ language are proposed in this research. This software will significantly reduce the hardware time for performing of computational operations and will expand the capabilities of the technical implementation of the EIT method in real technical systems of medical imaging. An algorithm for the operation of the EIT module prototype software in C++ has been developed. The principles of building the software for the EIT module prototype have been developed, which provides the possibility of embedding into other medical equipment. The software interface of the EIT module prototype has been developed. Keywords Electrical impedance tomography · Medical software · Reconstruction · Conduction field · Software principles
1 Introduction Improving the quality and reliability of medical diagnostic information is one of the key tasks of the modern healthcare system worldwide. In this regard, there is a practical need for the development and creation of tools for obtaining and processing the medical visualizing information as one of the main components of a significant A. A. Katsupeev (B) · G. K. Aleksanyan · N. I. Gorbatenko · R. K. Litvyak · E. O. Kombarova Department of Informational and Measurement Systems and Technologies, Platov South-Russian State Polytechnic University (NPI), Novocherkassk, Russia e-mail: [email protected] G. K. Aleksanyan e-mail: [email protected] N. I. Gorbatenko e-mail: [email protected] R. K. Litvyak e-mail: [email protected] E. O. Kombarova e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_53
729
730
A. A. Katsupeev et al.
examination of the patient. In this regard, it is promising to develop new medical and technical systems that can increase the efficiency of existing devices and provide medical personal additional information about the human condition to systematize and formulate the correct diagnosis and develop adequate and correct treatment tactics. In this regard, the creation of technical means based on the receipt and analysis of the electrical properties of an object is an urgent area of the modern medical instrumentation. One of the promising prospects in this area is the electrical impedance tomography [1–3], which finds new aspects of use in clinical practice every year. The method is understandable and relatively easy to implement in technical devices, but it has several disadvantages for use in real (quasi-real) time. This is due to a number of factors, one of which is multilevel resource-intensive calculations and operations for calculating, reconstructing, and visualizing the conduction field, which is difficult to implement in hardware at the microprocessor level. Existing application packages that offer various algorithms for implementing the EIT method, such as EIDORS [4] and PyEIT [5], cannot be directly used in informational and measuring devices of the EIT due to the peculiarities of their implementation of the algorithms. In this regard, this work proposes a new implementation of well-known algorithms in C++ [6] that will significantly reduce the hardware time for performing computational operations and expand the capabilities of the technical implementation of the EIT method in real technical systems of medical imaging.
2 Existing Technologies of Implementation 2.1 PyEIT Framework The PyEIT framework [5] is based on the Python framework for modeling and visualizing of the conduction field using the EIT method. The capabilities of PyEIT include finite element modeling, 2D and 3D visualization, solving the forward and inverse EIT problems, meshing, and image formation for external applications. The mesh module can split the region into triangles (2D) and tetrahedra (3D). PyEIT implements state-of-the-art EIT algorithms that support both static and dynamic reconstruction. PyEIT can use the following algorithms for dynamic reconstruction such as backprojection method, GREIT [7], and NOSER [8]. PyEIT includes demo examples of the operation of reconstruction algorithms [5]. To visualize the results of EIT (including 3D), PyEIT uses the Matplotlib charting library [5].
Software Development for the Prototype of the Electrical …
731
2.2 Eidors The specialized application package Electrical Impedance and Diffuse Optical Reconstruction Software (EIDORS)[4] is intended for the reconstruction of images based on the results of EIT [1]. The work of the EIDORS software system is based on the use of the language and development environment MATLAB [9]. EIDORS features include finite element modeling, solving the forward EIT problem, static and dynamic reconstruction using various methods, 2D and 3D visualization [10]. EIDORS includes many reconstruction algorithms, such as the Gauss–Newton algorithm, the back-projection method, the conjugate gradient method, the internal point method, GREIT, and others.
2.3 Justification of the Need for Implementation in C++ Language The EIT reconstruction algorithm requires high performance due to two factors such as the first factor is building a finite element mesh and calculating the reconstruction matrix to calculate the conductivity field of the object under study based on the measurement data. And the second factor is displaying the change in the conductivity field in quasi-real time during the measurement process of the object under study. Here, the first factor is not key when implementing the EIT algorithm in medical devices, since the reconstruction matrix can be generated in advance and used repeatedly for different patients. However, the second factor is important in view of the need to display the change in the conductivity field of the object under study with sufficient speed. Since C++ is a compiled programming language, it meets the stated performance requirements and, as a result, can be used to develop software for the EIT module. The proposed work has also developed a solution for processing EIT results in the form of a Web portal [11], but this development is not intended for use in medical equipment. This is due to the fact that the EIT channel is often used in clinical practice not independently, but is integrated into existing medical and technical devices, for example, lung ventilators, the software of which is implemented in the C++ language. Thus, the general scheme of the developed software can be represented in Fig. 1. The software is divided into two modules: information processing and software interface. Software processing, in turn, is divided into a measurement process and statistical processing of measurement data.
732
A. A. Katsupeev et al.
EIT module software
Information processing
Measurement process
Software interface
Statistical processing
Fig. 1 EIT module prototype software diagram
3 Principles of Software for EIT Module Prototype 3.1 Software Algorithm The developed software algorithm is shown in Fig. 2. In general, the program algorithm is divided into three directions: carrying out the measurement process, statistical processing of the results, and the generation of a new model of the object under study for calculating the conductivity field. In the case of the program running in the measurement mode, after setting the measurement parameters and connecting to the device, from the measuring device potential differences at the measuring electrodes are obtained, presented in the form of sets = {ϕ1 , …,ϕn }, ’ = {ϕ1 , …, ϕn ’ }. where ϕi , i = 1, …, n—potential differences across the measuring electrodes during the obtained measurement; ϕi, , ϕi, ,i = 1, …, n—potential differences across the measuring electrodes during the reference measurement; n—measuring electrodes pairs number. The conductivity field of the object under study within the tomographic section is calculated by the formula: = H ∗ ( −), where H—pre-generated reconstruction matrix. The reconstruction matrix consists of pre-generated coefficients that are used to calculate the conductivity field values based on the measurement vector. Thus, the number of matrix rows is equal to the number of finite elements, and the number of columns is equal to the size of the measurement vector.
Software Development for the Prototype of the Electrical …
733
Start Measurement process
Other activity Definition of the work purpose
Device connection
Definition of the work purpose
Statistical analysis
Generation of new model
Setting measurement parameters MP (I, f, signal form)
Selecting the patient
Selecting the model view (2D or 3D)
Start of the measurement cycle
Viewing the patient record
Setting the shape and boundaries of the model
Receiving measurement data Ψ = {φ1,…,φn} from the device
Viewing the statistical information
Setting the distance between nodes of the finite element mesh
Calculation of the conduction field Ω = {σ1,…,σm}
DICOM protocol generation
Finite element meshing
Visualization of the conduction field Ω = {σ1,…,σm}
Reconstruction matrix calculation
Calculation of lung ventilation and perfusion
Archiving the measurement process
End of the measurement cycle
End
Fig. 2 Software algorithm of EIT module prototype
734
A. A. Katsupeev et al.
The vector is a set of values of the conductivity field at the finite elements of the reconstructed object model: = {σ1 , ...σm }, where m—the number of finite elements in the model. Based on the measurement data, ventilation, perfusion, and ventilation-perfusion ratio are calculated, as well as archiving of the measurement procedure is carried out. Another direction of the program’s work is the processing and formation of statistical results based on the measurement data. Within the framework of this direction, a measurement protocol is formed according to the DICOM standard. The third direction is the generation of a new model of the object under study for tomographic measurements. It is used if the standard chest cavity model is not suitable and it is necessary to adapt the model for a specific patient.
3.2 The Principles of the EIT Module Prototype Software The software is divided into the following modules, as shown in Fig. 3. The software modules are conventionally divided into three groups, namely measurement process, information processing, and user interface. The “measurement process” group interacts with the measuring device, the “information processing” group processes the measurement data, and the “user interface” group interacts with the user. The principles for implementing a solution in C++ language include the following: (1) The main criteria are uninterrupted measurement process and speed of processing the information received from the measuring device; (2) Based on the analysis of state-of-the-art graphics visualization [12], a combination of GTK + OpenGl [13, 14] is used as a graphic library for displaying the conductivity field to ensure high speed of information output. A screenshot of the software is shown in Fig. 4. The main blocks of the information displayed on the screen are the reconstructed conductivity field, ventilation graphs, control buttons, and measurement parameters setting block. Within the framework of compatibility with the software used in lung ventilators, it is planned to change the GTK graphics library to MFC [15]. (3) To minimize the computation time in the measurement data processing module, it is necessary to use a compiled programming language, therefore, the developed solution is implemented in the C++ language.
Software Development for the Prototype of the Electrical …
735
EIT module software
Measurement process EIT measurement process control
Archiving and recording the measurement procedure
Information processing
Measurement data processing
Calculation of ventilation by the EIT method
Calculation of perfusion by the EIT method
Calculation of the ventilationperfusion ratio Storage of reconstruction and visualization results
Generation of reports
DICOM protocol generation
Fig. 3 Principles of the EIT module prototype software
User interface
Visualization of the results of calculating indicators characterizing the function and condition of the lungs
Signaling
Generation and maintenance of patient records
736
A. A. Katsupeev et al.
Fig. 4 Screenshot of the software interface of the EIT module prototype
4 Conclusions The implementation of the electric impedance tomography algorithm in different environments is proposed using the C++ language. The software developed will significantly reduce the hardware time for performing of computational operations and will expand the capabilities of the technical implementation of the EIT method in real technical systems of medical imaging. An algorithm for the operation of the EIT module prototype software in C++ has been developed. The principles of building the software for the EIT module prototype have been developed, which provides the possibility of embedding into other medical equipment. The software interface of the EIT module prototype has been developed. Acknowledgements The study is carried out as part of the federal target program “Research and Development in Priority Directions for the Development of the Russian Science and Technology Complex for 2014-2020”, with financial support from the Ministry of Science and Higher Education (agreement No. 05.607.21.0305). Unique agreement identifier RFMEFI60719X0305.
References 1. Adler A, Boyle A (2019) Electrical impedance tomography, pp 1–16. https://doi.org/10.1002/ 047134608x.w1431.pub2. 2. Pekker JS, Brazovskii KS, Usov VN (2004) Electrical impedance tomography—Tomsk: NTL, p 192 3. Aleksanyan GK, Denisov PA, Gorbatenko NI, Shcherbakov ID, Al Balushi ISD (2018) Principles and methods of biological objects internal structures identification in multifrequency electrical impedance tomography based on natural-model approach. J Eng Appl Sci 13(23):10028–10036
Software Development for the Prototype of the Electrical …
737
4. Adler A, Lionheart W (2006) Uses and abuses of EIDORS: an extensible software base for EIT. Physiol Meas 27(5):S25–S42. CiteSeerX 10.1.1.414.8592. https://doi.org/10.1088/09673334/27/5/S03. PMID 16636416 5. Liu B, Yang B, Xu C, Xia J, Dai M, Ji Z, You F, Dong X, Shi X, Fupy F (2018) EIT:a python based framework for electrical impedance tomography // SoftwareX 7(C):304–308 6. Stroustrup B (1997) “1". The C++ Programming Language (Third ed.). ISBN 0–201–88954–4. OCLC 59193992 7. Adler A, Arnold J, Bayford R, Borsic A, Brown B, Dixon P, Faes T, Frerichs I, Gagnon H, Garber Y, Grychtol B, Hahn G, Lionheart W, Malik A, Stocks J, Tizzard A, Weiler N, Wolf G (2008) GREIT: towards a consensus EIT algorithm for lung images. Manchester Institute for Mathematical Sciences School of Mathematics. The University of Manchester 8. Cheney MD, Isaacson D, Newell JC (2001) Electrical impedance tomography. IEEE Sign Process Mag 18(6) 9. MATLAB Documentation (2013) MathWorks. Retrieved 14 Aug 2013 10. Lionheart WRB, Arridge SR, Schweiger M, Vauhkonen M, Kaipio JP (1999) Electrical impedance and diffuse optical tomography reconstruction software. In Proceeding soft he 1st world congresson industrial process tomography, pp 474–477, Buxton, Derbyshire 11. Aleksanyan G, Katsupeev A, Sulyz A, Pyatnitsin S, Peregorodiev D (2019) Development of the web portal for research support in the area of electrical impedance tomography. EasternEuropean J Enterprise Technol 6(2):6–15 12. Chotisarn N, Merino L, Zheng X (2020) A systematic literature review of modern software visualization. J Vis 23:539–558 13. The GTK Project // https://www.gtk.org/ 14. OpenGL—The Industry Standard for high performance graphics // https://www.opengl.org/ 15. MFC Applications for desktop // https://docs.microsoft.com/ru-ru/cpp/mfc/mfc-desktop-app lications?view=vs-2019
Information Communication Enabled Technology for the Welfare of Agriculture and Farmer’s Livelihoods Ecosystem in Keonjhar District of Odisha as a Review Bibhu Santosh Behera, Rahul Dev Behera, Anama Charan Behera, Rudra Ashish Behera, K. S. S. Rakesh, and Prarthana Mohanty Abstract Odisha is an Agrarian State, and extension education plays a vital role in agriculture growth and promotion of farmer’s livelihoods. For technology dissemination and knowledge generation purposes, ICT plays a vital role in Agrarian society to empower the farming fraternity. In the year 2013–2017, a pilot-based research was made by the group of researchers with the help of OUAT, KVK and the agriculture department to publish this research review for investigation. Here, expost facto design and multistage sampling were taken for study along with a structured followed for collection of data. Around 170 samples were taken for a comprehensive study in Keonjhar District Of Odisha. Here, the objective of this research is to provide support for data collection. Keywords ICT · Agriculture · Livelihood
B. S. Behera (B) · R. D. Behera · P. Mohanty OUAT, Bhubaneswar, Odisha, India e-mail: [email protected] R. D. Behera e-mail: [email protected] B. S. Behera International Researcher, LIUTEBM University, Lusaka, Zambia A. C. Behera · R. A. Behera Faculty Green College, Odisha, India K. S. S. Rakesh LIUTEBM University, Lusaka, Zambia © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_54
739
740
B. S. Behera et al.
1 Introduction 1.1 Brief Introduction Information communication technology is a system under which all solutions may be possible as per problem/challenges. ICT is also a useful weapon that can be applied in every development sector for achieving maximum effectiveness in every assigned task. Agriculture is the fuel of the country which plays a very dynamic role in society and also as the backbone of the country’s economy. As per the research findings, approximately 70% of the population in India makes its livelihoods from agriculture (Kurukshetra, June 2015).
1.2 Theoretical Orientation Agriculture must have constant diffusion of technology to convene worldwide food security, environmental sustainability as well as poverty reduction. In India, as per SECC information (2011), launched in 2015, the entire “households are 24.39 crores, out of 17.91 crores” followed in the field that is non-urban, some of uncertain as well as unsecured earnings. Because of globalization, demand and urbanization of higher value item worldwide situation continue to be altering in the perspective of cultivation. Agriculture is quite hard for individuals though it is living below the poverty line. To moderate the worldwide requirements as well as reforms of agriculture manufacture, green revolution got positioned throughout the mid of 1960. The main goals of the natural revolution had been developed in a location below agriculture, utilization of chemical fertilizers, new technologies as well as pesticides to enhance the result. In India, actions underneath the eminent researcher M.S.Swaminathan as well as his panel of researchers “of Indian Council for Agriculture Research (ICAR). Green revolution” enhanced the productivity of cereals—mostly wheat and rice along with some other primary cereals such as maize to an assured amount, generally in western Uttar Pradesh, Punjab as well as Haryana. In the opening phase, the green revolution has been started through the spread of technology that is new to much better irrigated as well as the endowed region. Afterward, numerous revolutions came about in India to improve a range of other categories of production like the white revolution (milk production), silver fiber (cotton production), round revolution (potato production), “red revolution (meat and buffalo),” blue revolution ( fish production), evergreen (productivity without loss), yellow revolution (oil seeds), golden (overall horticulture), silver revolution (poultry production) and so on. Instead of these kinds of revolution in India, numerous farmers follow the conventional method of farming. Every year a lot of farmers are facing enormous damage throughout farming. Agriculture manufacture constantly declines throughout the last several years. Thus, farming advancement requires to offer new information as well as knowledge to farmers’ doorstep. Advanced training includes different methodologies
Information Communication Enabled Technology for the Welfare …
741
of training to encourage awareness, consequently which agriculture part requires advanced training to renew output via farming. Information communication technology (ICT) assists to give information to the farmers at their doorstep. It offers knowledge associated with fertilizer consumption, price output in the markets pest management, weather/climate information, online land registration, etc. Government offices of each level are linked with a system, to offer knowledge to the farmers. Krisak Sathi, development officers, agriculture expert, village agriculture workers (VAW), as well as stakeholders are educating farmers, to settle in techniques that are new to agriculture. In India, teledensity has quickly improved, according to the government’s report 2015 (Kurukshetra, February 2016), in rural area and teledensity is improved two times. Non-urban farmers access data concerning agriculture by voice over call on a mobile phone, short message service (SMS). The Central government works together among the state governments have been initiated in several ICT centers equipped with telephone, broadband connection, internet, PCs, along with development officer for example IFFCO-ISRO GIS project, VISTANET, Gyandoot project, cyber dhaba, e-choupal, AMARKET and so on information based Knowledge offerer via mobile based web portal, various web and Kisan Call Centres,farmers web portal mkisan portal (www.mkisan.gov.in), (www.farmer.gov.in) (mksp.gov.in). These portals are assistance information-based knowledge as well as advisory by subject professionals. Department of Agriculture and Cooperation was created over 80 portals, mobile-based applications along with Web sites through the cooperation of the National Informatics Centre. RKVY, DACNET, National Food Security Mission (NFSM), SEEDNET, National Horticulture Mission (NHM), Acreage, Productivity and Yields (APY), ATMA and INTRADAC are the essential portals. The greatest proportion of citizens create livelihood through farming. This study has specified significance to know about the different ICTs projects related to farming growth. Special ICTs projects in Odisha as well as the government along with private organizations are private designs programmers to attain the non-urban farmers. ICT and Agricultural Development ICTs in farming can assist better entry to data that support or drive information distribution. ICTs help dissemination, retrieval, storage, management, along with the creation of any related information, knowledge and data which might have already been adapted as well as processed “(Bachelor 2002; Chapman and Slay maker 2002; Rao 2007; Heeks 2002).” Since the analysis is all regarding the discrepancy of ICTs as well as the application of it should create for growth especially in the field of cultivation. In India, ICT in farming is a promising area concentrating on the development of rural as well as agriculture improvement. ICT can supply with the exact information needed for the farmers that help enhanced agriculture productivity. Thus, the public–personal partnership, private government as well as initiatives programs are created for the growth of agriculture. But in India, still in increasing phase along with developing as an emerging style, the gain of ICT is still to achieve every farmer. Possibly technological improvement numerous farmers particularly that are share cropper as well as original are not receiving appropriate service and information because of poor financial conditions as well as social constraints.
742
B. S. Behera et al.
Other reasons are language barriers, illiteracy as well as refusal to accept the new technology. The manner in that ICT projects entry, apply, assess, as well as deliver content might enhance the possibility of ICT utilization through farmers as well as therefore can develop into an essential element in a project. To submit the data seeking appetite of farmers, ICT may be acted as a panacea for all of them. Here, the problem of a farmer may be analyzed by using ICT tools with relevance to local condition of them. Local content was described as content that is proposed for particular neighborhood viewers, as described by language, culture or geographic location, or as content which is economically, politically socially as well as culturally related to a certain people. The optimum benefit of ICT should be reached to the doorstep of the farming fraternity and rural artisans.
2 Scope and Significance of the Study The Government” of India emphasizes on “Digital India” program. The Government of India started to follow information ideas for the course to improve throughout the 1980 through Prime Minister Rajiv Gandhi. The work was concentrated on utilization of ICTs to get into farming information in the component of Patna. The analysis has concentrated the way the improvement officers as well as stakeholders utilize ICTs as the right to use as well as utilization equipment would be the emphasis of the research. The analysis is focused on using ICTs through data suppliers as well as the way they distribute information to access utilization of cultivation the nonurban farmers of Patna along with additionally how Patna’s farmers in non-urban distant regions use information as well as use the assistance of ICTs in agriculture development. Similarly in Odisha, OCAC, Department of IT, Government of Odisha has been introduced an e-literacy program/digital literacy program to empower the farmers and rural artisans. Digital Green Project is also supporting Odisha in Odisha Livelihoods Mission Program for strengthening the livelihoods of farmers in Odisha.
2.1 Review of Literature Among the several types of researches accomplished on the subject associated with this particular effort, the mechanism is linked to ICT’s communication as well as ICT’s role in non-urban development in the area such as agriculture, education, health sanitation, and economy they are: “Mohanty and Bohra [1]iii highlight the ICT’s role in the entire world through the appearance of different types of equipment. ICT played a major role that has not merely prepared access throughout the world easier but has assist combination of consideration, procedure synergies in functioning techniques along with place, democratic function approach along with participation in learning along with the
Information Communication Enabled Technology for the Welfare …
743
improving organization transparency increase alertness to along with e-governance’s application has opened recent vistas of facilitation strategy as well as the management system. To discuss improving communication in non-urban “sectors, Dasgupta et al. [2]iv bring out the” idea by guide to press for farming growth, though explaining about communication models for development as well as technology transfer and research about the way to create as well as transport of acceptance technologies requirements for recreating the agriculture communication models. The benefit of communication technologies assists in non-urban development as well as the benefit of information technology in agriculture growth as well as impact distribution along with alteration for agriculture. In his research, Schware and Bhatnagar [3]v , highlight the victorious utilization of information and communication technologies( ICT) in” non-urban growth. It starts through an initial section that traces the past of the utilization of ICT in non-urban India. It observes a few of the issues which were influenced the implementations of non-urban development programs as well as also demonstrates ICT application might assist and conquer them in upcoming years. “Narula [4]vi explains about the dynamics of dysfunction” as well as the growth of advancement. Additionally mentioned about how these two elements are impeded as well as facilitated “by Development Communication models” running in particular society at a specific point of time is offered improvement communication difficulties, “Technological challenge; Strategies” and reach for advancement. Singhal and Rogers [5]ix state that information communication that is new replaces the speed of interaction by human throughout the globe. They create the globe as a “Global Village” in which the whole world is linked through each other. In India quite a distance to attaining information society, though vast amounts of employees working in sectors related to the information to offer data from the ground level. A significant role in the development procedure is played by information. The technology in which new as well as its numerous applications, i.e., TV, telecommunication, radio, computers, cable, as well as the Internet, is quickly leading India to turn a society based on information. the ideas “Informatisation” techniques is the procedure with that technologies based on communication are utilized as a way of promoting socio-economic growth. The different Indian communication revolution includes proliferation of telephone, software as well as the Internet for development of venture capital, entrepreneurship, along with helpful government policies as well as “networking between Indian business in Silicon Valley and their Indian-based counterparts.” Hanson and Narula [6]x states that a” genuine creation of the society is based on information in itself as well as a topic to the common difficulty of international transfer, postal service along with telephony through the computer. Ways to numerous nations are responding to each other because of stress from a society based on information. They examine the present situation of infrastructure development policy, social systems and developing countries along with models of information technologies as well as society. Exactly “how society accepts technologies in the
744
B. S. Behera et al.
social system” as well as lifestyle requirements of the different societies in global perceptions. Behera et al. [7] in his study entitled as in India, in agriculture sector, information communication technology endorses retail marketing as a study stated that “ICT mediated Market Led Agriculture Extension era. Thus, in this Information,” era could not in a position to make it with no information. According to R.T.I. act 2005, each individual justifies the appropriate knowledge for agriculture. Consequently through providing regard to knowledge, one must generate a revolution on information through the delightful mantra “Soochana se Samadhan.” The second major producer of merchandise is India like vegetables and fruits. Among the primary key problems, that need investigation, is the technique by that can decrease the post-harvest damage that is quite considerable currently. This will require a plan of the environment-friendly, cost-effective and efficient storage system. Additionally, there is a necessity for worth inclusion in farming which creates to capitalize on the cultivation outcome. This paper tries to emphasize the significance of ICT in enhancing advertising actions of retail business in farming aspects in the economy of Indian. The great opportunity of applying exactly the similar in Indian agricultural business behaviors through several stories of achievement and versions for an explanation of the significance of ICT in Agriculture Retail Marketing. Behera et al. [7] in his study entitled ad E-Governance arbitrated farming for continuous Life in India stated that the era of ICT that includes the ICT Mediated farming Extension in Urban and non-Urban areas to distribute the principles of information through “Expert System (ES), Decision Support System (DSS), along with Management Information System (MIS)” by impregnating the Knowledge Management System as well as User Interface. Therefore, e-agriculture, thus details an emerging area concentrated on the improvement of rural and agricultural development via enhanced information as well as communication procedure. The major goal is to give an interface to consumers as well as farmers and also to facilitate connecting up of cultivation manufacture marketing cooperative. In India, Gyandoot Project, IT-Kiosks, “Information Village Project of MSSRF (MS Swaminathan Research Foundation),” ITCs, Eid-party agriline, E-chaupal, Bhoomi Project, Kisan Call Center(KCC), I-Kisan Project of Nagarjun group of companies, Village Knowledge Center and so on are the latest improvement of e-governance mediated agriculture. It is the life of end-users as well as farmers in an alternative manner through “Common Service centers in grass root level,” e-kiosks and knowledge management portals.
3 Specific Objectives The specific objectives of the work are to illustrate and analyze ICTs application in agriculture in the Patna block and to discover the role of ICTs in agriculture development. And also to assess the people’s consciousness toward ICTs application in agriculture development.
Information Communication Enabled Technology for the Welfare …
745
3.1 Hypothesis Setting The hypothesis of the” work is: H1—An emphatic role of upliftment of agriculture growth is by ICTs. H2—In rural areas, ICTs applications are still unreachable. H3—The strength in the ICTs application for agriculture growth is enhanced by mobile communication.
4 Research Methodology (Materials Methods) The present work was accomplished through gathering “both primary and secondary data. Secondary Data Collection: The secondary data” information has been gathered with the help of various sources of portals, Web sites, materials, along with other existing records which are Act and Policy of Odisha Government, national as well as state government agriculture portal, different projects as well as schemes on ICT under Odisha’s Government. The additional related details were collected from different publications, journals, Internet, research paper, official records, magazines and news articles along further exiting sources of information. Sample Design: To work for the responsibility of ICT in a location like Patna in Keonjhar District, the sample was created as per the possibility of investigating ways in the fixed time. Population of the Study: 29,755 are the research population composed of farming labors, and farmers (private agencies) still are exclusively connected through the farming. Sample Area: In Odisha’s agriculture, Patna blocks play an important part in maize production. Approximately 80% of people are exclusively associated with agriculture as well as industries based agro that offer livelihood on the population of the block. 2 g panchayats are chosen out from 24 panchayats for the gathering of data as per agricultural activities, 1 maximum cultivation actions and another one lowest actions. The chosen location of the sample from the block of Patna includes “16 different villages, 8 villages” through every 2 panchayats along with a sample which is small individuals as well as big farmers. Thus, ICT setup is needed in which it experiences the lack of data. The study done is descriptive research. Sample Size: 170 are sample sizes which consist of: “4 stakeholder (private agents of seeds and fertilizer companies),” 4 government officials, 160 farmers along 2 ICTs experts.
746
B. S. Behera et al.
Sample Selection: Simple random sampling methods were used for the sample. With the help of stratified random sampling techniques, ten farmers were chosen from every village; in each panchayat, they have eight revenue villages. Primary Data Collection: It was gathered via two techniques observation as well as a survey. From the farmers of selected villages, data was collected through a schedule. From every village, ten farmers were taken. The selection was completed from the catalog of stakeholders. The schedule was organized through both open-ended and closed-ended “questionnaire. Although gathering primary data of non-participatory” surveillance, a technique was observed. Tools and Techniques: Schedule” was utilized “as a tool of survey method.” Data analysis and interpretation: Data is examined in quantitative as well as qualitative procedures. Collected data from each panchayat are examined averagely. To understand the variation, a comparative analysis was completed. To evaluate the size of data, SPSS software was utilized. Farmers are experiencing different press behavior that do not limit with particular media. An 21% of average consumes just on media based on electronic (radio as well as TV), 7% on folk media, 51% on both folk and electronic, 5.6% on print as well as electronic, 3.15% on print, folks as well as electronic, 4.4% on electronic as well as the Internet, 0.65% on print only as well as 6.25% on every category, respectively. Here, over 10% of farmers have utilized the Internet. Therefore, the impact of electronic media (radio as well as TV), as well as folk media, is superior compared to other media. Consumption of media is approximately equivalent to both sides since farmers spend the rest time of theirs on leisure. Primarily paying on entertainment typical is 41.55%, 3.15% just on a news program, on 37.75% on both entertainment as well as news are entertainments along with others 0.65%, agriculture as well as news associated information around 2% as well as every category of programs are15.1%. Although the ICT based information system are data oriented deliberately go via agricultural based data as a news channel and public information system. When a consequence, “15.1 + 1.9 = 17% of farmers are” remarkably search data according to the requirements of theirs with the help of media. India’s only completely agricultural-based channel is DD Kisan. The channel has broadcasting programs mainly based on agriculture, but an average of 37.79% of individuals are widely recognized all concerning the channel, while 62.21% of farmers do not have any clue about its significance as well as existence. It presents that farmers approximately 55.5% become weather/climate-associated information with the surveillance as well as asking out of knowledgeable ICTs users or friends. 5% of farmers have studied with the help of media, i.e., weather forecast or news of radio, newspapers or TV. Basically, 1.25% with the assist ICTs application, 27.55% during different sources such as relatives, friends along with family those are ICTs users, tools along with observation of “media, 11.25% of them access” via ICTs application as well as media. Thus, 1.25 + 11.25 = 12.50% “an average of farmers” utilize “ICTs application for weather/climate” information.
Information Communication Enabled Technology for the Welfare …
747
Data transfer is from one to numerous via inter-personal communication. Usually, 20% of farmers obtain knowledge regarding program, announcement, distributions, government policy and so on from friends, 5.65% from agriculture officials (village level workers, Krisak Sathi along with other government officials), as 0.65% study just from media, along with 2.55% directly via ICTs applications, 53.8% through the assistance of agriculture officials as well as friends, 3.8% media as well as agriculture officials, 5% via media, agriculture officials along with friends, 5.65% with the assist of media as well as friends, moreover directly via ICTs application 3.15%. Consequently, farmers are utilizing data concerning government, policy program as well as services with the assistance of ICTs applications, and 2.55 + 3.15 = 5.7%. is the percentage. Analysis state regarding how the farmers are learning strategies which are latest of farming for enhanced production, fertilizer usage, “pest management, treat the different disease of crops” as well as market price results. Typical 7.5% learns exceeding information by own knowledge along with 22 observations through friends. 2.5% by the assistance of “agriculture officials, 3.15%” by assistance “of ICTs application and 80.65%” via the different sources such as observation, friends, stakeholders along with government officials, although 6.3% learn from stakeholders. Consequently, those ICTs applications specifically assist farmers to acquire the latest information as well as innovative methods and total contribution of its 3.15 + 6.3 = 9.45%. Between the entire proportion of mobile users, approximately 36% of them utilize just for communication among relatives as well as friends, 23.29% have utilized for double reason entertainment (Play game, watching the video, listen songs etc.), as well as communication approximately 15% for communications as well as also to learn about agriculture “based information (agriculture extension purpose)” along with 25.85% each the categories of an idea. Therefore 25.85 + 15 = 40.85, approximately 41% obtain knowledge concerning farming through the assistance of communication through mobile. Cultivation extension officers offer data straight to the farmers concerning way and techniques which are new of farming. At the first phase, extension officers are achieving information by the assistance of ICTs application after which data distributes to the whole agriculture society. As an outcome, 21.3% of farmers enhanced the profits of theirs by the assistance of extension workers as well as ICTs applications. Figure 1 depicts the relation between information sources Vs rate of adoption and non-adoption rate of each source, and finally, most of the respondents were belong to the farming community; so accordingly as per their literacy and knowledge percentage, they were exposed to various sources. Apart from these, most of the respondents were women that are why due to their shyness they prefer puppet and folkways are the trustable sources of information, and due to illiteracy, wall painting is also one of the best sources and NGOs is also one of the trustable sources as per findings. Here, Fig. 2 depicts the preference of information consumers toward the various source of information with the adoption rate. Local traditional media is highly
748
B. S. Behera et al. 180 160 136 145 140 120 95 100 70 68 65 60 80 60 34 36 35 32 40 18 19 20 0
AdopƟon rate Non-AdopƟon rate Linear (AdopƟon rate) Linear (Non-AdopƟon rate)
Fig. 1 Information sources versus adoption and non-adoption rate. Source own data 180 160 140 120 100 80 60 40 20 0
165 140
85
95
AdopƟon Rate
89 71
Non-AdopƟon Rate 30 5 Print Media
Electronic Media
Rural ArƟsans
Youth
Internet Local & Media TradiƟonal Media Women
Linear (AdopƟon Rate) Linear (Non-AdopƟon Rate)
Farmers
Fig. 2 Adoption rate of information consumers with various information sources along with rate of adoption. source own data
adopted by people, and Internet media is the lowest part. The trendline reflects the adoption and non-adoption level of information which passes through both information consumers and information media.
Information Communication Enabled Technology for the Welfare …
749
5 Result and Conclusion Media consumption is much superior between the farmers. Almost 99% of them use on media, it might be regular or electronic media, folk media, or media which is new. According to the information, less or more farmers behave on several media types. Around 11% of the farmers are utilizing the Internet. Usual 17% of farmer’s use media for agriculture-based information, another percentage of farmers utilizing for other reason like news, entertainment, along with various other kinds of details. Between the media users, about 38% are familiar with “the DD Kisan Channel. Those are” familiar with the DD Kisan Channel, 63.63% are seeing the channel frequently, and 97.5% declared every information is appropriate for the farming extension. Weather information cooperates a critical role for farming. Now at Patna, an average of 12.50% of farmers utilizing ICTs application to find out regarding weather/climate information. Several of them are receiving data via media, ICTs, as well as through the surveillance. In Patna, essentially farmers are discovered agriculture strategies by the ancestors as well as friends. ICTs assist them degree to understand cultivation procedure. Essentially information is moving with agriculture experts or extension offices to the farmers; Patna’s portion is 53.8%. 5.7% are permission via the ICTs applications. The latest trends of methods utilize modern gadgets, use of improving and advanced agriculture techniques, hi-tech concepts, pesticides management along with other methods—normally diffuse through one to more as well as again more to much more. In Patna, 9.45% of farmers use every kind of data with the assistance of ICTs application as well as every source. At the starting phase, stakeholders, experts and extension officers increase trends via ICTs types of equipment for the future phase; they instruct farmers through visit fields, workshops as well as demonstrations. Mobile phone functions as resources of ICT in Patna “block, an average of 87.87% farmers” utilizing a cell phone. Between the 22.3% using smartphone while 77.7% owning a regular phone, tab person is 0. Cell phone mostly utilizing now for communication among relatives or friends as well as the portion is 36%, along with 23.29% utilizing for entertainment and communication “(listen to music, watching the video such types,” play game), an approximately 41% utilizing for collect data concerning farming. Out of the cell phone customers, 65% of mobile users hardly ever read SMS that is received in the inbox, except 35% seriously read the SMS. An average 38% of farmers get SMS through several portals, registered Web sites or government offices concerning agricultural requirements. In Patna, Internet users hike gradually, and the absence of appropriate connectivity of broadband and vulnerable strength of mobile network produce obstacles to use the “Internet. In spite of these, 58.35% of farmers” are utilizing the Internet, out of which “22.3% are smartphone users. Between the 58% Internet users, 62.50% of farmers browse” farming associated information on the Internet. However, just 3% of them frequently visits as well as know portals related to farmers along with such types of Web sites.
750
B. S. Behera et al.
Basically, 40% of farmers understand regarding Kisan Call Centres, others do not include any clue regarding it. Between the 40% to 23.49%, users maintain communication in KCC; out of the 23.49%, just 65.8% are authorized mobile numbers of theirs on KCC, among 65.8% of farmers who are registered 75% obtain message from KCC regularly. Government of Odisha, free of charge mobile distribution programs for Kisan Credit cardholders, in Patna Block, no one availed mobile phone. PCs as well as laptops are seldom utilized in this area, just 1.75% of farmers utilized such kinds of progenies or devices of farmers are utilizing laptops, and PCs; they tell their elders concerning questions related to agriculture. In Agriculture Extension, 8.67% of farmers are aware of the benefits of ICTs application. Through the suggestions of extension workers, collected data with the help of ICTs application, 21.3% of farmers enhance the manufacture of theirs. Mandi facilities are really poor, just paddy are bought through neighborhood mandi, and fresh foods are delivered to the neighbor mandis of the states. Approximately 8.7% of the farmers obtain gain via the assistance of mobile phones or ICTs application. The mobile phone essentially enables you to collect information regarding sell price; approximately 19% of farmers study sell price results via the mobile phone. ICTs applications like cell phones assist farmers to transform the old perceptions. With the help of the mobile phone, farmers can create communication with Kisan Call Centres, market holders queries with extension officers, share information with friends, and also web on a mobile phone. Above most assists them to alter the standard design of farming. Acknowledgement Acknowledging the help and support of OUAT, Green College and LIUTEBM University for publishing this paper sucessfully. Also thankful to Dr.B.P.Mohapatra, Dr.K.S.S Rakesh for encouragement.
References 1. Mohanty L, Bohra N (2006) ICT strategies for school a guide for school administration. Sage Publication, NewDelhi 2. Dasgupta D, Choudhary S, Mukhopadhyay SD (2007) Development communication in rural sector. Abhijeet Publication, Delhi 3. Bhatnagar S, Schware R (eds) (2006) Information and communication technology in development. Sage Publication, New Delhi 4. Narula U (2011) Development Communication: theory and practice. Har-Anand Publication, New Delhi 5. Singhal A, Rogers EM (2011) Indian communication revolution, From Bullock Carts to cyber marts. Sage Publication, New Delhi 6. Hanson J, Narula U (2012) New communication technologies in developing countries. Routledge, Publication New York 7. Behera SB et al (2015) Procedica of computer science, Elseiver Publications
CHAIN: A Naive Approach of Data Analysis to Enhance Market Power Priya Matta, Sparsh Ahuja, Vishisth Basnet, and Bhasker Pant
Abstract Data analytics is one of the most important fields in this computing world. Every emerging paradigm finally results in the generation of data. The rapidly growing attraction from different industries, markets, and even academics results in the requirement of deep study of big data analytics. The proposed work deals with the unstructured data and raw data, and it then converts that into the structured and consistent data by applying various modern methods of the data warehouse. These methods include logistics and supply chain management (LSCM), customer relationship model (CRM), and business intelligence for market analysis to make fruitful decisions for an organization. Thus, the processes of data analysis are performed in data warehouses that are ETL process which describes the steps of gathering and transforming the data and finally placing the data to the destination. The proposed CHAIN method is a core sector of our research work as a naive approach assisting in improvising the market power. Here, in this research work, a market analysis on an IT hardware sector is performed that deals with the sales of peripherals and telecommunication devices in the market. This can be achieved via continuous communication of clients and retailers to generate meaningful and relevant data, and this data can also be analyzed for generating various required reports. Keywords Data analytics · Extraction · Transformation · Logistics and supply chain management (LSCM) · Customer relationship model (CRM)
P. Matta (B) · S. Ahuja · V. Basnet · B. Pant Computer Science and Engineering, Graphic Era University, Dehradun, India e-mail: [email protected] S. Ahuja e-mail: [email protected] V. Basnet e-mail: [email protected] B. Pant e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_55
751
752
P. Matta et al.
1 Introduction Big data creates a big opportunity with big data sets and helps to realize the benefits. The targeted solutions for data analysis are to provide new approaches to achieve impressive results. Marketers are collecting lots of data daily from a variety of customers to paint a complete picture of each customer’s behavior. According to the past CRM, analytics comprises all programming that analyzes data customers to streamline better business decisions and monetizing. Similarly, in this research work, the introduction of another analysis technique by seeking the present results will lead a better decision making in businesses and also be healthy for customers in the coming years. In this, marketers can feed these new, real-time insights back into the organization to influence product development. This research work provides a new methodology for data analysis using the chain process. It begins by defining the ETL process and describing its key characteristics. It provides an overview of how the analysis process is to be done. As by seeking the past and present results in the market, it shows less interaction between customers and shopkeepers. To enhance the relationship between them, their implications for the future are also discussed. Key application areas for the process are introduced, and sample applications from each area are described. Two case studies are provided to demonstrate the past results, and how they can yield more powerful results now. The paper concludes with a summary of the current state that how businesses can be more powerful for the future. The paper is composed of nine sections. After the introduction part, Sect. 2 defines data analytics and its importance. Section 3 describes the extraction, transformation, and load (ETL) cycle. Section 4 discusses the motivation behind our research. The proposed technology is elaborated in Sect. 5, containing the definition of CHAIN, Why CHAIN, How CHAIN can be achieved, and the proposed model of methodology. In Sect. 6, market analysis through the case study is discussed. The result analysis is presented in Sect. 7. A comparison of the proposed approach with existing methods is provided in Sect. 8, and the conclusion of the work is presented in Sect. 9.
2 Data Analytics and Its Importance According to Gandomi [1], “Big data are worthless in a vacuum. Its potential value is unlocked only when leveraged to drive decision making.” Decision making can be accomplished by different institutes and organizations by turning the vast and big amount of data into precise and meaningful information. According to him, different industries and different organizations define and express big data in different ways. Data analysis is a term that examines the extraction of the data set either a primary or secondary, to organize and mold into helpful information for healthy decision making. It also helps in reducing complexities in the managerial decisions, to enhance the effectiveness, marketing policies, and end-user serviceability to boost the business
CHAIN: A Naive Approach of Data Analysis to Enhance Market Power
753
Input Concept
TransformaƟon Analysis
Output ComputaƟon
Fig. 1 Data analytics process
performances. From input to the generation of output, the process can be understood with the help of Fig. 1. Importance of Data Analytics According to Kempler [2], “to capitalize on these opportunities, it is essential to develop a data analytics framework in which defines the scope of the scientific, technical, and methodological components that contribute to advancing science research.” Other importance can be mentioned as it helps in reducing the banking risk by identifying fraudulent customers from the historic data and also assists in presenting appropriate advertisements depending on selling and purchasing historical data. It is also exploited by various security agencies to enhance security policies, by gathering the data from different sensors employed. It also assists in eliminating replicated information from the data set. Limitations of Data Analytics The limitations of the data analytics can be described as while having surveys, the person does not need to be providing accurate information. The missing values and lack of substantial part could also limit its usability. Data may be varying in quality and format when it is collected from different sources.
3 Extraction, Transform, and Load (ETL) (A) Extraction According to Akshay. S, “In the extract process, data is extracted from the source system and is made accessible for further processing. The main objective of the extract step is to extract the required data from the source systems utilizing the least possible little resources” [3]. Extraction is the act of extracting the records from a range of homogenous and heterogeneous resources, and it is to be transferred to the data warehouse. This process is done in such a way that it will not be affecting the performance as well as the response time of the system.
754
P. Matta et al.
Fig. 2 ETL process
(B) Transformation According to Akshay [3], “The most complex part of the ETL process is the transformation phase. At this point, all the required data is exported from the possible sources but there is a great chance that data might still look different from the destination schema of the data warehouse” [3]. Transformation is an act of transforming the extracted data into fruitful information which is not exactly similar to the structure of the data in the warehouse. In this process, the naïve values are sorted, merged, and even derived by the application of various validation rules. (C) Loading Once the data is extracted and transformed, it is now finally ready for the last stage that is the load process in which the data is collected from one or more sources into their final system. However, there are many facts like the process of data loading and its influence on the storage of data in the data warehouse. The way the data is being loaded may have its impact on the server’s speed of processing as well as the analysis process. The other major consideration during the data loading is to prevent the database from getting debilitate. According to Talib and Ramazan, “This step makes sure data is converted into the targeted data structure of data warehouse rather than source data structures. Moreover, various schema and instance joining and aggregation functions are performed at this phase” [4]. The process of ETL can easily be explained with the help of Fig. 2.
4 Motıvatıon Behind As the local survey of the micro and medium enterprise has been recorded, the owners of these enterprises said that they are lacking in establishing proper communication with their customers since various platforms came into existence. A variety of platforms is the reason for segregating the choice of customer, and thus the customer is not stable to purchase goods from a single platform. Seller’s said that if the customers will help them by giving their valuable and suitable suggestions regarding the product, they will be providing the best services for them. Thus, from this survey, the CHAIN
CHAIN: A Naive Approach of Data Analysis to Enhance Market Power
755
process is introduced which helps to interact and enhance the relationship between both the consumer and the seller.
5 Proposed Methodology The analysis process can be accomplished after the data is gathered and inputted on the basis of dynamic requirements generated by various end-users or customers. “During early requirement analysis, the requirement engineer identifies the domain stakeholders and models them as social actors, who depend on one another for goals to be fulfilled, tasks to be performed, and resources to be furnished” [5]. After having concern about the requirements, then it will be forwarded to the next process, that is, data collection. Big data analytics in logistics and supply chain management (LSCM) has received increasing attention because of its complexity and the prominent role of LSCM in improving the overall business performance. Thus, while acquiring data from multiple sets and performing analyses it found that the CHAIN method will be a rising factor in mining techniques for the customer relationship model (CRM). “CRM requires the firm to know and understand its markets and customers. This involves detailed customer intelligence in order to select the most profitable customers and identify those no longer worth targeting” [6]. “In the emerging markets of Asia, dynamic capability played a crucial role in gaining competitive CRM performance across all three industries” [7]. (A) CHAIN “Another popular approach to customer preference quantification is the discrete choice analysis (DCA) technique, which includes the probit model and logit models (multinomial, mixed, nested, etc.) to name but a few” [8]. Data requirement is the first and foremost part of data processing. The analysis process can be accomplished after data is gathered and inputted on the based on dynamic requirements generated by various end-users or customers. There are lots of ways to collect data from various inputs but according to the situation of the market, the introduction of the term “CHAIN” can be a rising factor for the local market businesses. CHAIN stands for “Customer’s Help, Advice and Information Networks,” and it is a process of communicating customers with the sellers from which the customer will help the business persons to gain knowledge by providing suitable suggestions to enhance a business and boost up the services between the customer and sellers. This is the process in which the shopkeepers will raise the questions with their customer in the form of sentiments or in any other possible ways to interact with them (Fig. 3). (B) Need of CHAIN “Market segmentation is one of the most fundamental strategic planning and marketing concepts, wherein the grouping of people is done under different categories such as the keenness, purchasing capability, and the interest to buy” [9]. According to the market analysis, it shows that nowadays customers are avoiding approaching
756
P. Matta et al.
Fig. 3 Data analytics process
the local market and trying to less interact with the local vendors. This is because of having changes in our lifestyle and having attraction toward the platform which is not social. “OLC establishes an organizational culture where the existing mental models regarding the collection, retention, and utilization of customer knowledge are gradually replaced with new ones to better exploit market opportunities which translate customer needs into value-added offerings” [10]. According to the survey, the shopkeepers have suffered a lot due to e-commerce not only because of cheaper prices of products but also not having the proper interaction between customers and shopkeepers which causes a distance between them. By the process of “CHAIN,” it is possible to challenge and target other platforms easily. By having healthy communication, an information network between them will also help in enhancing business services and by providing a product at effective cost. (C) Performance of CHAIN “Today is the era of loyalty such as customer loyalty, employee loyalty, management loyalty, and loyalty to the principles, ideals, and beliefs. Several studies have shown that satisfaction is not the key to ultimate success and profitability” [11]. With the help of e-services (digitally) or can be non-digital, shopkeepers will easily approach toward the customer with providing weekly suggestion assessment for the customer, who will help the shopkeeper to gain fruitful approaches and aid to enhance their business and also help in maintaining relations with customer, and thus it will form a customer network. Examples of e-services are by providing sentiment assessment for their customers through contact numbers, creating an effective Web site of their respective stores and
CHAIN: A Naive Approach of Data Analysis to Enhance Market Power
757
Fig. 4 Proposed model of CHAIN process
by providing a chatbot system, providing effective services for all products and a proper review system for each product, and also ERP system which attracts the customer by getting any kind of information regarding purchased goods. These factors will lead to rising of CHAIN’s process. “These findings prove the findings given by Brown and Gulycz (2001) and Chen (2008), who recommended that satisfied customers are more inclined toward retaining a relationship with existing companies and positive repurchase intentions in the future” [12]. (D) Proposed Model of Methodology Figure 4 depicts the relation between customer and seller in which seller will provide a suitable interface which contains the reviews and sentiments for their customers to communicate with them and later on customers will advise sellers about their product and share the appropriate reviews with them, and at last vendor will come up with the best outcome and share it with them accordingly. Thus, it will form a CHAIN to enhance the business management system and customer relationship management. The shopkeeper will have a Web site and application software that consists of a good support system design through which they can interact with the end-users. Each customer will have to register and to create a user id and password through which they can login and communicate with shopkeepers; there will be a choice for customers that they want their information to be public or private. This feature will keep customer’s data safe, confidential, and away from duplication. And if any kinds of comments or information either positive or negative are there, the rights are given only to the admin or a shopkeeper to ignore or block the false information.
758
P. Matta et al.
6 Market Analysis Through Case Study According to Sgier [13], “Big data analytics in Logistics and Supply Chain Management (LSCM) has received increasing attention because of its complexity and the prominent role of LSCM in improving the overall business performance.” According to Shafiei [14], “Reliable field data and well-documented field trials are the foundation of developing statistical screening tools.” According to Marjani [15], “Moreover, big data analytics aims to immediately extract knowledgeable information using data mining techniques that help in making predictions, identifying recent trends, finding hidden information, and making decisions.” According to Nastic [16], “detecting patterns in large amounts of historic data requires analytics techniques that depend on cloud storage and processing capabilities.” After going through different viewpoints of researchers and practitioners, concludes the following research. The study can be made on any kind of computing device, for example, desktop PC, laptops, smartphones, and others of the same cadre. The investigation involves a thorough market analysis of IT products of different companies that have been sold in Indian markets in the 2nd quarter of the year 2019 (July–December). The collected data from the local market reached all the IT product exclusive stores about trending sales. After the analysis, there has been a decline in market sales and an increase in e-commerce sales by 60–70%. And as a whole, there is a threefold decrease in the overall IT hardware sales in India. The data collected from the smartphone sellers depicts that there is a huge downfall in sales of mobile phones and its accessories in small local markets. Thus, an analysis is performed due to less interaction between customers and a shopkeeper is the major factor in the downfall of IT sales in the local market. The lack of services provided by the shopkeeper also leads to loss of their market. A survey is conducted for the shopkeepers regarding the market strategies. Some of those questions are as follows: Criteria 1: If you want to buy any digital or IT product, will you purchase that from the local market or online market (Flipkart, Amazon, any other online platform)? Criteria 2: Which platform among these two you feel more secure and safe? Criteria 3: If the local market gives you the facility, not only of item purchase but also information sharing, accepting customer requests and ideas, then which platform would you like to choose? The result of that survey is depicted in the graphical forms. After analyzing Fig. 5, it is observed that half of the customers are attracted to the online platform which is the basic reason for the downfall of the local market. If the shopkeeper starts communicating with customers and provides the best services to their customers, then customers will regain the local market and it may enhance the economy of the local market especially of IT products and there will be a healthy circulation of money in the market again. After analyzing Fig. 6, it is observed that a lesser number of customers are attracted to online platforms, as they feel insecure regarding the quality of the product.
CHAIN: A Naive Approach of Data Analysis to Enhance Market Power
759
Fig. 5 Pie chart on the basis of criteria 1
Fig. 6 Pie chart on the basis of criteria 2
Fig. 7 Pie chart on the basis of criteria 3
Figure 7 shows 60–65% of customers will back to the local market if the best services are provided by the sellers and according to the previous results, this time there will be an increase of sales in the market by 10–15% which will boost up the local market economy.
7 Result Analysis “Based on the theoretical and the reality of what happened, there is still a gap of research on the influence of customer satisfaction on customer trust and loyalty. The key problem in this research is questioning the variable customer satisfaction and
760
P. Matta et al.
trust influence customer loyalty and the role of customer trust as a mediating variable in the BRI Kendari of Southeast Sulawesi province” [17]. The customers now have a variety of options to purchase products because of rapid globalization and growing competition. They can easily compare products or even switch their platform which is the reason behind of downfall for retailers. Thus, to collect information from consumers there will be a “CHAIN” process which can be digital or non-digital will help lots of shopkeepers and customers to maintain a long-term relationship between them and thus there will be a safe, secure, and good product services in local markets. Not only “CHAIN” process applies to the IT sector but for the other remaining sectors “The CHAIN” methodology is a requirement. “Through customer collaboration, organizations learn, meet customer requirements better, and improve performance (Prahalad and Ramaswamy 2004). Customers offer a wide base of skills, sophistication, and interests and represent an often-untapped source of knowledge” [18].
8 Comparison with Existing Techniques However, this can be resolved through the CHAIN process which can be an operational process to enhance the growth of the business and make them more profitable. CHAIN is quite similar to customer service and support (CSS), but CHAIN is an advance feature in which it will take customer’s help and business-related information for better outcome of business and by providing quick, convenient, and consistent services to its customers by interacting them with the help of e-services directly to the shopkeepers without having any conciliator.
9 Conclusıon Data analysis and mining are some of the most important and most promising aspects in the field of information technology for the integration of the businesses, vigorous decision making. In this paper, the process of data analysis is explained by employing various techniques. The outline is significantly related to data analysis and its challenges. Furthermore, an overview of the ETL process can be applied to both the public and private sectors for forecasting, making crucial decisions in finance, marketing, sales, etc. The introduction of the CHAIN process will be the most effective technique in the field of analysis and mining. It will boost up the economy of the retail market in the coming years. Lots of people will interact with the retailers digitally, and thus it will also help in the growth of the digital sector which will be more safe and secure. People will also learn to be more social.
CHAIN: A Naive Approach of Data Analysis to Enhance Market Power
761
The outcome of data analytics is one of the major requirements of the industry nowadays. All industries, entrepreneurs, and other organizations have recognized the significance of data analytics to improve their throughput, to enhance their profits, and to increase their efficiencies. One can easily understand the criticality of efficient and better data analytics for the proper growth of any kind of industry, business, or organization. It also provides the speed and accuracy to business decisions and also maximizes the conversion rates. In data analytics, it is a great career opportunity and has a good future in a thriving era.
References 1. Gandomi A, Haider M (2015) Beyond the hype: Big data concepts, methods, and analytics. Int J Inf Manage 35(2):137–144 2. Steve K, Mathews T (2017) Earth science data analytics: definitions, techniques and skills. Data Sci J 16 3. Lohiya, Akshay S et al (2017) Optimize ETL for banking DDS: Data Refinement Using ETL process for banking detail data store (DDS). Imperial J Interdiscip Res (IJIR) 3:1839 4. Ramzan T et al (2016) A multi-agent framework for data extraction, transformation and loading in data warehouse. Int J Adv Comput Sci Appl 7(11):351–354 5. Giorgini P, Rizzi S, Garzetti M (2008) GRAnD: A goal-oriented approach to requirement analysis in data warehouses. Decis Support Syst 45(1):4–21 6. Rygielski C, Wang J-C, Yen DC (2002) Data mining techniques for customer relationship management. Technol Soc 24(4):483–502 7. Darshan D, Sahu S, Sinha PK (2007) Role of dynamic capability and information technology in customer relationship management: a study of Indian companies. Vikalpa 32(4):45–62 8. Conrad T, Kim H (2011) Predicting emerging product design trend by mining publicly available customer review data. In DS 68–6: proceedings of the 18th international conference on engineering design (ICED 11), impacting society through Engineering Design, vol 6, Design Information and Knowledge, Lyngby/Copenhagen, Denmark 9. Kashwan, Kishana R, Velu CM (2013) Customer segmentation using clustering and data mining techniques. Int J Comput Theory Eng 5(6):856 10. Ali Ekber A, et al (2014) Bridging organizational learning capability and firm performance through customer relationship management. Procedia Soc Behav Sci 150:531–540 11. Ali K, et al (2013) Impact of brand identity on customer loyalty and word of mouth communications, considering mediating role of customer satisfaction and brand commitment. (Case study: customers of Mellat Bank in Kermanshah). Int J Acad Res Econ Manage Sci 2(4) 12. Ishfaq A , et al (2010) A mediation of customer satisfaction relationship between service quality and repurchase intentions for the telecom sector in Pakistan: A case study of university students. African J Bus Manage 4(16):3457 13. Sgier L (2017) Discourse analysis. Qual Res Psychol 3(2):77–101 14. Ali S, et al (2018) Data analytics techniques for performance prediction of steamflooding in naturally fractured carbonate reservoirs. Energies 11(2):292 15. Mohsen M, et al (2017) Big IoT data analytics: architecture, opportunities, and open research challenges. IEEE Access 5:5247–5261 (2017) 16. Stefan N, et al (2017) A serverless real-time data analytics platform for edge computing. IEEE Int Comput 21(4):64–71
762
P. Matta et al.
17. Madjid R (2013) Customer trust as relationship mediation between customer satisfaction and loyalty at Bank Rakyat Indonesia (BRI) Southeast Sulawesi. Int J Eng Sci 2(5):48–60 18. Vera B, Lievens A (2008) Managing innovation through customer coproduced knowledge in electronic services: an exploratory study. J Acad Market Sci 36(1):138–151
Behavioural Scoring Based on Social Activity and Financial Analytics Anmol Gupta, Sanidhya Pandey, Harsh Krishna, Subham Pramanik, and P. Gouthaman
Abstract Credit scoring is probably the most seasoned utilization of an examination. Considering a couple of years, an enormous number of complex characterization strategies have been created to support the measurable execution of credit scoring models. Rather than concentrating on credit scoring, this project relies on alternative data sources to support and implement other factors to identify the characteristics of an individual. This work identifies unique factors through a person’s online presence and the financial record to provide them a unique score that signifies an individual’s behaviour. The proposal demonstrates how a person’s online activity on social media sites, like Facebook and Twitter determine the character and behaviour of the person. Some factors that are included for the social scoring are types of posts shared, comments added, posts posted, pages followed and liked. These data are plotted against a graph signifying the time to obtain a social score. There is a financial scoring model that will determine the person’s financial fitness and likelihood to engage in criminal activities due to financial deformity. Combining both social scoring and financial scoring at a specific weight will provide with a behavioural score. This score will classify the subjects and help determine good citizens among the rest. Subjects with higher behavioural scores predict the promise and practice of a good citizen. This can be used to engage and provide added incentives to good citizens in order to promote good citizenship.
A. Gupta · S. Pandey · H. Krishna · S. Pramanik · P. Gouthaman (B) Department of Information Technology, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, India e-mail: [email protected] A. Gupta e-mail: [email protected] S. Pandey e-mail: [email protected] H. Krishna e-mail: [email protected] S. Pramanik e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_56
763
764
A. Gupta et al.
Keywords Social activity · Financial analytics · Credit scoring · Behavioural score · Sentiment analysis
1 Introduction This work aims to specialize in the development of a scientific and methodological approach to see a scoring mechanism to assess and score human behaviour supported by online social media activity and financial activity. The scores will help reward people with good behaviour and hence promote good citizenship. This proposed work will provide a social score to individuals based on various factors. Personal factors include sincerity, honesty and integrity, which are key determinants of an individual’s personality. Financial factors include a person’s spending on money, timely payment of loans and EMIs. The financial aspects are mainly based on the banking details of the person’s account statements and loan history. The main innovation of this work is that each of the citizens is going to be given a score measuring their sincerity, honesty and integrity which this score will then be a serious determinant for the life of an individual, for example, whether to be able to get credit, rent a flat, or buy a ticket, or having preferential access to hospitals, various services of government and universities. Social credit score will consider a variety of factors that will go beyond just their online activity, behaviour, financial decisions and spending. Considering these factors, it will reveal more than just that as well as the base root of this work comes from the credit scoring model, where companies are rated and scored based on of their creditworthiness. It evolved over many stages, from individual people having their credit score, which are determined by the financial condition of the place that they are living in concerning global finance value. It evolved furthermore as the Chinese credit scoring system came into the forefront. In the 1960s, the emphasis was laid on the growth of an ‘information infrastructure in finance’. It proposed a constant progressive growth towards the use of statistical methods to avoid risk and to calculate creditworthiness. Moving towards the twentieth century, it grew into considering individual factors like a person’s honesty, integrity, sincerity and giving them individual scores. Before this, blacklisting and white listing obtained, which was a means to punish people. The primary aim of this work is to reward not to punish. Advancing towards the twentyfirst century, a proposition of a ‘scored society’ came to the force. This work has been inspired by a similar approach where a scoring system will score the individuals on their social media behaviour and their financial propositions. Some incidents related to this occurred in 2018 and 2019. In 2018, some people were denied rail and airline tickets in China as a result of fraud and poor behaviour. In 2019, restrictions were imposed on people leading to a change in people’s behaviour. Technical integration needed to be done to implement this on a global level. Systems were there to give you a financial score or a social score. This work has evolved from the thought of merging both and giving the people a suitable model that will promote traditional moral values. This work also identifies unique factors through a person’s online
Behavioural Scoring Based on Social Activity and Financial …
765
presence and financial record to provide with a score that signifies the behaviour of an individual. Furthermore, the work demonstrates how a person’s online activity on social media sites such as Facebook and Twitter determines the nature and behaviour of the person. Some factors that are included for the social scoring are types of posts shared, comments added, posts posted, pages followed and liked. These data are plotted against a graph signifying the time to obtain a social score. There is a financial scoring model that will determine the person’s financial fitness and likelihood to engage in criminal activities due to financial deformity. Combining both social scoring and financial scoring at a specific weight will provide the behavioural score. This score will classify the subjects and help determine good citizens among the rest. This can be used to engage and provide added incentives to good citizens in order to promote good citizenship and behaviour. The problem statement is to focus on the development of a scientific and methodological approach to determine a scoring mechanism to assess and score human behaviour based on online social media activity and financial activity. Credit scoring provides the banks with a base to determine whether the borrower or lender can be trusted to give or grant loans. It will help and filter out trusted and non-trusted sources. Big data sources are utilized to enhance statistical and economic models so as to improve their performance. Using this data, it will provide more information on the consumer which will help companies in making better decisions. There has always been a system to punish people for the mistakes, and it does not necessarily set the right example for others. Rather, building a system that will reward the people for their good behaviour and financial expenditures. It would accomplish two things, firstly it will inspire people to be well behaved on social media, and secondly, t helps to spend the money wisely. After the availability of required data, the system cleanses and analyses on the basis of various attributes, and later sentiment analysis will be performed to develop the sentiment of the data. Once this is done for all the data attributes, a normalized behavioural score will be generated. The system moves on to financial scoring and the user required to provide data like monthly income, debt ratio, number of times defaulted, number of dependents, number of open lines of credits, number of secured and unsecured lines of income, age and so on.
2 Literature Survey The financial capacity of the state and the statistical approach is to assess it. [1] illustrates the event of specialized approaches to find the amount and use of an economic scope to find current patterns of the variations for development. This research is necessary because of the demand to authenticate the best national security level, dubious methods to spot its modules, and therefore must select the direction for the development of the state. The sole specialized method to assess the state’s financial ability is incomplete, hence making it really hard to prepare steps for improving
766
A. Gupta et al.
the management of its components. This approach calls for comparing the values of creation and use of the statistical data provided by the authorities like ‘the ratio of the deficit/surplus of the state’s budget to Gross Domestic Production (GDP)’, etc. [2]. This has proposed a scientific method for an inclusive analysis of the financial potential of the state. An individual’s online behaviour can be precisely be explained by personality traits [3]. Therefore, understanding the personality of an individual helps in the forecast of behaviours and choices. Several types of research have shown that the user’s answers to a personality examination can help in predicting their behaviour. It can be helpful in many different spheres of life from punctuality, job performance, drug use, etc. [4]. The study was mainly focused on Facebook profiles of the users as everyone uses Facebook to share about their life events That Facebook profile features, For the purpose of the study, Facebook data of 32,500 US citizens were acquired who had been active on Facebook for the past 24 months or more. The data included the friend’s list of the user, events that are posted, regular status, posts updates, images and groups. For analysis, the features were arranged according to ‘number of months’ since when the user had joined Facebook. A simple linear regression method was used for this purpose. Correlating the personality of the user with that of Facebook profile features, ‘Spearman’s rank correlation’ was used to calculate the correlations of the features. All these were tested using t-distribution tests with significant at < 0.01 level. Openness—The people who are eager to have experience and are free tend to like more Facebook pages and have a higher number of status updates and are in more chatting groups. Conscientious users either join few groups or no groups at all and also their use of like is less frequent. The research reveals that the average number of likes made by the most conscientious users is higher than 40% likes of the most spontaneous people. Extroverts mean the individual will interact more with others on Facebook. These people like to share what all events they are attending, events that are happening in their life and like to come in contact with more and more users. Neuroticism and Facebook likes have a positive correlation, showing the emotional users using ‘like’ features more. It has been found that 75% of normal users like fewer than 150 pages or posts, but emotional people use the like feature for 250 times or more [5]. The study is limited to a small or moderate population who volunteered to report their behaviours though the research paper limits to only US-based Facebook users. Another issue is that Facebook users can be very selective in liking the pages which can defer from their personality. The credit scoring system has been the main support system for the bankers, lenders, NBFCs, etc., to perform statistical analysis about the creditworthiness of the borrower [6]. The lenders thus based on this credit score decide whether to grant credit to the borrower or not [7]. The article proposes a unique improvement in the traditional credit scoring model by the inclusion of mobile phone data and social network analysis results. Also, the article proves that adding call data records (CDR) adds value to the model in terms of profit. Thus, a unique dataset is made by including the call data record, account information related to the credit and debit made by the borrower to make the credit scorecard [8]. Call data networks which are made from
Behavioural Scoring Based on Social Activity and Financial …
767
call data records are used to differentiate the positive credit influencers from the defaulters. Advanced social networks analytics techniques are used to achieve the desired output. Hence, the results of the research show that the inclusion of a call data record increases the efficiency of the traditional credit scoring model. The positive aspect is that the traditional credit scoring models can be improved by adding call logs. It is easier to forecast what all features are more essential for the prediction, both in terms of profit and statistical performance. On the other side, the biggest limitation is the dataset. It just generalizes the whole scorecard card. The lender cannot differentiate whether the credit is for micro-loans or mortgages. The credit scoring model used in the research does not include the behaviour feature which is also an important parameter for analysing creditworthiness; behavioural data can be obtained from telecom companies or social media platforms. The subjective material well-being (S-MWB) covers a wide range of concepts like financial satisfaction, stress due to financial insecurity, how the government is handling the economy, etc. It can be a self-reported well-being gathered by a questionnaire [9]. It may consist of a range of economic issues like how the government is handling the economy of the country, the rate of essentials, pay and margin benefits from the job, and livelihood. It can also focus on particular dimensions of the physical life like contentment with the family financial situation as well as self-financial situation, if the income is adequate or not, the standard of living is up to the expectation or not, etc. The advantage here is about the study revealing several interesting facts that may have policy significance, understanding the precursors of subjective material well-being can help policy-makers to make schemes to improve the sense of material well-being of the people as well as their quality of life. The drawback is the comparison by the people done amongst their own country people and not the other country people. The subjective well-being of a person is very much dependent on the economy of the country. Credit rating could be a process to estimate the power of an individual or organization to fulfil their financial commitments, supporting previous dealings [10]. By such a process, financial institutions classify borrowers for lending decisions by evaluating their financial and/or non-financial performances. This paper focuses on using social media data to determine an organization’s credibility. It is done so because many times a case may arise where the financial and non-financial assessments done on the organization might not be accurate or cannot be trusted. Preferably, there may be a case where their credit analysers provide false data, so that they can easily get a loan. In these cases, using social media data can be very fruitful. It also undertakes financial measures which have been implemented in our project as well. Therefore, using multiple criteria for the credit rating approach can be very useful for accurate information. A multiple criteria approach will help to identify the loan taker’s credibility as a large extent to have various factors to distinguish. This approch not only tracks their financial and non-financial assets but also their social media data which can reveal important things about the company’s mind set and behaviour. The benefits are the integration of social media data into credit rating. Analysts are provided with more interpretation opportunities. The negative side is that the credit ratings
768
A. Gupta et al.
tend to decrease when social media data that is considered. Gathering this much data might require a lot of permissions which need to be granted. The work develops models to identify the scores of consumers with or without social network data and how the scores change the operation [11]. A consumer might be able to change the behaviour of social media to attain a higher score. This article deals with the amount of change from the normal score a consumer can get by doing accordingly. A credit score is required by consumers, so that people can apply for a loan, to get their lender’s confidence which can extend its consumers credit based on the score. Therefore, a high credit score is a must for all consumers. Consumers realize that the score is based on social network data; people will try to be better citizens. Given that consumers use credit for a variety of undertakings that affect social and financial mobility, like purchasing a house, starting a business, or obtaining teaching, credit scores have a substantial impact on access to opportunities and hence on social inequality among citizens. Until recently, credit score was captivated with debt level, credit history, and on-time payments as summarized by FICO. But now, an outsized number of firms depend on network-based data to see consumer creditworthiness. The positive aspect is the development of models to assess the change in credit score and usage of network-based measures rather than consumer’s data, whereas the negative part is developing these models which might be a tedious task, and there is a possibility of the network provided data being fabricated. A credit score system is presented in an article for ethical banking, micro-finance institutions, or certain credit cooperatives [12]. It deals with the evaluation of the social and financial aspects of the borrower. Social aspects include employment, health, community impact and education. The financial aspects are mainly based on banking details of the company, its account statements, loan history and repayment of debts on time. Based on these financial aspects and social as well, this paper will help to figure out companies who are actually trustworthy of borrowing or lending money and hence be fruitful for the bank where the people are associated. On the one hand, it provides a credit scoring system for ethical banking systems, identifies responsible leaders, whereas on the other side, there are security issues, consumer’s banking details which are sensitive information. The paper tries to show that compulsive buying and ill-being perception acts as controlling on credit and debit card uses and debts [13]. They both act as a repulsive force for each other. A person with compulsive buying will tend to spend more and more money which can lead to debt, whereas a person with a perception of ill-buying may think of future financial problems and knows that unnecessary buying may lead to debt will tend to spend money judiciously. There are several studies proving this theory that compulsive buying encourages debt, while ill-buying perception discourages debt [14]. But today, materialistic happiness is a dominant factor for people; hence, mankind tends to spend on materialistic goods. Well, there can be multiple hypothetical situations which the research paper had assumed like the urge of getting materialistic goals positively impacts compulsive buying, individuals with compulsive buying will overuse their credit cards, compulsive buying leads to debt, responsible use of credit cards have a negative effect on debt, ill-being perception leads to responsible use of credit cards, and ill-being perception discourages credit
Behavioural Scoring Based on Social Activity and Financial …
769
card debt. The benefit is that the article is not gender biased, and it clearly shows how a materialistic individual can lead to financial debt easily irrespective of the gender. The drawback is that the research is unable to analyse an individual’s behaviour at different time periods, and the credit analysis models lack robustness.
3 Proposed Methodology The work aims to focus on the improvement of a logical and methodological way to deal with deciding a scoring mechanism to assess and score human behaviour based on online social media activity and financial activity. More or less, its principle development, once completely executed, could be that every user will be given a certain mark estimating their truthfulness, genuineness and uprightness and that this score will at that point be a significant determinant for their lives, for example, regardless of whether to have the option to get credit, lease a house, or purchase a ticket or being given favoured access to medical clinics, colleges and taxpayer-driven organizations. This score will focus on thinking about a wide scope of individual elements. It additionally takes after, yet goes farther then, a scope of frameworks that are proposed to build the noticeable quality of notoriety with exchanges, online stages and in the ‘sharing economy’. Consequently, it is focused here on rating frameworks concerning distinct people. The social angles attempt to evaluate the advance effect on Millennium Development Goals, for instance, work, training, condition, well-being or network sway. The social FICO rating model combines the bank’s expertise and ought to likewise be lucid with its significance. Scoring alone based on financial aspects may risk the institution to let a socially bad person get loans and other financial benefits. A socially bad person may tend to be a defaulter or use financial benefits for unethical purposes. Therefore, keeping this view in mind, a methodology has been proposed which will score a person based on both social and financial aspects. The proposed system comprises four major components, namely user, third-party companies/govt., social media data pool and financial/bank data. The user is required to register with our system and connect a social media account of the desired choice. Once the user’s social media account is connected with this system, the user will be required to provide an ‘access token’ which the system utilizes to access the required data. After the required data is available, the system will clean, it will be analysed based on various attributes, and then, sentiment analysis is being performed to develop the sentiment of the data. Once this is done for all the data attributes, a normalized behavioural score will be generated. The system moves on to financial scoring, and the user will be required to provide data like monthly income, debt ratio, number of times defaulted, number of wards, amount of open lines of credits, number of secured and unsecured lines of income, age, etc. The system will make use of machine learning models for both behavioural and financial scoring. To perform behavioural scoring, the system will request an external API and use a local model for financial scoring. To send this ML model as a REST administration, it adopts Flask. Furthermore, the system is using
770
A. Gupta et al.
a WSGI compliant application server along with NGINX Web server. The trained model is deployed with a Flask. The model can be saved and loaded to make new predictions on data provided by clients/users. Considering behavioural scoring, sentiment analysis is executed on the user’s social data. Sentiment analysis classifies the data based on its sentiment and provides a positive, negative, or a neutral score and a confidence value which is used to generate a score. For financial analysis, the above-mentioned attributes are taken into consideration. Also, a local model is built using a random forest classifier algorithm to generate the score accordingly. The input dataset consists of social data obtained from user’s Facebook and other social media accounts using various external APIs like Facebook Graph API. It is the leading method for applications to analyse and compare user data. Financial data would be provided by the user at the time of generation of the score. Once the user data is available, the retrieval of important information is carried out to score a user, and unnecessary information can be removed. Parameters and features decide the availability of data and subject to check reliability. Monkey learn classifier is utilized to perform sentiment analysis and obtain the sentiment of the user. Score aggregation and normalization involve combining of both the behavioural score and financial score. Also, there are various techniques like the weight of evidence (WOE) and information value (IV) which are applied. This will make the system more reliable and efficient. Additional ranking based on time dependency is also performed. The final scores are intimidated to the user with the scoring benchmark
Fig. 1 Architecture of the proposed system
Behavioural Scoring Based on Social Activity and Financial …
771
and reference. The behavioural and the financial score will allot a final score to the user.
4 Empirical Analysis 4.1 Behavioural Scoring The objective of this module is to classify and provide a score to a person based on their social media activity. Behavioural scoring involves the collection of user data from various social accounts, analysis of user data and final generation of the score. Data is obtained from Facebook and Instagram using Graph API. This is done by generating access tokens with required permissions. Inspection of parameters like the post, quotes, likes and feed data is done. Behavioural scoring module uses the user’s social data to perform sentiment analysis. The analysis is performed on all the above parameters. The system will use an external API, MonkeyLearn and compute the sentiment of each parameter. This provides the system with sentiment value and confidence. The system performs weight of evidence (WOE) and allots rank weighted by time for parameters related to like posts and likes because these are influenced by time and dynamic appearances. Graph API provides various functionalities for applications to read and write the Facebook community-based diagram. The API’s structure is made up of nodes, edges and fields. Nodes are singular objects like user, picture and group. Edges are the connections between the nodes. In simple words, the link between a group of objects and a single object is an edge. Fields provide information regarding an object like general information of a person. So the nodes are used to fetch metadata about a particular object which are individual users in our system, use the connections between nodes (Edges) to fetch groups of entities on an individual object, and use fields to retrieve individual user’s general information which will be used as scoring parameters to generate a score for the individual user. Graph API is HTTP-based which makes it compatible with any language that has an HTTP library. This allows the Graph API directly with the proposed system once the access tokens are provided by the user. Also, field parameters can be included in the nodes (individual users) and describe which categories or areas that can be sent back with the response. An immediate check of the admin node reference shows that one of the categories that can be fetched when accessing admins entity in the name field, which is the name of the admin. Nodes are singular objects, each with a distinct ID, to get information about a node that directly queries its ID. In regards to MonkeyLearn API for sentiment analysis, an external API is applied to perform sentiment analysis. This assists in categorizing and finding utilitarian metadata from raw texts like electronic mails, online chats, and other media resources like Web pages, online documents, tweets and more. Also, the content can be categorized with formal groups or bins like emotion or subject, and extricate any specific
772
A. Gupta et al.
data like establishments or watchwords. MonkeyLearn necessitates that are validated by sending an API key with each solicitation to grant permission to access the API. Parameters refer to the attributes that the model will take to get efficient and concise results. A lot of data about a person can be obtained just by looking at their social feed like personal information which may include name, date of birth, places lived, interests, etc., but a lot of this cannot be practically used to generate a score. There are four parameters that are taken into consideration, namely posts, feed, likes and quotes which will be used for scoring. Posts refer to the updates which are posted by the user on their wall. They can be textual or pictorial. The feed continually refreshes the rundown of stories on a client’s viewing page. It consists of notices, photographs, recordings, joins, application action, likes from individuals, pages and gatherings that they follow on Facebook. Likes are the list of content that is liked by the user. This may include posts, the activity of friends, people and groups they follow. Quotes are posted by people and be found on a Facebook profile page when scrolled down to the very bottom. Here, users are able to add their favourite quotes for all the viewers and are visible (by default) on their public profile. The model chooses the parameters with the most information and ignores the rest. Quotes posted by the person are not very time-dependent, and feed is dynamic; hence, sentimental analysis is directly applied to it. Likes and posts show the current mood of the person. So, both time and senti-value (obtained after sentimental analysis) for likes and posts parameters. The Graph API acquires data from the user. The data will include the pre-decided parameters, quotes, posts, likes and feed. This is done using access tokens. An access token is an obscure string that distinguishes a client, application, or page and can be utilized by the application to make API calls. At the point when an individual associates with an application utilizing Facebook login and affirms the solicitation for consents, the application gets an access token that gives transitory, secure access to Facebook APIs. Access tokens are obtained by means of a few techniques. The token incorporates data about when the token will terminate and which application produced the token. Due to protection checks, most of the API’s approaches Facebook need to incorporate an access token. There are various sorts of access tokens to help distinctive use cases like entity access token, customer access token, app access token and page access token. User access token is utilized in this proposed work. This sort of access token is required whenever the application calls an API to peruse, alter or compose a particular individual’s Facebook information for their benefit. Client access tokens are by and large acquired by means of a login discourse and require an individual to allow the application to get one. The users are responsible to generate the access token. This is very simple and can be easily done using Graph API explorer. The users are able to connect the Facebook account with the Graph API, approve necessary permissions to access data and generate the user access token (Fig. 2). Now, the user data is available, and on this, sentiment analysis is performed. But prior, data is cleaned, removal of possible noise, columns checked for NaN values are removed. Once this is done, it is ready for sentimental analysis which is also known as opinion mining and emotion AI. Opinion investigation comprises natural language processing (NLP), content examination and biometrics to methodically
Behavioural Scoring Based on Social Activity and Financial …
773
Fig. 2 Generation of user access token
discern, separate, evaluate and study states and abstract data. An essential errand in assessment investigation is about ordering the extremity of a given book at the archive, sentence, or highlight/perspective level—regardless of whether they can communicate conclusion in a record, a sentence or an element include/viewpoint is sure, negative, or unbiased. Next, the access token is utilized and the model ID to call the MonkeyLearn API. Further, each column is iterated, and then, data is classified as negative or positive. An attribute called ‘confidence’ is obtained. These two attributes (‘senti-value’, ‘confidence’) with ‘ID’ are added in the CSV file which was obtained in the second step as new attribute columns with their values. These steps are involved to call the MonkeyLearn API that will initiate a post request. The endpoint expects a JSON body. It should be an object with the data property and the list of the data (texts) which need to be classified. The response consists of a list of all the data with their response that is negative, positive, or neutral, and a confidence value if the API call is successful. Next is identifying the weight of evidence and assigning them to parameters. The weight of evidence (WoE) provides the functionalities to re-engineer the values in continual and unconditional forecasting of the variables into individual boxes on its own and finally assign to every individual box category a distinct weight of evidence value. As different users will have different parameters, it will be used to allocate the score. WoE provides weight based on the priority and usefulness of the parameters. This weight is assigned to the parameters. The parameters with the highest weights are chosen for sentimental analysis. WoE can be used to compute
774
A. Gupta et al.
the ‘strength’ of a marshal in order to uncouple positive and negative default. It can also be written as the ratio of spread of positives / spread of negatives, where spread refers to the proportion of true and false values in the distinct bins of the total amount of positives and negatives. Mathematically, the ‘weight of evidence (WOE)’ value for some number of observations can be computed as: WOE = [ln (Distr Goods/Distr Bads)] ∗ 100
(1)
The amount of WoE will be zero if the likeliness of spread of positives / spread of negatives is equivalent to one. If the distribution negatives or badin, a bin is more than the distribution positives or goods, the probability will be lesser than one, and the WoE will be a negative number; if the number of positives is greater than the negatives in a group, the WoE merit will be a definite (>0) number. From all the aboveextracted features, the best features are identified which are related to differentiate the various leaf diseases.
4.2 Financial Scoring The objective of this module is to run an analysis on a person’s financial history and generate a score based on their financial activity. To do this, a model is created and financial data from various online sources are used to train it. There are various attributes and values which are taken into consideration, like user’s income, number of loans, debt ratio, number of family members, etc. The financial details provided by the users and the values that are passed in the model are to generate a score. Random forest classifier is utilized to develop the model. Credit plays a very important role in any economy and is always required by companies and individuals, so that the markets can effectively work. Hence, it is very important to devise sophisticated methods to understand whether an individual can be provided credit by forecasting the probability of default. A financial score or a credit score which is also popularly known as CIBIL is already used by many companies and financial institutions. It is used to control whether a loan should be accorded or not. Different institutions have different attributes and factors which they take into consideration to generate a score. The proposed model will focus on empirical data provided by the user mainly focusing on factors that assist third-party actors and the government to understand if a person preferably experiences financial problems in the coming two years. In addition, a dataset from kaggle is used to train and test the model. This dataset contains information of around 300,000 users. This leads the parties to make an informed decision regarding the reliability of a user hence making sound financial decisions. A random forest classifier is used to classify the given data into bins or classes by using a large number of decision trees. It uses capturing and features stochasticity when building every tree in order to create an unrelated forest of trees whose forecast by board is more precise than that of any discrete tree. To make correct decisions using random forest, features are necessary that can be used to
Behavioural Scoring Based on Social Activity and Financial …
775
get at least some insights and forecasting power. It also becomes really important that the decision trees as well as the forecasting made by them are unrelated or at least have very low levels of degree of similarity. While the algorithm itself via feature haphazardness tries to execute the lower degree of relations for us, the features selected and the final parameters decided will ultimately impact the relations as well. The two main reasons of utilizing random forest are follows. The predict_proba function of the random forest classifier can be used directly on the output to get a probability value in a range from zero to one. Another reason is that it is extremely effortless to change the output of a random forest classifier to a simpler binary classification problem which would further ease computation. In regards to dataset description, the attributes which are taken into consideration are as follows: • Age: The age in years of the borrower. • Debt ratio: The debt ratio is defined total costs incurred by the borrower in a month like living costs, payment of monthly EMI or any other debt payments divided by their net gross income of a month. • Monthly income: The gross monthly income of the borrower. • Number of dependents: Total number of dependents in the family including parents, children, wife, etc. • The total number of unsecured lines of income: This may include personal lines of loans, borrowed credit from friends or family, credit card balances, etc. • The total number of secured lines of income: Secured lines of income refers to real estate income, business or service income. • Defaulted in first 30 days: Amount of times the debtor failed to pay in the first thirty days. • Defaulted between 30–59 days: Amount of times the debtor failed to pay between 30 to 59 days. • Defaulted between 60–89 days: Amount of times the debtor failed to pay between 60 to 89 days. • Defaulted after 90 days: Amount of times the debtor failed to pay after 90 days. • Total number of loans: Total number of loans taken by the borrower. For model development in detecting outliers, the outliers in statistics can be thought of as data points as it differs greatly from the remaining part of the data. Outliers are abnormalities in the data, but it is imperative to understand the nature of the outliers. It is essential to dropping them only if it is clear that they are incorrectly entered or data not properly measured, so that removing them would not change the result. To detect the outliers, the interquartile range (IQR) method is applied. These are a set of mathematical formulae applied to retrieve the set of outlier data. IQR is defined as the interquartile range (IQR) which is the midpoint or centre half that is a share of quantifiable dissipation being equal to the difference between the ranges of seventy-fifth and twenty-fifth percentiles. IQR = Q3−Q1
(2)
776
A. Gupta et al.
At the end of the day, the IQR is the subtraction of the lower quartile from the third quartile that is also known as the upper quartile. These quartiles can be observed by plotting them on a case plot. It is a proportion of the scattering like standard deviation or fluctuation, yet is considerably more powerful against exceptions. The indexes of outliers are appended, and the entries removed are from the dataset. The dataset cleansing is the process of removing data which is unfit for the training process. This may include NaN values present in the dataset. A series of python functions are utilized to perform the same. The essential one being functions such as qcut and get_dummies. Qcut is defined in the python documentation as ‘quantile-based discretization function’. This means that this function will divide up the original or the fundamental data entities into similar-sized boxes. This function defines the boxes using percentiles based on the dispersion of the available data rather than the true numeric boundaries of the boxes. The values which are greater than six will be bonded together as the standard deviation of chosen data is extremely high. All the NaN values are present in the chosen dataset with the median of the column. Get_dummies is a python functionality which is used to convert categorical variables into dummy/indicator variables. When this function is applied to a column of categories where there is one category per observation. It produces a new column for each unique categorical value. The value one is placed in the column corresponding to the categorical value present for that observation. When the number of values is increased, the accuracy and the efficiency of the model are also improved when the random forest is used in further processes. After this, a final check is performed on the dataset to check for any NaN values present in the dataset. For model creation, the dataset is divided as testing and training data, and the target value is separated from the trained features. Moving on with the model fitting and accuracy aspects, a random forest classifier is used for creating and fitting the model. A confusion matrix will be used to generate the accuracy of the model. The confusion matrix is used to get an understanding of the working and execution of a classification pattern on certain testing data for which the label values are known. It allows envisaging the working of an algorithm. The specifics of the accuracy of the model can be determined using a confusion matrix, such as absolute accuracy, responsiveness, reactiveness and so on. These measures assist to determine whether to accept the model or not. Taking into account the cost of the errors is an imperative part of the decision whether to accept or reject the model. After this, accuracy can be calculated. The accuracy of the proposed model came out to be 80.78%. Accuracy = Elements classified correctly / Total Elements
(3)
The next step is the generation of scores. The model is loaded, and the data values provided by the user are passed into the model to generate scores. But first, the model is dumped in a package. This is done using the Joblib library. Joblib provides a quick and strong mechanism especially for bigger amounts of data and operations like streamlining of ND arrays. After loading the model, the predict_proba function is used to generate the scores. This is an extremely significant function. The
Behavioural Scoring Based on Social Activity and Financial …
777
Fig. 3 Weighted graph
predict_proba gives you the probabilities for the target in array form. The amount of probabilities for every row is taken out and is equal to the length of the total categories. It gives the value of the log-probability for all the features of the model, where features are ordered as they are in classes. The predict_proba (SELF, M) [SRC] estimates the final returned values for each class and is ordered by the index of the features. The mathematical logic applied to normalize and couple the scores. There are two aspects of the system namely financial and behavioural where equal weights are given to both behavioural as well as financial. Another approach is to do a weighted average (Fig. 3). To normalize the score accurately there are certain cases that need to be handled. For instance, there can be cases like unavailability of financial or behavioural data, categorization, ranking based on score and so on. In that case, there may be a necessity to convert the scoring parameters. To address these issues, below are a set of problems and solutions: • Unavailability of behavioural data, hence no generation of the behavioural score. • Similarly, in case of unavailability of financial data, the financial score cannot be generated. So there can be a requirement to perform scoring on financial or behavioural data alone. • As the financial model generates the default probability, is it possible to transform it into a financial scoring metric? • How to categorize users based on score and to justify the lack of data, if any of the above cases are encountered? • To handle these cases and maintain the effectiveness of the system, there are following possibilities: • As there is a calculation about the chances of a user defaulting for a certain number of days or the default probability, the probability a user does not default can be simply calculated, 1—probability (user defaults) therefore, the probability of good behaviour.
778
A. Gupta et al.
• There is not enough usable data for behavioural scoring, and yet it is significant to score them. In order to understand this clearly, hypothetically when there is non-availability of neither behavioural nor financial data, then both behavioural and financial score will be zero, and the final score will be calculated as, • Score = 1 – 0 + 0 / 2 = 0.5 • The user will be put into a neutral category—neither good nor bad. Suppose when behavioural data is unavailable, then the user has a behavioural score of 0 and financial default probability of 0.5 (which will come under a bad borrower), for this the score will be calculated as, • Score = 1−0.5 + 0 / 2 = 0.25 • This is a bad score and at the lower end of the spectrum, considerably not an ideal position to be in. When there is both financial as well as behavioural data and both the scores are 0.5 each, then the score will be given as, • Score = 1−0.5 + 0.5 / 2 = 0.5. • This will again come under good category and hence acceptable as the user has a good behavioural score but a bad financial score. If there is an extremely bad case, −0.5 as the behavioural score and a financial default probability of 1 which means the user will certainly default. In this case, the score will be calculated as, • Score = 1–1 + (−0.5) / 2 = -0.25. • Now, this is extremely bad, and hence, it is extremely concerning. On this basis, it is decided to categorize the scores as, 0.75–1.0: Excellent, 0.5– 0.75: Good, 0–0.5: Okay and −1.0 to 0: Concerning. Looking at all those cases, it is understood that with the data obtained and using the different behavioural or financial score computed, the score can be justified based on this logic. Hence, the score stays consistent, and all the scores can be justified in every case. One interesting thing to note over here is that financial score ranges from 0 to 1 (never negative), and behavioural score ranges from −1 to 1. Therefore, if an equal-weighted average is taken for both, the score will range in −0.5 to 1.0. In this case, it can be concluded with ease that a negative aggregate score can be an extremely concerning score.
5 Results and Conclusion The work demonstrates how behavioural scoring can be used to promote good behaviour and identify good citizenship among the actors. This can be used to engage
Behavioural Scoring Based on Social Activity and Financial …
779
and provide added incentives to good citizens to encourage good citizenship. The research work portrays how a person’s online activity on online media sites like Facebook and Twitter determines the nature and behaviour of the person. Some factors that are included for the social scoring are types of posts shared, comments added, posts posted, pages followed and liked. These data are plotted against a graph signifying the time and obtain a social score. There is a financial scoring model that will determine the person’s financial fitness and likelihood to engage in criminal activities due to financial deformity. Combining both social scoring and financial scoring at a specific weight will provide us with a behavioural score. This score will classify the subjects and help determine good citizens among the rest. This can be used to engage and provide added incentives to good citizens to enhance good citizenship. Many compelling avenues are open for future enhancement and exploration such as only certain specific features have been used to predict the personality of the user, though there is a wide variety of features that were not explored like a specific type of group a user is a member of. The user can be selective in liking any page, group, or public figure. Thus, more sophisticated approaches can be used to overcome this drawback. The analysis is only done based on the online behaviour of the user. A user can have different behaviour in the virtual environment and the real environment. Hence, work can be done in the future to outperform this negative aspect. Another scope of further improvement can be the study of privacy-safeguard mechanisms to further enhance and secure online data.
References 1. Vyhovska F, Polchanov N, Aldiwani A, Shukairi K (2019) The methodological approaches development to assess the creation and use of the financial capacity of the state. Public Munic Financ 8(1) 2. Agytaevna Adilova M, Meldahanovich Akayev K, Erzhanovna Zhatkanbayeva A, Halelkyzy Zhumanova A (2015) Problems of financial security and financial stability of the Republic of Kazakhstan. Mediterr J Soc Sci 6(6) 3. Kosinski T, Bachrach M, Kohli Y, Stillwell P, Graepel D (2014) Manifestations of user personality in website choice and behaviour on online social networks. Mach Learn 95(3):357–380 4. Tupes RC, Christal EE (1992) Recurrent personality factors based on trait ratings. J Pers 60(2):225–251 5. Vazire S, Gosling SD (2004) E-Perceptions: personality impressions based on personal websites. J Pers Soc Psychol 87(1):123–132 6. Zhou J, Sun H, Fu G, Liu S, Zhou J, Zhou X (2019) A Big Data mining approach of PSO-based BP neural network for financial risk management with IoT. IEEE Access 7:154035–154043 7. Agarwal V, Dhar R (2014) Editorial—Big Data, Data Science, and Analytics: the opportunity and challenge for IS research. Inf Syst Res 25(3):443–448 8. Thomas LC (2000) A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. Int J Forecast 16(2):149–172 9. Sirgy M (2019) What determines subjective material well-being? 10. Gul I, Kabak S, Topcu O (2018) A multiple criteria credit rating approach utilizing social media data. Data Knowl Eng 116, 80–99 11. Wei C, Yanhao Y, Van den Bulte P, Dellarocas C (2014) Credit scoring with social network data. SSRN Electron J 35(2)
780
A. Gupta et al.
12. Gutierrez-Nieto J, Begona SC, Carlos CC (2016) A credit score system for socially responsible lending. J Bus Ethics 133:691–701 13. Bertran D, Echeverry MP (2019) The role of small debt in a large crisis: credit cards and the Brazilian recession of 2014 14. Lee L, Qiu GM (2016) A friend like me: modeling network formation in a location-based social network. SSRN Electron J 33(4):1008–33
An Optimized Method for Segmentation and Classification of Apple Leaf Diseases Based on Machine Learning Shaurya Singh Slathia, Akshat Chhajer, and P. Gouthaman
Abstract Agriculture is a significant portion of the world economy as it gives insurance. In any case, concerning it has been noticed that plants are broadly contaminated by various sicknesses. This causes tremendous monetary misfortunes in agribusiness throughout the world. The personnel assessment of natural product infections is a troublesome procedure that is limited by utilizing robotized strategies used to identify plant sicknesses during the previous stage. During an examination, another strategy is actualized for apple sicknesses recognizable proof and acknowledgment. Three pipeline methods followed are initial processing, blemish division and highlights removal, and characterization. The initial step is that the plant leaf blemish is improved by a half and half technique that combines three-dimensional box separating, de-relationship, three-dimensional Gaussian channel, and three-dimensional median channel. From this point forward, the sore areas are fragmented by a solid connection utilized process. Finally, the shading, shading histogram, and the surface highlights are removed and combined. The extricated highlights are enhanced by hereditary calculation and arranged by KNN classifier. The research is carried on the plant village dataset. The planned system is tried for four sorts of ailment classes. The grouping results show the advancement of our strategy on the chosen infections. Also, the great initial-processing method consistently created unmistakable highlights which later accomplished critical arrangement exactness. Keywords Plant disease · Machine learning · Image processing · Picture segmentation · Feature extraction
S. S. Slathia · A. Chhajer · P. Gouthaman (B) Department of Information Technology, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, India e-mail: [email protected] S. S. Slathia e-mail: [email protected] A. Chhajer e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_57
781
782
S. S. Slathia et al.
1 Introduction The goal of this work is to discover leaf malady recognition and characterization. Leaf illness identification and characterization are the significant qualities required for agrarian businesses. As cultivation is a significant piece of the global wealth it gives sanitation. It has been noticed that plants are widely contaminated by various ailments. This causes tremendous monetary misfortunes in the horticulture industry around the globe, and a proficient recognizable proof and acknowledgment of organic product leaf maladies are a waste and flow challenges in machine vision because of their significant requisition in farming. In agribusiness, different sorts of natural product sicknesses exist which influence the creation and nature of organic products. The vast majority of these maladies are decided by according to a specialist here based on their indications. It may be costly because of the inaccessibility of specialists and greater expense. In this respect, the registering analysts in a joint effort with agribusiness specialists have proposed numerous calculations for robotized identification of infections in plants and natural products. The manual examination of natural product infections is a troublesome procedure which can be limited by utilizing computerized strategies for recognition of plant maladies at the previous stage. Hence, it is fundamental to build up a mechanized electronic framework for recognition and arrangement of leaf side effects at the beginning period. To satisfy the above necessity, a machine-based picture handling method is suitable for recognizing and grouping the leaf illnesses. The past framework portrays mechanized leaf infection imperfection discovery from the pictures. A robotized vision-based framework which comprises of a picture snatching system and an investigation strategy is utilized for identifying and grouping the leaf illnesses. The proposed investigation strategy treats malady like unusual areas and different effects that are identified with the order of leaf infections. In the imperfection recognition process, a few pictures preparing calculations are utilized to extricate pictures includes and find deformity’s situations on leaf pictures. The current framework proposed a color-based division picture preparing calculation for leaf ailment ID of this work is to specialize in the development of a scientific and methodological approach to see a scoring mechanism to assess and score human behavior supported by online social media activity and financial activity. The drawback of the existing system is that those strategies are not quicker and adaptive. The ribs may cause undesirable mistakes in the characterization of leaf diseases. Superior preprocessing and division are absent. The yielded result is less when contrasted with our proposed system. The accuracy of the system is low.
2 Literature Survey The article portrays computerized leaf sickness imperfection recognition from the pictures. In this paper, a robotized vision-based framework includes a picture
An Optimized Method for Segmentation and Classification …
783
capturing system and an investigation strategy for distinguishing and characterizing the leaf illnesses. The proposed examination strategy treats infection like anomalous locales and different effects that are identified with the characterization of leaf maladies. In the deformity discovery process, a few pictures handling calculations are utilized to remove pictures includes and find imperfection situations on leaf pictures [1]. This technique proposed a shading-based division picture handling calculation for leaf malady recognizable proof. In regards to apple leaf disease identification with a genetic algorithm, this article depicts computerized leaf infection imperfection discovery from the pictures. The proposed examination technique treats illnesses like strange districts and different effects that are identified with the grouping of leaf sickness. This enquires about the proposed relationship-based highlights determination picture handling calculation for leaf ailment distinguishing proof [2]. For the detection and classification of citrus diseases, a proposed investigation technique treats illness like anomalous areas and different effects that are identified with the grouping of leaf maladies [3]. In the imperfection recognition process, a few picture handling calculations are utilized to separate pictures includes and find deformity’s situations on leaf images. The technique that was proposed was an upgraded weighted division for leaf infection recognizable proof. To understand a comparative study of leaf disease diagnosis system, this article elucidates computerized leaf illness imperfection recognition from the pictures. In this paper, a computerized vision-based framework which comprises of a picture snatching system and an investigation strategy for recognizing and arranging deserts on the outside of cowhide texture [4]. This framework proposed the choice of a surface-based feature for leaf illness ID. Toward the development of an automated system for crop recognition and classification, this paper provides a computerized vision-based framework which comprises of a picture snatching system and an investigation technique for distinguishing and ordering the leaf infections [5]. In the deformity recognition process, a few pictures preparing calculations are utilized to separate pictures include and find imperfection’s situations on leaf images. The investigated technique proposed a shape-based element determination for leaf infection recognizable proof. In regard to recommending, it tentatively assesses a product solution for the planned location and grouping of plant leaf infections. Investigations of plant attribute/illness allude to the investigations of outwardly discernible examples of a specific plant [6]. These days crops face numerous qualities/illnesses. Harmfulness of the creepy-crawly is one of the significant characteristics/ailments. Bug sprays are not generally demonstrated productive because bug sprays might be poisonous to feathered creatures. It additionally harms normal creature natural ways of life. The going with two phases is incorporated continuously after the division stage. In the underlying advances, the recognition is based on the green-toned pixels. Next, these pixels are hidden subject as far as possible regards that are enrolled using Otsu’s technique, by then those large green pixels are hidden. The other additional development is that the pixels with zeros red, green, and blue characteristics and the pixels on the restrictions of the debased pack (object) were completely emptied. The exploratory
784
S. S. Slathia et al.
results show that the proposed technique is a generous procedure for the disclosure of plant leaves diseases. The features differentiate in the kind of nonlinear post-taking care of which is applied to the local force run. The features are operated with the Gabor imperativeness, complex minutes, and crushing cell executive features [7]. The capacity of the relating directors to make specific part vector packs for different surfaces is contemplated using two strategies. The Fisher standard is for gathering result connection. The two methods give consistent results. The pounding cell director gives the best detachment and division results. The surface distinguishing proof capacities of the executives and their power to non-surface features are moreover taken a gander at. The crushing cell chairman is the one specifically that explicitly responds just to the surface and limits a false response to non-surface features, for instance, object structures. Concerning leaf disease detection, the discussion is made on the exploration of customized leaf disease areas which is an essential research subject as it would exhibit benefits in checking immense fields of yields and along these lines perceive symptoms of ailment when it appears on plant leaves [8]. There are the guideline adventures for contamination acknowledgment of image acquisition, image preprocessing, image segmentation, feature extraction, and statistical analysis. This proposed work is to represent the first step in filtering using the center channel and convert the RGB picture to CIELAB concealing part, then second step is isolation using the k-medoid method, and finally, the resulting stage is to veil green pixels and remove the disguised green pixels, later in the following stage, it focuses on texture features statistics, and this features spread in the neural framework. The neural network depicts the performance well and could viably perceive and arrange the attempted disease. Toward plant disease detection, genuine organic fiascos cause incredible harm to edit creation. The plant infection information of various could be dissected, and later an estimating report was produced. This exploration was made by normal for the plant ailment which presents an estimated framework’s structure, and key improvements acknowledgment dependent on information mining’s plant malady framework [9]. The information mining capacity may isolate into two sorts; they are description and conjecture. The information mining is a sort of profound level information investigation, it can extricate the mass information from the certain standard information, and the profound level’s advancement may additionally improve the data asset. The restricted information within the information of the executive’s module information input stockroom is available as indicated by the portrayal. The information unearthing endures the accomplishment of the conjecture of the impact. The framework combines information of the executives to report the structure to produce and gauge in windows framework. In regards to detection, pictures structure noteworthy data and information in common sciences. Plant sicknesses have changed into an issue as it can cause a basic decline in both quality and measure of provincial things [10]. Modified acknowledgment of plant illnesses is a principal inquiry about a point as it would exhibit benefits in watching colossal fields of yields, and as such subsequently recognize the signs of infections when it appears on plant leaves. The proposed structure is an item answer
An Optimized Method for Segmentation and Classification …
785
for modified acknowledgment and computation of surface bits of knowledge for plant leaf ailments. The advanced planning system contains four rule steps, starting a concealing change structure for the data RGB picture is made, by then the green pixels are a disguise and ousted using express cutoff regard, by then the image is distributed and the significant parts are removed, finally the surface bits of knowledge is prepared. From the surface estimations, the closeness of contaminations on the plant leaf is surveyed.
3 Proposed Methodology The exploration is another strategy that is executed for sicknesses distinguishing proof and acknowledgment. Three fundamental techniques are used initiation, blemish division highlight removal, and arrangement. The initial step, specimen blemishes are improved by a half and half strategy that is three-dimensional box separating, de-relationship, three-dimensional Gaussian channel, and three-dimensional median channel. From this point forward, the blemishes are sectioned into a solid relationship triggered strategy. Continuing the shading, shading histogram and the surface highlights are separated and included. The characteristics are differentiated by a gene-based algorithm and classified by KNN classifier. The experiment is undertaken on the plant village dataset. The research tests for the following deficiencies like sulfur, phosphorous, etc. The benefits of the proposed system are utilizing various shading spaces which need to be distinguished for the best reasonable shading space. To utilize different element extraction strategies, it is essential to recognize suitable contrasts among terrible and great calfskin. Besides for utilization of various channels, it is necessary to expand the vision of influenced portions of leaf infections. Furthermore, to use genetic calculation-based element extraction strategies, it is required to recognize the best reasonable highlights between leaf infections. Productive recognizable proof and acknowledgment of organic product leaf ailments is a present test in PC vision (CV) because of their significant applications in agribusiness and agro-economy. In farming, different sorts of natural product illnesses exist which influence the creation and nature of organic products. The vast majority of these maladies are decided by a specialist around there based on their side effects. In any case, it is costly because of the inaccessibility of specialists and greater expense. In this respect, the processing analysts in a joint effort with horticulture specialists have proposed numerous calculations for robotized recognition of maladies in plants and natural products. Leaf side effects are a significant wellspring of data to recognize the ailments in a few distinct kinds of natural product plants. Apple is a significant organic product plant and broadly renowned for its supplement esteems. Be that as it may, its creation and quality are harmed by the assault of various ailments like dark decay, rust, and scourge. In this way, it is fundamental to build up a robotized electronic framework for the discovery and arrangement of leaf manifestations at the beginning time.
786
S. S. Slathia et al.
The early identification of these side effects is useful to progress the value and creation of natural products. On the grounds of machine visualization, it is a functioning examination region to discover the injury spot and concerning a few techniques that are presented for organic products sicknesses identification through picture handling and AI calculations. A great deal of division, highlights extraction, and grouping methods are proposed in writing for organic products infections division and acknowledgment, for example, a mix of highlights based indications distinguishing proof, shading-based division, connection-based highlights determination, improved weighted division, surface highlights, shape highlights, support vector machine (SVM), and so on. The covering model is an extra substance hiding model in which RGB lights are associated with different propensities to replicate a broad demonstration of shades. The name of the model beginnings from the initials of the three included substance key shades namely R: Red, B: Blue, and G: Green. The rule reason behind the redblue-green covering model is for the perceiving, delineation, and show of images in electrical frameworks, for example, television screens and personal computers. In any case, it possesses the way of utilizing standard photography. Before the electronic age, the red–green–blue covering model concerning a strong theory behind it, orchestrated in a human point of view on tints (Fig. 2). Fig. 1 Flow diagram
An Optimized Method for Segmentation and Classification …
787
Fig. 2 RGB
With considerations to grayscale, a dim computerized picture is a picture as a solitary example in photography and figuring where the estimation of every pixel is just power detail. Photographs of this sort are called highly contrasting, comprises of dim hues, extending from dark at the most minimal to white at the most noteworthy power. Dim scale pictures are not the same as the slightest bit highly contrasting bi-tonal pictures, which are pictures that just have two hues, highly contrasting with regards to PC symbolism. Grayscale processed data has different dark hues between them (Fig. 3). The dim scale picture now and then brings about a solitary band of electromechanical range estimating the power of the light at every pixel and, if only one given recurrence is caught, it is monochromatic. Be that as it may, you can likewise combine them with an entire shading picture. The principal tasks related to the article are the customary set activities association, convergence, and supplement in addition to interpretation. The interpretation A + x is set as a vector x and set A. A + x = α + x|αε A
(1)
Note that since the advanced picture is comprised of pixels at an essential arrange area (Z2), this suggests limitations on the reasonable interpretation vectors x. The fundamental tasks of Minkowski are included and taken away, and they are presently quantifiable. Morphological separating procedures apply to pictures at the dim level. The segments are organized with certain constraints to a limited number of pixels, and the curves are rearranged. Be that as it may, the organizing viewpoint presently has dark qualities related to each area of the directions as the picture produces. The points of interest can be found in Dougherty and Giardina. The consequential result is that the most extreme channel and the base channel are dark level widening and dim level disintegration for the particular organizing component given by the state of the channel window with the dim worth “0” inside the window. Morphological smoothing calculation is the perception that a dark level opening flattens a dim worth picture over the outside of splendor known as the capacity and the smooth dim intensity shutting from beneath. The morphological
788
S. S. Slathia et al.
Fig.3 Grayscale
gradient is where the inclination channel gives a vector representation. The version given here makes an approximation of the scale of gradients. Border corrected mask is where a channel is a cover. The covering guideline is otherwise called spatial filtration. Veiling is called sifting as well. In this definition, there is a process of managing the separating activity that is performed legitimately on the picture. A portion, convolution network, or veil is a little grid in picture handling that is valuable for obscuring, honing, decorating, edge-discovery, and that is only the tip of the iceberg. This is accomplished by methods for a part picture convolution. To locate the specific tasks in a picture, the veil is produced. The issues or highlights can be discovered in a picture. The outskirt amended cover is a veil wherein all the issues of a picture are shut to the edges. In machine visualization, surrounding is the strategy for apportioning a digital image data into various fragments. The division objective is to disentangle or potentially change a picture’s portrayal into something that is progressively important and simpler to examine. The division of pictures is commonly used to search for bends and blemishes in the images. More exactly, image dissection is the procedure by which every pixel in a picture is doled out a name, so pixels with an akin to coordinate have the same attributes. The result of image dissection is a lot of sections which covers the complete image all in all, or an assortment of forms got from the picture (see Edge recognition). Every one of the pixels in an area is comparable. Nearby areas differ significantly for the similar principles of the data. Utilizing interjection calculations like leading solid shapes, the form produced after data division can be used to make a three-dimensional simulation when used on a pile of data, which is the basis of clinical imaging. CCA is a notable picture handling method which checks a picture and gatherings pixels into marked segments dependent on pixel network. An eight-point CCA stage
An Optimized Method for Segmentation and Classification …
789
is performed to find all articles produced from the former stage inside the double picture [11]. The yield of this stage is a variety of N antiques that gives a case of that stage’s info and yield. The proposed framework’s fundamental applications basically point to mechanical applications such as supporting early location, finding, and fitting treatment, and segmentation of pictures assumes a significant job in numerous applications for the picture preparing. Finally, to lower SNR conditions and various things, the available issues are managed by computerization for the effective and exact division.
4 Empirical Analysis 4.1 Preprocessing To lower the abstract of pictures, both input and output are termed as preprocessing. Preprocessing is used to improve the data; it removes distortions and undesirable aspects of the data and also enhances important features that are required for further processing. As all image processing methods use redundancy of images, Pixels of the same image have identical corresponding luminosity values. The input data requires preprocessing techniques, so that the correct analysis of the data can take place. This implies that if any neighboring pixels that may be corrupted and can also be applied for data analysis. This method requires changing the size of the input data and changing it to a grayscale picture by using different filters. Data cleaning is the process to find, remove, and replace or missing data. Searching for local extreme and abrupt changes in the data can help in identifying notable trends in the data. The grouping method is used to signify the relationships between the various data points. The preprocessing applies certain methodologies wherein all the input images are resized into the same dimensions. The output image is altered in case the input data does not have the same specified aspect ratio. Image filtering is a process to enhance the input data. For example, an image can be filtered to highlight some aspects or erase some aspects. Next, if one single bit of colored pixel needs to be stored then 24 bits are required, whereas a grayscale image only requires 8 bits of storage. There is a significant drop in the memory requirement (by almost 67%) which is extremely useful. Grayscale reduces ambiguity from the value of a 3D pixel (RGB) to a value of 1D easily. Most functions with 3D pixels (e.g., edge detection, morphological operations, etc.) cannot be enhanced.
4.2 Segmentation Image segmentation is a common technique in digital image treatment and analysis, many times depending on pixels of the data, to segregate it into different sectors. In
790
S. S. Slathia et al.
computer vision, the segmentation of the image is a method of subdivision of the digital picture into several segments. Segmentation is a method for grouping pixels with similar characteristics. Separating an image into independent regions, so that object in the image is clearly defined, and it is designed to create where each region is homogeneous and combination of no two simultaneous areas is similar to each other, this process is defined as image segmentation. The accuracy of segmentation determines the potential success or failure of the analysis process. Segmentation is carried out because of the related property. The proposed solution implements this similar property by implementing the “k-mean clustering algorithm,” multiple segmentation method zeroes in on the center of any set in the inner cluster and finally re-segments the inputs based on a center that is nearest to the median. This method helps to remove important picture features, which allow information to be easily perceived. Next, a suitable test image is chosen by using various color space conversions including RGB to HSV and RGB to YCbCr conversions. The labels are then calibrated to produce realistic results using the shape detection method based on comprehensive regional context knowledge. Using the different color spaces, the best color space is selected that is related to leather classification. Firstly, Color space conversion where a translation of the color representation from one base to another is performed. This generally occurs with the translation of an image represented by a different color space, to render the image as similar as possible to the original. This is achieved with the translated image. Secondly, in the converting color format where the color information does not support many applications of image processing. If there is a necessity to differentiate between colors, then one explanation is to change the RGB input to black and white or grayscale output formats. Finally, morphological operations in which the morphological processing of images is a set of procedures that do not follow any hierarchy that is related to any structured elements of the data. Morphology has a huge array of data processing methods that process shape-based images. A structuring factor for an input image is used for morphological operations to construct an output image of the same size that are poor when printed in black and white (Fig. 4).
4.3 Feature Extraction In the field of automotive training, recognizing the sequence of processing the input image, culling of the features is initiated by putting together the processed information and constructs properties that are optimized for information and non-redundancy. The extraction of features is connected to a reduction in dimensionality. When the input for an algorithm is extensive for processing and is considered to be repetitive, it can be translated into a minimized set of properties. The function selection is called to determine a subset of the initial features [12]. In order that the desired task may not be complete, the opted characters will have the required information
An Optimized Method for Segmentation and Classification …
791
Fig. 4 Black and white image
from the input data, using this minimized depiction. Shape features, color features, geometrical features, and texture features. To begin with, the shape features comprise of shape characteristics like round objects or any other shape where perimeter boundaries of the objects along with the diameter of the order, are defined as shape features. Next, color features where the color and texture histograms and the whole picture color structure form part of the global apps. Color, texture, and shape features provide local characteristics for sub-images, regions, and points of interest. These image extracts are then used to match and retrieve images. Then, geometrical features in which the geometric characteristics of objects consisting of a sequence of geometric elements such as points, lines, curves, or surfaces are essential. Such characteristics may be corner elements, edge characteristics, circles, rids, a prominent point of picture texture, etc. Finally, texture features of an image fabric is a cluster of the processed parameters in the processing of an image that defines the quantum of a perceived arrangement of an image texture that gives information about the contiguous pattern of spectrum or sharpness of the data or a specific area the data (Fig. 5). Here, there is the utilization of feature extraction methods like gray-level cooccurrence matrix (GLCM), local binary pattern (LBP), region segmentation, and genetic algorithm. GLCM gives the structure features of the input data like clarity, correlation, energy, etc. Then, the LBP gives the various shape features of the input image. After that genetic algorithm is applied to choose the best attribute to distinguish the different diseases that may occur in the leaf. The region properties segmentation is utilized to get the mathematical features of the input image such as density, area, and so on. From all the above, the extracted features serve as the best features that are identified which are related to differentiating the various leaf diseases.
792
S. S. Slathia et al.
Fig. 5 Small object removed image
4.4 Classification The methodology of taking out data from a host of image indexes is called image classification. To create thematic charts, the resulting raster from image classification can be used. A toolbar for image classification is the preferred way for classification and multivariate analysis.
5 Conclusion and Future Work In this research, an enhanced robotized PC based strategy is planned and approved for acknowledgment of disease. The sore blemish differentiates extending, sore division, and conspicuous highlights determination and acknowledgment steps. The differentiation of the contaminated spot is upgraded, and division is performed by the projected technique. The performance of the projected technique is additionally upgraded by region segmentation. At that point, numerous highlights are removed and melded by utilizing an equal strategy. A genetic calculation is used to choose the best highlights, and later they are used by KNN for grouping. In the future, the proposed methodology can be grouped as many others that are yet to be developed methods like texture analysis and classification. This helps in determining the stages of the disease. It will be of great help as the system is not dependent on the disease. The proposed system can also be greatly enhanced to identify diseases that do not originate at the leaves but rather at different parts of the plant. Sudden death syndrome (SDS) can also be integrated into our module, but due to the lack of proper dataset at present, it could
An Optimized Method for Segmentation and Classification …
793
not be incorporated into this present work. Another advancement of the work could be to add and identify the different ways in which pests affecting the plants as each pest has a different way of attacking the plants. Finally, one major upgrade could also be used to identify what kind of nutrient deficiency the plant is facing due to which it is having those diseases and proper care can be taken care of the plant.
References 1. Rozario LJ, Rahman T, Uddin MS (2016) Segmentation of the region of defects in fruits and vegetables. Int J Comput Sci Inf Secur 14(5) 2. Chuanlei Z, Shanwen Z, Jucheng Y, Yancui S, Jia C (2017) Apple leaf disease identification using genetic algorithm and correlation based feature selection method. Int J Agric Biol Eng 10(2):74–83 3. Sharif MY, Khan MA, Iqbal Z, Azam MF, Lali MI, Javed MY (2018) Detection and classification of citrus diseases in agriculture based on optimized weighted segmentation and feature selection. Comput Electron Agric 150:220–234 4. Sapkal AT, Kulkarni UV (2018) Comparative study of leaf disease diagnosis system using texture features and deep learning features. Int J Appl Eng Res 13(19):14334–14340 5. AlShahrani AM, Al-Abadi MA, Al-Malki AS, Ashour AS, Dey N (2018) Automated system for crops recognition and classification. Comput Vis Concepts Method Tools 1208–1223 6. Gavhale KR, Gawande U (2014) An overview of the research on plant leaves disease detection using image processing techniques. J Comput Eng 16(1):10–16 7. Camargo A, Smith JS (2009) An image-processing based algorithm to automatically identify plant disease visual symptoms. Biosyst Eng 102(1):9–21 8. Zhang S, Wu X, You Z, Zhang L (2017) Leaf image based cucumber disease recognition using sparse representation classification. Comput Electron Agric 134:135–141 9. Ferentinos KP (2018) Deep learning models for plant disease detection and diagnosis. Comput Electron Agric 145:311–318 10. Shuaibu M, Lee WS, Hong YK, Kim S (2017) Detection of apple marssonina blotch disease using particle swarm optimization. Trans ASABE 60(2):303–312 11. Kamilaris A, Prenafeta-Boldu FX (2018) Deep leuarning in agriculture: a survey. Comput Electron Agric 147:70–90 12. Gu Y, Cheng S, Jin R (2018) Feature selection for high-dimensional classification using a competitive swarm optimizer. Soft Comput 811–822
A Thorough Analysis of Machine Learning and Deep Learning Methods for Crime Data Analysis J. Jeyaboopathiraja and G. Maria Priscilla
Abstract The analysts belonging to the police forces are obliged for exposing the complexities found in data, to help the operational staff in nabbing the criminals and guiding strategies of crime prevention. But, this task is made extremely complicated due to the innumerous crimes, which take place and the knowledge levels of recent day offenders. Crime is one of the omnipresent and worrying aspects concerning society, and preventing it is an important task. Examination of crime is a systematic means of detection as well as an examination of crime patterns and trends. The data work involving includes two important aspects, analysis of crime and prediction of perpetrator identity. Analysis of crime has a significant role to play in these two steps. Analysis of the crime data can be of massive help in the prediction and resolution of crimes from a futuristic perspective. To avert this issue in the police field, the crime rate must be predicted with the help of AI (machine learning) approaches and deep learning techniques. The objective of this review is to examine the AI approaches and deep learning methods for prediction of crime rate that yield superior accuracy, and this review article also explores the suitability of data approaches in the attempts made toward crime prediction with specific predominance to the dataset. This review evaluates the advantages and drawbacks faced by crime data analysis. The article provides extensive guidance to the evaluation of model parameters to performance in terms of prediction of crime rate by carrying out comparisons ranging from deep learning to machine learning algorithms. Index Terms Big data analytics (BDA) · Support vector machine (SVM) · Artificial neural networks (ANNs) · K-means algorithm · Naïve Bayes
J. Jeyaboopathiraja (B) Research Scholar, Department of Computer Science, Sri Ramakrishna College of Arts and Science, Coimbatore, India e-mail: [email protected] G. Maria Priscilla Professor and Head, Department of Computer Science, Sri Ramakrishna College of Arts and Science, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_58
795
796
J. Jeyaboopathiraja and G. Maria Priscilla
1 Introduction Recently, big data analytics (BDA) has evolved to be a prominent technique used for the data analysis and extraction and their relevance in an extensive array of application fields. Big data involves the accessibility to an extreme amount of data that tends to become hard to be stored, processed, and mined with the help of a classical database fundamentally because the data available is massive, complicated, unorganized, and quickly varying. This one is the possible and critical basis behind the big data’s idea, which is being initially encouraged by online companies such as Google, eBay, Facebook, LinkedIn, etc. The term “big data” indicates a digital repository of information having an enormous volume, velocity, and diversity. Analytics in big data refers to the software developing process to unravel the trends, patterns, associations, or other meaningful perspectives in that enormous amounts of information [1]. Big data has an inevitable part to play in different domains, like agriculture, banking, data mining, education, chemistry, finance, cloud computing, marketing, and healthcare stocks [2]. Owing to the consistent urbanization and rising population, society has become city-centric. But, an increasing number of savage crimes and accidents have also accompanied these developments. To deal with these problems, sociologists, analyst, and protection organizations have dedicated many endeavors toward the mining of important patterns and factors. The development of ‘big data,’ necessitating new techniques toward the well-organized and precise analysis of the rising volumes of data that are very much criminal, has remained a critical problem for every law enforcement and foundations of intelligence collection. The crime has increased multi-fold over the passage of time, and the criminals have begun using the latest trend in technology not just for committing the offenses, along with the runoff acquittal. Crime is not anymore confined to the boulevards and back rear entryways (back alleys) in the neighborhood places. Also, the Internet which acts as a connecting bridge for the whole world thrives as a field for the more crooked minded criminals recently. Acting upon the barbaric acts like the 9/11 terrorist assaults and technology exploitation for hacking the most protected databases used for defense, novel and efficient techniques of crime prevention has emerged to gain rising significance [3]. Data mining is called as a potential tool with remarkable capability to aid the illegal examiners in highlighting the majority of vital information concealed in the crime ‘big data.’ Mining the data in the form of a tool for crime investigation is identified as a relatively novel and popular research domain. In addition to the improving usage of the systems that are computerized for crime tracking, analysts in computer data have got into assisting by the officers of law enforcement and detectives not just to accelerate the process of crimes’ resolution [4], but also for advance prediction of crimes. The improving accessibility of big data has also influenced the importance of command by the applications involving several data mining approaches, and its simplicity to be used by people with no skills in data analysis and knowledge on statistics.
A Thorough Analysis of Machine Learning and Deep Learning Methods …
797
The capability of analyzing this extremely high amount of data along with its intrinsic drawbacks with no computational assistance imposes manual pressure [5]. This review investigates the modern approaches of data mining techniques, which are utilized for the prediction of crime and criminality. Diverse classification algorithms like Naïve Bayesian, decision tree (DT), back propagation (BP), support vector machine (SVM), and deep learning techniques have been utilized for the prediction “crime category” for differentiation. This technical work provides the details on different data mining classifiers employed for the crime data prediction. This technical work also studies the advantages and drawbacks of different data mining approaches in crime data.
2 Literature Review Mining the data techniques are utilized for the detection as well as prevention of crime. Classical classification approaches concentrate on both organized and unorganized data for pattern detection. The evolution of big crime data has made many of the available systems employ an ensemble of data mining approaches to get exact and accurate predictions. Crime analysis can range over an extensive array of activities in crime starting from simple mistreatment of public duties to crimes that are prearranged at an international level [1]. This section reviews the classical data mining techniques and deep learning techniques in crime data analysis.
2.1 Review of Data Mining Methods for Crime Data McClendon and Meghanathan [5] introduced WEKA, which is a publicly available data mining software, for carrying out an of analysis in a comparative way between the patterns of wild crime extracted from the Communities and Crime Unnormalized Dataset given by the University of California-Irvine (UCI) database and the original data in the crime statistics for the state of Mississippi, which has been given by neighborhoodscout.com Web site. Linear regression, additive regression, and decision stump algorithms are presented employing the identical finite group of features, on the Communities and Crime Dataset. On the whole, the algorithm for linear regression exhibited the best performance among the three algorithms chosen. Its lifetime is to show the effectiveness as well as the accuracy of the machine AI employed in data mining analysis at the prediction of aggressive crime activities. Tyagi and Sharma [6] studied the data mining mechanism employed in the criminal examination. Here, the important concepts are to focus on the technique utilized in crime data analytics. Algorithms including C4.5, CART, K-nearest neighbor algorithm (KNN), support vector machine (SVM), and artificial neural networks (ANNs) help detect the particular patterns of the criminals in massive sized datasets, classifying the criminal activities into various groups and prediction on the hotspots of
798
J. Jeyaboopathiraja and G. Maria Priscilla
crime. This research work describes the problems that come up during the analysis, which need to be eliminated to obtain the required result. Pramanik et al. [7] studied a framework on big data analytics, which investigates four parameters, they are criminal networks, such as network extraction, subgroup location, association design revelation, and central member identification and with success. Big data sources, change, stages, and devices are integrated to render four significant functions, which exhibit a potential correlation with the two dimensions of SNA. Also, social network analysis (SNA) is a well-known and proactive technique to unravel the earlier mysterious structural patterns from criminal networks. SNA was identified to be a resourceful and effective mechanism for analyzing criminal organizations and firms. With the constant introduction of modern platforms of big data analytics, tools, and approaches, the years to come will witness the broad deployment and usage of big data across defense and law enforcement companies. Jha et al. [8] presented on data analytical techniques based on big data can help prevent the crimes. Also, various approaches of data collection have been studied, and it comprises of volunteered geographic information (VGI) along with geographic information system (GIS) and Web 2.0. The final stage determination includes the forecasting that depends on the gathering of data and investigation. Big data is regarded as a suitable framework for the crime data analysis since it renders better throughput, fault resilience helps in the analysis of massive datasets, processes on hardware goods, and produces trustworthy outcomes, while the Naïve Bayes algorithm of machine learning can predict better utilizing the available dataset. Nadathur et al. [9] provided a comprehensive overview of crime incidences and their relevance in literature through the combination of techniques. This review work, as the first step, analyzes and detects the features of crime occurrences introducing a schema on the combinatorial incident description. The newly introduced schema tries to find a method for a systematic merging of various elements or crime features, and the system provides a database with much better throughput and lesser maintenance expenditure applying Hadoop tools with HDFS and map-reduce programs. Besides, an elaborate list comprising of crime-associated violations is presented. This facilitates a perfect interpretation of the repetitive and underlying criminal actions. This review work tries to help experts and law enforcement officers in finding the patterns and trends in rendering the forecasts, discovering the association, and probable explanations. ToppiReddy et al. [10] studied different visualizing approaches and AI algorithms, which are followed for the distribution of crime prediction over a region. First, process the untreated datasets that are processed and then envisage as per the requirement. KNN is a technique employed for arrangement purposes. The object classification is performed with a mainstream vote from its neighbor, and the presumed object belongs to the class that is famous among its k-nearest neighbors. Naïve Bayes depends on Bayes theorem that defines the likelihood of an occurrence by the earlier acquaintance of constraints having relevance to the event. Then, AI was utilized for the information extraction from these massive datasets and finds the concealed associations amid the data which in turn is further utilized for reporting as well as finding the patterns in the crime that was helpful for crime analysts in the analysis
A Thorough Analysis of Machine Learning and Deep Learning Methods …
799
of these crime networks employing different interactive visualization techniques for forecasting the crime and therefore is of great assistance in preventing the crimes. Yerpude and Gudur [11] designed data mining approaches, which are used on crime data for forecasting the features, which influence the higher crime rate. Supervised learning makes use of datasets for training, testing, and achieving the necessary results while unsupervised learning partitions a discontinuous, unorganized data into classes or groups. Data mining’s supervised learning approaches are decision trees, Naïve Bayes, and regression, and AI on earlier gathered data and hence utilized for the prediction of features which influences the crime incidences in an area or neighborhood. Depending on the positioning attained by the features, the Crime Records Bureau and Police Department can embark on required measures to reduce the chance of crime occurrence. Pradhan et al. [12] demonstrated big data analytics using the San Francisco crime dataset, as gathered by the San Francisco Police Department and accessible through the initiative of Open Data. Algorithms such as Naïve Bayes, decision tree, random forest, K-nearest neighbor (KNN), as well as multinomial logistic regression are employed. It is primarily focused on carrying out a comprehensive analysis of the important kinds of crimes, which happened in the municipality, screen the pattern throughout the years, and decide how different attributes have a role to play in particular crimes. Besides, the results attained of the exploratory data analysis is leveraged to inform the data preprocessing process, before training different AI models for forecasting the type of crimes. Especially, the model helps in predicting the kind of crime, which will happen in every district of the city. The dataset given is hugely imbalanced, and therefore, the metrics utilized in earlier research focus primarily on the majority class, irrespective of the performance of the classifier in minority classes, and a technique is proposed for resolving this problem. Yu et al. [13] studied building datasets from real crime records. These datasets have a collective tally of crime and crime-associated events classified by the police forces. The information comprises of place as well as a time of these events. More number of spatial and temporal features are extracted from the unprocessed dataset. Second, a group of data mining classification approaches are used for carrying the crime prediction. Different classification techniques are analyzed to decide the one that is best for the prediction of crime “hotspots.” The classification of rising or egress is investigated. Finally, the best prediction technique is proposed for achieving the most consistent results. A model has resulted from the research work, which exploits the inherent and external spatial and temporal data to get robust crime predictions. Jangra and Kalsi [14] presented a prediction analysis process, where future development and results are forecasted based on presumption. AI and regression methods come under the two techniques, which have been used for carrying out predictive analytics. During the process of predictive analytics, AI approaches are extensively used as well as emerged to become commonly used since their massive scale datasets managed by it are quite efficient and yield much better performance. It renders the results of having standard features and noisy data. The KNN is a famous approach that helps in the analysis of prediction. To boost the crime prediction’s accurateness,
800
J. Jeyaboopathiraja and G. Maria Priscilla
the Naïve Bayes method is used. It is found that Naïve Bayes yields much better accuracy in comparison with KNN for the crime prediction. Deepika and SmithaVinod [15] designed a technique for India’s crime detection that employs data mining approaches. The mechanism comprises steps such as data preprocessing, clustering, classification, and visualization. The field of criminology studies about different crime features. Clustering through K-means helps in the detection of crime, and the groups are created depending on the resemblance found in the crime characteristics. The random forest algorithm and neural networks are used for data classification. Revelation (visualization) is performed employing the Google marker clustering, and the crime hotspots are plotted on the India map. WEKA tool helps in validating the accuracy. Dhaktode et al. [16] presented a data mining approach that is employed for analysis, examination, and verifies the patterns in crimes. A clustering technique is enforced for the analysis of crime data, and the stored data is clustered employing the K-means algorithm. Once the classification and clustering are performed, a crime can be predicted by its past data. This newly introduced system can specify areas having a greater probability of crime rate and different regions having a greater crime rate. Jain et al. [17] designed a systematic approach for crime analysis and prevention to spot and investigate the patterns and trends found in crime. This system can help to predict the areas having a greater probability for crime incidences and can help to visualize the crime vulnerable hotspots. The growing usage of computerized systems is of much aid to the crime data analysts in helping law enforcement officials to solve crimes faster. K-means algorithm is performed by dividing the data into groups according to means. Further, this algorithm includes a modification known as the expectation–maximization algorithm where the data is partitioned based on their parameters. This data mining framework is easy for implementation, and it jointly operates with the geospatial plot of wrongdoing and increases the detective’s efficiency and other law enforcement officials. Sukanya et al. [18] worked on the analysis of the criminals’ data, and grouping and classification approaches are utilized. These data are accumulated in the criminals’ repository. Spatial clustering algorithms and structured crime classification are utilized for the classification of the crimes. These algorithms are useful in identifying the spots of crime occurrences. The identification of the criminals will be done based on the spectator or clue present at the location of the crime. Identifying the hotspot of criminals’ occurrences will be valuable to the police forces to improve the security of the specific region, and this will reduce the crimes to a much better extent in the future. After the application of this concept to every area, the criminal activities can be reduced to the maximum extent possible. The crimes cannot be controlled entirely. Ladeira et al. [19] presented data preprocessing, transformation, and mining approaches to find the crime details hidden in the dataset associating similar records. Subsequently, the criminal records are categorized into three groups considering the complexity of the criminal action, which are: A (low sophistication), B (medium sophistication), or C (high sophistication). To know the effect of non-application and utilization of preprocessing approaches and the data mining approaches that attain
A Thorough Analysis of Machine Learning and Deep Learning Methods …
801
the best outcomes, two experiments were carried out, and the comparison of their mean accuracy was done. The application of the preprocessing and random forest algorithm produced superior results and also the potential of knowing high dimensional and dynamic data. As a result, an ensemble of these approaches can yield better information to the police department. Inference of Data Mining Methods for Crime Data is shown in Table 1.
2.2 Review of Deep Learning Methods for Crime Data Keyvanpour et al. [20] designed data mining approaches that were supported using a multi-use framework for investigating the crimes intelligently. The framework used a systematic technique for employing a self-organizing map (SOM) and multilayer perceptron (MLP) neural networks for the grouping and classification of data in crime. Design aspects and problems in employing hierarchical/partitional grouping approaches are used in clustering the crime data. Lin et al. [21] studied the idea of a criminal situation in a framework-based crime forecast demonstrating and characterizes a lot of spatial-fleeting features that rely upon 84 sorts of segment data utilizing the Google Places API to robbery information for Taoyuan City, Taiwan. Deep neural networks was the best model, and it performed better than the well-known random decision forest, support vector machine, and K-near neighbor algorithms. Experiments show the significance of the geographic feature design in increasing performance and descriptive capability. Also, testing for crime displacement reveals that the copy of this design outshines the criterion format. Feng et al. [22] suggested data analysis for the analysis of criminal data in San Francisco, Chicago, and Philadelphia. First, the time series of the data is explored, and forecasting crime trends in the coming years are performed. After this, with the crime category predicted and the time and location are given, to get over the problem of disproportion, compound classes are combined into bigger classes, and selection of feature is carried out for accuracy improvement. Multiple state-of-theart data mining approaches, which are specially applied for forecasting the crime, are presented. The results of experiments reveal that the tree classification models outperformed this task of classification over KNN and Naive Bayesian techniques. Holt-Winters integrated with the seasonality of multiplicative yields superior results in the forecasting the crime trends. The potential results will be advantageous meant for police forces and law enforcement in solving crimes faster and render the cues, which can help in them nabbing the crimes, forecast the probability of happenings, efficiently exploit the assets, and formulate quicker decisions. Chauhan and Aluvalu [23] studied that in this emerging technological field, the cyber-crimes are increasing at an exponential rate and are quite a challenge to the skills of investigators. Also, the data on crime is rising magnanimously, and it is generally in digital format. So the data generated cannot be managed with efficiency employing classical analysis approaches. Rather than applying conventional data analysis mechanisms, it would be advantageous to employ big data analytics for this
802
J. Jeyaboopathiraja and G. Maria Priscilla
Table 1 Inference of data mining techniques for crime data S.
Author name
Technique name
Benefits
Drawbacks
1
McClendon [5]
Linear regression, additive regression, and decision stump algorithms
Shows how efficient as well as precise the machine learning algorithms utilized in the analysis of data mining can be at the prediction of wild crime patterns
Data mining help for a long and difficult process for law enforcement officers who need to go during massive volumes of data
2
Tyagi [6]
Data mining technologies
An algorithm can manage a massive amount of data and render superior accuracy
Intelligence agencies sift through the database manually, which in turn is a difficult task and time consuming
3
Pramanik [7]
Network extraction, subgroup detection, interaction pattern discovery, and central member identification
Functions as modern analytics of big data platforms for security and law enforcement agencies
Convergence is slow, the performance is reduced by an increasing number of classes, and over-fitting occurs often
4
Jha [8]
Machine learning algorithm
This system is utilized for analysis of crime data, and it renders higher throughput
Machine learning for predicting and averting future crime
5
ToppiReddy [10] Visualizing techniques and machine learning algorithms
6
Yerpude [11]
no.
The current system is Consumes more time utilized for the for classification prediction of the crimes and aids the law agencies. The accuracy in prediction is increased
Decision trees, Naïve Improves security Bayes, and regression and crime protection with a desktop. Any safety measures can be undertaken in accordance with the relevant features
Influences the greater crime rate
(continued)
A Thorough Analysis of Machine Learning and Deep Learning Methods …
803
Table 1 (continued) S.
Author name
Technique name
Benefits
Drawbacks
7
Yu [13]
Data mining classification techniques
Best prediction technique to attain the most consistent outcomes
It does not provide good support for real-time applications. Future work has to incorporate motor vehicle theft-based crime
8
Jangra [14]
Naïve Bayes
Enhances the Computation time is accuracy of the crime excessive for a few prediction approach classifiers. Concurrent techniques are required for reducing the classification time
9
Deepika [15]
K-means clustering, random forest algorithm, and neural networks
The technique will be advantageous for the crime department of India in the analysis of criminal activities with superior forecasting
10
Jain [17]
K-means clustering
Helps to increase the 1) Hard to efficiency of the predict K-value. officer and other law 2) Performance is enforcement officials not good with the global cluster
11
Sukanya [18]
Clustering and classification technique
The hotspot of the criminal activities and identifying the criminals employ clustering and classification algorithms
Real-time prediction is slow, hard to implement, and complicated
12
Ladeira [19]
Data preprocessing, transformation, and mining techniques
Application of preprocessing techniques and which data mining approaches yields superior results
The prediction process employing random forests consumes more compared to other algorithms
no.
One more problem is that they cannot predict the time of occurrence of the crime
804
J. Jeyaboopathiraja and G. Maria Priscilla
massive amount of information. Essentially, the data collected will be disseminated over multiple geographic places and based on that, the clusters will be generated. Secondly, the analysis of the clusters created is done employing big data analytics. At last, these analyzed clusters are provided to the artificial neural network which will lead to the generation of prediction patterns. Pramanik et al. [24] explored the strengths of analytics in big data for achieving intelligence insecurity within a criminal analytics structure. Five important technologies including, link analysis, intelligent agents, text mining, artificial neural network (ANNs), and machine learning (ML) have been identified and have found extensive application in a different field for evolving the mechanical basis of security in an automatic way and criminal investigation system. Few popular data sources, analytics techniques, and applications associated with two significant features of social network analysis such as analysis in a structural and positional way forming the basis of criminal analytics are examined. The advantages and drawbacks of analytics in big data apply to the field of criminal analytics. Stalidis et al. [25] presented a comprehensive analysis of crime classification and prediction employing deep learning architectures. The efficiency of deep learning algorithms in this domain is analyzed, and it yields recommendations for the design and training of deep learning systems for the prediction of crime spots, employing public data acquired from police reports. The experiments carried out with five openly available datasets show that the deep learning-based techniques perform continuously better than the available techniques that have been performing better. Also, the efficiency of various parameters in the deep learning frameworks is evaluated, and it provides various perspectives for their configuration to achieve increased performance in crime classification and at last crime prediction. Kang and Kang [26] suggested a fusion technique of feature-level data with environmental consideration depending on a deep neural network (DNN). This dataset comprises of gathered data from different online repositories of crime statistics, geographic, and meteorological data and images found in Chicago, Illinois. Before the generation of data training, crime-associated data is selected in carrying out statistical analyses. At last, the DNN training is carried out, and it comprises of the four types of layers, which include spatial, temporal, environmental context, and joint feature representation layers. Integrated with critical facts obtained as of different domains, fusion DNN is considered to be manufactured goods of an effective decision-making process, which helps in the statistical analysis of data redundancy. The traits of experiments demonstrate that the DNN model exhibits more accuracy in the crime prediction incidence compared to other prediction models. Shermila et al. [27] designed a model, which identifies the patterns in crime from observations gathered from the crime location and then describes the offender who is probably the crime suspect through prediction. This technical work covers two important aspects, which include analysis of crime and forecasting the offender’s identity. The crime analysis step classifies various crimes that are unsolved, as well as evaluates the impact of different factors such as year, month, and weapon used in uncertain crimes. The system predictively describes the offender employing algorithms such as multilinear regression, K-neighbors classifier, and neural networks.
A Thorough Analysis of Machine Learning and Deep Learning Methods …
805
Lin et al. [28] designed a deep learning algorithm that has been found extensive application in various fields; such as image identification and processing the natural language. The deep learning algorithm yields superior prediction results compared to other methodologies like random forest and Naïve Bayes for probable crime locations. Also, the model performance is improved by collecting data with diverse time scales. For validating the results of experiments, the probable crime spots are visualized on a map, and it is inferred if the models can find the real hotspots. Krishnan et al. [29] formulated an artificial neural networks model, which replace the traditional data mining approaches in a better manner. In this analysis, the prediction of the crime is done with the help of recurring long short-term memory (LSTM) networks. An available organized dataset is helpful in the prediction of the crimes. Data is divided into training, testing data. Both the testing and training go through the training process. The resultant training and testing data is then compared with the real crime count, and its visualization is done. Gosavi and Kavathekar [30] examined data mining approaches, which will use in the detection and prediction of crimes employing association rule mining, k-means clustering, decision trees, Naive Bayes, and machine learning approaches like deep neural network and artificial neural network. Inferences from this survey were that if the dataset instances contain more number of missing values, then preprocessing is an important task, and crimes do not happen consistently across urban locations but is concentrated in particular regions. Therefore, the prediction of crime hotspots is an essential task, and the usage of post-processing will be of massive help in reducing the crime occurrence rate. Wang et al. [31] designed the benchmarked deep learning spatio-temporal predictor, ST-ResNet, for aggregated prediction of the distribution of crime. These models consist of two steps. The first one performs the preprocessing of the crude data of crime. This comprises of regularization in both space and time to improve the guessable signals. Secondly, hierarchical architectures of residual convolutional units are adapted for training multifactor crime prediction models. Mowafy et al. [32] showed that criminology is a critical field in which text mining approaches have a significant part to play for law enforcement officers and crime analysts to help in the investigation and speed up the resolution of crimes. A common architecture for an extracting the crime procedure which combines the extraction of text with the investigation of criminal procedure for forecasting the type of crime by using the classification of text for the unorganized data in the police incident reports, which is regarded to be a segment of the criminal behavior analysis. Ivan et al. [33] recommended that approach is called the intelligence of business and it is considered as dependent on supervised learning (organization) approaches provided that there was branded training data. The comparison of four varied classification algorithms including decision tree (J48), Naïve Bayes, multilayer perceptron, and support vector machine was carried out to get the most efficient algorithm for forecasting the crimes. The study employed classification models created with the help of Waikato Environment for Knowledge Analysis (WEKA). Decision tree (J48) is performed Naïve Bayes, multilayer perceptron, and support vector machine (SVM) algorithms, and exhibited much better presentation both in terms of execution time
806
J. Jeyaboopathiraja and G. Maria Priscilla
and accuracy. Inference of Deep Learning Methods for Crime Data is shown in Table 2.
3 Issues from Existing Methods In the criminology literature, the association among crime and different features has been rigorously analyzed, where common instances are historical crime records, unemployment rate, and similarity in space. This literature review depicts conceptualizing predictive policing, and it is imminent and reaped out advantages and disadvantages. The research reveals a variance between the substantial focus for potential benefits and disadvantages of predictive policing in the literature and the available empirical proof. The empirical proof yields very limited assistance for the advantages claimed of predictive policing. While few empirical studies show that predictive policing mechanisms result in a reduction in crime, others show no influence. Concurrently, no empirical proof exists at all for the disadvantages given. With the rising advent of computerized systems, crime data analysts can prove to be of massive help to the law enforcement executives to accelerate the practice of rectifying the crime. Employing the extraction of data and statistical approaches, novel algorithms and systems have been designed alongside new sorts of information. The impact of AI and measurable methodologies (statistical approaches) on wrongdoing or other enormous information applications like auto collisions or time arrangement information will encourage the investigation, extraction, and translation of significant examples and patterns, subsequently helping in the prevention of criminal activities and control. In comparison with deep learning algorithms, machine learning algorithms are bogged down by a few challenges. Having all those benefits to its potential and ubiquity, machine Learning is just not exact. The below factors are its limitations:
3.1 Data Acquisition Machine learning need huge datasets for training, and these must be comprehensive/impartial and worthy. They can also encounter circumstances where they have to wait for the generation of novel information.
3.2 Time and Resources ML requires sufficient time to allow the algorithms that are trained and extend sufficiently to satisfy their objective with a reasonable amount of exactness and accordance. Also, enormous resources are required for its functioning. These can imply more computer resources required.
A Thorough Analysis of Machine Learning and Deep Learning Methods …
807
Table 2 Inference of deep learning techniques for crime data S. no.
Author name
Technique name
Benefits
Drawbacks
1
Keyvanpour [20]
SOM and MLP neural networks
SOM clustering technique within the scope of crime analysis with better results
MLP is now considered to be inadequate for recently advanced computer vision processes, does not consider spatial information
2
Lin [21]
Deep neural networks
Feature design for increased performance and descriptive capability
It needs a massive amount of data for carrying out the classification or crime analysis
3
Feng [22]
Data mining techniques
Data mining approaches are particularly employed for crime prediction. Holt-Winters combined with multiplicative seasonality yields superior results
Privacy, security, and information misuse are the huge challenges if they are not dealt with and resolved right
4
Stalidis [25]
State-of-the-art methods and deep learning
The efficiency of various parameters in the deep learning frameworks provides perspective for their configuration to attain better performance
It needs a massive amount of data for having a better performance than other approaches
5
Kang [26]
Deep neural network DNN results from an (DNN) effective decision-making process, which helps in the statistical analysis of data redundancy. DNN model offers more accuracy in the prediction of crime incidence compared to other prediction models
The kind and time of crime incidences and to get another type of data for their prediction
(continued)
808
J. Jeyaboopathiraja and G. Maria Priscilla
Table 2 (continued) S. no.
Author name
Technique name
Benefits
Drawbacks
6
Shermila [27]
Multilinear regression, K-neighbors classifier, and neural networks
The system describes the offender employing algorithms predicatively through, multilinear regression, K-neighbors classifier, and neural networks
KNN algorithm is that it is not an active learner; it does not use the training data for learning anything and just uses only the training data for classification
7
Lin [28]
Deep learning algorithm
Machine learning technique developed to yield increased prediction of future crime hotspot locations, with results verified by real crime data
Crime incidence prediction, similar to finding highly nonlinear correlations, redundancies, and dependencies between numerous datasets
8
Krishnan [29]
Neural network
Crime prediction is done with recurring LSTM networks
Undefined behavior of the network, hardship in demonstrating the problem to the network, and the duration of the network is unpredictable
9
Shruti S. Gosavi [30]
Association rule mining, k-means clustering, decision trees, Naive Bayes, and machine learning techniques
Prediction of crime hotspots is a highly significant task, and also usage of post-processing will aid in reducing the rate of crimes
Crime does not happen consistently across urban regions but is concentrated in particular regions
10
Wang [31]
CNNs and ST-ResNet
ST-ResNet Every training step framework, the past takes a much longer dependencies have to time be fixed in an explicit manner and explicit dependencies that are much longer make the network to be highly complicated and hard to train
A Thorough Analysis of Machine Learning and Deep Learning Methods …
809
3.3 Interpretation of Results One more important dispute is the capability of accurate interpretation of results that the algorithm generates. The algorithms also must be carefully chosen.
3.4 High Error-Susceptibility Machine learning is independent but exceedingly vulnerable to mistakes. Assume, an algorithm is trained with datasets sufficiently minute, so it cannot be impartial. So the results are influenced forecasting resulting from a biased training set. This provides unrelated advertisements that are being shown to end users. In ML, such errors can propagate an error chain, which can go on unnoticed for long durations of time. Also, if they are found out, a considerable amount of time is wasted in identifying the source of the problem, and even longer to get it fixed.
4 Solution Using deep learning and neural networks, novel representation has been designed for the prediction of crime incidence [34]. Since deep learning [35] and artificial intelligence [36] have gained many victories in the vision of the computer, they have found application in BDA for the prediction of tendency and categorization. Big data analytics offers the skills for transforming how that the department of law enforcement and agencies in intelligence security perform the extraction of essential information (e.g., criminal networks) from various sources of data in real-time for corroborating their surveys. Also, deep learning can be introduced in the form of a cascade of layers. Every succeeding layer takes the output signal from the earlier layer as its info (input). This feature and several other features yield few benefits while used for resolving different problems.
4.1 High-Level Performance Presently, several domains such as computer vision, recognition in speech, and processing the natural language like the neural networks relying on deep learning technologies are exceedingly great when compared to the techniques that are employed in conventional machine learning. The levels of accuracy are maximized when simultaneous the error count gets reduced.
810
J. Jeyaboopathiraja and G. Maria Priscilla
4.2 Capability to Design New Functions Traditional machine learning assumes that humans design purpose and this strategy consume quite a lot of time. Deep learning has the capability of creating new services depending on the inadequate number of them present in their learning dataset. It is implied is that deep learning algorithms can generate novel works to attain the recent goals.
4.3 Advanced Analytical Capabilities An AI algorithm to function in the approved manner, labeled data has to be prepared. The system dependent on algorithms of deep learning has the capability of emerging “smarter” by itself while doing the processing solving method and can deal with the unlabelled information.
4.4 Adaptability and Scalability Deep learning methods are greatly simple to be adapted to dissimilar fields, and compared to conventional ML algorithms, it can evolve the potential facilitation of transfer learning in which the complete model is learned, in many cases, aiding to attain much greater efficiency in a shorter span. Scalability is one more significant drawback. The neural networks can deal with data increase compared to conventional machine learning algorithms.
5 Conclusion and Future Work A comprehensive study on crime analysis employing deep learning and machine learning algorithms. The investigation based on deep learning and machine learning techniques in crime analysis may probably happen in certain coming years. This technical work investigates the efficiency of deep learning algorithms on this crime data analysis domain and yields recommendations for the design and training of deep learning systems in the prediction of crime hotspots, employing open data from police reports. The benchmarked techniques are compared against deep learning frameworks. The advantages and drawbacks of those two techniques are explained clearly in the review section. At last, the deep learning-based techniques show a consistent performance that is much better than the available best-performing techniques. As to the work intended for the future, the efficiency of various parameters in
A Thorough Analysis of Machine Learning and Deep Learning Methods …
811
the deep learning is to be evaluated, and insights can be provided for their configuration for attaining superior performance in crime classification and ultimately crime prediction.
References 1. Hassani H, Huang X, Silva ES, Ghodsi M (2016) A review of data mining applications in crime. Statist Anal Data Mining ASA Data Sci J 9(3):139–154 2. Memon MA, Soomro S, Jumani AK, Kartio MA (2017) Big Data analytics and its applications. Ann Emerg Technol Comput (AETiC) 1(1):46–54 3. Dhyani B, Barthwal A (2014) Big Data analytics using Hadoop. Int J Comput Appl 108(12) 4. Hassani H, Saporta G, Silva ES (2014) Data Mining and official statistics: the past, the present and the future. Big Data 2(1):34–43 5. McClendon L, Meghanathan N (2015) Using machine learning algorithms to analyze crime data. Machine Learning Appl Int J (MLAIJ) 2(1):1–12 6. Tyagi D, Sharma S (2018) An approach to crime data analysis: a systematic review. Communication, Integrated Networks Signal Processing-CINSP 5(2):67–74 7. Pramanik MI, Zhang W, Lau RY, Li C (2016) A framework for criminal network analysis using big data. In IEEE 13th international conference on e-business engineering (ICEBE), pp 17–23 8. Jha P, Jha R, Sharma A (2019) Behavior analysis and crime prediction using Big Data and Machine Learning. Int J Rec Technol Eng (IJRTE) 8(1) 9. Nadathur AS, Narayanan G, Ravichandran I, Srividhya S, Kayalvizhi J (2018) Crime analysis and prediction using Big Data. Int J Pure Appl Math 119(12):207–211 10. ToppiReddy HKR, Saini B, Mahajan G (2018) Crime prediction and monitoring framework based on spatial analysis. Proc Comput Sci 132:696–705 11. Yerpude P, Gudur V (2017) Predictive modelling of crime dataset using data mining. Int J Data Mining Knowl Manage Process 7(4) 12. Pradhan I, Potika K, Eirinaki M, Potikas P (2019) Exploratory data analysis and crime prediction for smart cities. In Proceedings of the 23rd international database applications and engineering symposium 13. Yu CH, Ward MW, Morabito M, Ding W (2011) Crime forecasting using data mining techniques. In IEEE 11th international conference on data mining workshops, pp 779–786 14. Jangra M, Kalsi S (2019) Naïve Bayes approach for the crime prediction in Data Mining. Int J Comput Appl 178(4):33–37 15. Deepika KK (2018) SmithaVinod, Crime analysis in India using data mining techniques. Int J Eng Technol 7:253–258 16. Dhaktode S, Doshi M, Vernekar N, Vyas D (2019) Crime rate prediction using K-Means. IOSR J Eng (IOSR JEN) 25–29 17. Jain V, Sharma Y, Bhatia A, Arora V (2017) Crime prediction using K-means algorithm. Global Res Dev J Eng 2(5):206–209 18. Sukanya M, Kalaikumaran T, Karthik S (2012) Criminals and crime hotspot detection using data mining algorithms: clustering and classification. Int J Adv Res Comput Eng Technol 1(10):225–227 19. Ladeira LZ, Sanches MF, Viana C, Botega LC (2018) Assessing the impact of mining techniques on criminal data quality, Anais do II Workshop de Computação Urbana (COURB), vol 2(1) 20. Keyvanpour MR, Javideh M, Ebrahimi MR (2011) Detecting and investigating crime by means of data mining: a general crime matching framework. Proced Comput Sci 872–880 21. Lin YL, Yen MF, Yu LC (2018) Grid-based crime prediction using geographical features. ISPRS Int J Geo-Inf 7(8)
812
J. Jeyaboopathiraja and G. Maria Priscilla
22. Feng M, Zheng J, Han Y, Ren J, Liu Q (2018) Big Data Analytics and Mining for crime data analysis, visualization and prediction. in International conference on brain inspired cognitive systems, pp 605–614 23. Chauhan T, Aluvalu R (2016) Using Big Data analytics for developing crime predictive model. In RK University’s first international conference on research and entrepreneurship, pp 1–6 24. Pramanik MI, Lau RY, Yue WT, Ye Y, Li C (2017) Big Data analytics for security and criminal investigations, Wiley interdisciplinary reviews: data mining and knowledge discovery, vol 7, No 4 25. Stalidis P, Semertzidis T, Daras P (2018) Examining Deep Learning architectures for crime classification and prediction, arXiv preprint arXiv:1812.00602 26. Kang HW, Kang HB (2017) Prediction of crime occurrence from multi-modal data using deep learning. PloS one, vol 12, No. 4 27. Shermila AM, Bellarmine AB, Santiago N (2018) Crime data analysis and prediction of perpetrator identity using Machine Learning approach. In 2nd international conference on trends in electronics and informatics (ICOEI), pp 107–114 28. Lin YL, Chen TL, Yu LC (2017) Using machine learning to assist crime prevention. In 6th IIAI international congress on advanced applied informatics (IIAI-AAI), pp 1029–1030 29. Krishnan A, Sarguru A, Sheela AS (2018) Predictive analysis of crime data using Deep Learning. Int J Pure Appl Math 118(20):4023–4031 30. Gosavi SS, Kavathekar SS (2018) A survey on crime occurrence detection and prediction techniques. Int J Manage Technol Eng 8(XII):1405–1409 31. Wang B, Yin P, Bertozzi AL, Brantingham PJ, Osher SJ, Xin J (2017) Deep learning for realtime crime forecasting and its ternarization. In International symposium on nonlinear theory and its applications, pp 330–333 32. Mowafy M, Rezk A, El-bakry HM (2018) General crime mining framework for unstructured crime data prediction. Int J Comput Appl 4(8):08–17 33. Ivan N, Ahishakiye E, Omulo EO, Wario R (2017) A performance analysis of business intelligence techniques on crime prediction. Int J Comput Inf Technol 06(02):84–90 34. Wang M, Zhang F, Guan H, Li X, Chen G, Li T, Xi X (2016) Hybrid neural network mixed with random forests and perlin noise. In 2nd IEEE international conference on computer and communications, pp 1937–1941 35. Wang Z, Ren J, Zhang D, Sun M, Jiang J (2018) A deep-learning based feature hybrid framework for spatiotemporal saliency detection inside videos. Neurocomputing 287:68–83 36. Yan Y, Ren J, Sun G, Zhao H, Han J, Li X, Marshall S, Zhan J (2018) Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement. Pattern Recogn 79:65–78
Improved Density-Based Learning to Cluster for User Web Log in Data Mining N. V. Kousik, M. Sivaram, N. Yuvaraj, and R. Mahaveerakannan
Abstract The improvements in tuning the website and improving the visitors’ retention are done by deploying the efficient weblog mining and navigational pattern prediction model. This crucial application initially performs data clearing and initialization procedures until the hidden knowledge is extracted as output. To obtain good results, the quality of the input data has to be promisingly good, and hence, more focus should be given to pre-processing and data cleaning operations. Other than this, the major challenge faced is the poor scalability during navigational pattern prediction. In this paper, the scalability of weblog mining is improved by using suitable preprocessing and data cleaning operations. This method uses a tree-based clustering algorithm to mine the relevant items from the datasets and to predict the navigational behavior of the users. The algorithm focus will be mainly on density-based learning to cluster and predict future requests. The proposed method is evaluated over BUS log data, where the data is of greater significance since it contains the log data of all the students in the university. The conducted experiments prove the effectiveness and applicability of weblog mining by using the proposed algorithm. Keywords Weblog mining · Navigational tree learning · Clustering · Density-based learning
N. V. Kousik (B) Galgotias University, Greater Noida, Uttarpradesh 203201, India e-mail: [email protected] M. Sivaram Research Center, Lebanese French University, Erbil 44001, Iraq e-mail: [email protected] N. Yuvaraj ICT Academy, Chennai, Tamilnadu 600096, India e-mail: [email protected] R. Mahaveerakannan Hindusthan College of Engineering and Technology, Coimbatore 110070, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_59
813
814
N. V. Kousik et al.
1 Introduction In the present scenario, the entire world is relying on the website to interact with the other end. The institutions, organization, and industries retain their clients by using many ways to make their website more efficient and reliable. This is achieved using the auditing operation, which can be performed in two ways. The first way is to evaluate the browsing history of a specific user and the collected information is used for enhancing the website structure or feedback contents received from the user and it can be used to improve the website experience. The second way is to record the navigational history of the client and this is used to improve the experience of the user. The second option is used widely since it does not rely on voluntary inputs of the client and it also automates the analysis of user navigational history. This is referred to as web usage mining (WUM) or weblog mining (WLM). The WLM finds its application in many fields and it includes web content personalization, recommender system [15], prefetching, and catching [14]. The benefits of weblog mining find its major usefulness in e-commerce applications in browsing, where the clients are targeted with relevant advertisements and products. Such a web access file is created automatically by the web server and it includes each view of the object, image, or HTML document, which is logged by the user. Each weblog file of a website is a single line text obtained due to each view and it has two log files, namely common log file and extended log file. Data in the file contains the navigation patterns of single or multi-user of a single or multiple website browsing behaviors of entire web traffic. The general characteristic other than a source of collection of a weblog file includes a text tile with the identical format, contains only single HTTP request, and support information like IP address, file name, HTTP response status and size, request date and time, URL, and browser data. Weblog mining consists of three processes: pre-processing or data cleaning, mining the pre-processed data to extract hidden knowledge, and then analyzing the results obtained after extraction. The weblog mining deals mostly with huge datasets, and hence, issues occur due to the availability of space and run time. Apart from such issues, the other challenges arise due to the nature of the log file [1]. The web server logs poorly track the user navigational history since it provides the entire control over server capacity and bandwidth. Here, a problem arises for the data mining algorithm due to the web access log accumulated from the server, obtained from user navigational history, which is used to extract the user’s navigational pattern. In this paper, the main aim is to analyze the WLM process and pattern prediction of the online navigational history of the user. The present work considers the access log file to process and extract the hidden knowledge. Such usage data is also collected from the user through browser cookies; however, it is not of prime concern, since it raises privacy concerns associated with the user. The major contribution of the paper includes pre-processing and cleaning operations under three stages. The second is the usage of a tree-based clustering algorithm for mining the user navigational pattern. The final contribution is the use of an effective way to predict the navigational behavior of online users and test the effectiveness of the methods.
Improved Density-Based Learning to Cluster for User Web Log …
815
The outline of paper is organized as follows. Section 2 shows related works. Section 3 provides proposed framework by integrating contributions. Section 4 evaluates proposed methods with various experiments. Section 5 concludes the paper.
2 Related Works There are many similar approaches for improving WLM, which enhances pattern sequence identification in the data stream. In [1], a task-oriented weblog mining behavior is used for identifying online browsing behavior in PC and mobile platforms. This method uses footstep graph visualization to navigate the patterns using sequential rule mining in clickstream data. The purchase decision is predicted using sequence rules in exploration-oriented browsing behavior. In [2], pre-processing, knowledge discovery, and analyzing the pattern are used to extract weblog data. The extraction process is carried out using a neuro-fuzzy hybrid model. This method uncovers the hidden patterns from WUM on a college website. In the case of [3], the removal process is carried out using mutually supervised and unsupervised descriptive knowledge mining. The clustering using association rule and subgroup knowledge discovery is carried out in an extra pure olive oil commercial website. The taxonomy is used as a constraint to WLM [4], in which the transaction data or user information is extracted using a weblog mining intelligent algorithm. This method helps to enable the third party straight right of entry on website functionalities. A rational recommendation method based on the action is used for WLM [5]. This method uses lexical patterns for itemset generation and better recovery of hidden knowledge. WLM is carried out using a tool [6], which evaluates the pedagogical process to identify instructor’s and student’s attitudes in a web-based system. This web-based tool provides support to measure various parameters both at the micro- and macrolevel, i.e., for instructors and policymakers, respectively [6]. The study to evaluate the self-care behavior of the participants, who are elderly, is carried out using a self-care service system. This system provides service and analysis of elder people on daily basis using WLM activity. Here, various self-care services are analyzed statistically. Then, an interest-based representation constructs the assembly of the elders using the ART2-enhance K-mean algorithm, which clusters the patterns. Finally, sequencebased representation with Markov models and ART2 K-mean clustering scheme is used for mining cluster patterns [7]. Form the webpages, web user ocular movement data is captured using an eyetracking tool in [8]. This eye-tracking technology is used for the classification of key objects in those websites, where conventional techniques surveying are eliminated. In this technique, a web user’s eye position data is identified in a monitor screen and the weblog sequence’s total page visits are combined with this. It also extracts significant behavior approaching on user activities. Temporal property is used in [9] for obtaining connection knowledge, where the temporal property is attained in this
816
N. V. Kousik et al.
technique using a fuzzy association. GA with 2-tuple linguistic demonstration is used for resolving fuzzy sets in association rule mining. A fuzzy set intersection boundary, discovery rules are used for extracting knowledge. To fit with real-world log data, graph representation with enhanced fitness function is used in a genetic algorithm. Graph theory is combined with sequential pattern mining in [21] for weblog mining. The relationship between data is defined using the request dependency graph in [22]. In [10], EPLogCleaner is used to discover knowledge by filtering out irrelevant items from common prefix URLs. This method is tested under real network traffic trace from one enterprise proxy. In [11], a taxonomy-based intentional browsing data is used for improving WLM. This method clarifies associations with other browsing data. Besides, an online data collection method is used to build intentional browsing data available for weblog data. In [12], an effective relationship between global and local data items is studied by extracts information using user web sessions. The data is segmented using similarity distance measures and the associations between the data are met using the registry federation mechanism. Further, for performing sequential mining, a distributed algorithm is used inside registry federation. In [13], a navigational pattern tree is used for modeling weblog data behavior and a navigational pattern mining algorithm is used for finding sequential patterns. This method scans the navigational pattern tree’s sub-trees for generating candidate recommendations. In [14], the content mining technique is combined with a weblog for generating user navigation profiles to link prediction and profile enrichment using diversified semantic information and user interest profiles based on language and global users. The semantic information is obtained using destination marketing organizations, which matches and provides a user-dependent profile for prospect web designs. This method is tested over the bidasoaturismo website using this non-invasive web mining with a minimum information web server. In the case of [15], an automated WLM and recommendation system using existing user behavior on a clickstream is developed. At this point, really simple syndication provides relevant information, and K-nearest neighbor classification identifies the user clickstream data. It further matches the user group and then browsing meets user needs. This is attained using extraction, cleansing, formatting, and grouping of sessions from the RSS address file of the users and then a data mart is developed. In [16], a unified intention weblog mining algorithm is used for processing datasets with multiple data types and it transforms well the browsing data into linguistic items using the concept of fuzzy set. In [17], a fuzzy object-oriented web mining algorithm is used with two phases, which are used for knowledge from weblogs or class or instances. Fuzzy has intra- and inter-page mining phase, where former one linguistic itemsets with the same classes and different attributes are derived, while in the latter, a large sequence to represent the webpage relationship is derived. In [18], a website structure optimization problem is resolved using an enhanced tabu search algorithm with advanced search features from its multiple neighborhoods, dynamic tenure of tabu, adaptive lists of tabu, and multi-level criteria. In the case of [19], recommendation-dependent WUM enhances current recommender systems quality using product taxonomy. This method tracks the customers using a rating database on purchasing behaviors and considerably enhances quality
Improved Density-Based Learning to Cluster for User Web Log …
817
recommendations. Product taxonomy improves the nearest neighbor search using dimensionality reduction. The remaining recommender system uses a Kohonen neural network or self-organizing map (SOM) [20] for improving search pattern in both online and offline. However, this method suffers from poor scalability problems due to the rapid usage of the system to mine weblog data. The solutions for improving mining are shown in the proposed system.
3 Proposed System The weblog data accumulate successful hits from the Internet. Hits are defined as requests made by the user for viewing a document or an image in an HMTL format. Such weblog data is created automatically and stored moreover in a client-side server or proxy server from the organization database. Weblog data has details like computer making query request’s IP address details, request time with details, user ID, the status field for defining whether the request is successful or not, transferred file size, URL, browser name and version. Data cleaning as well as pre-processing steps involve page views creation and session generations. The session operations identification is based entirely on timedependent heuristics. This kind of time-dependent approach decides session time out using time duration the threshold. This helps to attain better quality output using the proposed approach.
3.1 Pre-processing This is an initial step to clean weblog content, which converts log data in an unformatted version to be accepted as an input to the cluster mining process. The cleaning as well as pre-processing operation mainly involves three major steps, which include: cleaning of data, user identification, and session identification. Data cleaning and user identification involve data integration, data anonymization, data cleaning, feature selection, scenario building, feature generation, and target data extraction. The cleaning operation helps to improve the unwanted entries, which is quite an important step with regards to analysis or mining. The weblog data pre-processing has nine steps, which are shown in Fig. 1. Data Integration Weblog source is acquired from a weblog server for 6 to 12 months duration. The dataset has various fields that are integrated from multiple sources and used in evaluation for proving the proposed method’s effectiveness. Here, considered the BUS dataset with student’s records with multiple fields, which is shown in Table 1.
818
N. V. Kousik et al.
Fig. 1 Steps in pre-processing
Table 1 Depiction of basic characteristics of BUS dataset
Features
Depiction of features
User_id
Unique ID
student_username
Student number
Login_time
Student’s login time
Logout_time
Student’s logout time
Credit
Charge of Internet credit
Conn_duration
Connection duration
Sum_updown
Download/upload sum
Conn_state
Connection state
Ras_description
Hotspot name
Ras_ip
IP of network connection
Kill_reason
Disconnection warning
Reason
Reason for network disconnection
Remote_IP
Remote IP’s network connection
Station_IP
System’s information code
Station_IP_Value
System’s information value
Data anonymizationIt consists of three steps such as generalization, suppression, and randomization. The first step replaces attribute value with a general value, the second steps stop attributes real value release and its occurrence is displayed with some notation, and the final step replaces real with a random value. The proposed system uses a suppression strategy since the privacy of data is a major concern in the proposed system. Private information from the BUS dataset is extracted using student_username and student information’s privacy is preserved by replacing original with some pseudo values, respectively. This avoids finding student’s identity by an anonymous one but the proposed model is capable of handling.
Improved Density-Based Learning to Cluster for User Web Log … Table 2 Feature generation
819
Features
Feature generation
Depiction of feature
Student_username
Code_Field
Studying Field
Login_time
Logout_time
User_Type
user type
Entrance_Year
Entrance year
Code_Level
Studying level
Login_date
Student’s login date
Login_hour
Student’s login hour
Login_min
Student’s login minutes
Logout_date
Student’s logout date
Logout_hour
Student’s logout hour
Logout_min
Student’s logout minutes
Data Cleaning The cleaning operation involves three steps: missing as well as noisy data detection, automated filling with a global constant value if missed/duplicated record is removed. The negative values presence in the dataset makes the model’s performance uncomfortably. Hence, negative values in credit and duration features are replaced with suitable positive values using binning as well as smoothening operations. Feature Selection This process eliminates redundant and irrelevant features using the Spearman correlation analysis. It identifies features, which are correlating among them. In the present dataset, features elimination like duration and kill_reason have taken place, since the duration feature can also be obtained from login_time as well as logout_time, and feature reason has elevated correlation through kill_reason. Another feature similar to static_ip is eliminated, as it is established irrelevant to the current target. Scenario Building The proposed system is defined with two scenarios to analyze user behavior based on university regulations. It includes the details of connected student’s identification in the network and learning student’s behavior from any hotspots within a college campus on a holiday. Feature Generation According to consider scenarios, required feature sets are generated by this. So, student_username is divided into four features, which are shown in Table 2
820
N. V. Kousik et al.
Table 3 Intended characteristics for data extraction Target Features Validity, Reason, Logout-min, Logout-data, Logout-hour, Login-min, Login-data, Login-hour, Ras-IP, Remote-IP, Successfully-state, Ras-decription, Sum-in–out-mb, YearOfInterance, Duration, LevelCode, TypeOfuser, FieldCode
New validity feature creation presents with two values as per scenario one, i.e., no feature is set, if logout or login time of students is not set, else there would not be any string in the value. With seven values, created day features as per scenario two. Every day is allocated with a value. Target Data Extraction The target data is extracted from the above pre-processing operation and schema is shown in Table 3. Depending upon the scenario two, the number of network connections from Ras_description is computed and stored. When network connection count is more than the threshold for a student, the activity is considered as unusual behavior. If a medical student connects to the engineering faculty hotspot, it is regarded as unusual behavior.
3.2 Potential User Identification This step is used for separating potential users from the BUS dataset and interested users are identified using the C4.5 decision tree classification algorithm. Decisions rules are set to extract potential users from the dataset and the algorithm avoids the entries updated via network manager. The network manager normally collects and updates information by crawling around webpages. Such crawling collects huge log files and creates a negative impact while extracting knowledge from user navigational patterns. The proposed method resolves the issue by identifying the entries made by the network manager’s prior segmentation of the potential users. The weblog entries updated by the network manager’s are identified using its IP address; however, this knowledge finds it difficult to discover the search engine and agents. Alternatively, a root directory of the website is studied, since the network managers read the root files prior to website access. The weblog files containing the website access details are given to each manager before crawling to know its rights. However, the access to network managers cannot be relied on, since the exclusion standard of the network manager is considered voluntary and they try to detect and eliminate all the entries of network manager, which has accessed the weblog file. And also detects to eliminate all the network manager access within midnight. This leads to the elimination of the network manager entries in head mode and computes browsing speed and excludes network manager’s speed less than threshold value T and also when the total visited pages exceeding a threshold value.
Improved Density-Based Learning to Cluster for User Web Log …
821
Browsing speed is estimated based on total pages count browsed and total session time. For handling total entries count by network managers, a set of decision rules are applied. This helps to group the user into potential user and non-potential users. Using valid weblog attributes, the classification algorithm classifies users based on training data. Attributes selection is carried out within 30 s and session time for referring total pages is 30 min. Further, the decision rule for identifying the potential user is set less than 30 min and the total pages right of entry is predetermined to less than 5. The access code post is used for classifying users and it reduces weblog file size, which helps to improve clustering prediction and accuracy.
3.3 Clustering Process The proposed method uses an evolving tree-based clustering algorithm, which groups the potential users with its navigational pattern. The evolving tree graph sets connectivity between the webpages, where the graph edges assign the weights based on session time, connectivity time (is the measure of total visits in two webpages for a particular session), and frequency. N
C x,y =
Ti f x (k) Tx y f y (k)
i=1 N
i=1
(1) Ti Tx y
where T i —time duration of the session, i in both x and y webpages. T xy —requested time difference between x and y webpages in a specific session. At kth position, if the webpage appears, it is denoted as f (k) = kand the frequency measure between the x and y webpages is given as, Fx,y =
Nx y max N x , N y
(2)
where N xy —total sessions containing x and y webpages, N x —sessions count at page x, N y —sessions count at page y. The values Cx,y and Fx,y are used to normalize the time and frequency values between 0 s and 1 s. Hence, the degree of connectivity between the two webpages is calculated using,
822
N. V. Kousik et al.
wx,y =
2C x,y Fx,y C x,y + Fx,y
(3)
The weights are stored in an adjacency matrix (m) and each entry of m has wx,y values, as per Eq. (3). The increased use of edges in graphs is eliminated by discarding the lesser correlated threshold values with minimum frequency contribution. Evolving Tree Fundamentals Figure 2 shows the structure of the tree with N node nodes in the network, where each node is N l, j , where l is the node identity and the j is a parent node, l = 1, 2,…, N node and j = 0,1,…, and i = j. Consider an example, where the node N 2,1 has the parent node as N 1,0 . On other hand, the weight vector of each node is given as wl = {wl ,1 ,wl,2 , …, wl,n } with n as the number of features and bl as hit counter. The hit counter has the total counts; a node becomes the best matching pair (match) between the webpages. The size and depth of the tree are determined by N node nodes and maximum layers, e.g., size of the tree is 9 and its depth is 4 as shown in Fig. 1. The evolving tree has three types of nodes, namely the root node (is the first node, N 1,0 ), trunk node (blue circle other than N 1,0 ), and leaf nodes (green circle, N lf , where lf ∈ l). The root node is the first layer of the evolving tree and does not have a parent Fig. 2 An example of the ETree structure
N1,0
Tree Depth
Layer 1 Layer 2 Layer 3
N2,1 N4,2
N5,2 N8,5
Layer 4
N6,3
N7,3
N9,5
8000 7000
Number of Transactions
Fig. 3 Pre-processing and cleaning operation to test the transactions
N3,1
Users Potential Users
6000 5000 4000 3000 2000 1000 0
Day - 1
Day - 2
Day - 3
Improved Density-Based Learning to Cluster for User Web Log …
823
node (j = 0). The trunk node is found between the leaf and root node, which is a static node and acts as an interconnection node between the leaf nodes. The leaf node does not have any child nodes and a minimum trunk node is used to determine the distance between two leaf nodes. For example, the total trunk nodes between N 7,3 and N 4,2 are 3, and hence, the tree distance between N 7,3 and N 4,2 is also 3 or (N 7,3 , N 4,2 ) is 3. Evolving Tree-Based Clustering Algorithm The evolving tree has two parameters, namely splitting threshold, θ s and the creation of child nodes during the split process, θ c . Consider a training sample data with a total number of features as n, where X(t) = [x 1 , x 2 ,…, x n ] and the entire algorithm takes seven steps to learn the objects from the weblog data, which is the training sample data. Initially, the training data is fetched, and if the training model is available, the dataset is loaded into it, or else, a new training model is created. This operation takes place in the root node and then the process moves to the leaf node. The total number of best matching pair in the training dataset is found. The distance between the best matching pair is found using Euclidean similarity value and then the child node is matched at layer 2 using E{X(t)}. Finally, the shortest distance between the child and the best matching pair is found using the minimum distance equation from Eq. (4). Check if the leaf node and the match2 values are same since it leads to the calculation of N match value from X(t). When the leaf node does not match with match2 values, then the child node is estimated. The overall process is repeated until the entire leaf node (N lf ) is found. Check if the value of N lf is greater than one and the score(d(X(t),W l )) are the same, then the matching pair is chosen Randomly. On the other hand if N lf value is lesser than or equal to one and score(d(X(t),W l )) are dissimilar, then the process goes to step 13. Once the matching pair is chosen using weighted values, the weight vector of the best matching pair is found and updated. Then the weight vector of the best matching pair is updated and the process is repeated until the best-weighted pair is updated. After updating the matching pair (wx,y or wmatch ), the weight vector is updated using Kohonen learning rule given in Eq. (5). The neighborhood function of the tree is calculated using Eq. (6) to obtain the expansion of tree structure. Finally, the updated N match value is chosen as a parent node and the weighted of the parent node is considered as a child node. Once all the values of tree and child nodes are known, the tree is updated, and hence, training model is updated further. This is used as a learning model for new training data. Algorithm 1: Evolving tree-based clustering algorithm 1: 2: 3: 4: 5: 6: 7:
Fetch a training data, X(t) = [x 1 , x 2 ,…, x n ] If trained model is available Load the trained model (θ s ,θ c ) Else Create a new model (θ s ,θ c ) End While process move from root to leaf node
824
N. V. Kousik et al.
8: 9:
Find N match for X(t) Find Euclidean similarity measure between the pairs X(t) in the webpage = E{X(t)} 10: Match the child node at layer 2 with E{X(t)}, then 11: Calculate the minimum distance between the child node and best matching pair at layer 2, 12: d(X (t), Wl ) = ||X (t) − wl (t)|| 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26:
(4)
If match2 = leaf node, Then //best matching pair at layer 2 Calculate N match for X(t) Else Child node related to match at layer 2 is matched in a similar fashion End Repeat the process until the N lf are found If N lf > 1 && score(d(X(t),W l )) = same rand (match) //matching pair is chosen in random manner Else Go to step 13 End Update the wx,y or wmatch // weight vector of best matching pair is updated bmatch = bmatch + 1 Weight vector is updated using Kohonen learning rule wl f (t + 1) = wl f (t) + h Nmatch_l f (t) X (t) − wl f (t)
(5)
27: Calculate h Nmatch_l f //neighborhood function
2 −dT Nmatch , Nl f h Nmatch_l f (t) = α(t)e 2σ 2 (t)
28: 29: 30: 31: 32: 33:
where dT Nmatch , Nl f - distance between the N match and N lf α(t) - learning rate, σ (t) - Gaussian kernel width, which is monotonically reduced with t The tree is considered growing If bmatch = θ s Update N match as a parent node with θ c Initialize w(θ c ), such that weight of parent node is same as child nodes Else GoTo step 26
(6)
Improved Density-Based Learning to Cluster for User Web Log …
34: 35: 36: 37:
825
End Training model is updated Learn new training data End
3.4 Prediction Engine The prediction engine classifies the user navigation patterns and the future request of the user is predicted based on this engine classifier. The longest subsequence algorithm is utilized for such prediction process and it finds the common longest subsequence bout the entire sequence and the algorithm consists of two properties: • When two sequence x and y in a webpage ends up with similar element, the common longest subsequence (cls) is founded by eliminating the end element and then the shortened sequences are found. • When two sequence x and y in a webpage do not end up with similar element, then the longest sequence between x and y in a webpage is found, as x is cls(x n ,ym-1 ) and y is cls(x n-1 ,ym ) The cls is thus calculated using, Eq. (7), which is given by,
cls(xi , yi ) =
⎧ ⎨
0 if i or j = 0 if x¨i = y¨ j cls(xi−1 , yi−1 ), xi ⎩ long(cls(xi−1 , yi ), cls(xi , yi−1 )) if x¨i = y¨ j
(7)
The cls common to x i and yj is found by comparing its elements x¨i and y¨i . The sequence cls(x i-1 ,yj-1 ) can only be extended by x¨i , when both x i and yj are equal. If x i and yj are not equal the longer sequence in cls(x i-1 ,yj ) and cls(x i ,yj-1 ) is found, and if cls(x i-1 ,yj ) and cls(x i ,yj-1 ) are same, then both values are retained, provided both the values are not identical. The following algorithm 2 consists of the following steps: The webpages are assigned with URL, and for each web pair, the weight is calculated. The edge weight is computed over entire nodes in the graph and the edges with minimum frequency are removed. The remaining high-frequency nodes are further used to form the cluster using a depth-first search. Finally, the cluster with the minimum size is removed from the graph. Algorithm 2: Prediction Engine Algorithm 1: 2: 3: 4: 5:
URL is assigned over the list of webpages, L[p] = P For each pair (Pi , Pj ) ∈ L[p] do // webpage pair Then M ij = w(Pi , Pj ); // weight is computed using Eq. (3) Edgeij = M ij End for For Edgeu, v ∈ GraphE,V do //Edge weight with minimum frequency is removed
826
N. V. Kousik et al.
6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20:
If Edgeu,v < min(frequency) Remove Edgeu,v Else Keep Edgeu,v Endif End for For all vu do// vertices (u) C[i] = DFS (u); // perform Depth first search to form the cluster If C[i] < min(size(C)) Remove C[i] //remove cluster with length lesser than minimum size of cluster Else End i=i+1 End for Return the Cluster
4 Evaluation and Discussions This section evaluates the proposed method through a series of experiments and the BUS dataset (Collected from Bharathiyar University, India) is used for experimenting with the testing environment. The dataset consists of 1,893,725 entries and 533,156 webpages. The algorithms are implemented with Java programming. The results related to the pattern discovery are obtained using the proposed evolutionary tree-based clustering. The paper also indicates the gaining performance concerning user navigational pattern’s accurate prediction and run time. Initially, the process starts by discarding or filtering the noisy weblog data, implicit request, and error entries and network manager entries. Then the clustering process is carried out to group the potential users and the longest subsequence algorithm is used for best predictive response for future request.
4.1 Session Identification The improvements are suggested by proposing that returning users without repeated user requests do not help in identifying the knowledge related to the navigational pattern. Once the sessions are detected using threshold timing, say 30 min, the checking is done to detect whether the user pattern is shared by the same user or not. When the user navigation shared pattern exists, then the identified sessions are approved; otherwise, the sequences getting split to sessions are skipped. The investigations are carried out to find the effects associated with the quantity and quality of the sessions identified.
Improved Density-Based Learning to Cluster for User Web Log …
827
The present investigation carries out two-time thresholds, 10 and 30 min with an equal set of experiments. The timing threshold is carried out with three different minimum lengths of patterns (lsp), 1 to 3. The test length is tested with a different set of variables that ranges from 10–100%. As the variable value increases, the sessions associated with the patterns share well. Using the values of Table 3, the ratio for a different time and lsp is evaluated.
4.2 The inference from Table 4 can be obtained as follows as the threshold time reduces the ratio of correctly classifying instances gets reduced, this is due to the fact that minimum frequency values are mostly eliminated at the lower threshold level. Hence, the lowest ratio value defines very lesser correctly classified instances from the weblog data. And also, with reducing the length of the patterns, the ratio of classifying the instances increases greatly, which is tenfold times greater than the lsp3 2 and 3. This is the same case for threshold time of 30 min; however, the ratio of classification has improved to greater instant than the threshold time of 10 min. pre-processing The proposed algorithm is tested over the dataset to find the benefits of the proposed algorithm, which uses the proposed navigational pattern. The raw data is sent through the process of pre-processing, cleaning, and session identification prior to clustering. Table 3 provides the results of total transactions and memory used for storing the weblog cleaned data. Figure 3 shows the evaluated results of identified potential users after the pre-processing operation and it is found that many irrelevant items are removed that result in high-quality potential users identified (Table 5). Table 4 Results of various sessions for user navigational patterns Threshold
Sessions identified correctly
False positives rate
Total sessions identified
Ratio = False positives / Total
10 min
16,173
7129
23,302
30.59
30 min
15,345
5793
21,138
27.41
828
N. V. Kousik et al.
Table 5 Results of sessions for varying length and threshold time R in % Ratio time = Ratio time = Ratio time = Ratio time = Ratio time = Ratio time = 10 min lsp ≥ 10 min lsp ≥ 10 min lsp ≥ 30 min lsp ≥ 30 min lsp ≥ 30 min lsp ≥ 3 2 1 3 2 1 100
1.6
1.7
19.8
4.2
4.5
21.1
90
1.6
1.7
20.8
4.2
4.5
21.8
80
1.6
1.9
22.4
4.2
5
23.2
70
1.6
2.4
23.5
4.2
5.5
24
60
1.8
3.6
25.3
4.4
7
25.1
50
1.9
5.1
26.5
4.6
8.8
25.7
40
2.1
6.5
26.9
5
10.4
26
30
2.6
8
27.3
4.9
12.1
26.2
20
4.2
10.4
27.4
7.3
14.6
26.3
10
5.6
12.1
27.4
9.5
15.5
26.3
4.3 Results of Clustering The clustering result is used to find the different forms of information, which is extracted from the weblog data. It contains the total number of visits attempted over a website, traffic of the webpage, frequency of the page viewed, and behavior of navigational user pattern. The weblog data is considered with 100 unique webpages, where the clarity is improved by assigning it with codes. The total visits made over 24 h on these 100 pages are tested and this is used to test the proposed system’s performance. The minimum frequency or the threshold value is used to remove the correlated edges with lower value and the size of the minimum cluster is set as one. It is clear from Table 4 that the threshold value of 0.5 shows the optimal results with the associated dataset. The test is repeated against different weblog data sizes and the results are found. Thus, the clustering results are 0.5 and are used for the prediction process.
4.4 Prediction Results The prediction algorithm performance is evaluated using three performance parameters as shown in Eq. (8)-(10). The navigational pattern obtained from the previous step is divided into two sets. The generation of prediction and evaluation of prediction are the two sets in this process. The parameters are thus defined in Eq. (8)–(10), accuracy =
|P(an , T ) ∩ Evaln | |P(an , T )|
(8)
Improved Density-Based Learning to Cluster for User Web Log …
|P(an , T ) ∩ Evaln | |Evaln |
(9)
coverage(P(an , T )) × 2 × accuracy(P(an , T )) coverage(P(an , T )) + accuracy(P(an , T ))
(10)
coverage = F1 − measur e =
829
where an–navigation pattern for active session and. T–threshold value. P(an,T)–prediction set. Evaln–evaluation set. The prediction rate is increased, since the accuracy of the prediction is increased with the threshold values and the best accuracy is obtained is 92%.
5 Conclusions and Future Work An effective weblog mining framework is proposed to solve the online navigational prediction behavior. The proposed heuristic method is tested under two experimental sets and the results showed that the patterns obtained have only lesser false positive instances. With the increasing number of instances, the false positive instances get smaller and this provides good quality output due to proper cleaning and preprocessing prior to the application of the mining algorithm. The algorithm utilized for mining uses a clustering process to select its relevant from the appropriate input. This clustering algorithm helps to detect the entire pattern relevant to the user navigational pattern and it does not focus only on the most frequent patterns. The framework for improving the online user navigational behavior prediction fits well with the predictions of online patterns. This framework has reduced the running time and produces online patterns in a reduced time instance.
References 1. Raphaeli O, Goldstein A, Fink L (2017) Analyzing online consumer behavior in mobile and PC devices: a novel web usage mining approach. Electron Commer Res Appl 26:1–12 2. Shivaprasad G, Reddy NS, Acharya UD, Aithal PK (2015) Neuro-fuzzy based hybrid model for web usage mining. Procedia Computer Science 54:327–334 3. Carmona CJ, Ramírez-Gallego S, Torres F, Bernal E, delJesús MJ, García S (2012) Web usage mining to improve the design of an e-commerce website: OrOliveSur. com. Expert System Appl 39(12):11243–11249 4. Devi BN, Devi YR, Rani BP, Rao RR (2012) Design and implementation of web usage mining intelligent system in the field of e-commerce. Procedia Engineering 30:20–27
830
N. V. Kousik et al.
5. Lopes P, Roy B (2015) Dynamic recommendation system using Web usage mining for ecommerce users. Procedia Comput Sci 45:60–69 6. Cohen A, Nachmias R (2011) What can instructors and policy makers learn about Websupported learning through Web-usage mining. Int Higher Educ 14(2):67–76 7. Hung YS, Chen KLB, Yang CT, Deng GF (2013) Web usage mining for analysing elder self-care behavior patterns. Expert Syst Appl 40(2):775–783 8. Velásquez JD (2013) Combining eye-tracking technologies with web usage mining for identifying Website Keyobjects. Eng Appl Artif Intell 26(5):1469–1478 9. Matthews SG, Gongora MA, Hopgood AA, Ahmadi S (2013) Web usage mining with evolutionary extraction of temporal fuzzy association rules. Knowl-Based Syst 54:66–72 10. Sha H, Liu T, Qin P, Sun Y, Liu Q (2013) EPLogCleaner: improving data quality of enterprise proxy logs for efficient web usage mining. Procedia Computer Science 17:812–818 11. Tao YH, Hong TP, Su YM (2008) Web usage mining with intentional browsing data. Expert Syst Appl 34(3):1893–1904 12. John JM, Mini GV, Arun E (2012) User profile tracking by Web usage mining in cloud computing. Procedia Engineering 38:3270–3277 13. Huang YM, Kuo YH, Chen JN, Jeng YL (2006) NP-miner: A real-time recommendation algorithm by using web usage mining. Knowl-Based Syst 19(4):272–286 14. Adeniyi DA, Wei Z, Yongquan Y (2016) Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN) classification method. Applied Computing and Informatics 12(1):90–108 15. Tao YH, Hong TP, Lin WY, Chiu WY (2009) A practical extension of web usage mining with intentional browsing data toward usage. Expert Syst Appl 36(2):3937–3945 16. Hong TP, Huang CM, Horng SJ (2008) Linguistic object-oriented web-usage mining. Int J Approximate Reasoning 48(1):47–61 17. Yin PY, Guo YM (2013) Optimization of multi-criteria website structure based on enhanced tabu search and web usage mining. Appl Math Comput 219(24):11082–11095 18. Cho YH, Kim JK (2004) Application of Web usage mining and product taxonomy to collaborative recommendations in e-commerce. Expert Syst Appl 26(2):233–246 19. Zhang X, Edwards J, Harding J (2007) Personalised online sales using web usage data mining. Comput Ind 58(8):772–782 20. Musale V, Chaudhari D (2017) Web usage mining tool by integrating sequential pattern mining with graph theory, 1st International Conference on Intelligent Systems and Information Management (ICISIM), Aurangabad, India. https://doi.org/10.1109/ICISIM.2017.8122167 21. Liu J, Fang C, Ansari N (2016) Request dependency graph: A model for web
Spatiotemporal Particle Swarm Optimization with Incremental Deep Learning-Based Salient Multiple Object Detection M. Indirani and S. Shankar
Abstract The recent developments in the computer vision application will detect the salient object in the videos, which plays a vital role in our day-to-day lives. Difficulty in integrating spatial cues with motion cues makes the process of a salient object detection more difficult. Spatiotemporal constrained optimization model (SCOM) is provided in the previous system. Since the better performance is exhibited in the detection of single salient object, the variation of salient features between different persons is not considered in this method and more general agreement related to their significance is met by some objects. To solve this problem, the proposed system designed a spatiotemporal particle swarm optimization with incremental deep learning-based salient multiple object detection. In this proposed work, incremental deep convolutional neural network (IDCNN) classifier is introduced for a suitable measurement of success in a relative object saliency landscape. Spatiotemporal particle swarm optimization model (SPSOM) is used for performing the ranking method and detection of multiple salient objects. In this system to achieve global saliency optimization, local constraint temporal as well as spatial cues is exploited. Prior video frame saliency map and change detection motion history are done using SPSOM. Moving salient objects are distinguished from diverse changing background regions. When compared with existing methods, better performance is exhibited using proposed method as shown in results of experimentation concerning recall, precision, average run time, accuracy and mean absolute error (MAE). Keywords Salient object · Spatiotemporal particle swarm optimization model (SPSOM) · Incremental deep convolutional neural network (IDCNN) classifier · Global saliency optimization M. Indirani (B) Assistant Professor, Department of IT, Hindusthan College of Engineering and Technology, Coimbatore 641032, India e-mail: [email protected] S. Shankar Professor, Department of CSE, Hindusthan College of Engineering and Technology, Coimbatore 641032, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_60
831
832
M. Indirani and S. Shankar
1 Introduction In recent days, video salient object detection (VSOD) has gained more interest. In general, during free-viewing to understand the underlying mechanism of HVS, it is essential to have video salient object detection (VSOD) and it is also used in various real-time applications like weakly supervised attention, robotic interaction, autonomous driving, video compression, video captioning, video segmentation [1–5]. Due to the challenges in video data like large object deformations, blur, occlusions and diverse motion patterns and human visual attention behavior’s inherent complexity like attention shift, selective attention allocation and great difficulties are presented in VSOD in addition to its practical and academic significance. So, in the past few years, research interest is increased apparently. Detection of the salient object is a task that is based on the mechanism of visual attention, where the algorithm aims for exploring more attentive or objects other than scene or images surrounding area. From the background, foreground objects are identified in the detection of a salient object in video and image. The assumption that objects can be distinctive, pattern, motion, textured on when compared with background forms base for this technique [6]. The saliency map is outputted in one frame, where probability of pixel belonging to a salient object is represented by every value. Potential objects are identified using pixels having high probability. Difficulty in integrating spatial cues with motion cues makes the process of detection of a salient object is more difficult and it is also difficult to deal with static adjacent frames and unavailability of motion feature. Complicating factors like cluttered background, large background motion, shadowing due to illumination changes, intensity variation influence the acquired video quality. In a video, for spatiotemporal salient object detection, various methods have been proposed until now [7, 8]. The ability of high-level semantic feature representation of deep convolutional neural network (CNN) makes it more suitable in recent days. To detect a salient object, various CNN-based methods are proposed and they have produced better results [9– 11]. But, CNN output has non-sharp boundaries and coarse due to the presence of pooling layers and a convolutional layer having a huge amount of receptors [12]. Learning of new classes is allowed using effective DCNN with incremental growing and training method while sharing the base network’s part [12]. According to relative saliency’s hierarchical representation, proposed an IDCNN classifier. The paper is organized as, different salient object detection methods are discussed in Sect. 2, and for multiple salient object detection, a model is proposed in Sect. 3, experimentation and analysis are presented in Sect. 4 and Sect. 5 concludes the research work.
Spatiotemporal Particle Swarm Optimization with Incremental …
833
2 Related Works Le and Sugimoto (2018) present a technique for recognizing notable items in recordings, where worldly data notwithstanding spatial data is completely considered. The system introduced a new set of Spatio Temporal Deep (STD) features that exploit local and global contexts over frames. Furthermore, Spatio Temporal Conditional Random Field (STCRF) is proposed to calculate a saliency from STD features. STCRF is the augmentation of CRF to the worldly space and portrays the connections among neighboring areas both in a casing and over edges. STCRF prompts transiently reliable saliency maps over casings, adding to the precise discovery of notable items’ limits and clamor decrease during identification. The structured strategy first fragments an info video into numerous scales and afterward registers a saliency map at each scale level utilizing STD highlights with STCRF. The last saliency map is registered by intertwining saliency maps at various scale levels [13]. Chen et al. (2018) present a model for video notable item discovery called spatiotemporal constrained optimization model (SCOM). It misuses spatial and transient signs, just as a neighborhood limitation, to accomplish a worldwide saliency enhancement. For a robust motion computation of salient objects, present the scheme to modelling the motion cues from optical flow field, the saliency map of the prior video frame and the motion history of change detection. It can able to differentiate the moving salient objects from diverse changing background regions. Moreover, a viable objectness measure is structured with natural geometrical translation to extricate some solid item and foundation districts, which gave as the premise to characterize the closer view potential, foundation potential and the imperative to help saliency engendering. These possibilities and the limitation are planned into the structured SCOM system to create an ideal saliency map for each edge in a video [14]. Qi et al. (2019) structured a fast video striking article identification strategy at 0.5 s each casing (counting normal 0.32 s for optical stream calculation). It mostly comprises of two modules, the underlying spatiotemporal saliency module and the connection channel-based remarkable worldly spread module. The previous one coordinates the spatial saliency by powerful least hindrance separation and limit balance prompt with fleeting saliency data from movement field. The last one joins relationship channels to keep the saliency consistency between neighboring edges. The over two modules are at last combined in a versatile manner [15]. Wu et al. (2018) structured a spatiotemporal notable article recognition technique by incorporating saliency and objectness, for recordings with entangled movement and complex scenes. The underlying notable article identification result is first based upon both the saliency map and objectness map. A short time later, the district size of a remarkable item is acclimated to acquire the casing shrewd striking article discovery result by iteratively refreshing the article likelihood map, which is the mix of saliency map and objectness map.
834
M. Indirani and S. Shankar
At last, to improve the fleeting lucidness, the succession-level refinement is performed to produce the last striking article location result. Trial results on open benchmark datasets exhibit that the proposed technique reliably outflanks the best in class striking article location strategies [16]. Dakhia et al. (2019) center around precisely catching the fine subtleties of remarkable articles by proposing a Hybrid-Backward Refinement Network (HBRNet), which joins the elevated level and low-level highlights extricated from two distinctive CNNs. Exploiting the entrance to the obvious signals and semantic data of CNNs, our half and the half profound system helps in demonstrating the article’s unique situation and saving its limits also. In particular, the framework coordinates powerful mixture refinement modules by consolidating highlight maps of two back to back layers from two profound systems. Additionally, the planned refinement model uses the lingering convolutional unit to give a compelling start to finish preparing. Besides, apply the element combination strategy to empower full abuse of multi-scale highlights and continuously recuperate the goals of the coarse expectation map. Through the trial results, the framework shows that the planned system accomplishes cutting edge execution. In half and half refinement module, the leftover convolutional unit just as the combination technique can improve the nature of the forecast maps [17].
3 Proposed Methodology Spatiotemporal particle swarm optimization model (SPSOM) with incremental deep learning (IDL)-based salient multiple object detection is proposed. Here, foreground and background image subtraction is performed by using the incremental deep learning (IDL) algorithm. A new model is proposed for detecting video multiple salient objects and ranking method termed as spatiotemporal particle swarm optimization model (SPSOM), where, for multiple objects global saliency optimization, local constraints, temporal as well as spatial cues are exploited. Figure 1 shows the proposed system’s flow diagram.
3.1 Input Video For a given video sequence, visual and temporal detection of salient objects in every frame Ft of a video, a sequence is a major goal, where frame index is represented as t. The assumption that, for a given video sequence, by analyzing spatial and temporal cues, background or salient objects of some reliable regions can be found is used in this proposed saliency model, and from these detected reliable regions, saliency
Spatiotemporal Particle Swarm Optimization with Incremental …
835
Input video
Frame F
Frame F
Incremental deep Learning
For each object
Motion distribution
Motion edge
Motion history image
Motion Energy
Object-like regions
Spatiotemporal Particle Swarm Optimization Model Saliency maps for multiple objects
Salient Multiple Object
Fig. 1 Flow diagram of the proposed methodology
Saliency maps
836
M. Indirani and S. Shankar
seeds can be derived for achieving global optimization of salient detection of the object. In a video sequence, for every frame, superpixels are generated for modeling saliency using segmentation with SLIC and there exist around 300 pixels approximately in every superpixel. Superpixel labeling problem corresponds to the detection of the salient object. In a frame, for every superpixel ri (i = 1, 2, ·... N), saliency landspace value si ∈ [0, 1] is assigned in this technique. Minimization of constrained energy function E(S) is formulated from superpixel labeling using a system model, where saliency label’s configuration is represented as S = {s1 , s2 ,..., s N }. Initially, reliable labels are assigned to some superpixels and it has three potentials, namely smoothness potential , background potential and foreground potential . min E(S) =
N i=1
(si ) +
N i=1
(si ) +
(i, j∈N )
si , s j
(1)
s.t.(S) = k Where, in a frame Ft , neighborhood set having spatially connected superpixels pair is represented as N, constraint vector is represented as k and it has few values of convincing saliency land space. Superpixel may be classified as background or salient object using the measure of background potential and foreground potential . Overall saliency smoothing is promoted using smoothness potentials, where different labels are assigned to neighboring superpixels.
3.2 Incremental Deep Convolutional Neural Network (IDCNN) Four classes (C1−C4) are used for training base network and after training, discarded training data of those classes. Then, two classes (C5, C6) are given as an input sample data and these data has to be accommodated in the network while maintaining those initial four classes knowledge [18]. So, it requires an increase in capacity of network and network is retained only with new data (of C5 and C6) in an effective way for classifying tasks’ classes (C1−C6) using an updated network. The classification process is termed as a taskspecific classification if tasks are classified separately and it is termed as a combined classification if they are classified together. Incremental learning model overview is shown in Fig. 2. Design Approach In the same DCNNs, there exist classifier as well as feature extractor with several layers, which makes the superiority of DCNNs. In the proposed training method, fixed feature extractor corresponds to sharing convolutional layers and classifier
Spatiotemporal Particle Swarm Optimization with Incremental …
837
Fig. 2 Incremental learning model: the network needs to grow its capacity with the arrival of data of new classes
corresponds to fully connected layer and there would not be any sharing of it. Process of reusing learned network parameters for learning new classes set is termed as sharing. In every case, newly available data only used for learning new classes and there is an assumption that there will be a similar feature in old as well as new classes. A single dataset is split as various sets in a designed system for using them as an old and new task data with multiple classes in the network update process. Figure 3 shows that, in convolutional layers, around ∼60% of learning parameters are shared with respective ReLu and batch normalization layers and ∼1% accuracy of
Fig. 3 Updated incremental network architecture for training methodology
838
M. Indirani and S. Shankar
Fig. 4 Overview of the DCNN incremental training methodology with partial network sharing
the baseline is achieved. Accuracy of classification is drastically degraded if around 60% of network parameters are shared. For maximum benefits, incremental training method is developed in a designed system according to this observation. With optimum sharing of network, incremental training is proposed in this system, which is shown in Fig. 3. Two sets are formed by splitting available classes set initially. The base network is trained using a core or larger set and cloned branch network having various configurations of sharing is trained using a small set called demo set. From the results of training, generated a sharing vs accuracy curve, which is used for selecting the architecture of the network and optimum sharing configuration of an application. Without degrading accuracy of a new task, the ability to share base networks initial layer is indicated using this curve. Select an optimum sharing configuration, which satisfies quality requirements. New classes set can be learned using this optimum configuration. From accuracy-sharing trade-off curve, it is difficult to compute the optimum sharing configuration. With energy benefits, for trading accuracy, tuning knob is provided by this curve. Entire search space is needed not to be explored by this system and network architecture-based heuristics can be applied and also it allows number of training samples, dataset complexity and number of trainable parameters for computing optimum sharing configuration in few cloned network’s retaining iterations. Procedure for computing optimum sharing point is described in the following passage. Two sets are formed by separating available classes initially in a designed system for training base network is shown in the Fig. 4. The flow diagram of the Spatiotemporal Particle Swarm Optimization Model (SPSOM) is shown in Fig. 5. They are demo and core set. Then, the network is trained using a core set, and with a demo set, the network is separated. For computing optimum sharing configuration, this separated network’s accuracy is used as a reference. Branch network is created, which will share the base network’s few initial layers, and using demo set, it is trained. This branch network corresponds to the trained base network’s cloned version. Branch network is trained using this system and its performance is compared with reference accuracy. Sharing is increased, if the reference value is close to new
Spatiotemporal Particle Swarm Optimization with Incremental … Fig. 5 Spatiotemporal particle swarm optimization model (SPSOM)
839
Initialize the number of super pixels
Initialize position and velocity of each super pixel Compute distance for each super pixel Update Pbest Update Gbest Update position and velocity
No
Is the stopping condition Yes
Multiple object detection
accuracy and branch is again trained for comparison. Sharing is decreased, if the reference value is greater than new accuracy and branch is again trained for comparison. Based on required quality values, optimum sharing configuration is finalized after little iteration. Optimum sharing point shows the fraction of sharing, and after that, degradation of accuracy to increased sharing is greater than the threshold of quality. With the minimum loss of quality, maximum benefits can be achieved using this method. At last, with core and demo set, the base network is retained for enhancing base network features due to the availability of core and demo sets. (A) Foreground potential Using spatial–temporal visual analysis, few reliable object regions O can be obtained. This is a major assumption used for defining foreground potential; these regions are in salient object part. In a frame Ft , for every superpixel ri , foreground potential is defined in the system as,
840
M. Indirani and S. Shankar
(si ) = F(ri )(1 − si )2
(2)
Where foreground term is represented as F(ri ) and superpixel ri ’s foreground probability is evaluated using this term. As per theoretical computation, a superpixel with high foreground term is having a high chance of being salient and it has a high value of saliency land space si which is normalized in [0 to 1] range. In proposed SCOM, minimization of energy is achieved by multiplying (1 − si )2 with foreground term. In reliable object regions O, between every superpixel ro and superpixel ri , average appearance similarity F(ri ) is computed for modeling foreground term F(ri ). Where the number of masked regions O is denoted as K, yellow dots represent masked background regions B. Clustering is used for computing regions B and O. Object regions movement is estimated by proposing motion energy term M(ri ). Foreground term F(ri ) is expressed as, F(ri ) = A(ri )M(ri )
(3)
k distg2 (ri , r0 ) 1 A(ri ) = exp − N 0−1 2σ 2
(4)
where
where geodesic distance between superpixel ri to superpixel r0 ∈ O is represented as distg (ri , r0 ), in an undirected graph, along the shortest path from ri to r0 , accumulate edge weights are used for computing it. Motion Energy Term Motion energy term M is modeled using the exploitation of optical flow field in designed system and it indicates that Md , Me , Mh and St−1 are used to form a motion energy term M. System uses Sobel edge detector for generating motion edge Me optical flow field. Motion objects contours are extracted using this. Over the entire frame, uniform color distribution is shown by backgrounds as indicated by the color spatial distribution of optical flow field and unique nature is exhibited in motion objects and they are very compact. Following shows motion distribution measure Md defined in this system, Md (ri ) =
N pt r j − μi 2 vi j
(5)
j=1
where superpixel r j ’s normalized centroid is represented as pt r j ;superpixel ri ’s color similarity weighted centroid is represented as μi and is expressed as,
Spatiotemporal Particle Swarm Optimization with Incremental …
N
j=1 vi j pt r j N j=1 θi j
μi =
841
(6)
Between superpixel ri and r j , color similarity is expressed as,
distc2 ri , r j vi j = exp − 2σ 2
(7)
Between superpixels and other pixels of color optical field, color discriminativeness and spatial distance are measured using motion distribution Md . For frame Ft, motion edge Me and MHI Mh is defined using generated motion distribution map Md. In Ft, for superpixel, motion energy term M(ri) is defined using the integration of the prior frame’s saliency St − 1as, M(ri ) = (1 − γ )st−1 (ri ) +
γ (Mh (ri ) + Me (ri ) + Md (ri ) 3
(8)
where, balance parameter is represented as γ and it is set as 0.5 in our experiment. (B) Background potential Background potential is defined in the system as opposed with foreground potential, for measuring every superpixels likelihood for being background. Background potential (si ) of every superpixel ri is defined as, (si ) = ωb (ri )si2
(9)
where background term is represented as ωb (ri ) which measures background probability of superpixel ri . Clearly, superpixel having a small value of ωb (ri ) is visually less salient. Energy minimization of Eq. (1) is promoted by multiplying si2 with background term. Between superpixelri and superpixel in reliable background regions B, appearance similarity is computed for defining background term as, ωb (ri ) =
distg2 (ri , rb ) 1 exp(− |B| 2σ 2
(10)
Where in reliable background regions set B, the number of superpixels is represented as |B|, between superpixelri and superpixels of B, shortest average appearance distance is represented as distg(ri,rb). (C) Smoothness potential Overall saliency labeling smoothing is achieved using smoothness potential, where nighbouring pixels are penalized by assigning various saliency labels. It is expressed as,
842
M. Indirani and S. Shankar
si , s j = ωi j ri , r j (si − s j )2
(11)
where distc2 ri , r j ωi j ri , r j = exp − , (i, j) ∈ N 2σ 2
(12)
12 Where, within a frame, every spatially adjacent superpixels are available in neighborhood set N. In CIE-Lab, color space between superpixelsri and rj is represented as distc (ri ,rj ) and it also represents color features Euclidean distance, thus between sup, erpixelri and rj , appearance similarity is measured using wij (ri , rj ). (D) Reliable regions O and B According to reliable background regions B and reliable object regions O, background potential and foreground potential are proposed. Computation of reliable regions B and O is represented in this work. Salient object detection performance is mainly defined using these regions. In object-like regions K, superpixels are clustered using this system and objects are more similar to superpixels near to the center of the cluster. The ri ’s cluster intensity is represented as, I (r )i =
δ V (ri ) − V r j − dc
(13)
ri ,r j ∈K
Where cutoff value is represented as dc . In the proposed method, it lies between 0.05 and 0.5 and it is not sensitive. Delta function is expressed as, δ(x) =
1 ifx < 0 0 other wise
(14)
Superpixelri enclosed with the large number of neighbors rj is indicated using cluster intensity I(ri), the high value of DC can be produced within cut off distance. With the low intensity of cluster, center of the cluster enclosed with neighbors can have high object probability. Cluster intensity I(ri) is measured for selecting background regions B and reliable object regions O. Select ri as object superpixel, if the threshold h o is less than I(ri), likewise select ri as background superpixel, if the threshold h o is greater than I(ri). Following defines thresholds h 0 and h b , h 0 = t0 ∗ max(I (ri )), ri ∈ K
(15)
h b = tb ∗ min(I (ri )), ri ∈ K
(16)
Spatiotemporal Particle Swarm Optimization with Incremental …
843
where, for object regions, cluster intensities spanning extent is expressed as to and background regions, cluster intensities spanning extent is expressed as tb .
3.3 Salient Multiple Objects Detection The relative salience of detected objects regions is considered in the proposed work for predicting salient objects total count. For reliable object regions based on saliency propagation, from K superpixels ro ∈ O, affinity matrix Woi ∈ R N ×N is defined for every N superpixels ri ∈ S, so that, Woi = [. . . , ωoi (ro , ri ), . . . , ω K N (r K , TN )
(17)
distc2 (r0 , ri ) , (ro , ri ) ∈ N ωoi (ro , ri ) = exp − 2σ 2
(18)
where
Likewise, from M superpixel srb ∈ B, defined an affinity matrix Wbi ∈ R R M×N based on reliablebackground regions to every superpixelsri ∈ S. With diagonal elements doo = Woi , (o = 1; 2; 3; _ _ _;K), degree matrices of Woi is defined as i Doi and with diagonal elements dbb = = Wbi ,, (b = 1; 2; 3; …;M), egree matrices i
of Wbi is defined as Dbi . In matrix form, defined constraint function (S) = | in expression (1) using ranking technique as, ⎡ ⎢ ⎢ ⎢
⎢ .., Doi −αWoi , . ⎢ ⎢ .., Dbi −βWbi , . (K +M)×N ⎢ ⎢ ⎢ ⎣ =b
I (ro ) I (rb )
s1 . . si . . sN
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
(19)
N×1
(20) (K +M)×1
where element-wise multiplication is represented as , solution vector is represented as [s1 ; s2 ; s3 ; . . . ; s N ]T , where, in S, every element needs to be predicted and it is a saliency label, cluster intensity vector is represented as [I(ro ); I(rb )], dimensional weighting vector is represented as b with dimension (K × M), in which, based on a threshold value defined above, every element of this vector is set as 0 for background
844
M. Indirani and S. Shankar
and 1 for an object. Balance parameters are represented as α and β, and in this experiment, 0.99 is set to these parameters. Affinity matrices Woi and Wbi are not square matrices; additional potential appending to E(S) cannot be transformed using (S) = k. For multiple video salient object detection, the model is presented in this work and spatiotemporal particle swarm optimization model (SPSOM) which is a ranking method, where local constraint, temporal and spatial cues are exploited for achieving optimization of global saliency in multiple objects. Social behaviors of fish schooling and birds flocking motivate PSO and it is an evolutionary computation method. In a swarm, particles are used for representing every solution, which is a basic principle of PSO. Particles correspond to superpixel in this proposed work. In search space, every particle has its own position and vector xi = (xi1 , xi2 ,…,xi D ), is used for representing it, where the search space dimension is represented as D. To search optimal salient object in search space, superpixels are moving with velocity. The velocity of superpixel is indicated as vi = (vi1 , vi2 ,…,vi D ). Based on every particles experience and its neighboring pixels experience, velocity and position are updated by every particle. Object corresponds to a distance between superpixels as assumed in proposed work. Particles best position corresponds to its previous best position which is recorded and represented as pbest and gbest corresponds to the best position of the population achieved so far. The optimum solution is searched by PSO using gbest and pbest, where, based on the following expressions, every particles position and velocity are updated. t+1 t+1 t = xid + vid xid
(21)
t+1 t t t vid + c2 ∗ r2 pgd − xid = ω ∗ vid + c1 ∗ r1 pid − xid
(22)
where, in the evolutionary process, tth iteration is represented as t. In search space, dthdimension is represented as d ∈ D, inertia weight is represented as w, on current velocity, the impact of the previous velocity is controlled using this weight, acceleration constants are represented as c1 and c2 , random values having a uniform distribution are represented as r1 and r2 and it lies between 0 to 1, in dth dimension, pbest and gbest values of elements are represented as pid and pgd . t+1 ∈ [−vmax , vmax ] are used for limiting Predefined maximum velocity,vmax , and vid velocity. Number of iterations or good fitness value are used as criteria for setting the iteration limit of the algorithm. Multiple salient objects are produced as algorithm output. Algorithm 1: Spatiotemporal particle swarm optimization model (SPSOM) Input: Number of superpixels in the reliable regions. Output: Salient objects detection.
Spatiotemporal Particle Swarm Optimization with Incremental …
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
845
Initialize superpixels count (i = 1,…N) as particle Initialize random position and velocity While achieving stop condition do For each superpixeli Evaluate distance (objective function) If best fitness value (pBest) of history is worst that fitness value New pBestis set as current value end if If gBest is worst than pbest Set gBest = pBest end if Update position and velocity using Eq. (20) and (21) end do Return multiple object detection
4 Experimental Results The dataset used for evaluation is presented in this section with parameters used for evaluating salient object detection performance. Three benchmark datasets are used in experimentation, which includes, Freiburg-Berkeley Motion Segmentation (FBMS) datasets, which is commonly used and collected from https://lmb.informatik. uni-freiburg.de/resources/datasets/moseg.en.html. In FBMS, drastic camera movement is involved in various videos, and for extracting motion feature, large motion noise is introduced by these movements. Testing and training set is formed by splitting FBMS dataset randomly. Figure 6 shows input images. The Mean Absolute Error (MAE) performance of the proposed SPSOM with IDCNN method is compared with the existing DSS and SCOM approaches which are shown in Fig. 7.
4.1 Evaluation Metrics To detect multiple salient objects, system uses three various standard metrics for measuring performance, which includes, precision-recall (PR) curves, accuracy, average run time, mean absolute error (MAE). Deeply supervised salient object detection (DSS), SCOM and SPSOM with IDCNN approaches are compared. The proposed and existing methods performance is represented in Table 1. Mean Absolute Error (MAE) Absolute errors average defines mean absolute error | ei | =|yi − xi |{\displaystyle |e_{i}|=|y_{i}-x_{i}|}, where yi {\displaystyle y_{i}} is the prediction and xi {\displaystyle x_{i}} the true value. The mean absolute error is given
846
M. Indirani and S. Shankar
Fig. 6 Input images
by n i=1 y i − x i MAE = n
(23)
The performance of the proposed SPSOM with IDCNN method is compared with existing DSS and SCOM approaches in terms of mean absolute error(MAE). In x-axis, methods are represented and MAE is represented in the y-axis. In proposed work, incremental deep convolutional neural network (IDCNN) classifier is proposed
Spatiotemporal Particle Swarm Optimization with Incremental …
847
0.08 0.07
MAE (%)
0.06 0.05 0.04 0.03 0.02 0.01 0
DSS
SCOM
SPSOM with IDCNN
Methods Fig. 7 Mean absolute error (MAE) comparison
Table 1 Performance comparison
Methods
Metrics MAE (%)
Accuracy (%)
DSS
0.07
80
SCOM
0.069
87
SPSOM with IDCNN
0.04
91
Average Run Time (s) 0.432 37.5 0.28
for measuring success in a relative object saliency landscape. It reduces mean absolute error. From experimental results, it is concluded that proposed SPSOM with IDCNN approach achieves 0.04% when other methods such as DSS and SCOM attain 0.07% and 0.069%, respectively. Figure 8 shows the accuracy of proposed SPSOM with IDCNN approach and existing DSS and SCOM approaches. In x-axis, methods are represented and accuracy is represented in the y-axis. In proposed work, spatiotemporal particle swarm optimization model (SPSOM) is introduced for achieving multiple objects global saliency optimization. The distance between the superpixels is considered as an objective function. Due to this optimization, the accuracy of the proposed system is improved. From the graph, it can be concluded that the proposed system achieves 91% of accuracy when other methods such as DSS and SCOM attain 80% and 87%, respectively. Figure 9 shows the PR of proposed SPSOM with IDCNN approach and existing DSS and SCOM approaches. In x-axis, recall value is represented and precision is taken as the y-axis. From results, it shows that proposed SPSOM with IDCNN approach achieves great performance than existing SPSOM with IDCNN approaches. The average run time of the proposed SPSOM with IDCNN approach is compared with the existing DSS and SCOM approaches. In x-axis, methods are taken and average run time is taken as the y-axis. From Fig. 10, it is concluded that
848
M. Indirani and S. Shankar 100 90
Accuracy (%)
80 70 60 50 40 30 20 10 DSS
SCOM
SPSOM with IDCNN
Methods Fig. 8 Accuracy comparison
DSS
SCOM
SPSOM with IDCNN
1 0.9 0.8
Precision
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0
0.2
0.4
0.6
0.8
1
Recall
Fig. 9 PR curves
proposed SPSOM with IDCNN method achieves 0.28 s when other methods such as DSS and SCOM attain 0.432 s and 37.5 s, respectively. The performance of the proposed SPSOM with IDCNN approach achieves better performance than existing approaches.
Spatiotemporal Particle Swarm Optimization with Incremental …
849
Average run time (sec per frame)
40 36 32 28 24 20 16 12 8 4 0 DSS
SCOM
SPSOM with IDCNN
Methods
Fig. 10 Average run time (seconds per frame)
5 Conclusion The proposed system designed a spatiotemporal particle swarm optimization model (SPSOM) with incremental deep convolutional neural network (IDCNN) classifier to detect multiple salient objects in a video. In this proposed work, a deep learning algorithm is designed to split the foreground and background regions. From salient objects, some reliable regions are detected using an objectness measure, which is used for extracting constraint and modeling energy potential modeling and background for supporting saliency propagation. Then the proposed spatiotemporal particle swarm optimization model framework is introduced for generating an optimal saliency map of every frame of video. Experimental results show that proposed system produced better performance compared with existing DSS and SCOM approaches in terms of average run time, MAE, accuracy and PR. In a frame, form changing and static background, salient objects motions can be extracted by introducing objectness measure in future.
References 1. Wang W, Shen J, Porikli F (2015) Saliencyaware geodesic video object segmentation. In: IEEE CVPR, pp 3395–3402 2. Xu N, Yang L, Fan Y, Yang J, Yue D, Liang Y, Price B, Cohen S, Huang T (2018) Youtube-vos: Sequence-to-sequence video object segmentation. In: ECCV, pp 585–601 3. Pan Y, Yao T, Li H, Mei T (2017) Video captioning with transferred semantic attributes. In: CVPR, pp 6504–6512 4. Guo C, Zhang L (2010) A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE TIP 19(1):185–198 5. Zhang Z, Fidler S, Urtasun R (2016) Instancelevel segmentation for autonomous driving with deep densely connected mrfs. In IEEE CVPR, pp 669–677
850
M. Indirani and S. Shankar
6. Srivatsa RS, Babu RV (2015) Salient object detection via objectness measure. In: 2015 IEEE international conference on image processing (ICIP), pp 4481–4485, IEEE 7. Wang W, Shen J, Porikli F (2015) Saliency-aware geodesic video object segmentation. In: Proceedings of the conference on computer vision and pattern recognition, pp 3395–3402 8. Yang J, Zhao G, Yuan J, Shen X, Lin Z, Price B, Brandt J (2016) Discovering primary objects in videos by saliency fusion and iterative appearance estimation. IEEE Trans Cir Syst Video Technol 26(6):1070–1083 9. Chen T, Lin L, Liu L, Luo X, Li X (2016) DISC: deep image saliency computing via progressive representation learning. IEEE TNNLS 10. Li X, Zhao L, Wei L, Yang MH, Wu F, Zhuang Y, Ling H, Wang J (20165) DeepSaliency: multi-task deep neural network model for salient object detection. arXiv preprint arXiv:1510. 05484 11. Wang L, Lu H, Ruan X, Yang MH (2015) Deep networks for saliency detection via local estimation and global search. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3183–3192 12. Zheng S, Jayasumana S, Romera Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P (2015) Conditional random fields as recurrent neural networks. In: ICCV 13. Le TN, Sugimoto A (2018) Video salient object detection using spatiotemporal deep features. IEEE Trans Image Process 27(10):5002–5015 14. Chen Y, Zou W, Tang Y, Li X, Xu C, Komodakis N (2018) SCOM: Spatiotemporal constrained optimization for salient object detection. IEEE Trans Image Process 27(7):3345–3357 15. Qi Q, Zhao S, Zhao W, Lei Z, Shen J, Zhang L, Pang Y (2019) High-speed video salient object detection with temporal propagation using correlation filter. Neurocomputing 356:107–118 16. Wu T, Liu Z, Zhou X, Li K (2018) Spatiotemporal salient object detection by integrating with objectness. Multimedia Tools Appl 77(15):19481–19498 17. Dakhia A, Wang T, Lu H (2019) A hybrid-backward refinement model for salient object detection. Neurocomputing 358:72–80 18. Wang W, Shen J, Shao L (2017) Video salient object detection via fully convolutional networks. IEEE Trans Image Process 27(1):38–49
Election Tweets Prediction Using Enhanced Cart and Random Forest Ambati Jahnavi, B. Dushyanth Reddy, Madhuri Kommineni, Anandakumar Haldorai, and Bhavani Vasantha
Abstract In this digital era, the framework and working process of election and other such political works are becoming increasingly complex due to various factors such as number of parties, policies, and most notably the mixed public opinion. The advent of social media has deployed the ability to converse and discuss with a wide range of audience across the globe, whereas gaining a sheer amount of attention from a tweet or post is unimaginable. Recent advances in the area of profound learning have contributed to the use of many different verticals. Techniques such as longterm memory (LSTM) perform a sentiment analysis of the posts. This can be used to determine the overall mixed reviews of the population towards a political party or person. Several experiments have shown how to forecast public sentiment loosely by examining consumer behaviour in blogging sites and online social networks in national elections. This paper has proposed a model of machine learning to predict the chances of winning the upcoming election based on the common people or supporter views on the web of social media. The supporter or user shares their opinion or suggestions about the group or opposite group of their choice in social media. It has been required to collect the text posts about election and political campaigns, and then the machine learning models are developed to predict the outcome. Keywords Sentiment analysis · Decision tree · Random forest and logistic regression
A. Jahnavi · B. Dushyanth Reddy · M. Kommineni · B. Vasantha Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India e-mail: [email protected] A. Haldorai (B) Department of Computer Science and Engineering, Sri Eshwar College of Engineering, Coimbatore, Tamil Nadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_61
851
852
A. Jahnavi et al.
1 Introduction The online platform has become an enormous course for individuals to communicate their preferences. Using various assessment techniques, the ultimate intent of people can be found, for example, by eviscerating the content of the tendency, positive, negative, or truthful. For instance, assessment appraisal is always noteworthy in a relationship to hear their client’s insights on their things by imagining eventual outcomes of races and getting ends from film ponders. The data snatched from the opinion evaluation is helpful for predicting the future choices. Rather than associating individual terms, the relation between the set of words is also considered. While selecting the general assumption, each word’s ending is settled and united using a cap. Pack of words will also ignore word demands, which prompt phrases with invalidation should be erroneously described. In the past decades, there has been a massive improvement in the use of small-scale blogging stages, for instance, Twitter. Nudged by that advancement, associations and media affiliations are continuously searching for ways to analyze the information about what people ponder about their things and organizations in the social platforms like Twitter [1]. Associations, for instance, Twitratr, tweetfeel, and social mention are just an uncommon sorts of individuals who advance tweet presumption examination as one of their organizations [2]. Although a significant proportion of work has been performed on how emotions are expressed in different forms such as academic studies and news reports, where significantly less study has been done [3]. Features, for instance, customized linguistic component marks and resources, for instance, idea vocabularies have exhibited the accommodation for supposition examination in various spaces, and anyway will they also show significance for evaluation assessment in Twitter? This paper begins to analyse this request [4].
2 Literature Survey Notwithstanding the character goals on tweets, working out the concept of Twitter messages is basically close to the sentence-level assumption evaluation, the welcoming and express language used in tweets, as well as the general idea of the local micro-blogging allows Twitter’s thinking evaluation to extend beyond the expectation [5]. It is an open solicitation on how well the highlights and procedures are utilized on continuously well-shaped information that will move to the micro-blogging space [6]. Ref. [7] It involves measures such as data collection, pre-processing of documents, sensitivity identification, and classification of emotions, training and model testing. This research subject has grown over the last decade with the output of models hitting approximately 85–90% [8]. Ref. [9] Firstly, in this paper, they have presented the method of sentiment analysis to identify the highly unstructured data on Twitter. Second, they discussed various
Election Tweets Prediction Using Enhanced Cart and Random Forest
853
techniques in detail for carrying out an examination of the sentiments on Twitter information [10]. Ref. [11] They suggested a novel approach in this paper: hybrid topic-based sentiment analysis (HTBSA) for the task of predicting election by using tweets. Ref. [12] Using two separate versions of SentiWordNet and evaluating regression and classification models across tasks and datasets, it offers a new state-ofthe-art method for sentiment analysis while computing the prior polarity of terms. The research investigation is concluded by finding the interesting differences in the measured prior polarity scores when considering the word part of speech and annotator gender [13]. Ref. [14] This paper proposed a novel hybrid classification algorithm in this paper that explains the conventional method of predictive sentiment analysis. They also integrated the qualitative analysis along with data mining techniques to make sentiment analysis method more descriptive [15]. Ref. [16] This research work chose to use two automated classification learning methods in this paper: support vector machines (SVM) and random forest for incorporating a novel hybrid approach to classify the Amazon’s product reviews. Ref. [17] Here, the proposed research work aims to build a hybrid sentiment classification model that explores the basic features of the tweet and uses the domain-independent and domain-related lexicons to provide a more domainoriented approach for analysing and extracting consumer sentiment towards popular smartphone brands in recent years.
3 Methodology The following figure shows the steps followed in the proposed model (Fig. 1). Decision Tree As the implementation of machine learning algorithms is mainly intended to solve problems at the industry level, the need for more complex and iterative algorithms is becoming as an increasing requirement. The decision tree algorithm is one such algorithm used to solve problems in both regression and classification. Decision tree is considered as one of the most useful algorithms in machine learning because it can be used to solve many challenges. Here are a few reasons why decision tree should be used: (1) It is considered the most comprehensible machine learning algorithm and can easily be interpreted. (2) This can be used for problems with classification and regression. (3) It deals better with nonlinear data as opposed to most machine learning algorithms. (4) Building a decision tree is a very quick process since it uses only one function per node to divide the data.
854
A. Jahnavi et al.
Fetching the raw data
Implementing Algorithm’s
Pre-processing the Retrieved data
Retaining Accuracies from the algorithms applied
Fig. 1 Flow chart of the proposed work
Recursive partitioning is an important instrument in data mining. It lets us explore the structure of a collection of data while making decision rules simple to imagine for predicting a categorical (classification tree) or continuous (regression tree) outcome. This section explains the modelling of the CART, conditional inference trees (Fig. 2). Freak < 0.5
Fig. 2 Sample tree that appears after the implementation of cart algorithm
YES
NO
Hate < 0.5
True
wtf DEF SPACE ID ( Parameter_List )
Funcbody ‘‘ Missing Colon
The following is an error production in C to identify missing parentheses following the function name: 1
Program -> Type SPACE ID { FuncBody } ‘‘ Missing ()
2.1.2
PLY
Once we have the error productions, we generate a PLY[8] program. PLY is a purePython implementation of the compiler construction tools LEX and YACC. PLY uses LALR(1) parsing. It provides input validation, error reporting, and diagnostics. The LEX.py module is used to break input text into tokens specified by a collection of regular expression rules. Some of the tokens defined in LEX are : NUMBER, ID, WHILE, FOR, DEF, IN, RANGE, IF. The following assigns one or more digits to the token NUMBER: 1 2 3 4
def t_NUMBER(t): r’\d+’ t.value = int(t.value) return t
YACC.py is a LALR parser used to recognize syntax of a language that has been specified in the form of a context free grammar. 1
expr : expr ’+’ expr
{ $$ = node(’+’, $1, $3); }
The input to YACC is a grammar with snippets of code (called “actions”) attached to its rules. YACC is complemented by LEX. An external interface is provided by LEX.py in the form of a token() function which returns the next valid token on the input stream. This is repeatedly called by YACC.py to retrieve tokens and invoke grammar rules such as: 1 2
def p_1(p): "Program :
PRINT
’(’ NUMBER ’)’ "
3
The output of YACC.py is used to implement simple one-pass compilers.
2.1.3
PLY Program Generator
We use a novel approach, wherein a Python program is used to automatically generate the PLY program. The Python program reads the error productions provided in a file. Each production rule is then converted into a PLY function: 1 2 3
def p_3(p): "Program :DEF SPACE ID ’(’Parameter_List’)’’:’ Funcbody Program" print(action[2])
Here, action[2] corresponds to the customized error message mentioned in the file.
Flexible Language-Agnostic Framework to Emit Informative Compile-Time . . .
863
The Python program takes grammar of any programming language and generates the language-specific PLY code. This automatic generation of PLY program makes the tool language independent.
2.1.4
User Program
This is the code that is uploaded by the user. The automatically generated PLY program identifies the compile-time errors in this code.
2.1.5
Program Generated PLY Program
The program generated PLY program takes the code that is uploaded by the user as input and provides descriptive error messages for the compile-time errors. This PLY program is automatically generated and no changes are made to this while detecting syntax errors for different languages. Some of the syntax errors which were tackled are: – – – – – –
Missing or extra parenthesis Missing colon after leader Indentation errors Missing, misspelled or misuse of Keywords Mismatched quotes Misuse of assignment operator(=).
One of the challenges is identifying the various types of errors possible. For example, consider the following code snippet: 1
print("Hello World")
Some of the different possible syntax errors for the above code snippet are: 1 2
#Missing left parenthesis print "Hello World")
3 4 5
#Missing right parenthesis print ("Hello World"
6 7 8
#Extra parenthesis on the left print (("Hello World")
9 10 11
#Misspelled keyword prnt ("Hello World")
12 13 14
#Mismatched quote print (’Hello World")
All possibilities have to be handled during syntax error detection.
864
M. Nagalakshmi et al.
Fig. 2 User interface
2.2 Web Interface We have developed a Web interface that has the following features: 1. Write Code In: Allows the user to select the programming language of the code that is to be uploaded. 2. Upload: Allows user to upload a program in the programming language selected in Write Code in as shown in Fig. 2a 3. Compile: Highlights the errors in the user uploaded program if there are any. When the user hovers over the highlighted area, the error message will be displayed as shown in Fig. 2b, c. 4. Run: Allows the user to run the corrected code on the standard compiler or interpreter environment as shown in Fig. 2d.
3 Results We conducted a survey with 100 engineering students who had basic knowledge of at least one programming language. As shown in Fig. 3a, students preferred a more elaborate error message(error message 2) emitted by our framework over standard
Flexible Language-Agnostic Framework to Emit Informative Compile-Time . . .
865
Fig. 3 Survey
compiler/interpreter error messages. According to the survey, students also preferred working with a language-independent framework over a language-specific compiler/interpreter like Python IDLE(Fig. 3b). Keeping the existing work in mind, we were able to develop a framework that allowed programmers to easily detect and rectify their code.
3.1 Descriptive Error Messages Unlike the standard compiler/interpreter environment our framework emits noncryptic and descriptive error messages to novice programmers, as specified by the domain expert, making it easier for them to understand the reason for the error and hence help them correct the same. Table 1 shows the comparison of error messages produced by our framework and the standard error messages produced by Python.
Table 1 Result analysis Code snippet
Python IDLE
Framework output
Syntax error: Invalid syntax Syntax error: Missing colon at the end of line 1 Syntax error: Invalid syntax Syntax error at ’)’. Extra right parenthesis at line 1 Syntax error: Invalid syntax Invalid keyword at line 4 Syntax error: Invalid syntax Mismatched quotes at line 12 Syntax error: Invalid syntax ’++’ is invalid at line 2 Syntax error: Invalid syntax Required indentation at line 2
866
M. Nagalakshmi et al.
3.2 Syntax Errors Detection Unlike the existing tools, our framework is able to detect all compile-time syntax errors at once even for an interpreter environment. Consider the following code snippet: 1 2 3 4 5 6 7 8 9 10
# check whether a given string is a palindrome or not def isPalindrome(str) for i in range(0, int(len(str)/2)): if str[i] != str[len(str)-i-1]: return False return True s = "malayalam" ans = isPalindrome(s) else: print("No") Python IDLE output : Syntax Error: Invalid Syntax Framework output : Syntax Error: Missing colon at the end of line 2. Syntax Error: No matching ’if’ for ’else’ block at line 9.
3.3 Novel Approach Grammars are used to describe the syntax of a programming language, and hence any syntactically correct program can be written using its production rules. Our framework identifies the compile-time syntax errors in a user given program using a novel approach. This involves modifying the grammar provided for a language to contain error production rules to detect possible syntax errors. Our approach has not been used before. We use a Python program to generate the PLY program that detects syntax errors.
3.4
Language-Independent Framework
Our framework is language independent and no code changes are required while working with different programming languages. This makes our framework flexible.
4 Conclusion and Future Scope In this paper, we present a language-independent framework that provides a template for customizing error messages. Our framework uses a novel approach, wherein an automatically generated PLY program is used to detect and describe syntax errors. We have tested the tool by using error productions for Python and C. The core of
Flexible Language-Agnostic Framework to Emit Informative Compile-Time . . .
867
the framework remains the same irrespective of the programming language chosen. The only requirement is that the error productions provided and the uploaded user program is of the same language. The following describes the improvements and further steps that could be taken with this framework.
4.1 Range of Syntax Errors This framework could be extended to detect multiple syntax errors on a single line. For example, consider the following code snippet: 1
prnt "Hello World")
The above code snippet has two syntax errors. First, misspelled keyword, and second, missing left parenthesis. However, since our framework scans tokens from left to right, only the first error is detected. The second error is detected only after correcting the first error.
4.2 Automatic Generation Of Error Productions Presently, the rules to identify the errors have to be provided by the domain expert. This can be auto generated. Instead of specifying all possible error productions, the different error productions for a given correct production can be generated automatically. For example, consider the following correct production: 1
Program -> DEF SPACE ID ( Parameter_List ) : Funcbody
Using the correct production, the following error productions could be generated: 1 2 3
Program -> DEF SPACE ID ( Parameter_List ) Funcbody ‘‘ Missing Colon Program -> ID ( Parameter_List ) : Funcbody ‘‘ Missing keyword ’def’ Program -> DEF SPACE ID (( Parameter_List ) : Funcbody ‘‘ Extra left parenthesis
References 1. Kummerfeld, Sarah K, Kay J (2003) The neglected battle fields of syntax errors. In: Proceedings of the fifth Australasian conference on Computing education, vol 20 2. Eclipse IDE (2009) Eclipse IDE. www.eclipse.org. Last visited 2009 3. IntelliJ IDEA (2011) The most intelligent Java IDE. JetBrains. Dostupné z: https://www. jetbrains.com/idea/. Cited 23 Feb 2016 4. https://www.tiobe.com/tiobe-index/ 5. Javier Traver V (2010) On compiler error messages: what they say and what they mean. Adv Hum-Comput Interact Article ID 602570:26. https://doi.org/10.1155/2010/602570
868
M. Nagalakshmi et al.
6. Becker BA et al (2019) Compiler error messages considered unhelpful: the landscape of textbased programming error message research. In: Proceedings of the working group reports on innovation and technology in computer science education, pp 177–210 7. Becker BA et al (2018) Fix the first, ignore the rest: Dealing with multiple compiler error messages. In: Proceedings of the 49th ACM technical symposium on computer science education 8. Brown P (1983) Error messages: the neglected area of the man/machine interface. Commun ACM 26(4):246–249 9. Marceau G, Fisler K, Krishnamurthi S (2011) Mind your language: on novices’ interactions with error messages. In: Proceedings of the symposium on new ideas, new paradigms, and reflections on programming and software, pp 3–18 10. Sahil B, Rishabh S (2018) Automated correction for syntax errors in programming assignments using recurrent neural networks 11. Kelley AK (2018) A system for classifying and clarifying python syntax errors for educational purposes. Dissertation. Massachusetts Institute of Technology 12. Beazley D (2001) PLY (Python lex-yacc). See http://www.dabeaz.com/ply
Enhancing Multi-factor User Authentication for Electronic Payments Md Arif Hassan, Zarina Shukur, and Mohammad Kamrul Hasan
Abstract Security is beginning to be more and more important for electronic transaction nowadays, and the need for security is becoming more important than ever before. A variety of authentication techniques have been established to ensure the security of electronic transactions. The usage of electronic payment systems has grown significantly in recent years. To secure confidential user details from attacks, the finance sector has begun to implement multi-factor authentication. Multi-factor authentication is a device access management strategy that enables an individual to move through various authentication phases. For each of the previous tasks, attempts have been created to secure the electric payment process by using various authentication methods, and despite the advantages of theirs, each had a downside. Authentication based on password is one of the most common ways for users to authenticate in numerous online transaction applications. Inhere, electronic payments authentication mechanism which mainly relies on the traditional password only authentication cannot efficiently resist to the latest password wondering and password cracking strikes. In order to handle the problem, this paper proposes an authentication algorithm for electric payments by adding a multi-factor mechanism on the existing user authentication mechanism. The enhancement concentrates on enhancing the user authentication elements of multiple factors using the biometric technique. The software is developed using an android simulator, and a platform that helps developers to evaluate an application without the requirement for the device to be built on a real device. The proposed system has two phases, namely: registration stage and an authentication stage. The suggested authentication protocol gives users safe access to the authorization through multi-factor authentication using their password and
Md Arif Hassan (B) · Z. Shukur · M. Kamrul Hasan Faculty of Information Technology, Center for Cyber Security, National University Malaysia (UKM), 43600 UKM, Bangi, Selangor, Malaysia e-mail: [email protected] Z. Shukur e-mail: [email protected] M. Kamrul Hasan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_63
869
870
Md Arif Hassan et al.
fingerprint. In order to ensure unauthorized users cannot easily break into the mobile application, the proposed enhancement would provide better security. Keywords Electronic payments · Single-factor · Two-factor · Multi-factor authentication
1 Introduction The multi-factor authentication has been built on the Internet in order to improve user authentication efficiency and make it impossible for attackers to access and the cracking systems. It provides security of information for companies and prevents them from crashing or losing money. When online transfers arise, consumers still worry for hackers and anti-social activities as they transfer money from one account to another. Nevertheless, it is essential to validate users and so as to keep user information harmless on cloud and cryptographic techniques are required to encrypt this sensitive data. The most significant use of multi-factor authentication is to ensure that only authenticated or authorized users are entitled to process their financial transactions in financial services such as online banking, online banking, and Internet banking. A great change in electronic transactions has increased equal to the security attack against electronic payment systems. Some of these attacks are managed by the weaknesses of user authentication systems that are performed online. The authentication process is necessary to encourage users to enter passwords, and if it suits the current user, the user is authenticated and otherwise not permitted to sign in to the system. The first step is always authentication for online transactions. Authentication based on password (single-factor authentication) is one of the most common ways for users to authenticate in numerous mobile applications which is known as a single-factor authentication. However, password-based authentication schemes have many issues, and the risk of using passwords in corporate application authentication is not precise. One of the major problems with password-based authentication is that the majority of users do not know how strong password is. Two-factor authentication, the extra security action which involves individuals to get into a code route to their email or phone, has usually been effective to maintain usernames and passwords protection from attacks. The usage of two-factor applying these factors has reduced fraud but has not stopped [1]. The uses of two-factor authentication are using way too many tokens, token forgery, token costs, and shed tokens [2, 3]. Nevertheless, security industry specialists have confirmed an automated phishing attack which may cut through that additional level of protection—even called 2FA— perhaps, trapping unsuspicious users into sharing their private credentials [4]. The evolution of authentication techniques from SFA to MFA can be seen in Fig 1. Authentication strategies, which typically rely on over one component, are more difficult to compromise than one component method. A multi-factor authentication feature is necessary to render the solution successful and secure in order to improve
Enhancing Multi-factor User Authentication … Fig. 1 Evolution of authentication techniques from SFA to MFA [5]
871
Single Factor Authentication Knowledge Factor PIN, password, security question
Two-Factor Authentication Ownership factor Smartphone, key-card, one-time password
Multi-Factor Authentication Biometric factor Biometric factor: Fingerprint, face, iris, voice, vein etc.
the protection of online transactions. This paper intends to design and apply an electronic payment for authentication of a secure online payment system in this system; the payment process requires multiple authentications for a transaction rather than sending it directly to a customer. The emerging trend is now biometric authentication. Because of the high protection level, it is really user-friendly, and the fingerprint login option is increasing. Biometrics is the process by which a person is physiologic or chemical features are authenticated. For each person, these characteristics are unique. They can never be stolen or repeated, avoid different forms of assaults, and allow us to secure our personal records. Our proposed multi-factor authentication is a many stages of authentication that the person has going with. The person must be authenticated firstly through his password and fingerprint biometric to proceed validation process. After all of the methods on the person becomes authenticated, and then just user’s is able to access their account. The article is divided into four parts. The first section would provide a description of the existing system and the formulations of the problem. The implementation of electronic payments and its related analysis, already discussed, is in the following section. The third section explains the method overall. Section 4 outlines the implementation of the model and its conclusion and potential research are the final section.
872
Md Arif Hassan et al.
2 Literature Review A variety of authentication strategy was created to make certain the protection of electronic transactions. Single-factor authentication (SFA) enables device access through a single authentication process. A simple password authentication scheme is the most common example of SFA. The system prompts the user for a name followed by a password. Passwords are saved on server aspect in encrypted type or even using hash functions, also the username as well as passwords transmit in encrypted type over the secure connection. Therefore, if some intruder gets access over the system, there has no be worried about leakage of information, as it will not expose any info about real password. Though it appears protected, however, in practical, it is significantly less secure as an assailant is able to gain original password of a customer utilizing various assault after a couple of combinations [6]. By posting the password, one may compromise the account right away. An unauthorized user may also try to increase access through the use of the different kinds of attack like, brute force attack [7–9], rainbow table attack [10, 11], dictionary attack [12–14], social engineering [15–17], phishing [18, 19], MITMF [20, 21], password-based attack [7, 22], session hijacking [23], and malware [19, 24, 25]. Single-factor authentication image-based schemes are an approach in Bajwa [26]. The drawback of the system is that it takes more time for authentication and shoulder surfing is possible in this method. A common technique used in electronic payment authentication requires drawing design codes on the display screen, which approach by Vengatesan et al. [27]. To mitigate the problem of single-factor authentication, the two-factor authentication was considered as a solution for securing online transactions and recognizing the authenticated person and logging in to a system or application, and many current and new companies are racing to deploy it. Two-factor authentication is a mechanism that implements more than one factor and is considered stronger and more secure than the traditionally implemented single-factor authentication system. Two-factor authentication using hardware devices, such as tokens or cards, OTP has been competitive and difficult to hack to solve these problems. There are plenty of benefits of 2FA technique, but opponents have been working on breaking this strategy and discovered a number of ways to hack it as well as expose the very sensitive info of the people. Although 2FA systems are powerful, they still suffer by using malicious attacks like lost/stolen-wise card attacks, token costs, token forgery, and lost tokens [3], shooting a phony fingerprint from the original fingerprint, as well insider attacks [28]. Due to increase online transaction two-factor authentication is not enough for performing expensive transactions online [2]. There is therefore a necessity for a strong and secure more system according to the multi-factor authentication to check out the validity of users. Principally, MFA involves different things such as, for instance, biological and biometrics, the sensible mobile device, token unit, along with the smart card. This authentication program enhances the security level and also provides for the use of identification, verification, and then authentication for guaranteeing user authority. In order to mitigate the issue of two-factor authentication,
Enhancing Multi-factor User Authentication …
873
it viewed the multi-factor authentication as a formula for securing online transactions. Multi-factor authentication is a technique of computer access control that a person is able to pass effectively showing different authentication stages. In this, rather than asking only individual piece of info as passwords, users are requested to provide a number of extra info a minimum of three valid elementary authentication [29] factors and that helps make it harder for any intruder to bogus the identity of the real user. The multi-factor authentication will be the three measures of authentication that the person has going through user. This more info is able to consist of different aspects as fingerprints, security tokens, and biometric authentication [6]. Multi-factor authentication may be done in numerous ways; most widespread of them are using login credential with a few additional information. Until now, their many methods are available for secure user authentication, namely biometric authentication [30]. It uses biometric authentication to eradicate the defects of old techniques. Biometrics means automated detection of people depending on their specific behavioral and natural attributes like experience, fingerprint, iris, speech, etc. [31]. There are two types of biometrics namely unimodal biometrics and multimodal biometrics [16]. Biometric systems have many uses [12]. A biometric fingerprint authentication using Stream Cipher Algorithm have been approached by Alibabaee and Broumandnia for online banking [31]. The Stream Cipher protocol is based on one-time password and fingerprints. This method is highly resistant to all kinds of electronic banking attack, such as phishing and password theft [32–35]. The proposed system is developed based on RSA cryptosystem. A similar authentication method has been approached by Benli et al. [30]. In their, the proposed method users at first register their biometric credentials on the device. Do the related method has been proposed by Harish et al. [2] for Smart Wallet. In this project, we intend to implement and apply a multifactor user authentication of a secure electronic payment system using a user phone number, user secret code, and password and biometrics fingerprint authentication.
3 Proposed Design System For each of the previous tasks, attempts have been created to secure the electric payment process by using various methods, and despite the advantages of theirs, each had a downside. With a bit of searching right here, several existing deficiencies dependent on RSA encryption algorithms are solved and also a protected method is suggested. This paper proposes a multi-factor authentication component algorithm for electric payments. The proposed system has two phases, namely: registration stage and an authentication stage. A comprehensive explanation of each phase is provided below. Table 1 presents the notations used in the proposed system of ours. Before making use of the apps, the person should register the information of theirs during a procedure known as the registration phase. Verification of that information may just be achieved by a procedure known as an authentication phase. Each of the suggested materials and
874 Table 1 Demographic characteristics
Md Arif Hassan et al. Notations
Description
Ui
User
IDi UiEid UiP UiA
Unique identifier of user User email id User phone number User address
UiPass
Password of the user
UiBF DBs
Biometric feature of user Server database
strategies are completed in the system during both registration process as well as the authentication procedure, their process flow is reviewed in this area.
3.1 Registration Phase To be able to use the service on, the person will need to do one-time registration. In the particular registration stage that can be used to gather all of the users’ info, by using that info when the person would like to login into that method the server checks if the person is legitimate. Here, the person has to register their account together with the details; we explain the registration measures as follows (Fig. 2): Step-1: Start Step-2: Input user information Ui = IDi + UiEid + UiP + UiA + UiPass + UiBF Step-3: Combining all user information store in database DBs = IDi + UiEid + UiP + UiA + UiPass + UiBF Step-4: Registration complete End: Welcome notification for registration.
3.2 Authentication Phase When the customer tries to sign in on the method, the authentication server has to authenticate the user. If both values are identical, then the authentication is prosperous. In information, the person must authenticate him/herself utilizing the many authentication actions for the login progression as well as the transaction process. The authentication module comprises of two primary processes: specifically, login process as well as the authentication process. In login procedure, the person has to login utilizing the authorized password and number, fingerprint, and IMEI for authentication. After the user login to the method, the person is only able to see the account details. On the other side, for transaction, the person needs to again authenticate their
Enhancing Multi-factor User Authentication …
875
Fig. 2 Registration phase Initializing Application Registration
Input User credential information
Combining user all information
No
If all Match ?
Yes
Registration Successful Congratulation Notification
Stop
self-using the fingerprint authentication. Just when the person authenticates with the fingerprint specifics, the transaction could be accomplished. The comprehensive process is talking about in the following steps (Fig. 3): Step-1: Start Step-2: Input user password Ui = UiPass
876
Md Arif Hassan et al.
Fig. 3 Authentication phase Initializing Authentication Process
Input User registered fingerprint
No
If Match ?
Yes
Input User password
No
If Match ?
Yes
Authentication Successful !!!
Stop
If UiPass = DBs go to step 3 Else, step 2 Step-3: Input user biometric fingerprint Ui = UiBF If UiBF = DBs Ui = UiPass + UiBF = DBs go to step 4
Enhancing Multi-factor User Authentication …
877
Else, step 3 Step-4: Authentication Successful!! Step-4: End-access granted go for next module. In this particular registration stage, which can be used to gather all the users’ info, by using that info when the person would like to login into that particular method. In this particular stage, the moment operator enters all his personal identification, and then simply user is able to obtain access for the method. The registration procedure is shown in Fig. 4. A session is produced for him/her afterward he/she becomes equipped to access materials and he can alter the personal details of his in panel. After successful registration and owner approval, the customer will see the profile display on the payment process. In the profile, there is going to be many service models including wallet balances, leading set up mobile operators, shopping as well as include money. The most as well as the latest user authentication and account access control of Internet banking systems is based on an individual component authentication, namely operator title as well as a password. The protection of password-based authentication, nonetheless, is determined by the power of the user’s selected pass, etc.
4 Implementation and Result Android software development kit (SDK) engine was the primary development kit used for this project based on the scalability of devices the application can be used on and the rich application framework it provides, allowing users to build innovative applications for mobile devices within a Java language environment. The primary development kit used for this project was the Android software development kit (SDK). The front-end style is easy to use as well as simple as soon as the solution is begun, the person will certainly register himself, and after that, he will certainly have the ability to login right into the system. A hardware to help the back and front end of the device is essential for all applications to be built. The software that was used for this development is android studio. Google created android studio for android programming specifically. It has a comfortable interface that helps a customer to access and check the submission. To fix bugs, the built-in SDK was used to operate the device. The system is developed to run particularly android studio virtual device nexus Google pixel 2, and all various other different types of smartphone devices that utilize this innovation. The system is independent that it can service all androidbased smartphone devices. The android platform also provides built-in database (SQLite database) and Web services. SQLite databases have been developed into the android SDK. SQLite is a SQL database engine which stores data in.db files [36]. Two types of android SDK storage are widely available, namely internal and external storage. Save files in internal storage are containerized by default, so other apps on the computer cannot access them. Such files are removed when the user uninstalls
878
(a): Home screenshot
(d): fingerprint completed.
Md Arif Hassan et al.
(b): Registration process.
(e): password authentication
(c): fingerprint authentication.
(f): authentication completed.
Fig. 4 Screenshot of the registration and authentication process
your program. At the other side, a computer compatible with android allows external shared storage. Store can be replaced either internally or separately (such as an SD card). Files stored in external storage can be read worldwide. Once USB mass storage is allowed, the user is able to modify them. The prototype is evaluated based on the registration stage and authentication stages. The simulation is run in the Web server side on a DELL laptop computer
Enhancing Multi-factor User Authentication …
879
with Intel Core i7 CPU, 3.40 GHz CPU as well as 6 GB RAM. The operating system is Windows 10 professional. Android is an open-source operating system built on Linux and the android platform that makes everyday activities simple, fast, and helpful apps for mobile devices. The android architecture offers server-to-application compatibility with certain APIs. Java cryptography architecture (JCA) is used to develop android cryptography APIs. The JCA is a collection of digital signature APIs, message digests, authentication of credentials and certificates, verification, key generation and control, and stable generation of random numbers. These APIs allow security to be easily integrated into their application code by developers [37]. For our implementation, we used javax crypto APIs. Developing APIs/ GUI for use of a fingerprint reader application program interface or graphical user interface. The authentication of fingerprint is feasible only on smartphones, with a touch sensor for user recognition and connection to software and program features. The execution of fingerprint authentication is an enormous multi-step process at first. Fingerprint authentication is primarily a cryptographic process involving a key, encryption cipher, and a fingerprint manager for the authentication function. For an android device, there are several common ways to incorporate fingerprint authentication. The Keyguard Manager and the Fingerprint Manager for Fingerprint Authentication use two system services. For using the fingerprint analysis, a fingerprint manager is necessary. Within the Fingerprint Manager, several of the fingerprint-based approaches can be found. Fingerprint scanner and API source code collection have been used for developing API/GUI for this study. A new users signs up to the application by clicking the registration switch on the welcome page to start with and after that sending his/her info in the registration page. The user will certainly need to sign up in the application on first usage. After registration, the client will certainly obtain a username and also password. Our proposed work based on device-based using keystore. The system stores user data and compares it with existing databases. In this case, only authentication will be effective if the current and existing database match. The Android Keystore is a program that makes it easier for developers to build and store cryptographical keys in containers. The Android Keystore is another JavaKeystoreAPI implementation, which is a repository for certificates of authorization or public key certificates and which uses Java-based encryption, authentication, and HTTPS-service applications in several situations. The entries are encrypted with a password from a keystore. The most stable and suggested form of keystore is currently a strongbox-backed Android Keystore [38] The signup display is revealed and listed below Fig. 4b reveals a screenshot of the signup pages. When the registration is complete, the user requirement to the authentication procedure begins. A confirmation message if the customer signs successfully “successfully registered” is displayed as shown below Fig. 4c shows the screenshots from the login phase procedure in steps 1 and 2. Various monitoring are also held during user registration. The user has to use their fingerprint and password for logging into the application for each time usage. The authorized user must apply to sign up fingerprint and password otherwise if the user enters such a fingerprint
880
Md Arif Hassan et al.
or password that is not registered then the user will get a notification message that “incorrect fingerprint or password.”
5 Conclusion The proposed method is used for mobile application and security in an electronic payment system, using a biometric verification feature, which is used to validate the fingerprint model registered at the time of registration. The customer can perform the transaction and protection will be given if the fingerprint is matched with the samples in the databases, authentication will succeed. It provides users with access to the authorization secure way through multi-factor authentication using their password and fingerprint. This approach is simply to guarantee security and trust in the financial sector. Our algorithm provides an extra protection layer that stops hackers from targeting phishing and social engineering. The approach solution strengthens the existing system of authentication. It greatly increases mobile banking networks of protection by offering three-dimensional assurances from three separate areas such as knowledge, inherent, and possession. It also increases the user interface by allowing verification simpler for consumers. This process can be used by anyone who has a smart device that is support biometric fingerprint authentication. Acknowledgements The authors would like to thank the anonymous reviewers for their helpful feedback. This research was funded by a research grant code from Ya-Tas Ismail—University Kebangsan Malaysia EP-2018-012.
References 1. Khattri V, Singh DK (2019) Implementation of an additional factor for secure authentication in online transactions. J Organ Comput Electron Commer 29(4):258–273 2. Harish M, Karthick R, Rajan RM, Vetriselvi V (2019) A new approach to securing online transactions—the smart wallet, vol 500. Springer, Singapore 3. Shaju S, Panchami V (2017) BISC authentication algorithm: an efficient new authentication algorithm using three factor authentication for mobile banking. In: Proceedings of 2016 online international conference on green engineering and technologies. IC-GET 2016, pp 1–5 4. Newcomb A (2019) Phishing scams can now hack two-factor authentication | fortune, 2019. Available: https://fortune.com/2019/06/04/phishing-scam-hack-two-factor-authentic ation-2fa/. Accessed: 21 Mar 2020 5. Ometov A, Bezzateev S, Mäkitalo N, Andreev S, Mikkonen T, Koucheryavy Y (2018) Multifactor authentication: a survey. Cryptography 2(1):1 6. Kaur N, Devgan M (2015) A comparative analysis of various multistep login authentication mechanisms. Int J Comput Appl 127(9):20–26 7. Emeka BO, Liu S (2017) Security requirement engineering using structured object-oriented formal language for m-banking applications. In: Proceedings of 2017 IEEE international conference on software quality reliability and security. QRS 2017, pp 176–183
Enhancing Multi-factor User Authentication …
881
8. Ali MA, Arief B, Emms M, Van Moorsel A (2017) Does the online card payment landscape unwittingly facilitate fraud? IEEE Secur Priv 15(2):78–86 9. ENISA (2016) Security of mobile payments and digital wallets, no. December. European Union Agency for Network and Information Security (ENISA) 10. Sudar C, Arjun SK, Deepthi LR (2017) Time-based one-time password for Wi-Fi authentication and security. In: 2017 International conference on computer communication and informatics, ICACCI 2017, vol 2017, pp 1212–1215 11. Kogan D, Manohar N, Boneh D (2017) T/Key: second-factor authentication from secure hash chains dmitry, pp 983–999 12. Jesús Téllez Isaac SZ (2014) Secure mobile payment systems. J Enterp Inf Manag 22(3):317– 345 13. Dwivedi A, Dwivedi A, Kumar S, Pandey SK, Dabra P (2013) A cryptographic algorithm analysis for security threats of semantic e-commerce web (SECW) for electronic payment transaction system. Adv Comput Inf Technol 367–379 14. Yang W, Li J, Zhang Y, Gu D (2019) Security analysis of third-party in-app payment in mobile applications. J Inf Secur Appl 48:102358 15. Gualdoni J, Kurtz A, Myzyri I, Wheeler M, Rizvi S (2017) Secure online transaction algorithm: securing online transaction using two-factor authentication. Proc Comput Sci 114:93–99 16. Venugopal H, Viswanath N (2016) A robust and secure authentication mechanism in online banking. In: Proceedings of 2016 online international conference on green engineering and technologies—IC-GET 2016, pp 0–2 17. Roy S, Venkateswaran P (2014) Online payment system using steganography and visual cryptography. In: 2014 IEEE students’ conference on electrical engineering and computer sciences—SCEECS 2014, pp 1–5 18. Alsayed AO, Bilgrami AL (2017) E-banking security: internet hacking, analysis and prevention of fraudulent activities. Int J Emerg Technol Adv Eng 7(1):109–115 19. Ataya MAM, Ali MAM (2019) Acceptance of website security on e-banking—a review. In: ICSGRC 2019–2019 IEEE 10th control and system graduate research colloquium, Proceeding, pp 201–206 20. Kaur R, Li Y, Iqbal J, Gonzalez H, Stakhanova N (2018) A security assessment of HCE-NFC enabled E-wallet banking android apps. In: Proceedings of international conference on software and computer applications, vol 2, pp 492–497 21. Chaudhry SA, Farash MS, Naqvi H, Sher M (2016) A secure and efficient authenticated encryption for electronic payment systems using elliptic curve cryptography. Electron Commer Res 16(1):113–139 22. Skraˇci´c K, Pale P, Kostanjˇcar Z (2017) Authentication approach using one-time challenge generation based on user behavior patterns captured in transactional data sets. Comput Secur 67:107–121 23. Ibrahim RM (2018) A review on online-banking security models, successes, and failures. In: International conference on electrical, electronics, computers, communication, mechanical and computing (EECCMC). IEEE EECCMC 24. Elliot M, Talent K (2018) A robust and scalable four factor authentication architecture to enhance security for mobile online transaction. Int J Sci Technol Res 7(3):139–143 25. Shi K, Kanimozhi G (2017) Security aspects of mobile based E wallet. Int J Recent Innov Trends Comput Commun 26. Bajwa G, Dantu R, Aldridge R (2015) Pass-pic: a mobile user authentication. In: 2015 IEEE international conference on intelligence and security informatics: securing the world through an alignment of technology, intelligence, humans Organ. ISI 2015, p 195 27. Vengatesan K, Kumar A, Parthibhan M (2020) Advanced access control mechanism for cloud based E-wallet, vol 31, no. August 2016. Springer International Publishing, Berlin 28. Mohammed and Yassin (2019) Efficient and flexible multi-factor authentication protocol based on fuzzy extractor of administrator’s fingerprint and smart mobile device. Cryptography 3(3):24 29. Nwabueze EE, Obioha I, Onuoha O (2017) Enhancing multi-factor authentication in modern computing. Commun Netw 09(03):172–178
882
Md Arif Hassan et al.
30. Benli E, Engin I, Giousouf C, Ulak MA, Bahtiyar S (2017) BioWallet: a biometric digital wallet. In: Twelfth international conference on information systems (Icons 2017), pp 38–41 31. Alibabaee A, Broumandnia A (2018) Biometric authentication of fingerprint for banking users, using stream cipher algorithm. J Adv Comput Res 9(4):1–17 32. Suma V (2019) Security and privacy mechanism using blockchain. J Ubiquitous Comput Commun Technol (UCCT) 1(1):45–54 33. Sivaganesan D (2019) Block chain enabled internet of things. J Inform Technol 1(1):1–8 34. Hassan A, Shukur Z, et al (2020) A review on electronic payments security. Symmetry (Basel) 12(8):24 35. Hassan A, Shukur Z, Hasan MK (2020) An efficient secure electronic payment system for E-commerce. Computers 9(3):13 36. Guide MST (2020) Data storage on android—mobile security testing guide. Available: https://mobile-security.gitbook.io/mobile-security-testing-guide/android-testing-guide/ 0x05d-testing-data-storage#keystore. Accessed: 27 Jul 2020 37. Guide MST (2020) Android cryptographic APIs—mobile security testing guide. Available: https://mobile-security.gitbook.io/mobile-security-testing-guide/android-testing-guide/ 0x05e-testing-cryptography. Accessed: 27 Jul 2020 38. Android D (2020) Android keystore system | android developers. Available: https://developer. android.com/training/articles/keystore. Accessed: 16 Aug 2020 39. Mridha MF, Nur K, Kumar A, Akhtaruzzaman M (2017) A new approach to enhance internet banking security. Int J Comput Appl 160(8):35–39 40. Soare CA (2012) Internet banking two-factor authentication using smartphones. J Mobile, Embed Distrib Syst 4(1):12–18
Comparative Analysis of Machine Learning Algorithms for Phishing Website Detection Dhiman Sarma, Tanni Mittra, Rose Mary Bawm, Tawsif Sarwar, Farzana Firoz Lima, and Sohrab Hossain
Abstract Internet has become the most effective media for leveraging social interactions during the COVID-19 pandemic. Users’ immense dependence on digital platform increases the chance of fraudulence. Phishing attacks are the most common ways of attack in the digital world. Any communication method can be used to target an individual and trick them into leaking confidential data in a fake environment, which can be later used to harm the sole victim or even an entire business depending on the attacker’s intend and the type of leaked data. Researchers have developed enormous anti-phishing tools and techniques like whitelist, blacklist, and antivirus software to detect web phishing. Classification is one of the techniques used to detect website phishing. This paper has proposed a model for detecting phishing attacks using various machine learning (ML) classifiers. K-nearest neighbors, random forest, support vector machines, and logistic regression are used as the machine learning classifiers to train the proposed model. The dataset in this research was obtained from the public online repository Mendeley with 48 features are extracted from 5000 phishing websites and 5000 real websites. The model was analyzed using F1 scores, where both precision and recall evaluations are taken into consideration. The proposed work has concluded that the random forest classifier has achieved the most efficient and highest performance scoring with 98% accuracy. Keywords Machine learning · Phishing · Detection · KNN · K-nearest neighbor · Random forest · Decision tree · Logistic regression · Support vector machine
D. Sarma (B) Department of Computer Science and Engineering, Rangamati Science and Technology University, Rangamati, Bangladesh e-mail: [email protected] T. Mittra Department of Computer Science and Engineering, East West University, Dhaka, Bangladesh R. M. Bawm · T. Sarwar · F. F. Lima · S. Hossain Department of Computer Science and Engineering, East Delta University, Chittagong, Bangladesh © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_64
883
884
D. Sarma et al.
1 Introduction Today’s digital world is increasingly carried out in wide range of platforms from business to health care. Massive online activities open door for cyber criminals. Phishing is the most successful and dangerous cyber-attack observed across the globe. Phishing attacks are dangerous and it can be avoided by simply creating awareness and developing the habits of staying alert and continuously being on the lookout when surfing through the Internet and by clicking links after verifying the trustworthiness of the source links. There are also tools such as browser extensions that notify users when they have entered their credentials on a fake site, therefore, possibly having their credentials transferred to a user with malicious intent. Other tools can also allow networks to lock down everything and only allow access to whitelisted sites to provide extra security while compromising some convenience on the user side [1–4]. A company can take several measures to protect itself from phishing attacks. But the core problem is still relying on the employees to some extent on being careful and alert at all. While it ensures the reliability of machines, humans are not customizable. A mistake from one employee could be enough to lead to a vulnerability that an attacker can skillfully exploit and cause damage to an entire company if not detected and contained in time. Security is a significant concern for any organization [5–9, 22]. This paper decided to employ the concepts of machine learning to train a model that would learn to detect links that could be attempting to execute a phishing attack and allow the machine to become an expert at detecting such sites and alerting humans without having to rely on the human minds too much. By using artificial intelligence, this research intended to add another layer of security that would tirelessly detect sites and get better at its performance over time given more datasets to learn from and allow humans to share their responsibilities, regarding careful Internet surfing with the machines.
2 Related Research Different research works that were pertinent to phishing attacks and essential classification techniques that were practiced to detect web phishing were highlighted in this section. With the current boom in technology, phishing has become more popular among malicious hackers. The first-ever phishing lawsuit was filed in 2004 when a certain phisher created a duplicate of a popular website known as “America Online”. With the help of this same website, he was able to get access to personal user info and bank details of many individuals. Phishers began to focus on websites that had online transactions with people and made legions of fake websites that began to trick unsuspecting people into thinking they were the real one. Table 1 shows various types of phishing attacks.
Comparative Analysis of Machine Learning Algorithms …
885
Table 1 Types of phishing attacks Algorithm-based phishing
Attackers create different algorithms that can detect and steal personal user information from the database of a website
Deceptive phishing
Currently, this is done by using e-mails that link a client to a malicious website where a client unsuspectingly enters their private information
URL phishing
Hackers use hidden links in unsuspecting parts of a website that leads a client to a malicious page [10–15]
Hosts file poisoning
Before making a DNS query, the “hostnames” in the host records are checked. Phishers poison the “host records” and redirect a user to a phishing website
Content injection phishing
Hackers inject some malicious sections in a real website that collects the data of a user [16]
Clone phishing
Phishers use a former sent e-mail and clone it. The cloned e-mail has a malignant link attached to it that is sent to different unsuspecting users [14]
Table 2 Traditional website phishing detection techniques Blacklist filter
A blacklist is a primary access control mechanism that can block some aspects on a list from passing through. These filters can be applied in different security measures like DNS servers, firewalls, e-mail servers, etc. A blacklist filter maintains a list of elements like IP addresses, domains, IP netblocks that are commonly used by phishers
whitelist filter
A whitelist filter contains a list of different elements such as URLs, schemes, or domains that will be allowed to pass through a system gateway. A whitelist contrary to a blacklist maintains a list of all legitimate websites
Pattern matching filter Pattern matching is a technique that checks if a specific sequence of data or tokens exists among a list of given data
Table 2 explains traditional website phishing detection techniques like blacklist filter, whitelist filter, and pattern matching filter.
2.1 Machine Learning-Based Methods Malicious Domain Detection Malicious domains are one of the leading causes of phishing. Different machine learning methods and models were created to better detect malicious domains with a high success rate [17]. E-mail Spam Filtering Spam filters use probing methods that detect malicious e-mails and blocks them. The e-mail is passed through thousands of predefined rules that scores the e-mail on
886
D. Sarma et al.
the probability of being a spam. Phishers use spam e-mails to direct a client to their malicious webpage and steal data.
3 Methodology As this paper mainly employs machine learning techniques to train models that can detect phishing websites, our first step is to understand that our research understood how machine learning works. In a nutshell, all machine learning techniques involve a dataset and some programmed code that perform computations allowing the code to analyze a portion of the data and observe relationships between the features and the classification of the data. The machine’s trained knowledge of the relationship was then tested against the rest of the data, and its performance was measured and scored. Based on the performance of the model, the setup of the training procedure and the dataset preprocessing were readjusted in hopes of better results in the next iteration of training. If a model failed to provide satisfactory results, other techniques were employed if found relevant to the dataset. If the model performs better than all other trained models; however, the model was stored and used on new unknown datasets to verify its performance furthermore. It is important to note that different datasets could be in different formats, and therefore new datasets introduced to the model might require preprocessing to maintain optimal performance from the model. To better demonstrate the model, Fig. 1 demonstrates the process.
3.1 Dataset The dataset in this research was obtained from the public online repository Mendeley. The dataset contained 48 features extracted from 5000 phishing websites and 5000 real websites. An improved feature extraction technique was employed to this dataset by using the browser automation framework. The class label indicated two outcomes where 0 was a phishing website, and 1 was a real website.
3.2 Data Preprocessing Any collected dataset usually comes with errors, variable formats, different features, incomplete sections, etc. If the dataset is used directly to train a model, it could lead to unexpected behavior, and the results would rarely ever satisfy the expected needs.
Comparative Analysis of Machine Learning Algorithms …
887
Fig. 1 Flowchart of the proposed system
Therefore, it is important to preprocess data to convert raw data into an understandable format to allow the model to train in the best way possible. Steps in Data Preprocessing Step 1. Import the libraries: Pandas, matplotlib, yellowbrick, numpy, and sklearn libraries were used for import the libraries. Step 2. Import the dataset: The dataset was imported and named as “data”. Step 3. Check the missing values: This step was unnecessary for our dataset, as our dataset did not have any missing values. Step 4. See the categorical values: The dataset did not contain any categorical values. So, this step was skipped. Step 5. Split the dataset into training and test set: The dataset was split into two parts. 70% of the dataset was used for training purposes and the rest of the 30% for testing purposes. Training dataset provided features and labels together to learn the relationship between them so that the model could later on test its knowledge against the test set, where it was only provided with the features and was set to generate labels for each set of features and check how many of its predictions were done correctly.
888
D. Sarma et al.
Step 6. Feature scaling: Feature scaling was used to set a limit to the range of variables to allow for comparison on common grounds. But it was need not to implement for our dataset.
3.3 Classifiers The model picked K-nearest neighbors, random forest, support vector machines, and logistic regression as the machine learning techniques to train our model. The models after having been trained and could then be analyzed using F1 scores that took into account both precision and recall evaluations of the models. The model was judged based on how much bias it contained on predicting the labels for a sample of data, and how much difference existed in the fit of the data when compared between the test set results and the train set results and a measurement that was referred to as variance. K-Nearest Neighbors (KNN) The main idea of KNN in a nutshell is that things that are similar are usually near each other. It is one of the simplest of all machine learning techniques and works by simply comparing the features of a set of data, to be labeled, with other sets of data, that are already labeled. It measures the difference between these features and refers to these differences as distances. After having measured the distances, the model selected k number of shortest distances and then output the most frequent labels among them as its predicted label for the unlabeled dataset [15]. Random Forest Random forests are made using decision trees, so it is important to understand decision trees before understanding random forests. Decision trees were made out of data by analyzing the features in a dataset and creating a root using the feature that has the most impact on the label to the feature set. These can be measured using different scoring techniques like the Gini Index. Once a root had been decided on, the rest of the features were analyzed and similar to the selection of the root, and the features were scored and the most significant feature among the rest was added as a child to the root. This technique was repeated until all of the features were added to the tree. When a label was to be decided, the root feature was selected and then its probability was used to determine the path to take from its node, and similarly, it has decided the intended next path from the next node and its corresponding feature. The process was repeated until reached a leaf node that was the end of the tree, where the decision was finalized, and therefore a label was provided by our model. Although decision trees are good at predicting labels from the dataset used to create them, they are not so good at predicting labels on an entirely new set of features that are considered to be somewhat inaccurate at their predictive capabilities. This inaccuracy can be minimized by using random forests.
Comparative Analysis of Machine Learning Algorithms …
889
The first step in generating a random forest was using the dataset to create a bootstrapped dataset. This new dataset would contain samples from the original dataset but would be randomly selected and placed into the new table, with the possibility of some samples existing in the new table more than once. The second step was to select a random subset of the features and analyze those, using our chosen scoring technique, to generate the root of the decision tree. To add children to the root, another random subset from among the rest of the features was once again selected and was analyzed to pick the next child. The second step was repeated several times to generate a wide variety of decision trees which increased the accuracy of our model compared to using one individual decision tree. The process of labeling an unlabeled sample of data was using all of the decision trees to predict labels according to each of them, and then keeping track of the labels produced by each tree, and finally selecting the label that was predicted by the most number of decision trees. The most popular label was selected as the final predicted label and was usually more accurate then what would have achieved from using a single decision tree. While random forests are deterministic, another model called extremely randomized tree can also be used which introduces more randomness in its generation of trees. The splits in ERTs are not optimal, and therefore, can lead to more business. Variance is reduced as well because of the extra randomness. While both random forests and extremely randomized trees perform quite similarly, ERTs are usually more inaccurate but also understandably faster in computation. But ERTs should be avoided if the dataset contains a lot of noisy features which can reduce its functionality even more [18–20]. Support Vector Machines Support vector machines work by analyzing a dataset and trying to set a separator called a support vector classifier, among the features to be able to classify the samples using information regarding which side of the separator the sample falls into. To be able to separate different kinds of datasets and establish boundaries that mark regions for each label, the data was moved into a higher dimension than its original relatively lower dimension in the beginning. For example, if a dataset were to be one dimensional, it can be turned into a two-dimensional curve by using the squared values of the features and plotting it against the original features. If a support vector classifier were unable to separate the features into their labeled regions, this could open up the possibilities and allow the separator to be placed in the newer and more flexible graph. The decision of squaring the features, or raising them to a perhaps higher polynomial degree, is taken by the polynomial kernel which increases the dimensions by setting new degrees and then uses the relationships between each pair of observations to find a support vector classifier. Radial kernel can also be used which finds the support vector classifier in infinite dimensions and trains the model such that it behaves like a weighted nearest neighbor technique.
890
D. Sarma et al.
When a new unlabeled sample is provided, it can simply be plot within the graph and compare its position with that of the support vector classifier to observe which side of the separator it falls into, and therefore, classify the sample accordingly. Support vector classifiers also have other versions of it. For example, while linear SVC only attempts to fit a hyperplane within a data to best separate the different categories of the data, a Nu-SVC uses a parameter to control the number of support vectors [21, 23]. Logistic Regression Logistic regression is based on the concept of linear regression, where a line is plotted against a given dataset and its axes. This line is drawn such that the squared differences between this line and the plotted points are at their minimum. The line and calculated R2 are used to determine whether the features are correlated. The pvalue was also calculated to verify that this value was actually statistically significant. Finally, the line was used to plot any sample of data and finds a label’s corresponding value according to this line. Logistic regression uses a similar concept but is different such that it can only classify two labels and no more. Another difference is that it does not use a straight line, but rather an S-shaped curve which goes from 0 to 1. It tells the probability of a given sample to belong in one of these two labels. Logistic regression CV uses cross-validation over logistic regression to further improve the quality of our model. When cross-validation was applied, sections of data from the dataset were resampled in separate sessions to achieve multiple results. It could calculate the mean probability which can label the data and can get more accurate results. To reach the right equation of the line, stochastic gradient descent was used which used gradients of the loss function at each iteration as an indication to lead to the proper values to be placed within the line equations constants, and therefore minimizing the loss in the process and deriving the optimal line equation for our dataset.
4 Result Precision recall, F1 score, and success rate are widely used to measure the performance of the supervised machine learning algorithms [24–27]. Classification report of our model is described below. In all the tables, the row indicates 1 as a real website and 0 as a phishing website. Table 3 presents the classification report of the support vector machine. The precision and recall for predicting a real website are 0.920 and 0.898. These scores were used to calculate the F1 score for predicting a real website which was 0.909. Similarly, the precision and recall for predicting a phishing website are 0.895 and 0.917. Using these both scores, the F1 score was measured for predicting a phishing website and is 0.906. It is to be noted that the precision for predicting a real website is higher
Comparative Analysis of Machine Learning Algorithms … Table 3 SVC classification report
891
Precision
Recall
F1
1
0.920
0.898
0.909
0
0.895
0.917
0.906
while the recall for predicting a phishing website is higher. The F1 scores are similar. F1 score was compared to other algorithms to find the optimal one. Table 4 represents the classification report of the non-uniform support vector classifier. The precision and recall for predicting a real website are 0.897 and 0.851. These scores were used to calculate the F1 score for predicting a real website which is 0.874. Similarly, the precision and recall for predicting a phishing website are 0.851 and 0.896. Using these both, the F1 score was measured for predicting a phishing website which is 0.873. The scores for these are significantly lower than support vector machine. Table 5 presents the classification report of the linear support vector classifier. The precision and recall for predicting a real website are 0.900 and 0.970. These scores were used to calculate the F1 score for predicting a real website which is 0.933. Similarly, the precision and recall for predicting a phishing website are 0.965 and 0.885. Using these both scores, the F1 score was measured for predicting a phishing website that is 0.923. The F1 scores in here are significantly higher than support vector classifier. Table 6 represents the classification report of KNN. The precision and recall for predicting a real website are 0.854 and 0.905. These scores were used to calculate the F1 score for predicting a real website which is 0.879. Similarly, the precision and recall for predicting a phishing website are 0.893 and 0.836. Using these both, the F1 score is measured for predicting a phishing website is 0.864. The F1 scores in here are significantly lower than linear support vector classifier. Table 4 Nu-SVC classification report
Precision
Recall
F1
1
0.897
0.851
0.874
0
0.851
0.896
0.873
Precision
Recall
F1
1
0.900
0.970
0.933
0
0.965
0.885
0.923
Table 5 Linear SVC classification report
Table 6 KNN classifier classification report
Precision
Recall
F1
1
0.854
0.905
0.879
0
0.893
0.836
0.864
892
D. Sarma et al.
Table 7 Logistic regression classification report
Precision
Recall
F1
1
0.897
0.898
0.898
0
0.892
0.891
0.892
Table 8 Logistic regression CV classification report
Precision
Recall
F1
1
0.937
0.948
0.942
0
0.944
0.932
0.938
Table 7 presents the classification report of logistic regression. The precision and recall for predicting a real website are 0.897 and 0.898. These scores are used to calculate the F1 score for predicting a real website which is 0.878. Similarly, the precision and recall for predicting a phishing website are 0.892 and 0.891. Using these both, the F1 score is measured for predicting a phishing website is 0.892. The precision, recall, and F1 scores for both 1 and 0 are remarkably close to each other indicating that this algorithm works well for both precision and recall. However, the F1 scores are still lower than linear SVC, so it cannot be considered as the best one. Table 8 presents the classification report of logistic regression CV, where CV stands for cross-validation. The precision and recall for predicting a real website are 0.937 and 0.948. These scores were used to calculate the F1 score for predicting a real website which is 0.942. Similarly, the precision and recall for predicting a phishing website are 0.944 and 0.932. Using these both scores, the F1 score was measured for predicting a phishing website which is 0.938. The precision, recall, and F1 scores for both 1 and 0 are remarkably close to each other, indicating that this algorithm works well for both precision and recall. The F1 scores in here were better than linear SVC so this is the best score so far. Table 9 presents the classification report of stochastic gradient descent (SGD). The precision and recall for predicting a real website are 0.966 and 0.826. These scores were used to calculate the F1 score for predicting a real website which is 0.891. Similarly, the precision and recall for predicting a phishing website are 0.841 and 0.969. Using these both scores, the F1 score was measured for predicting a phishing website which is 0.900. The F1 scores in here were lower than logistic regression CV so it was also rejected. Table 10 represents the classification report of random forest classifier. The precision and recall for predicting a real website are 0.977 and 0.984. These scores were used to calculate the F1 score for predicting a real website which is 0.980. Similarly, the precision and recall for predicting a phishing website are 0.983 and 0.975. Using Table 9 SGD classifier classification report
Precision
Recall
F1
1
0.966
0.826
0.891
0
0.841
0.969
0.900
Comparative Analysis of Machine Learning Algorithms … Table 10 Random forest classifier classification report
893
Precision
Recall
F1
1
0.977
0.984
0.980
0
0.983
0.975
0.979
Table 11 Bagging classifier classification report
Precision
Recall
F1
1
0.972
0.977
0.974
0
0.975
0.971
0.973
Precision
Recall
F1
1
0.984
0.979
0.982
0
0.978
0.984
0.981
Table 12 Extra trees classifier classification report
these both scores, the F1 score was measured for predicting a phishing website which is 0.979. Here, the precision, recall, and F1 scores are remarkably high than all the other models. Hence, this is considered the best one yet. Table 11 presents the classification report of bagging classifier. The precision and recall for predicting a real website are 0.972 and 0.977. These scores were used to calculate the F1 score for predicting a real website which is 0.974. Similarly, the precision and recall for predicting a phishing website are 0.975 and 0.971. Using these both scores, the F1 score was measured for predicting a phishing website is 0.973. Here, the precision, recall, and F1 scores are remarkably high than all the other models except random forest classifier. Table 12 represents the classification report of extra trees classifier. The precision and recall for predicting a real website are 0.984 and 0.979. These scores were used to calculate the F1 score for predicting a real website which is 0.982. Similarly, the precision and recall for predicting a phishing website are 0.978 and 0.984. Using these both, the F1 score was measured for predicting a phishing website is 0.981. Here, the precision, recall, and F1 scores are the highest, and this is the best score of all. It is to be noticed from the above classification reports (Table 13) that all the classifiers under random forest did remarkably well for detecting phishing websites and real websites.
5 Conclusion This study went in great detail and an in-depth explanation of machine learning techniques and their performances when used against a dataset, containing data regarding websites, in order to detect phishing websites. This technique is not
894
D. Sarma et al.
Table 13 Comparative classification report Classifier
Precision
Recall
F1
SVC
1
0.920
0.898
0
0.895
0.917
0.906
Nu-SVC
1
0.897
0.851
0.874
0
0.851
0.896
0.873
1
0.900
0.970
0.933
0
0.965
0.885
0.923
KNN
1
0.854
0.905
0.879
0
0.893
0.836
0.864
Logistic regression
1
0.897
0.898
0.898
0
0.892
0.891
0.892
1
0.937
0.948
0.942
0
0.944
0.932
0.938
SGD
1
0.966
0.826
0.891
0
0.841
0.969
0.900
Random forest
1
0.977
0.984
0.980
0
0.983
0.975
0.979
1
0.972
0.977
0.974
0
0.975
0.971
0.973
1
0.920
0.898
0.909
0
0.895
0.917
0.906
Linear SVC
Logistic regression CV
Bagging Extra trees
0.909
commonly described in great detail in this paper but also showed how each of the models performs by using plotted charts to demonstrate and compare each individual algorithms. This report aims to be useful to its readers to provide a conclusive analysis of these methods and to verify our observations regarding random forest classifier’s optimal performance. The graphs and details that were added to this paper aimed to help others to carry out further experimentation progressing from where it was concluded. It is intended to carry on the proposed research work with further modifications to the dataset and applies other machine learning techniques with modified parameters to hopefully open more possibilities in improving the global defense against the cyber attackers.
Comparative Analysis of Machine Learning Algorithms …
895
References 1. Da Silva JAT, Al-Khatib A, Tsigaris P (2020) Spam e-mails in academia: issues and costs. Scientometrics 122:1171–1188 2. Mironova SM, Simonova SS (2020) Protection of the rights and freedoms of minors in the digital space. Russ J Criminol 14:234–241 3. Sethuraman SC, Vijayakumar V, Walczak S (2020) Cyber attacks on healthcare devices using unmanned aerial vehicles. J Med Syst 44:10 4. Tuan TA, Long HV, Son L, Kumar R, Priyadarshini I, Son NTK (2020) Performance evaluation of Botnet DDoS attack detection using machine learning. Evol Intell 13:283–294 5. Azeez NA, Salaudeen BB, Misra S, Damasevicius R, Maskeliunas R (2020) Identifying phishing attacks in communication networks using URL consistency features. Int J Electron Secur Digit Forensics 12:200–213 6. Iwendi C, Jalil Z, Javed AR, Reddy GT, Kaluri R, Srivastava G, Jo O (2020) KeySplitWatermark: zero watermarking algorithm for software protection against cyber-attacks. IEEE Access 8:72650–72660 7. Liu XW, Fu JM (2020) SPWalk: similar property oriented feature learning for phishing detection. IEEE Access 8:87031–87045 8. Parra GD, Rad P, Choo KKR, Beebe N (2020) Detecting internet of things attacks using distributed deep learning. J Netw Comput Appl 163:13 9. Tan CL, Chiew KL, Yong KSC, Sze SN, Abdullah J, Sebastian Y (2020) A graph-theoretic approach for the detection of phishing webpages. Comput Secur 95:14 10. Anwar S, Al-Obeidat F, Tubaishat A, Din S, Ahmad A, Khan FA, Jeon G, Loo J (2020) Countering malicious URLs in internet of things using a knowledge-based approach and a simulated expert. IEEE Internet Things J 7:4497–4504 11. Ariyadasa S, Fernando S, Fernando S (2020) Detecting phishing attacks using a combined model of LSTM and CNN. Int J Adv Appl Sci 7:56–67 12. Bozkir AS, Aydos M (2020) LogoSENSE: a companion HOG based logo detection scheme for phishing web page and E-mail brand recognition. Comput Secur 95:18 13. Gupta BB, Jain AK (2020) Phishing attack detection using a search engine and heuristics-based technique. J Inf Technol Res 13:94–109 14. Sonowal G, Kuppusamy KS (2020) PhiDMA—a phishing detection model with multi-filter approach. J King Saud Univ Comput Inf Sci 32:99–112 15. Zamir A, Khan HU, Iqbal T, Yousaf N, Aslam F, Anjum A, Hamdani M (2020) Phishing web site detection using diverse machine learning algorithms. Electron Libr 38:65–80 16. Rodriguez GE, Torres JG, Flores P, Benavides DE (2020) Cross-site scripting (XSS) attacks and mitigation: a survey. Comput Netw 166:23 17. Das A, Baki S, El Aassal A, Verma R, Dunbar A (2020) SoK: a comprehensive reexamination of phishing research from the security perspective. IEEE Commun Surv Tutor 22:671–708 18. Adewole KS, Hang T, Wu WQ, Songs HB, Sangaiah AK (2020) Twitter spam account detection based on clustering and classification methods. J Supercomput 76:4802–4837 19. Rao RS, Vaishnavi T, Pais AR (2020) CatchPhish: detection of phishing websites by inspecting URLs. J Ambient Intell Humaniz Comput 11:813–825 20. Shabudin S, Sani NS, Ariffin KAZ, Aliff M (2020) Feature selection for phishing website classification. Int J Adv Comput Sci Appl 11:587–595 21. Raja SE, Ravi R (2020) A performance analysis of software defined network based prevention on phishing attack in cyberspace using a deep machine learning with CANTINA approach (DMLCA). Comput Commun 153:375–381 22. Sarma D (2012) Security of hard disk encryption. Masters Thesis, Royal Institute of Technology, Stockholm, Sweden. Identifiers: urn:nbn:se:kth:diva-98673 (URN) 23. Alqahtani H et al (2020) Cyber intrusion detection using machine learning classification techniques. In: Computing science, communication and security, pp 121–31. Springer, Singapore
896
D. Sarma et al.
24. Hossain S, et al (2019) A belief rule based expert system to predict student performance under uncertainty. In: 2019 22nd international conference on computer and information technology (ICCIT), pp 1–6. IEEE 25. Ahmed F et al (2020) A combined belief rule based expert system to predict coronary artery disease. In: 2020 international conference on inventive computation technologies (ICICT), pp 252–257. IEEE 26. Hossain S et al (2020) A rule-based expert system to assess coronary artery disease under uncertainty. In: Computing science, communication and security, Singapore, pp 143–159. Springer, Singapore 27. Hossain S et al (2020) Crime prediction using spatio-temporal data. In: Computing science, communication and security. Springer, Singapore, pp 277–289
Toxic Comment Classification Implementing CNN Combining Word Embedding Technique Monirul Islam Pavel, Razia Razzak, Katha Sengupta, Md. Dilshad Kabir Niloy, Munim Bin Muqith, and Siok Yee Tan
Abstract With the advancement of technology, the virtual world and social media have become an important part of people’s everyday lives. Social media allows people to connect, share their emotions and discuss various subjects, yet it also becomes a place or cyberbullying, personal attack, online harassment, verbal abusing and other kinds of toxic comments. Top social media platform still suffering from fast and accurate classification to remove this kind of toxic comment automatically. In this paper, an ensemble methodology of convolution neural networking (CNN) and natural language processing (NLP) is proposed which segments toxic and non-toxic comments in first phase, and then it classifies and labels in six types based on the dataset of Wikipedia’s talk page edits, collected from Kaggle. The proposed architecture is structured following data preprocessing applying data cleaning processes, adopting NLP techniques like tokenization, stemming and converted word into vector by word embedding techniques. Ensembling the preprocessed dataset and best word embedded method, CNN model is applied that scores ROC-AUC 98.46 and 98.05% accuracy for toxic comment classification which is higher than compared existing works.
M. I. Pavel (B) · S. Y. Tan Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, The National University of Malaysia, 43600 Bangi, Selangor, Malaysia e-mail: [email protected] S. Y. Tan e-mail: [email protected] R. Razzak · K. Sengupta · Md. D. K. Niloy · M. B. Muqith Department of Computer Science and Engineering, BRAC University, Dhaka, Bangladesh e-mail: [email protected] K. Sengupta e-mail: [email protected] Md. D. K. Niloy e-mail: [email protected] M. B. Muqith e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_65
897
898
M. I. Pavel et al.
Keywords Toxic comment · Classification · Word embedding · Tokenization · Convolution neural networking · fastText · Natural language processing
1 Introduction Social media today, on account of its attainability and accessibility, has ended up becoming the region for discussions regarding information and queries about places. It has given scope to each person to express themselves more than ever and enhanced communication and sharing in online platform. Unfortunately, this platform is also turning into a platform hating speech and verbal attacking, even putting people at risk of violence, who support diversity in race, ethnicity, gender and sexual orientation. Cyberbullying and harassment have become serious issues, which affect a wide range of people, sometimes, inflicting severe psychological problems such as depression or even suicide. The abusive online content will fall into more than one toxic category, such as hating, threatening, insulting based on identity [1]. According to the 2014 poll [2] of the PEW Research Institute, 73% of people on the Internet have seen someone being harassed online, 45% of Internet users have all been harassed and 45% were exposed to substantial harassment. More than 85% of databases are completely non-toxic and the concentrations of toxicity are totally not seen in Wikipedia. In contrast to 2010, teenagers were 12% [3] more likely to be subjected to cyberbullying, which obviously indicates negative part of social media. Corporations and social media platforms are trying to track down abusive comments toward users and are also looking for ways to automate the said process. This paper utilizes deep learning to examine whether social media comments are abusive or not, and to classify them further into various categories such as toxic, severely toxic, obscene, vulgar, profane and hatred toward different identity. In this paper, we are using two neural networks based methods, convolutional neural networks (CNN) and natural language processing (NLP) which have been applied combining with word embedding without any syntactic or semantic expertise. We have evaluated our models and have used accuracy tests to see how well the models performed. To represent the research work, the sections of the paper are arranged as follows: Sect. 2 addresses the relevant work in this area; Sect. 3 outlines the proposed methodology. Section 4 presents the experimental analysis alongside implementation process and results of the procedure and Sect. 5 concludes the purpose of research work as well as further scopes of development.
2 Related Works With the massive increment of Internet and social media; toxic comments, cyberbullying and verbal abuse have become a major issue of concern and several studies have been conducted to resolve this by adopting classification techniques. Georgakopoulos
Toxic Comment Classification Implementing …
899
et al. [4] used CNN model for solving toxic comment classification problem using the same dataset we have used in our methodology. In their solution, they demonstrated using a balanced subset of the data without tackling imbalance dataset as well as did only binary classification to identify the comments are toxic or not, but did not predict the toxic level. To improve the issue, Saeed et al. [5] applied deep neural network architectures with a good accuracy. One of the best part of this research was their classification framework does not need any laborious text preprocessing. They used CNN-1D, CNN-V, CNN-I, BiLSTM, BiGRU models and analyzed calculating F1, ROC-AUC, precision, recall scores and claims that Bi-GRU had showed the highest F1 score and scored great in precision and recall. In another work, authors of [6] demonstrated capsule network-based toxic comment classifier which implemented based on single model capsule network along with focal loss along. Their model scored 98.46% accuracy with RNN classifier on TRAC dataset, known as a Hindi-English combined dataset of toxic and non-toxic comments. Furthermore, Kandasamy et al. [7] adopted natural language processing technique (NLP) integrated with the implementation of URL analysis and supervised machine learning techniques social media data where it scored 94% accuracy. Anand et al. [8] presented different deep learning techniques such as convolution neural network (CNN), ANN, long short-term memory cell (LSTM) and these are with and without word GloVe embeddings, where GloVe pre-trained model is applied for classification. Most of the research works showed the binary classification of toxic and non-toxic comments, but labeling classes of toxic comments after identification are still missing in the previous works. To solve this issue, we proposed a methodology that classifies toxic comments and also predicts the toxicity based on classes.
3 Methodology The workflow of the proposed methodology is presented in Fig. 1 flowchart, where it is shown that after inserting dataset and splitting it into training and testing set. The dataset then processes through preprocessing stage that combines data cleaning, tokenization, stemming and passes to word embedding. We compared three famous available word embedding techniques and ensemble with CNN classifier with the best accurate one. The CNN architecture is firstly implemented the binary classification to determine toxic or non-toxic comments and finally its subclass toxic levels are predicted if the comments are toxic.
900
M. I. Pavel et al.
Fig. 1 Proposed workflow
3.1 Data Preprocessing 3.1.1
Data Cleaning
To reduce the irregularity from the dataset, data cleaning is effectively needed for achieving better outcomes and faster processing. We adopt various data cleaning processes like stop word removing [9], removing punctuations and case folding where all words will be converted into lower case, removing duplicate words, URL, emoji or short codes of emoji, numbers, one character-based word removing and symbol removing. Overall, there are lot of words that are unnecessarily used to emphasize a meaningful sentence. Those words are needed to be removed from the list such as I, is, are, a, the, of and many more. They have no value or in use for the list. In addition, these are the pronouns, conjunctions and relational words which contribute to almost 500 stop words list of the English language. Natural Language Toolkit (NLTK) [10] is a library for Python, for many different languages that we have used in our model for better classification and accurate result.
3.1.2
Tokenization
It is the most common and important part of NLP where a sentence full of words are separated or split to form different individual words [11] which is considered as a token. Figure 2 shows how the sentence is broken down to a segmented form. Here, a model named fastText is used for mapping the word to a vector number. In the first step, chunk of words will be separated from a big sentence or content of information such as [“We hate toxic comment”] to [“we”, “hate”, “toxic”, “comment”] and in
Toxic Comment Classification Implementing …
901
Fig. 2 Process of tokenization
second, the words will be embedded with some numbers to represent word vectorization. It mainly compares the group of vector words that are in vector space and finds the mathematical similarity like man to boy and woman to girl.
3.1.3
Stemming
Connotation of words has a different sense in distinct form in English language and sometimes they have similar word for describing a variety of things in using all parts of speech. In order to find the root which is also called lemma, stemming is used for preparing those words by removing or diminishing the inflection forms of the words like playing, played and playful. These words have suffix, prefix, tense, gender and other grammatical forms. Moreover, when we compare group of words and find a root that is not the same kind then we would take it into consideration as a different category of that word as lemma. The method of lemma is used in our model for better output [12].
3.2 Word Embedding Word embedding is used to learn the representation of a vector constructed using neural networks. It is used primarily to regulate vector representations of words in a significant alternative. Word embedding works by transforming vector representations based on mapping semantic data to an embedded space for geometric words [13]. Table 1 shows the applied and tested word embedding methods and the details.
902
M. I. Pavel et al.
Table 1 Comparison of word embedding techniques Method fastText
Word2Vec GloVe
Token (billion)
Details
References
16
This word embedding dataset is created from Wikipedia 2017, UMBC Web base corpus and statmt.org news dataset
[14, 15]
100
It is a pre-trained dataset created from Google news dataset
[16]
This dataset is created from Gigaword 5 and Wikipedia 2014
[17]
6
3.3 CNN Architecture for Classification Convolutional neural networks or CNN have been commonly applied to image classification [18] problems because of its internal capacity to use two statistical properties named “local stationarity” and “compositional structure”. To implement CNN for toxic comment classification [19–22], the initial rule is that before being feeded [23] to CNN architecture, sentence needs to be encoded and to improve the scenario, the approach of applying vocabulary in a medium of index containing words which has sets of texts that is mapped into integer length from 0 to 1. Afterward, the padding technique is utilized to fill with zeros the document matrix with a view to gain the highest length as CNN architecture needs constant input dimensionality. Next, the next stage includes translating the encoded documents into matrices, in which each row corresponds to a single term. The matrices generated move by the embedding layer in which a dense vector transforms any term (row) into a representation of low dimensions [24]. The operation then goes as per the standard CNN research method. The word embedding technique is chosen for the low-dimensional representation of each word during this point. Embedding method is the use of fixed dense word vectors, generated utilizing word such as fastText, Word2Vec and GloVe which are mentioned in the previous section. Our CNN architecture is built includes kernel size five in 128 filters for 5 word embeddings along with 50 unit fully connected (dense) layer. Figure 3 shows the setup of CNN for our toxic comment classification model. The proposed architecture for CNN designed in 10 layers which is shown in Fig. 4 where it begins with input layer where we input the dataset, then an embedded layer which is pre-trained with chosen word embedding technique, convolution layer to learn feature map which captures relationships with nearby elements, max pooling layer that helps to reduce dimensionality by segments and takes the max value, two dropout regularization layer to deduct the problem of overfitting, two dense layers where first one learn the weights of the input to identify outputs and second one improves the weight and one flatten layer (fully connected) and finally one output layer that generates the predicted class. To train our model, we adopt ADAM optimizer [25] and binary cross-entropy loss, and evaluated with binary accuracy in first phase, then proceed with multi-class classification for toxic leveling. We use four epochs for high computation power with
Toxic Comment Classification Implementing …
903
Fig. 3 Setup visualization of CNN architecture
Fig. 4 Proposed convolutional neural network architecture
spearing training data set into mini-batches of 64 examples where 70% is training and 30% data is for testing purpose.
4 Experimental Analysis In this section, first, we describe the used dataset from Kaggle and visualize the categories with their correlations. After that, the performance analysis of the proposed system on this dataset for toxic comment classification is shown. Finally, a demonstration of the proposed methodology on random toxic comments is presented for leveling the toxic categories. For this experimental analysis, we used a computer built AMD Ryzen 5 with 16 GB RAM and 256 GB SSD ROM, Nvidia’s GTX 1665 GPU and coded in Python 3.6 in Anaconda which is based on Spyder IDE.
904
M. I. Pavel et al.
4.1 Dataset The dataset we have used in our research is acquired from Kaggle which is very popular publicly available dataset named “Wikipedia Talk Page Comments annotated with toxicity reasons” [26] which content almost 1,60,000 comments with manually labeling. The dataset contains total six classes (toxic, severe_toxic, obscene, insult, threat, identity_hate) which are described down below in Fig. 5. The correlation matrices in Fig. 6 shows that “toxic”, comments are most strongly correlated with “insult” and “obscene” class. Moreover, “toxic” and “thread” have the only weak correlation. Further, there is very weak correlation between “obscene” and “insult” comments are also highly correlated, which makes perfect sense. It also shows the class “threat” has the weakest correlation with all classes.
Fig. 5 Data representation in bar chart (class contain wise)
Toxic Comment Classification Implementing …
905
Fig. 6 Visual representation of correlation between classes
4.2 Classification Evaluation After doing the preprocessing and word embedding, we used CNN model with fastText embedding technique for binary classification in the initial stage after preprocessing with tokenization and stemming. We utilize three separate structures of convolution, utilizing three separate structures of convolution at the same time where dense vector dimension 300 with filter size width 128. For increasing convolutional layer, filter width is equal to the vector dimension, and its height was 3, 4 and 5. A cumulative pooling operation is implemented after each convolutional layer. A complete layer attached is the output of the pooling layer, while the softmax feature refers to the ending layer. Finally, we implementing this in four epochs where at first, model loss was 6.08% and ROC-AUC was 98.46%, a gradually decreasing starts from second epoch where loss was 3.63%, then 2.71 and 2.6% in final epoch as well as validation score-based AUC reached maximum at 98.63% for toxic comment classification. In this experimental, AUC reached maximum at 98.63% for toxic comment classification. Figure 7 presents the training and testing loss for each epoch where it visualizes that the training loss decreases from 0.0618 to constant 0.0368. Furthermore, Table 2 shows the demonstration of toxic leveling on some random vulgar and toxic comments where it is shown the predicted toxicity based on the six classes. It
906
M. I. Pavel et al.
Fig. 7 Loss function on each epoch for train and test set Table 2 Toxic comment labeling Sentence “Just go and die”
Toxic (%) Severe Toxic Obscene (%) Threat (%) Insult (%) Identity Hate (%) (%) 98
4
40
40
67
1
“I will kill you”
100
10
10
96
13
0
“you are a bloody”
100
4
100
0
99
0
“Good morning”
0
0
0
0
0
0
“you are ugly”
93
0
0
0
48
0
“I will break your”
86
0
0
68
1
0
100
1
100
0
99
0
97
0
40
0
71
0
“You are a 100 jackass”
0
97
0
96
0
“We hate you fool”
96
0
29
0
81
58
“Nigga, get lost”
99
0
62
0
68
89
“you, son of bitch” “Stupid girl!!”
Toxic Comment Classification Implementing … Table 3 Accuracy comparison
907
References
Method
[2]
CNN and bidirectional GRU 97.93
[27]
SVM and TF-IDF
[28]
Logistic regression
89.46
[1]
Glove and CNN and LSTM
96.95
Proposed model CNN with fastText
Accuracy (%) 93.00
98.05
levels each classified toxic words into subclasses with prediction where some of the sentence can be in multiclasses or can be specifically scores high in one class that makes sense. Table 3 shows comparisons of others proposed work and our methodology where authors of [2] used CNN and bidirectional GRU model and achieved 97.93% accuracy, [27] implemented SVM and TF-IDF and got 93.00% test accuracy after, [28] scored 89.46% applying logistic regression, [1] got 96.95% accuracy implementing GloVe word embedding and combining CNN and LSTM; and our proposed methodology with fastText word embedding works better these works showing 98.05% accuracy and 98.46% ROC-AUC score. We adopt fastText word embedding technique was it shows highest accuracy for our model where with GloVe the accuracy was 96.68 and 93.45% with Word2Vec.
5 Conclusion In this paper, we represent a toxic comment classification system which is a vital issue as with the growing social media, and it is also necessary to prevent cyberbullying, vulgar or toxic comments because preventing this kind of things are still challenging. However, we successfully achieve a higher accuracy comparing with other existing works, implementing CNN with fastText word embedding technique after processing using natural language processing including data cleaning (i.e., stop word removing), tokenization and stemming. Firstly, it classifies comments are toxic or non-toxic with 98.05% accuracy and 98.46 ROC-AUC score. Following that, it labels the toxic classified comments into five other subclasses. Thus, the proposed work not only fetches toxic comments but also clarify which subclasses it may belong that is essential for practical implementation. Though the accuracy of this proposed methodology is high, it can be improved more by improving the quantity of the dataset where there are some imbalance in class distributions as well as quantities and training more cases. Further, we are planning to deploy it in online teaching platform chatbox and social media as these two platforms cause a major amount of toxic comments.
908
M. I. Pavel et al.
References 1. Anand M, Eswari R (2019) Classification of abusive comments in social media using deep learning. In: 2019 3rd international conference on computing methodologies and communication (ICCMC), pp 974–977 2. Ibrahim M, Torki M, El-Makky N (2018) Imbalanced toxic comments classification using data augmentation and deep learning. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA), pp 875–878 3. Van Hee C, Jacobs G, Emmery C, Desmet B, Lefever E, Verhoeven B et al (2018) Automatic detection of cyberbullying in social media text. PLoS ONE 13(10):e0203794 4. Georgakopoulos SV, Tasoulis SK, Vrahatis AG, Plagianakos VP (2018) Convolutional neural networks for toxic comment classification. In: Proceedings of the 10th hellenic conference on artificial intelligence, pp 1–6 5. Saeed HH, Shahzad K, Kamiran F (2018, November) Overlapping toxic sentiment classification using deep neural architectures. In: 2018 IEEE international conference on data mining workshops (ICDMW), pp 1361–1366 6. Srivastava S, Khurana P, Tewari V (2018, August) Identifying aggression and toxicity in comments using capsule network. In: Proceedings of the first workshop on trolling, aggression and cyberbullying (TRAC-2018), pp 98–105 7. Kandasamy K, Koroth P (2014) An integrated approach to spam classification on Twitter using URL analysis, natural language processing and machine learning techniques. In: 2014 IEEE students’ conference on electrical, electronics and computer science, pp 1–5. IEEE 8. Anand M, Eswari R (2019, March) Classification of abusive comments in social media using deep learning. In: 2019 3rd international conference on computing methodologies and communication (ICCMC), pp 974–977 9. Uysal AK, Gunal S (2014) The impact of preprocessing on text classification. Inf Process Manage 50(1):104–112 10. Hardeniya N, Perkins J, Chopra D, Joshi N, Mathur I (2016) Natural language processing: python and NLTK. Packt Publishing Ltd 11. Orbay A, Akarun L (2020) Neural sign language translation by learning tokenization. arXiv preprint arXiv:2002.00479 12. Hidayatullah AF, Ratnasari CI, Wisnugroho S (2016) Analysis of stemming influence on indonesian tweet classification. Telkomnika 14(2):665 13. Yang X, Macdonald C, Ounis I (2018) Using word embeddings in twitter election classification. Inform Retriev J 21(2–3):183–207 14. Santos I, Nedjah N, de Macedo Mourelle L (2017, November) Sentiment analysis using convolutional neural network with fastText embeddings. In: 2017 IEEE Latin American conference on computational intelligence (LA-CCI), pp 1–5 15. Wang Y, Wang J, Lin H, Tang X, Zhang S, Li L (2018) Bidirectional long short-term memory with CRF for detecting biomedical event trigger in FastText semantic space. BMC Bioinform 19(20):507 16. Lilleberg J, Zhu Y, Zhang Y (2015, July) Support vector machines and word2vec for text classification with semantic features. In: 2015 IEEE 14th international conference on cognitive informatics & cognitive computing (ICCI* CC), pp 136–140 17. Chowdhury HA, Imon MAH, Islam MS (2018, December) A comparative analysis of word embedding representations in authorship attribution of bengali literature. In: 2018 21st international conference of computer and information technology (ICCIT), pp 1–6 18. Pavel MI, Akther A, Chowdhury I, Shuhin SA, Tajrin J (2019) Detection and recognition of Bangladeshi fishes using surf and convolutional neural network. Int J Adv Res 7: 888–899 19. Risch J, Krestel R (2020) Toxic comment detection in online discussions. In: Deep learningbased approaches for sentiment analysis, pp 85–109 20. Jacovi A, Shalom OS, Goldberg Y (2018) Understanding convolutional neural networks for text classification. arXiv preprint arXiv:1809.08037
Toxic Comment Classification Implementing …
909
21. Wang S, Huang M, Deng Z (2018, July) Densely connected CNN with multi-scale feature attention for text classification. IJCAI 4468–4474 22. Carta S, Corriga A, Mulas R, Recupero DR, Saia R (2019, September) A supervised multiclass multi-label word embeddings approach for toxic comment classification. In: KDIR, pp 105–112 23. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537 24. Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural networks. In: Advances in neural information processing systems, pp 1019–1027 25. Zhang Z (2018, June) Improved adam optimizer for deep neural networks. In: 2018 IEEE/ACM 26th international symposium on quality of service (IWQoS), pp 1–2 26. Toxic Comment Classification Challenge. (n.d.). Retrieved February 9, 2020, from https:// www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/data 27. Dias C, Jangid M (2020) Vulgarity classification in comments using SVM and LSTM. In: Smart systems and IoT: Innovations in computing, pp 543–553. Springer, Singapore 28. Kajla H, Hooda J, Saini G (2020, May) Classification of online toxic comments using machine learning algorithms. In: 2020 4th international conference on intelligent computing and control systems (ICICCS), pp 1119–1123
A Comprehensive Investigation About Video Synopsis Methodology and Research Challenges Swati Jagtap and Nilkanth B. Chopade
Abstract With enormous growth in video surveillance technology, the challenges in terms of data retrieval, monitoring, and browsing have been increased. A smarter solution for this is a video synopsis technique that represents prolonged video in a compact form based on the object trajectories rather than the key frame approach. It converts long video footage into shorter video form while preserving all the activities of the original video. The object trajectories are shifted in time domain as well as in spatial domain to offer the maximum compactness while maintaining the sequence of original source video. This paper gives the brief about the different approaches, evaluation parameters, and datasets used to assess the quality of synopsis video. The main objective is to investigate the query-based video synopsis useful for data retrieval through activity clustering steps in the synopsis framework that will also help to solve societal problems. Keywords Video synopsis · Video surveillance · Object detection and segmentation · Optimization · Activity clustering
1 Introduction The exponential increase in technological enrichment demands the need for video surveillance almost in all areas. Video surveillance plays an important role in terms of security mostly for monitoring process, transport, public security, education field, and many more [1]. There are some challenges of video surveillance that needs to address. The enormous amount of data produced is hard to monitor continuously, and processing of this data within a short period of time is a major challenge [2]. As the surveillance camera is continuously tracking the events, there is a huge requirement S. Jagtap (B) · N. B. Chopade Department of Electronics and Telecommunication, Pimpri Chinchwad College of Engineering, Pune, India e-mail: [email protected] N. B. Chopade e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_66
911
912
S. Jagtap and N. B. Chopade
of memory for storage. Thus, data browsing for certain activities from this data will take hours/days. Therefore, video browsing becomes tedious and time-consuming task as a result most of the videos are not watched. A probable solution is a need for a method that summarizes the video and can convert hours of video into mins. These methods are called video condensation, which is further divided as frame-based, object-based. Video summarization is a frame-based method defined as a process of making and representing a detail abstract view of the whole video withinthe shortest time period. The video summarization can be categorized into two parts: a static video summary (storyboard) and a dynamic video summary (video trailer). The static video summarization selects the key frames of a video sequence and mostly used for indexing, browsing, and retrieval [3]. Dynamic video summary consists of video skimming and video fast forward. Video skimming method selectsthe smallest dynamic portion called video skims of audio and video to generate the video summary [4]. The movie trailer is one of the most popular video skims in exercise [5]. Fast-forwarding methods processes the frame depending upon the manual control or automatic setting of speed [6]. The video summarization methods condense the video data in a temporal domain only. The spatial redundancy is not considered for condensation that reduces the compression efficiency. Figure 1 illustrates the difference between video summarization and video synopsis. Video summarization extract the key frame based on the features such as texture, shape, motion, where compression is achieved in temporal domain only. While, video synopsis is a object-based compression algorithm that extracts the activity from the original video that is represented in the form of tubes. The proper rearrangement of tubes with same chronological order gives the compression in temporal as well as in a spatial manner producing more compactness. The paper is organized as follow: Section II covers the various video synopsis approaches. The complete synopsis process flow and methodology is explained in
Fig. 1 a Video summarization extracting only the key frame. b Video synopsis displaying the multiple objects from different time interval [7]
A Comprehensive Investigation About Video Synopsis …
913
section III. The evaluation parameters and datasets are reviewed in section IV. Section V covers the research challenges in the field of video synopsis. Finally, section VI covers the conclusion and discussion on future research.
2 Video Synopsis Approaches Video synopsis can be categorized in different approaches based on optimization method, activity clustering, and input domain. Optimization is the main step of video synopsis which gives a proper arrangement of a tube to get better efficiency. The optimization methods can be categorized in two types mainly online and offline methods. In offline synopsis, all the moving objects are extracted and saved in a memory with the instantaneous background before the optimization process. Therefore, there is a need of huge memory and supporting hardware. In optimization process, the main focus is energy minimization. However, this large memory requirement makes the optimization more time-consuming to search the optimum solution. Also, the manual control needed to decide the length of video synopsis. To address this problem, Feng et al. [8] proposed an online synopsis which can overcome the problems of offline optimization. In this method, the moving object detection and filling of the tube are done real time and applicable for live video steam. There is no need for huge memory to store the object; thus, the method also reduces the computation cost. Most of the approaches use online optimization to reduce the computation cost and reduction in memory requirement. Activity clustering is another approach that can be used for increasing the efficiency. It is the additional step added after extracting the moving object and representing as a tube. The tubes can be further categorized into different cluster which can be used for smooth and quick video retrieval and browsing. The input domain depends upon the camera topology and data domain used. Type of camera topology plays an important part to decide the computation complexity. The topology is classified as single camera and multicamera. Most of the studies use single camera to find the optimum solution and reduce the computation complexity. A multicamera approach covers all cameras to find the optimal solutions, but increases the computational complexity. The input data domain can be categorized as pixel domain or compressed domain. Some of the approaches use the compressed domain as the input to reduce the computational complexity than transforming the data into the pixel domain. Table 1 categorizes the different approaches used in studies. The table gives the brief about research work need to be contributed in online optimization, activity clustering, multicamera approach, and compression domain. The online approach gives better performance to reduce the computational cost. Activity clustering helps to improve the compression ratio.
914
S. Jagtap and N. B. Chopade
Table 1 Classification of video synopsis approaches Studies
Optimization Optimization Activity type method clustering (Online/Offline)
Sekh et al. [9]
Online
Energy Yes Single minimization YOLO9000
Pixel
Raut et al. [10]
Online
Dynamic graph coloring approach
No
Single
Pixel
Ghatak et al. [11] Offline
HSATLBO optimization approach
No
single
Pixel
Li et al. [12]
Group partition greedy optimization algorithm
No
Single
Pixel
Ra and Kim [13] Online
Occupation matrices with FFT
No
Single
Pixel
He et al. [14]
Potential collision graph (PCG
No
Single
Pixel
Yumin et al. [15] Offline
Genetic algorithm
No
Single
Pixel
Ansuman et al. [7]
Table driven, Yes, contradictory Simple binary graph MKL coloring (CBGC) approach, and simulated annealing (SA)
Multi
Pixel
Balasubramanian Offline et al. [16]
Energy Yes minimization SVM classifier
Single
Pixel
Zhong et al. [17]
Offline
Graph cut approach
No
Single
Compressed
Jianqing et al. [18]
Online
Tube filling algorithm Greedy optimization
No
Single
Pixel
Pritch et al. [19]
Offline
Simulated annealing method
Yes SVM classifier
Single
Pixel
Offline
Offline
Offline
Camera topology Input (single/multicamera) domain (pixel or compressed)
(continued)
A Comprehensive Investigation About Video Synopsis …
915
Table 1 (continued) Studies
Optimization Optimization Activity type method clustering (Online/Offline)
Camera topology Input (single/multicamera) domain (pixel or compressed)
Zheng et al. [20]
Online
Single
Simulated annealing method
No
Compressed
3 Video Synopsis Framework and Methodology Video synopsis is used to condense the size of the original video which makes the data retrieval easy. Figure 2 describe the steps which involves in video synopsis process. The initial step for video synopsis is to detect and track the moving object. This is a preprocessing step and very important for further processing. The next step involves activity clustering in which the clustering of the same object trajectories is done. The next step is the main part of the synopsis algorithm called optimization. It involves the optimal tube rearrangement for collision avoidance and to get the compressed video. The tube rearrangement can be done based on user’s query which helps to target the synopsis video depending on query given. After the optimum tube rearrangement, the background is generated depending on surveillance video, and the rearranged tubes are stitched with the background to get the compact view of the original video.
3.1 Object Detection and Tracking Object detection is the preliminary phase of any video synopsis process. Object detection is followed by segmentation and tracking of trajectories of same object called activity and represented as a tube. There are many challenges in tracking the
Fig. 2 Video synopsis observed process flow
916
S. Jagtap and N. B. Chopade
real-time object. Occlusions and illumination variations are the primary challenges. The object tracking for synopsis involves appropriate segmentation of the object in each frame. The activities present in the original video should be detected and tracked correctly else it produces the blinking effect that is the sudden appearance and disappearance of objects. Some of the approach with respective studies are listed in Table 2 The evaluation parameter of video synopsis directly gets affected by the qualities of object detection and tracking. To avoid this challenge, Pappalardo et al. [23] introduced a toolbox to generate a dataset needed for testing with annotation. Table 2 Object detection and tracking algorithms Studies
Object detection and tracking algorithms
Merits
Demerits
Ahmed et al. [9]
Mixture of Gaussian (MoG) Gaussian mixture model GMM
Self-adaptive to illumination variation within short temporal segments
Does not give good result in dense foreground
Raun et al. [10]
Sticky tracking Visual background extractor (ViBe) algorithm
Can minimize blinking May produce some effect visual artifacts
He et al. [14]
Visual background extractor (ViBe) algorithm Linear neighbor object prediction algorithm
Efficient background subtraction and tracking algorithm
May produce some visual artifacts
Zhu et al. [18]
Scale invariant local ternary pattern (SILTP) Sticky tracking
Better segmentation due to its illumination invariant nature in design and reduce the blinking effect
Losses the object that stop moving
Li et al. [21]
Aggregated channel features (ACF) detection Kalman filter
Fast feature extraction External and powerful disturbances may representation capacity effect the tracking
Huang et al. [22]
Contrast context Robust to geometric histogram (CCH) and photometric Gaussian mixture model transformations GMM
Ghost shadow occurs and Occlusion
Mahapatra et al. [7]
Fuzzy inference system DBSCAN
Leaves holes in detected object
Fast and efficient
A Comprehensive Investigation About Video Synopsis … Table 3 Overview of activity clustering methodology
917
Studies
Activity clustering methodology
Findings
Ahmed et al. [9]
Convolutional neural network (CNN) [24]
To find object labels (three layers)
Balasubramanian et al. [16]
SVM classifier
Classification based on features of a face
Chou et al. [25]
Hierarchical clustering
To cluster the similar trajectories
Mahapatra et al. [7] Simple MKL
Action-based clustering
Lin et al. [26]
Abnormal activity classification
Patch-based training method
3.2 Activity Clustering Activity clustering is used to categorize the activity trajectories of similar type depending upon the object type, motion type, target as per the user query. The quality of video synopsis can be improved by displaying similar activities together as it is also easy for user to understand. It can be used for application-level rather than enhancement in methodology. It gives good accuracy for video browsing application. Table 3 describes the activity clustering methodology with their description. For clustering, convolution neural network gives the accuracy of around 70–80%. However, advance deep learning approaches can be used to increase accuracy.
3.3 Optimization Optimization is the main process of video synopsis. It is the process of the optimum rearrangement of the tube to obtain the collision free and chronological arranged compacted video. The rearrangement of foreground objects is expressed as reduction of energy in terms of activity cost, collision cost, and temporal consistency cost of object trajectories. The activity cost assures the maximum number of object trajectories in a video synopsis. The temporal consistency cost is used to preserve the temporal order of the activities; therefore, the breakage of temporal order is penalized. The collision cost helps to avoid spatial collisions between activities with providing the better visual quality. Some of the optimization approaches are listed in Table 4 The optimization methodology helps to rearrange the tube optimally; some approaches focus on collision avoidance while some try to improve the compression ratio. Improvement in all the parameters cannot be achieved at the same time.
918
S. Jagtap and N. B. Chopade
Table 4 Overview of optimization methodology Studies
Optimization methodology
Findings
Raun et al. [27]
Dynamic graph coloring approach
The tube rearrangement using graph coloring approach formed from steaming video
Heet et al. [14]
Potential collision graph (PCG) Tube rearrangement using collision free (CF) and collision potential (CP)
Pappalardo et al. [28]
Improved PCG coloring approaches
Based on graph coloring approach by updating the PCG coloring approaches through graph connected component
Zhu et al. [18]
Tube filling algorithm Greedy optimization
Tube rearrangement through Finding tube’s optimal location
Nie et al. [29]
Markov chain Monte Carlo (MCMC) algorithm
Tube arrangement and energy minimization
Li et al. [21]
Simulated annealing approach
Energy minimization for tube arrangement
Huang et al. [22]
Markov random field with a simple greedy algorithm
Energy minimization for tube arrangement
Ghatak et al. [11]
HSATLBO optimization approach
Hybridization of teaching learning-based optimization (TLBO) algorithms and simulated annealing and
Li et al. [12]
Group partition greedy optimization algorithm
Tube rearrangement
Ra et al. [13]
Occupation matrices with FFT
Tube rearrangement using energy minimization
Tian et al. [15]
Genetic algorithm
Tube rearrangement using energy minimization
Li et al. [30]
Seam carving using dynamic programming and greedy optimization algorithm
Tube rearrangement using seam carving
3.4 Background Generation A time-lapse background image is generated after optimizing the location of the activity. The surveillance video is having a static background; however, the background image should include the changes reflecting day and night, illumination and represents the natural background over time to improve the visual quality. The background generation is not related to the efficiency of video synopsis but it adds the benefit in visual quality. The inconsistency between the tube and background may produce the visual artifacts in video synopsis. Many of the approaches [11, 17] are using temporal median filter to estimate the background of surveillance video.
A Comprehensive Investigation About Video Synopsis …
919
3.5 Stitching This is the former step of video synopsis flow, where the tubes are stitched with the generated time lapse background. The stitching does not affect effect on the efficiency of video synopsis but it improves the visual quality. Many approaches employed Poisson image editing to stitch a tube into the background by changing the gradients.
4 Parametric Evaluation and Dataset Parameters are used to assess the quality of video synopsis. Some of the parameters are listed below. 1. Frame condensation ratio (FR) is defined as the ratio between the number of frames in original video and the synopsis video [10]. 2. Frame compact ratio (CR) is defined as the ratio between the number of object pixel in original video and total pixels in synopsis video. It provides the information about the spatial compression and measures the object occupying the spatial space in synopsis video [10]. 3. Non-overlapping ratio (NOR) is defined as the ratio between the number of pixels that the object is occupying and sum of each object mask pixels of synopsis video. It provides the information about the amount of collision between tubes in a synopsis video [10]. 4. Chronological disorder (CD) is defined as the ratio between the number of tubes in reverse order and total number of tubes. It measures the chronological sequence of the tube [10]. 5. Runtime (s): Time required for generation of video synopsis. 6. Memory requirement: The memory utilization is measured using peak memory usage and average memory usage. 7. Visual quality: It gives the visual appearance of the synoptic video which should include all the activity that occurred in the original video 8. Objective evaluation: some of the approaches [31] also conduct the survey based on the result in the objective way to validate the synopsis result by comparing the visual appearance. The original video, proposed synopsis video, and existing method synopsis videos are shown to fixed participants and certain question based on appeared, compactness is asked. Based on the answers, the efficiency of proposed synopsis is calculated.
4.1 Dataset The datasets are needed to validate the performance of different methodology. The presence of proper datasets helps to check the quality of results by the proposed
920
S. Jagtap and N. B. Chopade
Table 5 Overview of available datasets Studies
Datasets
Description
Ahmed et al. [9]
1.VIRAT and Sherbrooke street surveillance video datasets 2. IIT-1 and IIT-2 dataset
Surveillance video dataset
Raun et al. [10]
Online video using You tube
Surveillance video
Pappalardo et al. [23]
1. UA-DETRAC dataset
Real video consisting of traffic scenes
Ghatak et al. [11]
1. PETS 2001, MIT surveillance dataset 2. IIIT Bhubaneswar surveillance dataset 3. PETS 2006 and UMN dataset
Standard surveillance video data set
Mahapatra et al. [7]
1. 2. 3. 4.
Evaluating action Recognition (first two) synopsis generation (latter two)
KTH WEIZMANN PETS2009 LABV
methodology. The performance of video synopsis can be evaluated using the publicly available dataset and outdoor videos. Table 5 lists the available dataset. In some of the studies, the datasets are created by the researcher to check the evaluation parameters. However, these datasets cannot be used to compare the result. The assessment of the evaluation parameter is a tough task as the standard dataset is not available. In some of the studies, the evaluation parameters are taken in reverse ratio. Therefore, the comparison of different results will be a problematic task.
5 Future Research Challenges Video synopsis technology has overcome many challenges in the area of video investigation and summarization, but there are many glitches within the scope of application by itself. Some of the challenges in the field of video synopsis are given below. 1. Object Interactions The interaction between the object trajectories should be preserved while converting original video into compacted form. For example, if two people are walking side by side in a video. The tracking of tubes is done separately in optimization phase and for collision avoidance; these tubes are rearranged in a way that they not ever met in the final synopsis [32]. The rearrangement of the tubes should be implemented with proper optimization algorithm, so the original integration can be preserved. 2. Dense Videos
A Comprehensive Investigation About Video Synopsis …
921
Another challenge is the crowded public places, where the source video is highly dense with objects occupying all the location repeatedly. In this situation, the required object can be kept alone in synopsis video but this may affect the chronological order of the video and may create misperception to the user browsing for the particular event or object in the resultant video. Also reduces the visual quality of the video. The selection proper segmentation and tracking algorithm will help to overcome with the challenge. 3. Camera Topology The synopsis video quality may get affected by the camera topology used. Object segmentation and tracking are an important phase of video synopsis in which the source videos can be fetched using still camera or moving camera. The synopsis generation will be difficult with moving camera as the viewing direction is constantly varying. The background generation and the stitching step will be difficult for moving camera as there will be continuous changes in background appearance. The multicamera approach will be another challenge in the generation of video synopsis as object tracking and segmentation will be difficult as the number of inputs will be more, and changeable background shift will be tough to predict. 4. Processing Speed The faster real time speed can be achieved by system using a multi-core CPU implementation. The GPU further reduces the processing time and enhances the speed of processor giving reduced value of runtime. 5. It is an optional step in video synopsis process flow but added advantage for quick data retrieval and browsing. It increases the computational complexity but can be used for many applications depending upon the user’s query. Depending upon the user query, the clustering of the similar tubes can be generated, and synopsis video is produced based on the clustering.
6 Conclusion Video synopsis has gained more demand with the increases in CCTV and technological enrichment in the video analysis field. It is an emerging technology used to represent the source video in compacted form based on the activities which can be used in many applications. There are several approaches of video synopsis in which online approach is used for real-time video steaming. Multicamera and compressed domain approach need to explore for enhancing the efficiency of related parameters. The video synopsis process flow starts with object detection, trajectories tracking, activity clustering, tube rearrangement, background generation, and stitching. The accuracy of tracking and segmentation of object trajectories can affect the quality of synopsis video. The compression ratio can be improved by optimum arrangement tubes. The proper chronological order and less collision between the tubes help to enhance the visual quality.
922
S. Jagtap and N. B. Chopade
Numerous challenges can be addressed in future for further research. Video synopsis is used for efficient data retrieval and browsing that can be effectively addressed by query-based video synopsis through activity clustering step in a synopsis process flow. The deep learning classifiers can be used for clustering to improve efficiency. Other challenges may be the existence of a dense crowd, multiple camera views, and the trade-off between the parametric evaluation of video synopsis. Thus, there is a need to design a new framework for efficient optimization through clustering of the tube to enhance the overall efficiency.
References 1. Markets and Markets (2019) Video surveillance market survey. https://www.marketsandma rkets.com/Market-Reports/video-surveillance-market-645.html 2. Tewari D (2020) Video surveillance market by system type and application (commercial, military & defense, infrastructure, residential, and others): global opportunity analysis and industry forecast, 2017–2025. Video surveillance Market Outlook 2027. 3. Truong BT, Venkatesh S (2007) Video abstraction: a systematic review and classification. ACM Trans Multimedia Comput Commun Appl 3(1):3–es 4. Money AG, Agius H (2008) Video summarisation: A conceptual framework and survey of the state of the art. J Vis Commun Image Represent 19(2):121–143 5. Smith MA (1995) Video skimming for quick browsing based on audio and image characterization 6. Petrovic N, Jojic N, Huang TS (2005) Adaptive video fast forward. Multimedia Tools Appl 26(3):327–344 7. Mahapatra A et al (2016) MVS: a multi-view video synopsis framework. Signal Processi: Image Commun 42:31–44 8. Feng S et al (2012) Online content-aware video condensation. 2082–2087 9. Ahmed SA et al (2019) Query-based video synopsis for intelligent traffic monitoring applications. IEEE Trans Intell Transport Syst 1–12. 10. Ruan T et al (2019) Rearranging online tubes for streaming video synopsis: a dynamic graph coloring approach. IEEE Trans Image Process 28(8):3873–3884 11. Ghatak S et al (2020) An improved surveillance video synopsis framework: a HSATLBO optimization approach. Multimedia Tools Appl 79(7):4429–4461 12. Li X, Wang Z, Lu X (2018) Video synopsis in complex situations. IEEE Trans Image Process 27(8):3798–3812 13. Ra M, Kim W (2018) Parallelized tube rearrangement algorithm for online video synopsis. IEEE Signal Process Lett 25(8):1186–1190 14. He Y et al (2017) Fast online video synopsis based on potential collision graph. IEEE Signal Process Lett 24(1):22–26 15. Tian Y et al (2016) Surveillance video synopsis generation method via keeping important relationship among objects. IET Comput Vis 10:868–872 16. Balasubramanian Y, Sivasankaran K, Krishraj SP (2016) Forensic video solution using facial feature-based synoptic video footage record. IET Comput Vis 10(4):315–320 17. Zhong R et al (2014) Fast synopsis for moving objects using compressed video. IEEE Signal Process Lett 21:1–1 18. Zhu J et al (2015) High-performance video condensation system. IEEE Trans Circuits Syst Video Technol 25(7):1113–1124 19. Pritch Y et al (2009) Clustered Synopsis of Surveillance Video. In: 2009 Sixth IEEE international conference on advanced video and signal based surveillance
A Comprehensive Investigation About Video Synopsis …
923
20. Wang S, Wang Z-Y, Hu R-M (2013) Surveillance video synopsis in the compressed domain for fast video browsing. J Vis Commun Image Represent 24:1431–1442 21. Li X, Wang Z, Lu X (2016) Surveillance video synopsis via scaling down objects. IEEE Trans Image Process 25(2):740–755 22. Huang C-R et al (2014) Maximum a Posteriori probability estimation for online surveillance video synopsis. IEEE Trans Circ Syst Video Technol 24:1417–1429 23. Pappalardo G et al (2019) A new framework for studying tubes rearrangement strategies in surveillance video synopsis. In: 2019 IEEE international conference on image processing (ICIP) 24. Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. IEEE Conf Comput Vis Pattern Recogn (CVPR) 2017:6517–6525 25. Chien-Li C et al (2015) Coherent event-based surveillance video synopsis using trajectory clustering. . In: 2015 IEEE international conference on multimedia & expo workshops (ICMEW) 26. Lin W et al (2015) Summarizing surveillance videos with local-patch-learning-based abnormality detection, blob sequence optimization, and type-based synopsis. Neurocomputing 155:84–98 27. Rav-Acha A, Pritch Y, Peleg S (2006) Making a long video short: dynamic video synopsis. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 1, pp 435–441 28. He Y et al (2016) Graph coloring based surveillance video synopsis. Neurocomputing 225 29. Nie Y et al (2019) Collision-free video synopsis incorporating object speed and size changes. IEEE Trans Image Process: Publ IEEE Signal Process Soc 30. Li K et al (2016) An effective video synopsis approach with seam carving. IEEE Signal Process Lett 23(1):11–14 31. Fu W et al (2014) Online video synopsis of structured motion. Neurocomputing 135:155–162 32. Namitha K, Narayanan A (2018) Video synopsis: state-of-the-art and research challenges
Effective Multimodal Opinion Mining Framework Using Ensemble Learning Technique for Disease Risk Prediction V. J. Aiswaryadevi, S. Kiruthika, G. Priyanka, N. Nataraj, and M. S. Sruthi
Abstract Multimodal sentiment analysis framework is tremendously applied in the medical and healthcare sectors. Identifying the depression, differently disabled person speech recognition, Alzheimer, low pressure or heart problems and those kinds of impairments are widely addressed. Video data is processed for mining feature polarity from acoustic, linguistic and visual data upon extraction from the same. The feature data set extracted from the YouTube videos contains comments, likes, views and shares for expressing the polarity of information conveyed through the streaming videos. Static information from a video file is extracted in the form of linguistic representation. Musical data extracted and transformed in the linguistic form is used for polarity classification using ensemble-based random forest algorithm which has encountered with the error rate of 4.76%. Short feature vectors are expressed in the visualizing musical data and trending YouTube videos data set for utilizing the transformed and SF vectors from video and musical data. Accuracy of the ensemble-based learning is obtained as 91.6% which is tougher than any other algorithms to achieve using the same set of machine learning algorithms. Proper filter wrapping of batch data is used for the split ratio of 5 percentage split window. The original version of this chapter was revised. The author name “G. S. Priyanka” has been changed to “G. Priyanka”. The correction to this chapter is available at https://doi.org/10.1007/978-981-334305-4_72 V. J. Aiswaryadevi (B) Dr NGP Institute of Technology, Coimbatore, India e-mail: [email protected] S. Kiruthika · G. Priyanka · M. S. Sruthi Sri Krishna College of Technology, Coimbatore, India e-mail: [email protected] G. Priyanka e-mail: [email protected] M. S. Sruthi e-mail: [email protected] N. Nataraj Bannari Amman Institute of Technology, Sathyamangalam, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021, corrected publication 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_67
925
926
V. J. Aiswaryadevi et al.
When SVM is used in discrimination with ensemble random forest algorithm, the predicted results contain an error rate of 2.64% which improves the accuracy of the classifier along the soft margin with an accuracy of 96.53% accuracy. Keywords Sentiment analysis · SVM · Ensemble random forest · Multimodal opinion mining
1 Introduction Many researchers are working on the construction of the multimodal opinion mining framework. A clinical decision support system is employed with automation and without human intervention using machine learning algorithms and deep learning algorithms in recent days massively. Deep learning networks are also used along with the ensemble-based extreme learning machines due to the problem of overfitting in several depths leading to a sparse density of traversal towards the goal state. Traditional random forest works with simple and quite effective accuracy in object recognition and goal state prediction. Here, the traditional random forest is sampled with its input data sampled with goal-based constraints. The seeded sample is taken into random forest module execution for opinion mining framework construction. SVM machine learning algorithm is also trained on the same set of samples trained by random forest classifier for the random sampling on the data. The prediction results and parameter are discussed based on observations noted. Section 2 briefs about the data set under analysis, and Section 3 describes the goalbased ensemble learning algorithm with the set of goal constraints and its predicates. Section 4 speaks about the ensemble-based learning algorithm and its effectiveness. Section 5 depicts the results derived by using the ensemble-based opinion mining framework.
2 Related Works Multimodal sentiment analysis framework is constructed using many data mining, machine learning algorithms and also even deep learning algorithms for health care, ministry and military data, especially in the e-learning industries. It is widely aiding the diverse variety of industry sectors deployed researchers and industry innovations. The polarity classification is attained with 88.60% of accuracy using CNN algorithm with the aid of word2vector on the data set described by Morency et al. (2011) in [1]. Subjectivity detection on linguistic data using ELM paradigm which combines the features of Bayesian networks and fuzzy recurrent neural networks was done by Chaturvedi et al. [2] showing the accuracy results hiked to 89% using Bayesian networks-based extreme learning machines in [2]. Ensemble methods are adopted for the textual data by Tran et al. in [3] for YouTube sentic computing
Effective Multimodal Opinion Mining Framework Using Ensemble …
927
features. Short-time Fourier transform (STFT)-based Mel-frequency cepstral coefficients [4] were calculated using ensemble-based extreme learning machines for feature-level attributes in YouTube videos establishing accuracy of 77% achieved by Hu et al. [5]. Naive Bayes classifier + support vector machines-lyric features were generated using Doc2Vec data set of 100 Telugu songs (audio + lyrics). From the experimental results, the recognition rate is observed to be in between 85 and 91.2%. The percentage of lyric sentiment analysis can be improved by using rule-based and linguistic approach shown in [6]. The USC IEMOCAP database [6] was collected to study multimodal expressive dyadic interactions in 2017 by [7]. Another experimental study showed that while using CNNSVM produced a 79.14% accuracy, an accuracy of only 75.50% was achieved using CNN. Multimodal sentiment analysis, data set, multimodal emotion recognition data set and the visual module of CRMKL [8] obtained 27% higher accuracy than the state of the art. When all modalities were used, 96.55% accuracy was obtained outperforming the state of the art by more than 20%. The visual classifier trained on the MOUD which obtained 93.60% accuracy [9] got 85.30% accuracy on the ICT-MMMO data set [10] using the trained visual sentiment model on the MOUD data set. Many historical works failed to reduce the overfitting caused due to deep neurons and decision levels.
3 About the Data Set Chord bigrams of piano-driven and guitar-driven musical strings are extracted and transformed into the linguistic form in the visualizing musical data set. Bigram features such as B-flat, E-flat and A-flat chords are frequently occurring musical strings in the piano and guitar music. YouTube trending data set contains the number of comments, likes and shares expressed for each video in the data set. Using prefiltering algorithms goal-specific rules the needful information alone is extracted from the videos and musical strings of the data set. Histogram segmentation is used for video sequences, and Mel-frequency spel spectrum is used for musical sequences for sparse filtering.
4 Goal-Based Ensemble Learning The detailed multimodal opinion mining frameworks are expressed in terms of five basic steps, namely collection of raw data, pre-processing and filtering, classification of filtered data, sentiment polarity extraction and analysing the performance parameter. Goal-based features are alone extracted for analysis. The following flowchart provides the flow of opinion mining framework for multimodal sentiment analysis (Fig. 1).
928
V. J. Aiswaryadevi et al.
Data set transformed and random sampled
CollecƟon of raw data
Preprocessing and filtering SenƟment Polarity ExtracƟon Analye the performance parameter Goal based machine learning
Fig. 1 Opinion mining framework construction flowchart
4.1 Random Forest Goal-based data mining algorithms [11] are used for forming the decision trees [12]. Bootstrapped decision trees are constructed using 150 samples under random sampling and 10 features sampled using feature sampling and bagged using a majority of the polarity expressed by the bootstrapped data. A simple model for end-stage liver disease risk prediction [13] is implemented using the ensemble-based random forest algorithm with 500 bootstrapped decision trees and achieved the accuracy of 97.22% with Gaussian filter normalization [14] with the random sampling rate of Gaussian Naïve Bayes Classifier [15] specified below in Eq. 1. An MDR data set is developed as like MELD data [16] set using the normalized short feature vectors from YouTube trending videos [17] and visualizing musical Data [18].
xi − μ y exp − P(xi /y) = 2σ y2 2 2π σ y 1
2 (1)
Random sampling is done at the seed samples of 2019 among 11,110 data entries using Gaussian normalization distribution (Fig. 2).
Effective Multimodal Opinion Mining Framework Using Ensemble …
929
Fig. 2 Bootstrapping the random samples and feature samples
4.2 SVM for Logistic Regression on Random Sampling SVMs are used for setting the UB and LB sample rate by soft margin random sampling amongst the entire data set handpicked from the multimodal MDR [19] data set. Utterances are neglected for sampling. Only the sentiment score and polarity expressed in the data are taken into account. Hypervector parameter [20] is used for effective classification of random samples with the seeding rate of 200 per vector (Figs. 3 and 4).
Fig. 3 SVM random sampling on short feature vector
930
V. J. Aiswaryadevi et al.
Fig. 4 SVM feature sampling and scaling vectors expressed by the data set transformation
Feature sampling is done at 80 samples per vector, and the sampled features are bootstrapped using the decision tree algorithms. The accuracy rate expressed by the random forest generated using the random sampled and feature sampled chords is discussed in the results below.
5 Analysis of Performance Parameters SoftMax classifier is used with performance measure indices reflected through the confusion matrix describing the true positive rate, false positive rate, true negative rate and false negative rate. A confusion matrix for an actual and predicted class is formed comprising of TP, FP, TN and FN to evaluate the parameter. The significance of the terms is given below TP = True Positive (Correctly Identified) TN = True Negative (Incorrectly Identified) FP = False Positive (Correctly Rejected) FN = False Negative (Incorrectly Rejected). The performance of the proposed system is measured by the following formulas: Accuracy (ACC) =
(TP samples + TN samples) TP + TN + FP + FN
(2)
TP samples TP + FN
(3)
TN samples TN + FP
(4)
Sensitivity (Sens) = Specificity (Sp) =
Effective Multimodal Opinion Mining Framework Using Ensemble …
931
6 Results and Discussions The results and performance metric indices derived are expressed as follows: The number of samples present before random sampling and feature sampling is 11,110 records, whereas, after random sampling with the seed of 200 seeds, the sample frames are created with 660 records. Amongst the true positive rate, the accuracy rate obtained is demonstrated below with a dot plot and confusion matrix. Number of trees 5700 No. of variables tried at each split 500 OOB 4.76% Confusion matrix TP FN Error (Class) Positive Polarity 250 5 0.01960784 Negative Polarity 14 130 0.09722222
Fig. 5 Multimodal opinion mining framework for disease risk prediction
932
V. J. Aiswaryadevi et al.
In Fig. 5, the disease risk prediction rate of random forest is well expressed with the sample rate at each bagging node. The accuracy level increases and Gini index increases for maximum accuracy on the classification.
References 1. Poria S, Cambria E, Gelbukh A (2015) Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 2539–2544 2. Chaturvedi I, Ragusa E, Gastaldo P, Zunino R, Cambria E (2018) Bayesian network based extreme learning machine for subjectivity detection. J Franklin Inst 355(4):1780–1797 3. Tran HN, Cambria E (2018) Ensemble application of ELM and GPU for real-time multimodal sentiment analysis. Memetic Computing 10(1):3–13 4. Poria S, Majumder N, Hazarika D, Cambria E, Gelbukh A, Hussain A (2018) Multimodal sentiment analysis: addressing key issues and setting up the baselines. IEEE Intell Syst 33(6):17–25 5. Hu P, Zhen L, Peng D, Liu P (2019) Scalable deep multimodal learning for cross-modal retrieval. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval (SIGIR’19). Association for Computing Machinery, New York, NY, USA, pp 635–644. https://doi.org/10.1145/3331184.3331213 6. Abburi H, Akkireddy ESA, Gangashetti S, Mamidi R (2016) Multimodal sentiment analysis of Telugu songs. In: SAAIP@ IJCAI, pp 48–52 7. Poria S, Peng H, Hussain A, Howard N, Cambria E (2017) Ensemble application of convolutional neural networks and multiple kernel learning for multimodal sentiment analysis. Neurocomputing 261:217–230 8. Busso C, Deng Z, Yildirim S, Bulut M, Lee CM, Kazemzadeh A, Lee S, Neumann U, Narayanan S (2004) Analysis of emotion recognition using facial expressions, speech and multimodal information. In: Proceedings of the 6th international conference on multimodal interfaces. ACM, pp 205–211 9. Poria S, Chaturvedi I, Cambria E, Hussain A (2016) Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: 2016 IEEE 16th international conference on data mining (ICDM). IEEE, pp 439–448 10. Calhoun VD, Sui J (2016) Multimodal fusion of brain imaging data: a key to finding the missing link(s) in complex mental illness. Biological pysychiatry. Cogn Neurosci Neuroimaging 1(3):230–244. https://doi.org/10.1016/j.bpsc.2015.12.005 11. Lin WH, Hauptmann A (2002) News video classification using SVM-based multimodal classifiers and combination strategies. In: Proceedings of the tenth ACM international conference on multimedia. ACM, pp 323–326 12. Falvo A, Comminiello D, Scardapane S, Scarpiniti M, Uncini A (2020) A multimodal deep network for the reconstruction of T2W MR Images. In: Smart innovation, systems and technologies. Springer, Singapore, pp 423–431. https://doi.org/10.1007/978-981-15-50935_38 13. Kim Y, Jiang X, Giancardo L et al (2020) Multimodal phenotyping of alzheimer’s disease with longitudinal magnetic resonance imaging and cognitive function data. Sci Rep 10:5527. https:// doi.org/10.1038/s41598-020-62263-w 14. Rozgi´c V, Ananthakrishnan S, Saleem S, Kumar R, Prasad R (2012) Ensemble of SVM trees for multimodal emotion recognition. In: Proceedings of the 2012 Asia Pacific signal and information processing association annual summit and conference. IEEE, pp 1–4 15. Xu X, He L, Lu H, Gao L, Ji Y (2019) Deep adversarial metric learning for cross-modal retrieval. World Wide Web 22(2):657–672. https://doi.org/10.1007/s11280-018-0541-x
Effective Multimodal Opinion Mining Framework Using Ensemble …
933
16. Kahou SE, Bouthillier X, Lamblin P, Gulcehre C, Michalski V, Konda K, Jean S, Froumenty P, Dauphin Y, Boulanger-Lewandowski N, Ferrari RC (2016) Emonets: multimodal deep learning approaches for emotion recognition in video. J Multimodal User Interfaces 10(2):99–111 17. Jin K, Wang Y, Wu C (2021) Multimodal affective computing based on weighted linear fusion. In: Arai K, Kapoor S, Bhatia R (eds) Intelligent systems and applications. IntelliSys 2020. Advances in intelligent systems and computing, vol 1252. Springer, Cham. https://doi.org/10. 1007/978-3-030-55190-2_1 18. Ranganathan H, Chakraborty S, Panchanathan S (2016) Multimodal emotion recognition using deep learning architectures. In: 2016 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 1–9 19. Majumder N, Hazarika D, Gelbukh A, Cambria E, Poria S (2018) Multimodal sentiment analysis using hierarchical fusion with context modeling. Knowl-Based Syst 161:124–133 20. Soleymani M, Garcia D, Jou B, Schuller B, Chang SF, Pantic M (2017) A survey of multimodal sentiment analysis. Image Vis Comput 65:3–14
Vertical Fragmentation of High-Dimensional Data Using Feature Selection Raji Ramachandran, Gopika Ravichandran, and Aswathi Raveendran
Abstract Fragmentation in a distributed database is a design technique that reduces query processing time by keeping the relation size small. When it comes to storing high-dimensional data in a distributed manner, the processing time increases. This is due to the huge attribute size. In this paper, a method is proposed which can reduce the size of high-dimensional data by using feature selection technique. This technique reduces dimensions by removing irrelevant or correlated attributes from the dataset without removing any relevant data. The algorithm used for feature selection and vertical fragmentation is the random forest and Bond Energy Algorithm (BEA), respectively. Experiments show that our method can produce better fragments. Keywords Feature selection · Random forest · Vertical fragmentation · Bond energy algorithm
1 Introduction Owing to the need of today’s business world, many organizations run in distributed manner and hence stores data in distributed databases. Banking systems, consumer supermarkets, manufacturing companies, etc. are some examples. These organizations have branches working in different locations and therefore stores their data in a distributed manner. Fragmentation is a design technique in distributed databases, in which instead of storing a relation entirely in one location, it is fragmented into different units and stored at different locations. Fragmentation provides data to the user from the nearest location as per the user’s requirement. Fragmentation increases R. Ramachandran · G. Ravichandran · A. Raveendran (B) Department of Computer Science and Applications, Amrita Vishwa Vidyapeetham, Amritapuri, India e-mail: [email protected] R. Ramachandran e-mail: [email protected] G. Ravichandran e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_68
935
936
R. Ramachandran et al.
efficiency by reducing the size of the table, and hence, the search time and also provides security and privacy to the data. The fragmentation process has three categories: Horizontal, vertical, and hybridized fragmentation. Diagrammatic representation of these is shown in Fig. 1. Fragmentation is partitioning off a relation F into fragments F 1 , F 2 , …, F i , containing enough information to reconstruct the original relation F. In horizontal fragmentation, data is fragmented tuple wise based on minter predicates. This helps all related data to fall in a particular fragment. In this case, most of the time user queries need search in minimum fragments [1]. Whereas, in vertical fragmentation, data is fragmented attribute wise. This is with the assumption that users access certain related attributes together, and hence, if they are kept in one fragment, then user queries can be executed faster. Our system considers vertical fragmentation. The benefit of vertical fragmentation is that only a few related attributes are stored in each site, comparing to the original attribute set. Also, attributes are stored according to the access frequency of attributes at different sites. All these factors reduce the query processing time in distributed databases. In the case of hybrid fragmentation, data is fragmented vertically as well as horizontally. This method creates fragments with minimal information, attribute wise as well as tuple wise [2].
Fig. 1 Fragmentation type
Vertical Fragmentation of High-Dimensional Data …
937
Today’s world mainly deals with a large volume of data called big data. Big data is a collection of data which is a large size and yet increasing day by day. It contains high dimensions and needs a large amount of space for its storage. When it comes to storing big data in a distributed manner, even after fragmentation, each fragment will be large. As the size of fragments increases, time for query execution also increases [3]. So, if the fragment size can be reduced as much as possible, then that will speed up the query execution process. When high-dimensional data is considered, it can be seen that all those dimensions may not be important or they may be interrelated, and that redundancy occurs in the set of attributes. Removing this irrelevant or repeated dimensions will reduce the attribute size of dataset and hence that of fragments produced in vertical fragmentation. This paper proposes a novel approach for vertical fragmentation of high-dimensional data using feature selection. Dimensionality reduction techniques are divided into two categories—feature selection and feature extraction. Feature selection is used for reducing the attribute size before vertical fragmentation. Feature selection is the technique which allows us to select the most relevant features. It is done according to the relative importance of each feature on the prediction. It eventually increases the accuracy of the model by removing irrelevant features. Even though there exists different types of feature selection methods, random forest algorithm (supervised) is focused on because its efficiency is better compared to other feature selection methods [4]. The rest of the paper is organized as follows. Section 2 discusses the major work already done in the vertical fragmentation as well as in feature selection. Our proposed method of vertical fragmentation based on feature selection is explained in Sect. 3. Experimentation conducted on various datasets, and their result analysis is done in Sect. 4, and the paper concludes in Sect. 5.
2 Literature Review Fragmentation in distributed databases is undergoing rapid evolution as new methods are being developed with an ever-expanding availability of data. This section explains the major work done in this area. The feature selection techniques became popular ever since the emergence of big data. The section also explains the main techniques and applications of different feature selection techniques. Performance of the distributed database can be increased only if proper distribution design is included. One of them is fragmentation technique, and fragmentation of relational databases has started since the 1080s [5]. The distributed relational database has fragmented vertically that contains simple attribute and simple method by Thendral and Madu Viswanathan [6]. Here, the input is a set of user queries which are entered at different sites. Partitioning is done using bond energy algorithm by considering query’s frequency which the user enters. This method has been advantaging that fragmentation allocation and update
938
R. Ramachandran et al.
are implemented successfully for the stand-alone system. But only update queries are handled in the paper; for delete and alter extensions are being needed. A similar work has been done by Rahimi et al. [7]. Here, fragmentation is performed in a hierarchical manner using the applied bond energy algorithm with a modified affinity measure and then calculates the cost of allocating fragments to each site and allocates fragments to the correct site. The hierarchical method results in more related attributes which enable better fragments. However, the cost function considered for fragment allocation is not an optimized one. Dorel Savulea and Nicolae Constantinescu in their paper [8] uses a combination of a conventional database containing fact and a knowledge base containing rules for vertical fragmentation. The paper also presents us with the implementation of different algorithms related with fragmentation and allocation methods, namely RCA rules for clustering, OVF for computing overlapping vertical fragmentation, and CCA for allocating rules and corresponding fragments [9]. Here, attribute clustering in vertical fragmentation is not determined by attribute affinity matrix, as usual, but it is done using the rule to attribute dependency matrices. The algorithm is efficient but a small number operation can only perform [10]. A case study of vertical fragmentation is done by Iacob (Ciobanu) Nicoleta— Magdalenaa [11]. The paper explains briefly about of distributed database and the importance of fragmentation and its strategies. A comparison between different types of fragmentation is also done here. This case study is done by implementing the elearning platform for the academic environment using vertical fragmentation [12]. The paper explained how vertical fragmentation increases concurrency and thereby causes an increase in throughput for query processing. Feature selection helps to reduce overfitting and reduces its size by removing irrelevant features. There are mainly three types of feature selection; they are wrapper method, filter method, and embedded method [13]. In the wrapper method, subsets of features are generated; then, the features are deleted or added in the subset. In the filter method, feature selection is done based on the scores of the statistical test. The embedded method combines both the features of the wrapper method and the filter method. Random forest classifier comes under the wrapper method [4]. Jehad Ali, Rehanullah Khan, Nasir Ahmad, Imran Maqsood on their paper random forest and decision tree made a comparative study on the classification result of random forest and decision tree by using 20 datasets available in UCI repository. They made a comparative study based on correctly classified instances in both decision tree and random forest by taking a number of instance and number of attributes [14]. On the comparison, the paper concluded that the percentage of correctly classified instances is high in random forest, and incorrectly classified instances are lower than that of a decision tree. The comparison is also done on recall, precision, and Fmeasure. In the comparison, random forest has increased classification performance, and the results are also accurate [15]. The study on the random forest is done by Leo Breiman in his paper named random forests. The paper gives a high theoretical knowledge of random forest, and it includes the history of random forest. The complete steps of the random forests are explained by computation. The random forest for regression is formed in addition to
Vertical Fragmentation of High-Dimensional Data …
939
classification [16]. It is concluded that random features and random inputs produce better results in classification than regression. But only two types of randomness are used here that are bagging and random features; other injected randomness gives a better result. The application of the random forest algorithm in computer fault diagnosis is given by Yang and Soo-Jong Lee. The paper describes a technique that helps to diagnose rotating machinery fault. In this, a classifier for a novel assembly constructs a significant amount of decision tree. Even though there exist many fault diagnosis techniques, the random forest methodology is considered to be better because of its executed speed. Here, the randomness like bagging is used, the bootstrap acronym which is a meta-algorithm that enhances classification [17]. However, a minor change in the training set in a randomized procedure can trigger a major difference between the component classifier and the classifier trained in the whole dataset. One proposal is made by Ramon Casanova Santiago Saldana, Emily Y. Chew, Ronald P. Danis, Craig M. Greven, Walter T. Ambrosiu in their paper implementation of random forest methods for diabetic retinopathy analyzes. Early detection of retinopathy diabetic can prevent the chances of becoming blind. The approach used by 3443 participants in the ACCORD-Eye analysis is random forest and logistic regression classificatory on graded fundus photography and systematic results. They concluded that RF-based models provided a higher classification of ion accuracy than logistic regression [18]. The result suggests that the random forest method can be one of the better tool to diagnose diabetic retinopathy analysis and also evaluating its progression. But here different degrees of retinopathy are not evaluated. Even though there exist many applications of feature selection in the big data area, it has not yet been used in distributed databases for vertical fragmentation, to the best of our knowledge.
3 Proposed Method As stated earlier, our proposed method consists of vertical fragmentation of highdimensional data, after removing irrelevant or correlated attribute, using a feature selection method.
3.1 Architectural Diagram The proposed system architecture is shown in Fig. 2. As shown in the architectural diagram, our proposed method of vertical fragmentation is done in two phases. Phase 1: Feature selection of high-dimensional data. Phase 2: Vertical fragmentation. The processing of each phase is explained below.
940
R. Ramachandran et al.
Fig. 2 System architecture
Phase 1: Feature selection phase: This phase converts high-dimensional dataset to low-dimensional dataset by removing irrelevant information by using the feature selection method, namely random forest algorithm. The steps in the random forest are given below. Step 1: Select the random samples from the dataset. Step 2: Using selected samples create decision trees, and the result of the prediction is got from all decision trees. Step 3: Voting will be carried out for every expected outcome. Step 4: Ultimately, pick the most significant result as the final outcome of prediction. Phase 2: Vertical fragmentation phase: Vertical fragmentation of the reduced dimensional dataset is done in this phase using the Bond Energy Algorithm (BEA) [19]. The steps of BEA are given below. Step 1: Create usage matrix for each class from the user queries. Given a set of queries Q = q1 , q2 , ..., qq that will run on the relation R[A1 , A2 , ..., An ], use(qi , A j ) =
1, if A j is used by qi 0, otherwise
(1)
Step 2: Create access frequency matrix of queries for each class for each site. Step 3: By using access frequency and method usage matrix, affinity matrix is determined. Step 4: The clustered matrix is built from an affinity matrix. Step 5: Partitioning algorithm is used to obtain partitions. Partition point is the point that divides the attributes into separate classes to allow the multiple sites to be allocated. Two-way partitioning was done, i.e., division of the attributes must be assigned to two locations in two classes. Attributes to the left
Vertical Fragmentation of High-Dimensional Data …
941
of the partition point belong to one site while attributes to the right belong to another site [20]. The fragments produced using the BEA can be allocated to various nodes of the distributed database using the allocation algorithm.
4 Experimentation and Result Analysis Experimentation of our proposed method is done using various parameters like time, no. of fragments as well as the average number of dimensions in each fragment. For experimentation purpose, five datasets have been taken from the UCI repository. Details of the datasets are given in Table 1. The complexity and space consumption are reduced by using a feature selection method is seen. Time taken for fragmenting the dataset with and without feature selection is shown in Fig. 3. As seen from the graph, when the high-dimensional data is reduced to lowdimensional data using feature selection, it also reduces the fragmentation time. As the dimensionality of big data increases, a considerable reduction in fragmentation time will be got, if remove irrelevant or dependent features before fragmentation. Table 1 Dataset used
Dataset
Number of attributes
Number of tuples
Madelon train
100
4400
Arrhythmia
200
452
Anonymous Microsoft Web
300
3371
Rock art features 400
1105
Madelon
4400
Fig. 3 Comparison based on fragmentation time
500
942
R. Ramachandran et al.
Fig. 4 Comparison based on number of fragments formed
Also, the number of fragments produced after fragmentation plays an important role in determining the efficiency of the distributed system. As the fragment count increases, more fragments need to be searched for processing a single query and that will, in turn, increase the query processing time. A comparison based on number of fragments produced for different datasets is shown in Fig. 4. When feature selection is used as a preprocessing step of fragmentation, only relevant features will be considered for vertical fragmentation. This ensures that the no. of fragments produced will be sufficient to answer user queries. When a query requires only a single fragment for its processing, the number of dimensions in that fragment plays a critical role in performance. If related attributes can be kept together in a fragment, then it can reduce query processing time drastically. When fragmentation is done after feature selection, it can be made sure that each fragment formed contains only limited and relevant attributes. Table 2 shows the average number of dimensions in each fragment after feature selection. It is evident from the table that feature selection can result in producing fragments with limited dimensions after fragmentation. In general, experiments show that our method can produce better fragments with respect to various other methods. Table 2 Number of dimension in each fragments
Dataset
Number of fragments
Average no. of dimensions
D1
10
5
D2
18
6
D3
22
8
D4
28
9
D5
32
10
Vertical Fragmentation of High-Dimensional Data …
943
5 Conclusion The paper proposes a new method for vertical fragmentation of high-dimensional data using feature selection. Removing irrelevant or correlated attributes before fragmentation can reduce the dimension size, as well as it can produce better fragments. A random forest is chosen for the feature selection and bond energy algorithm for the vertical fragmentation. When high dimensional is reduced to low dimension and then fragment, it will also considerably reduce the query execution time. Allocation of fragments to various nodes of the distributed database is kept as a future enhancement.
References 1. Ramachandran R, Nair DP, Jasmi J (2016) A horizontal fragmentation method based on data semantics. In: 2016 IEEE international conference on computational intelligence and computing research (ICCIC), Chennai, India 2. Ramachandran R, Harikumar S (2015) Hybridized fragmentation of very large databases using clustering. In: 2015 IEEE international conference on signal processing, informatics, communication and energy systems, Chennai, India 3. Jacob JS, Preetha KG (2001) Vertical fragmentation of location information to enable location privacy in pervasive computing, India. In: IFIP/ACM International Conference on Distributed Systems Platforms and Open Distributed Processing (2001) 4. Kursa MB, Rudnicki WR (2018) The all relevant feature selection using random forest. IEEE, Bhopal, India (2018) 5. Vertical Fragmentation in Relational Database Systems-(Update Queries In- cluded). IEEE Communications Surveys 6. Matteo G, Maio D, Rizzi S (1999) Vertical fragmentation of views in relational data warehouses. In: SEBD, pp 19–33 7. Rahimi H, Parand F, Riahilarly D (2018) Hierarchical simultaneous vertical fragmentation and allocation using modified bond energy algorithm in distributed databases, India (2018) 8. Savulea D, Constantinescu N (2011) Vertical fragmentation security study in distributed deductive databases 9. Chakravarthy S, Muthuraj J, Varadarajan R, Navathe SB (1993) An objective function for vertically partitioning relations in distributed databases and its analysis. Distrib Parallel Databases 2(1):183–207 10. Bellatreche L, Simonet A, Simonet M (1996) Vertical fragmentation in dis- tributed object database systems with complex attributes and methods. IEEE 11. Rogers J, Gunn S (2016) Identifying feature relevance using a random forest, India 91 12. Cornell D, Yu PS (1987) A vertical partitioning algorithm for relational databases. In: Proceedings of the third international conference on data engineering. IEEE 13. Miaoa J, Niu L (2016) A survey on feature selection. In: Information technology and quantitative management, India 14. Mahsereci Karabulut E, Ay¸se O¨ zel S, Turgay ˙I (2012) A comparative study on the effect of feature selection on classification accuracy. Turkey 15. Ani R, Augustine A, Akhil NC, Deepa Gopakumar OS (2016) Random forest ensemble classifier to predict the coronary heart disease using risk factors. In: Proceedings of the international conference on soft computing systems 16. Reif DM, Motsinger AA, McKinney BA, Crowe JE Jr, Moore JH (2015) Feature selection using a random forests classifier for the integrated analysis of multiple data types
944
R. Ramachandran et al.
17. Draminski M, Rada-Iglesias A, Enroth S, Wadelius, C, Koronacki J, Komorowski J (2008) Monte Carlo “feature selection for supervised classification”. Bioinformatics 24(1):110–117 18. Reif DM, Motsinger AA, McKinney BA (2006) Feature selection using random forest classifier on integrated data type 19. Mehta S, Agarwal P, Shrivastava P, Barlawala J (2018) Differential bond energy algorithm for optimal vertical fragmentation of distributed databases 20. Puyalnithi T, Viswanatham M (2015) Vertical fragmentation, allocation and re-fragmentation in distributed object relational database systems with update queries
Extrapolation of Futuristic Application of Robotics: A Review D. V. S. Pavan Karthik and S. Pranavanand
Abstract The phase of life has been constantly developing. From a bicycle to the fastest car, the future is led by the latest breakthroughs in the field of science, medicine, space research, marine explorations, and many more. One such breakthrough is robotics. People are familiarized with robots by watching them on television, computers, and less likely in real life. Robotics revolutionize the very purpose of humans and their needs. Based on the panorama obtained, this paper includes profound research and applications made in the field of robotics, which put forward the machine dominance in the industry. Keywords Robotics · Data analytics · Space exploration · Medicine
1 Introduction Robots can leave constrained industrial environments and reach out to unexplored and unstructured areas, for extensive applications in the real world with substantial utility. Throughout history, there has always been a forecast about robotics thriving and being able to manage tasks and mimic human behavior. Today, as technological advances continue, researching, designing, and building new robots serve various practical purposes in fields like medicine, space research, marine exploration, manufacturing and assembling, data analytics, armory, and so on. In fields like manufacturing, assembly, medical and surgical implementations, robotics essentially minimizes human flaws, and to increase accuracy. On the other hand, in fields like space and marine exploration, robotics make it possible for us to reach unbelievable heights in areas that are practically impossible to reach. With the existing technologies, various applications have already been made. However, the future of robotics has a lot in the hold.
D. V. S. Pavan Karthik (B) · S. Pranavanand Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering and Technology, Secunderabad, Telangana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_69
945
946
D. V. S. P. Karthik and S. Pranavanand
1.1 Robotics in the Field of Medicine The use of robotics in the medical sector has been constantly upgraded to meet the accuracy and demand in surgeries. A 16-segment biomechanical model [1] of the human body is made, and its 3D model realization is done using the SolidWorks medium to facilitate the movement according to the task but for an arm or any limb to move similar to a real one. One should know the geometric and mass-inertial characteristics of body segments, to gain an overview of these properties a mathematical model which predicts the inertial properties of the human body in any fixed body position (Sitting) is made, and the model is used to develop a design. The model is used to determine the human behavior in space, ergonomics, criminology, and other areas. Brain–Machine Interface (BMI) [2] is an interactive software which helps in communication with the robot and the environment. It can be used for a wide range of patients. The information is taken from the user’s electroencephalographic (EEG) signals (Fig. 1) and adapts accordingly to the user’s daily requirement by providing almost the same inputs as with the real limb. In the European Commission (EC), the directorate general of information society and media is provoking the use of technology, which has been proven useful in the health care sector [3]. A brain tumor is a deadly chronic disease, be it for a child or an adult. The most efficient way of locating a tumor is with the help of an Magnetic Resonance Imaging (MRI). MRI, in coordination with robotics, has a better scope of success rate [4]. For example, the tumor may be neglected and may spread to different parts of the body, which may be complicated for the naked eye and present equipment to detect. But with the help of continuum robots, this probability is reduced to the maximum. Microbots [5] are deployed into the affected area which provides a first-person review and is efficient in taking decisions and accordingly performing the required activity, where there is no accountability of human caliber and approach. The robot uses
Fig. 1 Actual set up of BMI software
Extrapolation of Futuristic Application …
947
the process of electrocautery (cautery using a needle or other instrument that is electrically heated) to remove the tumor. This mesoscale approach helps the surgeon to locate both the surgical robot as well as the tumor to completely eradicate the tumor. On the other hand, surgical knives require certain experience to go flawlessly. The introduction of the water-jet surgery [6] increases precision and maintains the perfect pressure and abrasion for every tissue or organ in the human body. Water is sprayed at high pressures, and another probe absorbs the water spraying at the other end to facilitate the view of endoscopic optics. Taking care of elderly people is associated with both time and patience. Due to the shortage of caregivers, people with Mild Cognitive Impairment (MCI) have trouble controlling their actions and movements. With the help of robotics [7], the patients can be supported with an instrument that gives them directions and the bits of help control their posture (something like a smart walking stick which can be used for supporting the elderly in physical activity based on their motion). Pain is one of the most natural and common feelings ever endured by humans but for people affected by CRPS the pain and symptoms last for months. A robotic exoskeleton integrated with a Virtual Reality (VR) [8] dashboard is designed to assess proprioception at rest and during active movement in people suffering from CRPS as these people try to exaggerate their movements and hurt themselves. This system helps them by giving them information on how their body can properly move without taking any stress on the muscles. People who undergo mental therapy feel the need to talk to someone. They might get offended or feel very uncomfortable. For this very purpose, robots can be introduced, and they can also be used for recreational purposes, helping people cope up with their mental illness by having a good chat with them [9]. Not only humans, but robotics can also be used for animals to help with their disabilities. For example, rats suffering from neurological disorders or spinal cord injuries tend to have hindered movement. A robotic exoskeleton [10] is designed to help rats move, which consists of Soft Pneumatic Actuators (SPAs). According to some theories, it is discovered that all flora and fauna are born with the genes of their parents. Now, where do all these features get stored? They are stored in what is called Deoxyribonucleic Acid (DNA), and they determine the features inherited by the offspring. The structure of the DNA is a double-helix model and consists of tiny thread-like structures that carry genetic information. To treat some cases mostly coronary disease, micro-reeling of microfibers [11] is done. It is counterfeiting the vascular microstructure using Electromagnetic Needle (EMN). To achieve smooth reeling, the trajectory of the EMN tip is already predetermined. The EMN reels the microfiber containing magnetic nanoparticles around a micropillar; to keep the microfiber from being attracted to the EMN tip, a dual-ring structure is designed. The main advantage of the robotic system is that it involves high accuracy and stability in fabricating these microstructures. Performing surgeries from great distances have been a great advantage in today’s science thanks to the advanced robots—the Telerobots [12]. This particular kind of robot was discovered to those who need help in space as people couldn’t go the
948
D. V. S. P. Karthik and S. Pranavanand
distances. Yet, it is more than effectively used on earth. Not to forget, there is a slight communication gap between the surgeon and the patient. The very recent invention in the field of robotics is Xenobots [Based on, https:// en.m.wikipedia.org/wiki/Xenobots] the living and self-healing robot made from the stem cells of a frog (Xenopus laevis). It is a completely different species of robots, perfectly small to travel in the human body. They can go without food or water for about a month or two. Generally, tiny robots made of iron and plastic are harmful once they decay in the body. On the contrary, Xenobots are almost degradable compared to the other ones. They can be used for targeted drug delivery or elimination of any disease on a microscopic level.
1.2 Robotics in Space Research Space is another milestone that mankind has achieved in the previous decade. But space can be as challenging as a game of shogi—without a proper strategy, and a plan it is very complicated to cruise through. In places where there might be a risk for human landing, robots can be used as testbeds [13] and test the landing site before any human interactions. Humanoid robots [14] can be used for unmanned and deep space missions and retrieve information. The main objective of the robot should be able to perform these actions accordingly. 1. Communication dish: Adjust pitch and yaw alignment of a communication dish. 2. Solar array: Retrieve, deploy, and connect a new solar panel to the existing solar array. 3. Air leak: Climb the stairs to a habitat entrance, enter the habitat, find an air leak using a leak detector tool, and repair the air leak using a patch. DRC-Hubo [14] the humanoid robot that can recognize color and position of LED which displays on a console panel, can press the button that opens the door, and can walk through the doorway. Not all astronauts are blessed to be the customers of the robots on the International Space Station (ISS). People on ISS face many problems such as repetitive work and difficulty performing their experiments in microgravity. To conquer this problem, NASA invented a robot named Astrobee (Fig. 2) [15], which can perform repetitive tasks, carry loads, and provide a platform for the spacemen to conduct their research. It has a hinge that can cling on to the rails of the ISS, thereby increasing the standby time and reduction of fuel consumption to oppose gravity. It is more than speculating to have a robot co-worker or a co-pilot by your side. Many people may have watched films, where the robot steers the space shuttle and later retrieve information, where there is no possibility of human existence. Out of the many, ESA’s METERON SUPVIS Justin is the experiment, where astronauts on-board the International Space Station (ISS) commanded a humanoid robot Rollin’ Justin [16] in a simulated Martian environment on earth. This type of operation is highly recommended for the spacemen as they have trouble controlling their motor in
Extrapolation of Futuristic Application …
949
Fig. 2 Structure of Astrobee
microgravity apart from their mental load due to any uninvited problems on board. Another advantage is that the robot can freely be controlled by a variety of User Interface (UI) such as a tablet and such. Ever feared that one-day debris or a spaceship came crashing into the atmosphere? Well, one shouldn’t freak out because humans have been so smart to come with a countermeasure. A machine that is fitted with a kinematically redundant robot [17] its main principle is based on target motion prediction. It moves based on a reference trajectory provided by the ground control and constantly corrects its trajectory with the help of a tracking controller and finally takes the grasp. The duration of grasping is selected based on the initial contact forces that pass through the center of mass of the chaser (The robot which grabs the target) to minimize the aftereffects caused by a change in altitude. It either delivers the object down to earth or may set it back in its trajectory, thereby reducing ballistic impact. This method can also be used to remove space debris from the earth’s orbit.
1.3 Robotics in Marine Exploration The space is much known than our waters. Half our oceans remain unexplored, and there might be a cure for every disease that has struck mankind, lost knowledge, and much more right under our noses. Things that strike our mind when think of water are the vessels, ships, and boats. Out of all ways, the worst way to lose your vessel is to let
950
D. V. S. P. Karthik and S. Pranavanand
it sink in the deep waters. A hull is the main part of the ship, and its design is the only way it is to float and resist the friction. Due to humidity and it constantly cruising in the waters, rust is invited which in turn corrodes the metal leading to leakages in the vessel and resulting in the sinking. To counter this problem, researchers have come up with an idea of a swarm of deep-water robots [18] which detect the breaches in the hull and then notifies the crew on board in emergencies. They form a cluster, and they rearrange themselves in the area of the water infiltration. They are the physical representation of the quote “United we stand, divided we fall.” They have a higher resistance to sensor noise, and the probability of the robotic population going haywire is near to zero, thereby reducing casualties and economy in the marine industry. Deep marine exploration has been possible only due to the use of robots and their ability to transfer the information among themselves and take necessary action. Crabster (CR200) [19] is a six-legged crab-like robot (Fig. 3) made for deep-sea explorations that can withstand turbidity and underwater waste. It has been currently tested in a water tank which is simulated to a scenario to that of the wild sea currents. A platform named OceanRINGS [20] technologies which can be associated with almost any Remotely Operated Vehicle (ROV) independent of the size. Tests were conducted with different support vessels off the North, South, and West coast of Ireland, in Donegal, Bantry Bay, Cork Harbor, Galway Bay, Shannon Estuary, and La Spezia, Italy. It is also provided with the ground to a prototype communication system in real-time. It is based on the principle of remote presence technology. Marine renewable energy (Oil and gas) is equally important as oxygen in our lives, making our lives possible. Sometimes, the oil or gas forms offshore, and for such purposes, OceanRINGS have put forward the idea of building robotic systems for inspection of offshore sub-sea marine renewable energy [21]. They are capable of resisting extremely harsh weather conditions and send information related to the
Fig. 3 Crabster equipped with different devices
Extrapolation of Futuristic Application …
951
location and amount of reserves to the Virtual Control Cabin (VCC) on the ground, making renewable energy available to the population. This smart technology could lead to significant savings in time, maintenance, and operational costs.
1.4 Robotics in Manufacturing and Assembling In the manufacturing industry, human hours are not so efficient. Humans prefer to work in a safe and flexible environment, but cannot always be provided with luxury. [22] Therefore replacing them with robots increases efficiency and production compared to human labor. They reduce the cost of manufacturing. They can carry out any work for up to 12 hrs straight and can rectify many human errors and mistakes in quality control, thereby proving themselves fitter for the job than humans. For example, if a person needs to lift an object of about 25 kg he /she will experience pain in the back. People tend to forget the places they put their things. To improve the stacking and retrieving of things, people have come up with a robot which stacks [23] the required object at a particular place and at a particular time, and it makes a note of it, which later uses this information to retrieve the object back from, where it was placed. The Statue of Liberty was gifted by French to the United States on account of their independence, but it was imported in parts by ships. It would have certainly taken about four months or so just to assemble it. Imagine if it were to be gifted in this era, where there is a constraint for space and labor. Pinning these limitations in mind, a new approach is put forward, where the fixtures and the tooling are all taken care of by the coordinated mobile robots [24]. The mobility of the robot’s results in reevaluating the assembly space and reducing the cost and effort of labor. Sometimes robots need to undergo complex coordination to get the parts, and the tools at the designated place and time to obtain the desired assembly. The assembling process is cut down to these four basic points: (1) mobile manipulator hardware design, (2) fixture-free positioning, (3) multi-robot coordination, and (4) real-time dynamic scheduling. Additive manufacturing [based on https://en.m.wikipedia.org/wiki/3D_printing] or commonly known as 3D printing has greatly influenced the mass production market. It is used to manufacture the complex of parts which are difficult to manufacture and produce (Fig. 4). There is a bundle of advantages when it comes to 3D printing, which includes: 1. Rapid prototyping: As the name suggests, it aids faster production. It just takes hours to produce unlike the usage of other typical methods which may result in days. 2. A quick analyzing technique: Manufacturing an experimental product to check its properties and its functions, thereby having an awareness of the pros and cons of the product when going for large-scale production.
952
D. V. S. P. Karthik and S. Pranavanand
Fig. 4 A finished product using the 3D printer. Image courtesy: www.zdnet.com
3. Waste reduction: The material is used according to the product only, and the remaining material is later used. 4. Custom: Every product designed can be customizable in size, shape, color, and structure. 5. Precision: In some fields of work a millimeter plays an important role in the machine’s efficiency. For example, springs in watches are of very small size, and they require great time and precision to craft them by other means. However, here, it is done with pinpoint accuracy and in a short time.
1.5 Robotics in Data Analytics One of the most budding and enriching technologies which are on par with both AI and robotics are big data, otherwise known as the cloud, dew, and fog computing [25]. The robots have advanced to such a state, where all the information and data are stored, verified, and then sent to the user. To store and execute such large data and algorithms, robots need a much larger storage space apart from hard drives. This is where cloud and fog come into the picture. With their immense storage calculations, executions of functions are performed at higher speeds to meet the demands of the growing population and the corporate world. C2RO (Collaborative Cloud Robotics) [26] is a cloud platform that uses a stream processing technology to connect the city to mobile devices and sensors. This technology boosts the intelligence of robots to a larger scale by being able to perform complicated tasks such as simultaneous
Extrapolation of Futuristic Application …
953
localization and mapping (SLAM), speech recognition, and 3D grasp planning as they retrieve the information from the cloud or fog.
1.6 Robotics in SOS Missions During a natural calamity or disaster, most of the time loss of human life are inevitable. In such situations, drones can carry out search and rescue missions. The safest way to get in or out of a forest is to follow the existing trail generally made by hikers and mountaineers. The robot needs to look for the trail and then make an effort to stay on the trail. A machine learning approach to the visual perception of forest trails and gathering information is made by training the neural network with the various real-world dataset and testing it by operating on a single image, the system outputs the main direction of the trail compared to the viewing direction. The probable direction is determined using Deep Neural Networks (DNNs) as an image classifier [27], which operates reading the image’s pixels. It is used to determine the actions and avoid obstacles in the wilderness. It is mainly made to navigate in places, where humans cannot reach with their existing approach. Finding people who have lost their way into dense forests or maybe a rugged terrain might not be completely impossible, but for a robot of the size of an arm that might be a cakewalk.
1.7 Robotics in Armory War might not be a good thing to point our views onto, but have an opportunity of introducing unmanned vehicles controlled by robots to reduce casualties on a large scale. Unmanned Aerial Systems (UAS) [28] or can be simply quoted as drones have proven worth of themselves in unmanned missions. They can substitute missiles, torpedoes resulting in decreasing sacrificial and suicidal missions. Another alternative is the use of robotic soldiers who have higher endurance and strength are capable of the battle for longer durations, i.e., robotic ammunition, etc.
1.8 Robotics in Teaching Teaching today’s kids, tomorrow’s future is more than important. Robotics can turn ideas into reality. Inculcating it in the curriculum [29], both for schools and universities will trigger the future generations toward the budding field. If the teacher is a robot, then students will make an effort to listen. By grabbing their attention and which in turn results in a proper academic and social career, robots can be of great help. Robots can train athletes and sportsmen toward glory. They can instruct the pupils with the help of speech, the most effective form of communication. This feat
954
D. V. S. P. Karthik and S. Pranavanand
is achieved by using Artificial Neural Network (ANN) [30]. It can follow simple instructions as of now, thereby making robotics efficient in all walks of life.
2 Conclusion The above literature review gives us a basic clue of robotics in our daily lives and its use in the long run. It also gives an overview of the existing and the upcoming technology in the vast field of robotics. Apart from the cited fields of usage, there might be many other fields in which robotics are the fundamental building block. Reaching the human level of intelligence and exposure is currently an issue as the robots can perform tasks only which they are programmed for. Yet, research to achieve maximum human-like characteristics is still on the run.
References 1. Nikolova G, Kotev V, Dantchev D (2017) CAD modelling of human body for robotics applications. In: 2017 international conference on control, artificial intelligence, robotics & optimization (ICCAIRO), Prague, pp 45–50. https://doi.org/10.1109/ICCAIRO.2017.18 2. Schiatti L, Tessadori J, Barresi G, Mattos LS, Ajoudani A (2017) Soft brain-machine interfaces for assistive robotics: a novel control approach. In: 2017 International conference on rehabilitation robotics (ICORR), London, pp 863–869. https://doi.org/10.1109/ICORR.2017. 8009357 3. Gelderblom GJ, De Wilt M, Cremers G, Rensma A (2009) Rehabilitation robotics in robotics for healthcare; a roadmap study for the European Commission. In: 2009 IEEE international conference on rehabilitation robotics, Kyoto, 2009, pp 834–838. https://doi.org/10.1109/ICORR. 2009.5209498 4. Kim Y, Cheng SS, Diakite M, Gullapalli RP, Simard JM, Desai JP (2017) Toward the development of a flexible mesoscale MRI-compatible neurosurgical continuum robot. IEEE Trans. Rob. 33(6):1386–1397. https://doi.org/10.1109/TRO.2017.2719035 5. Ongaro F, Pane S, Scheggi S, Misra S (2019) Design of an electromagnetic setup for independent three-dimensional control of pairs of identical and nonidentical microrobots. IEEE Trans Rob 35(1):174–183. https://doi.org/10.1109/TRO.2018.2875393 6. Schlenk C, Schwier A, Heiss M, Bahls T, Albu-Schäffer A (2019) Design of a robotic instrument for minimally invasive waterjet surgery. In: 2019 International symposium on medical robotics (ISMR), Atlanta, GA, USA, pp 1–7. https://doi.org/10.1109/ISMR.2019.8710186 7. Stogl D, Armbruster O, Mende M, Hein B, Wang X, Meyer P (2019) Robot-based training for people with mild cognitive impairment. IEEE Robot Autom Lett 4(2):1916–1923. https://doi. org/10.1109/LRA.2019.2898470 8. Brun C, Giorgi N, Gagné M, Mercier C, McCabe CS (2017) Combining robotics and virtual reality to assess proprioception in individuals with chronic pain. In: 2017 International conference on virtual rehabilitation (ICVR), Montreal, QC, pp 1–2. https://doi.org/10.1109/ICVR. 2017.8007491 9. Meghdari A, Alemi M, Khamooshi M, Amoozandeh A, Shariati A, Mozafari B (2016) Conceptual design of a social robot for pediatric hospitals. In: 2016 4th international conference on robotics and mechatronics (ICROM), Tehran, pp 566–571. https://doi.org/10.1109/ICRoM. 2016.7886804
Extrapolation of Futuristic Application …
955
10. Florez JM et al (2017) Rehabilitative soft exoskeleton for rodents. IEEE Trans Neural Syst Rehabil Eng 25(2):107–118. https://doi.org/10.1109/TNSRE.2016.2535352 11. Sun T et al (2017) Robotics-based micro-reeling of magnetic microfibers to fabricate helical structure for smooth muscle cells culture. In: 2017 IEEE international conference on robotics and automation (ICRA), Singapore, 2017, pp 5983–5988. https://doi.org/10.1109/ICRA.2017. 7989706 12. Takács Á, Jordán S, Nagy DÁ, Tar JK, Rudas IJ, Haidegger T (2015) Surgical robotics— born in space. In: 2015 IEEE 10th Jubilee international symposium on applied computational intelligence and informatics, Timisoara, pp 547–551. https://doi.org/10.1109/SACI.2015.720 8264 13. Backes P et al (2018) The intelligent robotics system architecture applied to robotics testbeds and research platforms. In: 2018 IEEE aerospace conference, Big Sky, MT, 2018, pp 1–8. https://doi.org/10.1109/AERO.2018.8396770 14. Tanaka Y, Lee H, Wallace D, Jun Y, Oh P, Inaba M (2017) Toward deep space humanoid robotics inspired by the NASA space robotics challenge. In: 2017 14th international conference on ubiquitous robots and ambient intelligence (URAI), Jeju, pp 14–19. https://doi.org/10.1109/ URAI.2017.7992877 15. Yoo J, Park I, To V, Lum JQH, Smith T (2015) Avionics and perching systems of free-flying robots for the International Space Station. In: 2015 IEEE international symposium on systems engineering (ISSE), Rome, pp 198–201. https://doi.org/10.1109/SysEng.2015.7302756 16. Schmaus P et al (2020) Knowledge driven orbit-to-ground teleoperation of a Robot coworker. IEEE Robot Autom Lett 5(1):143–150. https://doi.org/10.1109/LRA.2019.2948128 17. Lampariello R, Mishra H, Oumer N, Schmidt P, De Stefano M, Albu-Schäffer A (2018) Tracking control for the grasping of a tumbling satellite with a free-floating robot. IEEE Robot. Autom. Lett. 3(4):3638–3645. https://doi.org/10.1109/LRA.2018.2855799 18. Haire M, Xu X, Alboul L, Penders J, Zhang H (2019) Ship hull inspection using a swarm of autonomous underwater robots: a search algorithm. In: 2019 IEEE international symposium on safety, security, and rescue robotics (SSRR), Würzburg, Germany, 2019, pp 114–115. https:// doi.org/10.1109/SSRR.2019.8848963 19. Yoo S et al (2015) Preliminary water tank test of a multi-legged underwater robot for seabed explorations. In: OCEANS 2015—MTS/IEEE Washington, Washington, DC, 2015, pp 1–6. https://doi.org/10.23919/OCEANS.2015.7404409 20. Omerdic E, Toal D, Dooly G (2015) Remote presence: powerful tool for promotion, education and research in marine robotics. In: OCEANS 2015—Genova, Genoa, 2015, pp 1–7. https:// doi.org/10.1109/OCEANS-Genova.2015.7271467 21. Omerdic E, Toal D, Dooly G, Kaknjo A (2014) Remote presence: long endurance robotic systems for routine inspection of offshore subsea oil & gas installations and marine renewable energy devices. In: 2014 oceans—St. John’s, NL, 2014, pp 1–9. https://doi.org/10.1109/OCE ANS.2014.7003054 22. Hirukawa H (2015) Robotics for innovation. In: 2015 symposium on VLSI circuits (VLSI circuits), Kyoto, 2015, pp T2–T5. https://doi.org/10.1109/VLSIC.2015.7231379 23. Chong Z et al (2018) An innovative robotics stowing strategy for inventory replenishment in automated storage and retrieval system. In: 2018 15th international conference on control, automation, robotics and vision (ICARCV), Singapore, pp 305–310. https://doi.org/10.1109/ ICARCV.2018.8581338 24. Bourne et al D (2015) Mobile manufacturing of large structures. In: 2015 IEEE international conference on robotics and automation (ICRA), Seattle, WA, pp 1565–1572. https://doi.org/ 10.1109/ICRA.2015.7139397 25. Botta A, Gallo L, Ventre G (2019) Cloud, fog, and dew robotics: architectures for next generation applications. In: 2019 7th IEEE international conference on mobile cloud computing, services, and engineering (MobileCloud), Newark, CA, USA, pp 16–23. https://doi.org/10. 1109/MobileCloud.2019.00010 26. Beigi NK, Partov B, Farokhi S (2017) Real-time cloud robotics in practical smart city applications. In: 2017 IEEE 28th annual international symposium on personal, indoor, and mobile
956
27.
28.
29.
30.
D. V. S. P. Karthik and S. Pranavanand radio communications (PIMRC), Montreal, QC, 2017, pp 1-5. https://doi.org/10.1109/PIMRC. 2017.8292655. Giusti A et al (2016) A machine learning approach to visual perception of forest trails for mobile robots. IEEE Robot. Autom. Lett. 1(2):661–667. https://doi.org/10.1109/LRA.2015. 2509024 Sanchez-Lopez JL et al (2016) AEROSTACK: an architecture and open-source software framework for aerial robotics. In: 2016 international conference on unmanned aircraft systems (ICUAS), Arlington, VA, pp 332–341. https://doi.org/10.1109/ICUAS.2016.7502591 Niehaus F, Kotze B, Marais A (2019) Facilitation by using robotics teaching and learning. In: 2019 Southern African Universities power engineering conference/robotics and mechatronics/pattern recognition association of South Africa (SAUPEC/RobMech/PRASA), Bloemfontein, South Africa, pp 86–90. https://doi.org/10.1109/RoboMech.2019.8704848 Joshi N, Kumar A, Chakraborty P, Kala R (2015) Speech controlled robotics using Artificial Neural Network. In: 2015 Third international conference on image information processing (ICIIP), Waknaghat, pp 526–530. https://doi.org/10.1109/ICIIP.2015.7414829
AI-Based Digital Marketing Strategies—A Review B. R. Arun Kumar
Abstract Artificial Intelligence (AI) techniques are applied for customer data and that can be analyzed to anticipate customer behaviour. The AI, the big data and advanced analytics techniques can handle both structured and unstructured data efficiently with great speed and precision than regular computer technology which elicits Digital Marketing (DM). AI techniques enable to construe emotions and connect like a human which made prospective AI-based DM firms to think AI as a ‘business advantage’. Marketers are data rich but insight poor is no longer enviable due to AI tools which optimize marketing operation and effectiveness. This paper highlights the significance of applying AI strategies in effectively reaching the customer in terms of understanding their behaviour to find their expectations on the product features, operations, maintenance, delivery, etc. using machine learning techniques. It highlights that such strategies enable digital marketing towards customer need-based business. Keywords Artificial intelligence · Machine learning · Digital humans · Chatbot’s · Digital marketing strategies
1 Introduction Digital Marketing (DM) involves promoting efforts that use an electronic device or the Internet utilizing digital channels such as electronic search engines, electronic social media, email and websites. DM which uses electronic and Internet to connect to current and prospective customers can also be denoted as ‘online marketing’, ‘Internet marketing’ or ‘web marketing’. Online marketing strategies implemented using the Internet, and its related communicating hardware/software devices/technologies can be referred to as digital marketing. B. R. Arun Kumar (B) Department of Master of Computer Applications, BMS Institute of Technology and Management (Affiliated to Vivesvaraya Technological University, Belagavi), Doddaballapura Main Road, Avalahalli, Yelahanka, Bengaluru, Karnataka 560064, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_70
957
958
B. R. Arun Kumar
DM implementation broadly involves the following steps as presented in Fig. 1 [1]: Fig. 1 Digital marketing methodology [1]
AI-Based Digital Marketing Strategies—A Review
959
Fig. 2 Planning a digital marketing strategy (Source Ref. [1])
Redefining the strategy is essential to broaden the reachability of the brand, whenever new product/service gets introduced. Goal definition may be re-established to bring brand awareness and goodwill among the customers using digital tools. Changes in goals/strategies expect changes in the action plan to practically implement on digital platforms (Fig. 2). Reaching the customers using the Internet, electronic gadgets such as smartphones, social media, search engines, understanding customer behaviour, their preferences, by applying analytic tools and analyzing their results are a comprehensive emerging, and dynamic domain of the Digital Marketing (DM) is quite different from traditional marketing. Several studies have projected that 85% of the customerbusiness relationship will be maintained using AI tools [2], and the AI market is appraised to be assets $9.88 billion by 2022. Coviello, Milley and Marcolin define e-Marketing as ’Using the Internet and other interactive technologies to create and mediate dialogue between the firm and identified customers’. DM is the broad term that makes use of different marketing strategies/tools, namely website, email, Internet, content, video, smartphone, PPC advertising and SMS messaging. Along with digital strategies/tools, the following basic guidelines which are the core of DM that is worth recall which is described as essential guidelines for DM to start with.
1.1 Essential Guidelines for Starting DM It is difficult and challenging to get the particular website of the business which is a top-ranked Search Engine Result Page (SERP) among nearly 14 billion searches per month in the globe all strategies of DM should be optimized including social media
960
B. R. Arun Kumar
marketing, PPC and other DM tasks (https://www.educba.com/seo-in-digital-market ing/). To be successful in DM, the following tips depicted in Fig. 3 are essential for beginners. Search Engine Optimization (SEO) technique if adopted enables search engine ranking of the website/business. SEO strategy is the key that positions the website during the critical activities of the business such as the buying/selling process along with keeping user experience into consideration. SEO is the process of finding and driving the customer base towards the business and the company among n number of e-businesses working with DM strategy. Search engine advertising is a form of Pay Per Click (PPC) model of Internet marketing which promotes a growing customer base and generating lead at optimized cost. Conversion Rate Optimization (CRO) ensures to increase the percentage of the customers who visited the website completes the specific action and improves customer satisfaction. Higher conversion rates are better as it increases ROI, user base, user experience, lead generation and reduces acquisition cost. Web analytics tools also enable us to understand user behaviour and get valuable marketing intelligence. To ensure the particular brand is well ranked following points to be pondered apart from the initial tips for the beginners. The other recent preferred gadgets are, namely domain naming, optimizing the results for experience on desktop, mobile, tablet. SEO strategies and social media marketing need to go hand in hand. The ultimate
Fig. 3 Essential tips for beginners of DM
AI-Based Digital Marketing Strategies—A Review
961
aim of content marketing strategies is to get profitable customer action. It shall be noted that 86% of the business today uses content marketing, where virtually content plays a significant role in all marketing. It is necessary to do content analytics to know whether the content is useful, need changes, optimization in the users and business perspective. Mobile Marketing (MM) strategies that are comparatively effective even though email marketing and digest are required. The new MM strategies are interactive such as Location-based Service (LBS), augmented reality mobile campaigns, 2D barcodes and GPS messaging. It continues to adopt text messaging and display-based campaign techniques. Digital marketing professionals need to have planned and use DM tools. Any DM planning depends on ’the marketing environment such as demography, geography, psychographic and behavioural analysis. Digital marketing is based upon Internet macro and microenvironments’. One of the major innovations in DM is to apply Machine Learning (ML) [3] and deep learning [4] techniques to make strategies/tools more effective and as per the business needs, the same have to be adopted. The next section analyses the role of Artificial Intelligence (AI)—ML in DM.
1.2 Role of AI, ML/DL in DM Despite enhanced digital marketing [5] strategies in place, their efficiencies are to be improved using contemporary technologies such as AI to understand emotions, behaviour, respond to human customer’s queries. AI computing could optimize DM strategies at all cognitive levels. Teaching the machine to learn is a subset of AI that can offer customized inputs for marketing specialists. DL is a subclass of ML encompassed of enormously large neural networks and an immense pool of algorithms that can replicate human intelligence. The yield of the direct answer by Google is driven by ML, and the return of the ’people also ask’ section is motorized by DL. Google is continuously culturing and reflecting human intelligence without the need for humans to nourish all the answers into its enormous database.
1.2.1
Basic Definitions
In a nutshell following widely referred definition illumines: AI also referred to as machine intelligence which is intelligence established by machines, contrasting the natural intelligence exhibited by humans and animals. AI [6] is quite often used to define the ability developed in the machine that clones the behaviour of the human mind by learning and problem solving [7]. Definition of ML: ‘A computer programme is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E’ [4].
962
B. R. Arun Kumar
Fig. 4 Performance comparison of ML with DL (Source Ref. [4])
Definition of DL: ‘Deep learning is a particular kind of machine learning that achieves great power and flexibility by learning to represent the world as a nested hierarchy of concepts, with each concept defined in relation to simpler concepts, and more abstract representations computed in terms of less abstract ones’ [4]. DL algorithms perform better and suitable when data is large on high-end machines, whereas ML algorithms can work on small data with low-end machines as shown in Fig. 4. DL algorithms involve massive matrix multiplication that needs hardware accelerators [8]. The better performance of ML expects accurate identification and extraction of features, whereas DL algorithms can learn features at a high level from data which makes DL unique which reduces the analysis and development of feature extractor for every problem. The given problem is broken into parts and solved when ML algorithms are used, whereas DL adopts the approach of ’end to end process’. DL method takes a longer time to learn compared to the ML method. Both ML and DL are applied in various fields including DM and medical diagnosis. Both applications of existing features and research trends are exploded in the industry as well as in academia. Adoptions of ML and DL analytical tools offered a competitive edge into DM [9].
1.3 Research Methodology The paper analyses the DM and AI-ML/DL role in stimulating the business by identifying and responding to the customer’s taste. The paper highlights the role of artificial intelligence, ML and DL tools in digital marketing. This paper is narrative in nature; information and examples denoted are based on the references available
AI-Based Digital Marketing Strategies—A Review
963
at some subordinate sources. The study motivates business enterprises to adopt AIML/DL techniques to optimize their digital marketing strategies.
1.4 Research Objective This research is carried out with a primary objective of exploring AI-based DM and to the significance of contemporary technology such as AI, big data, data analytics and deep learning for marketing their product and services.
2 Impact of AI and ML on DM DM strategies/tools based on AI-ML can streamline the market, optimizing both the business profit and satisfaction of user experience. The future of DM depends on the ability of DM professionals in applying AI-ML techniques to effectively implement DM strategies. AI and ML are separate yet complementary to each other. As mentioned in [10]. ’AI aims to harness certain aspects of the "thinking” mind, Machine Learning (ML) is helping humans solve problems in a more efficient way. As a subset of AI, ML uses data to teach itself how to complete a process with the help of AI [11] capabilities’. AI-ML tools [12] can bring out hidden business intelligence from the given consumer data which streamlines complex DM problems. It is difficult to make valid conclusions on the implications of ML techniques. It is known that ML has started creating an impact on DM [3]. This is because of the ability of ML tools to analyze the extremely large dataset and present the visualization as per the requirement of the DM team for taking decisions to streamline strategies. By applying ML tools, analytics outcomes enable them to understand their customers in-depth. It may be noted that 75% of DM strategy development as of now adopted AI functionality, and 85% of the customer interactions can be effectively managed without human intervention [10]. It implies that the ML tool can streamline DM strategies, and the business can align with AI-ML [10] future trends. It can be noted that there are several research works, and articles have upholder the artificial intelligence, ML and DL-based approaches for digital marketing including [13, 2]. It is found that 90% of the sales professional expected a substantial impact of AI on sales marketing [14].
964 Table 1 Linkedln table for content marketing
B. R. Arun Kumar Sl. no.
Particulars
% recommended
1
Audience relevance
58
2
Engaging and compelling storytelling
−57
3
The ability to trigger an action or response
54
2.1 ML Tools Enable DM AI in general, particularly relevant ML/DL techniques implications in DM [15] involves utilizing data, content and online channels which assure increased productivity and a better understanding of targeted customers. It is worth noting that how exactly ML tools enable DM to streamline. ML tool can improve the relevance of content marketing: All types of business creates content in the form of blogs, testimonial videos and recorded webinars. But content can become truly effective if it has followed the things as per the “Linkedln” table mentioned in Table 1. ML tools can analyze the given report as per the above requirement with your content. ML tools can boost PPC campaigns by providing metrics to drive the business and SEO by giving insights into the content rather than the specific keywords. ML Chabot [16] is a virtual robot capable of making conversation with humans either through the text mode, voice commands, sometimes both. Many of the big brand organizations have already adopted ML Chabot, for example, Apple’s Siri feature and Facebook. Chabot can speak to the personal level of the targeted customer and can collect personal information behaving like a virtual ambassador. The ML techniques can process large datasets and create instantly user personalized content drips. Investigation of complex DM problems can be much faster than ever before by applying ML tools and Chabot leading to a meaningful personalized relationship with targeted customer involvement. AI-ML has created disruptions and transforming the DM into a different technological landscape. AIML-based marketing models can utilize relevant buying patterns and conduct of the targeted customers leading to promotion teams wallop the supremacy of AI into their businesses. Figure 5 indicates models based on ML.
2.2 Digital Humans in DM AI-based digital humans are successfully communicating with appropriate responses to queries of the customer. Conferring to the statement of Jody Boshoff, FaceMe, ’non-human interactions between the customer and businesses are going to be 85% by 2025’. This is because from the reports, it is found that at present 70% of the customers are transacting using digital services which cut across different sectors
AI-Based Digital Marketing Strategies—A Review
965
Fig. 5 Some ML-based marketing models. (Source Ref. [3])
from telecommunications to banking [17, 18]. Since the customers enjoy intermingling with digital humans, AI-based DHs can impact digital marketing since it can work efficiently, eventually can keep learning from its experience and reduce costs as well. Digital services especially digital humans powered by AI when developed to meet the expectation of the customers, customers prefer it too. A ’Digital Human’ is the embodiment of an underlying AI-based Chatbot or digital assistant with additional capabilities such as emotional intelligence. Like a natural human, it can connect to individual natural humans, understand tone expression and body language and respond with relevance giving appropriate responses. For example, patients can take assistance from digital humans to understand their medical problems and method of following the prescription and diet with individual empathy [19]. An AI machine in human avatar visually realistic can blink eyes, wink, move their lips, smiles, treats with empathy; intelligent corporate digital human ability is highly convincing because of their modes of persuasion in handling customer-centric services. Compared Chatbots DHs can convince with logos, ethos and pathos. Digital assistants work 24/7 never get bored or tired. DHs are a combination of multiple advanced technologies that can understand the user’s emotion, mood and personality [19] (Fig. 6).
966
B. R. Arun Kumar
Fig. 6 AI digital humans in customer-centric service. (Source Ref. [19])
2.3 Understanding the Importance/Advantages of AI-ML in DM DM and storytelling features are strengthened by AI features in achieving market specialization and reaching the customers. AI’s innovations enabled customers to interact with technology, data, brands, products and services. Giant companies like Google are integrating AI [20] into their existing products using speech recognition and language understanding (Table 2). Table 2 AI-ML advantages Sl. no.
Importance/advantages
1
Leveraging big data to get a better value
2
Robust customer relationships
3
Precise market predictions and sales forecasts
4
Data-driven, optimized marketing campaigns
5
Enhanced Marketing Qualified Leads (MQL)
6
Further Sales Qualified Leads (SQL)
7
Superior insights to improve positioning
8
360-degree view of customer needs
9
Exploiting more business openings
10
Fine-tuned propensity models to create focused marketing strategies
11
Buyer satisfaction with improved user experience
12
Reducing marketing costs for better ROI
AI-Based Digital Marketing Strategies—A Review
967
Fig. 7 AI-ML-based future marketing driving forces. (Source Ref. [3])
Table 3 Mistakes to be avoided during DM streamlining [3]
Sl. no.
Mistakes to be avoided
1
Affecting generic and broad customer characters
2
Working with inadequate customer data
3
Neglecting performance of previous marketing campaigns
4
Not addressing regular and returning customers
5
Generating and dissemination of irrelevant content
6
Too much dependence on gut feeling
The global machine learning market is expected to grow from $1.41 billion in 2017 to $8.81 billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% [3]. The forthcoming DM includes AI-ML-based smart automation solutions which include the details given in Fig. 7. AI-ML-based marketing strategies must avoid the following mistakes shown in Table 3. It can be determined that ML/DL based on extensive data processing offers the information essential for the decision-making progression of promoting specialists. The application of ML-driven tools into digital marketing [21, 22] acquaint with various new challenges and opportunities. Implementation of ML applied to market analytical tools has no obvious disadvantages [9].
3 Conclusion DM strategies are continuously needed to be innovated in line with AI-ML techniques to keep up the market and get high ROI. AI has several tools that can
968
B. R. Arun Kumar
boost DM. They are AI-assisted professional website development, audience selection, content crafting services, creating and customizing content, Chabot’s, customer service, email marketing, predictive analysis and marketing, AI recommendations for engaging the targeted customers. The future developments in AI coupled with ML and DL tools address the concerns or limiting factors if any with the current tools.
Reference 1. https://www.deasra.in/msme-checklist/digital-marketing-checklist/?gclid=EAIaIQobChMI rLf4mJzM6wIV3sEWBR1f0AXnEAAYASAAEgIENPD_BwE 2. https://www.toprankblog.com/2018/03/artificial-intelligence-marketing-tools/ 3. https://www.grazitti.com/blog/the-impact-of-ai-ml-on-marketing/ 4. https://www.analyticsvidhya.com/blog/2017/04/comparison-between-deep-learning-mac hine-learning/ 5. https://quanticmind.com/blog/predictive-advertising-future-digital-marketing/ 6. Artificial Intelligence in Action: Digital Humans, Monica Collier Scott Manion Richard de Boyett, May 2019. https://aiforum.org.nz/wp-content/uploads/2019/10/FaceMe-Case-Study. pdf 7. Artificial intelligence. https://en.wikipedia.org/wiki/Artificial_intelligence 8. Talib MA, Majzoub S, Nasir Q et al (2020) A systematic literature review on hardware implementation of artificial intelligence algorithms. J Supercomput. https://doi.org/10.1007/s11227020-03325-8 9. Miklosik A, Kuchta M, Evans N, Zak S (2019) Towards the adoption of machine learning-based analytical tools in digital marketing. https://doi.org/10.1109/ACCESS.2019.2924425 10. https://digitalmarketinginstitute.com/blog/how-to-apply-machine-learning-to-your-digitalmarketing-strategy 11. https://www.smartinsights.com/managing-digital-marketing/how-ai-is-transforming-the-fut ure-of-digital-marketing/ 12. ]https://www.superaitools.com/post/ai-tools-for-digital-marketing 13. https://www.researchgate.net/publication/330661483_Trends_in_Digital_Marketing_2019/ link/5c4d3d6f458515a4c743467e/download 14. Top Sales & Marketing Priorities for 2019: AI and Big Data, Revealed by Survey of 600+ Sales Professionals Business Wire|https://www.businesswire.com/news/home/20190129005560/en/ Top-Sales-Marketing-Priorities-2019-AI-Big 15. https://www.educba.com/seo-in-digital-marketing/ 16. https://www.prnewswire.com/news-releases/machine-learning-market-worth-881-billionusd-by-2022-644444253.html 17. Digital Humans; the rise of non-human interactions, Jody shares. https://www.marketing.org. nz/Digital-Humans-DDO18 18. Customers’ lives are digital-but is your customer care still analog? Jorge Amar and Hyo Yeon, June 2017. https://www.mckinsey.com/business-functions/operations/our-insights/customerslives-are-digital-but-is-your-customer-care-still-analog 19. In 5 years, a very large population of digital humans will have hundreds of millions of conversations every day by Cyril Fiévet. https://bonus.usbeketrica.com/article/in-5-years-a-very-largepopulation-of-digital-humans-will-have-hundreds-of-millions-of-conversations-every-day 20. https://ieeexplore.ieee.org/stamp/stamp.?arnumber=8746184
AI-Based Digital Marketing Strategies—A Review
969
21. https://cio.economictimes.indiatimes.com/news/strategy-and-management/ai-digital-market ing-key-skills-to-boost-growth/71682736 22. https://www.singlegrain.com/seo/future-of-seo-how-ai-and-machine-learning-will-impactcontent/
NoRegINT—A Tool for Performing OSINT and Analysis from Social Media S. Karthika, N. Bhalaji, S. Chithra, N. Sri Harikarthick, and Debadyuti Bhattacharya
Abstract There are a variety of incidents that occur and the Open-Source Intelligence (OSINT) tools in the market are capable of only collecting those specific target data and even that is limited only to a certain extent. Our tool NoRegINT has been specifically been developed to collect theoretically an infinite amount of data based on keyword terms and draw a variety of inferences from it. This tool is used to gather information in a structured format about the Pulwama attacks and draw inferences such as the volume of data, the general sentiment of people about it and the impact of a particular hashtag. Keywords Open-source intelligence · Application programming interface · Spiderfoot · Maltego · Social media · Sentiment analysis
1 Introduction Open-Source Intelligence (OSINT) is the method of obtaining data and other relevant information about a specific target, usually but not limited to a person, E-mail ID, phone numbers, IP addresses, location, etc. It makes use of openly available information generally without the direct involvement of said target. OSINT is generally S. Karthika (B) · N. Bhalaji · S. Chithra · N. Sri Harikarthick · D. Bhattacharya Department of Information Technology, SSN College of Engineering, Kalavakkam, Chennai, Tamil Nadu, India e-mail: [email protected] N. Bhalaji e-mail: [email protected] S. Chithra e-mail: [email protected] N. Sri Harikarthick e-mail: [email protected] D. Bhattacharya e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_71
971
972
S. Karthika et al.
achieved through many tools which automate certain processes, although a preliminary analysis can be performed through manual processes. In this paper, a new tool is proposed to collect data from various sources such as Twitter, Reddit and Tumblr and draw inferences from it [1]. It is a well-known fact that social media is now an integral part of everybody’s life and more often than not, people tend to post a variety of things on their social media accounts such as details about their personal lives, their opinions about a particular entity or incident and also pictures of themselves. This makes data collection from such sources an extremely important task and essentially forms the core of fields such as Open-Source Intelligence(OSINT) [2, 3]. The motivation behind the proposed work is that there have been a large number of tools that have been developed in the field of OSINT, and these tools have although been very useful and have had various shortcomings. A large number of them are simple wrappers built around an Application Programming Interface (API) provided by the particular social media website [4, 5]. There is always an upper limit set upon the number of requests, and therefore, by extension, the amount of data that can be collected. The other web scraping tools also limit the amount of data collected, because of the concept of infinite scrolling in these web pages. Furthermore, the data is often unstructured and not followed by any sort of analysis. This sort of tools leave a lot of work up to the end user and therefore needs to be followed by a lot of cleaning and analysis [6, 7]. In this paper, a tool has been proposed that is built upon web scraping to collect publicly available data from social media websites without facing difficulties such as a cap on the amount of data or dependencies upon any sort of API. This tool will not only overcome the problem set by infinite scrolling web pages, but it will also provide the post data collection, temporal analysis, sentiment analysis that can be used to study social media response to certain incidents such as 9/11, Pulwama attack. The mentioned data can be of any type, textual, pictorial, etc. The targeted social media websites in this paper are twitter, Reddit and Tumblr [8–10]. In the following paper, Sect. 2 elaborates about related works done in this area, and Sect. 3 discusses the methodology. Section 4 analyses the results and compares with other APIs available, and Sect. 5 draws conclusions and discusses the future scope of the proposed tool.
2 Related Works This section discusses the various APIs and wrapper existing in the research area of OSINT.
NoRegINT—A Tool for Performing OSINT and Analysis …
973
2.1 Twitter API The Twitter API is an interface provided by the company itself to support the integration of their service in other applications. This is not a data collection tool but more of an alternative to access one’s account and perform actions from there. Although one can post data and perform various actions such as follow/unfollow and post content, it is not an effective data collection tool as the number of requests is limited, and its usage requires registration on the user’s part and proficient coding knowledge on the end user’s part [2, 11].
2.2 Reddit API-Wrapper(PRAW) This is an openly available tool built upon the Reddit developers’ API, which can be used to perform various activities such as retrieving posts, metadata about posts, ability to post content itself and also to upvote posts and follow other users [12]. However, the usage of this wrapper involves the hassle of registration as a developer with Reddit and also requires the end user to be familiar with programming concepts and the usage of OAuth. This prevents the wrapper from being a simple plug and play OSINT tool [13].
2.3 Spiderfoot Spiderfoot, although marketed as an OSINT tool, is commonly used for scanning targets based on IP addresses and is only capable of providing the user with raw data about a specified target IP or domain. Although it uses a large pool of resources to collect the data, it is not capable of drawing any inferences from the data collected.
2.4 Maltego Maltego is a commonly used data mining tool, mainly used to collect data about specific entities which may be people, companies or websites. This data is visually represented as a set of connected graphs. Although it is one of the most effective OSINT tools in the market, it is still not the best choice for term-wise data collection and data collection about incidents. It can co-relate and connect data but it cannot draw conclusive inferences from the representations [14].
974
S. Karthika et al.
3 Framework of NoRegINT Tool The proposed NoRegINT tool is designed to overcome the various existing API-based problems such as the number of days, amount of fetched content and the number of requests. Figure 1 presents four major modules, namely CLI module, Twitter scraper, Reddit scraper, Tumblr image collector and inference module involved in the framework of the proposed tool.
Fig. 1 A framework of NoRegINT tool
NoRegINT—A Tool for Performing OSINT and Analysis …
975
3.1 CLI Module It is an interface that can be used by the user. It provides a high level of abstraction and does not require any sort of significant programming knowledge on the end user’s part. It provides the user with two main functionalities. The collection of data based upon a specified search term and the inferences that can potentially be drawn from said collected data.
3.2 Twitter Scraper The Twitter scraper works upon the popular scraping package available in python called beautiful soup. However, using this with the ’request’ function presented limitations upon the amount of data collected. To overcome this limitation, an instance of a browser in the background is created, and the javascript code is executed to simulate scrolling movements. This theoretically provides an infinite amount of data. The beautiful soup uses the HTML DOM to access entities present in the page, and this has been used to collect the tweet metadata. The result is stored in a JSON format. This is then accessed by the inferences module.
3.3 Reddit Scraper Similar to the Twitter scraper module, this module also generates the metadata from Reddit posts regarding a particular keyword and then stores in the same format as the twitter results. This is then accessed by the inferences module.
3.4 Tumblr Image Collector The Tumblr image collector sends requests to download the images from URLs collected from the web page using beautiful soup. These images are then indexed and stored locally for the user to access. This can be useful in collection or accumulation of a dataset for a given problem.
3.5 Inference Module 1. Volume of data (a) Gives insight on the popularity of a topic on different social media
976
S. Karthika et al.
(b) It gives information about the number of Tweets, Reddit posts and images scraped by the system 2. Sentiment analysis (a) Gives the average sentiment value of all the Tweets and Reddit posts (b) Gives info on the impact of a term on social media
4 Results and Discussion The proposed tool NoRegINT has experimented for the keyword ’Pulwama’. Figure 2 presents the CLI module to obtain keyword input from the user. After obtaining the keywords from the user, the process of scraping begins with a new instance of selenium browser object with the built URL. Figure 3 illustrates the process of automatic loading of posts for scraping, and Figs. 4 and 5 show the JSON file generated by Reddit and Twitter scrapers, respectively. Figure 6 describes the comprehensive results achieved using the tool. About 99 tweet posts, 100 Reddit posts and two Tumblr photos were collected in the fast search method (three levels of scrolling). The tool uses Vader sentiment analysis package in which the interval of −0.05 to +0.05 is considered as neutral sentiment, an interval of +0.05 to +1 represents positive sentiment, whereas the interval of −1 to −0.05 is seen as a negative sentiment. The tool achieved an average sentiment of −0.28 on Twitter scraped data and a value of −0.16 on Reddit data. This sentiment analysis performed on the data collected based on the NoRegINT tool shows that the recent posts about that term and its hashtag have been negative on average. Figure 7 details on the percentage of sentimental tweets in the scraped repository built by NoRegINT tool. The comparison has been done based on standard features to determine the performance of the APIs and the OSINT tools. Posted content cannot be obtained in maltego and spiderfoot, whereas it is scraped and stored in NoRegINT in JSON Format.
Fig. 2 CLI module
NoRegINT—A Tool for Performing OSINT and Analysis …
Fig. 3 Reddit scraper
Fig. 4 Reddit scraper results
977
978
S. Karthika et al.
Fig. 5 Twitter scraper results
Fig. 6 Results from inference module
Reddit wrapper, Twitter API and Maltego restrict the amount of data scraped (~3200 tweets, etc.) while NoRegINT doesn’t restrict the amount of content retrieved from social media. APIs also have a 7 day range limit, while NoRegINT can gather any old information from the social media posts. None of the tools/APIs in question performs sentiment analysis on gathered data, while NoRegINT performs sentiment analysis, giving the average compound value and graph depicting the percentage of sentimental tweets. None of the tools has built-in inferencing methods while NoRegINT can report the number of tweets, posts and photos scraped from Twitter, Reddit and Tumblr, respectively (Fig. 8).
NoRegINT—A Tool for Performing OSINT and Analysis …
979
Fig. 7 Sentiment analysis results
Fig. 8 Comparison of NoRegINT tool with the existing APIs
5 Conclusion The authors of this paper have addressed the problems such as restriction on the volume of data, 7 day limit for content fetching and the lack of built-in inferencing in the existing APIs and OSINT tools. The proposed tool, NoRegINT, overcome these problems by using its features like automatic scrolling and sentiment analysis. The scrolling facilitates the infinite fetching of data through which the authors were able to build a complete restrictionless repository. The system is versatile in its keyword input. The tool can also automatically generate a sentimental review on the keyword input. This tool can be further developed to give more functionalities such as analysing streaming of posts and photos and can also be extended to other popular or growing social media websites.
980
S. Karthika et al.
References 1. Lee S, Shon T (2016) Open source intelligence base cyber threat inspection framework for critical infrastructures. In: 2016 future technologies conference (FTC). IEEE, pp 1030–1033 2. Best C (2012) OSINT, the Internet and Privacy. In: EISIC, p 4 3. Noubours S, Pritzkau A, Schade U (2013) NLP as an essential ingredient of effective OSINT frameworks. In: 2013 Military communications and ınformation systems conference. IEEE, pp 1–7 4. Sir David Omand JB (2012) Introducing social media ıntelligence. Intell Natl Secur 801–823 5. Steele RD (2010) Human intelligence: all humans, all minds. All the Tıme 6. Bacastow TS, Bellafiore D (2009) Redefining geospatial ıntelligence. Am Intell J 27(1):38–40. Best C (n.d.) Open source ıntelligence. (T.R.I.O), T. R. (2017). Background/OSINT. Retrieved 23 Apr 2018, from www.trioinvestigations.ca: https://www.trioinvestigations.ca/backgroundosint 7. Christopher Andrew RJ (2009) Secret intelligence: a reader. Routledge Taylor & Francis Group, London 8. Garzia F, Cusani R, Borghini F, Saltini B, Lombardi M, Ramalingam S (2018) Perceived risk assessment through open-source ıntelligent techniques for opinion mining and sentiment analysis: the case study of the Papal Basilica and sacred convent of Saint Francis in Assisi, Italy. In: 2018 International Carnahan conference on security technology (ICCST). IEEE, pp 1–5 9. Michael Glassman MJ (2012) Intelligence in the internet age: the emergence and evolution of Open Source Intelligence (OSINT). Comput Hum Behav 28(2):673–682 10. Stottlemyre SA (2015) HUMINT, OSINT, or Something New? Defining crowdsourced ıntelligence. Int J Intell Counter Intell 578–589 11. Gasper Hribar IP (2014) OSINT: a “Grey Zone”? Int J Intell Counter Intell 529–549 12. Intellıgence Communıty Directive Number 301 (2006) National Open Source Enterprıse, 11 July 2006 13. Neri F, Geraci P (2009) Mining textual data to boost information access in OSINT. In: 2009 13th International conference information visualisation, pp 427–432. IEEE 14. Pietro GD, Aliprandi C, De Luca AE, Raffaelli M, Soru T (2014) Semantic crawling: an approach based on named entity recognition. In: 2014 IEEE/ACM International conference on advances in social networks analysis and mining (ASONAM 2014). IEEE, pp 695–699
Correction to: Effective Multimodal Opinion Mining Framework Using Ensemble Learning Technique for Disease Risk Prediction V. J. Aiswaryadevi, S. Kiruthika, G. Priyanka, N. Nataraj, and M. S. Sruthi
Correction to: Chapter “Effective Multimodal Opinion Mining Framework Using Ensemble Learning Technique for Disease Risk Prediction” in: S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_67 In the original version of chapter 67, the following belated correction has been incorporated: The author name “G. S. Priyanka” has been changed to “G. Priyanka”. The chapter and book have been updated with this change.
The updated version of this chapter can be found at https://doi.org/10.1007/978-981-33-4305-4_67 © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_72
C1
Retraction Note to: Intrusion Detection Using Deep Learning Sanjay Patidar and Inderpreet Singh Bains
Retraction Note to: Chapter “Intrusion Detection Using Deep Learning” in: S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_10 The Series Editor and the Publisher have retracted this chapter. An investigation by the Publisher found a number of chapters, including this one, with various concerns, including but not limited to compromised editorial handling, incoherent text or tortured phrases, inappropriate or non-relevant references, or a mismatch with the scope of the series and/or book volume. Based on the findings of the investigation, the Series Editor therefore no longer has confidence in the results and conclusions of this chapter. The authors have not responded to correspondence regarding this retraction.
The retracted version of this chapter can be found at https://doi.org/10.1007/978-981-33-4305-4_10
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4_73
C3
Author Index
A Abhinay, K., 529 Adhikari, Surabhi, 39 Agrawal, Jitendra, 157 Ahuja, Sparsh, 751 Aiswaryadevi, V. J., 925 Aleksanyan, G. K., 729 Aravind, A., 529 Arif Hassan, Md, 869 Arun Kumar, B. R., 957 Atul Shrinath, B., 271 Ayyasamy, A., 127
B Bains, Inderpreet Singh, 113 BalaSubramanya, K., 235 Bansal, Nayan, 39 Baranidharan, V., 365 Basnet, Vishisth, 751 Bawm, Rose Mary, 883 Behera, Anama Charan, 739 Behera, Bibhu Santosh, 739 Behera, Rahul Dev, 739 Behera, Rudra Ashish, 739 Bhalaji, N., 971 Bhati, Amit, 53 Bhattacharya, Debadyuti, 971
C Channabasamma, 395 Chhajer, Akshat, 781 Chile, R. H., 703 Chithra, S., 971 Chopade, Nilkanth B., 911 Chung, Yun Koo, 551
D Dalin, G., 15 De, Debashis, 333 Dilhani, M. H. M. R. S., 647 Dilum Bandara, H. M. N., 567 Dushyanth Reddy, B., 851
E Eybers, Sunet, 379
G Gaba, Anubhav, 39 Ganesh Babu, C., 445, 481 Gautam, Shivani, 285 Ghosh, Atonu, 333 Gokul Kumar, S., 445 Gorbatenko, N. I., 729 Gour, Avinash, 305 Gouthaman, P., 763, 781 Graceline Jasmine, S., 189 Gupta, Akshay Ramesh Bhai, 157 Gupta, Anil, 203 Gupta, Anmol, 763 Gupta, Sachin, 315 Guttikonda, Geeta, 103
H Haldorai, Anandakumar, 851 Harish, Ratnala Venkata Siva, 349 Hettige, Budditha, 691 Hettikankanama, H. K. S. K., 601 Hiremath, Iresh, 405 Hoang, Vinh Truong, 299 Hossain, Sohrab, 883
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Smys et al. (eds.), Inventive Computation and Information Technologies, Lecture Notes in Networks and Systems 173, https://doi.org/10.1007/978-981-33-4305-4
981
982 I Indirani, M., 831 Ishi, Manoj S., 143
J Jagtap, Swati, 911 Jahnavi, Ambati, 851 Jain, Rachna, 677 Jani Anbarasi, L., 189 Jawahar, Malathy, 189 Jeeva Padmini, K. V., 567 Jeyaboopathiraja, J., 795 Jha, Aayush, 39 Jha, Avinash Kumar, 39 Jinarajadasa, G. M., 719 John Aravindhar, D., 463 John Deva Prasanna, D. S., 463 Joshi, Abhijit R., 221 Joshi, Shashank Karthik D., 235 Jotheeswar Raghava, E., 529 Jude, Hemanth, 677
K Kailasam, Siddharth, 405 Kamrul Hasan, Mohammad, 869 Karthika, S., 971 Karthikeyan, M. M., 15 Karthik, S., 247 Karthik, V., 189 Karunananda, Asoka S., 583, 691 Katamaneni, Madhavi, 103 Katsupeev, A. A., 729 Kikkuri, Vamsi Krishna, 173 Kiruthika, S., 925 Kombarova, E. O., 729 Kommineni, Madhuri, 851 Koppar, Anant, 405 Kousik, N. V., 813 Krishanth, N., 271 Krishna, Harsh, 763 Kumara, Kudabadu J. C., 647, 665 Kumar, Anuj, 537 Kumar, N. S., 859
L Li, Hengjian, 633 Lima, Farzana Firoz, 883 Lingaraj, N., 247 Litvyak, R. K., 729 Liyanage, S. R., 719
Author Index M Mahaveerakannan, R., 1, 813 Maheshwari, Vikas, 305 Majumder, Koushik, 333 Malipatil, Somashekhar, 305 Manickavasagam, L., 271 Manjunathan, A., 481 Manusha Reddy, A., 395 Marathe, Amit, 221 Maria Priscilla, G., 795 Maruthi Shankar, B., 445 Mathankumar, M., 481 Matta, Priya, 751 Mehta, Gaurav, 285 Mishra, Bharat, 421 Mittra, Tanni, 883 Mohanrajan, S. R., 271 Mohanty, Prarthana, 739 Muqith, Munim Bin, 897
N Nagalakshmi, Malathy, 859 Nagrath, Preeti, 677 Nair, Jayashree, 173 Narendra, Modigari, 189 Nataraj, N., 925 Naveen, K. M., 365 Nayar, Nandini, 285 Niloy, Md. Dilshad Kabir, 897 Nithish Sriman, K. P., 365
O Olana, Mosisa Dessalegn, 551
P Pandala, Madhavi Latha, 103 Pandey, Sanidhya, 763 Pant, Bhasker, 751 Parveen, Suraiya, 259 Patidar, Sanjay, 113 Patil, Ajay B., 703 Patil, Annapurna P., 513 Patil, J. B., 143 Pavan Karthik, D. V. S., 945 Pavel, Monirul Islam, 897 Perera, G. I. U. S., 567 Pramanik, Subham, 763 Pranavanand, S., 945 Prathap, R., 365 Praveen Kumar, N., 365 Premjith, B., 81 Priyanka, G., 925
Author Index Priyanka, G. S., 445 Priyatham, Manoj, 497 Punithavalli, M., 27 R Rajesh Kumar, E., 529 Rajesh Kumar, P., 349 Rajesh Sharma, R., 551 Rakesh, K. S. S., 739 Ramachandran, Raji, 935 Ramkumar, M., 445, 481 Ram, Shrawan, 203 Ranasinghe, D. D. M., 615 Ranjith, R., 271 Rathnayake, Kapila T., 601 Raveendran, Aswathi, 935 Ravichandran, Gopika, 935 Ravi Kiran Varma, P., 93 Razzak, Razia, 897 Rizvi, Ali Abbas, 221 Ruthala, Suresh, 93 Rzevski, George, 691 S Sabena, S., 127 Sachdeva, Ritu, 315 Sai Aparna, T., 81 Saini, Dharmender, 677 Sai Ramesh, L., 127, 435 Saranya, M. D., 247 Sarath Kumar, R., 445, 481 Sarma, Dhiman, 883 Sarwar, Tawsif, 883 Satheesh Kumar, S., 247 Selvakumar, K., 435 Sengupta, Katha, 897 Setsabi, Naomi, 379 Shalini, S., 513 Shankar, S., 831 Shanthini, M., 63 Sharma, Nitika, 677 Sharma, Tanya, 859 Shrinivas, S., 235 Shukur, Zarina, 869 Shwetha, N., 497 Shyamali Dilhani, M. H. M. R., 665 Silva, R. K. Omega H., 567 Silva, Thushari, 583 Simran, K., 81 Singh, Ashutosh Kumar, 421 Singh, Bhavesh, 221 Singh, Poonam, 285 Sivaram, M., 813
983 Sivasankar, P., 463 Slathia, Shaurya Singh, 781 Soman, K. P., 81 Sophia Reena, G., 27 Sri Harikarthick, N., 971 Sruthi, M. S., 925 Subash, G., 271 Subba Raju, K. V., 93 Sujin, J. S., 247 Sungheetha, Akey, 551 Sureshkumar, C., 127 Suresh, Yeresime, 395 T Talagani, Srikar, 173 Tan, Siok Yee, 897 Thakur, Narina, 677 Thapa, Surendrabikram, 39 Thenmozhi, S., 235 Thota, Yashwanth, 173 Tran-Trung, Kiet, 299 Tripathi, Satyendra, 421 U Udhayanan, S., 481 Uma, J., 1 V Varun, M., 405 Vasantha, Bhavani, 851 Vasanthapriyan, Shanmuganathan, 601 Vasundhara, 259 Vemuri, Pavan, 173 Venba, R., 189 Vidanagama, Dushyanthi, 583 Vidya, G., 63 Vikas, B., 235 Vivekanandan, P., 1 W Wagarachchi, N. M., 665 Wang, Xiyu, 633 Y Yashodhara, P. H. A. H. K., 615 Yuvaraj, N., 813 Z Zhao, Baohua, 633