219 6 29MB
English Pages 1072 [1033] Year 2022
Lecture Notes on Data Engineering and Communications Technologies 96
Jennifer S. Raj Khaled Kamel Pavel Lafata Editors
Innovative Data Communication Technologies and Application Proceedings of ICIDCA 2021
Lecture Notes on Data Engineering and Communications Technologies Volume 96
Series Editor Fatos Xhafa, Technical University of Catalonia, Barcelona, Spain
The aim of the book series is to present cutting edge engineering approaches to data technologies and communications. It will publish latest advances on the engineering task of building and deploying distributed, scalable and reliable data infrastructures and communication systems. The series will have a prominent applied focus on data technologies and communications with aim to promote the bridging from fundamental research on data science and networking to data engineering and communications that lead to industry products, business knowledge and standardisation. Indexed by SCOPUS, INSPEC, EI Compendex. All books published in the series are submitted for consideration in Web of Science.
More information about this series at https://link.springer.com/bookseries/15362
Jennifer S. Raj · Khaled Kamel · Pavel Lafata Editors
Innovative Data Communication Technologies and Application Proceedings of ICIDCA 2021
Editors Jennifer S. Raj Department of Electronics and Communication Engineering Gnanamani College of Technology Namakkal, India
Khaled Kamel Department of Computer Science Texas Southern University Houston, TX, USA
Pavel Lafata Department of Telecommunication Engineering Czech Technical University in Prague Prague, Czech Republic
ISSN 2367-4512 ISSN 2367-4520 (electronic) Lecture Notes on Data Engineering and Communications Technologies ISBN 978-981-16-7166-1 ISBN 978-981-16-7167-8 (eBook) https://doi.org/10.1007/978-981-16-7167-8 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022, corrected publication 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
We are honored to dedicate the proceedings of ICIDCA 2021 to all the participants and editors of ICIDCA 2021.
Preface
It is with deep satisfaction that I write this Foreword to the Proceedings of ICIDCA 2021 held in RVS College of Engineering and Technology, Coimbatore, Tamil Nadu, India, on August 20 and 21, 2021. This conference brought together researchers, academics and professionals from all over the world, experts in Innovative Data Communication Technologies. This conference particularly encouraged the interaction of research students and developing academics with the more established academic community in an informal setting to present and to discuss new and current work. The papers in the conference proceedings contributed the most recent scientific knowledge in the fields of distributed operating systems, middleware, databases, sensor, mesh, and ad hoc networks, quantum and optics-based distributed algorithms, Internet applications, social networks, and recommendation systems. Their contributions helped to make the conference as outstanding as it has been. The local organizing committee members and their helpers had put much effort into ensuring the success of the day-to-day operation of the meeting. We hope that this program will further stimulate research in theory, design, analysis, implementation, and application of distributed systems and networks. We feel honored and privileged to serve the best recent developments to you through this exciting program. We thank all the authors and participants for their contributions. Namakkal, India Houston, USA Prague, Czech Republic
Jennifer S. Raj Khaled Kamel Pavel Lafata
vii
Acknowledgements
We would like to thank the organization’s many volunteers for their contributions. On a local, regional, and international scale, members volunteer their time, energy, and knowledge. On behalf of the editors, organizers, authors, and readers of this conference, we wish to thank the keynote speakers and the reviewers for their time, hard work, and dedication to this conference. The organizers wish to acknowledge Dr. V. Gunaraj for the discussion, suggestion, and cooperation to organize the keynote speakers of this conference. The organizers also wish to acknowledge the speakers and participants who attended this conference. Many thanks are given to all persons who helped and supported this conference. ICIDCA would like to acknowledge the contribution made to the organization by its many volunteers. Members contribute their time, energy, and knowledge at a local, regional, and international level. We also thank all the chair persons and conference committee members for their support.
ix
Contents
Mobile Application for Tour Planning Using Augmented Reality . . . . . . P. Babar, A. Chaudhari, S. Deshmukh, and A. Mhaisgawali
1
Wrapper-Naive Bayes Approach to Perform Efficient Customer Behavior Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . R. Sıva Subramanıan, D. Prabha, B. Maheswari, and J. Aswini
17
ECG Acquisition Analysis on Smartphone Through Bluetooth and Wireless Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Renuka Vijay Kapse and Alka S. Barhatte
33
Enhanced Security of User Authentication on Doctor E-Appointment System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Md Arif Hassan, Monirul Islam Pavel, Dewan Ahmed Muhtasim, S. M. Kamal Hussain Shahi, and Farzana Iasmin Rumpa Enhanced Shadow Removal for Surveillance Systems . . . . . . . . . . . . . . . . P. Jishnu and B. Rajathilagam Vehicle Speed Estimation and Tracking Using Deep Learning and Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Sathyabama, Ashutosh Devpura, Mayank Maroti, and Rishabh Singh Rajput Transparent Blockchain-Based Electronic Voting System: A Smart Voting Using Ethereum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Md. Tarequl Islam, Md. Sabbir Hasan, Abu Sayed Sikder, Md. Selim Hossain, and Mir Mohammad Azad Usage of Machine Learning Algorithms to Detect Intrusion . . . . . . . . . . S. Sandeep Kumar, Vasantham Vijay Kumar, N. Raghavendra Sai, and M. Jogendra Kumar
47
65
77
89
99
xi
xii
Contents
Speech Emotion Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Siddhant Samyak, Apoorve Gupta, Tushar Raj, Amruth Karnam, and H. R. Mamatha
113
A Review of Security Concerns in Smart Grid . . . . . . . . . . . . . . . . . . . . . . . Jagdish Chandra Pandey and Mala Kalra
125
An Efficient Email Spam Detection Utilizing Machine Learning Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. Ravi Kumar, P. Murthuja, G. Anjan Babu, and K. Nagamani
141
Malware Prediction Analysis Using AI Techniques with the Effective Preprocessing and Dimensionality Reduction . . . . . . . S. Harini, Aswathy Ravikumar, and Nailesh Keshwani
153
Statistical Test to Analyze Gene Microarray . . . . . . . . . . . . . . . . . . . . . . . . . M. C. S. Sreejitha, P. Sai Priyanka, S. Meghana, and Nalini Sampath An Analysis on Classification Models to Predict Possibility for Type 2 Diabetes of a Patient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ch. V. Raghavendran, G. Naga Satish, N. S. L. Kumar Kurumeti, and Shaik Mahaboob Basha Preserve Privacy on Streaming Data During the Process of Mining Using User Defined Delta Value . . . . . . . . . . . . . . . . . . . . . . . . . . Paresh Solanki, Sanjay Garg, and Hitesh Chhikaniwala A Neural Network based Social Distance Detection . . . . . . . . . . . . . . . . . . Sirisha Alamanda, Malthumkar Sanjana, and Gopasi Sravani An Automated Attendance System Through Multiple Face Detection and Recognition Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K. Meena, J. N. Swaminathan, T. Rajendiran, S. Sureshkumar, and N. Mohamed Imtiaz Building a Robust Distributed Voting System Using Consortium Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rohit Rajesh Chougule, Swapnil Shrikant Kesur, Atharwa Ajay Adawadkar, and Nandinee L. Mudegol Analysis on the Effectiveness of Transfer Learned Features for X-ray Image Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gokul Krishnan and O. K. Sikha Blockchain-Centered E-voting System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Akshat Jain, Sidharth Bhatnagar, and Amrita Jyoti Project Topic Recommendation by Analyzing User’s Interest Using Intelligent Conversational System . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pratik Rathi, Palak Keni, and Jignesh Sisodia
171
181
197 213
225
235
251 267
277
Contents
xiii
Abnormal Behaviour Detection in Smart Home Environments . . . . . . . . P. V. Bala Suresh and K. Nalinadevi
289
Mapping UML Activity Diagram into Z Notation . . . . . . . . . . . . . . . . . . . . Animesh Halder and Rahul Karmakar
301
Exploring Sleep Deprivation Reason Prediction . . . . . . . . . . . . . . . . . . . . . Dhiraj Kumar Azad, Kshitiz Shreyansh, Mihir Adarsh, Amita Kumari, M. B. Nirmala, and A. S. Poornima
319
E-Irrigation Solutions for Forecasting Soil Moisture and Real-Time Automation of Plants Watering . . . . . . . . . . . . . . . . . . . . . . Md Mijanur Rahman, Sonda Majher, and Tanzin Ara Jannat
337
Implementing OpenCV and Dlib Open-Source Library for Detection of Driver’s Fatigue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . R. Kavitha, P. Subha, R. Srinivasan, and M. Kavitha
353
Comparative Analysis of Contemporary Network Simulators . . . . . . . . . Agampreet Kaur Walia, Amit Chhabra, and Dakshraj Sharma
369
Using Deep Neural Networks for Predicting Diseased Cotton Plants and Leafs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dhatrika Bhagyalaxmi and B. Sekhar Babu
385
Document Cluster Analysis Based on Parameter Tuning of Spectral Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Remya R. K. Menon, Astha Ashok, and S. Arya
401
Distributed Denial of Service Attack Detection and Prevention in Local Area Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Somnath Sinha and N. Mahadev Prasad
415
Secure Credential Derivation for Paperless Travel . . . . . . . . . . . . . . . . . . . Tarun Tanmay Bhatta and P. N. Kumar Performance Analysis of Spectrum Sensing in CRN Using an Energy Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ch. Sateesh, K. Janardhana Gupta, G. Jayanth, Ch. Sireesha, K. Dinesh, and K. Ramesh Chandra
429
447
Forecasting Crime Event Rate with a CNN-LSTM Model . . . . . . . . . . . . M. Muthamizharasan and R. Ponnusamy
461
LegalLedger–Blockchain in Judicial System . . . . . . . . . . . . . . . . . . . . . . . . Soumya Haridas, Shalu Saroj, Sairam Tushar Maddala, and M. Kiruthika
471
Arduino-Based Smart Walker Support for the Elderly . . . . . . . . . . . . . . . Pavithra Namboodiri, R. Rahul Adhithya, B. Siddharth Bhat, R. Thilipkumar, and M. E. Harikumar
483
xiv
Contents
Enhancing the Framework for E-Healthcare Privacy and Security: the Case of Addis Ababa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Selamu Shirtawi and Sanjiv Rao Godla
499
Socket Programming-Based RMI Application for Amazon Web Services in Distributed Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . Sanjiv Rao Godla, Getahun Fikadu, and Abinet Adema
517
Design and Verification of H.264 Advanced Video Decoder . . . . . . . . . . . Gowri Revanoor and P. Prabhavathi Analyzing Student Reviews on Teacher Performance Using Long Short-Term Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shiva Shankar Reddy, Mahesh Gadiraju, and V. V. R. Maheswara Rao
527
539
A Novel Ensemble Method for Underwater Mines Classification . . . . . . G. Divyabarathi, S. Shailesh, M. V. Judy, and R. Krishnakumar
555
Design of Blockchain Technology for Banking Applications . . . . . . . . . . . H. M. Anitha, K. Rajeshwari, and S. Preetha
567
A Brief Survey of Cloud Data Auditing Mechanism . . . . . . . . . . . . . . . . . . Yash Anand, Bhargavi Sirmour, Sk Mahafuz Zaman, Brijesh kumar Markandey, Anurag Kumar, and Soumyadev Maity
581
Detecting Sybil Node in Intelligent Transport System . . . . . . . . . . . . . . . . K. Akshaya and T. V. Sarath
595
Applying Lowe’s “Small-System” Result to Prove the Security of UPI Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sreekanth Malladi Spark-Based Scalable Algorithm for Link Prediction . . . . . . . . . . . . . . . . K. Saketh, N. Raja Rajeswari, M. Krishna Keerthana, and Fathimabi Shaik Obstacle Detection by Power Transmission Line Inspection Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ravipati Jhansi, P. A. Ashwin Kumar, Sai Keerthana, Sai Pavan, Revant, and Subhasri Duttagupta Performance Analysis of Logistic Regression, KNN, SVM, Naïve Bayes Classifier for Healthcare Application During COVID-19 . . . . . . . Mausumi Goswami and Nikhil John Sebastian Recognition and Detection of Human Judgment About Influential Pairs Using Machine Learning Techniques . . . . . . . . . . . . . . . . G. Charles Babu, P. Gopala Krishna, B. Sankara Babu, and Rokesh Kumar Yarava
609 619
637
645
659
Contents
Design and Implementation of High-Speed Energy-Efficient Carry Select Adder for Image Processing Applications . . . . . . . . . . . . . . . K. N. VijeyaKumar, M. Lakshmanan, K. Sakthisudhan, N. Saravanakumar, R. Mythili, and V. KamatchiKannan Mitigating Poisoning Attacks in Federated Learning . . . . . . . . . . . . . . . . . Romit Ganjoo, Mehak Ganjoo, and Madhura Patil An Audit Trail Framework for Saas Service Providers and Outcomes Comparison in Cloud Environment . . . . . . . . . . . . . . . . . . . M. N. V. Kiranbabu, A. Francis Saviour Devaraj, Yugandhar Garapati, and Kotte Sandeep Predicting Entrepreneurship Skills of Tertiary-Level Students Using Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abdullah Al Amin, Shakawat Hossen, Md. Mehedi Hasan Refat, Proma Ghosh, and Ahmed Al Marouf Smart Pet Insights System Based on IoT and ML . . . . . . . . . . . . . . . . . . . . G. Harshika, Umme Haani, P. Bhuvaneshwari, and Krishna Rao Venkatesh
xv
679
687
701
715
725
Design and Implementation of a Monitoring System for COVID-19-Free Working Environment . . . . . . . . . . . . . . . . . . . . . . . . . Attar Tarannum, Pathan Safrulla, Lalith Kishore, and S. Kalaivani
739
Statistical Measurement of Software Reliability Using Meta-Heuristic Algorithms for Parameter Estimation . . . . . . . . . . . . . . . . Rajani, Naresh Kumar, and Kuldeep Singh Kaswan
753
Development of Optimized Linguistic Technique Using Similarity Score on BERT Model in Summarizing Hindi Text Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. B. Rajeshwari and Jagadish S. Kallimani Faulty Node Detection and Correction of Route in Network-On-Chip (NoC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. G. Satish and A. C. Ramachandra Paddy Crop Monitoring System Using IoT and Deep Learning . . . . . . . Chalumuru Suresh, M. Ravikanth, G. Nikhil Reddy, K. Balaji Sri Ranga, A. Anurag Rao, and K. Maheshwari Diabetic Retinopathy Disease Classification Using EfficientNet-B3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Naveenkumar, S. Srithar, T. Maheswaran, K. Sivapriya, and B. M. Brinda
767
783 791
809
xvi
Contents
Design Flaws and Suggested Improvement of Secure Medical Data Sharing Scheme Based on Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . Samiulla Itoo, Akber Ali Khan, Vinod Kumar, Srinivas Jangirala, and Musheer Ahmad Comparative Analysis of Machine Learning Algorithms for Rainfall Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rudragoud Patil and Gayatri Bedekar Electronic Invoicing Using Image Processing and NLP . . . . . . . . . . . . . . . Samarth Srivastava, Oshi Varma, and M. Gayathri
823
833 843
BEVDS: A Blockchain Model for Multiparty Authentication of COVID-19 Vaccine Beneficiary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tejaswi Khanna, Parma Nand, and Vikram Bali
857
Reinforcement Learning for Security of a LDPC Coded Cognitive Radio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Puneet Lalwani and Rajagopal Anantharaman
871
Secure Forensic Data Using Blockchain and Encryption Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. S. Renuka and S. Kusuma
883
Denoising of Surface Electromyography Signal Using Parametric Wavelet Shrinkage Method for Hand Prosthesis . . . . . . . . . . S. H. Bhagwat, P. A. Mukherji, and S. Paranjape
899
An Efficient Machine Learning Approach to Recognize Dynamic Context and Action Recommendations for Attacks in Enterprise Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K. B. Swetha and G. C. Banu Prakash Convolutional Neural Network Based on Self-Driving Autonomous Vehicle (CNN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. Babu Naik, Prerit Ameta, N. Baba Shayeer, B. Rakesh, and S. Kavya Dravida Performance Analysis of Machine Learning Algorithms in Detecting and Mitigating Black and Gray Hole Attacks . . . . . . . . . . . . Mahesh Kurtkoti, B. S. Premananda, and K. Vishwavardhan Reddy Effect of Non-linear Co-efficient of a Hexagonal PCF Depending on Effective Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Md. Rawshan Habib, Abhishek Vadher, Ahmed Yousuf Suhan, Md Shahnewaz Tanvir, Tahsina Tashrif Shawmee, Md. Rashedul Arefin, Al-Amin Hossain, and Anamul Haque Sunny
915
929
945
963
Contents
Analysis of Student Attention in Classroom Using Instance Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K. Meenakshi, Abirami Vina, A. Shobanadevi, S. Sidhdharth, R. Sai Sasmith Pabbisetty, and K. Geya Chitra A Simplex Method-Based Bacterial Colony Optimization for Data Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Suresh Babu and K. Jayasudha Artificial Intelligent Former: A Chatbot-Based Smart Agriculture System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Gopikrishnan, Cheemakurthi Srujan, V. N. Siva Praneeth, and Sagar Mousam Parida
xvii
973
987
997
Image Caption Generation Using Attention Model . . . . . . . . . . . . . . . . . . . 1009 Eliganti Ramalakshmi, Moksh Sailesh Jain, and Mohammed Ameer Uddin Pearson Correlation Based Outlier Detection in Spatial-Temporal Data of IoT Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 1019 M. Veera Brahmam, S. Gopikrishnan, K. Raja Sravan Kumar, and M. Seshu Bhavani Energy Harvesting in Cooperative SHIPT NOMA for Multi-user Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1029 K. Raja Sravan Kumar, S. Gopikrishnan, M. Veera Brahmam, and M. Gargi An Efficient 1DCNN–LSTM Deep Learning Model for Assessment and Classification of fMRI-Based Autism Spectrum Disorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1039 Abdul Qayyum, M. K. A. Ahamed Khan, Abdesslam Benzinou, Moona Mazher, Manickam Ramasamy, Kalaiselvi Aramugam, C. Deisy, S. Sridevi, and M. Suresh Correction to: An Efficient 1DCNN–LSTM Deep Learning Model for Assessment and Classification of fMRI-Based Autism Spectrum Disorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abdul Qayyum, M. K. A. Ahamed Khan, Abdesslam Benzinou, Moona Mazher, Manickam Ramasamy, Kalaiselvi Aramugam, C. Deisy, S. Sridevi, and M. Suresh
C1
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1049
About the Editors
Dr. Jennifer S. Raj received the Ph.D. degree from Anna University and master’s degree in communication System from SRM University, India. Currently, she is working in the Department of ECE, Gnanamani College of Technology, Namakkal, India. She is a life member of ISTE, India. She has been serving as an organizing chair and a program chair of several international conferences and in the program committees of several international conferences. She is the book reviewer for Tata McGraw Hill publication and published more than fifty research articles in the journals and IEEE conferences. Her interests are in wireless healthcare informatics and body area sensor networks. Dr. Khaled Kamel is currently a professor of Computer Science at TSU. He worked as a full-time faculty and administrator for 22 years at the University of Louisville Engineering School. He was a professor and the chair of the Computer Engineering and Computer Science Department from August 1987 to January 2001. He also was the founding dean of the College of IT at the United Arab Emirates University and the College of CS and IT at the Abu Dhabi University. Dr. Kamel received a B.S. in Electrical Engineering from Cairo University, a B.S. in mathematics from Ain Shams University, an M.S. in CS from Waterloo University, and a Ph.D. in ECE from the University of Cincinnati. Dr. Kamel worked as a principle investigator on several government and industry grants. He also supervised over 100 graduate research master and doctoral students in the past 25 years. His current research interest is more interdisciplinary in nature but focuses on the use of IT in industry and systems. Dr. Kamel’s area of expertise is computer control, sensory fusion, and distributed computing. Ing. Pavel Lafata obtained his M.Sc. from the Faculty of Electrical Engineering of the CTU in Prague in 2007 and his Ph.D. from the Faculty of Electrical Engineering of the CTU in Prague in 2011. Since 2011, he is an assistant professor at the Department of Telecommunication Engineering, CTU in Prague. Since 2011, he has been a supervisor of more than 20 bachelor and diploma thesis and student projects; since 2014, he has been a supervisor of 3 Ph.D. students. Since 2011, he has been a teacher xix
xx
About the Editors
and lecturer of various courses, especially in the field of telecommunication networks and systems, optical systems, digital systems, fixed access networks, etc. He is an author or co-author of numerous scientific papers published in reviewed international journals or conferences, he is also a member of several scientific committees of international conferences and editorial boards of international journals, and he also acts as a fellow reviewer for numerous impact journals. He recently published a book Programmable Logic Controllers: Industrial Control with McGrew Hill professional in August 2013.
Mobile Application for Tour Planning Using Augmented Reality P. Babar, A. Chaudhari, S. Deshmukh, and A. Mhaisgawali
Abstract Over past years, there have been many advancements in the digital world, and augmented reality is one of the fastest emerging fields. Augmented reality is a section of mixed reality that superimposes the virtual world in the real world. Tourism, a tertiary sector industry, can make use of AR for their own benefits. Our paper proposes an application for tourists to make their tour experience more interactive. ART, augmented reality in tourism, intends to scan a food menu, display 3D food models and description about it so tourists get better idea about the food item and food ingredients which they are not familiar with; the application also proposes the idea of digital brochure for hotels which superimposes 3D view of the hotel rooms and the amenities provided by the hotel on the available brochure making the search for hotels more user-friendly; further the application also allows the AR camera to scan the map of a particular area, and user can navigate through the map to seek information about the locations via texts or video. This application focuses on three different areas which are considered the major concerns for tourists while travelling to non-familiar places. Keywords Augmented reality · Unity · Blender · Vuforia · 3D food menu · Brochures · Interactive maps · AR camera · GameObject · Image target
1 Introduction Tourism is the activity of people visiting and staying in new cities and countries for leisure, business or other purposes. This also includes trying different food items from different places. In today’s world of booming technology, augmented reality is the new rising technology which enhances the user experience by the use of digital visual elements. Augmented reality lets the people superimpose digital content like images, sound and text over real-life objects. AR system provides three features: a combined form of both real and virtual worlds, real-time reciprocation and precise P. Babar (B) · A. Chaudhari · S. Deshmukh · A. Mhaisgawali Department of Computer Science and Technology, UMIT, SNDT Women’s University, Mumbai, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_1
1
2
P. Babar et al.
3D portrayal of virtual content on real objects. AR includes graphics, sounds and touch feedback which are then added to the real world [1]. Augmented reality is often mistaken as Virtual reality but, is in fact the opposite of it. Virtual reality is a technology which immerses people in an environment which lets them interact with it, whereas augmented reality believes in adding elements to the existing surroundings, instead of creating an entire environment. Communication between the user, the virtual functions and elements in the simulated environment is made possible using augmented reality [8]. In augmented reality, the position and orientation of an object are determined using sensors and algorithms. This technology renders the 3D graphics as they would look from the view of the camera and superimposes the digital figures over the user’s view of the physical world. AR is an emerging technology for the tourism industry as it creates an illusion of a virtual world and provides useful information to the tourists at the same time.
2 Review of Literature (1)
(2)
(3)
(4)
(5)
(6)
Development and evaluation of i-brochure: This provides interactive information unlike a typical brochure in advocating higher studies institutions [2]. Augmented reality-based mobile tour guide system: This system presents a mobile tour guide system with augmented reality, called Tour Guide System. Using this system, visitors can have an interactive and user-specific augmented experience by tracking the contents of an offline tour booklet [3]. Using this system, visitors can have an interactive and user-specific augmented experience by tracking the contents of an offline tour booklet. Exploresia: It is an app made to provide information about tourism in Indonesia. It shows data in the form of text, 2D and 3D images, videos, etc. which engages users in a complete virtual understanding of that place. Users can even have 360-degree virtual tours, all on their mobile screens [4]. Augmented reality tourist catalogue using mobile technology: This paper informs about a catalogue which will help tourists both domestic and foreign get visual aid for viewing objects on a simple map. This article shows the extent this new technology has been explored and how much has been implemented [5]. Augmented reality and its effect on our life: This paper informs about the start of augmented reality, various types of AR, its applications, its advantages and disadvantages. It also gives us knowledge regarding those major threats that AR will face in the near future and about its current and future applications [1]. Visualization of virtual house model on property brochure using augmented technology: It aims to show customers a house form that is offered in three dimensions. Further, it can increase the interest of customers for the houses that are sold and reduce the cost of making miniature models [6].
Mobile Application for Tour Planning Using Augmented Reality
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
3
Image processing with augmented reality (AR): The aim of this paper is to discuss the intent to create an image-based android application. This study is based on real-time image detection and processing. The basis of this study is on real-time image recognition and processing. Augmented reality allows the user to manipulate the data and can add enhanced features to the image taken [10]. Augmented reality: Technology merging computer vision and image processing by experimental techniques. The objective of this paper is to understand the ongoing research in AR. This paper discusses all the technical details of the AR technology, the algorithms used to achieve various functionalities and the future scope of the technology [11]. Image processing techniques are used to analyse the application and scope of augmented reality in marketing. The primary goal of this research paper is to comprehend the concept and applications of augmented reality (AR), as well as the process of creating an AR image and the hardware/software requirements for it. The paper also focuses on how firms and advertisers use technology to provide customers with a better user experience [12]. An augmented reality method with image composition and image special effects: Augmented reality for natural images on mobile devices is currently a research focus. In this method, 3D tracking of a camera is obtained first, and then a template image is augmented by applying image composition, image special effects and 3D virtual model projection independently or jointly [13]. Augmented reality uses in marketing: This article explores which applications of augmented reality (AR) have evolved in the marketing business thus far, and provides classification schemas for them based on the degree of augmentation, diverse consuming situations and marketing roles [14]. AR using Vuforia for marketing: The concept of augmented reality is a technology that allows a human and a computer to interact in real time to construct two-dimensional or three-dimensional objects. In its application, augmented reality can give the necessary functionality and information. This research explains how to use Vuforia (QCAR) software to deploy augmented reality in mobile applications for real estate marketing. Vuforia provides convenience to the android mobile platform in the shooting in 3D objects [15]. AR in brand building—Valves industry: The role of augmented reality in brand building in the valves industry: We now have limitless chances for brand building and marketing thanks to digital media and the Internet. When a product is advertised, it must provide value in terms of product and brand awareness. The purpose of this article is to compare the effectiveness of augmented reality and traditional advertising in brand building and marketing (digital and print) [16]. Context-aware augmentational marketing: This study combines the use of augmented reality (AR) technology in the development of a start-up platform for marketing in the context of shopping malls. It is effective in weeding out options from large datasets and providing users with the option to integrate information in real-time scenarios, and the use of AR results in more effective
4
(15)
P. Babar et al.
user decisions. The paper also discusses how technological evolution has begun to bridge the gap and merge the boundaries between virtual space and reality space to create virtual reality (VR) and augmented reality (AR) depending on whether a real or virtual object is placed [17]. Augmented reality in food production: This paper aims to provide an update on current knowledge and scientific advancement of mobile augmented reality (MAR) in food manufacturing and packaging systems, as well as to highlight future needs for bringing AR to the food market [18].
3 Our Proposed System The main goal of this project is to help boost tourism by introducing it to augmented reality. AR can be categorized into two broad categories: object recognition and marker recognition. Object recognition also referred to as object detection identifies the shape and physical form of different objects and their orientation in space by the AR camera. For object recognition, the object should have the following properties: It should be convex and uniform and no protrusions. Simultaneous recognition of maximum two objects is possible. The object should be colourful with small details and have enough contrast points. On the other hand, marker recognition is detection of a flat image referred to as marker and then superimposing the augment-able 3D models, videos or texts over the markers. The required properties of a marker are: markers should be matte to avoid flares. Markers are treated as binary images so it is important that they should have distinguishing factors like edges, curves, slopes. Striped or full-striped elements will not be detected. If there is an object that covers marker, it will not be recognized. The application in discussion uses marker recognition technique. The working, in the simplest form, can be explained as—when this application scans the image, the corresponding information, in the form of text, images, 3D models or videos, is displayed. Hospitality and navigation are the main areas that this application is based upon. The use and implementation of embedded sensors in mobile phones have changed the aspect in which people react and interact with their surroundings [9]. In restaurants, the user can scan the restaurant menu and will get to see an exact, detailed, real-sized 3D model of the food dishes along with their 3D text description. This helps the customers get a clearer idea about the proportions, ingredients, spicy levels of the dish and enhances their dining experience. The customers can use this 3D menu even when offline. Another area in hospitality that this application focuses upon is the hotel industry. Generally, the hotel marketing is done using brochures. But these brochures do not always paint a clear picture of all the amenities that the hotel plans to provide. Using this application, one can scan these two-dimensional brochures and watch and explore 3D models and videos of their rooms, gym, swimming pools, cafes and many more amenities. This idea can create a great impact in the field of marketing.
Mobile Application for Tour Planning Using Augmented Reality
5
Fig. 1 System prototype
The final area of focus is navigation. This application makes map navigation more interactive and easier to understand using images, videos, texts and virtual buttons. Figure 1 shows the application system in a simple flowchart. It includes target scanning, i.e. marker recognition, virtual button recognition and an augmented reality engine. When camera scans the target, the corresponding image, 3D model, video or text appears on the mobile screen, depending on the target’s arrangement. Each marker is unique and has different objects assigned to it. Thus, our application ART enhances the user experience and benefits tourism.
3.1 Working 1.
Setting up Unity environment • AR camera needs to be selected while navigating through the dropdown menu of GameObject. • For adding Image target, select Vuforia image from GameObject dropdown [7].
2.
Converting image target to a dataset • Images to be augmented should have stark colour contrasts. Visually complex image aids in tracking the target.
3.
Uploading image targets to Vuforia database • Image targets should be maintained in Vuforia databases. Vuforia offers device, cloud, Vumark databases [7].
6
P. Babar et al.
Fig. 2 Features for the food menu image target
• Device database is used in the application so that we can access it without Internet connectivity which is the primary requirement of the application. • Vuforia rates the image targets based on feature tracking. The yellow markers shown in Fig. 2 represent the regions used to evaluate the correctness of the targets. Vuforia assigns a rating to all the targets which depict how augmentable the targets are. All targets are handled as binary images which means that colour is not a distinguishing factor but edges, curves, slopes are. • Vuforia detects these yellow points for further identify the given image target. 4.
Adding overlay on images • Overlay is the multimedia content that your application will superimpose on the image target after being triggered [7]. • Overlay can be audios, videos, 3D models and images. • Blender is used to create the 3D models as overlays for the food menu and brochure. • 3D model is the child of the marker which will be projected over the marker by your device via the application (Figs. 3 and 4).
Mobile Application for Tour Planning Using Augmented Reality
7
Fig. 3 Basic 3D food model created in unity
Fig. 4 Final snippet of 3D food model created in unity
4 Software Framework 4.1 Software AR uses integrating devices such as phones, tablets and more that contain sensors and suitable software which allow these computer-generated objects to be projected onto the real world. Software used for building the application in discussion are: Blender is an open-source platform used for 3D animation and modelling program. It performs different features like UV unwrapping, rigging, texturing, skinning and particle simulation. Alternatives to blender are Autodesk Maya, Cinema 4D and other 3D computer graphics software. Here Blender is used because Blender is cost free, an ideal software for small applications and compatible with Unity engine. Vuforia is a Software Development Kit (SDK) used for creating augmented realitybased mobile applications. Computer vision technology is used by Vuforia to track image marker and 3D models in real time. Unity is a multi-platform game engine for creating 3D, 2D, virtual reality and augmented reality games. Unity3D is widely used and provides free of charge development, and hence has been chosen for this project. The engine is used in architecture, film, games and engineering industries (Fig. 5).
8
P. Babar et al.
Fig. 5 Workflow of the software
4.2 Vuforia SDK Working (1) (2) (3) (4) (5)
(6)
(7)
(8) (9)
(10)
Vuforia is an augmented reality Software Development Kit for mobile devices that allows developers to create augmented reality apps. Vuforia is a standalone package that enables applications to recognize arbitrary items in the environment, such as photos and text. Due to the fact that both images are byte arrays, finding identical items in the reference image and the image shown might be time consuming. As a result, Vuforia analyses both photos while looking for certain feature points. It evaluates the unique elements in each image using feature points. They are usually high-contrast spots, curves, or edges that do not alter much when you look at them from different perspectives. When searching for these feature points, Vuforia only processes a reference image once. As a result, if the image lacks enough feature points, it will be difficult to detect. The fundamental purpose of reference photos is to have a large number of feature points that may be used as a kind of anchor for object recognition software. Vuforia remembers the relative positions of all feature points and effectively combine them into a shape that resembles constellations. Every camera frame is then subjected to a feature extraction algorithm, which attempts to match two sets of feature points. If a camera frame contains the majority of reference feature points, the image is recognized as the marker. Vuforia interprets marker orientation in the physical environment by comparing relative positions of reference “constellation” with a recognized “constellation.”
Mobile Application for Tour Planning Using Augmented Reality
9
Fig. 6 Architecture of the platform used for building the application
4.3 Architecture The main components used for this application are Unity 2020.1.15f1 version, Vuforia SDK 9.8.5, Android SDK 7.1 Nougat (API level 25) and Java JDK h1.8.0_152 (Fig. 6).
4.4 Prototype of Application See Fig. 7.
4.5 Implementation Our application can be run on Windows OS or Android OS or Mac OS. A few key points to be mindful of are:
10
P. Babar et al.
Fig. 7 Design of application
1. 2. 3.
For ART to run on your computer devices, the Unity software must be installed on your systems [6]. The image target should be held straight in the view of the camera. This helps in better detection of its location [6]. The camera being used must be responsive to the movement being done to recognize the inputs [6]. Now let us take a look at the working of this application.
3D food menu: Fig. 8 shows the image target that is to be scanned along with the augmented model of the food item (Fig. 9). Hotel brochure: A hotel consists of a number of amenities like different kinds of rooms, pools, cafe, fitness centres, gaming room, restaurants and many more. This application helps you take a 360-degree view of all those amenities, along with the hotel premises, before booking any hotel.
Mobile Application for Tour Planning Using Augmented Reality
11
Fig. 8 Ramen marker image and its 3D model
Fig. 9 Implementation of a few more food items
Figure 10 shows a hotel brochure, used as a marker to obtain Fig. 11 as outputs. AR map recognition: A visually interactive map where you can tap on the location visible and get detailed description and photos about the location. This can be useful in different heritage sites where they just have a map but no details about the minor locations in that place. Virtual buttons concept is used, and the scripting for the functioning of virtual button is done in C# for this application. Vuforia and unityEngine library files are used and a canvas consisting of different panels which displays details and images is created in Unity3D environment (Fig. 12).
12
Fig. 10 Hotel brochure as a marker image Fig. 11. 3D model of rooms and gym
Fig. 12 Panels showing details of places after making contact with the map
P. Babar et al.
Mobile Application for Tour Planning Using Augmented Reality
13
5 Testing and Validation After implementation phase, static testing was done. The interface test outputs are shown with the help of Table 1. Evaluation setting: Table 2 shows evaluation setting with statistic information. A quiz was curated and used to evaluate the usability, clarity and application control aspects with the perspective of a user. Users were made familiar with the application and asked questions based on the usability of the features provided, and we received a positive response for all the features shown in Fig. 13. Table 1 Testing outputs of functionalities in the application Testing target
Testing condition
Expected output
Scanning menu marker
AR camera
3D food model of 3D food model of the scanned menu the scanned menu item item
Pass
Scanning brochure marker
AR camera
3D model of the scanned brochure
3D model of the scanned brochure
Pass
Scanning Mumbai map
AR camera and virtual button1
Should display the “south mumbai” text panel
Displays the “south mumbai” text panel
Pass
Scanning Mumbai map
AR camera and virtual button2
Should display the “Juhu” text panel
Displays the “Juhu” text panel
Pass
Scanning Juhu map
AR camera and virtual Button1
Should display the “Juhu Beach” text panel with image
Displays the “Juhu Pass Beach” text panel with image
Scanning Juhu map
AR camera and virtual button2
Should display the “Lion’s Park” text panel with image
Displays the “Lion Pass Park” text panel with image
Table 2 Evaluation setting
Actual output
Status (pass/fail)
Family
Other
Date
May 2021
June 2021
Sample size
8
15
Male/female
3/5
7/8
Average age
42.6
21
14
P. Babar et al.
Fig. 13 Evaluation of application usability
Fig. 14 Evaluation of application clarity
The application displays food models, room models of hotels and a guide like feature for tourist visiting at new places. Figure 14 shows the evaluation results based on the clarity of the models and the screens. Application involves virtual buttons and Vuforia-based AR camera so it was important to study if users find it easy to work with these. Figure 15 shows the
Fig. 15 Evaluation of application control
Mobile Application for Tour Planning Using Augmented Reality
15
evaluation results of application control which showed that virtual buttons were a bit tricky to use, but the rest of the components were user-friendly.
6 Conclusion AR is applied for upgrading a tourist’s experience. This application can be useful for local tourists as well as international travellers as it uses augmented reality which gives a 3D view feature on just a basic smartphone. The application shows a 3D interactive model of food items which will encourage tourists to try out the new local dishes. It not only does that but also gives a 360-degree virtual tour of hotels and their amenities. This application aims to enhance the tourist experience and also help the tourism industry.
References 1. R. Aggarwal, A. Singhal, Augmented reality and its effect on our life, in 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 2019, pp. 510–515. https://doi.org/10.1109/CONFLUENCE.2019.8776989 2. A.N. Zulkifli, A. Alnagrat, R. Che Mat, Development and evaluation of i-Brochure: a mobile augmented reality application. J. Telecommun. Electron. Comput. Eng. 8, 145–150 (2016) 3. A. Sayyad, S. Shinde, Augmented Reality Based Mobile Tour Guide System [online]. Irjet.net. Available at: R(A) then A ← A ;
24
8. 9.
R. Sıva Subramanıan et al.
R(A ) ← R(A) until R(A ) ≤ R(A)|| |A | == d) end
4.2 Genetic Algorithm (GA) Second, the customer dataset is processed using a genetic algorithm to find the best variable subset, and further variable subset obtained is modeled using Naive Bayes to improve the customer analysis. GA Algorithm 2 GA belongs to metaheuristic algorithm and inspired by the process of natural selection which comes under the evolutionary algorithm class. GA helps to achieve better solution in search and optimization problems [20]. 1. 2. 3. 4. 5. 6. 7.
INITIALISE population WHILE (Stopping criteria is not reached) SELECTION REPRODUCTION REPLACEMENT EVALUATE END WHILE
In GA selection of optimal variable, subset is based upon the four steps: one population creation, parent selection, mutation, and crossover. First step starts with the chromosomes selection from the initial population to make parents for crossover. The chromosomes which exhibit high fitness score will be selected as parent. The next step is to carry out crossover process between two parents to create a new children. The next step is mutation process which changes the genes of newly generated children. In this stage, the genes of the chromosome are mutated from 1 to 0 bit and o bit to 1 bit accordingly to mutation rate [21]. Finally, optimal variable subset is obtained. The genetic algorithm is represented in Fig. 2.
5 Experimental Design The experimental procedure is carried out for the proposed methodology, and the results obtained are compared with NBTREE, standard Naive Bayes, AODE [22], CFWNB [23], and DTNB [24].
Wrapper-Naive Bayes Approach to Perform Efficient Customer … Fig. 2 Genetic algorithm approach
25
Input
Create Initial Population Mutation Evaluate Fitness value for each chromosome Crossover
Max generation
Selection
Optimal Solution
5.1 Dataset In this research, the experimental procedure is carried out Australian credit dataset (AC) [25]. The AC dataset is a real-time dataset, obtained from UCI respiratory and outcome of the dataset to examine whether the consumer is bad to reject or good to accept. The dataset holds 690 instances with 15 variables (14 input and one target class), and from the 690 instances, 307 are good cases to accept and 383 are bad cases to reject. There are 6 numerical and 8 categorical variables out of 14 input variables. The information about the input variables is given below.
5.2 Variable Description and Validity Scores Table 2 represents the attribute present in the customer dataset. The experimental results obtained are projected and compared using the different validity scores like accuracy, sensitivity, specificity, and precision [26] Accuracy =
TP + TN TP + TN + FP + FN
Recall =
TP TP + FN
Specificity =
TN TN + FP
(9) (10) (11)
26
R. Sıva Subramanıan et al.
Table 2 Variable description of the customer dataset applied S. No.
Feature description
Feature type
1
A1
Categorical 0, 1
2
A2
Continuous
3
A3
Continuous
4
A4
Categorical 1, 2, 3
5
A5
Categorical 1–9
6
A6
Categorical 1–9
7
A7
Continuous
8
A8
Categorical 1, 0
9
A9
Categorical 1, 0
10
A10
Continuous
11
A11
Categorical 1, 0
12
A13
Categorical 1, 2, 3
13
A13
Continuous
14
A14
Continuous
15
A15
Target class 1, 2
Precision =
TP TP + FP
(12)
5.3 Experimental Procedure 1.
2.
3.
The customer dataset is first computed using the wrapper approaches (genetic algorithm and SFS), and the feature subset captured is evaluated using Naive Bayes. The experimental results obtained are projected using the validity scores. Second, the same customer dataset is applied to different methodologies proposed to improve the Naive Bayes like AODE, DTNB, CFWNB NBTREE, and standard NB. The results obtained are projected using the validity scores. Further experimental results obtained from the GA-NB and SFS-NB methodologies, and existing approaches and experimental results are compared using the accuracy validity score.
Wrapper-Naive Bayes Approach to Perform Efficient Customer …
27
5.4 Results of Proposed Methodology and Existing Approaches Standard NB, AODE, DTNB, CFWNB, and NBTREE Table 3 represents the experimental outcome of the wrapper-NB methodology. In wrapper feature selection, two different approaches are applied; one is the genetic algorithm, and the second one is SFS approach. The variable subset obtained from the two approaches is applied separately with Naive Bayes algorithm, and performance of the NB obtained is computed using accuracy, sensitivity, specificity, and precision parameter. From the two approaches, GA and NB achieve higher accuracy of 86.8116 compared to SFS and NB (Fig. 3). Table 4 represents the results standard NB, and the approaches proposed to improve the NB AODE, DTNB, CFWNB and NBTREE. Here, we can see that the CFWNB and NBTREE approaches achieve higher prediction compare to other approaches, and we can witness that standard NB achieves poor prediction compare to other approaches. CFWNB and NBTREE achieve higher accuracy of 85.5072 compare to standard NB, AODE and DTNB (Fig. 4). Table 3 Experimental results of wrapper (GA and SFS)-NB methodology S. No.
Methodology
Accuracy
Sensitivity
Specificity
Precision
1
GA and NB
86.8116
85.3
87.98
85.1
2
SFS and NB
86.087
84
87.72
84.6
84.6 SFS & NB
86.087 85.1 85.3
GA & NB 82
83
Precision
87.72
84
84
85
Specificity 87.98
SensiƟvity
86.8116 86
Accuracy 87
88
89
Fig. 3 Represents the experimental results of wrapper (GA and SFS)-NB methodology using accuracy, sensitivity, specificity, and precision parameter
Table 4 Experimental results of standard NB, AODE, DTNB, CFWNB, and NBTREE S. No.
Methodology
Accuracy
Sensitivity
Specificity
Precision
1
Standard NB
84.7826
81.1
87.72
84.1
2
AODE
84.9275
80.5
88.51
84.9
3
DTNB
85.3623
91.5
80.41
78.9
4
CFWNB
85.5072
91.5
81
79.6
5
NBTREE
85.5072
84.8
83.78
84.8
28
R. Sıva Subramanıan et al. 84.8 83.78 84.8 85.5072
NBTREE 79.6
CFWNB
81 85.5072
78.9
DTNB
80.41 85.3623 84.9
AODE
80.5
81.1
70
75
80
Precision
91.5
Specificity SensiƟvity
88.51
84.9275 84.1
Standard NB
91.5
Accuracy
87.72
84.7826
85
90
95
Fig. 4 Experimental results of standard NB, AODE, DTNB, NBTREE, and CFWNB methodology using accuracy, sensitivity, specificity, and precision parameter
Accuracy GA & NB
86.8116 86.087
NBTREE
85.5072 85.5072 85.3623
DTNB Standarad NB 83.5
Accuracy
84.9275 84.7826
84
84.5
85
85.5
86
86.5
87
Fig. 5 Accuracy comparison of wrapper (GA and SFS)-NB methodology with standard NB, AODE, DTNB, NBTREE, and CFWNB
Figure 5 represents accuracy comparison SFS-NB and GA-NB with standard NB, AODE, CFWNB, DTNB, and NBTREE. From the results, it clearly shows that the suggested methodology performs better customer prediction compare to existing approaches AODE, CFWNB, DTNB, and NBTREE.
5.5 Result and Discussion Experimental outcome captured is represented in Tables 3 and 4. Table 3 displays the results of wrapper-NB approach. In the wrapper approach, two different methodologies are applied to elect the best feature subgroup to model with the NB algorithm. The first methodology is based upon a genetic algorithm to elect the best feature subgroup, and further variable subgroup captured is modeled with NB. This GA-NB methodology achieves higher accuracy of 86.116 which is higher than the accuracy obtained from the standard NB and the existing approaches which are represented in Table 4. The second methodology is proceeded using SFS algorithm to choose the best feature subgroup, and the further variable subset obtained is modeled with NB. This SFS-NB methodology achieves higher accuracy of 86.087 which is higher than the accuracy obtained from the standard NB and the existing approaches which are represented in Table 4. Table 4 represents the results of NBTREE, CFWNB, DTNB, AODE, and NB using accuracy, sensitivity, specificity, and precision parameter. From
Wrapper-Naive Bayes Approach to Perform Efficient Customer …
29
the comparison, we can clearly see that the wrapper-NB achieves higher prediction results compare to standard NB and the existing approaches. From the experimental results, we can conclude the proposed wrapper-NB performs better customer analysis which helps to get better customer understanding behavior patterns compare to existing approaches.
5.6 Research Findings 1. 2. 3. 4. 5.
The proposed methodology chooses the best feature subgroup and enhances the customer analysis efficiently. But time and computational complexity are more in the wrapper methodology to choose the feature subgroup. Compare to two wrapper approaches, genetic algorithm chooses the best variable subset to compare to sequential forward selection. In wrapper approaches, different induction algorithms can be applied to check how the variable subset selected varies accordingly to the induction algorithm. Compare to other variable selection approaches, wrapper approach performs best in choosing optimal variable subset.
6 Conclusion Analyzing efficient consumer behavior patterns helps the enterprises to interpret customer satisfaction and further makes to develop effective decision planning to enhance the consumer base and customer retention efficiently. To understand indepth pattern of customer behavior in this work, the NB approach is carried out. But due to the presence of uncertainties variables in the customer dataset, the performance of NB is degraded, since this is an infringement of the NB presumption about the variables in the dataset. To solve the issue, attribute selection is applied to elect the most appropriate attribute subgroup to model with the NB algorithm. In this research, two different wrapper approaches are applied, and variable subsets obtained are modeled with the NB algorithm. To validate the efficiency of the suggested methodology, the experimental procedure is carried using the Australian credit (AC) dataset. Further experimental results obtained are projected using validity scores, and results are compared using the existing approaches like AODE, DTNB, CFWNB, NBTREE, and standard Naive Bayes. The result analysis clearly states that the proposed methodology performs better compared to the existing approaches. The proposed methodology achieves higher accuracy of 86.8116 and 86.087 compare to the existing approaches. Finally, we can conclude that the proposed wrapper-NB performs better customer prediction compare to an existing one. Further, different wrapper and filter approaches can experiment with different customer datasets which can be carried as future work.
30
R. Sıva Subramanıan et al.
References 1. R. Sıva Subramanıan, D. Prabha, A survey on customer relationship management, in 4th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, 2017, pp. 1–5. Electronic ISBN: 978-1-5090-4559–4. https://doi.org/10.1109/ICACCS. 2017.8014601 2. D. Chiguvi, P.T. Guruwo, Impact of customer satisfaction on customer loyalty in the banking sector. Int. J. Sci. Eng. Res. (IJSER), 55–63 (2017) 3. U.N. Dulhare, Prediction system for heart disease using Naive Bayes and particle swarm optimization. Biomed. Res. 29(12) (2018) 4. L.N. Sanchez-Pinto, L.R. Venable, J. Fahrenbach, M.M. Churpek, Comparison of variable selection methods for clinical predictive modeling. Int. J. Med. Inform 116, 10–17 (2018). https://doi.org/10.1016/j.ijmedinf.2018.05.006 5. M. Granik, V. Mesyura, Fake news detection using Naive Bayes classifier, in 2017 IEEE First Ukraine Conference on Electrical and Computer Engineering (UKRCON), Kyiv, Ukraine, 2017, pp. 900–903 6. N.S. Harzevili, S.H. Alizadeh, Mixture of latent multinomial naive Bayes classifier. Appl. Soft Comput. 69, 516–527 (2018) 7. Y. Long, L. Wang, M. Sun, Structure extension of tree-augmented Naive Bayes. Entropy 21(8), 721 (2019) 8. L. Yu, L. Jiang, D. Wang, L. Zhang, Attribute value weighted average of one-dependence estimators. Entropy 19(9) (2017) 9. R.A.I. Alhayali, M.A. Ahmed, Y.M. Mohialden, A.H. Ali, Efficient method for breast cancer classification based on ensemble hoffeding tree and Naïve Bayes. Indones. J. Electr. Eng. Comput. Sci. 18(2), 1074–1080 (2020) 10. A. Gupta, L. Kumar, R. Jain, P. Nagrath, Heart disease prediction using classification (Naive Bayes), in Proceedings of First International Conference on Computing, Communications, and Cyber-Security (IC4S 2019). Lecture Notes in Networks and Systems, vol. 121, ed. by P. Singh, W. Pawłowski, S. Tanwar, N. Kumar, J. Rodrigues, M. Obaidat (Springer, Singapore, 2020). https://doi.org/10.1007/978-981-15-3369-3_42 11. K.J. Dsouza, Z.A. Ansari, Big data science in building medical data classifier using Naïve Bayes model, in 2018 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM), 2018, pp. 76–80. https://doi.org/10.1109/ccem.2018.00020 12. J.C. Cortizo, I. Giraldez, M.C. Gaya, Wrapping the Naive Bayes classifier to relax the effect of dependences, in Intelligent Data Engineering and Automated Learning—IDEAL 2007. IDEAL 2007. Lecture Notes in Computer Science, vol. 4881, ed. by H. Yin, P. Tino, E. Corchado, W. Byrne, X. Yao (Springer, Berlin, Heidelberg, 2007) 13. R. Panthong, A. Srivihok, Wrapper feature subset selection for dimension reduction based on ensemble learning algorithm. Procedia Comput. Sci. 72, 162–169 (2015). https://doi.org/10. 1016/j.procs.2015.12.117 14. M.-L. Zhang, J.M. Peña, V. Robles, Feature selection for multi-label naive Bayes classification. Inf. Sci. 179(19), 3218–3229 (2009). https://doi.org/10.1016/j.ins.2009.06.010 15. C.B. Christalin Latha, S.C. Jeeva, Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Inform. Med. Unlocked, 100203 (2019). https:// doi.org/10.1016/j.imu.2019.100203 16. R. Sıva Subramanıan, D. Prabha, Prediction of customer behaviour analysis using classification algorithms. AIP Conf. Proc. 1952, 020098 (2018). https://doi.org/10.1063/1.5032060. ISBN: 978-0-7354-1647-5 17. D. Prabha, R. Sıva Subramanıan, S. Balakrishnan, M. Karpagam, Performance evaluation of Naive Bayes classifier with and without filter based feature selection. Int. J. Innov. Technol. Expl. Eng. (IJITEE) 8(10), 2154–2158 (2019). ISSN: 2278-3075. https://doi.org/10.35940/iji tee.J9376.0881019
Wrapper-Naive Bayes Approach to Perform Efficient Customer …
31
18. R. Sıva Subramanıan, D. Prabha, J. Aswini, B. Maheswari, M. Anita, Alleviating NB conditional independence using multi-stage variable selection (MSVS): banking customer dataset application. J. Phys.: Conf. Ser. 1767, 012002 (2021). https://doi.org/10.1088/1742-6596/1767/ 1/012002 19. M.A. Fahmiin, T.H. Lim, Evaluating the effectiveness of Wrapper feature selection methods with artificial neural network classifier for diabetes prediction, in Testbeds and Research Infrastructures for the Development of Networks and Communications. TridentCom 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol. 309, ed. by H. Gao, K. Li, X. Yang, Y. Yin (Springer, Cham, 2020). https:// doi.org/10.1007/978-3-030-43215-7_1 20. R. Siva Subramanian, D. Prabha, Optimizıng Naive Bayes probability estimation in customer analysis using hybrid variable selection, in Computer Networks and Inventive Communication Technologies. Lecture Notes on Data Engineering and Communications Technologies, vol. 58, ed. by S. Smys, R. Palanisamy, Á. Rocha, G.N. Beligiannis (Springer, Singapore, 2021) 21. N. Metawa, M.K. Hassan, M. Elhoseny, Genetic algorithm based model for optimizing bank lending decisions. Expert Syst. Appl. 80, 75–82 (2017). https://doi.org/10.1016/j.eswa.2017. 03.021 22. G. Webb, J. Boughton, Z. Wang, Not so Naive Bayes: aggregating one-dependence estimators. Mach. Learn. 58(1), 5–24 (2005) 23. L. Jiang, L. Zhang, C. Li, J. Wu, A correlation-based feature weighting filter for Naive Bayes. IEEE Trans. Knowl. Data Eng. (2018). https://doi.org/10.1109/TKDE.2018.2836440 24. M. Hall, E. Frank, Combining Naive Bayes and decision tables, in Proceedings of the 21st Florida Artificial Intelligence Society Conference (FLAIRS), 2008, pp. 318–319 25. Y. Deng, Y. Wei, Y. Li, Credit risk evaluation based on data mining and ıntegrated feature selection, in 2020 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), Macau, China, 2020, pp. 1–4. https://doi.org/10.1109/ICSPCC50002. 2020.9259483 26. R. Siva Subramanian, D. Prabha, Customer behavior analysis using Naive Bayes with bagging homogeneous feature selection approach. J. Ambient Intell. Hum. Comput. 12, 5105–5116 (2021)
ECG Acquisition Analysis on Smartphone Through Bluetooth and Wireless Communication Renuka Vijay Kapse and Alka S. Barhatte
Abstract Health monitoring and technologies related to health monitoring is an appealing area of research. The electrocardiogram (ECG) has constantly been the mainstream estimation plan to evaluate and analyse cardiovascular diseases. Heart health is important for everyone. Heart needs to be monitored regularly and early warning can prevent the permanent heart damage. Also, heart diseases are the leading cause of deaths worldwide. Hence, this work presents a design of a mini wearable ECG system and is interfacing with the Android application. This framework is created to show and analyse the ECG signal obtained from the ECG wearable system. The ECG signals will be shipped off an Android application via Bluetooth device and Wi-Fi. This system will automatically alert the user through notification. Keywords ECG · Bluetooth communication · Wi-Fi · Android application
R. V. Kapse (B) · A. S. Barhatte MIT WPU, Kothrud, Pune, India A. S. Barhatte e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_3
33
34
R. V. Kapse and A. S. Barhatte
1 Introduction Heart health is one of the significant boundaries of human body. ECG framework is an ideal instrument for patient checking and management. The daily ECG monitoring empowers the recognition of cardiovascular diseases. Advancement of portable correspondence innovation makes it possible to make a minimal and effectively accessible smartphone-based ECG gadgets. ECG signal could be utilized to analyse the medical issue of heart. The electrodes recognize the smaller electrical changes after effect of heart muscle depolarization followed by repolarization during every beat of heart. In abnormal ECG pattern, there are various cardiovascular abnormalities and heart rhythm disturbances. The proposed system has great extensibility and can simply consolidate other physiological signals to suit different types of telehealth situations. This device will send the ECG signals to both user and doctor. JavaTM-based software running on smartphone performs task like raw ECG data compression, decompression, and encryption. This system can automatically send alert to user through notification. In this manner, we propose to coordinate the remote ECG change and heart rate checking in the developed Android application. The objective was to develop the smartphonebased ECG acquisition system and to analyse the ECG signal for normal and abnormal ECG. Fig. 1 Block diagram of proposed system Power supply
Node MCU
Wi-Fi
ECG module
Bluetooth HC-05
Smartphone device
ECG Acquisition Analysis on Smartphone Through Bluetooth …
35
1.1 Electrocardiogram An ECG is an advanced chronicle of the electrical signs in the heart. During every heartbeat, a good heart has a methodical movement of depolarization that begins with pacemaker cells in the sinoatrial hub, spreads all through the chamber, and goes through the atrioventricular hub down into the heap of HIS and into the Purkinje filaments, spreading down and to one side all through the ventricles. The waveform given below shows the normal electrical activity of heart (Fig. 2).
2 Literature Review As indicated by the distribution of International Telecommunication Union in January 2011, the worldwide cell phone clients have surpassed 50 billion individuals [1]. Taking a gander at the new patterns in bioclinical applications, a significant progression can be noted in well-being observing gadgets [2]. Advancement of the portable correspondence innovation makes it conceivable to make minimal and effectively accessible smartphone-based gadgets that give continuous control of human physiological boundaries. The two organizations Polar and Zephyr have fostered an Android stage viable Bluetooth heartbeat belt. The ECG signal and the determined pulse are communicated to the Android telephone by Bluetooth [1]. Heart rate variability (HRV) is known to be one of the delegate ECG-determined highlights that are valuable for assorted inescapable medical services applications. There is an unpretentious versatile ECG checking framework, Sinabro, which screens a client’s ECG deftly during every day cell phone use [3]. To detect ECG flags inconspicuously, they planned an ECG sensor which is installed in a cell phone case. Fig. 2 Normal ECG waveform. Source Google
36
R. V. Kapse and A. S. Barhatte
The Sinabro framework likewise incorporates the middleware for supporting smartphone healthcare applications dependent on ECG-inferred highlights [3]. Healthcare checking utilizing wearable sensor frameworks is a quickly extending space of exploration that guarantees an increment in the accessibility of well-being information. Wearable well-being observing frameworks ought to be agreeable and ready to accomplish high information quality. Keen materials with electronic components incorporated straightforwardly into textures offer an approach to insert sensors into attire consistently to fill these needs. There are numerous microcontrollers utilized in ECG screens, from 8 to 32-bit microcontrollers, just as DSPs. One of the restricting components on the usefulness of the telemetry framework is the force supply. This is a result of the restricted battery limit that may be utilized. The force utilization of the patient identification circuit was limited in the equipment plan such that all electronic parts utilized were chosen dependent on supply current rating. The current level of the enhancer utilized was 20 mA for the cardiac circuit [2].
3 System Description This device includes user’s ECG acquisition device, Android application in smartphone. The ECG information gathered by the sensor is sent to the telephone through vigorous remote stations. Then, at that point, the Android application on the telephone takes a further investigation of the information, and the outcomes can be quickly transferred to the application on smartphone. Then this signal is sent to physician by sharing the credentials of user. It has one week of access to doctor to take a clinical decision regarding to user. This framework has quick 24 h admittance to a doctor to audit communicated information and settle on clinical choices with respect to the patient. The doctor can help the patient if there should arise an occurrence of a crisis.
3.1 Selected Technology The proposed system has used Bluetooth module and Internet as transmission convention for connection between acquisition module and smartphone. This paper presents a portable system where heart’s electrical activity is transmitted via Bluetooth and Internet to mobile device. This type of system is desirable because it can operate without relying upon the view of obstacles in between. The HC-05 is exceptionally cool module. It can add full-duplex remote functionality to project. It can be used to impart through two microcontrollers or connect with each gadget having Bluetooth functionality. The Bluetooth module communicates by using USART at 9600 baud rate; thus, it is not hard to connect with microcontroller which upholds USART. Power it with + 5 V and connect the RX pin of Bluetooth with TX pin of Node MCU and vice versa.
ECG Acquisition Analysis on Smartphone Through Bluetooth …
37
3.2 Android Application We used MIT app inventor to create an Android application for ECG monitoring. It is a Web application coordinated development environment given by Google. It permits novices to create application software for two operating systems (OS): Android and iOS. The Android application can be categorized as high level programming for smartphone. The application consists of Bluetooth module and data processing. The block editor in the initial form ran in different Java process for making visual block programming language. The client may use an “on computer” emulator accessible of Windows, MacOS, and Linux. • Connecting mobile device’s Bluetooth adapter to acquisition device’s Bluetooth module. • Then acquisition device sends the data to smartphone via Bluetooth. • Then received data is processed, filtered, and plotted as ECG waveform (Figs. 3, 4 and 5). For Android application purpose, Blynk is best platform. Blynk works over Internet, the equipment you pick ought to have the option to associate with the Web. A portion of the sheets, as Arduino Uno, will require an Ethernet or Wi-Fi Shield to convey; others are now Internet empowered like the ESP8266, Raspberry Pi with Wi-Fi dongle or SparkFun Blynk Board. Be that as it may, regardless of whether you do not have a safeguard, you can associate it over USB to your PC or work area. It can be connected to cloud by using Wi-Fi, BLE, Ethernet, USB, etc. It can send
Fig. 3 Block diagram of designing ECG heart rate monitoring app with the help of MIT app inventor
38 Fig. 4 View of application for Bluetooth communication
Fig. 5 View of application in Blynk
R. V. Kapse and A. S. Barhatte
ECG Acquisition Analysis on Smartphone Through Bluetooth …
39
mail, tweets, and push notification. In this experiment, we used SAMSUNG Galaxy J6 smartphone with 3 GB RAM, 32 GB storage, and it runs on Android v8.0 Oreo OS.
4 Hardware Description The proposed framework comprises of three lead terminals for ongoing heart observations. A circuit board was created for ECG signal handling and remote information transmission. This gadget utilizes three anodes, and the gained signal addresses the main Einthoven bipolar lead. A circuit board was created for ECG signal processing and wireless data transmission via Bluetooth module. The circuit has class I Bluetooth module with high power transmitter (100 m). The actual range might be restricted to 100 ft. or due to internal receiving antenna or type of user device used to associate with it. Once the device is connected with Bluetooth, it is ready to receive data. Some exceptional character commands are defined to operate the unit. Results are quick; anyway any commands given do not survive a power cycle and should be re-issued if the unit is shut down or rebooted distantly. By show, lead 1 have the positive terminal on the left arm, and the negative cathode on the right arm and in this way estimates the expected distinction between the two arms. In this and the other two appendage drives, an anode on the right leg fills in as a kind of perspective cathode for recording purposes. An Android cell phone which has the remote correspondence capacity was utilized to get the ongoing ECG information from the securing module for showing the ECG plot on the screen and sending to physician by sharing the credentials of user (Fig. 6). We have used AD8232 ECG sensor to extract, amplify, and filter the biopotential signals having noisy condition. AD8232 has operating voltage of 3.3 V. And it has CMRR of 80 dB and three lead configuration, quick re-establish include improves channel settling. AD8232 ECG module can be effectively interfaced with any microcontroller unit. It requires one simple pin for getting the yield of the sensor and three advanced pins for control related tasks. Also, there is LED indictor that will throb to Fig. 6 Proper placement of electrodes. Source Google
40
R. V. Kapse and A. S. Barhatte
Fig. 7 AD8232 ECG sensor with electrodes. Source Google
rhythm of heartbeat. It is prescribed to remove the sensor cushion on the lead earlier placed on body. Closer the pads to heart, better the estimation (Fig. 7). We have used NodeMCU as microcontroller in this proposed system. It has flash memory of 4 MB, SRAM 64 KB and clock speed of 80 MHz. It has other Espressif boards like ESP8266, ESP12E and ESP32. NodeMCU has Arduino, Raspberry pi, AVR development board, Intel Edison, etc. NodeMCU can be fuelled utilizing MicroUSB jack and VIN pin. It upholds UART, SPI, and I2C interface. It has various applications such as prototyping to IoT devices and network projects. It has analog (A0) and digital (D0–D8) pins (Fig. 8). Fig. 8 NodeMCU
ECG Acquisition Analysis on Smartphone Through Bluetooth …
41
There is a method of developing NodeMCU with Arduino IDE. This is quite easier for Arduino developers than learning a new language. In Arduino IDE, we compose and accumulate code; the ESP8266 toolchain in background makes a binary firmware document of code we composed. Furthermore, when we transfer it to NodeMCU, then it will flash all NodeMCU firmware with recently generated binary firmware code. Indeed, it composes the total firmware.
5 Result We developed Android application for ECG monitoring. The aim of project was to develop an ECG monitoring device and analyse the ECG for normal and abnormal signals. And, it is successfully done. This framework depends on ECG sensor, microcontroller, ECG module, and Android innovation as it has extraordinary use in field of medicine. The current framework has the benefits of minimal expense and lowpower utilization. This Android application includes the display of heart rate, ECG waveform and recognized heart beat type and notify it through the notification such as “low heartrate” or “high heartrate”, and it will send the notification continuously in specific interval of time. The ECG waveform was displayed on screen and processed in form of 5 s interval. User can share his credential to multiple users at a same time including physicians. For sending the ECG signals to the smartphone via Bluetooth module, the Bluetooth associating interface was created. Subsequent to tapping the catch on smartphone, three utilitarian keys appeared on screen. Contacting the “SCAN” key empowers the smartphone to search for close by Bluetooth gadgets. Select a gadget, and smartphone would interface it with Bluetooth. After connecting the Bluetooth successfully, LED would glow, and then received ECG waveforms are shown on the smartphone’s screen (Table 1; Figs. 9, 10, 11 and 12). Flowchart of proposed system is as follows: Table 1 Normal ECG parameter
P wave
0.08 s
PR interval
0.12–0.2
PR segment
0.05–0.12
QRS complex
0.08–0.1
ST segment
0.08–0.12
QT interval
0.36–0.46
T wave
0.16 < 0.5
42
R. V. Kapse and A. S. Barhatte
Start
Circuit power ON
Bluetooth searching for
No
other device
Pairing devices
Turn ON hotspot
Share data of ECG to Smartphone
End
6 Conclusion The main aim of the project is to develop smartphone-based ECG acquisition device and the analysis of ECG single for normal and abnormal ECG. So, the analysis of ECG signals by using ECG acquisition device is successfully done via Wi-Fi and Bluetooth communication. Smartphone associated with proper sensing devices could successfully utilize for obtaining, preparing and representation of biosignals, adding adaptability and versatility. This system has advantages like low cost, low-power consumption. It can be used in both rural and urban areas. It is comfortable to use
ECG Acquisition Analysis on Smartphone Through Bluetooth … Fig. 9 Select the proper device to share the ECG data
Fig. 10 ECG signals via Bluetooth communication
43
44
R. V. Kapse and A. S. Barhatte
(a)
(b)
Fig. 11 a, b ECG data on Blynk application
during daily activities. It will detect the irregularities in rhythms of heart, and it will notify it through the notification.
ECG Acquisition Analysis on Smartphone Through Bluetooth …
45
Fig. 12 Notification on Blynk application
References 1. T.-H. Yen, C.-Y. Chang, S.-N. Yu, A portable real-time ECG recognition system based on smartphone, in Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Montréal, Québec, Canada, 2013 2. N. Belgacem, Bluetooth portable device and MATLAB-based GUI for ECG signal acquisition and analysis 3. S. Kwon, S. Kang, Y. Lee, C. Yoo, K. Park, Unobtrusive monitoring of ECG-derived features during daily smartphone use, in 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2014 4. S. Hadiyoso, K. Usman, A. Rizal, Arrhythmia Detection Based on ECG Signal Using Android Mobile for Athlete and Patient (Telkom University Bandung, Indonesia) 5. H. Gao, X. Duan, X. Guo, A. Huang, B. Jiao, Design and tests of a smartphones-based multilead ECG monitoring system, in 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2013 6. P.N. Gawale, A.N. Cheeran, N.G. Sharma, Android application for ambulant ECG monitoring. Int. J. Adv. Res. Comput. Commun. Eng. 3(5), 6465–6458 (2014) 7. R. Gamasu, ECG based integrated mobile tele medicine system for emergency health tribulation. Int. J. Bio-Sci. Bio-Technol. 6(1), 83–94 (2014). https://doi.org/10.14257/ijbsbt.2014.6. 1.09 8. https://en.wikipedia.org/wiki/Electrocardiography 9. J.I.Z. Chen, L.T. Yeh, Data forwarding in wireless body area networks. J. Electron. 2(02), 80–87 (2020) 10. S. Shakya, L. Nepal, Computational enhancements of wearable healthcare devices on pervasive computing system. J. Ubiquitous Comput. Commun. Technol. (UCCT) 2(02), 98–108 (2020) 11. V. Suma, Wearable IoT based distributed framework for ubiquitous computing. J. Ubiquitous Comput. Commun. Technol. (UCCT) 3(01), 23–32 (2021)
46
R. V. Kapse and A. S. Barhatte
12. J. Hariharakrishnan, N. Bhalaji, Adaptability analysis of 6LoWPAN and RPL for healthcare applications of Internat-of-Things. J. ISMAC 3(02), 69–81 (2021) 13. D.A. Bashar, Review on sustainable green Internet of Things and its application. J. Sustain. Wirel. Syst. 1(4), 256–264 (2020) 14. https://docs.blynk.cc 15. M. Aljuaid, Q. Marashly, J. Aldanaf, Smartphone ECG monitoring system helps lower emergency room and clinic visits in post-atrial fibrillation ablation patients. Clin. Med. Insights Cardiol. 14 (2020) 16. M. Shabaan, K. Arshad, M. Yaqub, F. Jinchao, Smartphone based assessment of cardiovascular diseases using ECG and PPG analysis. BMC Med. Inform. Decis. Mak. 20, 177 (2020) 17. M. Li, W. Ziong, Y. Li, Wearable measurement of ECG singles based on smart clothing. Int. J. Telemed. Appl. (2020) 18. R. Munoz, R. Olivares, MAG Santos, Online Heart Monitoring System on Internet of Heath Things Environments (Elsevier, 2020)
Enhanced Security of User Authentication on Doctor E-Appointment System Md Arif Hassan, Monirul Islam Pavel, Dewan Ahmed Muhtasim, S. M. Kamal Hussain Shahi, and Farzana Iasmin Rumpa
Abstract Introducing e-services has shown that e-commerce site is able to advertise an impactful method for individuals to do booking/reservation. As far as the topic is concerned, Internet-booking methods could be created for bus stations, airfields, hotels, cinemas along with other clinics, which take part in reservation. However, in this particular paper, we proposed an online appointment process for a healthcare environment. An Internet scheduling process enables people to securely and conveniently book their booking online. When compared to traditional queuing, the Webbased appointment system has the potential to significantly improve patient satisfaction with registration while also reducing total waiting time. The existing research is based on a single-layer authentication system that posed certain drawbacks, which also include poorly executed and inadequate time schedules with patient staff. This proposed technique solved existing methods and improved user authentication within a healthcare system. Our proposed method provides an effective and secure outpatient appointment queuing model for appropriate appointment scheduling, reducing patient wait times, doctors’ idle time, and overtime, while also increasing outpatient satisfaction. We used Hypertext Markup Language, Cascading Style Sheet, Bootstrap, and JavaScript for the front-end PHP and MySQL database as the back end. The developed system was successfully tested on a computer system with Intel Core i7 CPU, 3.40 GHz, and 6 GB RAM. The implemented proposed work demonstrated safer and improved outcomes comparing with other existing works. Keywords Online appointment · Authentication · Scheduling · Health care · Payment method M. A. Hassan (B) · F. I. Rumpa Faculty of Information Science and Technology, Centre for Cyber Security, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia M. I. Pavel · D. A. Muhtasim Faculty of Information Science and Technology, Centre for Artificial Intelligence Technology, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia S. M. Kamal Hussain Shahi Faculty of Computer Science and Engineering, Leading University, 3100 Sylhet, Bangladesh © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_4
47
48
M. A. Hassan et al.
1 Introduction Generally, healthcare sector is the fundamental and spindle component of human lives. Health therapy is a fundamental need of any man being. In numerous clinics, patients wait for a very long period in the healthcare centre before the doctor attend to them [1, 2]. Nowadays, each one desires a facility, which will lower the efforts of theirs, minimum time and offer a means to do their business much more easily. Therefore, any mistake committed in the medical services may well result in termination or defect of life [3]. Not too long ago, information and communication is used extensively to enhance the different services and operations in the health’s area care program. Patient appointment with the doctor is among the medical services, which have been automated. Medical providers are driven to lessen operation cost while enhancing the quality of service. Online appointment scheduling scheme, shown in Fig. 1, is a process whereby a person, or maybe simply an individual, is able to use the site on the physician, and also through the Internet program, the individual could make the appointments of theirs [4]. Additionally, scheduling appointments online has a number of advantages. These benefits include time savings, as staff members spend substantially less time with clients than they would with paper-based visits, during which patients are required to pack a variety of items. When a Web application-based method for arranging patient appointments is used, no time is wasted in waiting. Additionally, the involuntary appointment reminder tool built into the Internet appointment scheduling system saves time by removing the need for hospital employees to call and SMS patients to remind them of their appointments. In comparison to physically setting appointments
pre scheduled
PaƟent
Scheduled PaƟent Open Access
PaƟent PaƟent
PaƟent
Emergency
Walk in PaƟent Regular
PaƟent
Fig. 1 Various types of patient in appointment system [5]
Enhanced Security of User Authentication on Doctor …
49
at hospitals, which can be completed only during business hours, the Internet-based appointment system enables twenty-four-hour convenience scheduling and allows individuals to plan appointments. When an individual schedules an appointment via a Web application, he or she is able to request an appointment, negotiate with the centre whether the appointment is urgent and select a preferable time window from all available time slots. Appointments scheduled through a paper-based system require that those present at the clinic complete and return registration forms to the registration desk, following which they are assigned to their preferred doctor. Cards are distributed in a first come, first served (FCFS) fashion, with the person who arrives first being served first and the one who arrives last waiting in line. This kind of appointment scheduling structure has a selection of constraints, like persons being necessary to fill in appointment styles upon arrival at the clinic. And there is no chance to register while in home or any location as an outcome, patients spend a large amount of time waiting around in queues and must stay within dates of appointment assigned by the registration table [6]. User authentication is very important online environment issue [7, 8]. In addition, there is absolutely no mechanism for patient notification when visits are postponed. Additional, managing paper-based hospital appointment device is tough to handle, thus the demand for a substitute method. The key motivation of this project is the design and implementation of a medical centre’s website which offers an effective and inexpensive approach for appointments and assistance for the work involved: clinical and physician schedule management, queue management, and export of patient appointments. It is necessary that patients have quick and simple access to the services of the clinic and that all appointments are administered. In the meantime, doctors are also needed to conveniently see all forthcoming appointments and to produce the results for patients. In addition, the individual can also provide more information about the doctor, making the doctor aware of the issue and giving the doctor the opportunity to prepare the essential information for a patient’s arrival. For this object, the objective is to develop two-layer authentic system so that user’s data do not leak, enhance the security of controversial data of users, and create full functional system resolving the current limitation of the study.
2 Related Works IoT in today’s era, there are two sorts of appointments in the world based on paper and the Internet. The paper-based enclosed casualty centre comprising registry process and appointment scheduling, patient files and the overall health data are kept and transferred to the physical storage facility by nurses or perhaps a doctor’s workplace administrator. The threat of misplacement of captures is clear and time-consuming and inefficient. The registration process is largely time-consuming and inefficient, and the risk of misplacing catches is clear and calls on persons to run the kinds and
50
M. A. Hassan et al.
to present them in the registration tables, or they might insert their identification card or appointment card. The difficulty is that when patient files are not to be discovered or maybe unregistered patients, many patient meetings are also changed for more than once to a later date. This problem is a challenge for healthcare companies all across the world. In order to overcome this challenge, many researchers pursue different types of online appointment systems. Appointment and scheduling process is a single advantage in many health-related establishments of online clinical activities and services. The online scheduling system enables the ambulator to register their details, guide and reject their appointment online. The strategy will lower the person’s wait time and enhance the efficiency of the doctors. The Internet doctor appointment system offers a number of purposes on the market. In this particular field, we will cover various contemporary issues relating to their work and their characteristics. An Android-based healthcare system application is proposed in [1]. The suggested methodology aimed at making contact between a doctor and patients more efficient and quick, allowing transparency to distance or places, wherever they are located, using the software connecting to a hospital-operated server and using the communications GPS and GSM system. A comparable Android application for smartphones has been used to improve the healthcare system in [2, 5, 9, 10]. Android is a large-scale, open-source operating system that mainly uses Linux and Android to build plenty of useful applications thanks to mobile devices that simplify and speed up the everyday. The Android platform also includes an integrated database (SQLite Web database and) of services [2]. Android platform allows connectivity between the application and the server through some APIs, making it very easy to make use of the novel features and libraries on the Android platform for a doctor appointment that is connected to a site on the server. An object-oriented approach to analyse, design, and built an Android application using Android Studio, a Google-backed integrated development environment (IDE), and JAVA as logic and programming language option for PHP and MySQL is adopted in [16]. In the field of medicine, this research has played an important part in enabling and allowing patients to have doctor appointments in real time and also allows patients to come online to talk to patients and doctors. An interactive mobile healthcare application architecture that allows physicians and patients to communicate is proposed in [17]. For better healthcare services and delivery, the proposed mobile application can provide optimal communication to different healthcare stakeholders such as patients, doctors, and pharmacists. Objectoriented methodology was utilised in the software development process (OOADM). The front end of the mobile application is built using HTML, CSS, and JS, while the back end is driven by Angula JS, PHP, and MYSQL. A second popular option is a Web-based system that can be accessed via the Internet browser and adapts to any device on which you view it. They do not belong to a certain system and do not need to be downloaded or installed. In fact, because of their responsiveness, they look and work like a mobile app. In terms of Web-based medical appointment services, there are two main categories: medical scheduling software as a service (also known as SaaS) and proprietary Web-based scheduling
Enhanced Security of User Authentication on Doctor …
51
systems. Medical scheduling software as a service (SaaS) has received significant attention in recent times [18]. A Web-based designation mechanism is available in [3, 11]. This approach is designed to create a method to increase the efficiency and quality of offering you an online rating system that minimises time to wait, using Angular JS for the front end, Ajax client–server request handling frame, SQlite3, and MYSQL for the back end. An online appointment portal is proposed in [19]. Practitioners need to register in the system and will be able to see appointments made by users, including patients. This system is built using prototype model. The MySQL database was utilised to build this system, while PHP and JavaScript were used as programming languages. This approach will reduce the amount of appointment calls and avoid the morning rush caused by urgent appointments. It also eliminates the need for extra receptionists, resulting in significant labour savings. It also allows consumers to save time by eliminating the need to schedule an appointment with the receptionist. This technology has the potential to transform the current frightening appointment procedure into one that is more efficient, productive, and cost-effective. A method to make the delivery of a Web-based appointment system more efficient and better quality to decrease waiting times is proposed in [20]. This study uses an Angular JS for the front end, an Ajax framework for managing client–server request, and a Sqlite3 and MYSQL for the back end to create a patient appointment and scheduling system. “InstaSked” Web-based scheduling system was developed in [21] to decrease patient waiting time. It is for patients to make appointments, patient list administrator, physicians, and managers to manage appointments. The system included integration with the defining, measuring, analysing, planning, and checking (DMADV) Six Sigma approach and business process management (BPM). PHP, Apache, MySQL, and XAMPP are used for this system. A low-cost, convenient clinic appointment system that includes a Web application for clinic professionals and a hybrid mobile app for patients is proposed in [22]. Realtime queue monitoring and queue forecasting are linked using time series approximation to help patients avoid peak hours and enhance healthcare staff management in managing different patient flow peaks. Patients may reschedule appointments online and get appointment reminders through a mobile app using the Web-based appointment approach. A prototype is constructed and tested utilising numerous interviews to demonstrate the feasibility and effectiveness of the proposed method. The prototype is a rudimentary working model that demonstrates the feasibility of the proposed solutions for reducing wait times in other Malaysian clinics. MySQL WAMP server and PHP are used to construct the system. In [23], a basic application for booking an appointment using NodeJS as the back end and angular as the front end is described in terms of design, input, process, and output. The objectives of this article were to determine the advantages and obstacles to adopting Web-based medical scheduling that have been addressed in the literature, as well as unmet requirements in the present healthcare system. The suggested system in [24] is a smart appointment booking technique that allows patients to choose a doctor’s appointment online with ease. This system improves the
52
M. A. Hassan et al.
amount of profit by increasing patient optimism and contentment. User authentication is available in this system. This Web application is estimated to solve the issue of scheduling and managing appointments based on the preferences of the user. For better appointments, [25] proposes a Web-based appointment booking system that can be accessed through online or mobile devices and allows students and lecturers to be aware of their appointment time regardless of where they are. By connecting to the Internet, students and instructors may easily get access to the system. Bookazor, a Web-based application for arranging appointments in the fields of parlours, hospitals, and architects within a specified geographic region is proposed in [26]. On an ionic basis, this app is optimised. It is a free SDK for creating hybrid mobile apps. CSS, HTML, and JavaScript are among the technologies used. Firebase is essential for obtaining data for appointment scheduling, which helps in the creation of successful applications. It has features such as analytics, database, messaging, and crash reporting, all of which assist users concentrate. The back end of the proposed system is implemented with NodeJS. There is another prominent way in the medical system, called cloud computing. It is the provision of on-demand computer services from storage space to processing power generally via the Internet and on pay when you go on schedule. For most programs, cloud computing is becoming the default choice: software suppliers increasingly sell their applications as online products rather than as standalone products as they aim to convert to a membership model. A cloud computing-based online healthcare system is proposed in [12]. The proposed method is going to provide different amenities such as, for instance, Internet health checkups booking with rates that are discounted obtain info about preventive measures, offer different path laboratories, and record the user’s overall health checkups records. A paradigm for a flexible cloud-based e-health management system is introduced in [27]. Cloud flexibility enables limitless numbers of users to simultaneously utilise the cloud without any limitations. There are two interfaces in the healthcare system: the patient interface and the doctor interface. The patient interface allows users to create, manage, regulate, and exchange their health information with doctors. Overall, the proposed system is effective, as it raises profits, time, and patient profile storage and provides patient information security, schedules appointments and online consultation with the respective doctor. The existing proposed technique and their features are shown in Table 1.
Cloud-based Android application
Web-based appointment system
Android application Mr.Doc
Peter et al. [11]
Malik et al. [2]
Username and password Appointment, online application, Android, hospital, scheduling, track, health care
(continued)
Patient cannot Android studio 2.1.1 and communicate with doctor SDK plug-in, Android 6.0, directly and also doctor HTML and PHP cannot register upon is the system. Single-layer authentication system
HTML, PHP, XAMPP
The proposed method XAMPP, Android Studio usability is limitation and 1.5.1 single-layer authentication
Username and password Finding hospital, cabin Single-layer booking, choosing suitable authentication system hospital, finding a doctor, emergency service calling, first aid information, alarm system for medication, etc.
Username and password Online health checkups, track of user’s health checkups records, broadcast the blood requirement directly on
Evaluation approach MYSQL with WAMP server and PHP
The proposed method Android Studio and usability is limitation and Firebase, the programming single-layer authentication language is Java
Jagtap et al. [12]
Username and password Patient can chat to doctor at any instant of time; he can share his document using this application
Android Chat Application
Drawbacks
Sonwane et al. [10]
Performance metrics
Kyambille et al. [13]
Technical feature
Methods and models
Mobile-based application Username and password Display available Single-layer scheduling system specialists and slots. authentication system (MASS) cancellation and postponed slots, remote monitoring patient’s performance
Author and year
Table 1 Summary of related works
Enhanced Security of User Authentication on Doctor … 53
Username and password Real-time queue monitoring and queue forecasting
Drawbacks
Single-layer authentication system
Single-layer authentication system
Single-layer authentication system
Single-layer authentication system
Single-layer authentication system
Single-layer authentication system
Single-layer authentication system
Single-layer authentication system
Username and password Raises profits, decreases Single-layer time, patient profile storage authentication system and patient information security, appointment
Cloud-based Web appointment system
Bee et al. [22]
Username and password Patients to make appointments, patient list administrator, physicians and managers to manage appointments
Shaikh et al. [27]
Web-based appointment system
Mendoza, S. [21]
Username and password Decrease waiting times
Username and password Specified geographic region
Web-based appointment system
Akinode et al. [20]
Username and password Significant labour savings
Username and password Optimal communication
Cloud based Web appointment system
Web-based appointment system
Zhao et al. [19]
Performance metrics
Akshay et al. [26]
Web-based appointment system
Nwabueze et al. [17]
Technical feature Username and password Appointments, real time, online call
Username and password Booking an appointment
Android application
Ajayi et al. [16]
Gunther Eysenbach [23] Web-based appointment system
Methods and models
Android application
Author and year
Table 1 (continued) Evaluation approach
Java, Firebase cloud messaging
CSS, HTML, and JavaScript, Firebase, NodeJs
NodeJS, Angular
MySQL WAMP server and PHP
PHP, Apache, MySQL and XAMPP
Angular JS, Sqlite3 and MYSQL
MySQL, PHP and JavaScrip
HTML, CSS, and JS, Angula JS, PHP, and MYSQL
JAVA PHP and MySQL
54 M. A. Hassan et al.
Enhanced Security of User Authentication on Doctor …
55
Step 1
Step 2
Step 3
Step 4
Step 5
User wants to appointments
There is two appointments opƟons
Perform appointments using online portal registraƟon
Payment select
Appointments Done
Walk-in Booking
Internet Booking
Web applicaƟon Portal
IT Infrustracture
Online Booking Payment OpƟon
Fig. 2 Proposed system architecture adopted from [14]
3 Methodology 3.1 Proposed System For each of the prior objectives, attempts to establish an online nomination process were carried out using different approaches and, despite their merits, each had flaws. We propose three-layer user authentication in the E-doctor appointment system. The proposed safety solution is a three-layer authentication system involving username, password, and pattern recognition. Patient can send requests to the doctor in our proposed system and will accept them and save their details in the database. Doctor can select the patient, and patient can select the doctor for treatment. Patient can chat to doctor at any time; with this program, he can exchange his paperwork. The suggested system comprises two phases: registration and login. Before using the service, the person need to register their details via a registration phase method. Only a mechanism known as a login phase can verify that information. During both registration process and authentication process, each of the suggested materials and techniques is finished in the system, and their process flow is assessed in this area. A full explanation is provided below in Fig. 2.
3.2 System Architecture Medical appointment is a very premature medical technique. Appointment organisation is a management responsibility that ensures that a system, in particular the health system, works smoothly. The approach consists of two important components. One
56
M. A. Hassan et al.
for the patient and one for the administrator. The administrative department aims to upgrade and create info about cabins and medical amenities. In addition, several other fields provide the basic patient with a number of significant tasks that will surely help them to receive efficient health care. This component requires that a patient register his details if he/she holds any type of household participant can register from the same number. Afterwards the patient can log in as a component of the patient. The checklist of the physician is supplied with expertise, and the patient can choose the particular physician. If the patient is far from home, both the physician and he will surely be able to email an injury photo, file-rays, and records to his or her healthcare professional, and afterwards, the physician will certainly provide him an emergency medical prescription. As many patients know, they are afraid to talk to the medical professional for some conditions like HIV; therefore, they can examine their fundamental problem with the assistance of this talking centre. In this component, the physician’s information is provided from the Web server side, and every physician of that particular health service receives username and password. Physician can enter into a medical professional control panel using any form of tools, and the patient’s total demand is revealed. After that, an alert is given to the doctor, and the conversation begins even more with the doctor if request is approved. In contrast to a traditional medical system, expert offers a prescription using paper and pen, although the medical professional provides this online technique with a prescription usage. The proposed system flowchart architecture is shown in Fig. 3. Database management that works permitting the database manager has to add, delete, change, questions, back and restore up. Add, delete, modify, and questions are the standard operations of data source administration. They can effectively maintain the consistency of the data source to meet the actual need. The data backup and restoration can improve protection of the system. Even if the information loss happens, the system can be restored quickly. This proposed version shown in Fig. 4 contains 3-tier, (front end, middle tier as well as back end) first-rate, people can access visit info with a Web browser through net. The 2-tier connects with initial tier for info exchange making use of Web services. The 2-tier makes use of an Internet server to link to the Web server and handles the HTTP demands exclusively for the static materials, such as static HTML pictures as well as documents. It response user’s request with HTTP method, such as returning a HTML webpages. In case the HTTP demand is connected to client consultation organising services, the Internet server will entrust the vibrant response to an additional server-side application situated at application server to process the demand [15]. All data are resident on the system Web server, which has the capacity to engage with several patient at the same time by running numerous procedures concurrently. The results feedback from application server will certainly be exchanged HTML format with Web server and showed in the standard HTML website. The portal server processes the patient login and registration requests, which in 2-tier. The application Web server is a component that manages the full end-to-end consultation tracking and organising services. Comprehensive details about each arranged appointment
Enhanced Security of User Authentication on Doctor …
Home
Hospital List
Doctors List
57
Registration and login
Enquiry
Contact
Select Panel
Patient Details
Patient Login
Doctor Login
Search Doctor
View Appoinment
Add Description
Dashboard
Change Password
My Appointment
Feed Back
Patient Details
Log out
Logout
My Patient
Fig. 3 Proposed system flowchart
PaƟent
Doctor
Web Browser
Web Browser Web Services
Web Server portal Server
Application Server
Database
Fig. 4 Proposed architectural framework for online appointment system
58
M. A. Hassan et al.
port, such as patient login as well as call information, are likewise stored into the 2-tier data source.
3.3 Implementation of Prototype This component defines the implementation of a model that reveals the primary principles of the service. The prototype contains a PC (user side) as well as PHP and MYSQL operating on apache XAMPP Web server (server side). In the Web Internet browser of the patient computer system, no modifications as well as no plugins are set up.
3.3.1
User Side
The front-end style is both straightforward and straightforward to utilise. Once the solution is started, the person certainly will register, and then, he will surely be able to log into the system right away. By choosing the desired doctor, day, and time, the patient can make an appointment. The patient handles the appointments via a website. The patient also registers a physician. Patients can also view medical professionals, papers, and comments. These also build the elements used to make these styles and the system elements; HTML, CSS, and JavaScript on the front-end browsers. It interacts with a patient in real time while back-end code communicates with a Web server to return consumer prepared outcomes. Because of the question executed on the Web server and returned information to the front end, whatever is presented on the site is primarily. The system is developed to run particularly Apache’s Web server, and all various other operating systems utilise this innovation. PHP is a Web server-side scripting language code in the back end that gets in connects with a data source (MySQL) to lookup, conserve, or alter information as well as return it back to the patient in kind of front-end code. The system is independent that it can service all Internet browsers.
3.3.2
Server Side
The register form in was created making use of Bootstrap modal feature with appropriate type managing utilising JavaScript to make certain patient information are recorded properly. The back end defines the information gain access to layer, Web server, and various other computational reasoning of the system. The back end of the suggested system was developed with PHP for scripting. MYSQL data source is utilised for creating system’s database. Apache Web server (XAMMP) software program was utilised for the Web server side. This stores all the essential details concerning the administrator and patient of the system. The application additionally enables several patient solutions such as blood group, age, information, and so on
Enhanced Security of User Authentication on Doctor …
59
and patient likewise changed the info. All the info worrying the person is saved in the patient table. In this instance, the person can watch the visit daytime as well as cost and confirm such reservation. All the offered visit and their info are kept in the client panel table. This way, the individual can include day, time, physician, as well as banners and more to the system. The offered appointment is accumulated in the offered reservation table, which makes it feasible to suggest when a seat is reserved. This (back end) is concealed from sight of routine website visitors because it is the mind of the site that is constructed with the Web server-side language (PHP) and data source. The front-end elements: It connects with a patient in real time while back-end code connects with a Web server to return patient prepared outcomes. Since of the inquiry carried out on the server and returned information to the front end, whatever is presented on the website is generally. Hypertext preprocessor (PHP): It is a server-side scripting language (code) in the back end that attaches. With a data source (MySQL) to lookup, conserve or alter information and return it back to the client in type of front end. The structure is created such that the designer can conveniently set up and handle his/her application documents from a central point.
4 Testing and Analysis 4.1 Testing Prototype The prototype is analysed based on the 4.1 and 4.2 phases of registration and login. The simulation is performed on the DELL laptop computer with Intel Core i7 CPU, 3.40 GHz CPU, and 6 GB RAM on the Web server side. The operating system is professional for Windows 10. XAMP P is an open-source, complimentary Internet server option bundle, set up by Apache Friends. XAMPP is a distribution of the software program that includes Apache Internet server, MySQL (really Maria DB), PHP, and Perl (as command line and Apache components) in one bundle. It is easy to use for Windows, MAC, and Linux platforms. No arrangement is needed to incorporate PHP with MySQL. On the individual side, the solution is search on google chrome Variation 81.0.4044.138 (Official Build) (64-bit) with exact same operating laptop computer.
4.2 Testing Scenario A new patient registers to the Web server by clicking on the registration switch on the welcome webpage and then sends his patient information to the registration webpage. The patient will undoubtedly be required to register for the application for the first time. The customer will undoubtedly receive a username and password after registration. To join, the patient must load the specified fields which include the
60
M. A. Hassan et al.
username, email, password as well as the pattern, then click on the login button to login, and after that, all patient information is stored in the data source of the Web server. When the registration is complete, the patient’s login requirement begins. When the user successfully registers, a notice message “Registered successfully” will appear as seen below. Figure 5 shows the login phase method screenshots in steps 1 and 2. Various checks are also kept throughout the user’s registration. This username and password must be used by the patient to log into the program each time. If the registered username and password are not entered by the user, the user will receive a notification message “Signing Failed Checks Your Contact Uses or Contact Support,” as shown below, if a user is registered. If both passwords are not matched, the user is informed that they did not match the “passwords,” as shown below. If email is not legitimate, the user cannot register himself, and a message shows that “email is not valid” as seen below. Figure 5d and e shows the patient panel where they can input their details according to necessity. They can change their details, such as address, blood groups, and email, in the patient panel. Patients can also search for different doctor kinds according to their needs. Once this phase is completed, patients can book. Figure 5f shows that the doctor panel also adjusted the features in the doctor panel. In this panel, the patent information can also be analysed using his username ID. Doctor can also contact patients directly and provide their information, such as video files, a photograph in
Fig. 5 a Information details including email and password, b username and password authentication, c pattern authentication, d, e patient portal for searching doctor for appointment and confirming, f patient details in doctors’s portal, h hash password in databases
Enhanced Security of User Authentication on Doctor …
61
this panel. Figure 5g further shows that all patent details are stored securely in the database in the hacking password storage.
4.3 Performance Metrics Recently, healthcare sector has gradually moved towards applying Internet platforms, regardless of time that is long waiting in queue that will or even would not occur. Because of the main issue of a lengthy waiting period facing by many outpatients, there should be a Web-based healthcare appointment booking system to assist patients to reserve an appointment and access their health records online. It seeks to minimise the amount of missed appointments, pointless outpatient queue at the clinics, and lengthy waiting period of outpatients at the clinics. This method will not just improve the effectiveness of shared details between healthcare professionals nationally but might also lessen the operational expenses of the healthcare industry typically, as there will not be some redundancies. Healthcare system still faces challenges following extensive research from an earlier study on enhancing the operating health standard, whether using conventional paper-based technology or even experienced paperless options. In older techniques, typical authentication steps such as that password and username are already found. Right here, we used the user username, the password, and the colour pattern that might be a successful way to provide exceptional protection for the authentication process. Here, we compared the existing authentication system with proposed authentication system and also its features in Table 2.
5 Conclusion In online appointment process, it supplies patients with a regular appointment number. At the appointment time indicated, patients arrive at the clinic and receive registration for their appointment amount. The fact that it can help the individual make the appointment to their doctor, clinic, or hospital easier is now becoming widely known among the major aspects of the online appointment program. This study proposes an online healthcare environment appointment scheduling method. It exhibits the online service design and provides an appropriate paradigm for developing an integrated health information system to facilitate interaction across diverse, autonomous, and distributed healthcare information systems. The prototype system exhibits the structure’s viability. The existing research has presented some of the problems facing patients, employees when poorly executed, and ineffective planning instruments. This proposed methodology resolved the current approach and improved user authentication in a healthcare system. In the subject of debate, it has been confirmed that our proposed method is significantly better than the existing. In future, we will be able to enlarge the planned online patient appointment system to
62
M. A. Hassan et al.
Table 2 Comparison between existing versus proposed method Parameters
Kyambille et al. [13]
Sonwane et al. [10]
Jagtap et al. [12]
Peter et al. [11]
Malik et al. [2]
Lekan et al. [3]
Proposed method*
2-Layer authentication
✕
✕
✕
✕
✕
✕
✓
Online booking
✓
✓
✓
✓
✓
✓
✓
Doctor selection
✓
✓
✓
✓
✓
✓
✓
Hospital selection
✕
✕
✕
✓
✓
✕
✓
User-friendly
✓
✕
✓
✓
✓
✓
✓
Payment method
✕
✕
✕
✕
✕
✕
✓
Scheduling
✓
✓
✓
✓
✓
✓
✓
Monitoring patient
✓
✓
✓
✓
✓
✓
✓
Multi-platform
✕
✕
✕
✕
✕
✓
✓
include the capacity to include multi-factor authentication and real-time transaction techniques into the system.
References 1. P.D.V. Chandran, S. Adarkar, A. Joshi, P. Kajbaje, Digital Medicine: an android based application for health care system. Int. Res. J. Eng. Technol. 4(4) (2017) 2. S. Malik, N. Bibi, S. Khan, R. Sultana, S.A. Rauf, Mr. Doc: A Doctor Appointment Application System (2016), p. 9 3. A.J. Lekan, O.S. Abiodun, Design and implementation of a patient appointment and scheduling system. Int. Adv. Res. J. Sci. Eng. Technol. ISO 4(12), 16–23 (2017) 4. S. Nasia, E. Sarda, Online appointment scheduling system for hospitals—an analytical study. Int. J. Innov. Eng. Technol. 4(1), 21–27 (2014) 5. P.W.M. Hussein, M. Salim, B.I. Ahmed, A prototype mobile application for clinic appointment reminder and scheduling system in Erbil city. Int. J. Adv. Sci. Technol. 28(1), 17–24 (2019) 6. D.D. Patel, P.S. Joshi, K.P. Jaiswal, N.P. Sarode, S.P. Chhajed, Advanced facilitation for hospital appointment system. Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol. 2(2), 1089–1092 (2017) 7. N.A. Karim, Z. Shukur, Review of user authentication methods in online examination. Asian J. Inf. Technol. 14(5), 166–175 (2015) 8. M.A. Hassan, Z. Shukur, Review of digital wallet requirements, in 2019 International Conference on Cybersecurity, ICoCSec 2019 (2019), pp. 43–48 9. A. Imteaj, M.K. Hossain, A smartphone based application to improve the health care system of Bangladesh, in 1st International Conference on Medical Engineering, Health Informatics and Technology, MediTec 2016, 2017 10. S. Sonwane, S. Takalkar, S. Kalyankar, K. Wanare, S. Baviskar, Doctor patient data sharing using Android chat application. Int. J. Recent Trends Eng. Res. 3(4), 170–174 (2017)
Enhanced Security of User Authentication on Doctor …
63
11. A. Peter Idowu, O. Olusegun Adeosun, K. Oladipo Williams, Dependable online appointment booking system for NHIS outpatient in Nigerian teaching hospitals. Int. J. Comput. Sci. Inf. Technol. 6(4), 59–73 (2014) 12. P. Jagtap, P. Jagdale, S. Gawade, P.B. Javalkar, Online healthcare system using the concept of cloud computing. Int. J. Sci. Res. Sci. Eng. Technol. 2(2), 943–946 (2016) 13. G.G. Kyambille, K. Kalegele, Enhancing patient appointments scheduling that uses mobile technology. Int. J. Comput. Sci. Inf. Secur. 13(11), 21–27 (2015) 14. G.I. Raphael Akinyede, T. Balogun, Design and implementation of an online booking system for a cinema house. J. Inf. Comput. Sci. 10(6), 331–351 (2017) 15. SlideShare, ‘Online doctor appointment’ on SlideShare, 2020 [Online]. Available at: https:// www.slideshare.net/search/slideshow?searchfrom=header&q=online+doctor+appointment. Accessed 27 May 2020 16. O.O. Ajayi, O.S. Akinrujomu, O.S. Daso, A.O. Paulina, A mobile based medical appointment and consultation (MMAC) system. Int. J. Comput. Sci. Mob. Comput. 8(5), 208–218 (2019) 17. E.E. Nwabueze, O. Oju, Using mobile application to improve doctor–patient interaction in healthcare delivery system. E-Health Telecommun. Syst. Netw. 8(03), 23 (2019) 18. P. Zhao, I. Yoo, J. Lavoie, B.J. Lavoie, E. Simoes, Web-based medical appointment systems: a systematic review. J. Med. Internet Res. 19(4), e134 (2017) 19. N.S. Ismail, S. Kasim, Y.Y. Jusoh, R. Hassan, A. Alyani, Medical appointment application. Acta Electron. Malaysia 1(2), 5–9 (2017) 20. J.L. Akinode, S.A. Oloruntoba, Design and Implementation of a Patient Appointment and Scheduling System (Department of Computer Science, Federal Polytechnic, Ilaro Nigeria, 2017) 21. S. Mendoza, R.C. Padpad, A.J. Vael, C. Alcazar, R. Pula, A web-based InstaSked appointment scheduling system at perpetual help medical center outpatient department, in World Congress on Engineering and Technology; Innovation and its Sustainability (Springer, Cham, 2018), pp. 3–14 22. K.B. Bee, A. Sukumar, L. Letchmunan, A.W. Marashdih, Improving clinic queues in Malaysia using time series extrapolation forecast and web-based appointment. Compusoft 8(9), 3388– 3394 (2019) 23. G. Eysenbach, Web-based medical appointment systems. J. Med. Internet Res. (2017) 24. S. Pooja, H. Nilima, D. Prajakta, H. Nisha, J. Vinayak, Smart appointment generation for patient. Int. J. Adv. Eng. Res. Dev. Technophilia (2018) 25. B.O. Bello, Student-Teacher Online Booking Appointment System (Academic Institutions, 2016) 26. V. Akshay, A. Kumar, R.M. Alagappan, S. Gnanavel, BOOKAZOR—an online appointment booking system, in International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN) (IEEE, 2019), pp. 1–6 27. Z. Shaikh, D.P. Doshi, D.N. Gandhi, D.M. Thakkar, E-healthcare android application based on cloud computing. Int. J. Recent Innov. Trends Comput. Commun. 6(4), 307–310 (2018)
Enhanced Shadow Removal for Surveillance Systems P. Jishnu and B. Rajathilagam
Abstract Shadow removal has been proved very helpful in higher-level computer vision applications which involves object detection, object tracking as part of the process. Removal of the shadow has always been a challenge, especially for ensuring higher-quality images after the shadow removal process. In order to unveil the information occluded by shadow, it is essential to remove the shadow. This is a twostep process which involves shadow detection and shadow removal. In this paper, shadow-less image is generated using a modified conditional GAN (cGAN) model and using shadow image and the original image as the inputs. The proposed novel method uses a discriminator that judges the local patches of the images. The model not only use the residual generator to produce high-quality images but also use combined loss, which is the weighted sum of reconstruction loss and GAN loss for training stability. Proposed model evaluated on the benchmark dataset, i.e., ISTD, and achieved significant improvements in the shadow removal task compared to the state of the art models. Structural similarity index (SSIM) metric also used to evaluate the performance of the proposed model from the perspective of Human Visual System. Keywords Shadow removal · Image processing · Generative adversarial networks · cGAN · Combined loss · ISTD · PSNR · SSIM
Supported by Amrita Vishwa Vidyapeetham. P. Jishnu (B) · B. Rajathilagam Department of Computer Science Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] B. Rajathilagam e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_5
65
66
P. Jishnu and B. Rajathilagam
1 Introduction Removing shadows from the images has been considered as a challenging task in the field of computer vision. The presence of opaque objects in the path of sunlight leads to the formation of shadows and depends on different factors such as the altitude of the sun and location of the object. For example, consider a bike and bus in traffic such that the bike is standing left side of the bus. There are chances for the shadow of the bus to cover the bike if the sunlight is from the right side of the bus. Shadow of different shapes distorts two different objects into a single object. This is called as the occlusion. This is a difficult situation in which we cannot efficiently detect different objects. In this example, it will be difficult for us to distinguish between bike and bus. Probably, the bus and its shadow will merge together and form another shape which will be far different from the shape of a bus. Figure 1 illustrates the expected outcome of the shadow removal process. In traditional approaches, a common method to remove shadow consists of detecting shadows and using detected shadow masks as a clue for removing the shadows. The field of shadow detection predicts the location of the shadowed region in the image and separates the shadowed and non-shadowed region of the original image in pixels. This has been considered a challenging task to classify shadows in an image because shadows have various properties. Depending on the degrees of occlusion by the object, the brightness of shadow varies such as umbra and penumbra as shown in Fig. 2. The dark part of the shadow is called the umbra, and the part of a shadow that is a little lighter is called the penumbra. After the introduction of generative adversarial networks (GAN) in 2014 [1], the computer vision domain has taken leap at various tasks. Shadow removal is an important task which can be considered as an invaluable preprocessing step for higher-level computer vision tasks in surveillance systems like road accident identification and severity determination from CCTV surveillance [2] and plant leaf recognition using machine learning techniques [3].
Fig. 1 Expected outcome of shadow removal
Enhanced Shadow Removal for Surveillance Systems
67
Fig. 2 Penumbra and umbra regions
The challenging task is to ensure the higher quality in shadow-less image by using efficient evaluation metric and enhanced architecture. Shadow removal is also a difficult task because we have to remove shadows and restore the information in that region according to the degree of occlusion.
2 Related Work Shadow removal considered as the complex process compared to the shadow detection phase due to the difficulty while reconstructing the pixels in the detected shadow region. Wang et al. [4] introduced the ISTD dataset as part of the work titled “Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal” and which considered as one of the benchmark dataset for the shadow removal process. From this view, it is clear that the research in the shadow detection domain almost saturated, and the focus is on the enhancements in the shadow removal phase. They proposed an architecture which contains two conditional GAN (cGAN) stacked together and performs shadow detection and removal tasks simultaneously. With this model, RMSE value of 7.47 achieved on the ISTD dataset. Lack of considering the context information in the shadow removal phase was the drawback of this model. In order to consider more details of the shadowed image like the illumination information, Zhang et al. [5] proposed a GAN-based model which contains four generators and three discriminators in the work entitled “RIS-GAN: Explore Residual and Illumination with Generative Adversarial Networks for Shadow Removal.” This model achieved an RMSE value of 6.97 which is better than the model proposed by Wang et al. [4]. The model trained on two different benchmark datasets for shadow
68
P. Jishnu and B. Rajathilagam
removal which are SRD and ISTD datasets. RMSE value of 6.78 achieved for the SRD dataset. The context information was still missing in these models. The model proposed by Qu et al. [6] uses the multi-context information for removing the shadow and reconstructing the shadowed region. Their work titled as “DeshadowNet: A Multi-context Embedding Deep Network for Shadow Removal” uses only deep convolutional neural networks (DCNN) for shadow detection and shadow removal. Acquisition of multi-context information achieved by training three different networks named GNet (Global Net), A-Net (Appearance net), and S-Net (Semantic Net). Global context, appearance information, and semantic context of the shadowed region captured by these networks. All these three networks together called as the DeshadowNet. The RMSE value scored on the SRD dataset is 6.64. Blurred effect was present in the shadowed region of the output images. Generative adversarial network (GAN) base models produce high-quality output images compared to the traditional model. Sidorov [7] proposed AngularGAN architecture to improve the color constancy on the images. Their work entitled “Conditional GANs for Multi-Illuminant Color Constancy: Revolution or Yet Another Approach?” used the GAN-based model along with the angular information of the incident light on the object. The model trained on the synthetic dataset called GTAV dataset. Peak signal to noise ratio (PSNR) used to evaluate the performance of the model. PSNR value for the GTAV dataset was 21 dB. “Shadow Detection and Removal for Illumination Consistency on the Road” by Wang et al. [8] showed promising results for shadow removal using nonlinear SVM classifier. Proposed model was exclusively for traffic images, and the model was not good for the large sized shadows in the scene. The model performs better for small sized shadows in the traffic images. Accuracy is used as the performance evaluation measure and achieved an accuracy of 83% on UCF dataset. They improved the accuracy of the previous work entitled “Shadow detection and removal for illumination consistency on the road” [9] by introducing adaptive variable scale regional compensation operator to remove the shadows. Introduction of generative adversarial networks [1] made drastic changes in the computer vision researches. Already implemented models migrated to GAN architectures as extension to their previous works to improve the results. Yun et al. [10] introduced GAN for shadow removal as an extension to the previous work entitled “Shadow Detection and Removal From Photo-Realistic Synthetic Urban Image Using Deep Learning” and improved the performance. However, traditional modeled methods have limited ability to remove shadows when irregular illumination or objects with various colors are present. In order to address the color inconsistencies in the shadowed region, Cun et al. [11] proposed a novel network structure called dual hierarchically aggregation network (DHAN) which contains a series of convolutions as the backbone without any down-sampling and hierarchically aggregate multi-context features for attention and prediction for successful shadow removal. The model proposed in the work entitled “Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation
Enhanced Shadow Removal for Surveillance Systems
69
Network and Shadow Matting GAN” shows RMSE = 5.76, PSNR = 34.48 dB for ISTD dataset and RMSE = 5.46, PSNR = 33.72 for SRD dataset, respectively. The shadow removal phase is open to enhancements and not yet saturated. GANbased models show significant improvements in generating shadow-less images. Conditional generative adversarial networks (cGAN) [12] are very helpful for narrowing down the generated image space of the generator and thereby reducing the time for training the model. We can modify GAN architecture according to our needs, and based on the inputs to the generator and the discriminator, the behavior of the GAN changes significantly and produces good results.
3 Methodology Proposed architecture of the shadow removal task is illustrated in Fig. 3. Shadowed image given to the generator module and produces the shadow-less (generated image) version of the input image. The discriminator module takes the paired image containing the generated shadow-less image and the real shadow-less image. The duty of the discriminator is to check whether the paired image is real or fake. The generator module trained in such a way that to minimize the loss between the expected target image and generated image and fool the discriminator module.
Fig. 3 Block diagram of proposed approach
70
P. Jishnu and B. Rajathilagam
Fig. 4 Sample image of ISTD dataset
3.1 Dataset Description Since we are more focused on the shadow removal task, ISTD [4] shadow removal bench mark dataset is used (Fig. 4). Dataset contains shadowed images, shadow mask, shadow-less images. Train data—1330 images (640 * 480 pixels). Test data—540 images (640 * 480 pixels). During the data preprocessing phase, the images are loaded and rescaled to (256 * 256) pixels for processing convenience and converted to numpy array.
3.2 Architecture of Generator Overall abstract architecture of the generator is shown in Fig. 5. The generator is an encoder–decoder model using a U-Net architecture [13]. The encoder and decoder of the generator contain convolutional, batch normalization, dropout, and activation layers.
Fig. 5 Architecture of generator
Enhanced Shadow Removal for Surveillance Systems
71
The encoder encodes the information in the given input image, and the context information generated in the bottleneck is used for reconstructing the image using the decoder block. Skip connections are used between corresponding encoder and decoder layers for improving the quality of the image generated by the generator.
3.3 Architecture of Discriminator Architecture of the local patch discriminator is illustrated in Fig. 6. The discriminator is a deep convolutional neural network that performs image classification. Both the source image (generated image) and target image (shadow-less image) given as the input to the discriminator and check the likelihood of whether the shadow-less image are real or a translated version of generated image. The discriminator model is trained in the same way as a traditional GAN model. Adversarial training is used for training the discriminator model. The training of generator is too slow compared to the discriminator. This will lead to the issues in the GAN such as vanishing gradient, mode collapse, and non-convergence. The obvious solution is to balance generator and discriminator training to avoid over-fitting. Combined loss (1) introduced to balance the training of generator and discriminator. Giving more importance to the reconstruction loss (2) rather than the adversarial loss (3) during the generator training will reduce the effect of the fast training of discriminator on the generator. Combined loss = λ1 ∗ reconstruction loss + λ2 ∗ adversarial loss
(1)
where, λ1, λ2 = 100, 1. Loss function between the generated fake image and the shadow-less image is called as the reconstruction loss. Lreconstruction =
n
|Ytrue − Ypred|
(2)
i=1
where, Ypred = The predicted value for the ith pixel, Ytrue = The observed (actual) value for the ith pixel, n = Total number of pixels. Adversarial loss (Binary Cross Entropy loss) Ladversarial(G, D) = E x y [log D(x, y] + E x y [log(1 − D(x, G(x, y)))] where, G—generator, D—discriminator,
(3)
72
P. Jishnu and B. Rajathilagam
x—shadow-less image, y—shadowed image, D(x, y) is the discriminator’s estimate of the probability that real data instance x is real with respect to y, E is the expected value, G(x, y) is the generator’s output.
4 Results and Discussion In this project, ISTD dataset is used to train our model, and the metrics RMSE and PSNR are calculated. The state-of-the-art models RIS-GAN [5], stacked conditional GAN (ST-GAN) [4] used as the base models for comparing the performance of our model (Table 1). In this project, ISTD dataset is used to train our model, and the metrics RMSE and PSNR are calculated. The state-of-the-art models RIS-GAN [5], stacked conditional GAN (ST-GAN) [4] used as the base models for comparing the performance of our model (Table 1). We have used structural similarity index (SSIM) to evaluate the performance of our model apart from the metrics used in the base papers.
Fig. 6 Architecture of discriminator
Enhanced Shadow Removal for Surveillance Systems
73
Fig. 7 Generator and discriminator loss
SSIM = 0.722 means that our model could generate images which are more aligned with the human visual system. Our model shows a drastic improvement in RMSE as well as the PSNR, compared to the standard algorithms. The training graph based on the generator loss and discriminator loss is shown in Fig. 7. From the training graph, it is clear that the generator and discriminator training is like a min-max game. The generator tries to fool the discriminator, and at the same time, the discriminator tries not to fool by the generator’s fake image. The loss never become same for the generator and discriminator for a good GAN [1] model. Sample output of our model for the indoor and outdoor images is given in Fig. 8. We can see that our model performs well for the indoor images and the outdoor images. First and third column corresponds to the outdoor images. RMSE = 0.045 and PSNR = 78.76 dB for these two images. Second column corresponds to the indoor image. It is an image of a black board in a classroom. The shadow successfully removed by our model. RMSE = 0.08 and PSNR = 74.58 dB for this particular indoor shadowed image sample. Sample output of the proposed model on the sample images which are entirely different from the training dataset is given in Fig. 9. First row corresponds to an indoor shadowed image and corresponding output of our model. Second row corresponds to a outdoor shadowed image from a real life instance and corresponding shadow-less image produced by our model. It is clear that proposed model performs well in the shadowed images outside the training dataset, and it is evident in the shadow-less images produced by the proposed model.
Table 1 Comparing with the existing models S. No. Model RMSE 1 2 3
ST-GANl [4] RIS-GAN [5] Our model
7.47 6.67 1.997
PSNR (dB)
SSIM
30.66 31.64 53.38
.... .... 0.722
74
P. Jishnu and B. Rajathilagam
Fig. 8 Sample output. Source, generated, and expected images are arranged row wise
5 Conclusion In this project, we proposed a GAN-based shadow removal model for generating enhanced shadow-less images. Initially, we use basic conditional GAN model on ISTD dataset and analyzed the areas of improvements. Secondly, we modified the architecture and introduced a combined loss for training the model and tuned the parameters by conducting repeated experiments and identified the appropriate set of parameters for the model. From the experiments, it was seen that the proposed model performs better than the existing model. The model showed promising results on the shadowed (outdoor and indoor) real-time images collected by us. For some of the images, our model could not ensure the required quality in the shadow-free images. Our model does not guarantee that it will always produce high-quality shadow-less images.
Enhanced Shadow Removal for Surveillance Systems
75
Fig. 9 Output of the proposed model on real-world sample images
As a future enhancement, shadow-less image generated by the proposed model can be improved by using image super-resolution techniques. Parameter tuning is a time-consuming task and also requires more domain knowledge. In addition to future enhancement, neuroevolution techniques [14] can be used to tune the hyperparameters to identify the best set of parameters for the model and thereby improving the performance of our model. The enhanced model can be used for improving the efficiency of the object detection and tracking applications, especially for the applications like wild animal detection and recognition from aerial videos using computer vision technique [15] in which the presence of shadow is unavoidable.
References 1. I.J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial networks, in International Conference on Neural Information Processing Systems, 2014, https://arxiv.org/abs/1406.2661
76
P. Jishnu and B. Rajathilagam
2. S. Veni, R. Anand, B. Santosh, Road accident detection and severity determination from CCTV surveillance, in Advances in Distributed Computing and Machine Learning, Singapore, 2021 3. R. Sujee, S.K. Thangavel, Plant leaf recognition using machine learning techniques, in New Trends in Computational Vision and Bio-inspired Computing: Selected Works Presented at the ICCVBIC 2018, Coimbatore, India (Springer, Cham, 2020), pp. 1433–1444 4. J. Wang, X. Li, J. Yang, Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal. CVPR 2017. https://doi.org/10.1109/ CVPR.2018.00192 5. L. Zhang, C. Long, X. Zhang, C. Xiao, RIS-GAN: Explore Residual and Illumination with Generative Adversarial Networks for Shadow Removal. AAAI 2020. https://doi.org/10.1609/ aaai.v34i07.6979 6. L. Qu, J. Tian, S. He, Y. Tang, R.W.H. Lau, DeshadowNet: a multi-context embedding deep network for shadow removal, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. https://ieeexplore.ieee.org/document/8099731 7. O. Sidorov, Conditional GANs for multi-illuminant color constancy: revolution or yet another approach? in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. https://arxiv.org/abs/1811.06604 8. C. Wang, H. Xu, Z. Zhou, L. Deng, M. Yang, Shadow detection and removal for illumination consistency on the road. IEEE Trans. Intell. Veh. (2020). https://ieeexplore.ieee.org/document/ 9068460 9. C. Wang, L. Deng, Z. Zhou, M. Yang, B. Wang, Shadow detection and removal for illumination consistency on the road. https://ieeexplore.ieee.org/document/8304275 10. H. Yun, K. Kang Jik, J.-C. Chun, Shadow detection and removal from photo-realistic synthetic urban image using deep learning. Comput. Mater. Continua (2019). https://www.techscience. com/cmc/v62n1/38123 11. X. Cun, C.-M. Pun, C. Shi, Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN. AAAI 2020. https://arxiv.org/abs/1911.08718 12. M. Mirza, S. Osindero, Conditional Generative Adversarial Nets. In Arxiv 2014. http://arxiv. org/abs/1411.1784 13. O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional Networks for Biomedical Image Segmentation. MICCAI 2015. https://arxiv.org/abs/1505.04597 14. K. Sree, G. Jeyakumar, An evolutionary computing approach to solve object identification problem for fall detection in computer vision-based video surveillance applications. J. Comput. Theor. Nanosci. 17(1), 1–18 (2020). https://doi.org/10.1166/jctn.2020.8687 15. K. Mondal, S. Padmavathi, Wild animal detection and recognition from aerial videos using computer vision technique. Int. J. Emerg. Trends Eng. Res. 7(5), 21–24 (2019)
Vehicle Speed Estimation and Tracking Using Deep Learning and Computer Vision B. Sathyabama, Ashutosh Devpura, Mayank Maroti, and Rishabh Singh Rajput
Abstract This paper proposes an intelligent traffic surveillance system that detects the vehicle and its speed, colour, direction, and type of vehicle using computer vision and deep learning. This information can be used to find the traffic violator using automatic number plate recognition. Research shows that over-speeding accounts for 60% of total accidents in India, which raises a serious concern. The proposed approach uses TensorFlow object detection API for vehicle detection, cumulative Vehicle counting, and colour detection of the vehicle using colour histogram integrated with the KNN machine learning algorithm in a real-time environment and a robust approach using deep learning and computer vision for speed estimation and direction detection. This study will effectively monitor traffic usage and help officials track, detect, and lay a floor plan to effectively stop speeding and wrong-side driving vehicles from getting into accidents. This paper proposed an efficient and robust approach for detecting moving vehicles along with their speed and other attributes. The proposed approach can be integrated with a pre-installed traffic monitoring camera system without significant adjustments. Keywords Vehicle speed estimation · Object detection · Computer vision · Deep learning · KNN clustering algorithm
1 Introduction There has been a significant increase in the number of vehicles in India due to an ever-increasing population which gives rise to significant problems like road accidents due to over-speeding and wrong-side driving, leading to severe injuries and even death [1]. According to the data provided by National Crime Records Bureau (NCRB), approximately 60% of the road accidents in India were due to over-speeding which caused around 80,000 deaths and left around 300,000 people injured. Hence, B. Sathyabama · A. Devpura (B) · M. Maroti · R. S. Rajput SRM Institute of Science and Technology, Chennai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_6
77
78
B. Sathyabama et al.
over-speeding in India accounts for the majority of injuries and fatality. According to Transport Research Wing (TRW), wrong-side driving is the second biggest cause of death in India after over-speeding. Wrong-side driving accounts for about 36% road fatalities. India is highly in need of a sophisticated system because of the rise in accidents that can facilitate the authorities to monitor and control over-speeding and the wrong-side driving vehicles. The continuous increase in the number of vehicles also puts much pressure on the infrastructure of roads, leading to problems like road congestion and vehicle collisions. The manual segmentation of each vehicle by the traffic police can lead to the exemption of many over-speeding vehicles. This approach gives about 97% accuracy in detecting the speed of the vehicle. The output is saved in the form of a video output and a CSV file. Video output can be reviewed later as proof against the violator. CSV files contain data associated with traffic flow, speed violators, the colour and type of the vehicle and flow of direction of the vehicle, which can be analysed and accustomed in solving various traffic and road transportation-related issues. The log of the number of vehicles and the distribution of traffic can be used to build appropriate road infrastructure. Various statistical tools can be used on parameters such as the average number of vehicles on the road, and the type of vehicles on the road at a particular interval of time can be analyzed, which can help road authorities administrate the roads [2].
2 Related Work A paper by B. N. Krishna Sai and T. Sasikala proposed how object detection and count of objects in image can be implemented using the TensorFlow Object Detection API and the faster R-CNN algorithm [3]. This paper proposes using MobileNet with SSD instead of faster R-CNN algorithm with TensorFlow Object Detection API as for large objects, SSD with MobileNet can outperform faster R-CNN. Various approaches have been proposed throughout the years on vehicle tracking and speed estimation. A paper by Hua et al. (2018) proposes that modern deep learning models with classic computer vision approaches propose an efficient way to predict vehicle speed and how their proposed approach can also be used for traffic flow prediction [2]. D. Song et al. (2018) had proposed a novel approach towards a multi-vehicle tracking algorithm that considers motion dependence across vehicles by integrating the car-following model (CFM) into the tracking method with on-road constraints [4]. Zhang et al. (2017) have proposed using deep reinforcement learning for visual object tracking in videos and how their proposed approach can be utilized to maximize tracking performance in the long run by training the model with the reinforcement learning algorithm (RL), the disadvantage of this algorithm is that it requires a lot of computation and data [5]. Kumar and Kushwaha (2016) implemented an approach by using OpenCV and JavaCV for detection and tracking and their approach helped them to achieve an average detection accuracy of about 87.7% [6]. With recent advancement in the field of deep learning and computer vision, a higher average detection accuracy has
Vehicle Speed Estimation and Tracking …
79
been achieved by the proposed approach. For colour detection, an article by Thakkar (2018) demonstrates how it is possible to detect dominant colours in an image using k-means clustering [7].
3 Methodology The source video from the traffic surveillance cameras is read frame by frame with OpenCV and is processed by the “SSD with MobileNet” model [8]. The process is iterated until the end of the video feed. The segmented frame is then processed using TensorFlow Object Detection API. The approach uses already pre-trained models referred to as model Zoo, which comprises a pre-trained model trained on the COCO dataset [9, 10]. Figure 1 shows the system architecture of the proposed approach. Single-shot multi-box detector has remarkable performance and precision for object detection tasks, scoring an average 74% mean average precision (mAP) at 59 frames per second on the standard COCO dataset. Figure 2 shows the architecture of single-shot multi-box detector.
Fig. 1 System architecture
Fig. 2 Architecture of single shot multi-box detector
80
B. Sathyabama et al.
3.1 Object Counting The approach is by manipulating and counting the objects using the pixel location of the object. The counter is set to zero and is updated as the vehicles pass through the region of interest. When the input video is passed, the first step is to define input and output tensors for the detection graph. The next step is to compute detection boxes and detection scores. Each box represents a part of the image where a particular object was detected, and each score represents the level of confidence for each of the objects. The score is shown on the resulting image, together with the class label. All the frames extracted from the input video expand dimensions since the model expects images to have shape: [1, None, None, 3]. Finally, the detection takes place, and object counting takes place [11].
3.2 Colour Recognition The approach uses K-Nearest Neighbour (KNN) algorithm, trained by R, G, B colour histogram features for vehicle colour recognition [12]. It can classify violet, yellow, orange, blue, green, red, black and white. More colour can be classified by working on training data or using other colour features such as colour moments or colour correlogram. The training dataset includes ten images of each of the colours mentioned above. Colour histogram represents the allocation of colours in an image. For digital imaging, a colour histogram corresponds to the number of pixels with colours in each of a fixed array of colour ranges that span the image’s colour space, the set of all possible colours. To get the RGB colour histogram of a video file, the approach makes use of the bin number of the histogram, which contains the dominant value of pixel count for R, G and B as a characteristic, resulting in the dominant R, G and B values to produce feature vectors for training. For example, the dominant R, G and B values of the red colour in an image is [254, 0, 2]. After getting the dominant R, G and B values using a colour histogram for each trained image, it is labelled because the KNN classifier is a supervised learner, and these feature vectors are deployed in the CSV file. Thus, creating a training feature vector dataset. The primary functions of the KNN classifier are to fetch training data, fetch test image features, calculate Euclidean distance, get K-nearest neighbour and predict the colour [13]. The training data and testing data are split into 60% training data and 20% testing data, while the remaining 20% is validation data. For test images, if the images are taken under uncertain conditions such as bad lighting or with shadows can be classified under false positives; a filtering algorithm is performed before the test images are sent to the KNN classifier. The prediction accuracy of this model is 97% for the colours mentioned above.
Vehicle Speed Estimation and Tracking …
81
3.3 Speed and Direction Recognition The proposed method deploys the vehicle object tracker, the centroid tracker and the speed formula to determine the speed and direction of the Vehicle. Vehicle Object Tracker: Object tracker is used for tracking since achieving detection on every frame would put too much load on the computational device [14]. To improve tracking, perform object detection every N frame intermediately. The confidence value is 0.4 that is the probability threshold for identifying and locating objects using MobileNet SSD. Four-speed estimation points A, B, C, D are marked as the columns in a frame depending upon the width of the frame. The field of view of the camera is measured for future analysis. The length of the road is measured from one side to the other side of the frame in metres as the calculations are dependent on distance in metres. Videos of the vehicle passing through the field of view will be logged and uploaded to the cloud. The position of the camera is upright perpendicular to the field of view. Figure 3 shows timestamps of a vehicle are transcribed at waypoints A, B, C, D if the direction is upward or Fig. 4 shows timestamps of a vehicle are transcribed at waypoints D, C, B, A if the direction is downward. The average speed is computed by calculating three speeds among the four waypoints and then converted to km/h. Centroid Tracker: The paper proposes using vehicles as an object for the tracking algorithm using centroid identification and location tracking, which is computed by calculating the Euclidean distance between centroids of the existing object and the centroids of the new object between succeeding video frames [15, 16]. 1. Centroid of the vehicle is computed by acquiring bounding box coordinates from an object detector. 2. Three objects are presented in the following input frame; Euclidean distance is computed between bounding boxes of the new object and existing objects.
Fig. 3 Direction of vehicle from A to D
Fig. 4 Direction of vehicle from D to A
82
B. Sathyabama et al.
3. The position of the existing objects is updated. 4. New objects are registered by providing identification numbers. 5. Register new objects that were not matched with an existing object by providing an identification number. 6. Deregister old objects. Speed Estimation: The speed estimation is computed by calculating the average of three speeds of four waypoints. 1. Speed Formula, Speed = distance/time
(1)
2. Metres per pixel is the quotient of distance constant and the frame width, mpp = distance constant/frame width
(2)
where mpp is metres per pixel, distance constant is length of the road in metres and frame width is width of the frame in pixels. 3. Distance in pixels is calculated as the difference between the centroids of the objects as they pass by the columns. pab = |colb − cola |
(3)
pbc = |colc − colb |
(4)
pcd = |cold − colc |
(5)
where pab , pbc , pcd are distance in pixels and cola , colb , colc and cold are centroids of the object. 4. Distance in metres is calculations since computation takes place in metre, dab = pab ∗ mpp
(6)
dbc = pbc ∗ mpp
(7)
dcd = pcd ∗ mpp
(8)
where dab , dbc , dcd are distance in metres. 5. Four timestamps ta , tb , tc , td are acquired as the vehicle moves through the frame. 6. With the help of four (t) timestamp values, the three t values tab , tbc , tcd using the distance formula are calculated, tab = |ta − tb |
(9)
tbc = |ta − tb |
(10)
Vehicle Speed Estimation and Tracking …
tcd = |ta − tb |
83
(11)
7. We will calculate speed which will be the average speed. Speedavg = (tab /dab + tbc /dbc + tcd /dcd )/3
(12)
4 Implementation The first step is to initialize the pre-trained Mobile SSD, load object detector and initialize the video stream. This approach can be used on both CPU and GPU. Timestamp can be out of sync because of the lag in frame capture leading to inaccuracy in measuring the speed. GPU assures that the FPS is adequate for accurate speed calculations as it ensures that there is little to no lag between frame captures [17]. The centroid tracker is initialized, and the count of the total frames is administered with the help of the total frames counter. The next step is to initialize the list of various points used to calculate the vehicle’s average speed. The four points will be used to calculate the vehicle’s average speed by estimating the three estimated speeds that are acquired from the four points. After initialization, the approach is to iterate over the video frames, then collect the next frame from the video stream and cache the current timestamp and the new date. If the video frames are no longer available, the process will terminate and break out of the iteration. Multi-object tracking is achieved using dlib’s correlation tracker, and hence, the video frame is converted into RGB format [18]. Frame dimensions are then set to calculate metres per pixel that facilitate computation of the three estimated speeds of the vehicle. The next step is to initialize the list of bounding boxes passed by either the object detector or the dlib’s correlation trackers and then extract the confidence score or probability associated with the prediction. Only the “car”, “motorbike”, “bus” classes are viewed using the pre-trained MobileNet SSD. Hence, if the class label is not a car, motorbike or bus, ignore it. The next step is to calculate the bounding box’s x and y coordinates so that the dlib’s rectangle object can be constructed from the bounding box coordinates to start the dlib’s correlation so that tracking can begin in the region of interest detected by our object detector. Since an object detector can have more computational load, an object tracker is utilized instead of an object detector to achieve higher frame processing. Utilize the centroid tracker to correlate the centroids of the old object with the centroids of the newly computed object. Each trackable object is assigned an object identifier for identification. Check if a trackable object is existing for the current object identifier. If there is no existing trackable object and the object attributes such as speed or direction are not estimated, estimate it. If the vehicle’s direction is determined as positive, then the vehicle is moving upward, and if the direction is determined as negative, then the vehicle is moving downwards. The direction of the vehicle is determined by comparing the x coordinate concerning the corresponding points A, B, C and D for upward direction; moreover, setting the timestamp as the
84
B. Sathyabama et al.
current timestamp and the position as the position of centroid x-coordinate flagging the last point that is point D, as True, and points D, C, B and A for downward direction moreover setting the timestamp as the current timestamp and the position as the position of centroid x-coordinate flagging the last point A as True. Finally, calculate the distance in pixels and calculate the time in an hour. Calculate the distance in kilometres and affix the calculated speed to the record, calculating the average speed.
5 Output and Findings After experimenting with the proposed approach, sometimes the speed analysis resulted in a significant error. It was determined that the inaccuracy in the readings is because of the inaccuracy in measuring the distance constant value; it is solved by calibrating the distance constant. If the resulting speed showcase is slightly high, modify the value of the distance constant by decreasing its value; vice versa, if the speed showcased is slightly low, then modify the distance constant by increasing its value. The proposed approach is implemented using OpenCV and TensorFlow. Figure 5 illustrates the input video frame, and Fig. 6 illustrates the output video frame, which contains salient details like vehicle count, direction, speed, colour and type of vehicle. The vehicle detection, tracking and colour recognition are demonstrated when the vehicle is in the frame of view, and speed estimation, direction and type of vehicle are
Fig. 5 Input video frame
Vehicle Speed Estimation and Tracking …
85
Fig. 6 Output video frame
Fig. 7 Variation in accuracy of speed by comparing speedometer and computer vision speed readings
86
B. Sathyabama et al.
Table 1 Analysis of proposed approach by comparing speedometer and computer vision speed readings Vehicle count Direction of Speedometer CV reading (kph) Speed accuracy vehicle reading (kph) (%) 1 2 3 4 5 6 7 8
Downwards Upwards Downwards Upwards Downwards Upwards Downwards Upwards
30 30 35 35 40 40 45 45
31.74 28.56 33.74 35.74 41.97 38.78 44.48 46.12
94.20 95.20 96.40 97.89 95.08 96.95 98.85 97.52
Fig. 8 Variation in accuracy of detection and tracking in different environmental condition
demonstrated when the vehicle passes through the point of interest. For testing and verification of the proposed approach, real-time drive thrust was conducted, and the readings of the speed of the speedometer and the proposed system were transcribed as shown in the table. Figure 7 illustrates the comparison between the speedometer and computer vision readings which show that the proposed approach on average had achieved approximately 97% accuracy, as shown in Table 1. For the purpose of data visualization, graphs have been generated using Tableau. Furthermore, four different videos were processed using the proposed approach for different environmental conditions. Figure 8 illustrates that the detection and tracking accuracy fluctuates concerning different environmental conditions. The accuracy of tracking is excellent in the afternoon, with an average tracking accuracy of 98.9%. The lowest tracking accuracy can be seen at night, with an average tracking accuracy of 94.5% because of the lack of illumination. The average tracking
Vehicle Speed Estimation and Tracking …
87
Table 2 Analysis of proposed approach by comparing average detection and tracking accuracy in different environmental conditions Video count Environmental Avg. detection Avg. tracking accuracy conditions accuracy (%) (%) 1 2 3 4
Morning Afternoon Evening Night
97.7 98.2 95.9 92.8
96.2 98.9 94.8 94.5
accuracy and average detection accuracy obtained by the proposed approach in all four environmental conditions are approximately 96%, as shown in Table 2.
6 Conclusion This paper proposes an approach to find the vehicle’s cumulative vehicle count and colour using TensorFlow API and find the speed estimation and direction of vehicles using centroid and object tracker. The proposed approach has been tested in real time and using surveillance cameras for different environmental conditions. For speed estimation, sometimes inaccuracy in measuring the distance constant leads to inaccuracy in the speed reading, and this can be solved by calibrating the distance constant. The direction of the vehicle can facilitate detecting the wrong-side driving on single-lane roads. The output is saved as a CSV file and video output. CSV file can be used to solve traffic management problems using various statistical and analytical tools, and video output can be used as proof against the violator.
References 1. S. Singh, Road traffic accidents in India: issues and challenges. Transp. Res. Procedia 25, 4712–4723 (2017). https://doi.org/10.1016/j.trpro.2017.05.484 2. S. Hua, M. Kapoor, D.C. Anastasiu, Vehicle tracking and speed estimation from traffic videos, in IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, pp. 153–1537. https://doi.org/10.1109/CVPRW.2018.00028 3. B.N.K. Sai, T. Sasikala, Object detection and count of objects in image using tensor flow object detection API, in International Conference on Smart Systems and Inventive Technology (ICSSIT), 2019, pp. 542–546. https://doi.org/10.1109/ICSSIT46314.2019.8987942 4. D. Song, R. Tharmarasa, T. Kirubarajan, X.N. Fernando, Multi-vehicle tracking with road maps and car-following models. IEEE Trans. Intell. Transp. Syst. 19(5), 1375–1386 (2018). https:// doi.org/10.1109/TITS.2017.2723575 5. D. Zhang, H. Maei, X. Wang, axnd Y.-F. Wang, Deep reinforcement learning for visual object tracking in videos (2017)
88
B. Sathyabama et al.
6. T. Kumar, D. Kushwaha, An efficient approach for detection and speed estimation of moving vehicles. Procedia Comput. Sci. 89, 726–731 (2016). https://doi.org/10.1016/j.procs.2016.06. 045 7. S.K. Thakkar, Dominant colors in an image using k-means clustering. Medium (2018). https:// buzzrobot.com/dominant-colors-in-an-image-using-k-means-clustering-3c7af4622036 8. N.A. Othman, I. Aydin, A new deep learning application based on Movidius NCS for embedded object detection and recognition, in 2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), 2018, pp. 1–5. https://doi.org/10.1109/ISMSIT. 2018.8567306 9. T.Y. Lin et al., Microsoft COCO: common objects in context, in Computer Vision—ECCV 2014. ECCV 2014, ed. by D. Fleet, T. Pajdla, B. Schiele, T. Tuytelaars. Lecture Notes in Computer Science, vol. 8693 (Springer, Cham, 2014). https://doi.org/10.1007/978-3-319-10602-1_48 10. O. Elisha, Real-time Object Detection using SSD MobileNet V2 on Video Streams. Medium (2020). https://heartbeat.fritz.ai/real-time-object-detection-using-ssd-mobilenet-v2on-video-streams-3bfc1577399c 11. C.R. del-Blanco, F. Jaureguizar, N. Garcia, An efficient multiple object detection and tracking framework for automatic counting and video surveillance applications. IEEE Trans. Consum. Electron. 58(3), 857–862 (2012). https://doi.org/10.1109/TCE.2012.6311328 12. P. Saini, K. Bidhan, S. Malhotra, A detection system for stolen vehicles using vehicle attributes with deep learning, in 2019 5th International Conference on Signal Processing, Computing and Control (ISPCC), 2019 13. N. Ali, D. Neagu, P. Trundle, Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets. SN Appl. Sci. 1, 1559 (2019). https://doi.org/10.1007/s42452-0191356-9 14. M.R. Sunitha, H.S. Jayanna, Ramegowda, Tracking multiple moving object based on combined color and centroid feature in video sequence, in IEEE International Conference on Computational Intelligence and Computing Research, 2014, pp. 1–5. https://doi.org/10.1109/ICCIC. 2014.7238412 15. A. Heredia, G. Barros-Gavilanes, Video processing inside embedded devices using SSDMobilenet to count mobility actors, in 2019 IEEE Colombian Conference on Applications in Computational Intelligence (ColCACI), 2019 16. K.C. Lai, Y.P. Chang, K.H. Cheong, S.W. Khor, Detection and classification of object movement—an application for video surveillance system, in 2010 2nd International Conference on Computer Engineering and Technology, 2010, pp. V3-17–V3-21. https://doi.org/ 10.1109/ICCET.2010.5485748 17. B. Armstrong, GPU Benchmarking on the Edge with Computer Vision Object Detection. MobiledgeX (2021). https://mobiledgex.com/blog/2021/01/07/gpu-benchmarking-onthe-edge-with-computer-vision-object-detection 18. S. Murray, Real-Time Multiple Object Tracking (2017). https://kth.diva-portal.org/smash/get/ diva2:1146388/FULLTEXT01.pdf
Transparent Blockchain-Based Electronic Voting System: A Smart Voting Using Ethereum Md. Tarequl Islam, Md. Sabbir Hasan, Abu Sayed Sikder, Md. Selim Hossain, and Mir Mohammad Azad
Abstract The research work scrutinizes an e-voting concept that is on the platform ethereum blockchain. Ethereum is a distributed computing platform that is free, open source with the functionality of smart contracts. By utilizing this depiction, it is feasible to originate engrossing scientific prominence which enables the thoughtful in sober fact collaboration occurring in the blockchain. E-voting is the most accepted worldwide because it is a tool that every moment signifies the democracy of the election. Consequently, most of the countries persevere to experiment and development of the e-voting process. Blockchain technology is responsible for a decentralized design that designates advanced data simultaneously among the P2P network barring a central database. At last, the experiment addressed the debilitation of the existent e-voting method and successfully fruitful blockchain technology to unravel that feebleness. Keywords E-voting · Blockchain · Ethereum · Smart contract · Authorization · Decentralization · Security and privacy
1 Introduction A blockchain is a sempiternal progressive ledger [1–3] that remains an indestructible record of all the transactions that have occupied a place in a sheltered, chronological, and unchallengeable [4] way. A blockchain comprises a chain of blocks that take on information. Every block records all of the current transactions, and once Md. T. Islam (B) · Md. S. Hasan · M. M. Azad Department of Computer Science and Engineering, Khwaja Yunus Ali University, Enayetpur, Sirajganj 6751, Bangladesh A. S. Sikder School of Science and Engineering, Southeast University, Dhaka, Bangladesh Md. S. Hossain Department of Computing and Information System, Daffodil International University (DIU), Dhaka, Bangladesh © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_7
89
90
Md. Tarequl Islam et al.
completed goes into the blockchain as a perpetual database. Blockchain technology can be mobilized into abundant areas. The fundamental use of blockchain is as a distributed [5] ledger [6] for cryptocurrencies. It displays the greatest pledge across a comprehensive range of business applications like entertainment, finance, health care, insurance, media, government and banking, retail, etc. Blockchain technology has become more accepted because of unchangeable transactions, reliability, security, collaboration, time reduction, and decentralized. In 1991, Stuart Haber and W. Scott Stornetta have described this technology [7]. Electronic voting machine (EVM) is one of the most serious themes to dispute for political gatherings in point of fact from the last successive few years [4]. In the conventional scheme [8, 9], the election commission has to print isolated ballot paper for each voter [10]. A voter usages “seal and ink” to vote for their selected candidate [11]. And sometimes, many votes converted impractical for philanthropic the “seal” in unexpected constituencies. Again, piracy in voting [3, 7] and “lack of clearness in numeration [1, 3] are the major disqualifications of the outdated organization [7]. The voting system should generally be in a position to vote, such as accredited voters (the same convenience about accessibility and place). The voter may verify if their voting is counted, and the outcome is consistent throughout the voting process [4]. The election should be a reasonable cost. No one can mention their vote other than themselves. The voting process should be capable to validate by all participants. By meeting legislators’ legal requirements, blockchain-based electronic voting systems improve voting integrity, optimize the voting process, and produce consistent voting results. It will also help to mitigate current challenges for a long time through the distributed ledger process. Blockchain technology has an incalculable value to be able to withstand e-voting constraints. Blockchain’s behavior is an undeniable and distributed ledger [4, 9]. The main database is to be removed. P2P networks have the identical blockchain at each node, but do not distribute the consequence in any one error point [8]. The prior block is mentioned in the new block that creates an unchanging chain, which protects data from interference [6, 7], where a new data or a so known block develops. “Control over half of the nodes (51%) in the net which made the system” [7] tremendously secured. It is improbable to launch DDoS to multiple nodes in the network at the same time [8]. Moreover, ethereum brings additional prolongation, while residuals the blockchain functionalities are: “Give authorization to the developer to program and customize blockchain (i.e., smart contracts)” [12]. “Least CPU possessing the cost in terms of performance” [13]. Furthermore, the decentralized architecture brings the security level higher with blockchain technology with its consensus algorithm than the centralized architecture (client–server).
2 Related Work In this part, we contemporary approximately the circumstances of the art pertinent e-voting schemes that usage blockchain as a service. “Agora” [14] designed a voting system that is an end-to-end verifiable [15] voting solution on “blockchain-based
Transparent Blockchain-Based Electronic Voting System …
91
for governments and institutions” [16]. For the election, administrations, and organizations, acquisition tokens for each qualified voter Agora uses these Token on the blockchain [14]. The current voting system has numerous flaws, including political power abuse, high costs, and a lengthy procedure, among others. To address these issues, we proposed a blockchain-based smart voting system. In the UK, they provide digital voting to their voters to vote from their home district or by a web browser at home by using blockchain technology that is used in their current voting system [15, 16]. An online biometric voting system is a web-based online voting system that delivers a rapid, accurate, and secure election results to validate the electoral process. In this voting system, there are two separate users, one administrator and the other a system user to create data with rights [12, 13] each. An administrator will generate logs, enter information from the candidate, create voting data, party information, and stop the web application when the system user makes logs, creates voter information, and closes the web application solely. The system accepts the voter fingerprint for the check process for voters’ registration, and when the fingerprint is matched with the database, the system gives the voters’ PIN, otherwise the system scans the finger to identify the corresponding fingerprint. Once PIN has been received, voters can vote [5].
3 Proposed Concept Along with Contribution The foremost achievement of the paper is tabulated by the way: (1)
(2)
(3)
It is anticipated a message endorsement and broadcasting apparatus which permits authorization to examine while conserving secrecy. The mechanism can take advantage of in various scenarios including vote, authority, result, candidates, and so on. It has been decoupled the biometric info and inspecting the method into three steps, correspondingly administered by the authority, the voters, and the candidates’ smart agreement. In the simulation, the comprehensive apparatus is implemented, containing reliable calculation skills founded on ethereum.
4 Working Procedure Algorithm1: Algorithm for Authentication authorization: = initialization If authorizations: = (voter_id, Biometric_Info) Add: = Node{(node_id) & (authentication)} Authority = certification {user credentials, certify (credentials, node_id, users_Info)}
92
Md. Tarequl Islam et al.
Algorithhm2: Voting Algorithm Vote (V): = vote (voter_id, candidate) Block: = add (V, chain) BC Info: = Update (voting machines) changed the voter’s linked arena to vote by Authority, vote (voter_id, user-list, true). Algorithm 3: Counting Vote Candidates are reached from an authority, candidates = get candidates (candidate_list). Calculation: = votes and the conqueror of the territory is strong-minded, results = count (chain, candidates). End Nowadays, we are working to designate our project typically according to the algorithm. Ethereum-based voting system includes us with a central database, where every transaction is rechecked. This procedure of transaction with the transaction is called a block. By following every set of rules, the transaction validation is checked. Solidarity software design is utilized to change the consent to get to the organization whether it is private ethereum-based blockchain framework. Even, however, ethereum remains an open and permissionless blockchain framework; we will attempt to make the permission. In every step, it checks the voter’s identity to produce voter permission to vote (Fig. 1). Since BC is a distributed ledger system, the data will be managed by decentralized manner. BC will be used to transfer the normal data into blockchain format. Here each block has a coordination with the previous block. There are three blocks in this proposed system. Each block is connected. When a person (voter) comes to vote first, his/her details are checked by the authority by biometric information; then, she/he can access the voting system. After verified electrical ballot paper showed in the display, voter chooses the candidate; when voter chooses candidate, this is also verified by the author again. At last, when all are verified, then the vote is counted to the chosen candidate. However, we have a NID number against our biometrics fingerprint system. The NID number will be matched in equivalence to the fingerprint where it will easily handle the data from database system.
4.1 Transaction Table The elector’s data is kept secure utilizing the “SHA3-512-bit hashing algorithm and to keep citizens’ democratic inclination un-uncovered just elector id and political decision id is put away in the information base which doesn’t uncover whom they have cast a ballot.” [17, 18] Some other exchange yields are given underneath every exchange distinguishing proof (Tables 1, 2 and 3).
Transparent Blockchain-Based Electronic Voting System …
93
Fig. 1 Proposed block diagram of e-voting system using blockchain Table 1 Transaction details No
Election_ID
Election_name
Number of candidates
1
0xCA35b7d915458EF540aDe6068dFe2F44E8fa733c
Select candidate
3
0 × 583031d1113ad414f02576bd6afabfb302140225
0xdd870fa1b7c4700f2bd7f44238821c26f7392148
0x63fdf702493a35c653e9a1a851da1cc6aff16c33
0x560565799dc6a92c6fc494c58835cd01fa2b7b81
0x2e2ae5091cb3971389bb92b5dd1428ad81491baf
0x0a4cdb536739782cff6bd6f3c7a0ebf6cae8efa3
0xc27a944e35f9b9b06f4c83e785353ce1a460d919
1
2
3
4
5
6
7
0x4b0897b0513fdc7c541b6d9d7e929c4e5364d2db
0xca35b7d915458ef540ade6068dfe2f44e8fa733c
0x4b0897b0513fdc7c541b6d9d7e929c4e5364d2db
0xca35b7d915458ef540ade6068dfe2f44e8fa733c
0x4b0897b0513fdc7c541b6d9d7e929c4e5364d2db
0x14723a09acff6d2a60dcdf7aa4aff308fddc160c
Candidate ID 0xca35b7d915458ef540ade6068dfe2f44e8fa733c
Voter ID
No
Table 2 Voting transaction table Status
Vote successful
Vote successful
Vote successful
Vote successful
Vote successful
Vote successful
Vote successful
Time
7–12-2020 19:10:55
7–12-2020 19:04:09
7–12-2020 19:00:05
7–12-2020 18:58:05
7–12-2020 18:56:15
7–12-2020 18:54:07
7–12-2020 18:54:05
61,435
50,139
69,665
50,139
59,564
62,560
50,139
Gas cost
94 Md. Tarequl Islam et al.
Transparent Blockchain-Based Electronic Voting System …
95
Table 3 Output of a transaction Statues
0x1 transaction mined and execution succeed
Transaction hash
0x8593e5277dd25ffd6261d1db16f709f8e96e9231decb44b55049d7e6cb7d773e
From
0xca35b7d915458ef540ade6068dfe2f44e8fa733c
To
0x692a70d2e424a56d2c6c27aa97d1a86395877b3a
Gas
3,000,000 gas
Transaction cost
50,139 gas
Execution cost
27,523 gas
Hash
0x8593e5277dd25ffd6261d1db16f709f8e96e9231decb44b55049d7e6cb7d773e
Input
0x462…00,000
Decoded input
0x0DCd2F752394c41875e259e00bb44fd505297caF
Decoded output
[]
Logs
[][]
Value
0 wei
The framework checks the all-out votes got by every competitor in every single diverse political decision which stores them in the information base. Also, the eventual outcome has appeared on the overseer site. The accompanying figures show the number of up-and-comers in all decisions and got votes of every competitor in each various race.
5 Result Analysis We are describing delay time between two voters; when a voter voted, then another voter needs to wait to complete the transaction. An organization setup that the program would not be transmission however would make some short time delay amassed into the transmission to allow. For example, we take ten voters to vote in a network. Here if every voter takes 10–15 ms time after completing his/her transaction, then the delay of time goes to high (Figs. 2 and 3; Table 4).
96
Md. Tarequl Islam et al. 120 100
Time
80 60 40 20 0 1
2
3
4
5
6
7
8
9
10
11
8
9
10
11
Candidates Fig. 2 Chart of delay time 120 100
Time
80 60 40 20 0 1
2
3
4
5
6
7
Candidates Fig. 3 Curve of continuous delay time
Table 4 Delay of time for voters Voters
1st
2nd
3rd
4th
5th
6th
7th
8th
9th
10th
D. time
10
22
28
40
50
60.50
73
85
90
103.45
6 Security Issue At first, we deliberate the security of smart agreements and e-voting on the blockchain, which is the foremost apprehension that essential be occupied. Since supposing if the citizens are not guaranteed of their security, they will not get engaged in the procedure [3]. There are confident safety areas that can be pleased with our projected practice, such as inconspicuousness, voters’ privacy, confidentiality, ballot manipulation, and transparency.
Transparent Blockchain-Based Electronic Voting System …
97
7 Limitations of the Proposed System In the ethereum-based blockchain (EBB) innovation, the exchange will happen in a cryptologic way wherever logs are not open and cannot be adjusted. It is not possible to get the log data of this EBB exchange. Smart conventions started with the comparative pay of changelessness as blockchain. Indeed, even minimal blunder in cryptograms can end up being costly and tedious to precise once when the keen arrangement is situated to execute. While the annihilation of third social events stays a theory that receipts set for blockchain and unique agreement that is no real way to dispose them.
8 Conclusion and Future Work From the earlier discussion, e-voting is an unindustrialized thought or clarification of voting to convey out operations with exactness and authenticity. We proposed a voting system in the research work which is based on ethereum blockchain which contained a decentralized platform. The foremost influence of this work is the biometric input of the ethereum network on the e-voting system. But the whole work is done in simulation software and ethereum online IDE. Paillier cryptosystem as a library in solidity is implemented here. With this system cryptography of solidity, the library could largely improve our ballot verifiability. BC system will be also applicable in NFT market place, real-time IoT monitoring, personal identity security, supply chain, banking sector, etc. In the future, it will be tried to implement a blockchain-based e-voting system in real life.
References 1. S. Agathiyan, S. Latha, and J. Menaka, A new approach of solar powered electronic voting machine with authentication system and for blind people. Adv. Innov. Res., 36 (2019) 2. K.M. Khan, J. Arshad, M.M. Khan, Investigating performance constraints for blockchain based secure e-voting system. Futur. Gener. Comput. Syst. 105, 13–26 (2020) 3. E. Yavuz, A. K. Koç, U. C. Çabuk, G. Dalkılıç, Towards secure e-voting using ethereum blockchain, in 2018 6th International Symposium on Digital Forensic and Security (ISDFS), pp. 1–7 (2018) 4. M.S. Hossain, M.T. Islam, An extension of vigenere technique to enhance the security of communication, in 2018 International Conference on Innovations in Science, Engineering and Technology (ICISET 2018), no. October, pp. 79–85 (2018). https://doi.org/10.1109/ICISET. 2018.8745638 5. F.Þ. Hjálmarsson, G.K. Hreiðarsson, M. Hamdaqa, G. Hjálmtýsson, Blockchain-based e-voting system, in 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), pp. 983– 986 (2018)
98
Md. Tarequl Islam et al.
6. A. Rahman, M.S. Hossain, Z. Rahman, S.K.A. Shezan, Performance enhancement of the internet of things with the integrated blockchain technology using RSK sidechain. Int. J. Adv. Technol. Eng. Explor. 6(61), 257–266 (2019) 7. A. Zahan, M.S. Hossain, Z. Rahman, S.K.A. Shezan, Smart home IoT use case with elliptic curve based digital signature: an evaluation on security and performance analysis. Int. J. Adv. Technol. Eng. Explor. 7(62), 11–19 (2020) 8. M.S.I. Sarker, Z. Rahman, S.K.A. Shezan, M.S. Hossain, M. Mahabub, Security assumptions for ubiquitous secure smart grid infrastructure using 2 way peg blockchain and fuzzy specifications. Blue Eyes Intell. Eng. Sci. Publ.: Int. J. Innov. Technol. Explor. Eng. (IJITEE) 9(4), 2278–3075 (2020) 9. M.M. Alhammad, A.M. Moreno, Gamification in software engineering education: a systematic mapping. J. Syst. Softw. 141, 131–150 (2018) 10. X. Wang et al., Survey on blockchain for internet of things. Comput. Commun. 136, 10–29 (2019) 11. Y. Kurt Peker, X. Rodriguez, J. Ericsson, S.J. Lee, A.J. Perez, A cost analysis of internet of things sensor data storage on blockchain via smart contracts. Electronics 9(2), 244 (2020) 12. S. Hossain, S. Waheed, Z. Rahman, S.K.A. Shezan, M. Hossain, Blockchain for the security of internet of things: a smart home use case using ethereum. Int. J. Recent Technol. Eng. 8(5), 4601–4608 (2020). https://doi.org/10.35940/ijrte.e6861.018520 13. K. Christidis, M. Devetsikiotis, Blockchains and smart contracts for the internet of things. IEEE Access 4, 2292–2303 (2016). https://doi.org/10.1109/ACCESS.2016.2566339 14. Y. Wang et al., Formal verification of workflow policies for smart contracts in azure blockchain, in Working Conference on Verified Software: Theories, Tools, and Experiments, pp. 87–106 (2019) 15. T. Chen, X. Li, X. Luo, X. Zhang, Under-optimized smart contracts devour your money, in 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 442–446 (2017) 16. K. Patidar, S. Jain, Decentralized e-voting portal using blockchain, in 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–4 (2019) 17. L. Vo-Cao-Thuy, K. Cao-Minh, C. Dang-Le-Bao, T.A. Nguyen, Votereum: an ethereumbased e-voting system, in 2019 IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF), pp. 1–6 (2019) 18. M.T. Islam, M.K. Nasir, M.M. Hasan, M.G.G. Faruque, M.S. Hossain, M.M. Azad, Blockchainbased decentralized digital self-sovereign identity wallet for secure transaction. Adv. Sci. Technol. Eng. Syst. J. 6(2), 977–983 (2021)
Usage of Machine Learning Algorithms to Detect Intrusion S. Sandeep Kumar, Vasantham Vijay Kumar, N. Raghavendra Sai, and M. Jogendra Kumar
Abstract Interruption location framework (IDS) is one of the executed arrangements against unsafe assaults. Besides, aggressors consistently continue to change their instruments and strategies. Notwithstanding, executing an acknowledged IDS framework is additionally a difficult undertaking. In this paper, a few trials have been performed and assessed to survey different AI classifiers dependent on KDD interruption dataset. It prevailing to process a few presentation measurements to assess the chose classifiers. The emphasis was on bogus negative and bogus positive execution measurements to upgrade the recognition pace of the interruption identification system. An intrude on the spot structure recognizes a few vindictive practices and weird activities that can harm the security and trust of the PC structure. IDS works at the host or association level using recognizing or distinguishing mishandles. The fundamental issue is to precisely recognize the gatecrasher invasion against the PC association. The focal inquiry of the effective disclosure of the blackout is the choice of the real remarkable focuses. To decide the IDS issues, this investigation work proposes “an improved technique for recognizing blackouts utilizing computerized reasoning estimations.” In our article, we utilized the KDDCUP 99 dataset to investigate the sufficiency of the conspicuous power outage test with various AI estimations like J48, J48 Graft Naïve Bayes, Bayes, and furthermore Random Backwoods. For perceiving the organization-based IDS alongside the KDDCUP 99 dataset, the test outcomes show that the three computations J48, J48 Graft, and Random Woods give fundamentally preferred outcomes over other AI gauges. We will utilize WEKA to affirm the precision of the coordinated informational collection utilizing our proposed methodology. We have considered all the cutoff focuses for the estimation of the outcome, for instance, exactness, adjustment, measure F, and ROC. Keywords Machine learning · Deep learning · Intrusion detection system · Network security · Precision · Recall · IDS · KDDCUP 99 S. Sandeep Kumar · V. V. Kumar · N. Raghavendra Sai (B) · M. Jogendra Kumar Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_8
99
100
S. Sandeep Kumar et al.
1 Introduction With the fast improvement of data innovation in the previous twenty years, PC networks are generally utilized by industry, business, and different fields of the human existence. Subsequently, building dependable organizations is a vital assignment for IT chairmen. Then again, the quick advancement of data innovation delivered a few difficulties to construct solid organizations which is a troublesome assignment. There are numerous sorts of assaults compromising the accessibility, honesty, and privacy of PC organizations. The denial of administration assault (DOS) considered as quite possibly the most well-known unsafe attacks. For example, the eminent open-source interference disclosure structure Snort Strategy [1] runs in a relationship with a few hundred machines and makes incalculable alerts every day, containing a huge segment of the fake cautions. Usually, the IDS development makes a huge proportion of results that conventionally appear at the alliance’s security operations center (SOC), achieving an incredibly hopeful effort and long work to do. To vanquish this issue, analysts for the most part produce cautious IDS rules (hails) that explicitly limit incredibly exact changes and reduce the all around sham positive (FP) rate. In any case, this results in the disappointment of withdrawing different enmities or different kinds of threats focused in the polymorphic characters of the enmities, which will be an impact of the human intuition behind them. Plus, to stop negatives which are bogus (FN), for instance, the exposure is lost, the IDS framework specialists resort to joining past strategies with a more non-prohibitive design, so a development with even a distant credibility of defying the assault would trigger a caution. (A)
IDS
Robotized attacks on PCs, affiliations have become common occasions in which the security, suffering, quality, and responsiveness of the PC frameworks being recommended are entered. Thusly, a construction ought to be set up that can see and hinder the attacks for a host or PC association. Thusly, different planes and improvements appear to robotize this collaboration. Different terms of IDS are the going with: • Intrusion: A push to orchestrate mystery, genuineness, or anticipated availability (CIA) in a PC office or affiliation. • Intrusion detection: Strategy to perceive occasions occurring in a PC connection or affiliation and explore them for power outage terms. • IDS: Part of a conspiratorial thing or equipment that mechanizes the procedure for recognizing power outages. • Intrusion prevention system (IPS): Conspiring to keep up all of the capacities of the IDS yet could in like manner effectively hinder expected occasions. The IDSs can be mentioned on the web and have been really based and depicted in markings, features, and nuances. These orders can be found point by point in the expansion:
Usage of Machine Learning Algorithms to Detect Intrusion
101
• Host-based IDS (HIDS): Conspires that it is anything but a prepared proficient or host on the nearby PC and alerts of the machine’s direct, for ex, by exploring the logs • NIDS: A chart that alerts of association traffic, for the most part involved sensors scattered all through the affiliation and an organization unit. Sensors perceive network packages, similar to TCP/IP groups, and the arrangement endeavors to distinguish poisonous bundles or odd activity in the affiliation [2].
1.1 Detection Approaches Generally perfect blackout confinement procedures after rankings: • Identification dependent on misuse or mark: A brand name is a model that is identified with an apparent animosity or risk; the recognizable proof of misuse is the methodology to counter the models with distinguished occasions with the acknowledgment of potential interferences. Battles practically identical to most antivirus programs [3]. Check the association for practices that have been foreordained as hurtful. They are astonishing and quick as they are essentially filtering through the thing they are seeing and a foreordained theme. Mark-based IDS would not perceive the farthest threats. As another attack follows, the data records can be refreshed before the association gets insecure. • IDS anomaly based: Anomaly put together acknowledgment fundamentally depends with respect to the portrayal of the association’s activities. The authoritative activities are the predefined ones, when the event is perceived or, most likely, proclaimed instead of confusion [2]. The apparent activities of the association are coordinated or educated by the judgments of the association’s chairmen. The basic stage in the portrayal of authoritative activities is the IDS motor fit for intersection the different shows at all levels. By this show, the test is computationally exorbitant, the benefit it produces, for instance, broadening the standard wizard set into less fake positive cautions. The rule deterrent to idiosyncrasy-based exposure is the depiction of your game plan of standards. The credibility of the game plan relies upon the execution and testing of each accessible arrangement. • IDS Spec-based: In this framework, truly stretched out gauges are utilized to clarify the upheld program. This philosophy relies upon the customary clarifications made by the shipper to communicate activities that permit him to follow the conditions of the arrangement. By and large, the affiliation shows that the tasks in determination base IDS depend on the rules of the world rules that limit the reasonable. The upside of this technique is that it does not deliver bogus cautions when lawful demonstrations of obscure customers are endeavored. Likewise, you can see the frail attacks of the past considering your capacity to perceive attacks that go amiss from foreordained authentic activities. Regardless, the essential advancement for such a method requires basic exertion, which influences the propriety of the methodology. Besides, the efficiency of the diminishing in bogus positives stays improbable.
102
S. Sandeep Kumar et al.
Blended ID: IDs hybrid are addressed by different techniques used to recognize attacks and IDS movement in the association. The IDS can perform abnormality acknowledgment or misuse following and can be given as an association-based structure or as a host-based method. These results in four normal gatherings have explicitly misuse, authoritative maltreatment and host peculiarity and organization of irregularities [4]. A few IDS unite the qualities of every one of these classes (primarily by recognizing both maltreatment and extraordinariness) and are seen as blended casings. Make the indispensable differentiations between the acknowledgment of inconsistencies and the ways to deal with the site of misuse [5]. The key issues IDS finds are innumerable counterfeit positive alerts that are mistakenly assigned to normal traffic in light of wellbeing infiltrates. An optimal IDS does not give sham or unimportant alerts. All things being equal, signaturebased IDSs gather to make extra false alerts than expected. This is a result of the crazy number of votes and the deficiency of a basic affirmation mechanical assembly to avow the completion of the round. The goliath number of engaging counterfeit spots in the alert log makes the arrangement of making an accommodating move for certifiable positives, as fruitful changes, delayed and raised undertakings.
2 Literative Survey Lately, numerous cheap and non-business interruption discovery looms have been fostered that portray loom breaks. The most recent strategies are utilized to improve the speed of advancement of this kind of plan. Data extraction techniques could adapt to goliath informational collections and empower the automation of IDS. The nearby irregularity acknowledgment models have been extended to perceive a break with a gigantic degree of exactness. As the assessed research shows, two kinds of profiling are performed. A couple of IDS structures support a possible information base about the arrangement of a meddle with development and trigger ready when that activity is seen. These edges make less fake alerts considering a capability in focus use plans; regardless, irksome lead is likely going to be derided with the new models. Another class of IDS plans keeps an ordinary operational profile showed by a learning affiliation. Anything outside of this social profile is designated a potential unsettling influence. These plans have a pervasive fake alert rate; in any case, they will undoubtedly pick if there should arise an occurrence of a power outage. Xiao and his associates [6, 7] presented a blackout area methodology that GA applies to recognize blackouts in networks during significant segment choice strategies. His strategy utilizes information theory to extricate related features and lessen issues. Starting there on, they set up a basic development rule from the picked features to order the organization practices into average and sporadic activities. In any case, your methodology just contemplates the features that go unseen. Ciaulkns et al. [5] have communicated an incredible data exhuming system to recognize peculiarities utilizing the tree of decision in networks. The update method
Usage of Machine Learning Algorithms to Detect Intrusion
103
improves team execution by utilizing an adaptable window and arriving at the flight tree as the essential understudy. The fundamental thinking about the drive is to join basic principles to outline an outfit with a definitive objective of improving the adequacy of the remarkable collectible segment. The force computation starts by giving all the data by setting up the tuples of the practically identical weight w0. Later than a classifier is built the load of every tuple is modified according to the categorization given by that classifier. Then, a second classifier is constructing the reloaded training tuple. The concluding classification of intrusion detection is a loaded average of the individual classifications of overall classifiers. Skillet et al. [8] communicated a scheme about the area of misuse utilizing the neural association assortment and the C4.5 estimation. Gaddam et al. [9] communicated the connivance for the immediate recognizable proof of quirks through the bunching of relative K-means and ID3 choice tree learning computations. In almost all study work, SVM has been utilized for categorization of network traffic patterns. The disadvantage with this method is that it obtains a long time for training the scheme. The issue with this system is that it requires some investment to set up the arrangement. So, it is significant to optimize that difficulty utilizing clustering, fuzzy logic genetic algorithm and neural networks. Platt [10] communicated a fast arrangement strategy for SVM utilizing unimportant ensuing advances. Khan et al. [11] acquainted a procedure with smooth out the SVM readiness season, particularly when managing huge informational collections, utilizing a reformist grouping overview. From them, an incredible programmed development has been utilized that acquires the computation of the trees to be gathered, since it has been shown that it tackles the issues of the distinctive existing gathering estimations by levels. The bunching test assists with discovering negligible methodologies, which are predominantly data models prepared to plan SVM, between two classes, uncommon and customary. Its estimation has contributed colossally to improve the SVM readiness stage with the predominant theoretical accuracy. Guo et al. [12] proposed an epic computation for multiclass SVM. The functioning tree in the math incorporates a two-class SVM succession. Mulay et al. [13] chose SVM-subordinate IDS utilizing the tree of decision. Chen and her partners applied the help vector machine to multiclass grouping issues and addressed multiclass investigating blunders. They assessed the constraints of the common methodologies, and from here on out, they planned the SVM dependent on the decision tree (DTSVM) that utilizes the innate estimation (GA) that saves the greatest limit with regards to theory. Yi et al. [14] projected a modified spiral base part work (U-RBF), through the positive sides of the mean and the base of the mean square of the featured credits installed in the outspread base piecework (RBF). They recommended a refreshed work of the U-RBF stepper bit, which relies upon crafted by the Gaussian part. This strategy lessens the hurrying around among supporters; subsequently, the acknowledgment pace of U-RBF is higher than that of RBF. The U-RBF assumes a fundamental part in saving time in planning and testing. This technique is futile for discovering root (U2R) and far-to-approach (R2L) customer assaults.
104
S. Sandeep Kumar et al.
3 ML Algorithms The proposed procedure in different man-made cognizance calculations and decided the results of the solicitation, we gather these identical mechanized thinking assessments. The AI calculations showed up underneath will outline the justification the compasses performed with our proposed methodology. BayesNet: BayesNet learns questionable Bayesian relationship as clear qualities with no missing attributes. These two are totally novel parts for evaluating the affiliation’s restrictive probability tables. In our examination, we will generally run BayesNet with the simple estimator and the K2 question calculation without using AD tree. The assessment of K2 is done as follows. We are keeping down to see the full interest for focus focuses. Most importantly, each center does not have people. By then, the calculation dynamically apportions the organization whose advancement will improve by far most of the score of the resulting plan. Exactly when no augmentation of a parent improves the score, he stops accomplice the guardians in the center. As interest from the centers has been seen as of not long ago, the noticing space under this need is essentially more unassuming than the total locale. Decision Trees J48: J48 decision tree is a gifted man-made intellectual ability model that picks the target worth from another model ward a couple of positive pieces of the chance of information reachable. The inner focal spots of a choice tree address unequivocal credits, while the branches between the splendid lights give data on the potential attributes that these properties can achieve in the showed models, while the terminal living spaces educate on the last worth (disposition) of the dependent variable. In the J48 structure, to demand something different, you ought to from the beginning settle on a decision tree that relies on the quality benefits of even disapproved of procedure information. Hence, when you experience a colossal heap of things (the system set), you see the property that isolates the different models significantly more point of fact. This part that can disengage information significantly more absolutely is said to get the most astounding data. As of now, among the conventional advantages of this part, if there is a persuading power for which there is no weakness, that is, for which the information openings are of their sort, they have a comparable improvement for the objective variable. J48 Graft: J48 Graft produces a choice tree joined by a J48 tree. The mix system relates the focuses to a current choices tree with a confirmation to lessening doubt bungles. These appraisals see spaces in the model space that do not have the entirety of the attributes to be joined through occasion engineering, or included exclusively by misclassified organizing cases, and think about the diverse approaches for those spaces. Their explanation is to foster the way to constantly reference openings that leap outside of the spaces covered by the status data. Join can be a post-processor that will be applied to the choice trees. Your commitment is to decrease the hypothesis botch by renaming space event regions where there is no status data or where there is essentially misclassified information. Its inspiration is to find the best coordinated with existing leaf region cuts and branches bound to make new leaves with elective
Usage of Machine Learning Algorithms to Detect Intrusion
105
perspectives concerning those under. Regardless, that tree is changed into a tremendous heap of bleeding edge; in any case, here seen as exclusively, a piece does not present slip-ups of deals in the information actually referred to in a sensible way. The new tree submitted as requirements be decreases bumbles as opposed to presenting them.
4 Proposed Work The extra calculation 1 was used as the proposed examination. The KDDCUP 99 dataset contains a huge heap of individual attacks like back, xterm, apache2, etc.; from beginning, the dataset is figured out how to take out irregularities. Precisely when the irregularities have been gotten out by then, at that point, the individual rounds are displaced by their orchestrating as displayed in Algorithm I. Calculation I: Data: KDD cup 99 Dataset Execution: WEKA, ARFF practicable record where all assaults are amassed straightforwardly Stage I: release inconsistencies from the dataset Stage II: tolerating attack_read == ‘apache2’ by/find replace all ‘apache2’ attacks the assault of Cat1 in the dataset KDD Stage III: despite if attack_read == ‘back then’s/look for each ‘attack from behind’ spoof the Cat2 round in the dataset KDD /change the improvement for individual rounds in the dataset KDD cup99 Step N: regardless if attack_read == ‘xterm’ by/find all assaults ‘xterm’ replaces the Catn assault in the KDD dataset n + 1 Step: Otherwise, mark as normal in the dataset KDD cup n + 2 Step: hoard the record and make it as individual_attacks.arff In the wake of getting individual_attacks.arff, we run it on the WEKA mechanical get-together to pronounce the constraint of the depiction of our proposed technique. Subsequently, we have used the some classifiers.
5 Analysis of Result Our methodology was viewed as utilizing the bothering informational record from the KDDCUP 99 network. It shows up from the DARPA 98 Intrusion Detection Assessment met by MIT’s Lincoln Research Facility. This is viewed as a standard data understanding approach that ties the test plan and sets together. In this account, we utilize the KDD information list interface (corrected.zip) (311,029 records). Table 2 shows the level of the interest. By then, the assessment eliminates 10% of the
106
S. Sandeep Kumar et al.
information. 66% of this new establishment was reduced with the arrangement set and 34% went to the test set. In like way, in the KDDCUP dataset, there are 37 open assault types in the test set; in any case, just 23 of them are in the fix set. Thusly, the test pack can be utilized to check the region’s speed of new or delicate assaults (Table 1). Detection rate and precision of proposed technique. The appraisals utilized for looking over the introduction of proposed technique are precision Accuracy (Precision) = Detection Rate = F Measure =
TP TP + FP
(1)
TP TP + FP
2 ∗ Precision ∗ Recall Precision + Recall
(3)
Qualities of the getting supervisor (ROC): ROC watches out for the trade among affectivity and demeanor. The ROC turns plot the PT rate versus the FP rate, in isolating breaking point shorts. Table 1 KDDCUP 99 dataset attack categories Attack category
Attack in KDDCUP Dataset
DOS
Apache2, back, land, mail_bomb, Neptune, pod, procestable, smurf, teardrop, udstorm
U2R
Buffer_overflow, httrunnel, oasmodule, per, ps, rootkit, sqlattack, xterm
R2L
Ftp_write, guesspassword, imap, multihop, named, phf, sendmail, snmpgetattack, snmpguess, sp, warezclient, warezmaster, wor, slock, ssnoop
PROBING
Ipsweep, Mscan, Namp, portsweep, saint, satan
NORMAL
NORMAL
Table 2 KDDCUP 99 test dataset no of samples
Category of attack U2R
No of samples 228
Probe
4165
Total
311,014
Normal
60,589
DoS
229,853
R2L
16,179
Usage of Machine Learning Algorithms to Detect Intrusion
107
The consequences of exploratory accuracy with the various classifiers utilized are displayed in Table 3. As should act naturally apparent, the Naïve Bayes classifier is more disturbing than ordinary. Fundamentally, Tables 4, 5, and 6 show the presentation of recall, F measures, BayesNet ROC, Naïve Bayes, J48, J48 join and random forest for the attestation of individual assaults. Hence, three classifiers J48, J48 Graft, and random forest effectively see the undeniable assault packs from the KDDCUP 99 dataset. In this way, these two assessments can be appropriately shipped off any AI-based IDS to see the assault modes displayed in Table 1 (Fig. 1). Table 3 KDDCUP 99 dataset for all classifier individual attacks precession BayesNet
Weighted Avg
Naïve Bayes
J48
J48 Graft
Random Forest
Individual attack class
1
0.861
0.982
0.982
0.996
Apache2
0.995
0.992
0.997
1
1
Back
0.97
0.014
0.417
1
0.5
Buffer_overflow
0.999
0.876
0.996
0.998
0.999
Guess-password
0.644
0.487
0.963
0.981
1
Http tunnel
0.514
0.196
1
1
0.989
Ipsweep
0.031
0
1
1
1
Land
1
0.943
0.998
0.996
1
Mailbomd
0.948
0.756
0.989
0.984
1
Mscan
1
1
1
1
1
Neptune
0.813
1
1
1
1
Nmap
0.998
0.999
0.953
0.953
0.954
Normal
0.567
0.071
0.868
0.892
0.971
Pod
0.818
0.79
0.991
0.991
1
Portsweep
1
0.992
1
1
1
Processtable
0.211
0.005
0.333
1
1
Ps
0.788
0.026
0.95
0.95
0.93
Saint
0.884
0.71
0.964
0.966
0.978
Satan
1
0.996
1
1
1
Smurf
0.411
0.383
0.618
0.618
0.618
Snmpgetattack
0.928
0.195
0.999
0.999
1
Snmpguess
0.804
0.748
0.989
0.988
0.991
Warezmaster
0.981
0.965
0.981
0.981
0.981
108
S. Sandeep Kumar et al.
Table 4 Classifiers recall of KDDCUP 99 dataset for individual attacks
Weighted Avg
BayesNet
Naïve Bayes
J48
J48 Graft
Random Forest
Individual attack class
0.996
0.96
0.996
0.996
0.996
Apache2
1
1
0.995
0.995
1
Back
1
0.714
0.714
0.571
0.714
Buffer_overflow
0.993
0.844
0.994
0.994
0.983
Guess-password
0.983
0.949
0.881
0.881
0.979
Http tunnel
0.979
0.968
0.979
0.979
1
Ipsweep
1
0
1
1
1
Land
1
0.976
1
1
1
Mailbomd
0.995
0.78
0.986
0.986
1
Mscan
0.996
0.995
1
1
1
Neptune
1
1
1
1
1
Nmap
0.797
0.636
0.947
0.947
0.948
Normal
0.971
0.914
0.943
0.943
0.943
Pod
0.968
0.904
0.942
0.928
0.976
Portsweep
0.996
0.98
0.996
0.992
1
Processtable
0.444
0.778
0.111
0.111
0.333
Ps
0.746
0.02
0.93
0.93
0.922
Saint
0.927
0.973
0.976
0.974
0.958
Satan
1
0.999
1
1
1
Smurf
0.992
0.606
0.654
0.654
0.65
Snmpgetattack
0.996
0.971
0.996
0.996
0.996
Snmpguess
0.998
0.442
0.988
0.989
0.993
Warezmaster
0.968
0.909
0.98
0.98
0.981
6 Conclusion From here on out, these heap procedures, methodologies, and contraptions are utilized to recognize the interruption in the PC association, and we continue with the examination to make them far better than the impression of the interference. Regardless, meanwhile, new attacks have arisen that will be hard for Handel as he advances with changes in his practices. In this exploring work, we have proposed “an improved methodology to recognize interruption utilizing AI computations” with the KDDCUP 99 informational index, which is reproduced on the WEKA instrument. The proposed strategy recognizes the special assaults present in the KDDCUP 99 dataset rapidly and proficiently. The distinguishing proof rate for every one of the three IA J48, J48 Graft, and random forest computations is more noteworthy than 96%. These estimations can be adapted to the association’s environment to improve acknowledgment
Usage of Machine Learning Algorithms to Detect Intrusion
109
Table 5 Individual attacks F-all classifier measure of KDDCUP 99 dataset
Weighted Avg
BayesNet
Naïve Bayes
J48
J48 Graft
Random Forest
Individual Attack class
0.998
0.924
0.989
0.993
0.996
Apache2
0.997
0.996
0.996
0.97
1
Back
0.177
0.028
0.526
0.727
0.588
Buffer_overflow
0.996
0.859
0.995
0.996
0.999
Guess-password
0.779
0.644
0.92
0.929
0.991
Http tunnel
0.674
0.326
0.989
0.989
0.984
Ipsweep
0.06
0
1
1
1
Land
1
0.959
0.999
0.998
1
Mailbomd
0.971
0.853
0.88
0.985
1
Mscan
0.998
0.998
1
1
1
Neptune
0.897
1
1
1
1
Nmap
0.886
0.777
0.95
0.95
0.951
Normal
0.716
0.132
0.5
0.9917
0.957
Pod
0.886
0.843
0.904
0.959
0.988
Portsweep
0.998
0.986
0.954
0.996
1
Processtable
0.286
0.01
0.998
0.2
0.5
Ps
0.766
0.023
0.167
0.94
0.926
Saint
0.905
0.821
0.94
0.97
0.968
Satan
1
0.997
0.97
1
1
Smurf
0.582
0.469
1
0.636
0.634
Snmpgetattack
0.961
0.324
0.636
0.998
0.998
Snmpguess
0.89
0.556
0.989
0.989
0.992
Warezmaster
0.964
0.926
0.98
0.98
0.981
speed and time. Later on, we may change the default WEKA limits with a decline in the features of the KDDCUP 99 dataset. Notwithstanding the blend of AI estimations and data extraction methods, we can utilize mechanized thinking and fragile computational systems. Like neural networks etc., which would helpful to reduce false alarm rate in intrusion detection.
110
S. Sandeep Kumar et al.
Table 6 ROC individual attacks for all classifier KDDCUP 99 dataset
Weighted Avg
Bayes net
Naïve Bayes
J48
J48 Graft
Random Forest
Individual attack class
1
0.999
0.998
0.998
1
Apache2
1
1
1
0.997
1
Back
1
0.975
0.928
0.856
0.929
Buffer_overflow
1
0.994
0.997
0.997
1
Guess-password
0.998
0.997
0.961
0.942
0.992
http tunnel
0.993
0.988
0.994
0.994
0.998
Ipsweep
1
1
1
1
1
Land
1
0.999
1
1
1
Mailbomd
1
1
0.997
0.999
1
Mscan
1
0.999
1
1
1
Neptune
1
1
1
1
1
Nmap
0.997
0.98
0.998
0.998
0.998
Normal
1
0.976
0.971
0.971
1
Pod
1
0.999
0.988
0.988
1
Portsweep
1
0.998
0.988
0.998
1
Processtable
0.997
0.992
0.886
0.83
0.944
Ps
0.999
0.969
0.992
0.988
0.988
Saint
1
0.999
0.996
0.994
0.994
Satan
1
0.997
1
1
1
Smurf
0.989
0.977
0.99
0.99
0.99
Snmpgetattack
1
0.993
0.999
0.999
1
Snmpguess
1
0.997
0.997
0.998
1
warezmaster
0.999
0.994
0.999
0.999
0.981
Usage of Machine Learning Algorithms to Detect Intrusion
111
Fig. 1 Classification of all classifiers accuracy
References 1. Sourcefire. Snort open source network intrusion prevention and detection system (ids/ips). http://www.snort.org 2. H.-J. Liao, C.-H. Richard Lin, Y.-C. Lin, K.-Y. Tung, Intrusion detection system: a comprehensive review. J. Network Comput. Appl. 36(1), 16–24 (2013) 3. H. Debar, An introduction to intrusion-detection systems, in Proceedings of Connect 2000 (2000) 4. V. Jyothsna, V.V. Rama Prasad, K. Munivara Prasad, A review of anomaly based intrusion detection systems. Int. J. Comput. Appl. 28(7), 26–35 (2011) 5. A.K. Ghosh, A. Schwartzbard, M. Schatz, learning program behavior profiles for intrusion detection, in Workshop on Intrusion Detection and Network Monitoring, vol. 51462 (1999) 6. N. Bashah, I.B. Shanmugam, A.M. Ahmed, Hybrid intelligent intrusion detection system. World Acad. Sci. Eng. Technol. 11, 23–26 (2005) 7. T. Xia, G. Qu, S. Hariri, M. Yousi, An efficient network intrusion detection method based on information theory and genetic algorithm, in 24th IEEE International Conference on Performance, Computing, and Communications, 2005 (IPCCC 2005) (IEEE, 2005), pp. 11–17 8. B.D. Caulkins, J. Lee, M. Wang, A dynamic data mining technique for intrusion detection systems, in Proceedings of the 43rd Annual Southeast Regional Conference, vol. 2 (ACM, 2005), pp. 148–153 9. M. Gudadhe,P. Prasad, K. Wankhade, A new data mining based network intrusion detection model, in 2010 International Conference on Computer and Communication Technology (ICCCT) (IEEE, 2010), pp. 731–735 10. S.R. Gaddam, V.V. Phoha, K.S. Balagani, K-means+ ID3: a novel method for supervised anomaly detection by cascading k-means clustering and ID3 decision tree learning methods. IEEE Trans. Knowl. Data Eng. 19(3), 345–354 (2007) 11. H. Tang, L.-S. Qu, Fuzzy support vector machine with a new fuzzy membership function for pattern classification, in 2008 International Conference on Machine Learning and Cybernetics, vol. 2 (IEEE, 2008), pp. 768–773 12. D.S. Kim, H.-N. Nguyen, J.S. Park, Genetic algorithm to improve SVM based network intrusion detection system, in 19th International Conference on Advanced Information Networking and Applications, 2005 (AINA 2005), vol. 2 (IEEE, 2005, pp. 155–158)
112
S. Sandeep Kumar et al.
13. N. Raghavendra Sai, Analysis of artificial neural networks based intrusion detection system. Int. J. Adv. Sci. Technol. (2005). ISSN 4238 14. N. Raghavendra Sai, K.M. Jogendra, C.Ch. Smitha, A secured and effective load monitoring and scheduling migration VM in cloud computing. IOP Conf. Ser. Mater. Sci. Eng. 981 (2020). ISSN- 1757–899X 15. Z.-S. Pan, S.-C. Chen, G.-B. Hu, D.-Q. Zhang,Hybrid neural network and C4. 5 for misuse detection, in 2003 International Conference on Machine Learning and Cybernetics, vol. 4 (IEEE, 2003, pp. 2463–2467) 16. Y. Yasami, S.P. Mozaffari, A novel unsupervised classification approach for network anomaly detection by k-Means clustering and ID3 decision tree learning methods. J. Supercomput. 53(1), 231–245 (2010) 17. J.C. Platt, Fast training of support vector machines using sequential minimal optimization, in Advances in Kernel Methods: Support Vector Learning, pp. 185–208 (1998) 18. C.-F. Lin, S.-D. Wang, Fuzzy support vector machines. IEEE Trans. Neural Networks 13(2), 464–471 (2002) 19. L. Khan, M. Awad, B. Thuraisingham, A new intrusion detection system using support vector machines and hierarchical clustering. VLDB J. Int. J. Very Large Data Bases 16(4), 507–521 (2007) 20. N. RaghavendraSai, K. Satya Rajesh, An efficient loss scheme for network data analysis. J. Adv. Res. Dyn. Control Syst. (JARDCS) 10(9) (2018). ISSN 1943-023X 21. N. Raghavendra Sai, J. Bhargav, M. Aneesh, G. Vinay Sahit and A. Nikhil, Discovering Network intrusion using machine learning and data analytics approach, in 2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV), Tirunelveli, India, pp. 118–123 (2021). https://doi.org/10.1109/ICICV50876.2021.9388552 22. N. Raghavendra Sai, T. Cherukuri, B. Susmita, R. Keerthana, Y. Anjali, Encrypted negative password identification exploitation RSA rule, in 2021 6th International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, pp. 1–4 (2021). https://doi.org/10. 1109/ICICT50816.2021.9358713 23. W. Lu, I. Traore, Detecting new forms of network intrusion using genetic programming. Comput. Intell. 20(3), 475–494 (2004)
Speech Emotion Analyzer Siddhant Samyak, Apoorve Gupta, Tushar Raj, Amruth Karnam, and H. R. Mamatha
Abstract In the current age, many virtual assistants interact with the user in their spoken languages. Every virtual assistant such as Google Assistant, Siri and Alexa has to understand the user’s feelings to give their best result. These assistants can widely use emotion speech recognition to capture the user’s emotion and respond accordingly. This has increased the demand for improved emotion detection taking as little time as possible. We try to combine different methods with little changes and also include sentiment analysis to improve our accuracy. Keywords Machine learning · Deep learning · Sentiment analysis · Ensemble model
1 Introduction In the current age, technology is more engrossed in our lives than ever and is an integral part of our lives. Technology is revolutionizing our world and making our lives better, faster and easier. With advancements in technology, there has been an increase in human interaction with computers. Its continual refinement will lead to secure life in every area of our lives. Over the years, more problems are getting solved through computers. Hence, there is an increasing need for computers to understand human needs and respond correspondingly. Virtual assistants are intelligent computer systems that can interact with users in their spoken language. The interaction methods involved are: voice and text images. Voice is a very convenient way for humans to interact, and virtual assistants have to extract the information needed, taking voice as the input. It has to answer the user’s questions and perform the specific task while conversing with the user.
S. Samyak · A. Gupta (B) · T. Raj · A. Karnam · H. R. Mamatha PES University, Bangalore 560085, India H. R. Mamatha e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_9
113
114
S. Samyak et al.
There is an increasing need for virtual assistants to interpret the voice input better and react to the information needed. Emotion has always played an important role in how we think. We can easily sense others’ emotions and interact accordingly, but it is not very easy for machines to comprehend and give the most effective response [1, 2]. Accurate determination and effective understanding of the emotion through speech can make the virtual assistants more user interactive. This makes it very important to accurately classify the speech to the correct emotion with the latency being very small. Many virtual assistants take voice as an input, recognizing emotion in real time to effectively improve users’ experience. The recorded speech can be categorized into different emotions, namely angry, sad, neutral, happy, etc. Emotion of the speech can help disambiguate a sentence having different meanings and choose the precise meaning according to the context. We are using different machine learning and deep learning models to classify the input speech into its correct emotion. Along with these models, sentiment analysis can also be carried out on the text extracted to improve the classification. With accurate classification and fast performance, real-time speech recognition can make giant strides in improving its services to the user.
2 Related Work 2.1 CNN and LSTM Architecture for Speech Emotion Recognition (SER) with Data Augmentation This research work proposed the uses of both convolution neural network (CNN) layers and bidirectional long short-term memory (LSTM) layers for classifying emotions. The model is trained using the Interactive Emotional Dyadic Motion Capture (IEMOCAP) dataset. The IEMOCAP dataset is small and contains class imbalance in its distribution. Even with these constraints, the model gives a really good result of 64.5% on all the emotions. CNN layers, even though widely used in image classification, are useful for reducing the number of features that prevent overfitting, and the LSTM layers consider the context during classification, hence increasing the accuracy of the whole model. The speech is classified into four different emotions, namely angry, sad, neutral and happy [3].
2.2 Speech Emotion Recognition Using Deep Neural Network and Extreme Learning Machine This paper focuses on the usage of neural network techniques for detecting emotions in human speech. The neural network techniques used here are deep neural networks (DNN) and extreme machine learning (ELM). The second technique ELM [4, 5] is
Speech Emotion Analyzer
115
quite new and has shown promising results with speech detection and thus provides motivation for using it for emotion detection. Even the training time for ELM is very short which is another point for using it in this work. This whole work can be simply divided into two sections, feature extraction and classification. The DNNs are used here to perform the work of feature extraction, while ELM has been used for classifying speech based on these features. Till now what we have discussed was all about the machine learning techniques used in this work. Now let us discuss the dataset used and its preprocessing. The dataset used here is IEMOCAP. The audio signals present in the database are not equally distributed among all the classes they represent, namely frustration, surprise, excitement, happiness and neutral. For preprocessing, the fundamental frequency of the signals has been increased to 16 kHz, and they have been divided into frame sizes of 25-ms. For gauging the performance of their models, they have used an accuracy performance metric and achieved 54.3% [5].
2.3 A Research of Speech Emotion Recognition Based on Deep Belief Network (DBN) and Support Vector Machine This research proposed an approach of using deep belief network (DBN) and nonlinear SVM for speech emotion recognition. Differences between the feature extraction methods like DBN and traditional methods were brought up. In this research, work authors have experimented with the DBN and restricted Boltzmann machine (RBM). DBN is a deep neural network with multiple nonlinear mapping which can perform complex function approximation. Its architecture can be defined as a highly complex directed acyclic graph formed by a series of RBM stacks. Even though the authors have tried the new architecture of combining DBN with RBM, they have not succeeded in achieving remarkable performance gain. They achieved accuracy of 86.5%, as compared to 79.5% achieved from traditional methods [6].
2.4 Emotion Recognition by Speech Signals In this method, emotion recognition is done by the traditional method of feature selection. Initially, the features are extracted; then, features are analyzed which is called feature selection. This helps in optimizing the next phase which classifies the input speech into the corresponding emotion. The classification uses different classifiers and combines them to incorporate the best of all the models. The feature extraction extracts the features which are used for classification such as Mel-frequency cepstral coefficient (MFCC), pitch and loudness. The authors have measured the performance using two different datasets: SUSAS [7] and AIBO [8]. While the model performed
116
S. Samyak et al.
satisfactorily on SUSAS with 70.1% accuracy, it performed very poorly with the AIBO dataset giving an accuracy of mere 42.3% [9].
3 Data In this section, we will try to describe the data used to train the speech emotion analyzer.
3.1 Overview Speech emotion analyzer is a prediction model which tries to accurately predict the emotion of the speech being analyzed. To get the desired results, it is important to train the model with a dataset that has approximately equal audio files for each emotion. After checking out different datasets, we found the RAVDESS [10] dataset to have the audio files spread across different emotions with both male and female speakers. For the research, the weighted percentage is 50% for training, 35% for validation and 15% for testing.
3.2 RAVDESS The dataset contains 24 actors with each actor having around 60 audio files which are classified into 8 different emotions. The 8 different emotions are calm, neutral, happy, angry, sad, surprised, disgust and fear. Figure 1 clearly shows us the distribution of audio files among different actors. Figure 2 clearly shows us that the audio files are equally distributed across all the emotions except the neutral emotion. So, we tried to fill the gap by duplicating the data present for neutral so that all the emotions have equal audio files. But this did not considerably improve the model performance and hence was not included. We also want our model to not be affected by external factors such as noise and gender. So, to make sure it does not matter who speaks the audio file, we checked the distribution of the files across both male and female. As it can be seen through the graph in Fig. 3 that the dataset is not biased toward any gender and has equal audio files from both males and females.
Speech Emotion Analyzer
117
Fig. 1 Number of audio files per actor
Fig. 2 Number of audio files per actor
3.3 Analyzing Different Audio Signals Mel-spectrogram is a useful tool to understand audio signal properties. It is a special type of spectrogram where frequencies are converted into Mel-scale. A Mel-spectrogram for a neutral emotion by a male speaker looks like the image Fig. 4. Mel-spectrograms are obtained by: 1. 2.
Sampling the input with window size. Computing fast Fourier transform (FFT) for each window.
118
S. Samyak et al.
Fig. 3 Distribution of audio files by gender
Fig. 4 MEL-spectrogram—male neutral
3. 4.
Generating Mel-scale. Spectrogram generation.
We can see the differences in the Mel-spectrograms between different emotions. This fact is used in emotion detection using two-dimensional CNN, where the images of the spectrograms are given as input to the CNN model and are classified using the images given as the input. Different features such as wave-plot and MFCCs can also be used in classifying the audio signal. The pictures of wave-plot and MFCCs for a specific audio signal are shown in Figs. 5 and 6.
Speech Emotion Analyzer
119
Fig. 5 Wave-plot of a specific audio file
Fig. 6 MFCC plot for a specific file
4 Implementation Figure 7 shows the basic methodology which is being followed to classify the audio signals into their respective emotions. This consists of both machine learning and deep learning models.
4.1 Feature Extraction • Features such as MFCC, pitch, loudness and frequency are used to classify the input speech to its respective emotion. • Hence, features have to be extracted and used to train the model. • In the testing time, these features decide which emotion this speech belongs to.
120
S. Samyak et al.
Fig. 7 Ensemble model
4.2 Models 1.
CNN-1D: Audio signals are classified by using one-dimensional CNN. This is done by extracting log Mel-spectrogram mean values for each audio signal. A deep neural network with conv1D layers, using categorical cross-entropy as the loss function, and Adams optimizer were used to train the model. Figures 8 and 9 show how accuracy and model loss vary with respect to the number of epochs. Most of the information is learnt in the first 20 epochs, and the graph also shows us that the model gives an accuracy of about 51% on the validation set.
Fig. 8 Accuracy versus the number of epochs
Speech Emotion Analyzer
121
Fig. 9 Loss versus the number of epochs
2.
CNN–2D: Instead of feeding the mean values of Mel-spectrogram for each audio signal, Mel-spectrogram itself can be given as an input to the CNN model for each audio signal. VGG16 (pretrained convolutional neural network) with some additional layers was used to train the dataset and gave an accuracy of around 73% on the validation set. A pretrained model is used in CNN-2D, whereas conv1D layers are used in CNN-1D to classify the audio signal. Figure 10 shows the confusion matrix which gives us a clear picture of how the CNN-2D model is performing.
Fig. 10 Confusion matrix for 2-D CNN
122
3.
4. 5.
S. Samyak et al.
MLP: A simple multilayer neural network is used to classify the audio signals into their respective emotions. The MLP model produced an accuracy of 67% after tuning the parameters. SVM: Multiclass support vector machine is used to classify emotions by taking the audio files as input after the feature extraction process. Sentiment model: This involves converting audio into text and applying sentiment analysis to the obtained text. This is done because sometimes text might involve some words which not only helps in classifying the audio signal better but also faster. a. b. c. d. e. f. g.
6.
The input speech is converted to text, and sentiment analysis is performed on it. This will incorporate the emotion conveyed through the words. Since our research is on speech audio signals, we are using AWS transcribe to convert the signals to their respective transcripts. Those transcripts are then fed to the sentiment model which is based on BERT [11]. BERT [11] is an ML model which has been pretrained on a very large corpus of English dataset. The model has been trained with two different objectives: Masked Language Modeling (MLM) and Next Sentence Prediction (NSP). Due to these objectives and a very huge dataset, BERT [11] develops a deep understanding of the English language and can be tested on any classification dataset to give very good results.
Ensemble model: Ensemble model is the combination of many weak learners to produce an effective learner. Since the inputs to the weak learners have to be transformed depending on the model (e.g., audio is converted into features in the case of SVM and MLP, and audio is converted to log Mel-spectrogram images for the 2D CNN) so we found stacking to be the most effective ensemble model technique.
Figure 11 shows the flowchart of the stacking principle. All the weak learners are stacked together and trained on a part of the dataset (called train data), and these models are tested on another portion of the dataset (called validation data). This validation data is now used to train the meta-Learner, neural network is used as a meta-Learner which is finally tested on the remaining part of the dataset.
5 Result Analysis The accuracy of the ensemble model (74%) came out to be better than most of the individual models built for emotion recognition. The accuracy metric is used to check the performance of the models with their corresponding results given in Table 1. The ensemble model was constructed using
Speech Emotion Analyzer
123
Fig. 11 Stacked ensemble model architecture
Table 1 Model comparison using different metrics Model
Accuracy (%)
Precision
Recall
F1-Score
Multilayer perceptron classifier model
61.4
0.62
0.61
0.61
Support vector classifier model
51.2
0.52
0.51
0.51
CNN 1-D
52.0
0.53
0.52
0.52
CNN 2-D
66.0
0.66
0.66
0.65
Sentiment model (BERT)
93.5
0.84
0.89
0.86
Ensemble model
74.0
0.78
0.75
0.76
the stacking technique. The ensemble model consists of four different machine/deep learning models which are SVM, MLP, CNN-1D, CNN-2D. It is also evident from other metrics too that the ensemble model outperformed the individual models as shown in Table 1. The sentiment model (BERT) (checking sentiment only) was trained and tested on the Emotion dataset for NLP. The model gave an accuracy of 93.55% during testing. This model could not be included in the ensemble model due to lack of appropriate dataset to train and test our final ensemble model with the sentiment model. Since the RAVDESS dataset has only two sentences spoken by 24 actors in different tones, incorporating the sentiment model in the ensemble model was not fruitful. Currently, no dataset for emotion recognition has different sentences spoken by different speakers in different tones. We hope with the availability of the dataset, soon we can incorporate the sentiment model in our final ensemble model thus improving the accuracy further.
124
S. Samyak et al.
6 Conclusion and Future Work Through this research, we have tried to improve the detection of emotion in speech by trying various deep learning and machine learning models. We tried to improve the accuracy of the models by applying different optimization techniques. The ensemble model outperformed the individual models with accuracy higher than the models included in it. For future work, with the availability of dataset which has different sentences spoken by different speakers in different tones, we can incorporate the sentiment model in our ensemble model. Also, we can try to improve the latency of the 2-D CNN model and the sentiment model which uses Amazon transcribe for audio-to-text conversion. Thus, by improving the latency of our model, it can be incorporated in many voice assistants for real-time speech emotion detection to improve the user’s experience. Acknowledgements We would like to thank PES University for giving us this opportunity.
References 1. M. El Ayadi, M.S. Kamel, F. Karray, Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn. 44(3), 572–587 (2011) 2. B. Schuller, A. Batliner, S. Steidl, D. Seppi, Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge. Speech Commun. 53(9), 1062–1087 (2011) 3. Etienne, C., Fidanza, G., Petrovskii, A., Devillers, L, Schmauch, B.: CNN+LSTM architecture for speech emotion recognition with data augmentation, in Workshop on Speech, Music and Mind, pp. 21–25 (2018) 4. G.-B. Huang, Q.-Y. Zhu, C.-K. Siew, Extreme learning machine: theory and applications. Neurocomputing 70(1), 489–501 (2006) 5. D. Yu, L. Deng, Efficient and effective algorithms for training single-hidden-layer neural networks. Pattern Recogn. Lett. 33(5), 554–558 (2012) 6. K. Han, D. Yu, I. Tashev, Speech emotion recognition using deep neural networks and extreme learning machines, in INTERSPEECH, pp. 223–227 (2014) 7. C. Huang, W. Gong, W. Fu, D. Feng, A research of speech emotion recognition based on deep belief network and SVM. Math. Probl. Eng., 1–7 (2014) 8. J.H.L. Hansen, SUSAS LDC99S78. Web Download (Linguistic Data Consortium, Philadelphia, 1999) 9. A. Batliner, S. Steidl, E. Nöth, Releasing a thoroughly annotated and processed spontaneous emotional database: the FAU Aibo Emotion Corpus, in Proceedings of a Satellite Workshop of LREC 2008 on Corpora for Research on Emotion and Affect, ed. by D. Laurence, M. Jean-Claude, C. Roddy, D.-C. Ellen, B. Anton (LREC, Marrakesh, MA, 2008), pp. 28–31 10. O.W. Kwon, K. Chan, J. Hao, T.W. Lee, Emotion recognition by speech signals, in The Eighth European Conference on Speech Communication and Technology (2003) 11. S.R. Livingstone, F.A. Russo, The Ryerson audio-visual database of emotional speech and song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE 13(5), e0196391 (2018). https://doi.org/10.1371/journal.pone.0196391
A Review of Security Concerns in Smart Grid Jagdish Chandra Pandey and Mala Kalra
Abstract Smart grid is a digital transformation of the electric power grid which uses automatic control and communications techniques for transmission and distribution of electricity. It enables self-monitoring to detect quickly changing electric demands and respond accordingly. Smart grids have self-healing capabilities and enable electricity customers to become active participants. Smart meter communicates sensitive information to the smart grid which is prone to various attacks such as integrity violation, eavesdropping, and replay attack. The control center can be misguided and forced to make wrong conclusions, such as transmitting a wrong electricity consumption report to customers if the malicious activity cannot be located in time. For securing data in smart grids, different security countermeasures are required to be explored. This study presents a review of various security challenges in smart grids along with some others issues encountered in its adoption. The paper also suggests directions for future research in this area. Keywords Smart grids · Security issues · Internet of things · Security and privacy · Attacks
1 Introduction Smart grid is an improvement over traditional grid which uses new and potent technologies for an effective power dispersion and transmission. As compared to the conventional electrical grid, the smart grids are bi-directional and are responsible for fast communication of the electricity as well as information between consumers and suppliers. Here, various assets such as advanced metering infrastructure are also enabled by the smart grid which interacts with the network infrastructure for effective power control abilities. Smart grid is modern electrical grid which combines J. C. Pandey (B) · M. Kalra NITTTR, Chandigarh, India M. Kalra e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_10
125
126
J. C. Pandey and M. Kalra
the duplex, reliable transmission techniques with the mathematical knowledge, for power production, communication, dispersion, and usage across various substations to attain a structure that is secure, flexible, effective, and acceptable. Advantages of the smart grid are • • • • • •
Improving the quality of power Improving resilience to avoid disruption Enhancing capability and effectiveness of traditional electrical power systems Facilitates addition of large-scale renewable energy systems Self-monitoring and adaptation as per varying conditions Decreasing the emission of greenhouse gases by supporting electric vehicles and new sources of electricity • Quicker restoration of electricity after power disturbances • Enables customer participation by allowing them to reduce electricity use in response to indications during peak times. Compared to conventional electric system, smart grid is equipped with automated monitoring and recovery tasks along with widespread management. Smart meters collect and transmit electricity consumption data. Smart meters are equipped with a processor, secondary memory, and transmission technologies which comprise of home area network. Power usage data are collected by smart meter after short intervals, i.e., in every 15 min. and transmitted to neighborhood gateway. Neighborhood gateway then transmits usage data to control center. Control center provides current period tariff information to consumers. Customers can respond by reducing electricity usage during peak times in order to reduce their bills. Load balancing, data management of customers, and suppliers are the important aspects of the smart grids. They are prone to various security attacks. They are susceptible to illegal and harmful operations ex.—the tampering of meters reading. The control center can be misguided and forced to make wrong conclusions, such as transmitting a wrong electricity consumption report to customers if the malicious activity cannot be located in time. To detect replayed/injected messages, an authentication scheme is strongly required. For securing data in smart grids, different security countermeasures are required to be explored. The existing countermeasures consume large number of resources and computation time for data encryption as well as authentication. Smart meters have limited resources, for example, a computation constrained microprocessor, small storage space, etc.
2 Overview of Smart Grid [1] Following are the key subsystems or key components of smart grid.
A Review of Security Concerns in Smart Grid
127
2.1 Advanced Metering Infrastructure It is an important part of smart grid which is comprised of smart meters, communication technologies, and technologies that control the data flow across the system which provides fast and efficient interaction between users and network engineers. For energy suppliers, advanced metering infrastructure allows precise usage information at reduced cost. End users get a chance to use power at lower price by having updates about market demand and pricing. AMI network infrastructure is a hierarchical network architecture that comprises of WAN joining utility center to headend, NAN which joins headend to electronic meters, and HAN for joining household electronic devices to buyer’s smart meter (Fig. 1). AMI allows duplex connection between customer meters and grid operators so that data analysis can be performed to estimate market pricing and to manage resources. The different components of AMI are as follows: • Home Area Network HAN consists of smart devices connected to smart meter. HAN controls and monitors these devices using different communication protocols such as Zigbee. Buyers are facilitated by controlling high electricity consumption load, providing them updated knowledge of power consumption and observing device performance on regular basis. Also, home area network comprises of house automation and building automation techniques, which allows sharing of power consumption reports between different smart appliances within house building. So, the transmission requirements are low electricity usage, low price, and reliable transmission.
Fig. 1 Overview of smart grid [5]
128
J. C. Pandey and M. Kalra
• Neighborhood Area Network NAN links HAN with WAN. NAN enables monitoring and dissemination of power. Neighborhood area network collects large amount of electricity consumption data from numerous smart meters placed in HAN’s. Information is collected at a data concentrator (gateway of the NAN). The NAN requires transmission speed of 100 kbps–10 Mbps in range of about 10 km. Various data communication protocols used in NAN are as follows: ZigBee mesh networks, Wi-Fi mesh networks, power line communication (PLC), etc. • Wide Area Network WAN connects neighborhood area network with utility network. WAN facilitates security of electrical network to provide high performance and reliability. As the vast amount of information is required to be transmitted at high frequency, a transmission speed of 10 Mbps to 1 Gbps is needed in range of up to 100 km. • Smart Meter Smart meter is important part of advanced metering infrastructure. There is two-way information flow between smart meter and utility center to transfer realtime power usage reports and relevant information such as voltage, and frequencies. The electronic meters are susceptible to physical interference as they are located in customer’s site. Smart meter performs various tasks like: recording realtime consumption, real-time cost estimation, monitoring electricity quality, detection of unauthorized access to meters identification of power theft, and exchange of information with other electronic appliances within premises. • Customer Premise Display It is an interactive interface that provides power consumption and pricing details to the customer. • Energy Management System It is an intermediary between utility center and real-time pricing programs. • Data Collector Collects data from smart meters and send it to the utility center. • Utility Center Utility Center is responsible for tasks related to computation of bills. • Advanced Metering Infrastructure Headend It obtains information from smart meters as well as remotely give commands to smart meters. It also sends firmware updates and configuration. • Meter Data Management System It saves information obtained from smart meter and provides it to various grid units. The meter data management system collects, verifies, computes, and allows modification of smart meter data like power consumption, and production related data. Meter data management system saves this information for a definite time period and then sends this data-to-data warehouse to make it accessible to other authorized units.
A Review of Security Concerns in Smart Grid
129
2.2 Supervisory Control and Data Acquisition (SCADA) It is main part of SG network. Its main task is periodic monitoring and management of electricity distribution. SCADA collects information of power consumption, enable different other management units to collect different types of analog data and circuit breaker position information from grid network for accessing different security aspects of SG network. Different parts of SCADA are as follows: • Control Servers: Execute different controlling functions. • Human–Machine Interface: Enables monitoring of system, updating control settings, and manually execute control instructions if some unexpected event occurs. • Remote Terminal Unit: Field device equipped with wireless radio interfaces for performing collection of information. • Programmable Logic Controller: Execute logic control tasks. • Intelligent Electronic Devices: Comprised of a smart sensor along with actuator for collecting information; transmit it to other appliances, and execute local computation. The control center consists of control server, human–machine interface, engineering workstations, and data historian, connected by a LAN. It aggregates data received from the field devices, provide it to the human machine interface and take actions on the basis of analyzed information.
2.3 Wide Area Measurement System It manages grid system in wide geographical area. All related information of power network is made available at same location and same period with the help of WAMS. WAMS time-synchronize phasor measurement units (PMUs) of grid network, transmit updated phasor information to control center with the help of GPS satellite signal. The collected phasor data give real-time information of electrical units, which helps to take remedial actions to improve grid networks performance.
3 Issues in Smart Grid [2] In this section, various issues in adoption of smart grid are discussed. • Cyber-Attacks When grid is connected to Internet, it is susceptible to various attacks from hackers, malwares, etc. Main goals of cyber-security are as follows: (1) Availability, (2) Integrity, and (3) Confidentiality. Availability means to ensure that the data are available whenever required; Integrity refers to protecting the data from illegal alteration; Confidentiality means to secure data from illegal access.
130
J. C. Pandey and M. Kalra
• Storage Concerns Smart Grid uses renewables for electricity production in large amount. As the electricity generation from renewables is irregular and variables it needs to be stored. Battery used for this purpose has life span of very short duration. Moreover, it is very expensive and complex. Also, there is shortage of raw materials required to build these cells. Flywheel (a mechanical device which stores energy in form of rotational or kinetic energy) is very fast in storing and releasing electricity, but its stability is quite low. • Data Management Smart grid power network consists of large number of meters, sensors, and controllers. As smart meter reads data after every 15 min, the vast amount of data gets collected. There is difficulty in accessing, managing, gathering, and sustaining this vast amount of information. This vast amount of information delays the task of information gathering, evaluation, and storage. Defining standards and protocols is another concern. Cloud computing seems to be a potential solution to manage this big data. • Communication Issues Communication technologies for deployment in smart grid have their own limitations like limited bandwidth, restricted operating range, high information loss, high failure rate in underground installation. GSM and GPRS have low data transfer rate. 3G needs costlier spectrum. ZigBee has low coverage range. Wired communication faces the problem of interferences. Optical fiber is very expensive (Fig. 2).
Fig. 2 AMI network infrastructure [1]
A Review of Security Concerns in Smart Grid
131
4 Security Concerns in Smart Grid [2] Here, various attacks possible on the smart grids are discussed. (a)
Key-Based Attack
In the smart grid, secret key or a certificate is used for authenticating a user. Keybased attack can be categorized into known key attack and unknown key share attack. The key is assumed to be known in known key attack, and attacker is able to know some structural feature of the cipher text. In unknown key attack, an attacker deceives two parties P and Q, as P thinks that he/she has exchanged key information with Q, and in fact, Q wrongly understands that he has exchanged the key information with R = P. Schemes in [3, 4] can prevent known key attack, whereas scheme in [5] can prevent unknown key share attack. (b)
Data-Based Attack
An attacker can alter power grid’s information that comprises of data regarding power usage. An integrity violation is said to happen when an attacker is able to alter the data exchanged, which can be avoided with the help of scheme in [1] that make use of concept of access control. When a user is taking part in a transaction but claims that he is not a part of this transaction, the attack is termed as Repudiation Attack which can be avoided by using lightweight message authentication technique in [4] comprising of identity and hash-signature verification. A Malicious Data Mining Attack is said to be performed, when an attacker is able to harm the aggregator and can make use of malicious meters. It can be avoided by a technique proposed in [3], which makes use of privacy-preserving aggregation mechanism. According to Jiang et al. [6], Human Factor Aware Differential Aggregation attack is motivated from human behavior, where an attacker can acquire smart meter identity information by evaluating different messages sent by smart meters to gateway at different time intervals. The scheme in [7] is robust to HDA attacks. In ChosenPlaintext-Attack [8], attacker is able to identify a ciphertext if two corresponding plaintexts are given that can be avoided by a technique proposed in [3] based on Homomorphic Encryption. In homomorphic encryption, operations are performed on ciphertexts, and results are same as it would have been if applied correspondingly on plaintexts, e.g., RSA encryption. In Chosen-Cipher Text attack, if attacker has the knowledge of ciphertext and can get corresponding plaintexts, he can find secret key. This can be handled by using a technique proposed in [8] based on Lightweight Integrity Protection Scheme. Data integrity attack [9] involves modification of data obtained from a number of power meters. Numerous integrity attacks are discussed in [10, 11], e.g., the unobservable attack. Unobservable attack is launched by intercepting the communication channel with the help of a sniffer (used to capture networks packets). While data transmission occurs across network, if encryption of information is not done, then sniffer can easily intercept that information. Adversary with the help of sniffer application can examine the communication channel and can make use of data to disrupt the communication. Cache Pollution Attack
132
J. C. Pandey and M. Kalra
[12] occurs, when an attacker exploits the gateways to cache irrelevant information to reduce network efficiency and increase link usage. Techniques proposed in [13, 14] can prevent cache pollution attack by deploying orthogonal schemes like zero knowledge proof . (c)
Impersonation-Based Attack
Attacker can intercept information provided by different electronic meters and can make use of this data for malicious purpose. Theft of information that is being transferred in between electronic meters or between gateway and electronic meters for altering the information which is not in the knowledge of communicating parties [14] is termed as man-in-the-middle attack. Electronic meters or gateway provides their public keys (M1, M2, M’1, and M’2) to each other for establishing a reliable connection in between them. Attacker capture these messages, and it provides its public keys (M3, M4, M’3, and M’4) to the communicating parties. After that, data are encrypted by first smart meter or gateway by making use of attacker’s public key, and this encrypted data are then transmitted to second smart meter (M5, M’5). Attacker again captures transmitted messages, and decrypts them by applying its own private key. After that, attacker with the help of first smart meter’s public key encrypts the data, and communicates it to second smart meter (M6, M’6) as shown in Fig. 3. When adversary captures data and replays that data to the destination server, the attack is known as replay attack. The timestamp approach can be used to resolve replay attack. In V2G (vehicle to grid) system, an attacker can redirect vehicle’s data packets to other network other than the original one. Redirection attack is generally combined with a phishing attack [15]. Illegal efforts to get confidential data, for example, usernames, passwords, credit card information by pretending as trusted entity is known as phishing. Attack is done through email spoofing as it requests particular entity to provide confidential data to some unauthorized Web site, which is similar to legitimate site as far as look and feel is concerned. By using the concept of location, verification of every electronic vehicle redirection attack can be prevented
M1
M2
M’1
M’2
M3
M4
M’3
M’4
M5
M6
M’6
M’5
Fig. 3 Man-in-the-middle attack
A Review of Security Concerns in Smart Grid
133
[15]. Attacker can launch an impersonation attack, where he can pretend as one of the legal users [4] during process of authentication between electronic meters and gateways. When attacker is able to compromise the communication channel and can intercept all data transferred between electronic meters and gateways so as to invade secrecy of other authorized users, the attack is known as eavesdropping attack. The scheme in [16], makes use of Boneh-Goh-Nissim cryptosystem [17] and differential privacy for preventing eavesdropping attack. The scheme in [18] which is based on anonymous ECC-based self-certified key distribution scheme can detect eavesdropping attack. (d)
Physical-Based Attack
In physical based attack, attacker may target the hardware of a smart grid. Differential attack is a technique by which gaining knowledge about disparity in input information can make an effect on resultant difference in final outcome. Differential attack is launched for acquiring the consumer’s information in an electric grid. The scheme in [19] (based on differential privacy protocol [20]) and scheme in [16] (based on use of Laplace noise as cipher text for gaining differential secrecy) can prevent differential attack. A malware attack is initiated by injecting malwares in an electric grid, to capture the private information of consumer. The scheme in [16] (using Boneh-GohNissim cryptosystem [17] and concept of differential secrecy) can prevent malware attack. Collusion attack [21] is integration of different data records of the media or other files in such a manner which produces a new copy similar to original one. Example of such operations includes averaging of data, replacing linear combination of data. They are mostly used for the purpose of breaking into video fingerprinting technologies or for cracking the passwords/keys of the required system. It has been estimated that only 8–10 records are required for collusion attack. The scheme in [22] which is based on concept of self-healing mechanism can prevent a collusion attack. Cloud server collusion attack is avoided by using scheme in [21], based on an identity-based encryption scheme and proximity score calculation algorithm, for data encryption. In proximity search, attacker searches for those documents such that two or more separately matching term occurrences are within a given distance. If an attacker is able to analyze characteristics of DBMS, the attack is known as inference attack [2]. By using concept of separable key-chaining management, inference attack can be prevented [22].
5 Review of Existing Cryptographic Techniques Abdallah et al. [3] have proposed a lightweight technique for ensuring the privacy of electricity usage information which exploits the homomorphic cryptosystem which in turn is based on lattice vector. The appliances of smart house collect data without any involvement of electronic meter. The base stations/smart meters can verify the authenticity of messages but are not able to decrypt information regarding power usage. Overall consumption and communication of power for this type of schemes
134
J. C. Pandey and M. Kalra
are trivial and can easily be tolerated by the various entities in the group such as electronic meters, devices, and various stations. Also, there is reduction in computation overhead for smart devices as this cryptosystem is based on simple arithmetic operations. Experimental results prove that this technique is able to maintain privacy of the consumer, integrity, and the authentication of messages with lightweight communications and the reduced computation complexity. But, this technique fails to achieve protection against differential attack and several full rounds attacks using Biclique cryptanalysis (variant of meet-in-the-middle method of cryptanalysis) and also can have problems of block collisions (two input blocks producing the same hash value) if used with large amount of data. Fouda et al. [4] have proposed a scheme which is based on identity and hashsignature verification to avoid repudiation attack and known key attack on smart grid. Here, first mutual authentication is set up between electronic meters, and then, with the help of Diffie–Hellman algorithm, session key is shared. After that, the messages are authenticated in a lightweight way with the help of shared session key. Result analysis shows that the said technique, not only fulfill the required security needs of smart grid but is also effective in terms of low latency and signal message communication. Diffie–Hellman algorithm is effective as far as memory, data rate, and processing overhead is concerned, however the security depends on the key size. Larger the key size more will be resource utilization. Jolfaei et al. [8] have proposed lightweight and integrity protection scheme (16-bit Fletcher algorithm, permutation-based encryption, checksum embedding, integrity verification algorithm) for maintaining the integrity of grid information. Proposed scheme has fast data rate. This scheme is also used in high speed data transfer having less delay in detecting error and for integrity check. Yang et al. [27] have proposed least-effort attack model. In this scheme, least number of sensors that are to be intercepted to alter specified number of states is being computed by reduced row Echelon form. After that, an efficient greedy design for ideal PMU placement is developed to prevent data integrity attacks. Detailed security analysis shows that this scheme has low computation complexity. Leasteffort attack model has low computation complexity, however fails to find optimal global solution. Guan et al. [28] have proposed EFFECT, an efficient flexible privacy-preserving aggregation scheme. Proposed scheme effectively provides data source authentication and secure information collection from different sources. For ensuring faulttolerance and information secrecy, the limit is set for collection of power usage data of specific location periodically. Results obtained prove that this scheme not only provide secure and fast communication within smart grid but has also low processing overhead. Efficient flexible privacy preserving data aggregation scheme has low processing overhead and fast data rate. Bansod et al. [31] have proposed PICO (lightweight and low power consuming encryption scheme) for smart grid. PICO uses small data block (64-bit) and reduced key (80/128 bit) to encrypt the information. This scheme consumes lesser resources due to which this is being deployed in resource constrained devices such as sensor, RFID tags, and smart cards. The limitation of this approach is that executing large
A Review of Security Concerns in Smart Grid
135
number of rounds (as permutation is done at bit level) increases the computation time. PICO consumes lesser resources but has increased computation time due to large number of rounds as it performs permutation at bit level. Sivaganesan et al. [33] have proposed block chain-based data driven trust mechanism to detect internal attacks in IoT-powered sensor nodes. This scheme is decentralized and power saving scheme. Results obtained shows that proposed scheme has not only achieved reduced message overhead and execution time but has also improved network lifetime. Manoharan et al. [34] have proposed an effective chaotic-based biometric authentication scheme for user interaction layer of cloud. The proposed scheme uses finger point as the biometric trait and N-stage Arnold transform to check the legitimacy of user. Results obtained have shown that proposed scheme has achieved improved performances in terms of detections, false detection, and accuracy. Table 1 shows the comparative analysis of existing techniques.
6 Research Issues and Challenges Observed in Previous Models • Most of the existing cryptography approaches consume large number of resources and have high computational overhead [5, 15, 17, 19, 22, 24–26]. • Diffie—Hellman algorithm is effective as far as memory, data rate, and processing overhead is concerned, however the security depends on the key size. Larger the key size more will be resource utilization [2]. • Lightweight homomorphic cryptosystem based on lattice vector reduces computation complexity but fails to achieve protection against differential attack and several full rounds attacks using Biclique cryptanalysis (variant of meet-in-themiddle method of cryptanalysis) and also can have problems of block collisions (two input blocks producing the same hash value) if used with large amount of data [3]. • Lightweight message authentication schemes are effective as far as time delay, and data rate is concerned [4]. • Integrity protection algorithm (16-bit Fletcher algorithm, permutation-based encryption, checksum embedding, integrity verification algorithm) has fast data rate. This scheme is also used in high speed data transfer having less delay in detecting error and for integrity check. [8]. • Identity-based encryption determines false data with low cost hardware implementation [9]. • Usage pattern-based power theft detector scheme has improved sampling rate, but privacy of customer is affected [29]. • Online sequence extreme learning machine technique provides better speed and accuracy [30].
Lightweight homomorphic cryptosystem which is based on lattice vector
Abdallah et al. [3]
Fouda et al. [4]
Jolfaei et al. [8]
Yang et al.[27]
Guan et al. [28]
1
2
3
4
5
Modification attack, Repudiation attack, Redirection attack
Key-based attack
Security aspect focused
Flexible privacy preserving data aggregation scheme (Secret sharing algorithm)
Least-effort attack model
Contributions
Failed to find optimal global solution
Fails to distinguish between blocks of all 0 bits and blocks of all 1 bits, consumes significant amount of memory, fails to establish authenticity of user
Low computation complexity
This scheme has high data rate. This scheme is also used in high speed data transfer having small time delay, to detect error and for integrity check
Low computational delay and Provide security against high data rate modification attack, Repudiation attack, Redirection attack
Failed to achieve protection Provide robust security against differential attack and against key-based attack several full rounds attacks using Biclique cryptanalysis can have problems of block collisions if it is used with large amounts of data
Limitations
(continued)
Replay, Modification, Each share of the secret must Low computational Injection, and Forgery attack be at least as large as the complexity and secret itself communication overhead
Repudiation attack, Data integrity attack
A Lightweight integrity Man-in-the-middle attack protection algorithm(16-bit Fletcher algorithm, permutation-based encryption, checksum embedding, integrity verification algorithm)
Lightweight message authentication scheme using hash signature and access control
Methodology used
S. No. Reference/citation
Table 1 Comparative analysis of existing techniques
136 J. C. Pandey and M. Kalra
Manoharan et al. [34]
8
Effective chaotic-based biometric authentication scheme
Sivaganesan et al. [33] Block chain-based data driven trust mechanism
7
No security mechanism during data transfer
Due to large number of rounds (as permutation is done at bit level) the computation time increases
Limitations
To check the legitimacy of Not so smooth for user in user interaction layer multimodal system, of cloud constrained two-time attempt to validate the query is a limiting factor
internal attacks in IoT-powered sensor nodes
PICO (lightweight encryption False data injection, technique) Replay attack, Eavesdropping attack
Bansod et al. [31]
6
Security aspect focused
Methodology used
S. No. Reference/citation
Table 1 (continued)
Achieved improved performances in terms of detections, false detection, and accuracy
Power saving scheme, results obtained shows that proposed scheme has not only achieved reduced message overhead and execution time but has also improved network lifetime
Consumes lesser resources
Contributions
A Review of Security Concerns in Smart Grid 137
138
J. C. Pandey and M. Kalra
• PICO consumes lesser resources but has increased computation time due to large number of rounds as it performs permutation at bit level [31]. • The combination of two or more existing cryptographic approaches, i.e., a hybrid lightweight approach for securing sensitive data may provide improved results in terms of memory space, time, and computational complexity while still providing the same level of security as provided by best conventional cryptographic techniques [3, 6, 8, 12, 32].
7 Conclusion and Directions for Future Work Smart meter communicates sensitive information to the smart grid which is prone to various attacks such as integrity violation, eavesdropping, and replay attack. To secure the smart meter, various security algorithms such as AES and homomorphic encryption have been used. But, these techniques consume large amount of resources and computation time. Lightweight approaches (having reduced data block of 64bit and secret key of 80/128 bit) are needed to reduce the resource consumption and computation time. In resource constrained devices like sensor and smart cards, the lightweight security techniques are used. The most preferred lightweight techniques are PRESENT, HIGHT, SIMON, SPECK, PICO, etc. The drawback of these lightweight approaches is the increase in computation time due to large number of rounds (a permutation is done at bit level). The combination of two or more existing cryptographic approaches, i.e., a hybrid lightweight approach for securing sensitive data may provide improved results in terms of memory space, time, and computational complexity while still providing the same level of security as provided by best conventional cryptographic techniques. Hence, the need arises for designing a lightweight encryption technique for the smart grid security which not only consumes less number of resources, has less computational complexity but also provide secure transmission and storage of electronic meter data.
References 1. S. Tan, D. De, J. Yang, K. Das, Survey of security advances in smart grid: a data driven approach. IEEE Commun. Surv. Tutor. 019(1), 397–422 (2017) 2. Kappagantu, S. Daniel, Challenges and issues of smart grid implementation: a case of Indian scenario. Science Direct 005(3), 453–467 (2018) 3. A. Abdallah, X. Shen, A lightweight lattice-based homomorphic privacy-preserving data aggregation scheme for smart grid. IEEE Trans. Smart Grid 009(1), 396–405 (2018) 4. M. Fouda, M. Zubair, K.N. Fadlullah, R. Lu, X. Shen, A lightweight message authentication scheme for smart grid communications. IEEE Trans. Smart Grid 002(4), 675–685 (2011) 5. P. Zhang, O. Elkeelany, L. McDaniel, An implementation of secured smart grid ethernet communications using AES, in Proceedings of the IEEE Southeast Conference, pp. 394–397 (2010)
A Review of Security Concerns in Smart Grid
139
6. C. Lai, H. Li, R. Lu, R. Jiang, X. Shen, LGTH: a lightweight group authentication protocol for machine-type communication in LTE networks, in IEEE Global Communications Conference (GLOBECOM), pp. 832–837 (2013) 7. A. Mood, N.M. Dariush, Efficient design and hardware implementation of a secure communication scheme for smart grid. Int. J. Commun. Syst. 031(10) (2018) 8. A. Jolfaei, K. Kant, A lightweight integrity protection scheme for low latency smart grid applications. Comput. Secur. 086(1), 471–483 (2019) 9. S. Tan, M. Stewart, J. Yang, L. Tong, Online data integrity attacks against real-time electrical market in smart grid. IEEE Trans. Smart Grid 009(1), 313–322 (2018) 10. A. Waisi, A.M. Zainab, On the challenges and opportunities of smart meters in smart homes and smart grids, in Proceedings of the 2nd International Symposium on Computer Science and Intelligent Control (ACM, 2018), pp. 16–17 11. K.P. Braeken, A. Martin, Efficient and privacy-preserving data aggregation and dynamic billing in smart grid metering networks. Energies 011(8), 1–20 (2018) 12. M. Conti, P. Gasti, M. Teoli, A lightweight mechanism for detection of cache pollution attacks in named data networking. Comput. Networks 057(16), 3178–3191 (2013) 13. X. Tan, J. Zheng, C. Zou, Y. Niu, Pseudonym-based privacy-preserving scheme for data collection in smart grid. Int. J. Ad Hoc Ubiq. Comput. 022(2), 120–121 (2016) 14. M. Conti, N. Dragoni, V. Lesyk, A survey of man in the middle attacks. IEEE Commun. Surv. Tutor. 018(3), 2027–2051 (2016) 15. Z. Wan, W. Zhu, G. Wang, PRAC: efficient privacy protection for vehicle-to-grid communications in the smart grid. Comput. Secur. 062(1), 246–256 (2016) 16. H. Bao, R. Lu, A new differentially private data aggregation with fault tolerance for smart grid communications. IEEE Internet Things J. 002(3), 248–258 (2015) 17. D. Boneh, E. Goh, K. Nissim, Evaluating 2-DNF formulas on ciphertexts, in Theory of Cryptography Conference (Springer Berlin, Heidelberg, 2005), pp. 325–341 18. A. Mood, N.M. Dariush, An anonymous ECC-based self-certified key distribution scheme for the smart grid. IEEE Trans. Ind. Electron. 065(10), 7996–8004 (2018) 19. L. Chen, R. Lu, Z. Cao, K. AlHarbi, X. Lin, MuDA: multifunctional data aggregation in privacy-preserving smart grid communications. Peer-to-Peer Netw. Appl. 008(1), 777–792 (2014) 20. C. Dwork, Deferential privacy: a survey of results, in Theory and Applications of Models of Computation. International Conference on Theory and Applications of Models of Computation (Springer, Berlin Heidelberg, 2008), pp. 1–19 21. X. Liang, K. Zhang, R. Lu, X. Lin, X. Shen, EPS: an efficient and privacy-preserving service searching scheme for smart community. IEEE Sens. J. 013(10), 3702–3710 (2013) 22. R. Jiang, R. Lu, J. Luo, C. Lai, X. Shen, Efficient self-healing group key management with dynamic revocation and collusion resistance for SCADA in smart grid. Secur. Commun. Netw. 008(6), 1026–1039 (2014) 23. S. Desai, R. Alhadad, N. Chilamkurti, A. Mahmood, A survey of privacy preserving schemes in IoE enabled smart grid advanced metering infrastructure. Cluster Comput. 022(1), 43–69 (2018) 24. Wen, M., Lu, R., Zhang, K., Lei, J., Liang, X., Shen, X.: PaRQ: A privacy-preserving range query scheme over encrypted metering data for smart grid. IEEE Trans. Emerg. Topics Comput. 001(1), 178–191 (2013) 25. B. Li, R. Lu, W. Wang, K. Choo, DDOA: A Dirichlet-based detection scheme for opportunistic attacks in smart grid cyber-physical system. IEEE Trans. Inf. Forensics Secur. 011(11), 2415– 2425 (2016) 26. N. Saputro, K. Akkaya, Performance evaluation of smart grid data aggregation via homomorphic encryption, in IEEE Wireless Communications and Networking Conference (WCNC), pp. 2945–2950 (2012) 27. Q. Yang, D. An, R. Min, W. Yu, X. Yang, W. Zhao, On optimal PMU placement-based defence against data integrity attacks in smart grid. IEEE Trans. Inf. Forensics Secur. 012(7), 1735–1750 (2017)
140
J. C. Pandey and M. Kalra
28. Z. Guan, Y. Zhang, L. Zhu, L. Wu, S. Yu, EFFECT: an efficient flexible privacy-preserving data aggregation scheme with authentication in smart grid. Sci. China Inf. Sci. 062(3), 1–15 (2019) 29. P. Jokar, N. Arianpoo, V. Leung, Electricity theft detection in AMI using customers consumption patterns. IEEE Trans. Smart Grid 007(1), 216–226 (2015) 30. Y. Li, R. Qiu, S. Jing, Intrusion detection system using online sequence extreme learning machine (OS-ELM) in advanced metering infrastructure of smart grid. PloS One 013(2), 1–16 (2018) 31. G. Bansod, N. Pisharoty, A. Patil, PICO: an ultra-lightweight and low power encryption design for ubiquitous computing. Defence Sci. J. 066(3), 259–265 (2016) 32. S. Khasawneh, M. Kadoch, Hybrid cryptography algorithm with precomputation for advanced metering infrastructure networks. Mobile Networks Appl. 23(4), 982–993 (2018) 33. D. Sivaganesan, A data driven trust mechanism based on blockchain in IoT sensor networks for detection and mitigation of attacks. J. Trends Comput. Sci. Smart Technol. 3(1), 59–69 (2021) 34. Manoharan, A novel user layer cloud security model based on chaotic Arnold transformation using fingerpoint biometric traits. J. Innov. Image Process. 3(1), 36–51 (2021)
An Efficient Email Spam Detection Utilizing Machine Learning Approaches G. Ravi Kumar, P. Murthuja, G. Anjan Babu, and K. Nagamani
Abstract In this paper, email information was categorized as ham email and spam email by utilizing directed learning calculations. The issues of email spam discovery framework are related with high dimensionality in element determination cycle and low exactness of spam email characterization. On the other hand, feature extraction is emerging as a globalized enhancement with the implication of artificial intelligence [AI] techniques to reduce the superfluous and repetitive data and generates satisfactory outcomes with high precision. With reference to these technologies, this paper presents an element determination calculation depending on the particle swarm optimization (PSO) with three ML approaches such as decision tree, support vector machine [SVM], and Naive Bayes. The proposed PSO-based component choice calculation looks through the element space for obtaining the best element subsets. The classifier execution and length of the selected highlight vector as a classifier input are considered for executing the assessment by utilizing UCI spambase dataset. Exploratory outcomes show that the PSO-based element choice calculation was introduced to create incredible component determination results with the insignificant arrangement of selected features, which has to be brought about by a high exactness of spam email order depending on three ML classifiers. The outcomes show that the SVM-PSO-based element determination technique can obtain the most elevated precision from 93.62 to 95.97%. Keywords PSO · ML · Spam · SVM · Decision tree · Naive Bayes
1 Introduction As Web users are increasing rapidly, email is becoming as an essential element for online communication. Data users rely heavily on message structure as one of the G. Ravi Kumar (B) · K. Nagamani Department of Computer Science, Rayalaseema University, Kurnool, Andhra Pradesh, India P. Murthuja · G. Anjan Babu Department of Computer Science, SV University, Tirupati, Andhra Pradesh, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_11
141
142
G. RaviKumar et al.
most important sources of online communication and information or message transmission [1]. Its significance and utilization are ceaselessly developing by notwithstanding the advancement of portable applications, interpersonal organizations, and so on. Messages are utilized on both the individual and expert levels. They can be considered as true archives in correspondence with the users. Because of the huge size of spam and junk emails, it becomes difficult for the users to identify the important ones. Spam mail is a significant and enormous issue of specialists to examine and reduce its occurrence. Spam messages are usually obtained in mass sum, and it contains trojans, infections, malware and is also causes phishing assaults [2]. Issues arise when number of undesirable emails come from unknown sites, and it becomes difficult for the user to categorize the received email as spam or ham. Spam in messages is perhaps the most intricate issues in email administration. Spam messages are those undesirable and spontaneous messages that are not expected by explicit recipient and that are sent for showcasing purposes, or for trick, lies, and so on. Spam has a great deal of definitions, as it is viewed as one of the complex problems in email administration. Spam can be either a garbage mail/message or a spontaneous mail/message. Spam messages are additionally those undesirable and spontaneous messages that are not intended for a particular collector [3]. It is essentially an online communication that has been sent to the user without consent. It takes on different structures like grown-up substance, selling items or services, propositions for employment, etc. The spam has increased tremendously over the most recent years. The great, awesome, and official messages are known as ham. It is likewise characterized as an email that is large required. Therefore, the problem that is addressed in this paper is the ways to develop an automated method for classifying the spam mails out of inbox ones (ham). The machine learning [ML] field has a hearty, instant, and elective path for tackling this kind of the issue [4].
2 Feature Selection Highlight selection is an important strategy for reducing the number of eliminated highlights without lessening the characterization precision during the time spent on example order. [5]. The goal of any element choice calculation is (i) to limit the quantity of highlights and (ii) to boost the order precision [6]. The extraction and determination of valuable highlights in email characterization are significant strides to create precise and proficient classifiers [7, 8]. Accordingly, the deliberately selected highlights can significantly improves the characterization precision, and at the same time, it decreases the measure of information required to obtain the ideal exhibition.
An Efficient Email Spam Detection …
143
2.1 Particle Swarm Optimization Particle swarm optimization (PSO) was concocted by Kennedy and Eberhart in 1995 [9]. PSO is a generally utilized populace-based stochastic improvement procedure since it has solid worldwide inquiry capacity, high assembly speed, high vigor and is exceptionally fundamental [10]. PSO reenacts the social conduct among species, for example, fish schools and fowl runs. PSO is a transformative calculation space. The separated highlights in general are called as a populace, and the individual highlights are called as particles. Every molecule is allowed to fly over the quest space with the offered speed to look for the best position. Subsequently, every molecule has a speed and current position. The momentum position in the hunt space is represented as, xi = {xi1, xi2, xiD}. The current speed of a molecule permits it to move in a particular way, and generally, the speed advances the molecule to obtain the best arrangement. The current speed of the molecule is represented as, vi = {vi1, vi2, viD}. The populace in PSO is considered as multitude, and it comprises of particles (arrangements) and afterward looks for the ideal situation of particles by refreshing it for all emphasis. In all cycles, every molecule is refreshed by following two best wellness esteems that are assessed by utilizing appropriate wellness work as indicated by the issue. The principal esteem is the best position it has accomplished so far for every molecule; this worth is called individual best (Pbest). Another worth is the best situation for the whole multitude that has been obtained so far by any molecule in the multitude; this is known as a worldwide best (Gbest). All particles have a speed, which decides the heading of the particles and moves it to a new position. The basic algorithm of PSO is as following steps: i.
ii. iii. iv. v. vi. vii.
Initialize every molecule i of the multitude with arbitrary qualities for the position (X i ) and speed (V i ) in the hunt space as per the dimensionality of the issue. Evaluate the wellness estimation of molecule by utilizing wellness work. Compare the worth acquired from the wellness work obtained from particle i with the estimation of Pbest. If the estimation of the wellness work is superior to the Pbest esteem, at that point the molecule position is updated to replace Pbest. If the value of Pbest form any particle is better than Gbest, then update Gbest = Pbest. Modify the position X i and speed V i of the particles by utilizing Eqs. (1) and (2), separately. If the maximum number of iterations or the ending criteria is not achieved so far, then it returns to step 2. V i[t+1] = ωV it + c1r1 pi.best[t] − pi [t] + c2 r2 p g.best[t] − pi [t]
(1)
pit+1 = pi [t] + V i [t + 1]
(2)
144
G. RaviKumar et al.
where i = 1, 2… N, N is the various multitude populace. V i [t] is the speed vector in tth emphasis. pi [t] represents the current position of the ith particle. pi.best[t] is the previous best position of ith particle, and p g.best[t] is the previous best position of whole particle. To control the pressing factor of nearby and worldwide hunt, ω has been used. c1 and c2 are positive quickening coefficients, which are separately called as the psychological boundary and social boundary. r1 and r2 are irregular numbers somewhere in the range of 0 and 1.
3 Proposed Methodology The proposed approach comprises of the accompanying advances; they are: 1. 2. 3. 4. 5. 6. 7. 8.
Select the spambase dataset. Information preprocessing and highlights choice through PSO. Selection of best features using feature selection technique. Add chosen highlights in rundown Fi. Split the dataset into preparing and testing (70% and 30%, separately) by utilizing 10-crease cross-approval. Train the classifiers (SVM, decision tree, and Naive Bayes) utilizing a preparation set. Assess the prepared model by utilizing testing set. Store the order execution.
3.1 Data Preprocessing In the proposed model, first the information in spambase dataset is preprocessed. In this paper, the well-known and regularly utilized corpus benchmark UCI spambase dataset is used to group email as spam or non-spam. The dataset is accessible in numeric structure, and the highlights are frequencies of different characters and words in messages [11]. The fundamental assignments in preprocessing are change, decrease, cleaning, mix, and standardization. Standardization is a significant stage to quick the calculation, intermingling and declines the impact of irregularity in information. In spam base dataset, standardization is done prior to PSO calculation. Each element of spam base dataset is standardized in the reach [0, 1] through the accompanying capacity. In the proposed model, first the information in spam base dataset is preprocessed. The accessible information in the spam base dataset is in numeric structure. All the 57 attributes in the spambase dataset mostly represent frequencies of various words and characters in emails; this research work intends to standardize this information prior to the PSO approach. Standardization is a significant stage because of accelerating
An Efficient Email Spam Detection …
145
model, assembly, and also, it diminishes the impact of irregularity in providing information to the classifier. In the preparation and testing stages of this paper, standardize the information in the reach [0, 1]. This paper has used three ML algorithms, namely SVM, decision tree, and Naive Bayes algorithms, along with PSO feature optimization.
3.2 Support Vector Machine (SVM) SVM is an AI calculation that can be displayed for both relapse and order issues, and however, it is significantly utilized for grouping a parallel class issue [12]. The fundamental thought from SVM was to verifiably plan the preparation of informational index into a high-dimensional element space [13]. At the point when marked preparing information is given as information, the model gives an ideal hyperplane as a yield, which classifies the examples. It is not difficult to keep a straight hyperplane between two classes. The fundamental thought from SVM was to verifiably plan the preparation informational index into a high-dimensional element space. The idea of the SVM technique is to make an ideal hyperplane, which differentiates information into two classes. The ideal hyperplane is a territory, which parts the information into classes, and it is found opposite to the nearest design. Examples are specks that portray informational collection. To get the ideal hyperplane, it is required to locate the most extreme edge. Edge is a reach between the hyperplanes with the nearest design for each class, while uphold vector is the closest example to the ideal hyperplane [14].
3.3 Decision Tree A decision tree is a tree-like diagram that comprises of inward hubs, which speak to test on a property and branches, which mean the result of the test and leaf hubs that connote a class mark [11]. The characterization rules are framed by the way chosen from the root hub to leaf. To partition each information, first the root hub is picked as it is the most noticeable property to isolate the information. The tree is built by recognizing ascribes and their related qualities, which will be utilized to examine the information at each moderate hub of the tree. After the tree is shaped, it can prefigure the recent information by crossing, beginning from a root hub to the leaf hub by visiting all the inner hubs by relying on the test states of the credits at every hub. The principle favorable position of utilizing decision trees rather than other grouping methods is that they give a rich arrangement of decides that are straightforward.
146
G. RaviKumar et al.
3.4 Naive Bayes The Naive Bayes calculation is a straightforward probabilistic classifier that computes a bunch of probabilities by checking the recurrence and blends of qualities in a given informational index [11]. The probabilities of participation can be resolved with the assistance of these classifiers, and it has its establishment on Bayes hypothesis. It is accepted that the impact of an element estimation of a given class does not rely upon the qualities from different highlights and is called restrictive autonomy. The calculation utilizes Bayes hypothesis and expects all credits to be free given the estimation of the class variable. This restrictive autonomy presumption once in a while remains constant in genuine applications; consequently, the portrayal as Naive yet the calculation will in general perform well and adapt quickly in different managed arrangement issues.
4 Experimental Results The proposed technique was evaluated using UCI spambase dataset [15] in Python language and the Scikit-realize machine learning library, which contains a few learning calculations. The environment in which the application is built is known as Jupyter notebook. The ML models are built, prepared, tested, and evaluated using the Scikit-Learn ML Python system.
4.1 Dataset The proposed strategy is probed the spam base of UCI dataset [15]. This dataset comprises of 4,601 numbers of occurrences and 57 numbers of characteristics. The last column of this information base indicates whether the email is viewed as spam (1) or not (0).There are 1813 spam cases and 2788 non-spam cases. The majority of characteristics show that how successive for a word or character to happen in a given email. What is more, the excess credits measure the length of arrangements of continuous capital letters. It contains a bunch of spam and non-spam classes. The vast majority of the ascribes show whether a specific word or character is every now and again happening in the email or not. We haphazardly pick a preparation set (70%) and a testing set (30%) of three information sets. In request to approve the forecast aftereffects of the proposed technique, k-crease hybrid approval is utilized. The k-overlap hybrid approval is normally used to diminish the blunder came about because of arbitrary examining in the correlation of the correctness’s of various prediction models [11]. The whole arrangement of information is haphazardly partitioned into k folds with similar number of cases in each overlay. The training and testing are performed for k occasions, and onefold is
An Efficient Email Spam Detection …
147
chosen for additional testing while the rest are selected for further preparing. The current investigation isolated the information into 10-folds where onefold was for trying and 9-folds were for training for the 10-crease hybrid approval.
4.2 Performance Metrics This paper examines the impacts of PSO-based component choice for decision tree, SVM, and Naive order for UCI spam base dataset. Accuracy, precision, and recall are utilized to assess and compare the model [11]. There exist various measurements to assess ML calculations: Accuracy Accuracy is a measure of accurately classified instances of absolute occurrences, defined as the ratio of right expectations to total number of forecasts [11]. It is appropriate for use on a dataset with symmetric objective classes and equal class importance. Accuracy =
TP + TN TP + TN + FP + FN
(3)
Precision: Precision measures how accurate the classifier predictions are by displaying the number of true positives predicted out of all positive labels assigned to instances by the classifier. Precision is the percentage of correct positive predictions. Precision =
TP TP + FP
(4)
Recall: The proportion of positive samples that are correctly predicted to be positive is referred to as recall. It displays the proportion of truly predicted positive classes in relation to the total number of actual positive classes. Recall =
TP TP + FN
(5)
where • • • •
True negative (TN) = number of negative samples correctly predicted. True positive (TP) = number of positive samples correctly predicted. False negative (FN) = number of positive samples wrongly predicted. False positive (FP) = number of negative samples wrongly predicted as positive.
148
G. RaviKumar et al.
4.3 Results and Discussion We arbitrarily pick a preparation set (70%) and a testing set (30%) of spam base dataset. Our test configuration was arranged in two primary cycles. In the main stage choice tree, SVM and Naive Bayes learning calculations are prepared on the first arrangement of highlights and were utilized in the analysis. In the second stage, we actualize a PSO calculation for getting the satisfactory number of highlights to distinguish the highlights chose. Here, 26 highlights were chosen for arrangement. The outcomes obtained for decision tree, SVM, and Naive Bayes without highlight choice and with include choice are introduced in the disarray network and are appeared in Table 1 with their relating arrangement esteems. The qualities to quantify the performance of the strategies (e.g., precision, accuracy, and review) are gotten from the disarray networks and showed in Table 2, and same appeared in the graphical portrayal in Fig. 1. From Fig. 1, the exhibition of decision tree without PSO dependent on exactness, accuracy, and review is 93.18%, 93.2%, and 93.2%, while the presentation of Table 1 Confusion matrix of email spam dataset (1380) Decision Tree with all features Actual class
Decision Tree with selected features
Predicted class No Spam
Spam
No Spam
797
43
Spam
51
489
Actual Attack
Predicted class No Spam
Spam
Non-Spam
808
32
Spam
47
493
SVM with all features
SVM with selected features
Actual class
Predicted class
Actual Attack
Predicted class
No Spam
Spam
No Spam
Spam
No Spam
804
36
No Spam
813
27
Spam
52
488
Spam
31
509
Naive Bayes with all features
Naive Bayes with selected features
Actual class
Predicted class
Actual class
Predicted class
No Spam
Spam
No Spam
Spam
No Spam
805
35
No Spam
794
46
Spam
82
458
Spam
55
485
Accuracy
Precision
Recall
Accuracy
Precision
Recall
Decision Tree
93.18
93.2
93.2
94.27
94.3
94.3
SVM
93.62
93.6
93.6
95.97
95.8
95.8
Naive Bayes
91.52
91.6
91.5
92.68
93.5
94.4
Table 2 Performance of algorithms Algorithm
All features
PSO-based selected features
An Efficient Email Spam Detection …
149
Performance of Algorithms 97 96 95 94 93 92 91 90 89
Accuracy
Precision
Recall
Accuracy
All features Decision Tree
Precision
Recall
PSO based Selected features SVM
Naïve Bayes
Fig. 1 Performance of three ML algorithms with PSO
choice tree with PSO includes determination dependent on exactness, accuracy, and review such as 94.27%, 94.3%, and 94.3%. The exactness of the choice tree model generally included prior to applying the PSO highlight choice method is 93.18%; however, the accuracy after applying RFE highlight determination strategy is 94.27%, which shows an improvement of roughly 1.09% subsequent to applying the proposed highlight choice. While the presentation of SVM without PSO is dependent on exactness, accuracy, and review values such as 93.62%, 93.6%, and 93.6%, respectively, the presentation of SVM with PSO highlights the choice dependent on exactness, accuracy, and review values such as 95.97%, 95.8%, and 95.8%, respectively. We notice, the exactness of the SVM-based highlights prior to applying the PSO determination procedure is 93.62%, yet the accuracy after applying PSO choice strategy is 95.97% which shows an improvement of roughly 2.35% in the wake of applying the proposed choice. While the presentation of Naive Bayes without PSO is dependent on exactness, accuracy, and review such as 91.52%, 91.6%, and 91.5%, respectively, the exhibition of Naive Bayes with PSO includes the determination dependent on exactness, accuracy, and review which has accomplished, and their corresponding values are 92.68%, 93.5%, and 93.5%. We notice, the precision of the Naive Bayes-based highlights prior to applying the PSO determination procedure is 91.52%, yet the accuracy after applying PSO choice method is 92.68% which shows an improvement of roughly 1.16% subsequent to applying the proposed choice. In our test result, the three calculations are choice tree, SVM and Naive Bayes with PSO calculation that shows the highest accuracy, when analyzed without PSO. With the improvement in exactness, the proposed model performs well subsequent to choosing the relevant features.
150
G. RaviKumar et al.
This outcome gave new knowledge by utilizing a classification learning calculation and decreases the strategy and significant element to improve the accuracy of the framework and recognize possible features, which may add to this improvement. At the point when correctness are looked at three calculations, the SVM with PSO has obtained the most elevated precision (95.97%) when contrasted and choice tree (94.27%) and Naive Bayes (92.68%) with PSO included choice.
5 Conclusion Email spam has received the most requesting research points due to the increasing digital technologies and spammers. PSO-based element choice is utilized as an element choice methodology. Experimentation has been performed on the spambased dataset with the utilization of assessment boundaries for exactness, accuracy, and review. From the assessed results, it has been very well pronounced that the proposed PSO-based component choice with three ML calculations idea beat in correlation with singular ML. The proposed model showed that the presentation of the SVM with PSO has higher exactness, contrasted with choice tree and credulous Bayes with PSO classifier. For future research enhancement, these ML approaches can likewise be coordinated with some other multitude improvement-based models like ant colony optimization, honey bee optimization, firefly optimization, and so forth.
References 1. W.A. Award, S.M. Elseuofi, Machine learning methods for e-mail classification. Int. J. Comput. Appl. 16(1) (2011) 2. E. Blanzieri, A. Bryl, A survey of learning-based techniques of e-mail spam filtering. Artif. Intell. Rev. 29(1), 63–92 (2008) 3. P. Cunningham, N. Nowlan, S.J. Delany, M. Haahr, A case-based approach to spam filtering that can track concept drift, in The ICCBR, vol. 3 (2003) 4. O. Saad, A. Darwish, R. Faraj, A survey of machine learning techniques for Spam filtering. Int. J. Comput. Sci. Network Secur. (IJCSNS) 12(2), 66 (2012) 5. V. Boln Canedo, N. Snchez Maroo, A. Betanzos, A Feature Selection for High Dimensional Data (Springer, Berlin, 2016) 6. J. Yan, B. Zhang, N. Liu, S. Yan, Z. Chen, Effective and efficient dimensionality reduction for large scale and streaming data processing. IEEE Trans. Knowl. Data Eng. 18(3), 320–333 (2006) 7. G. Ravi Kumar, V.S. Kongara, G.A. Ramachandra, An efficient ensemble based classification techniques for medical diagnosis. Int. J. Latest Technol. Eng. Manage. Appl. Sci. II(VIII), 5–9 (2013). ISSN 2278-2540 8. Y. Pang, Y. Yuan, X. Li, Effective feature extraction in high dimensional space. IEEE Trans. Syst. (2008) 9. J. Kennedy, R. Eberhart, Particle Swarm Optimization, in Proceeding of International Conference on Neural Networks (IEEE, Piscataway, Perth, Australia, 1995), pp. 1942–1948
An Efficient Email Spam Detection …
151
10. J. Kennedy et al., Swarm Intelligence (Morgan Kaufmann, 2001) 11. J. Han, M. Kamber, Data mining concepts and techniques, in the Morgan Kaufmann series in Data Management Systems, 2nd edn. (Morgan Kaufmann, San Mateo, CA, 2006) 12. C. Cortes, V. Vapnik, Support vector networks. Mach. Learn. 20, 273–297 (1995) 13. Q. Wang, Y. Guan, X. Wang, SVM-based spam filter with active and online learning, in TREC (2006) 14. H. Drucker, D. Wu, V.N. Vapnik, Support vector machines for spam categorization. IEEE Trans. Neural Networks 10(5), 1048–1054 (1999) 15. http://archive.ics.uci.edu/ml/dataset/Spambase/
Malware Prediction Analysis Using AI Techniques with the Effective Preprocessing and Dimensionality Reduction S. Harini, Aswathy Ravikumar, and Nailesh Keshwani
Abstract Research shows that the hackers use the malware attacks to infect the computer. Malware attacks are very common nowadays, and they are growing in terms of both volume and the level of complexity. In this paper, the analysis of different machine learning techniques that can be used to detect malwares is done. Efforts are taken to great extent to reduce the malware attack even then they are increasing at an exponential rate. The malware attacks are common in the web servers which is a main platform for most ecommerce sites and industries. Each day the malware is growing, and most recently, a new class of malwares difficult to detect has developed with advanced evasion capabilities. In this paper, the machine learning algorithms are used for malware detection and are analyzed. The algorithms used in this analysis for malware detection are Naïve Bayes, Decision Tree, K-Nearest Neighbor (KNN), Random Forest, NN, ANN and DNN. The analysis carried out in this paper includes all the above techniques with imputation and encoding as well as with raw dataset. These classifiers will compare the parameters of the testing dataset with training dataset. Two boosting techniques, namely AdaBoost and XG Boost, are also applied for enhancement of the algorithm. Keywords Malware attacks · Machine learning · Supervised learning · Microsoft malware detection · Security · Boosting
S. Harini · A. Ravikumar (B) School of Computer Science and Engineering, VIT, Chennai 600127, India e-mail: [email protected] S. Harini e-mail: [email protected] N. Keshwani School of Civil Engineering, VIT, Chennai 600127, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_12
153
154
S. Harini et al.
1 Introduction Artificial intelligence (AI) is widely applied in many domains for the effective analysis and understating of the process. AI helps to automate the process, it improves the performance compared to human analysis in many cases, and it helps to give a clear understanding of the main logic behind this process. AI can be applied to many fields like medical, cyber security, ecommerce, etc. for example, in the medical diagnosis field, AI helps to identify the influence of certain parameters on the disease and helps to make a better diagnosis of the disease, and this acts as a helping hand for the doctors [1]. Another important domain where AI is widely applied is cyber security for many applications like the malware detection, filtering of spam and prevention of intrusion detection [2]. AI helps the experts to understand and identify the malwares, and the complicated models can be discovered. Malware is the set of malicious code or malicious software which are used to infect the system. Malware includes ransomware, viruses and spyware. They are unwanted software that can perform some operations on the system. These codes or software are used by the hackers to infect or damage or unauthorized access to someone’s computer. Malware basically spreads by using files like executable files or document files through the Internet. Around 200 million different malwares were reported by McAfee, and they reported it grows at a rate of 1,00,000 malwares per day [3]. This dynamic growth of malwares has made it a need to find ways to reduce them and overcome their attacks. Malware analysis is gaining attention due to this, and it deals with the proper understanding of the modes of operation of malwares, finding their similarities, how to deal with them, how they adapt with the environment and time. The malware analysis is a very tedious process and requires a lot of effort, and the AI has helped a lot for the automation of this process. It helps to learn the evolution of malware and effective prediction of its future behavior. Cyber defenders and malware analysts could use these AI-enabled malware analysis systems to explore the ecosystem of malware, determine new and significant threats, or develop new defenses. As per the certain reports, the malware widely spreads through the files or links by the use of emails from the suspicious accounts. The malwares may also enter to your computer while you surfing the websites that contain malicious code in their scripts. Many sites contain download button or icon, which downloads the malicious files to the computer. Nowadays, document files are not ever safe to use. There is different method used to shroud the malignant code in the documents. The data is the one of the most significant things for the person. If the malware attacks infect or damage the data, it creates the huge loss for the users. The hackers can copy the data from user computer to their computers and can use their data for the wrong purposes. Malwares help hackers to make the gateway or back door to fetch the data from computer or servers. Malware has become a greatest danger to the data business in recent years. Agreeing to AV-Test, a free IT-security organization, consistently the quantity of malware is expanding in an exceptional rate notwithstanding utilizing malware recognition procedures. As per the statistic report of antivirus companies, the threat of the malware was increased from past few years.
Malware Prediction Analysis Using AI Techniques …
155
According to the antivirus companies report, the script-based infectors like macromalware are comeback. As per the safety report of antivirus organizations, there is increase in script-based attacks which can be harmful. The PHP, JavaScript and PowerShell are some of the commonly used scripting language for writing of malicious codes. Each malware codes or software have some sequences. The antivirus software uses this sequence to detect the malware. But the hackers are adapting new techniques to spread malwares. So, it is very important to find this new technique. Basically, these codes are written obfuscation techniques which means the hiding of codes. The obfuscation technique is the one of the best techniques used to bypass the antivirus software. By obfuscation, the antivirus is unable to understand code. Every malware file has two stages: ideal stage or running stage. In the ideal stage, the files are downloaded to the computer, but it is not opened or executed. In the running stage, the user opens or executes the files. By this the code obfuscated behind the file got executed and started their work, and it was done without our knowledge. For avoidance of the malwares try not to download the content from the unwanted websites. By surfing the unwanted websites, we will get malware in our systems. Install good antiviruses in computers to avoid malware attacks. The pirated software also causes malware attacks to the computer systems. In this paper, the Naïve Bayes algorithm, K-Nearest Neighbor, ANN, DNN, Random Forest algorithm and Decision Tree is used. After computing all these algorithms, the accuracy of each algorithm is compared. For accuracy improvement of the Decision Tree, the Adaptive Boosting (AdaBoost) technique is also applied. The machine learning algorithm is used to check or analyze the features or properties which are independent of their occurrence.
2 Motivation Due to the current demand in malware analysis and lack of tools for it have led to the need of AI algorithms for malware analysis. Malware analysis takes place as two steps, the first one is the filtering of malware using the certain defined guidelines like the reoccurrences, etc. And based on the previously available data the malwares are analyzed and similarities are detected. During the analysis phase, if the malwares are found to be new it moves to the second phase in which in-depth analysis is performed with strict filtering policies [4]. This process fails to identify the newly formed malwares or the modified ones and, in those cases, we need human analysis to identify the malwares. But human analysis becomes difficult when the size of the malwares grows both in terms of velocity and volume. More advanced classification and clustering algorithms are required to analyze and identify the similarities among the malwares. Clustering algorithms are widely used for malware analysis for sensitive hashing [5], hierarchical clustering using classification and prototyping [6] and other approaches with incremental hierarchical logic [7, 8]. To identify the new malwares, the knowledge acquired from the previous data can be used as a starting point of analysis without the help of a manual analysis. The AI can help to create a deeper understanding of the malwares and understand their functionalities, the main
156
S. Harini et al.
purpose of them and the main intentions of its creator. All these tasks are performed by the human analyst at the present scenario which is time consuming and a difficult process. They are mainly done using the reverse-engineering concept. The experts capable of identifying interrelated behaviors are required for this. It is important to understand the evolution cycle of malwares to understand its effects and predict the actions in future. It is difficult for the human analyst to compare the malwares manually and derive the patterns when the quantity is large and complex [9]. Works are developed to make the learning process automated and identify the linkage among the malware groups [10], and AI provides a great breakthrough for that by saving the time and improving efficiency to a great extent. Automated analysis with the help of AI helps to find both the evolution and the cross-family dependencies among the malwares. The main points like the impact of the threat created by the malware, its mode of attack, the intention of its creator, its evolution cycle and its interdependence are to be understood from the analysis. Automated analysis is done taking into consideration all parameters, and they will give the methods to detect all possible malwares.
3 Literature Survey In this paper, the different machine learning algorithms like decision trees are used to detect the malware files and clean files [11]. They try to reduce the false positive rate. All the aspects are compared using the properties of a file. A set of features is computed for every binary file in the dataset [12]. In this paper, the machine learning algorithms are used to find unknown malware attacks. A novel feature is requested to detect malware attacks. According to this paper, the signature-based detection techniques fail to detect unknown and zero-day malware attacks [13]. The research shows that due to increase in the number of malwares, it is difficult to find the signature-based malwares. In this paper, the ML algorithms like Decision Tree, Random Forest and KNN are used. Feature selection algorithms improve the results. According to this paper, the Decision Tree gives high accuracy [14]. In this paper to detect the unknown computer viruses, the automatic heuristic method is used. Decision Tree and Naïve Bayesian ML algorithms are used. The Decision Tree gives 91.4% result, and Naïve Bayes gives 77.1% result. Decision Tree is more effective than the Naïve Bayes according to this paper [15]. In this paper, Opcode frequency is used as a feature vector. Applied supervised learning as well as unsupervised learning for malware classification. The performance of variance threshold is over the Deep Auto-Encoders. 99.7% accuracy is achieved by using Random Forest algorithm. Variance threshold helps to improve performance by 1.26% [16]. In this paper, SNN clustering algorithm is used. The decision making is used for classification of malwares. Grayscale images, import functions and Opcode n-gram are dealt with by the data processing modules. Accuracy by this research is 86.9% and able to detect 86.7% new malware successfully [17]. ML algorithms primarily learn from the metadata executable files and Windows PE32 file format. PE32 header data
Malware Prediction Analysis Using AI Techniques …
157
successfully used to detect malware through ML algorithms. Decision tree achieved a “0.97 F-score” [18]. In this paper, malware classification is based on the string’s information. Tree-based algorithms, near neighbor algorithms and statistical algorithms are used. The AdaBoost type of boosting technique is also used to improve accuracy. As a result, 97% accuracy is achieved [19]. The feature like n-grams of byte code is used to encode each as training. Malicious executable is detected using their payload functions such as mass mailing and opening in background. Boosted J48 produced the detector under the ROC curve around 0.9 [20]. In this paper, it is proposed to employ information theory. Windows API calls are used for dynamic analysis. TF-IDE weighting scheme is used to find behavior of malware family [21] Current antivirus frameworks try to identify these new malicious projects with heuristics created by hand. This methodology is expensive and periodically ineffectual. In this paper, it presents a data mining structure that recognizes new, beforehand inconspicuous malicious executable precisely and consequently. The information mining system consequently discovered examples in our informational index and utilized these examples to recognize a lot of new malignant doubles. Contrasting our identification techniques and a customary signature-based strategy, our strategy dramatically increases the present recognition rates for new malignant executable. In this paper, they are being actualized as a system mail filter. Implementing a system-level email channel that utilizations calculations to get malevolent executable before clients get them through their mail [22]. In this paper, it states that there are many types of malicious attacks like worm, virus or trojans in the Application Programmable Interfaces (API). So, by using machine learning classifiers, it can be more effective to detect malicious activities. Multi-dimensional Naïve Bayes algorithm will be used for this purpose. First pre-process the data, then extract the features of data by using the mean and standard deviation. Then they classify the data using the Naïve Bayes classification techniques. After that, by using the predicted values from the classifier, calculate the True Positive Rate (TPR) and False Positive Rate (FPR). The model with six ML algorithms KNN, SVM, Naïve Bayes, SGD, Bagging and Decision Tree was implemented to detect the malwares [23]. The 558 android documents were analyzed, and it was found that KNN was found to give the best accuracy of 95%. SVM-based android malware detection [24] was proposed, and it used static analysis techniques for obtaining information about the applications. The two-layer neural network model with 10 input neurons and 34 output neurons was implemented in the 1700 dataset. In this work, the Apk tool was used for obtaining the .xml file from the APK file, and it gave an accuracy of 65% [25]. Bayes classifier [26] was used for detection of commands, API calls, permissions, encrypted codes, etc. This work was done in a 1600 size dataset, and it gave an accuracy of 97.22%. The main goal of this paper is to increase correctness for the detection of malware by using machine learning classifier.
158
S. Harini et al.
4 Methodology The proposed method consists mainly of the steps of data collection, data preprocessing, exploratory data analysis, machine learning algorithms and evaluation and analysis of the algorithms. The methodology is shown in the figure.
4.1 Dataset In data collection, we collect the data according to properties and functions required for the process of training or testing of the dataset. The dataset is the collection of the functionality of the files. The dataset for this study is taken from the open-source data repository Kaggle. The Microsoft Malware Prediction dataset from Microsoft is available in Kaggle [27]. The dataset consists of rows which are the machines that are detected by a Machine Identifier. The ground truth of the malwares detected is provided by the HasDetections. There are two files train.csv for training and test.csv for testing, and the system after training should correctly predict the machine state in HasDetections for each test case in test.csv. This dataset was developed using the sampling method meeting the constraints with respect to machine execution time and privacy. The malware dataset is a time series dataset with the new data added dynamically. Dataset is not just a collection of Microsoft users, but it has been sampled from various sources to make it diverse.
4.2 Data Preprocessing Some of the major problems faced in processing any huge data by machine learning or neural network algorithms are listed below. • Missing data • Non-numerical data • Cardinal numerical data that cannot be compared relatively. Missing data in the dataset is mainly handled using the dropping method, imputation and extended imputation. In the dropping method, the missing columns are dropped. Imputation in which the mean/median of the existing values is calculated, and they are placed on the missing values. Imputation is a fast method, and it is very effective in case of the numeric data, but it fails for the categorical values and the uncertainties are lost in this case. Extended imputation is done using the KNN algorithm which uses the similarity among the features for the prediction of the missing values. In this, the KD tree is constructed using the values available, and using the tree, the nearest neighbors are found and the average is taken. This method is more accurate than the mean/median method, but it involves a lot of computation. Another
Malware Prediction Analysis Using AI Techniques …
159
method of extended imputation is linear interpolation, stochastic regression, hot deck and deep learning. These methods are more accurate and can handle the missing data for categorical and numerical values. Non-numerical data in the dataset is processed using the one-hot encoding and data leakage. One-hot encoding is the most widely used method for dealing with nonnumerical data. In one-hot encoding, it generates a column to show the occurrence of each item in the original data. Data leakage is found in the model when it uses the data beyond the training dataset, and this helps the model to perform better and learn additional things. Counting encoders are used to substitute the categorical values with the number of occurrences of it in the dataset. Cardinal numerical data that cannot be compared relatively is handled using cross validation and pipelines. Validation is the process of verifying the results of the hypothesis. Cross-validation is the method in which a model is trained using a small subset of the entire data and performs evaluation using the other part of the dataset. K-fold cross-validation is the one in which data is divided into k subset. The models are made more effective by averaging the error over all the k subsets. The main advantage is the reduction of bias and variance. Pipeline in machine learning is used for automation of the workflows in ML. In this, the set of data is correlated in a model and it is used to evaluate the output. ML pipeline is a repetitive process of executing the steps multiple times to have a better performance for the overall algorithm. Pipeline helps to provide a control over the algorithm execution and improves flexibility. Pipelines are cyclic execution patterns for performance and scalability improvement of the algorithm. It is mainly associated with feature extraction, collection of data, data cleaning and the final model visualization and validation.
5 Machine Learning Machine learning [28] is the subsection of artificial intelligence in which it makes the learning from the training data, and it can be used for similar use cases after the training process. Machine learning is broadly classified as three categories: unsupervised machine learning, supervised and reinforcement learning [29, 30]. In supervised learning algorithms [31, 32], the labeled dataset is used for the training of the machine, and it builds the model using information from the training data. Unsupervised algorithms are based on the analysis of the hidden patterns in the data and building the model based on that analysis. Reinforcement learning is a feedback mechanism working based on the reward and penalty concept from the environment. Decision Tree is one of the most popular machine learning techniques. It has a tree-like structure. In this, every attribute is divided into different nodes. The nodes are divided into different levels according to their functionality. It looks like the branches of a tree. The objective is to make a model that predicts the estimation of an objective variable dependent on input factors. Decision Tree works based on the principle of recursive partitioning. In this method, the dataset is recursively split at each branch to obtain homogenous groups. The main decision tree algorithm is C4.5
160
S. Harini et al.
which put forward by Quinlan [33]. The main advantage of Decision Tree is faster execution, and it is possible to generate rules from the decision tree. Naïve Bayes is basically used for text-based analysis and classification by the high-dimensional training and testing of datasets. The Naïve Bayes algorithm is used to check or analyze the features or properties which are independent of their occurrence. The Naïve Bayes classification techniques use the Bayes theorem which are commonly used for probability calculation [34, 35]. The KNN is based on supervised machine learning algorithms. The KNN predicts the similarity between two or more than two categories. KNN is the one of the best algorithms for the large dataset. It determines the value of “k” and checks the nearest neighbor of the “k”. It is also good for noisy datasets. The KNN uses the information learned from the training dataset to predict the results in the test set, and when a new test case comes, the algorithm predicts the output considering the samples in the training set most similar with the test case. The similarity among the data points is measured using many methods like Manhattan Distance, City Block Distance, Euclidean distance, etc. The k in the algorithm denotes the number of similar data points we are comparing in the entire dataset. The value of k is very crucial in the algorithm because as the value of k increases the noise in the dataset is reduced, but it also increases the computation time of the algorithm. Random Forest is constructed from multiple decision trees in the top-down manner, and the count of decision trees varies. Each node of the Decision Tree is used for the splitting of the training set under different class labels, and this helps in the classification process. The term forest denotes the collection of trees and the main principle used for training bagging. Bagging is an ensemble approach for training models for better performance. Random Forests perform well on both classification and regression problems. The randomness feature is introduced into the model in this algorithm as multiple trees are combined and grown. The best features are selected during the process rather than the most important features, and this introduces the randomness in the model. The randomness can be further improved by using the random thresholds. This algorithm avoids overfitting, and it is highly versatile. The biological neurons in the human brain are the main inspiration behind the artificial neural networks [36]. Similar to the human brain, ANN is capable of analyzing and decoding the hidden complex patterns, and this helps to find effective solutions in the classification problem. ANN consists mainly of the input, hidden and output layers. The input layers are the point at which information is fed into the network, and the output layer provides the final output of the model. The layers in between are known as hidden layers, and these layers extract the hidden features based on the values obtained from previous layers. Each layer can be considered as a filter that filters the unwanted information and keeps only the most relevant information or features for the model. The neurons are also known as perceptron, and the entire network with all layers is known as multilayer perceptron. The traditional supervised algorithms which are trained based on the labeled training data fail to detect the threats that occur at runtime, and this can be sorted using the hidden layers in ANN which helps to identify the complex patterns in the data [37]. The activation functions can be used to normalize the entropy calculated
Malware Prediction Analysis Using AI Techniques …
161
from the hidden layers [38]. Using sigmoid activation function, the output will be between 0 and 1; in most of the cases, it is usually close to the boundaries either 0 or 1 [39]. ANN can act as an unsupervised method in which there is no need for external support to train the model. Such ANN is known as Auto Associative ANN (AAANN) in which the network by itself organizes the data presented to it, finds the patterns and the feature vectors act as both the input and output of such models [39]. Deep Neural Network is a neural network with many numbers of hidden layers. It has a wide range of applications like image processing, cyber security, automated driving cars, etc. The performance of the network can be improved using the regularization, optimization and dropout techniques [40]. This helps to avoid the overfitting condition. For image-related applications, the Convolutional Neural Networks (CNN) with special convolutional filters, pooling, and fully connected layers are used. Low-level and high-level features are extracted irrespective of their positions, and it was found that CNN reduces the error rate in many applications [41]. For sequential data, Recurrent Neural Network (RNN) is used which have a loop/iterative structure and memory cells for storing the previous states. RNN faces the vanishing gradient problem which was overcome using the long-term short-term memory (LSTM) [42]. RNN is used in many fields like stock market prediction, natural language prediction, speech processing, etc. RNNLM [43] is the model exclusively for language modeling in which the most probable next word is predicted based on the previous words. RNN can be used for malware analysis [44] by using the logistic regression and MLP for the feature vector classification. Boosting improves the accuracy of the ML algorithm for prediction. It converts the weak attributes to strong attributes. Not every attribute is individually strong enough to use for classification. Boosting is a method in which a collection of weak models forms a strong model. The method is an iterative process in which the models are recursively built from the previous version to reduce the errors. AdaBoost is one of the first boosting algorithms The Adapting Boosting (AdaBoost) is used to increase the accuracy of Decision Tree. The AdaBoost takes the case where the base learner fails. These are the weak learners. The AdaBoost [45] can be used for ML algorithms which take training data weights. It takes the weak learner and converts it into a strong learner. XGBoost is an algorithm developed by the Distributed Machine Learning Community that works based on the gradient boosted decision trees, and it helps to improve the performance and reduce the running time of the model [46]. The algorithm explores both the hardware and memory resources for the performance improvement, and it can perform Gradient, Stochastic and Regularized Boosting. It helps to remove the flaws in the existing models and is very efficient in handling realworld problems due to its faster execution even in a single platform [47]. XGBoost is capable of parallel processing, can run on multiple core machines, highly scalable, flexible by giving option to set the objective function as per requirement, portable to other platforms and run on different languages. XGBoost supports regularization to remove overfitting. It has inbuilt cross-validation, and it is efficient to deal with missing values. The memory requirement can be reduced by using the Save and Reload option in it. All this feature makes XGBoost better than the existing models [48]. Light GBM is the algorithm used for improvement of gradient boosting
162
S. Harini et al.
algorithms in decision trees for better performance and faster execution. The tree split is done leaf wise in Light GBM to obtain better results faster compared to the existing algorithms. But the leaf-wise split may cause overfitting which needs to be removed by using the depth of the split. The main advantages of Light GBM are the high speed and efficiency which is obtained by the histogram-based algorithm used, the less memory requirement, it works well on big datasets and supports parallel processing.
6 Result Analysis The detailed analysis of the performance based on the accuracy of the malware analysis is done in this section. The performance of the machine learning methods for the prediction of the malware and its analysis on the soft Malware Prediction dataset is done. For the better understanding of the efficiency, the receiver operating curve of the classifier is analyzed. Receiver operating curve helps to analyze the classifiers considering the area under the curve. AUC is the most commonly used metric for selecting the classifiers, and it is the probability with which a new positive sample is ranked compared to a randomly chosen negative sample. Implemented in the Python 3 environment in Kaggle which has many analytics libraries installed. It is defined by the Kaggle/Python docker image: https://github. com/kaggle/docker-python. It is implemented on the Microsoft Malware Prediction dataset from Microsoft available in Kaggle [27]. The performance of the machine learning algorithms is calculated based on the confusion matrix generated after the model is trained and undergoes testing. The accuracy is calculated as the correct prediction made after the model undergoes testing with respect to the total number of predictions. Many other performance measurements like Recall, Precision, F measure are available. Classification accuracy is another metric used for the selection of a classifier which is the average of correct predictions with respect to total no of samples. Cross-validation is used for the estimation of the skill of the machine learning algorithms. It helps to analyze a model and helps in selecting the model with a lower bias compared to other algorithms. In the case of limited data, it can be used as a resampling method. We have implanted the k-fold cross-validation method in which data is divided into random k samples, and the model is trained k times with the data. The performance is measured each time with the testing data and the average is taken as overall performance of the model. In this proposed method, we have used the fivefold cross-validation method and the accuracy of ML algorithms is shown in Table 1.
Malware Prediction Analysis Using AI Techniques … Table 1 ML accuracy
Algorithm
163 Accuracy
Decision Tree
50.34
Random Forest (dropping missing columns)
50.85
Random Forest (imputation)
50.85
Random Forest (an extension to imputation)
50.85
Decision Tree with one-hot encoding
52.12
Random Forest with one-hot encoding
53.21
Random Forest with pipelines
54.79
Decision Tree with pipelines
55.91
Random Forest with cross-validation
51.10
XGBoost
52.10
KNN
52.48
NN (non-categorical)
50.02
NN (categorical)
50.63
ANN
52.59
DNN
69.42
ANN with counting encoding
50.52
Naïve Bayes
52.52
Decision Tree with AdaBoosting
64.90
6.1 Excluding Categorical Value The categorical values are excluded from the dataset, and it was implemented in the Decision Tree and Random Forest for malware detection. The x-axis values are identified that are numeric from the dataset and stored them as x_numerals. Y-axis will be having detections. The Random Forest was implemented using three approaches dropping missing columns, imputation and extended imputations. The accuracy and Mean Absolute Error (MAE) are analyzed for each approach and shown in Fig. 1.
6.2 Including Categorical Value The categorical values are included from the dataset using one-hot encoding, and it was implemented in the Decision Tree and Random Forest for malware detection. Accuracy and MAE is shown in Fig. 2.
164
S. Harini et al.
Fig. 1 a Excluding categorical values accuracy. b Excluding categorical values MAE
6.3 Pipeline and Cross-Validation Cardinal numerical data that cannot be compared relatively are handled using the pipeline and cross-validation technique. It was implemented in the Decision Tree and Random Forest for malware detection. Accuracy and MAE is shown in Fig. 3 and Fig. 4 for Pipeline and Cross validation.
6.4 Neural Networks The neural networks were implemented for the categorical, non-categorical data in neural network, artificial neural network and deep neural networks, and the accuracy was compared. The neural network has three layers with ReLU and Sigmoid Activation function, the loss calculated using the binary cross-entropy function for categorical and non-categorical data. The neural network was trained on 10,578 samples and validated on 2267 samples with 100 epochs. DNN was implemented with the same cross-entropy and Adam optimizer with 100 epochs and the accuracy is shown in Fig. 4.
Malware Prediction Analysis Using AI Techniques …
165
Fig. 2 a Including categorical values MAE. b Including categorical values accuracy
6.5 Counting Encoder The counting encoder was implemented together with ANN and was found to give the number of true positives 2291 and false positives 2243, and both true negatives and false negatives zero giving an accuracy of 50.5% and counting encoders alone gave an accuracy of 100%. And from the study, it was found that counting encoders provide the best accuracy for malware analysis.
7 Conclusion and Future Scope This paper shows how malware can be detected using machine learning algorithms. The various properties are given to the algorithms so they predict the data. In this paper, Naïve Bayes, K-Nearest Neighbors, Random Forest, Neural Networks and Decision Tree are used. For the improvement of the algorithms, the boosting technique XGBoost and the Light GBM model with data leakage were implemented. The
166
S. Harini et al.
Fig. 3 a Pipeline and cross-validation accuracy. b Pipeline and cross-validation MAE
Fig. 4 NN accuracy
counting encoder technique was found to give the best accuracy. In this paper, we have identified that the counting encoder technique effectively deals with the data irregularities and better performance in malware prediction. In the future, we plan to do more real-time malware analysis since the growth of malware is at an exponential
Malware Prediction Analysis Using AI Techniques …
167
Fig. 5 Accuracy of ML algorithm
rate. To prevent the damage to data files, these malwares have to be removed by timely detection using machine learning techniques.
References 1. P. Szolovits, R.S. Patil, W.B. Schwartz, Artificial intelligence in medical diagnosis. Ann. Intern. Med. 108(1), 80–87 (1988) 2. E. Tyugu, Artificial intelligence in cyber defense, in 2011 3rd International Conference on Cyber Conflict (ICCC) (IEEE, 2011, pp 1–11) 3. B. Cruz, P. Greve, B. Kay, H. Li, D. McLean, F. Paget, C. Schmugar, R. Simon, D. Sommer, B. Sun, J. Walter, A. Wosotowsky, C. Xu, Mcafee labs threats report: Fourth quarter 2013. McAfee Labs 4. J. Jang, D. Brumley, S. Venkataraman, BitShred: feature hashing malware for scalable triage and semantic analysis, in Proceedings of the 18th ACM conference on Computer and Communications Security, CCS ’11 (ACM, New York, NY, USA, 2011), p. 309320. https://doi.org/10. 1145/2046707.2046742 5. U. Bayer, P. M. Comparetti, C. Hlauschek, C. Kruegel, E. Kirda, Scalable, behavior-based malware clustering, in NDSS, vol. 9, Citeseer, pp. 8–11 (2009) 6. K. Rieck, P. Trinius, C. Willems, T. Holz, Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011) 7. R. Perdisci, D. Ariu, G. Giacinto, Scalable fine-grained behavioral clustering of http-based malware. Comput. Netw. 57(2), 487–500 (2013) 8. N. Sahoo, J. Callan, R. Krishnan, G. Duncan, R. Padman, Incremental hierarchical clustering of text documents, in Proceedings of the 15th ACM International Conference on Information and Knowledge Management (ACM, 2006), pp. 357–366 9. T. Dumitras, I. Neamtiu, Experimental challenges in cyber security: a story of provenance and lineage for malware. Cyber Security Experimentation and Test 10. M.E. Karim, A. Walenstein, A. Lakhotia, L. Parida, Malware phylogeny generation using permutations of code. J. Comput. Virol. 1(1–2), 13–23 (2005) 11. D. Gavrilut, et al., Malware detection using machine learning, in 2009 International Multiconference on Computer Science and Information Technology, pp. 735–741 (2009) 12. Malware Analysis, in 2019 IEEE International Conference on System, Computation, Automation and Networking (ICSCAN), pp. 1–5 (2019) 13. K. Sethi, et al., A novel machine learning based malware detection and classification framework, in 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security), pp. 1–4 (2019)
168
S. Harini et al.
14. J.-H. Wang, et al., Virus detection using data mining techniques, in IEEE 37th Annual 2003 International Carnahan Conference on Security Technology, 2003. Proceedings, pp. 71–76 (2003) 15. H.S. Rathore, et al., Malware Detection Using Machine Learning and Deep Learning. BDA (2018) 16. L. Liu, et al., Automatic malware classification and new malware detection using machine learning. Front. Inf. Technol. Electron. Eng. 18, 1336–1347 (2017) 17. Z. Markel, M. Bilzor, Building a machine learning classifier for malware detection, in 2014 Second Workshop on Anti-malware Testing Research (WATeR), 1–4 (2014) 18. R. Tian, et al., An automated classification system based on the strings of trojan and virus families, in 2009 4th International Conference on Malicious and Unwanted Software (MALWARE), pp. 23–30 (2009) 19. J.Z. Kolter, M.A. Maloof, Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006) 20. J. Y.-C. Cheng, et al., An information retrieval approach for malware classification based on Windows API calls, in 2013 International Conference on Machine Learning and Cybernetics, vol. 04, pp. 1678–1683 (2013) 21. S. Choi, et al., Malware detection using malware image and deep learning, in 2017 International Conference on Information and Communication Technology Convergence (ICTC), pp. 1193– 1195 (2017) 22. M.A. Jerlin, K. Marimuthu, A new malware detection system using machine learning techniques for API call sequences (2018) 23. C. Urcuqui, A. Navarro, Machine learning classifiers for android malware analysis, in Proceedings of the 2016 IEEE Colombian Conference on Communications and Computing (COLCOM) (IEEE, 2016), pp. 1–6 24. J. Sahs, L. Khan, A machine learning approach to android malware detection, in Proceedings of the 2012 European Intelligence and Security Informatics Conference (EISIC) (IEEE, 2012), pp. 141–147 25. Scikit-Learn 0.16.1 Documentation. Scikit-learn: Machine learning in Python. Retrieved from the website: http://scikit-learn.org/stable/ 26. M. Ghorbanzadeh, Y. Chen, Z. Ma, T.C. Clancy, R. McGwier, A neural network approach to category validation of android applications, in Proceedings of the 2013 International Conference on Computing, Networking and Communications (ICNC) (IEEE, 2013), pp. 740–744 27. https://www.kaggle.com/c/microsoft-malware-prediction 28. P.K. Chan, R. Lippmann, Machine learning for computer security. J. Mach. Learn. Res. 6, 2669–2672 (2014) 29. MachineLearning: What it is and why it matters (2016) 30. N. Idika, P. Aditya Mathur, A Survey of Malware Detection Techniques (Department of Computer Science, West Lafayette, 2015), pp. 3–10 31. B.-Y. Zhang, J.-P. Yin, J.-B. Hao, D.-X. Zhang, S.-L. Wang, Using support vector machine to detect unknown computer viruses. Int. J. Comput. Intell. Res. 2(1) (2014) 32. T. Hastie, R. Tibshirani, J. Friedman, The elements of statistical learning. Data Mining Inference and Prediction (Springer, 2009) 33. J.R. Quinlan, C4.5: Programs for Machine Learning (Morgan Kaufmann, 1993) 34. D. Damopoulos, S.A. Menesidou, G. Kambourakis, M. Papadaki, N. Clarke, S. Gritzalis, Evaluation of anomaly-based IDS for mobile devices using machine learning classifiers 35. S.Y. Yerima, S. Sezer, G. McWilliams, I. Muttik, A new android malware detection approach using Bayesian classification, in Proceedings of the 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), pp. 121–128 (2013) 36. W.S. Sarle, Neural Network FAQ, part 1 of 7: Introduction, periodic posting to the Usenet newsgroup comp.ai.neuralnets 37. L. Bochereau, P. Bourgine, Extraction of semantics features logical rules from a multi-layer neural network. Proc. Int. Joint Conf. Neural Networks 2, 579–583 (1990)
Malware Prediction Analysis Using AI Techniques …
169
38. R. Kamimura, S. Nakanishi, Hidden information maximization for feature detection and rule discovery. Netw. Comput. Neural Syst. 6, 577–602 (1995) 39. Z. Boger, H. Guterman, Knowledge extraction from artificial neural networks models, in Proceedings of IEEE International Conference on Systems Man and Cybernetics, pp. 3030– 3035 (1997) 40. G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Improving neural networks by preventing co adaptation of feature detectors, CoRR, vol. abs/1207.0580 (2012). [Online]. Available: http://arxiv.org/abs/1207.0580 41. A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) 42. F.A. Gers, J. Schmidhuber, F. Cummins, Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000) 43. T. Mikolov, M. Karafiát, L. Burget, J. Cernocky, S. Khudanpur, Recurrent neural networkbased language model, in 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), vol. 2, pp. 1045–1048 (2010) 44. R. Pascanu, J.W. Stokes, H. Sanossian, M. Marinescu, A. Thomas, Malware classification with recurrent networks, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1916–1920 (2015) 45. M. Stamp, Boost your knowledge of adaboost (2017). https://www.cs.sjsu.edu/stamp/ML/files/ ada.pdf 46. J. Brownlee, A gentle introduction to XGBoost for applied machine learning. Machine Learning Mastery 47. I. Reinstein, XGBoost a top machine learning method on Kaggle, Explained. Available online: http://www.kdnuggets.com/2017/10/xgboost-top-machine-learning-method-kag gle-explained.html 48. B. Tripathy, Rough sets on fuzzy approximation spaces and intuitionistic fuzzy approximation spaces in Rough Set Theory: A True Landmark in Data Analysis (Springer: Berlin/Heidelberg, Germany, 2009), pp. 3–44
Statistical Test to Analyze Gene Microarray M. C. S. Sreejitha, P. Sai Priyanka, S. Meghana, and Nalini Sampath
Abstract A microarray may be a laboratory tool deployed to identify the expression of thousands of genes at a comparable time. Microarrays are tiny slides that are printed with many little spots in specific positions, and each spot is containing a known DNA gene. In this work, we are trying to perform feature selection methods like information gain, correlation coefficient, and also by applying machine learning techniques like support vector machines (SVM) and Naïve Bayes (NB) using which carcinoma samples and healthy samples are classified. To improve the accuracy, we will be developing a spiral method in which the whole data set is passed to information gain for extracting features, and we are passing those features which are extracted from information gain to mutual information for feature selection which is most significant, and also in this spiral method, we are again passing the whole data set to information gain for feature extraction. These extracted features from information gain are again passed to the correlation coefficient for the most relevant features, and we are passing those features to classification techniques like support vector machines (SVM) and Naïve Bayes (NB). Keywords Support vector machine (SVM) · Naive Bayes (NB) · Information gain · Mutual information · Correlation coefficient
1 Introduction Cancer is the most dangerous disease which leads to abnormal growth of cells that divide uncontrollably and have the ability to infiltrate and destroy body tissues. It spreads throughout the body in a short period of time. Cancer affects our body in many ways like breast, lung, liver, small intestine, lymphoma, cervical cancer, leukemia. By M. C. S. Sreejitha (B) · P. Sai Priyanka · S. Meghana · N. Sampath Department of Computer Science & Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India N. Sampath e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_13
171
172
M. C. S. Sreejitha et al.
extracting the most significant gene expressions, we can predict whether the person is affected by cancer or not. These gene expressions can be extracted by machine learning (ML) algorithms such as Naïve Bayes (NB) and support vector machine (SVM) and feature selection techniques like mutual information (MI), correlation coefficient (CC), information gain (IG). The data sets used are Alonds and Lymph which have more than 2000 features and describe if he/she is suffering from cancer or not. A model is designed using R by implementing the techniques mentioned which give a framework to our project. In this work, we used two data sets that are related to cancer, namely Alonds data set which consists of 2002 attributes and another one is Lymph data set which consists of 2078 attributes. Accuracy will increase after combining the two feature selection methods like information gain along with mutual information is passed to support vector machine (SVM) and information gain along with correlation coefficients is passed to support vector machine (SVM) [1] compared to the single feature selection method. In Alonds data set, we are getting more accuracy when we are passing features to support vector machine (SVM) which are extracted from information gain along with mutual information. In the Lymph data set, we are getting more accuracy when we are passing features to support vector machine (SVM) [2] which are extracted from information gain along with correlation coefficient. Initially, when we use machine learning algorithms for all genes (2002 attributes for Alonds data set) and (2078 attributes for lymph data set), computational time increases. Using naive Bayes, we did not achieve the best results [1, 3].
2 Related Work 2.1 A Review on Feature Selection Techniques for Gene Expression Data (June 11, 2015) In diagnosing and predicting disease, the classification of genetic data plays an important role. Gene genetic data are derived from microarray technology and play a key role in cancer differentiation. In genetic data, all genes do not contribute to the functional classification of genetic samples. Advances in DNA microarray technology revealing a large number of data samples and the size of the genetic expression feature encourage the development of a competent and robust algorithm for genetic selection in genetic data classification. Translating data from genetic information is an important area in bioinformatics research and remains a complex problem, due to the large and low sample size. Such problems pose a significant challenge to existing divorce strategies. To overcome the challenges of existing methods, an algorithm for selecting an effective feature is needed to classify genetic data. The article presents an overview of existing feature selection strategies, data sets, and performance metrics used to measure the performance of those algorithms.
Statistical Test to Analyze Gene Microarray
173
2.2 A Comparative Study of Cancer Classification Methods Using Microarray Gene Expression Profile (Dec 15, 2013) An effective cancer diagnostic, additionally, predictive, and therapeutic purposes have emerged as identification of microarray-based organic phenomena. The main functionality of the microarray data classification is to find out the best model which will confirm the category of unknown samples by the taken microarray data. In earlier times, the fields of science have become most popular because of the microarray process. It is of great significance that to confirm, the educational genes which are likely to cause cancer have improved in early cancer diagnosis, and effective chemotherapy has been made available for use. The main task is to separate genetic data with huge processes and with lower resolution sample data with a much greater amount of screeching or ineffective genes and lost data in the microarray. Therefore, finding an effective cancer classification model can be an important issue in the medical field. In this paper, the authors conducted a comparative study that will classify the effective binary classification ways used for the profile of microarray expression. The conclusion follows by identifying the classification method with the highest precision with more accuracy with the smallest number of effective genes.
2.3 Feature Selection of Gene Expression Data for Cancer Classification: A Review (December 2015) The DNA microarray innovation can possibly chip away at the degrees of thousands of qualities at the same time during a single experiment. Examination of quality articulation is essential in numerous zones of exploration to recover the predetermined data. As time advances, the ailment in like manner and disease, specifically, has gotten increasingly intricate and refined, in recognizing, breaking down, and healing. We realize cancer is a lethal illness. Cancer research is one of the significant zones of exploration in the clinical field. Anticipating decisively dissimilar tumor types might be an incredible test, and giving precise expectations will have an extraordinary incentive in giving the best treatment to the patients. To accomplish this, data processing algorithms are major tools and accordingly the most broadly used approach to get key features of gene expression data and play a crucial role in gene classification. One of the fundamental difficulties is to get the route how to extract helpful data from extremely enormous data sets. This paper presents late advances in machine learning-based gene expression data analysis with a diversity of feature selection algorithms.
174
M. C. S. Sreejitha et al.
2.4 Analysis of Microarray Experiments of Gene Expression Profiling (Jun 22, 2008) The gene expression profiling of tissues and cells in the study is for discovery in the medical industry. Description of DNA and non-coding DNA changes in health was allowed by microarray experiments. The changes are directed to the diagnosis, and prognosis of the diseases in gynecology and obstetrics is expected to be the outcomes. In addition, a fair and systematic investigation of quality articulation profiling ought to permit the result of another scientific categorization of sickness for obstetrics and gynecological syndromes. Thus, a substitution period is arising during which regenerative cycles and issues may be portrayed utilizing sub-atomic devices and fingerprinting. We allude to the study of gene articulation profiling of cells.
3 Proposed Work The data set is partitioned into two sets. One data set will be the training data set, and another will be the testing data set. At first, the training set is supplied to the feature selection technique like information gain and calculates the relationship between the gene expressions with respect to the class label. The probability of the features is obtained from the above, and features can be extracted by the condition, i.e., the average of the minimum value and maximum value obtained with respect to the gene expression. After extracting the features from one feature selection algorithm, those particular features extracted are sent to the second stage feature selection algorithms like correlation coefficient and mutual information, respectively. Figure 1 shows the
Fig. 1 Workflow diagram
Statistical Test to Analyze Gene Microarray
175
overall working of the system with internal blocks. We have taken 80% for training and 20% for testing from both the data sets. Information gain => correlation coefficient => support vector machine and Naïve Bayes. Information gain => mutual information => support vector machine and Naïve Bayes.
4 Feature Selection Algorithm In statistics, machine learning and feature selection are also called variable selection, variable subset selection, or attribute selection. It is the process of choosing a subset of relevant features (predicators, variables) for use in model construction.
4.1 Support Vector Machine The SVM model is a machine-aligned machine algorithm and is a model of models such as space points, mapped so that the models of the different categories are separated by a clear gap as wide as possible. New models were then drawn in that space, and it was predicted that they would be in the category depending on which side of the gap they fell into. In addition to performing line spacing, SVM can optimize nonlinear partitions using the so-called kernel strategy, by fully mapping their input into high-level feature spaces. Support vector machines (SVMs) are another tool used in the study of the supervised machine to separate Sample X from class labels. SVMs work by obtaining a linear decision boundary that can represent a nonlinear class boundary through a nonlinear mapping of the input vectors x k in a high-dimensional feature space. The linear models are constructed in high-dimensional feature space to represent a nonlinear decision boundary. A maximum margin in the new location is derived from the hyper-plane training data. This maximum margin hyper-plane provides maximum separation between two sections. This hyper-plane is derived from the examples closest to it, with all other examples being considered irrelevant in defining the decision boundary. For the linearly separable case where there are two squares and the data are represented by three attributes x 1 , x 2 , x 3 , there is no need to map to a higher-dimensional space, and thus, the equation of the following form in the maximum margin hyper-plane will happen with the following form: y = w0 + w 1 x 1 + w 2 x 2 + w 3 x 3
(1)
where y is the result and x i is the attributes. Four weights, wi , are obtained from the training data. The maximum margin hyper-plane can also be represented as support vectors. SVM is based on the concept of decision planes that define decision boundaries.
176
M. C. S. Sreejitha et al.
4.2 Naive Bayes The NB classifier may be a probabilistic algorithm supported by Bayes’ rule and, therefore, the simple assumption that the feature values are conditionally independent given the category. Given a replacement observation, NB estimates the conditional probabilities of classes using the joint probabilities of training sample observations and classes. Bayesian classifiers are statistical classifiers that can approximate the probability of a class membership, and that will be the probability that a given representative belongs to a particular class. Bayesian classifiers will apply Bayes theorem in order to give more class membership probabilities. The naive Bayes classifier, a special type of Bayesian classifier, takes a naive notion of conditional independence among all features. Let X be a one and only sample represented by an n dimensional vector {x 1 , x 2 , … x n } which has n features of x. In Bayesian statistics, X is called the “evidence.” Now, suppose that H is a hypothesis that X belongs to some class C. In classification, we want to find the probability of P(H|X)—the probability that X belongs to class C {x 1 , x 2 , …, x n }. P(H|X) is known as the posterior probability of H(X). P(Ci |x1 , x2 , . . . xn ) =
P(x1 , x2 , . . . xn |Ci ) · P(Ci ) for 1 < i < k P(x1 , x2 , . . . xn )
(2)
4.3 Classifier (SVM and NB) SVM and NB are the classifiers that are used for learning the features from the training set and then it gives patterns or performance measures when the testing set is passed through.
4.4 Performance Measures The performance measures obtained when testing the set are passed through the SVM. These performance measures are accuracy, sensitivity or recall, specificity. Accuracy = (TP + TN)/(TP + FN + FP + TN)
(3)
Specificity = TN/(TN + FP)
(4)
Recall = TP/(TP + FN)
(5)
Statistical Test to Analyze Gene Microarray
177
where TP = true positive, FP = false positive, and FN = false negative.
5 Results A.
B.
C.
D.
E.
Outcome of support vector machine (SVM) for entire gene set from Table 1 We are passing a complete data set to the machine learning algorithm like support vector machine (SVM) which results in 75% accuracy. Outcome of spiral method (information gain (IG), correlation coefficient (CC)) with support vector machine (SVM) from Table 1 We are passing the complete data set to the information gain for feature extraction, we are passing those features which are extracted from the information gain to the correlation coefficient, and we are passing those features which are extracted from the correlation coefficient and information gain to the support vector machine (SVM) for most significant genes which give us 90% accuracy. Outcome of spiral method (information gain (IG), mutual information (MI)) with support vector machine (SVM) from Table 1 We are passing the complete data set to the information gain for feature extraction, we are passing those features which are extracted from the information gain to the mutual information, and we are passing those features which are extracted from mutual information and information gain to the support vector machine (SVM) for the most significant genes which give us more accuracy 96% compared to the remaining methods. Outcome of Naïve Bayes (NB) for entire gene set from Table 1 We are passing a complete data set to the machine learning algorithm like Naïve Bayes (NB) which results in 41% accuracy. Outcome of spiral method (information gain (IG), correlation coefficient (CC)) with naïve Bayes (NB) from Table 1 We are passing the complete data set to the information gain for feature extraction, we are passing those features which are extracted from the information gain to the correlation coefficient, and we are passing those features which
Table 1 Performance measures for Alonds data set Feature selection method
Classifier
Accuracy
Sensitivity
Specificity
All features
SVM
75.00
0.8571
0.6000
All features
NB
41.67
0.2857
0.6000
IG->MI
SVM
96.77
1.000
0.9000
IG->MI
NB
70.97
0.5714
1.0000
IG->CC
SVM
90.32
0.9524
0.8000
IG->CC
NB
67.74
0.5238
1.0000
178
M. C. S. Sreejitha et al.
Table 2 Performance measures for Lymph data set Feature selection method
Classifier
Accuracy
Sensitivity
Specificity
All features
SVM
70.27
0.2727
0.8846
All features
NB
59.46
0.4545
0.6538
IG->MI
SVM
94.74
0.8780
0.1000
IG->MI
NB
80
0.6829
0.8889
IG->CC
SVM
95.79
0.9024
0.1000
IG->CC
NB
91.58
0.9024
0.9259
F.
G.
H.
I.
J.
K.
are extracted from the correlation coefficient and information gain to the Naïve Bayes (NB) for most significant genes which give us 67% accuracy. Outcome of spiral method (information gain (IG), mutual information (MI)) with Naïve Bayes (NB) from Table 1 We are passing the complete data set to the information gain for feature extraction, we are passing those features which are extracted from the information gain to the mutual information, and we are passing those features which are extracted from mutual information and information gain to the Naïve Bayes (NB) for most significant genes which give us 70% accuracy. Outcome of support vector machine (SVM) for entire gene set from Table 2 We are passing a complete data set to the machine learning algorithm like support vector machine (SVM) which results in 70% accuracy. Outcome spiral method (information gain (IG), correlation coefficient (CC)) with support vector machine (SVM) from Table 2 We are passing the complete data set to the information gain for feature extraction, we are passing those features which are extracted from the information gain to the correlation coefficient, and we are passing those features which are extracted from the correlation coefficient and information gain to the support vector machine (SVM) for most significant genes which give us more accuracy with 95% compared to the remaining methods. Outcome of spiral method (information gain (IG), mutual information (MI)) with support vector machine (SVM) from Table 2 We are passing the complete data set to the information gain for feature extraction, we are passing those features which are extracted from the information gain to the mutual information, and we are passing those features which are extracted from mutual information and information gain to the support vector machine (SVM) for most significant genes which give us 94% accuracy. Outcome of Naïve Bayes (NB) for entire gene set from Table 2 We are passing the complete data set to the machine learning algorithm like Naïve Bayes (NB) which results in 59% accuracy. Outcome of spiral method (information gain (IG), correlation coefficient (CC)) with naïve Bayes (NB) from Table 2
Statistical Test to Analyze Gene Microarray
L.
179
We are passing the complete data set to the information gain for feature extraction, we are passing those features which are extracted from the information gain to the correlation coefficient, and we are passing those features which are extracted from the correlation coefficient and information gain to the Naïve Bayes (NB) for most significant genes which give us 91% accuracy. Outcome of spiral method (information gain (IG), mutual information (MI)) with naïve Bayes (NB) from Table 2 We are passing the complete data set to the information gain for feature extraction, we are passing those features which are extracted from the information gain to the mutual information, and we are passing those features which are extracted from mutual information and information gain to the Naïve Bayes (NB) for most significant genes which give 80% accuracy.
6 Conclusions The extracted features give the most significant gene expressions that cause cancer. These extracted genes provide better performance measures when we apply statistical methods compared to the original data set. We are getting more accuracy while using the spiral method (information gain (IG), mutual information (MI)) with 96% compared to other feature selection methods for Alonds data set, and in Lymph data set, we are getting more accuracy of 95% for the spiral method (information gain (IG), correlation coefficient (CC)). In the future, further computational time should be reduced, and the same spiral method can be applied for other machine learning classifiers. Apply more machine learning algorithms for getting a better understanding of performance measures.
References 1. B.R. Manju, V. Athira, A. Rajendran, Efficient multi-level lung cancer prediction model using support vector machine classifier. IOP Conf. Ser. Mater. Sci. Eng. 1012(1), 012034 2. I. Huang, M. Zheng, Q.M. Zhou, M.Y. Zhang, W. Jia, J.P. Yun, H.Y. Wang, Identification of a gene expression signature for predicting lymph node meta statistics in patients with early stage cervical carcinoma. Cancer 117, 3363–3373 (2011). https://doi.org/10.1002/cncr.27850 3. P.R. Radhika, R.A. Nair, G. Veena, a comparative study of lung cancer detection using machine learning algorithms, in 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT) (IEEE, 2019), pp. 1–4
An Analysis on Classification Models to Predict Possibility for Type 2 Diabetes of a Patient Ch. V. Raghavendran, G. Naga Satish, N. S. L. Kumar Kurumeti, and Shaik Mahaboob Basha
Abstract Machine learning (ML) is a theoretical method in which computers learn how to solve problems without being explicitly programmed. Classification algorithms in machine learning can extract useful information from datasets, text files, photographs, audio and video. Several factors affect the choice of a machine learning algorithm, including, but not limited to, data size, consistency and diversity, market specifications, training time, data points, precision and parameters. The aim of this study is to examine a patient dataset in order to predict the likelihood of type 2 diabetes. This paper is intended to analyze the dataset of patients for predicting the possibility of diabetes using classification methods. The Pima Indian Diabetes Dataset (PIDD) is used to evaluate the efficiency of the classification algorithms: Support Vector Machine (SVM), Logistic Regression, K-Nearest Neighbor (KNN), Decision Tree, Random Forest, AdaBoost and Naïve Bayes Classification. The metrics used are F1 Score, Precision, Recall, Accuracy Score, ROC and Confusion Matrix. Keywords Diabetes · Machine learning · Classification · SVM · Logistic Regression · KNN · Decision Tree · Random Forest · AdaBoost · Naïve Bayes · Precision · Confusion Matrix · F1 Score
Ch. V. Raghavendran (B) Aditya College of Engineering and Technology, Surampalem, India G. Naga Satish BVRIT Hyderabad College of Engineering for Women, Hyderabad, TS, India N. S. L. Kumar Kurumeti Aditya Engineering College, Surampalem, India S. M. Basha Department of CSE, Saveetha School of Engineering, Chennai, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_14
181
182
Ch. V. Raghavendran et al.
1 Introduction Diabetes is a widespread chronic condition that poses a serious health risk to humans. Diabetes is defined by blood glucose levels that are higher than average, which is caused by either faulty insulin secretion or its biological effects, or both. According to rising indisposition in recent years, the number of diabetic patients worldwide will exceed 642 million in 2040, implying that one out of every ten adults will have diabetes. Without a question, this disturbing statistic requires immediate attention. Machine learning technologies allow analysts to retrieve new realities from massive safety-related informational repositories, which increase the distribution, lively and disease oversight of medicinal services. The data analytics could be computerized by using ML techniques to do data analysis. Data is used by the algorithms to learn and analyze it to find hidden patterns that can be used to improve performance [1]. Machine learning has a variety of algorithms for constructing predictive models. These algorithms are designed to find hidden patterns in large datasets that aren’t explicitly programmed [2]. The difficult part is deciding which of the available methods to use to deal with the data for analysis [3, 4]. Classification algorithms are used in machine learning to learn from training data and determine the right class for new observations. Supervised learning is a learning style in which the performance value (label) indicator for all observations is defined. The aim of this learning style is to derive a feature from labeled training data that can be applied to new observations [5]. In this paper, we analyzed the Pima Indians Diabetes Dataset (PIDD) using seven classification methods: Support Vector Machine (SVM), Logistic Regression, K-Nearest Neighbor (KNN), Decision Tree, Random Forest, AdaBoost and Naïve Bayes Classification to see which one best suit the dataset. The rest of the paper is organized as follows: Sect. 2 discusses on the literature study. The classification algorithms are explained in Sect. 3, and data preprocessing is discussed in Sect. 4. The implementation of the classification methods on the dataset is shown in Sect. 5, and the paper is concluded in Sect. 6.
2 Literature Study Data analytics is a process to identify unseen tendencies in large volumes of data in order to draw concrete conclusions. Several ML techniques are frequently used in healthcare domain to analyze data to create learning prototypes from which predictions are made. In the literature, there are good number papers on predicting a patient as diabetic or not. Researchers have implemented machine learning techniques on the PIMA Indian dataset and arrived different accuracy values. Indoria and Rathore [6] focused on ML algorithms to improve the correctness of observation and identification of the diseases. They made an assessment on two methods Naïve Bayes, ANN and attained 89.56% of accuracy for training data, and 81.49% for test data.
An Analysis on Classification Models …
183
Kavakiotis et al. [7] made an efficient analysis on the uses of ML algorithms in the area of diabetes toward (a) Prediction and Diagnosis, (b) Diabetic Complications, (c) Genetic Background and Environment and (e) Health Care and Management. Zheng et al. [8] developed a semi-automated framework using machine learning as a study to liberalize filtering criteria to increase recall rate with a protection of low false positive rate. They evaluated the performance of models within their framework, including KNN, Naïve Bayes, Decision Tree, Random Forest, SVM and Logistic Regression. Wei et al. used Pima Indian dataset in [9]. They used two data preprocessor techniques linear discriminant analysis and principal component analysis. They used DNN, SVM to recognize whether a patient is diabetic or not and got an accuracy of 77.86%. Sonar and JayaMalini [10] have projected an advance system mistreatment information processing that presents whether a patient is diabetic or not by using Decision Tree, ANN, Naive Bayes, SVM by partitioning the data as 75% for training and 25% for testing. Saru and Subashree [11] discussed on ML algorithms to forecast diabetes. They implemented Decision Trees, Naïve Bayes, and KNN algorithms on the WEKA software and made a performance analysis to know the best algorithm. Kayal Vizhi and Dash [12] used different MN algorithms like SVM, linear regression, decision tree, XGBoost and Random Forest to analyze efficiency and correctness to predict a patient having diabetes or not with an accuracy of 77%. According to [26], Rajput et al. analyzed five different classification techniques on PIMA Indian dataset for predicting type 2 diabetes, and XGBoost has given better results comparing with the other. In [27], Al-Hameli et al. studied four machine learning techniques and applied on PIMA Indian dataset to analyze the effectiveness in terms of performance measures—Accuracy, Precision, Recall.
3 Classification Algorithms Various classification algorithms are supported by Machine Learning. However, determining which is greater than the other is extremely difficult. Since an algorithm’s output is influenced by the application and dataset used with that algorithm. Classes are linearly divisible that will perform better than sophisticated models. The following is a list of some of the most important classification algorithms. 1. 2. 3. 4. 5. 6. 7.
Support Vector Machine (SVM) Logistic Regression K-Nearest Neighbors (KNN) Decision Tree Random Forest AdaBoost Naïve Bayes.
184
Ch. V. Raghavendran et al.
3.1 Basic Terminology Following terminology is used in classification. • Classifier: This is an algorithm to associate input data to a particular class. • Classification model: This attempt to get decision from the given values in train dataset. The model predicts the class label of future data. • Feature: This is a distinct computable property of an occurrence being detected. • Binary classification: This classification is with only two probable conclusions, e.g., Smoker (Yes/No). • Multiclass classification: This classification is used when there are more than two classes. E.g.: A fruit can be banana or orange, but not both. • Multi-label classification: In this classification, a sample may be mapped to one or more target classes. For example, a news item can be about politics, city and a country at a time. Building a classification model goes through the following steps: • Initialize: Decide which classifier is to use. • Train the classifier: The classifiers in “Scikit learn package” use “fit(X, y)” technique to fit the model for a train dataset x, train label y. • Predict target: For an unlabeled data x, the “predict(x)” method gives the prediction for label y. • Evaluate: Finally evaluate the classifier.
3.2 Classification Types Classification is a method to predict the class for the given variables of the dataset. Detecting spam mails in email is an example for classification problem. In this case, there are two categories of mails—spam and not spam. So, this is a binary classification. A classifier understands the relationships between the variables in the training data comes under a class. In the email example, spam emails and not spam emails are considered as class labels in the training dataset. If the classifier is trained perfectly, it may classify the email to one of the two classes. The two learners in classification are • Lazy learners • Eager learners. 3.2.1
Lazy Learners
A lazy learner stores training dataset and delays until it is given a new instance. A classification is directed based on associated data in the training dataset. In lazy
An Analysis on Classification Models …
185
Fig. 1 Information about the dataset
learners, for training it takes less time and for predicting it takes more time, e.g., K-Nearest Neighbor, Case-based reasoning.
3.2.2
Eager Learners
Given a set of training data, eager learner builds a classification model before receiving new data to classify. This takes more time for training and a smaller amount time for prediction, e.g., Naive Bayes, Decision Tree, ANN.
4 Data Preprocessing 4.1 Description of Diabetes Dataset The “National Institute of Diabetes and Digestive and Kidney Diseases” [13] provided the dataset for this study. Using this, one can predict whether a patient is diabetic or not based on the dataset’s definite diagnostic tests. The collection of these occurrences from a larger database was subjected to a number of constraints. In the dataset, only female patients are considered with age more than 21 years old of Pima Indian heritage. There are eight features in the dataset which include— pregnancies count, glucose, BP, skin thickness, BMI, Age, Class variable, etc. There are two classes—Patient with Diabetes and without Diabetes (1, 0). Figure 1 shows the information of the Diabetes dataset with independent and dependent features. Statistical information like Mean, Standard deviation, Min, Max and Quantiles of this dataset is shown in Fig. 2.
4.2 Exploratory Data Analysis and Visualization Exploratory Data Analysis (EDA) discusses on the important process of executing preliminary examinations on the data to discover patterns, irregularities, to test hypothesis and to check rules using statistics and graphical representations [14–19]. A block diagram to represent the workflow of the proposed approach is presented in
186
Ch. V. Raghavendran et al.
Fig. 2 Statistics about the dataset
Fig. 3. In the Dataset, we have eight predictor variables and one dependent variable. Figure 4 shows the features with zero values in the dataset. It was observed that in the predictor variables—serum insulin, skin thickness and no. of times pregnancy some of the rows are filled with zero values. So, these zero values are changed with mean values group by Class variable. In order to handle these zeros in the predicting features, we have calculated the median of each attributed by grouping the rows on the values of Class variable. Outlier analysis is another important step in the EDA to study the volume of outliers in each attribute. Figure 5 is used to study the outliers in the predicting features using histograms. The outliers are replaced with median values of their respective Class variable group, and the result is shown in Fig. 6.
Fig. 3 Workflow of the proposed approach
An Analysis on Classification Models …
Fig. 4 Zero values count of the independent features
Fig. 5 Outlier analysis for predictor variables
Fig. 6 Histograms of the predictor variables after outlier analysis
187
188
Ch. V. Raghavendran et al.
4.3 Data Standardization Data scaling is an important preprocessing step for scaling the real-world data before applying the machine learning algorithm on it. Real-valued input and output variables may be normalized or standardized to achieve data scaling. It is evident from the statistics presented in Fig. 2 that the scale and distribution of values are different for each feature. Standardization scales each feature independently by deducting mean and dividing by standard deviation.
4.4 Performance Evaluation Metrics The possibility of correctly predicting the class of an unlabeled instance is the accuracy of a classifier, and it can be calculated in many ways [20, 28]. Let us consider our two-class diabetes prediction problem discriminate between patients with or without diabetes. The following four values can be used to characterize binary classification accuracy measures: • • • •
TP or True Positive—no. of correctly classified patients TN or True Negatives—no. of correct classified non-patients FP or False Positives—non-patients classified as patients FN or False Negatives—patients classified as non-patients.
Confusion Matrix—a confusion matrix is a method of summarizing a classification algorithm’s results. If you have an unequal number of observations in a class or if your dataset has more than two classes, classification accuracy alone can be misleading. Accuracy—It is the proportion of the total number of predictions that are correct. Precision—It is a measure of correctness achieved in true prediction, i.e., of observations labeled as true, how many are actually labeled true. Recall or Sensitivity—It is a measure of actual observations that are predicted correctly, i.e., how many observations of true class are labeled correctly. Specificity—It is a measure of how many observations of false class are labeled correctly. F1 score—This is the Harmonic Mean between Precision and Recall, and its range is [0, 1]. It tells you how precise your classifier is as well as how robust it is. ROC curve—Receiver Operating Characteristic (ROC) summarizes the model’s performance by evaluating the tradeoffs between true positive rate (sensitivity) and false positive rate (1 − specificity). The Area Under Curve (AUC) is known as the
An Analysis on Classification Models …
189
Fig. 7 Confusion matrix and other performance metrics
index of Accuracy (A) and is an accurate ROC curve efficiency metric. The greater the region under the curve, the better the model’s prediction ability. Figure 7 presents the relationship between TP, TN, FP, FN and the performance metrics.
5 Implementation This section includes implementation of the specified machine learning techniques using Python libraries—Pandas, NumPy, Sk-learn and Seaborn.
5.1 Support Vector Machine Support Vector Machines (SVMs) are a group of associated supervised learning models for both classification, regression. This is a classification, regression prediction model that uses ML for improving predictive accuracy and prevents data overfitting. SVM functions by creating a hyperplane that connects two groups of points. The maximal margin hyperplane, which is the hyperplane with the greatest distance from the training observations, is used to evaluate the hyperplane. The margin is the measurement of the difference between two points. The points on one side of the hyperplane are labeled −1, while those on the other are labeled +1. Figure 8 shows the classification report and the ROC curve for SVM with “rbf” kernel.
190
Ch. V. Raghavendran et al.
Fig. 8 Classification report and ROC curve for SVM
Fig. 9 Classification report and ROC curve for Logistic Regression
5.2 Logistic Regression In machine learning, Logistic Regression algorithm used for solving classification problems. This is useful to know the impact of various autonomous variables on a single outcome variable [21, 22]. This works if the predicted variable is binary, and all the predictor variables are autonomous of one another and also the dataset with no missing values. The Logistic Regression is a supervised learning and so, the dataset is partitioned as 80% as training data and 20% as test data. Figure 9 shows the classification report and the ROC curve for Logistic Regression.
5.3 K-Nearest Neighbors The K-Nearest Neighbors (KNN) classifier is based on neighbors and is of lazy learning type [23]. This classifier does not try to build a common internal model, but stores occurrences of the training data [24]. Classification is calculated from a modest majority vote of the KNN of each point. This algorithm is very modest to apply, strong to noisy training data and operative if training data is huge.
An Analysis on Classification Models …
191
Fig. 10 Classification report and ROC curve for KNN
GridSearchCV is an import feature that comes with Sk-learn’s model-selection package. This tries all the combinations of the values passed as parameters and evaluates the model for each combination using the cross-validation method. So that we get accuracy/loss for every combination of hyperparameters, and we can choose the one with the best performance. Figure 10 shows the classification report the ROC curve for KNN algorithm after fine-tuning the hyperparameters resulted by the GridSearchCV.
5.4 Decision Tree This is a binary branching form that can be used to classify any input vector X. A simple feature comparison against any field is included in each node in the tree. Each comparison yields a true or false result, indicating whether we should move on to the left or right child of the given node. We must run the model against various combinations of parameters and features and record the score for review in order to hyper parameter tune or find the best model. The GridSearchCV feature in Sk-learn can be used to accomplish this. This function uses cross-validation to test a score, and based on the results, we can determine which hyper parameter provides the best results. Figure 11 shows the classification report and the ROC curve for Decision Tree classification.
5.5 Random Forest This is an ensemble learning approach for classification, regression and other tasks that work by building a large number of decision trees during training and then outputting the class that is the mode of the classes or the mean prediction of the individual trees. Figure 12 shows the classification report and the ROC curve for Random Forest classification.
192
Ch. V. Raghavendran et al.
Fig. 11 Classification report and ROC curve for Decision Tree
Fig. 12 Classification report and ROC curve for Random Forest
5.6 AdaBoost Adaptive Boosting is an ensemble method used as a boosting technique in machine learning. This is named as adaptive boosting, as the weights are reassigned to each instance with weights to incorrectly classified instances. AdaBoost method uses decision tree method with a depth of one. Figure 13 shows the classification report and the ROC curve for AdaBoost classification.
Fig. 13 Classification report and ROC curve for AdaBoost
An Analysis on Classification Models …
193
Fig. 14 Classification report and ROC curve for Naïve Bayes
5.7 Naïve Bayes This algorithm is built on Bayes’ principle by a hypothesis of independence among every couple of variables [25]. This classifier work fine in several real-world scenarios like classifying the documents filtering spam mails, etc. This classifier takes a training data as an input to estimate the necessary parameters. This classifier is very fast compared to more classy methods. Figure 14 shows the classification report and the ROC curve for Naïve Bayes Classification.
5.8 Result Analysis In the Sect. 4.4, we have discussed about various metrics used to analyze the performance of a classification model. Figure 15 shows these metrics for all the seven models under study. From this, it is evident that model AdaBoost, Random Forest and Decision Tree models are having highest values for most of the metrics. The
Fig. 15 A comparison of all the metrics for the studied classification models
194
Ch. V. Raghavendran et al.
Fig. 16 Train and test accuracy of the models. a Before preprocessing. b After preprocessing the data
models are applied on the raw data first and the train, test accuracies are presented in Fig. 16a. The values in Fig. 16b present the train, test accuracies after preprocessing.
6 Conclusion This paper compares different classification algorithms of ML algorithms applied on PIMA Indians Diabetes dataset. In this paper, we have studied seven classification algorithms, namely SVM, Logistic Regression, KNN, Decision Tree, Radom Forest, AdaBoost and Naïve Bayes. The dataset considered for the analysis is with 768 rows divided as 614 rows (80%) as training data and 154 (20%) rows as test data. From the above observations, AdaBoost performs well with 95% accuracy on train data and 93% on test data comparing with the remaining models.
References 1. Big Data Analytics, An Oracle White Paper (2013) 2. SAS. ML: What it is and why it matters (2016). URL: https://www.sas.com/it_it/insights/ana lytics/machine-learning.html 3. J. Qiu, Q. Wu, G. Ding, A survey of ML for big data processing, in EURASIP JASP (2016) 4. S.K. Vasudevan, M. Rajamani, K. Abarna, Big data analytics: a detailed gaze and a technical review. IJAER 9 (2014) 5. A. Talwalkar, M. Mohri, A. Rostamizadeh, Foundations of ML (The MIT Press, 2012). ISBN: 026201825X, 9780262018258 6. P. Indoria, Y.K. Rathore, A survey: detection and Prediction of diabetics using machine learning techniques. IJERT (2018) 7. I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, I. Chouvarda, Machine learning and data mining methods in diabetes research. Comput. Struct. Biotechnol. J. 8(15), 104–116 (2017). https://doi.org/10.1016/j.csbj.2016.12.005.PMID:28138367;PMCID: PMC5257026 8. T. Zheng, W. Xie, L. Xu, X. He, Y. Zhang, M. You, G. Yang, Y. Chen, A machine learning-based framework to identify type 2 diabetes through electronic health records. Int. J. Med. Inform. 97, 120–127 (2017). https://doi.org/10.1016/j.ijmedinf.2016.09.014. Epub 2016 Oct 1. PMID: 27919371; PMCID: PMC5144921
An Analysis on Classification Models …
195
9. S. Wei, X. Zhao, C. Miao, A comprehensive exploration to the machine learning techniques for diabetes identification, in 4th World Forum on Internet of Things (IEEE, 2018) 10. P. Sonar, K.J. Malini, Diabetes prediction using different machine learning approaches, in Proceedings of the Third International Conference on Computing Methodologies and Communication (ICCMC 2019) (IEEE, 2019) 11. S. Saru, S. Subashree, Analysis and prediction of diabetes using machine learning. Int. J. Emerg. Technol. Innov. Eng. 5(4), 167–175 (2019). ISSN 2394-6598 12. Kayal Vizhi, A. Dash, Diabetes prediction using machine learning. Int. J. Adv. Sci. Technol. 29(6), 2842–2852 (2020) 13. https://www.kaggle.com/ 14. B. Lakshmi Sucharitha, C.V. Raghavendran, B. Venkataramana, Predicting the cost of preowned cars using classification techniques in machine learning, in Advances in Computational Intelligence and Informatics. ICACII 2019, ed. by R. Chillarige, S. Distefano, S. Rawat. Lecture Notes in Networks and Systems, vol. 119 (Springer, Singapore, 2020). https://doi.org/10.1007/ 978-981-15-3338-9_30 15. C.V. Raghavendran, C. Pavan Venkata Vamsi, T. Veerraju, R.K. Veluri, Predicting student admissions rate into university using machine learning models, in Machine Intelligence and Soft Computing, ed. by D. Bhattacharyya, N. Thirupathi Rao. Advances in Intelligent Systems and Computing, vol. 1280 (Springer, Singapore, 2021). https://doi.org/10.1007/978-981-159516-5_13 16. Ch.V. Raghavendran, G. Naga Satish, T. Rama Reddy, B. Annapurna, Building time series prognostic models to analyze the spread of COVID-19 pandemic. Int. J. Adv. Sci. Technol. 29(3), 13258 (2020). Retrieved from http://sersc.org/journals/index.php/IJAST/article/view/ 31524 17. K. Helini, K. Prathyusha, K. Sandhya Rani, Ch.V. Raghavendran, Predicting coronary heart disease: a comparison between machine learning models. Int. J. Adv. Sci. Technol. 29(3), 12635–12643 (2020). Retrieved from http://sersc.org/journals/index.php/IJAST/article/view/ 30385 18. C.V. Raghavendran, G.N. Satish, V. Krishna, S.M. Basha, Predicting rise and spread of COVID19 Epidemic using time series forecasting models in machine learning. Int. J. Emerg. Technol. 11(4), 56–61 (2020) 19. K. Prathyusha, K. Helini, C.V. Raghavendran, N. Kumar Kurumeti, COVID-19 in India: lockdown analysis and future predictions using regression models, in 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence), pp. 899–904 (2021). https://doi.org/10.1109/Confluence51648.2021.9377052 20. P. Baldi, S. Brunak, Y. Chauvin, C. Andersen, H. Nielsen, Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics (Oxford, England) 16, 412–424 (2000) 21. N. Sultana, Md. Mohaiminul Islam, Comparative study on ML algorithms for sentiment classification. IJCA 182(21), 1–7 (2018). 0975-8887 22. S. Kumbhar, N. Paranjape, R. Bhave, A. Lahoti, Comparative study of ML algorithms for anomaly detection in Cloud infrastructure. IJFRCSCE 4(4), 596–598 (2018). ISSN: 2454-4248 23. R. Ragupathy, M.L. Phaneendra, Comparative analysis of ML algorithms on social media test. IJET 7(2.8), 284–290 (2018) 24. A.K. Srivastava, Comparative analysis of ML algs. for steel plate fault detection. IRJET 06(05), 1231–1234 (2019) 25. R. Agarwal, P. Sagar, A comparative study of supervised ML algorithms for fruit prediction. J. Web Dev. Web Des. 4(1), 14–18 (2019) 26. R. Rajput, R.K. Lenka, S.J. Chacko, K.G. Javed, A. Upadhyay, Overview of Amalgam models for Type-2 diabetes mellitus, in Proceedings of 3rd International Conference on Computing Informatics and Networks, ed. by A. Abraham, O. Castillo, D. Virmani. Lecture Notes in Networks and Systems, vol. 167 (Springer, Singapore, 2021). https://doi.org/10.1007/978-98115-9712-1_48
196
Ch. V. Raghavendran et al.
27. B.A. Al-Hameli, A.A. Alsewari, M. Alsarem, Prediction of diabetes using hidden Naïve Bayes: comparative study, in Advances on Smart and Soft Computing, ed. by F. Saeed, T. Al-Hadhrami, F. Mohammed, E. Mohammed. Advances in Intelligent Systems and Computing, vol. 1188 (Springer, Singapore, 2021). https://doi.org/10.1007/978-981-15-6048-4_20 28. G. Naga Satish, C.V. Raghavendran, R.S. Murali Nath, Analysis of Covid confirmed and death cases using different ML algorithms, in Intelligent Systems, ed. by S.K. Udgata, S. Sethi, S.N. Srirama. Lecture Notes in Networks and Systems, vol. 185 (Springer, Singapore, 2021). https:// doi.org/10.1007/978-981-33-6081-5_7
Preserve Privacy on Streaming Data During the Process of Mining Using User Defined Delta Value Paresh Solanki, Sanjay Garg, and Hitesh Chhikaniwala
Abstract A huge volume of data is being regularly generated from different applications, then analyzed by data mining techniques, and summarized in a way so that at the end result would be significant to the users. Published data sets may contain highly sensitive data so; companies are concerned about releasing sensitive information in a public domain. Sharing of such data has been proven to be beneficial for data mining process; however, it could be in violation of privacy laws and eventually be a threat to civil rights. Finding a solution to this problem is the aim of preserving privacy during data mining. Paper recommends preserve privacy on streaming data during the process of mining using user defined Delta value which perturb the sensitive data. Users have the flexibility to change the data attributes that are perturbed based on their security requirements. The outcomes not only protect data privacy but also reliably mine data streams. Keywords Data mining · Stream mining · Privacy preservation · Perturbation · Precision · Recall · Sensitive data · Clustering
1 Introduction Data mining is a mining process to extract useful and previously unknown information from large data sets. Various fields, such as health care, which includes medical diagnostics, business, finance organization, sports, education, insurance claims analysis, drug development, gambling, stock market, telecommunication, and defense P. Solanki (B) Department of Computer Engineering, U. V. Patel College of Engineering, Ganpat University, Mehsana, Gujarat, India e-mail: [email protected] S. Garg School of Computing, DIT University, Dehradun, Uttarakhand, India H. Chhikaniwala Adani Institute of Infrastructure Engineering, Ahmedabad, Gujarat, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_15
197
198
P. Solanki et al.
data where data mining is widely applied. The utmost extensively used data mining techniques are association rule mining, clustering, classification, and regression analysis. Discovered information from datasets would be deposited in databases, data warehouses, online analytical processes, and other repositories. The rapid advancement in Communications and Internet technology has led to the rise of data streams. A streaming data is termed as a sequence of endlessly arriving data at a high rate that is offline or online, explicitly or implicitly ordered by timestamp, evolving, and undetermined [1, 2]. Because of the properties of data stream (consecutive, rapid, temporal, and unpredictable), the study of data mining procedures has transitioned from old-fashioned static data mining to dynamic data stream mining. Data stream mining is concerned with pull out knowledge structures and characterized in models and/or patterns in endless, unceasing streams of information. The idea of data stream processing is that training samples can only be examined briefly once, i.e., when they arrive in a high-speed stream, and then they must be discarded to make room for subsequent samples. The algorithm that processes the stream has no control over the order in which the examples are seen, and it must update its model incrementally as each example is examined. Old-fashioned data mining approaches have been used in applications where persistent data obtainable and where produced learning models are static in nature. Because the full data is obtainable to us before we make it available to our machine-learning algorithm, statistical information of the data distribution can be known in advance. There is no way to process data incrementally if the obtainable bigger data sets are sampled to accommodate with the memory, which is minor in size with respect to bigger size of data. For each sample, the learning model starts processing from scratch because there is no mode to have an intermediate analysis of the result. The data stream cannot be stored because of its larger size. Numerous scan is also not conceivable. Streams of data can continue to evolve over time. Evolving concepts necessitate data stream processing algorithms to keep models up to date with the latest changes. All of the changes are due to changes in the environment, such as the industry’s financial situation; evolving human characteristics, such as individual interest rates and operation complexity, and so on. Since data streams are continuous, swift, and large in volume, conventional mining techniques cannot be directly applied to data stream mining. In recent years, these data has become rapidly huge at an unlimited rate. Individual privacy in data collection, processing, and mining has become a major concern throughout the data mining process. As a result, data mining privacy preservation is a serious problem. Data mining, with its promise of quickly uncovering useful, nonobvious information from vast databases, is thought to be especially vulnerable to misuse. It’s likely that data mining and privacy are at odds. Rather than revealing knowledge about individuals, the aim of data mining is to generalize across populations. The issue is that data mining is based on analyzing individual data. As a result, there could be an invasion of privacy. As a result, the real issue isn’t data mining itself, but rather how it’s performed. The aim of maintaining privacy in data mining is to reduce the possibility of data misuse while generating results that are equivalent to those obtained without such privacy-preserving techniques [3]. Before it can be used for data mining tasks, it must first have privacy. The algorithm necessitates the security of confidential data
Preserve Privacy on Streaming Data During the Process …
199
without disrupting data flow. It is very difficult to secure the data stream because data streams are naturally changing (time evolution) and because of memory limitation, flow of data stream cannot be stored (performance requirement). Traditional data mining techniques are inapplicable on data streaming process and based on such nature, correlations and autocorrelations may change over time. In the abovementioned characteristics, the most conventional algorithms are unsuccessful. Among several privacy-preserving methodologies, data perturbation is a popular technique for achieving a compromise between data usability and information privacy. It is well established that the attacker’s knowledge of the original data will play a major role in data privacy breaches. The possibility of data privacy being jeopardized due to the leakage of a few initial data records is investigated. It first demonstrates that during the perturbation, additive perturbation maintains the angle between data records. Based on this angle-preservation property, even the leakage of a single piece of original data degrades the privacy of perturbed data in some cases in a general perturbation model. Data privacy and protection can be jeopardized in a number of ways, both inside and outside the data collection organizations. Also within data collection organizations, various degrees of trustworthiness are allocated to different individuals, typically based on the rights of the computer accounts they use. To protect data privacy and security from being breached intentionally or unintentionally, it is preferable to correctly preprocess data before it is distributed for analysis or made public. An attacker’s background information consists of one or more original data documents that are precisely known to him. The intruder may use this history information to compromise other records in the original data, compromising the public’s access to the perturbed data. Motivation: Existing approach for tuple value computation using normalized values of attributes (except sensitive attribute) has been compared with other tuples having the same normalized values [4]. However, it is possible that such tuple could have the same tuple value as the other. In such scenario, perturbation of a sensitive attribute value may compromise the privacy of certain set of tuples. To overcome the abovementioned issue, a Delta-based perturbation has been introduced. Proposed approach does not claim to improve privacy gain in all cases, but it provides more robustness to the existing approach. Objective of our work: The overall objective of our work is to enhance privacy in data stream mining with minimum information loss. The proposed work satisfies the following properties: Maintain the statistical properties on data set after applying privacy on original data set, sensitive attribute values hide the relationship between the other attributes of the data set, Delta-PPDSM achieves the privacy on sensitive attributes. The rest of the paper proceeds as follows: In Sect. 2, we have discussed existing work. In Sect. 3, we have discussed Problem Definition, suggested framework, and ‘Preserve Privacy on Streaming data during the process of Mining using user defined Delta value’ method. In Sect. 4, we have shown Outcomes, Observation, and Performance analysis. Finally, we conclude in Sect. 5.
200
P. Solanki et al.
2 Related Work PPDM/PPDSM (Privacy-Preserving Data Mining/Privacy-Preserving Data Stream Mining) can be categorized mostly as a heuristic, reconstruction, and cryptographic approach. In reconstruction-based approach [5, 6], from the randomized data, original data is reconstructed. In heuristic-based approach, the adaptive alteration changes only selected values that minimize utility loss rather than all existing values. In the cryptography-based approach, computation is secure, and users not know anything except its own input and the outcomes at the end of the computation. Perturbation is the process of converting an original value into a distorted value via some mathematical function which achieves the privacy and keeps the statistical belongings of data. There has been a significant amount of work done on protecting input privacy for static data. However, there has been very little work undertaken to protect the privacy of data streams [8–10]. Several data mining techniques, incorporating privacy protection mechanisms, have been proposed based on different approaches. Recent work in the field of PPDM/PPDSM has focused on finding a balance between privacy and the need for information discovery, which is critical for improving decisionmaking processes and other human activities [11–13, 17, 29, 30]. PPDM algorithms are designed to minimize the harm to privacy caused by malicious parties during the rule-mining process. Data perturbation includes a wide variety of techniques, namely, additive, multiplicative [13, 14], matrix multiplicative, k-anonymization [15–17], micro-aggregation, categorical data perturbation, data swapping, resampling, data shuffling [7, 18], and so on. Multilevel trust is the most important principle in data perturbation. A portion of the attribute-value pairs in a data set is modified to protect sensitive data stored within it [24]. The sensitive data is thus preserved, and the data set’s statistical properties are also achieved in [14]. Multidimensional k-anonymization is designed for the general purpose of preserving utility. In the study of [19] proposed an approach for privacy-preserving classification data mining based on the matrix of random perturbation, which is applicable to any data distribution. R-amplifying method and matrix condition number have been used to protect data privacy. The work presented by [20] used randomized techniques for privacy preserving with unknown distortion parameters. Their approach was carried out on categorical data only, and an extension of it on numerical and network data is still a topic of research. The work presented by [21] shows that Gaussian distribution, along with noise addition, was used to achieve a better tradeoff between privacy gain and information loss applied to classification data mining method with numerical data only. The work presented by [22] shows data streams preprocessing system (DSPS) perturb data streams to preserve data privacy. In [23] proposed sliding window-based data stream k-anonymization. It uses distributive density of data to predict upcoming data which in turn improves the precision of data and increases usability. In [4] proposed multiplicative data perturbation in data stream mining based on tuple value. The author concludes that the given method increases privacy with minimum information loss. Privacy-preserving data mining methods have been evaluated by a research community and statistical
Preserve Privacy on Streaming Data During the Process …
201
disclosure committee mainly on aspects such as applicability, privacy-protection metric, the accuracy of mining results, computation, and so on. Table 1 shows the comparison of various PPDM techniques based on characteristics.
3 Proposed Work Before releasing the data stream publically, provide the privacy on values. This is the aim of our suggested work. The main data stream should be perturbed, and the perturbed data streams should yield the same results as the main data stream. To keep privacy on the existing data stream, the suggested work should add/multiply online induced noise. Performing preprocess on data stream and clustering the data stream are the main phases of our work. The initial step’s main aim is to perturb data streams in order to protect data privacy. The primary goal of the subsequent stage is to applying clustering on a perturbed data set in order to assess output using CMM and precision/recall process. Framework of proposed approach is shown in Fig. 1. Dataset D is given as an input to suggested data perturbation algorithm. Algorithm perturbs only sensitive attribute values and resultant dataset with modified values which is called perturbed dataset D . D and D are provided to standard clusteringstream learning algorithms to obtain results R and R , respectively. Proposed work focuses on obtaining close approximation between clustering results R and R’ to balance tradeoff between privacy gain and information loss. The primary goal of the second stage, which is done by the online data mining method, is to cluster the data by mining perturbed data streams.
3.1 Formal Analysis of Proposed Work The ultimate aim of the data perturbation method is to maximize the data transformation process by achieving maximum data privacy and usefulness. The proposed heuristic-based privacy preservation in data stream mining using perturbation approach keeps the original data set’s statistical properties for the data mining operation. The suggested approach not only perturbs sensitive data on its own, but it also perturbs sensitive data with corresponding tuple values, implying that it perturbs sensitive data across the entire dataset. During the perturbation process, the suggested approach has no bias in terms of sensitive data. The adversary would not be able to recover the original data from the perturb dataset because suggested technique is not reversible. The suggested method enhances sensitive data privacy while limiting information loss. Using the Delta-based perturbation method, which is a value distortion-based method, the perturbation method aims to protect data privacy by modifying values of sensitive attributes. Suggested approach work as follows:
Multi-dimensional
Very low
Low level of uncertainty exists
Simple
Complex
Low level of uncertainty Low level of uncertainty Level of uncertainty exists exists exists
Simple
Having previous knowledge, independent component analysis (ICA)
Association/classification/clustering
Multi-dimensional
Minimum
Simple
Multi-dimensional
Very low
Quasi-identifiers Average
Low level of uncertainty exists
Single dimensional
Minimum
Sensitive attribute Very low
Simple
Data dimension
Sensitive attribute Very low
Uncertainty Level
Single dimensional
Info. loss
Average/high
Sensitive attribute
Static Anonymize data set
Complexity
Minimum
Privacy loss
Static/stream
Geometric characteristic correlation between dimension
Static/stream
Anonymization Categorical
Having previous Re-identification, similarity attack knowledge, independent component Analysis (ICA)
Average/high
Applies to
Values distribution
Static/stream
Numerical
Random projection
Spectral properties
Sensitive attribute
Preserved property
Numerical
Random rotation
Reconstruction/security Spectral properties breaches possible through
Values distribution
Dataset type
Categorical /numerical
Noise multiplication
Classification/clustering
Static/stream
Input dataset
Data mining techniques Association/classification/clustering Association/classification/clustering Classification/clustering
Noise addition
Categorical /numerical
Comparison criteria
Table 1 Privacy-preserving data mining techniques
202 P. Solanki et al.
Preserve Privacy on Streaming Data During the Process …
203
Fig. 1 Framework of proposed method
1.
On given original data set ‘D’, calculate the Tuple value (TVi) for each instance using following except class attribute. T Vi = Avg
n−1 i=1
2.
m Vi − Stdev
(1)
where V i = Attribute Value, m = Mean and Stdev = Standard deviation. Perturbation is performed using user defined Delta value and tuple value. (a)
Calculate the Delta Value (DV) of each TV i DVi = user define percentage of T Vi
(b)
Calculate ub and lb (upper-bound and lower-bound) for define range to choose the tuple values which are exist in in this range. ub = DVi + T Vi and lb = DVi − T Vi
(c)
(2)
(3)
Based on Eq. (3), if tuple values are in this range then select the corresponding sensitive values and calculate the average which will give the final modified values of current instance of sensitive value.
204
P. Solanki et al.
Sv =
w
Sv(i) n
i=1
(4)
where Sv = Modified value, Sv = Original Value, and n = total no. of values.
3.2 Proposed Work Proposed algorithm, perturb sensitive attribute values before publishing for mining purpose. The algorithm first normalizes all attribute values (except sensitive attribute values) using widely used z-score normalization method. Tuple value is computed for every instances of normalized data set by taking average of the attribute values of each instance. Further, user uses Delta to find out ‘ub’ and ‘lb’ of the range. Suggested approach assumes only one sensitive attribute available in the dataset, and suggested approach can further be modified to extend the approach to perturb discrete set of attributes. Algorithm: Preserve Privacy on Streaming data during the process of Mining using user defined Delta value. Input: Data stream D, Delta (DV ), window size (w), sliding window size (W ). Intermediate Result: Perturb data stream D . Output: R and R (Clustering data stream) Procedure:
Preserve Privacy on Streaming Data During the Process …
205
For i=1 to i=W Set SA[i] //store sensitive attribute values in array For each instance of D For A = 1 to A = n – 1 // n = number of attributes except sensitive attribute If NOT (normalized (A)) ZA = value (A) – mean (A) / stdev (A) Else ZA = value (A) End If End for TVB = Average (A=1 to A=n-1) //store tuple values in TVB End for For i=0 to w s =0, sv=0 c = TVB[i] s = (c * TV) / 100 //Delta value (in percentage) of current instance ub = c + s //upper bound lb = c – s //lower bound If (c = lb) sv = sv + SA[j] End if End for sv = Average (sv) //Average of sensitive values Value (SA) = sv //set the value of sensitive Instance Perform clustering process on R’ End for
4 Performance Evaluation 4.1 Experimental Setup Experiments and simulations in a data stream clustering environment have been conducted to evaluate the efficiency of the proposed privacy-preserving process. Using the resultant precision of original data set clustering and perturbed data set clustering, we quantified the proposed method. The experiments were processed on three different datasets available from the UCI ML repository, MOA dataset dictionary [26]. K-Means clustering algorithm using WEKA tool in MOA framework has been simulated to evaluate the accuracy of the suggested privacy-preserving method. MOA is a software environment that allows you to create algorithms and run experiments for online learning from changing data streams. MOA supports evaluation of data stream learning algorithms on large streams for both clustering
206
P. Solanki et al.
Table 2 Dataset
Dataset
Input ınstances
Nominal
Sensitive Attribute
CoverType [27]
65,000
Ignored
Elevation, aspect, slope
ElectricNorm [25]
45,000
Ignored
Nswprice, nswdemand
Bank marketing [28]
45,000
Ignored
Balance, duration
and classification. In addition to this, it also supports interface of WEKA machinelearning algorithms. Data sets with configuration are shown in Table 2 to determine the accuracy of our proposed system. The K-Means clustering algorithm is being used to evaluate five clusters for each data set.
4.2 Experimental Results The performance review of the proposed perturbation approach to Preserve Privacy in data stream clustering is the focus of this section. The key benefit of our work over previous work is that respective party produces the perturb data set based on the Delta value. Tests have been led to assess correctness and secrecy while safeguarding confidential information. We’ve presented two different results for investigation.
4.2.1
Performance Analysis: Cluster Membership Matrix (CMM)
In assessment method, we concentrated on overall superiority of produced clusters. We examined how closely each perturbed dataset cluster matched its corresponding original dataset cluster. The first step was to compute the matrix of frequencies to find the cluster that matched. We state to such a matrix as the CMM-Clustering Membership Matrix (rows—represent the initial dataset clusters, columns—represent the perturbed dataset clusters), and Freqi,j is the point total in cluster C i that falls in cluster C i in the perturbed dataset. We have compared the results of D and D with setting the parameter; sliding window size (W—total records are handled first in data stream) is 3000. When the procedure is completed, the sliding window advances to the next step. We have used K-mean algorithm for clustering. We defined the value of is five. In our experimentations, we can also take into account the user-defined window size (w) and Delta (DV). For our experimentation, we have set user-defined size is 10, 30, and 50. According to Delta value and window size, sensitive attribute value is perturbed. Figures 2, 3, and 4 show the accuracy obtained through suggested method of data perturbation. Over three standard datasets with DV ranging from three to ten percent, average clustering accuracy of 90% has been achieved. From the graph, we can see that, better accuracy is achieved if we minimize the window
Preserve Privacy on Streaming Data During the Process …
Fig. 2 Accuracy gained for cover dataset
Fig. 3 Accuracy gained for bank marketing dataset
Fig. 4 Accuracy gained for ElectricNorm dataset
207
208
P. Solanki et al.
size. If window size is larger, then we will achieve less accuracy because more instances are involved to modify the sensitive attribute values in the process. Table 3 shows the K-Means clustering result which shows that how many instances are correctly classified in each cluster and how many instances are not classified correctly means misclassify the instances.
4.2.2
Performance Analysis: Precision and Recall
Massive Online Analysis framework provides two important accuracy measurement, namely F1_Precision and F1_Recall. F1_Precision is determining the precision of method by seeing the precision of distinct cluster. F1_Recall is determining the recall of method. The efficiency and accuracy are determined by Precision and Recall. Outcomes of suggested method have been calculated using Precision and Recall which is provided by MOA. Figures 5, 6, and 7 show the accuracy measurement of suggested method in line graph using these two measures. The results when the original data is processed without using the suggested perturbation method and when the data is processed using the suggested method are depicted in each graph. MOAK-Means algorithm is used. These measures have been assessed with W, w, and DV values based on our suggested method. We may infer from the graph that the precision of the information obtained after using the perturbation technique is maintained. In both cases, precision and recall measurements have almost equal values after and before data perturbation, demonstrating that the proposed solution offers greater precision with minimal information loss and optimal privacy benefit in data sets.
5 Conclusion Under the process of data stream mining, sensitive information of individuals or organizations can be threatening. Suggested study work tries to find out answers to balance the data privacy and mining result. Existing approaches take into account the dataset’s features and either alter the values of sensitive attribute values or leave them alone and anonymous. Proposed approach shows favorable results among other suggested methods of data perturbation. Suggested method uses the concept of the Delta value, so that organization can decide how much privacy is needed. We have applied K-means clustering for the calculation of accuracy. Based on the characteristics of datasets, we gained privacy and accuracy (we have achieved, on average 90% accuracy with 10% information loss after testing three different data sets) in almost all cases. The suggested method is flexible and is an easy-to-use method in the area of PPDSM like other existing methods. The loss of accuracy caused by data perturbation was measured in terms of the percentage of instances of data stream that were misclassified using CMM. Our experimentations are limited to numeric-type attributes. There are some open-research issues related to privacy preserving in data mining: (1) various privacy-preserving techniques have been developed; however, all
8707
Duration
Electric norm
Covertype
8731
Balance
Bank marketing
6245
NswPrice
13,185
Slope
5231
12,883
Elevation
NswDemand
12,701
Aspect
C1
8573
8255
12,589
12,051
9134
9192
9163
C2
6811
5594
11,018
11,224
7758
9455
9596
C3
Clusters with classified instances
Attribute
Dataset
10,156
8315
9984
10,222
9291
8246
8281
C4
Table 3 K-Means cluster result of the perturb datasets (w = 50, Delta = 10%)
7453
5569
10,303
10,338
10,420
7152
7116
C5
5762
12,036
8921
9282
16,696
2248
2113
Miss classification
39,238
32,964
57,079
56,718
49,304
42,752
42,887
Total
87.20
73.25
86.48
85.94
74.70
95.00
95.30
Accuracy (%)
Preserve Privacy on Streaming Data During the Process … 209
210
P. Solanki et al.
Fig. 5 Accuracy quantity of cover type dataset
Fig. 6 Accuracy quantity of electric Norm dataset
techniques focus on preserving privacy and do not look into the aspect of information loss. (2) Balancing tradeoff between privacy gain and data utility is still a major issue. (3) Because of the features of the data stream model, achieving data privacy in a data stream model is more complex than in a conventional data mining model.
Preserve Privacy on Streaming Data During the Process …
211
Fig. 7 Accuracy quantity of bank marketing dataset
References 1. S. Wares, J. Isaacs, E. Elyan, Data stream mining: methods and challenges for handling concept drift. SN Appl. Sci. 1, 1412 (2019) 2. G. Krempl, I. Zliobaite, D. Brzezinski, E. Hullermeier, M. Last, V. Lemaire, T. Noack, A. Shaker, S. Sievi, M. Spiliopoulou, J. Stefanowski, Open challenges for data stream mining research, in SIGKDD, vol. 16 (2014) 3. M.B. Malik, M. Asger Ghazi, R. Ali, Privacy preserving data mining techniques: current scenario and future prospects, in IEEE Third International Conference on Computer and Communication Technology (ICCCT), pp. 26–32 (2012) 4. H. Chhinkaniwala, S. Garg, Tuple value based multiplicative data perturbation approach to preserve privacy in data stream mining. Int. J. Data Mining Knowl. Manage. Process (IJDKP) 3(3), 53–61 (2013) 5. L. Zhang, W. Wang, Y. Zhang, Privacy Preserving association rule mining: taxonomy, techniques, and metrics. IEEE Access 7, 45032–45047 (2019) 6. U. Yun, J. Kim, A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Syst. Appl. 42(3), 1149–1165 (2015) 7. H. Chhinkaniwala, S. Garg, Privacy preserving data mining techniques: challenges & issues, in International Conference on Computer Science & Information Technology (2011) 8. M.A.P. Chamikara, P. Bertok, D. Liu, S. Camtepe, I. Khalil, Efficient data perturbation for privacy preserving and accurate data stream mining. Pervasive Mob. Comput. 48, 1–19 (2018) 9. C.Y. Lin, Y.H. Kao, W.B. Lee, R.C. Chen, An efficient reversible privacy-preserving data mining technology over data streams. Springerplus 5(1), 1407 (2016) 10. D.M. Rodríguez, J. Nin, M. Nuñez-del-Prado, towards the adaptation of SDC methods to stream mining. Comput. Secur. 70, 702–722 (2017) 11. S.A. Abdelhameed, S.M. Moussa, M.E. Khalifa, Privacy-preserving tabular data publishing: a comprehensive evaluation from web to cloud. Comput. Secur. (2017) 12. N. Abitha, G. Sarada, G. Manikandan, N. Sairam, A cryptographic approach for achieving privacy in data mining, in International Conference on Circuit, Power and Computing Technologies (ICCPCT), pp. 1–5 (2015) 13. S. Upadhyay, C. Sharma, P. Sharma, P. Bharadwaj, K.R. Seeja, Privacy preserving data mining with 3-D rotation transformation. J. King Saud Univ.-Comput. Inf. Sci. (2016) 14. S. Chidambaram, K.G. Srinivasagan, A combined random noise perturbation approach for multi-level privacy preservation in data mining, in International Conference on Recent Trends in Information Technology (IEEE, 2014)
212
P. Solanki et al.
15. J.J.V. Nayahi, V. Kavitha, Privacy and utility preserving data clustering for data anonymization and distribution on Hadoop. Futur. Gener. Comput. Syst. 74, 393–408 (2017) 16. S. Mohana, S.S.A. Mary, Heuristics for privacy preserving data mining: an evaluation, in Algorithms, Methodology, International Conference on Models and Applications in Emerging Technologies (ICAMMAET) (IEEE, 2017), pp. 1–9 17. J. Wang, C. Deng, X. Li, Two privacy-preserving approaches for publishing transactional data streams. IEEE Access 6, 23648–23658 (2018) 18. S.R. Oliveira, O.R. Zaiane, Privacy preserving clustering by data transformation. J. Inf. Data Manage. (2010) 19. Z. Xiaolin, B. Hongjing, Research on privacy preserving classification data mining based on random perturbation, international conference on information. Networking Autom. (ICINA) 1(1), 173–178 (2010) 20. L. Guo, X. Wu, Privacy preserving categorical data analysis with unknown distortion parameters. Trans. Data Privacy (TDP), 185–205 (2009) 21. P. Kamakhi, B.A. Vinayya, Preserving the privacy and sharing the data using classification on perturbed data. Int. J. Comput. Sci. Eng. 2(3), 860–864 (2010) 22. C. Ching-Ming, C. Po-Zung, S. Chu-Hao, Privacy-preserving classification of data streams. Tamkang J. Sci. Eng. 12(3), 321–330 (2009) 23. Z. Junwei, Y. Jing, Z. Jianpei, Y. Yongbin, Kids—K-anonymization data stream base on sliding window, in 2nd International Conference on Future Computer and Communication, vol. 2, no. 2, pp. 311–316 (2010) 24. M. Prakash, G. Singaravel, An approach for prevention of privacy breach and information leakage in sensitive data mining. J. Comput. Electr. Eng., 134–140 (2015) 25. Moa Datasets, http://moa.cs.waikato.ac.nz/Datasets 26. A. Bifet, R. Kirkby, P. Kranen, P. Reutemann, Massive online analysis manual (2011) 27. Covertype Dataset, UCI machine learning repository. https://archive.ics.uci.edu/ml/datasets/ covertype 28. Bank Marketing Dataset, UCI Machine learning repository. Online: https://archive.ics.uci.edu/ ml/datasets/bank+marketing 29. Ms.K. Chitra, V. Prasanna Venkatesan, An antiquity to the contemporary of secret sharing scheme. J. Innov. Image Process. (JIIP) 2(01), 1–13 (2020) 30. D. Sivaganesan, A data driven trust mechanism based on blockchain in IoT sensor networks for detection and mitigation of attacks. J. Trends Comput. Sci. Smart Technol. (TCSST) 3(01), 59–69 (2021)
A Neural Network based Social Distance Detection Sirisha Alamanda, Malthumkar Sanjana, and Gopasi Sravani
Abstract The global breakout of the most spreading viruses like COVID-19 has fundamentally changed the way of interacting with one another. By avoiding people’s physical contact, the spread of virus can be lowered. So, one such measure called social distancing can minimize the reproduction rate (RO) of viruses among communities. This paper proposes a system that can automatically estimate the interpersonal distance between people using MobileNet SSD deep learning algorithms. The proposed model first detects the persons in a video stream, and then, it characterizes people based on their distance using different color indications. To make people aware about the social distance violation, this model also provides alert signal. The proposed system allows a quick monitoring of individuals potential behavior to avoid health risk, especially in current pandemic situation. The proposed system can also be used in surveillance systems to alert the government in harmful situations by analyzing people’s movement. Keywords Social distancing · Pandemic · MobileNet SSD · Person detection · Distance estimation · Surveillance system · Alert signal
1 Introduction In the fight against the COVID-19 spread, maintaining social distancing [1] is proven to be a very effective and important measure to reduce the disease spread. Researches and studies proved that the spread of the virus has been increasing rapidly due to the physical contact or interaction between infected and non-infected persons. So, people are asked to maintain social distance and limit their interactions with each other, which in turn reduces the spread of deadly viruses [2] from the person we are in contact with. In the past, artificial intelligence and deep learning applications have S. Alamanda (B) CBIT, Hyderabad, India M. Sanjana · G. Sravani IT Department, CBIT, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_16
213
214
S. Alamanda et al.
shown good promising results on a number of daily life problems. In the proposed system, we will see the detailed explanation of how we can use computer vision and deep learning models to monitor social distancing at public places and work environments [3]. To ensure social distancing in crowded areas, this detection tool is proposed to monitor whether people are following a safe distance protocol or not by analyzing a video stream captured in real time. Social distancing means people physically staying at 2 arm lengths from one another, i.e., at least 6 ft distance in both indoor and outdoor spaces. Use of convolutional neural networks and openCV helps in perfect detection of objects like person, car, bird, airplane, fruit, book, background, and many more. After detecting the person using pre-trained model, we calculate the distance between two people using triangular similarity technique and estimate whether the threshold limit of safe distance is followed or not. The rest of the paper is presented as follows: Sect. 2 briefly summarizes related work in the past. Section 3 describes methodology for finding social distance. Section 4 summarizes a series of experimental results, and finally, Sect. 5 presents conclusions.
2 Related Work To continuously alert people on violation of social distance, an efficient automatic detection system is required. In the existing systems, there are various types of models that provide information only about social distance violations from the given input image or a video. Most of the existing models concentrate on two-dimensional view of the object detected. Training such models with frontal view and side view of datasets helps to perform better object detection [4]. Distinguishing various kinds of violations into high risk, low risk, and safe are a challenging part for controller to take the necessary precautions. Providing this kind of information may decrease the level of contact between the people and crowd density in the public areas. Thus, the proposed system focuses in giving alarm sound, whenever the people are harmfully close. This proposed model differentiates the violations by using color indications, and also, mail notification is sent to the controller when the number of violations exceed a threshold limit. During a conversation, the geometrical disposition among the people defines the social behavior [5], and the interpersonal distance in this kind of situations heavily dependent on different cultures. The proposed system concentrates on the following methodology which includes scene geometry understanding, people detection, and visual distance characterization. Here, scene geometry understanding is estimating interpersonal distances using ground plane identification where people walk. Such ground plane analysis plays a vital role in majority of surveillance systems to envision the scene through bird’s eye view [6]. Ground plane detection is used for easy data statistics representation and visualization. Then, the problem comes in estimating homography from given reference elements which are extracted from scene or an image.
A Neural Network based Social Distance Detection
215
So, people detection is magnificent in estimating people pose even in complex scenarios. Later, the proposal of VSD characterization helped in such scenarios where the threshold of social distance is violated and also many other scenarios like the number of people gathered, type of social interaction maintained. This model perfectly detects the distance between persons from the given input images and also provides the ellipsoids to the detected objects. It failed to provide the alerts, dynamically for the one who violates the threshold distance. In the reduction of virus spread and prevention of pandemic’s peak, the social distancing plays a vital role if it is implemented from an early stage [7]. It is observed how social distancing has minimized the burden on healthcare centers and the count of infected people during the covid situation. Implementing social distance protocol ensures that the count of infected cases doesn’t exceed the capability of public healthcare centers and a decrease in rate if mortality. This paper concentrates on You Only Look Once (YOLOv3) [8] methodology for monitoring the social distance using deep sort technique to track the people detected and represented by bounding boxes. The main aim of this model is multi-scale object detection [4] by the adjustment of network structure. It has used several logistic regression algorithms instead of soft-max for object classification. Using convolutional layers, feature learning is performed which is called as residual blocks. These blocks are made of several skip connections and convolutional layers. This model performs detection at three individual scales, i.e., to estimate the social distance between individual people, the Euclidean distance to calculate the distance between the centroids for each pair of detected bounding boxes, and to find the location parameters from the detected bounding box using non-maximal suppression. This model is widely used for smallscale objects for the detection of human as it improves predictive accuracy. It failed to work on side view dataset and needs GPU support for more accurate results. Individuals are unable to estimate the distance of 6 ft (2 m) between one another. An active surveillance system can be useful for estimating the distances between individuals, there by alerting them to lower the spread of highly contagious disease. This paper [9] proposed a real-time social distance detection system and alerting system with four major factors: (i) wrong data should never be recorded by the system, (ii) the alerting should not focus on individual, (iii) no supervisor or controller should exist in the detection loop, and (iv) open-source code should be developed to allow access by everyone. This model was developed to detect real-time objects based on deep learning and a monocular camera to measure physical contact between people. First, methodology used is mostly based on R-CNN that begins with region proposals after that it performs the bounding box regression and classification [10]. Deep sort technique describes an approach for social distance monitoring based on YOLOv3 model and was used to track and detect and pedestrians. The proposed system was tested on various real-time datasets to observe its performance. It provides faster results on trained datasets. It failed to consider the weakly detected person objects as it only focuses on 2D view of a person from a given input video.
216
S. Alamanda et al.
3 Methodology The main purpose of the proposed system is to provide a solution for tracking and detecting social distance violations. Based on the region of interest (ROI), we preprocess and transfer the data for learning on MobileNet SSD model. The Mobilenet SSD model is pre-trained with MS-COCO and Pascal VOC datasets and helps in performing object detection and object tracking [11]. When given the video input, the model draws the bounding boxes around detected objects by considering x-coordinate as width and y-coordinate as height. Later, we compute distance between two persons using triangular similarity technique and Euclidean distance measure and verify if the given threshold social distance is maintained or not. If threshold distance is not maintained, an audio alert signal is generated for the detected social distance violations. Also, if the number of violations exceed given limit, an alert email notification is sent to controller to control the crowd. The significant modules in the proposed system are shown in Fig. 1. The implementation steps of proposed system is as follows: Step 1: The input is given as a video sequence which splits into training and testing data. Step 2: The object person is detected and tracked in each frame. Step 3: Then, the distance between two people is calculated to find out the number of social distance violations. Step 4: A non-intrusive alert is provided whenever the violations occur. Step 5: An email notification is sent to the controller to control the crowd. The proposed methodology for social distance detection has four main stages.
Training the model
Object Detection and Object Tracking
Distance Estimation
Are there any social distance violations?
Non-Intrusive alert
Is the area crowded? Crowd Violations Control
Fig. 1 Design of the proposed system
A Neural Network based Social Distance Detection
217
3.1 Person Detection The proposed system uses a MobileNet SSD architecture for accurately detecting persons in the given input video. The system was pre-trained for object detection using a fast and efficient deep learning-based method by combining the MobileNet and SSD framework. To handle the practical limitations in real-time applications by running neural networks which consume high power and resource on low-end devices, framework of SSD was integrated with MobileNet using the Caffe framework. CAFFE stands for convolutional architecture for fast feature embedding. Caffe is made as a framework based on deep learning keeping in mind expression, speed, and modularity. So, in a SSD when the base network used is MobileNet, it becomes MobileNet SSD. The SSD is designed to be independent of the base network, and so, it can run with any base networks such as, MobileNet, YOLO, and VGG. A MobileNet classifier converted from Tensorflow was trained on Microsoft COCO dataset and fine-tuned with VOC0712. The produced convolutional neural network model helps us to classify the person based on the class_id. The SSD Mobilenet caffe network can be used for object detection and can detect 20 different types of objects like airplane, bird, bottle, boat, bicycle, bus, background, car, chair, cow, cat, dog, dining table, monitor, motorbike, horse, potted plant, person, sheep, sofa, TV, and train.
3.2 Person Tracking From Fig. 2, after the persons are detected from video sequence frames. According to the class name, the model classifies and identifies the object by drawing random bounding boxes. Then, to detect exact position of the person in the frame, the x, ycoordinates of bounding boxes are used, where x and y represents width and height, respectively. To locate and estimate how far is that particular object from camera, we are using triangular similarity technique. The triangular similarity technique is applied to calculate distance between camera and object based on focal length of camera and detected bounding box coordinates. To understand better, let us assume that a person is at a distance ‘d’ from the camera (in centimeters), and the actual height of the person is ‘h’ (Assume that the average height of human as 165 cm). Using the object detection module, we consider the y-coordinate of the bounding box as person’s pixel height ‘p’. From these values, ‘f ’ as the camera’s focal length can be computed with Eq. 1. f = ( p × d)/ h
(1)
To calculate the person’s distance ‘d’ from camera, (i.e., the depth of the person from camera) we rewrite the above Eq. 1 as Eq. 2 by considering the ‘h’ as person’s actual height, ‘p’ as person’s pixel height, and ‘f ’ as camera’s focal length. This
218
Feed an Input from webcam/ video
S. Alamanda et al.
Detecting persons in the input
Localizing the persons in a frame
Displaying a bounding box for each detected person
Checks if distance between 2 people is less than 2m
Distance between people is calculated
Triangle Similarity Technique is applied
Gives alarm on violation
Detection Continuos
Checking for the number of social distance violations
If a certain threshold is met, the alert mail is sent to the controller
Fig. 2 Workflow diagram of proposed system
depth ‘d’ (distance of the person from camera) is used as z-coordinate to get 3D view [12] of the detected object to avoid false distance estimations. d = (h × f )/ p
(2)
3.3 Distance Calculation After estimating the person’s depth ‘d’ from camera, we have to compute the physical distance between two people. For an instance, a video may contain ‘n’ number of detected people. So to consider each and every detected person in frame, we take the 3D view of the object and calculate the Euclidean distance measure to find distance between the mid-points of the bounding boxes of all the detected people. If the estimated distance is less than 200 cm between two people, a red colored bounding box is displayed around these people indicating that social distance is not maintained by them. And if the distance between two people is in the range of 2–3 m, an orange colored bounding box is displayed around the people designating that they are about to violate the safe distance. The objects distance from camera was converted into feet for visualization purpose. The below color indications differentiate various kind of violations such as
A Neural Network based Social Distance Detection
Red Orange Green
219
Risk Unsafe Safe
As described in Fig. 2, the system continuously monitors and checks if the persons are maintaining social distance by checking on each and every frame. If there are no violations, then it moves on to next frame for distance estimation.
3.4 Alert Signaling The audio alert is provided only when the distance between the persons is dangerously close. When social distance is violated the alarm sounds, if the controller is not able to pay attention in monitoring the social distance violations and the number of violations exceeds over limit, (given the threshold limit as ≥7) then the mail notification is sent to controller and it also displays a statement in yellow color on the output window stating violations over limit.
4 Experimental Results For training the proposed model, we have used COCO dataset [13]. COCO stands for common objects in context. It is a large dataset for image recognition or classification, object segmentation, detection, and captioning. COCO dataset has 1.5 million object instances, 330K images where above 200K are labeled, 80-object categories, 91-stuff categories, 5-captions per image, and 250,000 people with key points. The PASCAL visual object classes (VOC) dataset [14] has a total of 9963 images, with each image containing a group of objects with 20 different classes, making a collection of 24,640 annotated objects. The 20 object categories include animals, household, vehicles, and other: airplane, bird, bottle, boat, bicycle, bus, background, car, chair, cow, cat, dog, dining table, monitor, motorbike, horse, potted plant, person, sheep, sofa, TV, and train. Each image in this dataset has bounding box annotations, pixel-level segmentation annotations, and object class annotations. This is used as a benchmark dataset for tasks such as semantic segmentation, object detection, and classification. The dataset is divided into three subsets of 1464 training images, 1449 validation images, and a private set for testing. After training the model with existing universal datasets, it was sent for testing on town center dataset and pedestrian walking dataset. The Oxford Town Centre [15] dataset is a video with a total of 7500 annotated frames, which is broken into a training set of 6500 images and a testing set of 1000 images for pedestrian detection. The data
220
S. Alamanda et al.
Fig. 3 Output on visual studio code terminal
were recorded from a CCTV camera in Oxford for research and development into activity and face recognition. Later, after testing the model on pre-trained datasets, the system was tested with input taken from webcam or a camera. The result of the proposed system is a video with detected person’s highlighted using bounding box with different colors based on the distance between one another. These boxes are labeled with depth of the person detected. After executing the code, video output is displayed showing different colored bounding boxes, and also, sound alarms are given, whenever there are social distance violations. As shown in Fig. 3, the results in the visual studio code terminal shows the percentage of the detected person and also distance in centimeters. The developed model firstly, tested on universal datasets and on a real-time video data captured with a webcam. Figure 4 shows the results on social distance violation window, when the model is tested on town center dataset. Figure 5 shows the results when the model is tested on pedestrian walking dataset. In Figs. 4 and 5, the total number of serious violations are two, which indicates that those two people are at high risk (red bounding box), the total number of unsafe violations are 2 which indicates that those 2 people are at unsafe distance or at low risk (orange bounding box) and the model ensures that maintaining a distance greater than 3 m is considered as safe distance (green bounding box). The testing of the proposed model in real time may include various places like hospitals, work environments, educational institutions, public areas, pilgrimage centers, shopping malls, transport stations, and many more. Figure 6 shows the results when the proposed model is tested on real-time data in the indoor environment.
A Neural Network based Social Distance Detection
221
Fig. 4 Test results of the proposed system with town center dataset
Fig. 5 Test results of the proposed system on pedestrian walking dataset
In Fig. 6, the proposed system is tested with two people in the home environment. The picture depicts that there is violation of social distance, as they are at closer distance to each other. Figure 7 shows the results when the proposed model is tested in the outdoor environment on real-time data. The picture depicts, the detection of all kind of social
222
S. Alamanda et al.
Fig. 6 Test results of the proposed system in indoor environment with real-time data
Fig. 7 Test results of the proposed model in outdoor environment on real-time data
distance violation, showing the total count of various violations on the output detection window. The proposed model also sends a mail to the controller, when the total number of the violations exceed the given limit. The sent mail contains an alert message as shown in Fig. 8.
A Neural Network based Social Distance Detection
223
Fig. 8 Email notification sent to the controller as an alert message
The proposed system uses the openCV library to detect the person by loading the deep learning detection algorithms. The cv2 rectangle function is used to draw the bounding boxes for the detected persons. The numpy and math libraries of Python are used for calculating and locating the position of the persons in the video. The playsound function is used to give non-intrusive sound alert on each social distance violation. At the same time, the mailer function is used to send email notification to the monitoring person, when the social distance violations have crossed the given limit. In order to save the output onto the local system, we have used the videocapture function, and also, as shown in Fig. 3, the number of the frames per second is also displayed with the elapsed time on it.
5 Conclusion The proven best practice to stop the spread of infectious corona virus disease and other viral diseases is social distancing. By measuring the crowd density in a region of interest (ROI), we can reduce the social distance violations. In order to control modulating inflow, we have used a deep learning model called MobileNet SSD to automatically estimate the breaches of social distancing and without targeting the individual generates a non-intrusive audio warning signal. Integrating the proposed system in the video surveillance systems to monitor can possibly contribute in reducing the peak of virus attacks in public areas and health care centers through person detection and tracking. Each person in real time is detected with the help of bounding boxes. The distance between people is calculated, and different kinds of social distance violations are shown using different colors. This system works very effectively and efficiently in identifying the social distancing between the people and generating the
224
S. Alamanda et al.
alert that can be handled and monitored. The proposed system truly detects potentially dangerous situations in public environments but cannot detect and avoid false alarms in groups such as a family with children or relatives, an elder with their caretakers. In the future, the proposed model can be further deployed to get bird’s eye view for more accurate distance estimation between objects. Various object detection and tracking approaches can be used to detect the people for social distancing violation. The work can be extended for different indoor and outdoor environments.
References 1. M. Greenstone, V. Nigam, Does social distancing matter? University of Chicago, Becker Friedman Institute for Economics. Working Paper, no. 2020-26 (2020) 2. C. Courtemanche, J. Garuccio, A. Le, J. Pinkston, A. Yelowitz, Strong social distancing measures in the unitedstates reduced the covid-19 growth rate: study evaluates theimpact of social distancing measures on the growth rate of confirmed covid-19 cases across the united states. Health Affairs 10–1377 (2020) 3. Z. Zou, Z. Shi, Y. Guo, J. Ye, Object detection in 20years: a survey (2019) 4. P. Dendorfer, H. Rezatofighi, A. Milan, J. Shi, D. Cremers, I. Reid, S. Roth, K. Schindler, L. Leal-Taixe, CVPR19 tracking and detection challenge: how crowded can it get? (2019). http:// arxiv.org/abs/1906.04567 5. M. Cristani, A. Del Bue, V. Murino, F. Setti, F. Vinciarelli, The Visual Social Distancing Problem (IEEE, 2020) 6. C.T. Nguyen, Y.M. Saputra, N. Van Huynh, N.-T. Nguyen,T.V. Khoa, B.M. Tuan, D.N. Nguyen, D.T. Hoang, T.X. Vu, E. Dutkiewicz et al., Enabling and emerging technologies for social distancing: a comprehensive survey (2020) 7. I. Ahmed, M. Ahmad, J.J.P.C. Rodrigues, G. Jeon, S. Din, A deep learning-based social distance monitoring framework for COVID-19. Sciencedirect (2020) 8. N.S. Punn, S.K. Sonbhadra, S. Agarwal, Monitoringcovid-19 social distancing with person detection and tracking via fine-tuned yolo v3 and deepsort techniques (2020) 9. D. Yang, E. Yurtsever, V. Renganathan, K.A. Redmill, U.O. Un, Vision-based social distancing and critical density detection system for COVID-19. ResearchGate (2020) 10. J. Dai, Y. Li, K. He, J. Sun, R-fcn: object detection via region-based fully convolutional ˙ ˙ networks, in Advances Inneural Information Processing Systems (2016), pp. 379–387 11. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, A.C. Berg, Ssd: single shot multibox detector, in European Conference on Computer Vision (Springer, Berlin, 2016), pp. 21–37 12. F. Moreno-Noguer, 3D human pose estimation from a single image via distance matrix regression, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017), pp. 2823–2832 13. MS-COCO, https://cocodataset.org 14. Pascal VOC Dataset, https://paperswithcode.com/dataset/pascal-voc 15. Towncentre Dataset, https://www.kaggle.com/ashayajbani/oxford-town-centre/activity
An Automated Attendance System Through Multiple Face Detection and Recognition Methods K. Meena, J. N. Swaminathan, T. Rajendiran, S. Sureshkumar, and N. Mohamed Imtiaz
Abstract In this paper, a face detection-based attendance system is proposed and the comparison between different algorithms is evaluated. Haar cascade is used for face detection and for face recognition is implemented through LBPH, fisher face and eigenface. Real-time images are captured by using a camera and stored in the dataset. During prediction, the person’s image is recognized and the database is updated with that person’s id and the changes will get reflected in the excel sheet as well. The SMS is also sent to an appropriate number given by the student. The comparison is done among the three methods, and the best one is chosen. This paper also tells about preferring of OpenCV over MATLAB. Keywords Face recognition · OpenCV · Eigenfaces · Fisher face · Linear binary pattern histogram (LBPH) · Haar cascade classifier
1 Introduction Face detection is a biometric method which is uniquely capable of identifying a person and analyzes patterns based on the student’s facial contours [1]. Face detection is the best among all the biometrics, due to the more accurate and less time consuming when compared to the traditional paper-based system. The methods used are Eigenfaces, Fisher faces, LBPH. The comparison between the usage OpenCV and Matlab is shown below (Table 1): K. Meena (B) Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Chennai, India J. N. Swaminathan · S. Sureshkumar QIS College of Engineering and Technology, Ongole, India T. Rajendiran Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, India N. Mohamed Imtiaz GRT Institute of Engineering and Technology, Tiruttani, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_17
225
226
K. Meena et al.
Table 1 Performance comparison of OpenCV versus MATLAB Conditions
OpenCV
MATLAB
Depiction
OpenCV is an open-source library for computer vision, originally developed by Intel. The library can be implemented in multiple types of platforms or operating environments. This is also free for use under the open-source BSD license
MATLAB is a multi-model prototype and it has numerical computing setup. It is developed by MathWorks. MATLAB perform matrix manipulation, algorithm implementation, plotting of data and function and user interface creation. MATLAB runs on MATLAB language, which is a matrix-based language
Processing speed Executes fast, process 30 frames in a Slower, examines 3–4 frames in a second [2] second [3] Operating system Works well with various operating system such as • Windows • Linux • macOS • Android
Works well with • Windows • Linux • Macros • We can call directly Matlab from Java, Perl, ActiveX and NET
Lab cost
Have to be purchased except some trial versions
Free as it comes under the BSD license
1.1 Attendance System Traditionally, paper-based attendance system is used, i.e., the attendance is noted on the paper with the student’s names and ids on it [4]. The possibility of proxy attendance is more in this traditional paper-based attendance system. So, later retinal scan and thumb biometrics are used for the attendance. Due to the disadvantages like a time consuming and error rate in the retinal scan and thumb impression biometric, the facial recognition is used [5]. With the latest advancement in facial recognition, student’s face image is taken as input through a camera, and attendance is updated automatically [6, 7].
1.2 Literature Survey There is a lot of ongoing research on face recognition is implemented successfully and [8, 9] and few of the face recognition models are examined as feature based, appearance-based, knowledge based and template matching. The face recognition method used here is categorized into holistic based technique and feature based technique. Dimensionality reduction of data is attained through PCA (Principal component analysis). Local binary pattern histogram (LBPH) is calculated based
An Automated Attendance System Through Multiple …
227
on corresponding neighborhood pixels. Then, histograms are calculated from those pixels which are used for prediction of a face.
2 Proposed System This system is divided into three modules.
2.1 Dataset Preparation By clicking on add student button in GUI of the attendance system a window is popped up which captures 300 images of each individual student along with his user id and stores all the images under the dataset folder and the user id, name of the student is stored in the database. This approach uses Viola-jones algorithm for face recognition [10, 11].
2.2 Training Phase Any one of the face recognizer such as Eigenface recognizer, Fisher face recognizer and LBPH algorithms [12] are used in creating a trained model for the database, which will be used in further predictions.
2.3 Recognition Phase By clicking on the detection button in GUI of the attendance system a window is popped up which takes the image of a student and trained model will predict the student’s name and user id and updates the attendance in the excel sheet. LBPH algorithm is more accurate when compared to the other two and it is well suited for real-time data processing [13, 14].
2.4 Methodology Image acquisition is capturing the image on the camera and stores them in the database. Face detection is detecting of the face alone, by leaving the environment. This is done by using haar cascade. Features of the faces include eyes, nose and cheeks. These are extracted and the comparison is done for the rest of the images
228
K. Meena et al.
in the database, and the face is recognized. In the training phase, the images of a student are captured and stored on the database. In the test phase, the streaming video is running lively, and Region of Interest (RoI) of faces are cropped using haar cascade, as it plots boxes exactly around the faces. Feature extraction is done by using local binary pattern (LBP). After identifying the features, the faces are classified as identified and unidentified faces. Identified faces will get names on the box, and for unidentified faces it will show as unknown.
2.5 Implementation of Algorithms 2.5.1
Eigen Face Algorithm
This mechanism work on the basis of identifying and extracting discrete features from face which includes eyes, cheeks, nose and their uniqueness. Principal Component Analysis based dimensionality reduction techniques is applied to extract the appropriate features [15]. Algorithm 1. 2. 3. 4.
Apply PCA and derive the principal component of the test image. The selected principal component of an image is compared against the training images which is stored in database. Machine learning techniques is used to identify the perfect images [16]. Appropriate student detail is retrieved from database (Fig. 1).
2.5.2
Fisher Face Algorithm
Fisher face method seeks a class specified transformation matrix, and that is the reason why it does not take illumination like the eigenface method. LDA knows the features of the face and discriminates among the person’s faces. Fisher faces profoundly depends on the given input data. Here, the whole concept lies on a simple idea: In the lower-dimensional representation, same classes must form strong clusters together and different classes should be far away from each other. Algorithm 1. 2. 3. 4. 5. 6. 7.
Apply Euclidean distance over all face images. Calculate the average. Subtract the values of step 2 from step 1. Construct two scatter matrices (between classes and within classes). Compute the matrix W (which maximizes the variances among two scatter matrix). Columns of W represents the eigenvectors. Projection of faces into the LDA space [16].
An Automated Attendance System Through Multiple …
229
Fig. 1 Face recognition system
2.5.3
LBPH Algorithm
LBPH is calculated based on corresponding neighborhood pixels. Histograms are calculated from those pixels which are used for prediction of faces. These histograms are called LBPH [17]. Algorithm 1. 2. 3. 4.
Input testing image to the recognizer. A new histogram is created by the recognizer for every new input image taken. Now, we compare the histogram generated, while training with the histograms produced from the testing image. Finally, the algorithm identifies the correct match and returns the student name associated with that match [18–20].
3 Experimental Results 3.1 Database Update The database is updated for the students who are present in the class and their names and ids are updated in the database (Fig. 2).
230
K. Meena et al.
Fig. 2 Database creation from real-time face images
3.2 Graphical User Interface The figure shows that the graphical user interface is created by using Tkinter in the OpenCV library. The labels and buttons are used to do actions (Fig. 3).
Fig. 3 GUI creation
An Automated Attendance System Through Multiple …
231
3.3 Performance Evaluation of Various Methods for Face Detection Figures 4, 5 and 6 represent the results obtained for Eigen face, Fisher face and LBPH methods by varying the parameters of components and confidence. The below tables compare the efficiency and accuracy of the algorithms used (Table 2).
Fig. 4 Performance of Eigen face algorithm
Fig. 5 Performance of Fisher face algorithm
232
K. Meena et al.
Fig. 6 Performance of Eigen face algorithm
Table 2 Performance evaluation of various classifiers Criteria
Eigen face
Fisher face
LBPH
Confidence factor (based on output)
2000–3000
100–400
2–5
Threshold
4000
400
7
Principle of dataset generation
Component-based
Component-based
Pixel based
Basic principle
PCA
LDA
Histogram
Background noise
Maximum
Medium
Minimum
Efficiency
Low
Higher than Eigenface
Highest
Accuracy (%)
80
88
98
From the above table, it is observed that LBPH performance is better than the other algorithms considered for this experimentation. LBPH is texture based model, and it capture micro details present in the image. Hence, it provides better efficiency and accuracy.
4 Conclusion This paper mainly concentrates on the efficient algorithm for face recognition in student attendance. As shown by the comparison, LBPH is the best algorithm in
An Automated Attendance System Through Multiple …
233
real-time usage. It basically is more advanced to the traditional paper-based attendance system. The OpenCV helps more efficient way than Matlab. LBPH is having a minimum of confidence around 40, which makes it more accurate. LBPH even have less noise when compared to Eigenfaces and Fisher faces.
References 1. H.L. Dhanush Gowda, K. Vishal, B.R. Keertiraj, N.K. Dubey, M.R. Pooja, Face recognition based attendance system. Int. J. Eng. Res. Technol. (IJERT) 9(06) (2020) 2. karanjthakkar.wordpress.com/2012/11/21/what-is-opencv-opencv-vs-matlab/ 3. blog.fixational.com/post/19177752599/open%20cv-vs-matlab 4. Md. Shakil, R.N. Nandi, Attendance management system for industrial worker using fingerprint scanner. Glob. J. Comput. Sci. Technol. Graph. Vis. https://www.semanticscholar.org/ paper/Attendance-Management-System-for-Industrial-Worker-Shakil-Nandi/036043eb875e b6abf0859f48d6e1c8d42c95454d#paper-header 5. N. Kar, M.K. Debbarma, A. Saha, D.R. Pal, Study of implementing automated attendance system using face recognition technique. Int. J. Comput. Commun. Eng. 1(2) (2012) 6. M. Shirodkar, V. Sinha, U. Jain, B. Nemade, Automated attendance management system using face recognition, in International Conference and Workshop on Emerging Trends in Technology (2018) 7. Y. Kortli, M. Jridi, A. Al Falou, M. Atri, Face recognition system: a survey. Sensors 20(342) (2020) 8. A. Rekha, H. Chethan, Automated attendance system using face recognition through video surveillance. Int. J. Technol. Res. Eng. 1(11) (2014) 9. M.-H. Yang, D. Kriegman, N. Ahuja, Detecting faces in images: a survey. IEEE Trans. PAMI 24, 34–58 (2002) 10. https://www2.units.it/carrato/didatt/EI_web/slides/ti/72_ViolaJones.pdf 11. A. Patil, M. Shukla, Implementation of classroom attendance system based on face recognition in class. Int. J. Adv. Eng. Technol. (2014). 12. N.K. Balcoh, M. Haroon Yousaf, W. Ahmad, M. Iraam Balg, Algorithm for efficient attendance management:face recognition based approach. IJCSI Int. J. Comput. Sci. Issues 9(4) (2012) 13. J. Zeng, X. Qiu, S. Shi, Image processing effects on the deep face recognition system. Math. Biosci. Eng. 18(2), 1187–1200 (2021) 14. P. Tupe, Waghmare, R.R. Joshi, Bibliometric survey on multipurpose face recognition system using deep learning. Libr. Philos. Pract. (e-journal) 1–24 (2021) 15. M. Shirodkar, V. Sinha, U. Jain, B. Nemade, Automated attendance management system using face recognition, in International Conference and Workshop on Emerging Trends in Technology 16. https://www.superdatascience.com/opencvface-recognition/ 17. docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html (2019) 18. S. Singh, A. Kaur, Taqdir, A face recognition technique using local binary pattern method. Int. J. Adv. Res. Comput. Commun. Eng. 4(3) (2015)
234
K. Meena et al.
19. T. Ahonen, A. Hadid, M. Pietikäinen, Face description with local binary patterns: application to face recognition. Draft (2006) 20. J. Joseph, K.P. Zacharia, Automatic attendance management system using face recognition. Int. J. Sci. Res. (IJSR). https://ieeexplore.ieee.org/document/7916753
Building a Robust Distributed Voting System Using Consortium Blockchain Rohit Rajesh Chougule, Swapnil Shrikant Kesur, Atharwa Ajay Adawadkar, and Nandinee L. Mudegol
Abstract The traditional ballot system and the electronic voting system has been used all over the world for a while now but it has numerous flaws. This paper put forth those flaws viz. voter verification, security, transparency, mobility, flexibility and operational cost. Overview of blockchain and unraveling its potential to overcome these flaws is followed. The process of carrying out the entire election process right from voter and contestant registration, till the end result calculation is depicted both technically and functionally based on consortium blockchain model. Potential challenges and their solutions involved in building such a robust and fault-tolerant software system are brainstormed. The motive of this paper is to suggest why a decentralized voting system built using a novel approach by leveraging blockchain is significantly better than current systems and how to build one by elucidating the architecture and pseudocode for the same. Keywords Election · Consensus · Consortium blockchain · Distributed systems · Byzantine fault tolerance · Cryptographic hashing
1 Introduction Elections that are held publicly are the crux of any democracy. India is the largest democracy in the world [1]. Therefore, it is essential for the government body to hold free and fair elections to elect the candidates. There are multiple types of voting systems in various countries around the world. Electronic voting, ballot based voting are some of the common methods implemented. However, with these methods, there are multiple flaws associated with serious concerns. One of the major concerns is the problem of voter confidence. The voter never knows whether his or her vote has been actually counted and has not been tampered with. The level of transparency provided by current methods is not reliable and arguably the government body can tamper with the election process [2]. The Electronic Voting Machine (EVM)’s were introduced in R. R. Chougule (B) · S. S. Kesur · A. A. Adawadkar · N. L. Mudegol Department of Computer Science and Engineering, Walchand College of Engineering, Sangli, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_18
235
236
R. R. Chougule et al.
the nineteenth-century and potential risk of using it has increased marginally since then. Another limitation of the current voting system is that the citizen has to be present in the District or State where the election is being contested, i.e., in his or her electoral constituency as the booths are situated in the same regions. This becomes an inconvenience for the people who are not living currently in that region as they might be living in some other city for education or work. Also a person has to vote within the same day. These factors can be considered as some of the factors for the fluctuating voter turnout. Besides this, there are other additional concerns regarding the voter verification, security, cost, disadvantages of E-Voting, manpower, etc. which will be discussed in detail in the “Defining the Problem Statement” section. In this paper, we propose a system which can solve these problems by leveraging the Blockchain technology (Consortium Blockchain) which will be a secure, transparent, economical, and more convenient voting system than ever. This system also elucidates a novel approach of implementing a blockchain which can solve the flaws of traditional blockchain like the 51% attack which are explained in detail in further sections. Blockchain is thought of as one of the most promising emerging technologies that has a huge potential to revolutionise the way we store data for the applications and transform the way businesses run today [3]. At the core, a blockchain is a collection of records that are timestamped, consolidated in a block and stored over a distributed system that forms an interconnected network. Unlike traditional databases which are centralized, blockchain is completely decentralized meaning there is no single authority, entity or organization that can have full control over the data like the Database Admin (DBA) have in traditional databases. A block consists of actual records or transactions and hash values of current and previous blocks which actually makes it a chain of multiple blocks. A simple analogy can be a “Linked List”—a linear collection of objects which have an address of the next node. A blockchain is secured using cryptographic hash algorithms: each record can be verified by the public key which is presigned using a private key. This makes each record of the block immutable. The ability to verify the entire blockchain is achieved using consensus algorithms like Proof-of-work (PoW) which is used by Bitcoin, Proof-of-stake (PoS), Byzantine fault tolerance (BFT) and more. There are various use cases that can leverage blockchain technology: Healthcare, Supply Chain, Finance, Cryptocurrencies, Data Management, Education, Critical Infrastructures, Asset Management to name a few [4]. But the application of blockchain is not limited to currently discovered use cases, it can be used to build any system which records any valuable data. Voting is also a transaction between the voter and the candidate contesting the election, a voter basically has something of value, i.e., here the vote, which he or she has to transfer to a contestant. Thus, we posit that a Voting System can also be an apt use case for the Blockchain technology which can potentially revolutionize the way we vote today. Another major advantage to choose a technological approach to solving the problems in the currently offline voting system is to address the voter verification issues as we can utilize the advanced biometric authentication techniques to verify a voter in the booths instead of just relying on voter identification cards. This will also eliminate any potential
Building a Robust Distributed Voting System Using Consortium …
237
worst case scenarios of the same person voting more than one time, in the same way that Bitcoin solves the problem of double-spending [5]. The rest of the paper is organized as follows: Sect. 2 will give some briefing about current methodologies, Sect. 3 will define the problem statement, Sect. 4 will delineate the proposed solution, Sect. 6 is the conclusion.
2 Current Methodologies Briefing There are many ways in which countries across the globe carry out their election process suiting their geo-political scenario of the nation.
2.1 Paper Ballot This is the oldest and has proved to be an effective voting system which is still used by many first world countries like the United States of America [6]. It is a simple process where voters visit preallocated booths to mark their vote on a paper ballot which is then manually counted. Though effective, it is a long and expensive process, but still many countries stick to it as there is less chance of a high-jacking election process compared to other methods like EVM as it is free from any kind of hardware or software vulnerability. Drawbacks: This method contains storing data (votes) in a physical manner without any backup. Mishaps may lead to data loss (vote loss) and this is not acceptable for a voting system. Heavily depends on manual auditing and hence human induced errors while counting votes may occur.
2.2 E-voting Electronic Voting Machine (EVM) is a popular way of conducting elections in populated countries as paper ballot is very exhausting on such a humongous scale and it also increases chances of manual error. India is the largest democracy to adopt this method. But privacy, integrity, and transparency of E-voting systems have been a big concern that some of the countries like Germany and Netherlands have discontinued use of it [7]. The process is mostly similar to paper ballots but instead of paper votes are casted on an electronic device which also makes it quite faster to produce results as compared to paper ballot [8]. Drawback: As it is an electronic hardware device that is shipped across to record votes, it is possible that it might malfunction and lead to vote tampering.
238
R. R. Chougule et al.
2.3 I-voting I-voting evolved from E-voting; it is also called “Remote E-voting” this method includes voters sending in their votes remotely via Web platform, email, fax, and so on thus eliminating booth setup and significant reduction in human effort required in the election process. Drawbacks: It is found to be more vulnerable to cyber attacks including Distributed denial of service (DDOS) attacks, fake voters, voter spoofing. The data is usually stored in a centralized manner and hence, any system compromised might lead to data loss or data manipulation. The trend of voting on the Internet was started by the US in 2000 but Estonia was the first country to introduce permanent national Internet voting. Fourteen more countries followed the US until 2013. Out of these 14 countries only 10 plan to reconsider this system for future elections. The main aim of pioneers of e-voting was to make sure all the votes must be taken into consideration even when the individuals could not physically be present at the voting booth [9].
3 Defining the Problem Statement Different types of elections were briefly discussed in the previous section. They possess some common and some distinct flaws. Envision a fighter abroad or a sailor on a nuclear submarine. Both are serving their country, yet their ability to cast a poll is confined to the coordination of acquiring a truant ticket and getting it back so as to be tallied. This was one of the major flaws which were faced in earlier versions of voting over the Internet. All the aforementioned attempts for e-voting, specifically voting over the Internet shared a common issue of security and transparency. Some claim that even Electronic Voting Machines (EVM) can be hacked even though they are not connected to the Internet. Likewise, the centralization of the I-Voting framework utilized in Estonia makes it powerless against DDOS attacks which could make the elections difficult to reach the voters. Also, there was an issue of transparency as faced by the US in elections of 2000 where voters overseas living in different parts of the world were not sure if their vote was really taken in the actual count. Collectively, we categorize all the issues into six main points: Voter verification, Mobility, Security, Transparency, Flexibility and Operational Cost.
3.1 Voter Verification Currently voter verification is all manual which involves human intervention at every step. Wherever there is human intervention, fairness is not guaranteed as there will be dependency on that person. Also there can be a potential caveat with the voter
Building a Robust Distributed Voting System Using Consortium …
239
itself. Although very unlikely voters using counterfeit documents is not impossible. This process of manually verifying the voter by humans which relies on the documents provided by the voters has been there for decades now. Meanwhile there has been enormous advancement in the technology, which includes areas of cybersecurity as well. There are lot of state-of-the-art user authentication systems developed which include biometrics like fingerprint, iris, palm, face, et cetera [10] and other multi-factor authentication systems which can include regular passwords, captcha, token, one-time password (OTP), and by using protocols like OAuth (Open standard for access delegation). Such sophisticated authentication mechanisms are used for simple or even nonessential services like social media but ironically not for the voter authentication which is very crucial for a democratic nation.
3.2 Mobility This was briefly stated in the introduction section. Currently a voter has to be present in-person in the region of his electoral constituency if there is any election going on and he or she desires to vote. Lot of people migrate from rural areas to urban areas for education, employment, business or some other reasons and some of them live at large distances such that it becomes inconvenient for them to travel such distances and thus many choose up ending not to go and use the election holiday to travel or some other activity instead of voting. This also directly or indirectly impacts the voter turnout.
3.3 Security No current system can give a 100% guarantee that the results are legit. And this issue is even more severe in case of e-voting and i-voting. Plethora of attacks such as DDOS, virus infestication and malicious software, spoofing attacks [11] have been reported in elections all across the globe which becomes a huge question on the reliability of the final results. Even the losing party sometimes refuses to accept the results claiming some security flaws because of which they lose.
3.4 Transparency For any electoral system to be effective, it is of utmost importance that voters must have full confidence in the election results, no matter what methodology is chosen to conduct them. Though owing to technological advances over the past few decades some new ways such as “Voter Verified Paper Audit Trail” (VVPAT) were introduced in recent times to ensure some degree of transparency to voters but again all such
240
R. R. Chougule et al.
new ways are heavily dependent on electronics systems and are heavily abstracted thus voters tend to be skeptical toward such automated system whose workings they do not fully understand [12].
3.5 Flexibility The election voting occurs only on one single day and also only in some specific time window, i.e., around 8 h. All eligible voters have to vote within this period itself. This is again one of the limitations of the existing voting system. First of all it might be the case that for a specific eligible voter it might not be possible to vote in that time window. This issue intersects to some extent with aforementioned Mobility issues—a voter might be living at a different place which takes more than a day to travel to his or her electoral constituency to cast the vote. Another downside of having an election on a single day is it causes congestion at multiple places—be it vehicle traffic, crowds at the election booths, pressure on police and other security forces responsible for discipline, volunteers and other workers having some or the other responsibility on voting day. Also it costs a lot of time for the voter to do the entire process of casting a voting, i.e., traveling locally in traffic and waiting time at the booth which is definitely not encouraging at all.
3.6 Operational Cost Cost of conducting a nationwide election is an insanely expensive process. The US uses paper ballots for presidential elections which in 2020 cost around 14 billion USD [13]. Even India which has adopted e-voting using EVM which requires lesser human effort spent around 7 billion dollars in the 2019 Loksabha election [14].
4 Proposed Solution 4.1 Blockchain Infrastructure Setup The entire infrastructure consists of a distributed system connected via a peer-to-peer network where each system is called a node. A node will be a computer system housed in something called “Voting Station”. Voting Stations (VS) will be setup in all districts and there can be multiple such stations in a district with very large geographical areas so that each citizen will have a voting station nearby. Since the system is based on consortium blockchain, nodes will be owned by completely heterogeneous entities. For example, node A1 in region A will be owned by a private research university
Building a Robust Distributed Voting System Using Consortium …
241
and node A2 in the same region A will be owned by a public research school, an IT company or any other government entity like District magistrate. And since there will be a strong consensus algorithm there will not be a need for any government body or even the election commission to have stringent monitoring over the voting system making it truly decentralized. This will make sure that any entity or organization or a specific government body does not have a complete control over majority of the nodes and it would be practically impossible for someone to take control over more than 51% of the nodes in the entire nation as all nodes are managed by mutually exclusive and isolated entities (Fig. 1). Technically speaking, a node is a computer system that will have storage to maintain its own copy of the entire blockchain and do compute involved in every phase. A node itself can locally be a distributed system to support database sharding and replication for its own blockchain local copy and make the compute distributed by using a master–slave paradigm. This will make even a single node fault-tolerant, scalable and robust. A single node typically will consist of three layers—(a) User Interface Layer, (b) Computer Layer, (c) Local Storage Layer. The UI layer will serve the purpose of an interface through which users (here the three actors—Voter, Contestant and Voting Station Maintainer) will interact with the system. Compute layer which consists of multiple servers will perform all the processing and interact with UI layer, local storage layer and global common storage (we used AWS RDS— Amazon Web Services’ Relational Database Service for prototyping). Local storage will be a distributed storage system with some replication factor and sharding which will store the copy of actual blockchain’s with respect to that particular node. And the global storage will store all other information except actual vote (“who voted whom”
Fig. 1 A peer-to-peer network of voting station (VS) nodes
242
R. R. Chougule et al.
transaction) which will include registration information, login and authentication, previous election results, etc. Functionally speaking, a node will serve the purpose for all the other modules which is further explained. This blockchain infrastructure, once built, could be reused multiple times for any type of election across the nation thus reducing manpower and overall costs of the elections. Also, all nodes will remain active for election in any region. For example, let us say there is an election in state X. A voter who is eligible to vote (as he or she might belong to an electoral constituency in state X) for that election might be living currently in another state Y. He or she can visit any voting station in state Y and cast a vote for an election that is happening in state X without traveling thousands of kilometers and more importantly contribute to voter turnout by conveniently voting. As we are leveraging technology to build this infrastructure it can enable us to keep the election open for multiple days compared to a specific time window on a single day. Which again makes it more convenient for voters and eliminates the issue of some important work clashing on a single voting day. Each Voting Station will have an authority called as “VSM”, i.e., Voting Station Maintainer, whose roles are discussed further (Fig. 2).
4.2 Voter, Contestant and VSM Registration This involves the registration of voters which will be a one-time activity. This will basically create a voter profile which will be linked with his or her national government identity proof, phone number and fingerprint. This process has to be done before the day the election starts and there will not be any deadline. A user would have to visit any Voting Station for the same. Once a user profile is created, he or she would be able to vote from any Voting Station across the nation for all eligible elections. A candidate who is contesting election would also have to register as a contestant. As every election might have different contests, this will be required for every election he or she wishes to contest for. An authority called Voting Station Maintainer (“VSM”) at any organization Voting Station can create an election. This again can be chosen by some round-robin or random mechanism as which station creates the election is not very critical as once it is created, it will be the same for everyone. So the entire system will have three major roles: (a) Voter, (b) Contestant, (c) VSM. A voter will do registration, voting and validation post-election. A contestant will participate by registration. VSM will create elections, resolve grievances and be responsible for proper functioning of the voting station. The registration related information of all types of users, election information like start time, end time, mapping of all contestants with election, etc. will be stored in a common global database (depicted in architecture diagram and explained further). In our case, it is AWS Relational Database Service.
Building a Robust Distributed Voting System Using Consortium …
243
Fig. 2 Architecture of a node in one voting station
4.3 Voting Process The voting process is where voters actually vote by visiting any Voting Station. This process will last for a specific time window (decide during election creation) which can be multiple days during which a voter can visit a Voting Station and cast his vote for the desired contestant. Voters will go through a voter verification process. This process will be nothing but a multi-factor authentication system which will include document verification, fingerprint based biometric authentication and an one-timepassword (OTP) before casting a vote. This will avoid any potential malpractices that voters can do. After verification is done, voters will cast the actual vote and then will be marked as voting completed in the system which will make sure that a voter will be able to vote only once per election. This system will be a self-service based system which will save hours of time generally spent for voting and thus be convenient to voters. Technically, when a voter casts a vote, that record will be a single transaction in the entire blockchain ecosystem. Each voter whenever the verification process starts, he or she is redirected to any random node besides the voting station he or she
244
R. R. Chougule et al.
is currently present at. Once the verification process is completed and the user casts the vote, it is broadcasted to all the nodes of the blockchain network. All nodes after receiving this data store in their current block as a block will contain multiple votes. Here is a sample of transactions (votes) in a single block (Fig. 3). A block will also contain a hash value of votes data that it possesses and also has the value of the previous node in the chain. The first block, called the genesis block, will have the previous block value as “0”. Once a block is filled, there will be a block-level consensus amongst all nodes based on Byzantine’s fault tolerance and a consistent block per node will be created and stored in the local copy of blockchain of each node. Reaching consensus in a distributed system is not a very straightforward task as some of the nodes might perform some malicious activity. One of the best solutions till date for this is Byzantine Fault Tolerance (BFT) which is based on Byzantine Generals Problem [15]. This suggests that the majority of nodes should perform the same action, i.e., be consistent to avoid system failures. A BFT system has the ability to continue operating even if some of the nodes fail or are malicious. The consensus algorithm used by proposals is also based on BFT. These nodes will further send the data to all nodes except from which they have received the vote and itself. So all the nodes will have N values for votes. Based on whatever values it is having, that node will choose the majority value as its final value. After all nodes decide their final value, the majority of all final values will be considered as the global consistent value by all the nodes. For example, let us say there are 3 nodes N1, N2 and N3 and each has 3 values. For simplicity let us consider the value is binary but in the actual system it will be some unique identifier for a contestant. So here are the values: N1{1, 1, 0}, N2{1, 1, 0}, N3{1, 0, 0} so, respectively, for N1, N2, N3 final values for according to majority will be {1, 1, 0}. Now amongst these final values we again take a majority to decide upon one consistent global final value which will be accepted by all nodes and stored as that in this case 1. To sum up the above calculation (Figs. 4 and 5): max{max{N1}, max{N2}, max{N3}} => max{max{1, 1, 0}, max{1, 1, 0}, max{1, 0, 0}} => max{1, 1, 0} => 1 A block is considered to be filled if (a) It meets the block threshold—a value which is considered as max capacity for a single block or (b) A specific time window is completed—for example, create a block every hour even if block threshold is not met or (c) There are non zero votes in current block and current time meets the election end time (Refer Fig. 6 for pseudocode).
Fig. 3 A sample of transactions in a block
Building a Robust Distributed Voting System Using Consortium …
Fig. 4 Byzantine fault tolerance example for 3 nodes in action
Fig. 5 Consensus achievement post BFT completion
245
246
R. R. Chougule et al.
Fig. 6 Voting process pseudocode
4.4 Result Calculation This phase is after the Voting Process is completed. Here, each node will traverse the entire blockchain and calculate the result based on its local copy of the blockchain. Result will include a unique identifier for the contestant and number of votes received. A node called Result Calculation Node (“RCN”) will be chosen randomly where all other nodes would send results according to their local copy. For example, let us say there were 4 contestants A, B, C and D and each got, respectively, 100, 150, 175 and 50 votes, the result will be a string value as “A100B150C175D50”. This result value will be hashed by using SHA-256, a cryptographic hashing algorithm which is then sent to the RCN by all other nodes (Refer Fig. 7 for pseudocode). RCN will then select a hash value sent by majority, i.e., 51% of the nodes (RCN node’s results are excluded from calculation to avoid any potential bias). Once a hash value of the correct result is selected, it is sent to all other nodes so that it can be audited if required. After that all nodes will send the actual result string, i.e., “A100B150C175D50” in our example to the RCN as SHA-256 being irreversible, i.e., one-way [16], it is not possible to derive the actual result based on hash value. When RCN receives the actual result, it performs the same hashing algorithm and then matches it with previously finalized hash value and then the result is shared to other nodes (Refer Fig. 8 for pseudocode).
Building a Robust Distributed Voting System Using Consortium …
247
Fig. 7 Pseudocode for local result calculation at each node except RCN
Fig. 8 Pseudocode for global result calculation at RCN
4.5 Public Vote Validation This phase is to provide transparency to users to validate if their vote is rightly counted and credited to the contestant they voted for. For this a voter will visit a voting station and authenticate in the similar way as of the actual voting. After
248
R. R. Chougule et al.
successful authentication he or she can check for the actual contestant whom he or she had voted for. To display this, that particular station will fetch data for the voter from all nodes and display the candidate value sent by more than 51% of the nodes.
5 Technological Requirements for Large-Scale Deployment Deploying such a system at a large-scale seems to be challenging due to the heterogeneous nature of the entities in the consortium-based blockchain network but given the advancements in distributed stream and batch processing frameworks and containerorchestration in recent years it is totally feasible. The implementation of proposed method at large-scale like across nation have three major concerns: 1.
2.
3.
Building the Network—This would be comparatively the most challenging step which would include agreeing upon the entities that would own and be responsible for maintaining a node. However this challenge is not a technological challenge per se, it is more of a regulatory challenge. Once all the entities are decided technically it would be straightforward to deploy the process at each node by leveraging today’s state-of-the-art containerization platforms like Docker and using standard inter-process communication techniques like remote procedure call (RPC). Data Storage and Processing—If we think about the system, it will definitely has a very huge data size but the heavy processing will be only during a few days like during election or result calculation. For data storage and access, designing a good schema and having appropriate indexing in place would make it optimal for the use case. For compute, proven batch and stream processing frameworks like Apache Spark which are used by many large tech giants in the industry would suffice. By considering the alternatives and carrying out proof-of-concepts for multiple frameworks, one best for the use case can be opted based on tuning [17] the parameters like parallelism, memory management, heap size, garbage collection, serialization, et cetera. Security—Security is one of the biggest advantages of leveraging blockchain due to the tamper-proof transaction paradigm and the usage of good secure hashing algorithms for block data. There are some hypothetical flaws like a hacker gaining access to more than 51% of the nodes in the blockchain, which seems impractical. But if such a worst case occurs, there can be simple anomaly detection alerts configured which would suggest a potential breach, in such case the nodes can fork off [18] to a new version of the chain making the attack worthless.
Building a Robust Distributed Voting System Using Consortium …
249
6 Conclusion In the “Defining the Problem Statement” section, we had categorized problems into six modules. Here’s a brief of how the proposed system solves them: Voter verification: We are using multi-factor authentication in the proposed system which includes government identification, fingerprint based biometric authentication and one-time-password mechanism to achieve this. Mobility: A voter will be able to register, vote and validate his or her vote from any voting station in the nation which will be available in each district and specific geographical radius for convenience. Such a convenience would encourage the voters positively, eventually increasing the voter turnout. Security: The entire system is based on decentralized architecture and no single authority can have a control over it. The blockchain is immutable and even if there is some tampering done by any voting station, the BFT based consensus algorithm ensures such tampered data is rejected. Even while calculating the result, cryptographic hash function SHA-256 is used which guarantees that the node calculating the count does not know the actual values before it decides the hashed final result and broadcasts it to other nodes. Transparency: Vote validation functionality empowers voters to visit a voting station after the election and validate whether their vote was rightly counted toward the right contestant. It also resolves the shortcomings of VVPAT. Flexibility: By using this system, we can configure how many days the election will be open for voting and thus it solves the issue of having it open for only one day and that too for a particular time window. This makes the process more convenient than ever for everyone, Voter, Contestant and the volunteers or other people involved. Operational Cost: There is an initial investment involved to build this system, but once built, it can be reused for any number of elections. Also the workforce required will get marginally low for the process. Also the compute or storage infrastructure in a node at a voting system need not be entirely in-house. Cloud technologies can be used, more specifically infrastructure as a service (IaaS) where the additional resources can be requested during the election time period which actually involves complex processing of the blockchain and released whenever not required. The inhouse infrastructure can be kept minimal just to support basic functions like voter registration and vote validation. This will reduce cost further as we can leverage the “pay-as-you-use” paradigm. Achieving this type of feat for an entire nation from existing EVM based or ballot based is quite ambitious and might also have some other challenges regarding compliance, laws and regulations but still practical or considered and might turn out to be revolutionary in the near future if implemented at scale. And once this kind of system is built it will overcome the flaws of the current election process which is critical as it forms the government: the most crucial pillar of any democratic nation.
250
R. R. Chougule et al.
References 1. World | South Asia | Country profiles, Country profile: India. BBC News, 7 June 2010. Retrieved 22 Aug 2010 2. Security analysis of India’s electronic voting machines, in Proceedings of 17th ACM Conference on Computer and Communications Security (CCS ’10) (2010) 3. M. Xu, X. Chen, G. Kou, A systematic review of blockchain. Financ. Innov. 5, 27 (2019) 4. J. Clavin, S. Duan, H. Zhang, V.P. Janeja, K.P. Joshi, Y. Yesha, L.C. Erickson, J.D. Li, Blockchains for government: use cases and challenges. Digit. Gov. Res. Pract. 1(3), Article 22, 21 p 5. S. Nakamoto, Bitcoin: a peer-to-peer electronic cash system (2009) 6. S. Everett, M. Byrne, K. Greene, Measuring the usability of paper ballots: efficiency, effectiveness, and satisfaction, in Proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol. 50 (2006). http://doi.org/10.1177/154193120605002407 7. The role of electronic voting machine (EVM) ın strengthening democracy in India. Int. J. Recent Technol. Eng. (IJRTE) 8(3) (2019). ISSN: 2277-3878 8. An analysis of electronic voting machine for its effectiveness. Int. J. Comput. Exp. (IJCE) 1(1) (2016) 9. The past and future of ınternet voting. https://www.brookings.edu/wp-content/uploads/2016/ 07/pointclickandvote_chapter.pdf 10. Z. Rui, Z. Yan, A survey on biometric authentication: toward secure and privacy-preserving identification. IEEE Access 7, 5994–6009 (2019). https://doi.org/10.1109/ACCESS.2018.288 9996 11. A. Javaid, Electronic voting system security. SSRN Electron. J. (2014). https://doi.org/10.2139/ ssrn.2393158 12. C. Enguehard, Transparency in electronic voting: the great challenge (2008) 13. 2020 U.S. Presidential election to be most expensive in history, expected to cost $14 billion. https://www.thehindu.com/news/international/2020-us-presidential-election-tobe-most-expensive-in-history-expected-to-cost-14-billion/article32969375.ece 14. Why India’s election is among the world’s most expensive? https://economictimes.indiatimes. com/news/elections/lok-sabha/india/why-indias-election-is-among-the-worlds-most-expens ive/articleshow/68367262.cms 15. K. Driscoll, B. Hall, H. Sivencrona, P. Zumsteg, Byzantine fault tolerance, from theory to reality. 2788, 235–248 (2003). http://doi.org/10.1007/978-3-540-39878-3_19 16. D. Rachmawati et al., A comparative study of Message Digest 5(MD5) and SHA256 algorithm. J. Phys. Conf. Ser. 978, 012116 (2018) 17. Tuning—Spark 3.1.2 documentation, https://spark.apache.org/docs/3.1.2/tuning.html 18. N.C.K. Yiu, An overview of forks and coordination in blockchain development (2021)
Analysis on the Effectiveness of Transfer Learned Features for X-ray Image Retrieval Gokul Krishnan and O. K. Sikha
Abstract A computer-assisted system for retrieving medical images of identical contents can be used as a data processing method for managing and mining large amount of medical data, as well as in clinical decision support systems. This paper studies the effectiveness of deep features extracted from state-of-the-art deep learning models for the retrieval of X-ray images. The first part of the study explores the effectiveness of transfer learned features generated from DenseNet model, Inception model and Inception-ResNet model for image retrieval. The performance of transfer learned features for image retrieval was analyzed based on the retrieval accuracy. The second part of the paper analyzes the effect of preproccesing using adaptive histogram equalization on image retrieval. The experiment is carried out on a publicly available musculoskeletal radiographs (MURA) dataset which consists of nearly 40,561 bone X-ray images of different body parts in varied angles with 7 classes. Keywords CBIR · Deep learning · Transfer learning · X-ray image retrieval · Adaptive histogram equalization · Image enhancement
1 Introduction Chest X-rays are the most common radiological tests used in clinical practitioner to detect various irregular thoracic and cardiopulmonary conditions. The workload of radiologists has increased significantly as a result of developments in medical imaging technology and increase in the amount of radiology exams requested. This, in fact, leads to a longer radiology turnaround time and may reduce the overall patient care quality. A computer-assisted automatic system that analyses and extracts previously diagnosed X-rays with identical image content can be a useful tool for guiding diagnosis and the radiology report generation process which in turn improves the overall quality of health care system. G. Krishnan · O. K. Sikha (B) Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_19
251
252
G. Krishnan and O. K. Sikha
For the past 20 years, content-based image retrieval (CBIR) has been a hot topic in the field of computer vision. The database images are first interpreted in terms of a collection of associated features computed directly on the image content. Similar images are then selected by comparing the features extracted from the query image. Traditionally, discriminant handcrafted features such as edge, color and texture were used to retrieve images from large datsets. However, with handcrafted features, reducing the semantic gap remains a challenge. Machine learning-based techniques, in which an intelligent system is trained to generate a characteristic feature space, will help to reduce this gap. Deep learning networks learn complex features at multiple levels of abstraction without the use of handcrafted features when given enough data. With the recent development of deep learning-based techniques and the availability of large-scale digital image databases, the computer vision research community has started exploring deep learning-based CBIR frameworks. Sikha and Soman [1] discussed about using visual saliency features for CBIR. Maji and Bose [2] used transfer learned deep convolutional neural networks such as Inception [3], Inception-ResNet [4], VGG [5] etc. for CBIR on natural images. The authors evaluated the efficiency of deep features extracted from the deep CNNs for image retrieval and proved that deep features capture the semantic aspects of images hence best suitable for CBIR application than the handcrafted featutres. Despite the fact that image retrieval systems have been extensively studied for natural image retrieval tasks, applying the retrieval mechanism to medical images, especially radiology images, thus remains as an open reaserch problem. This is majorly beacuse of the complex nature of medical images such as complex imaging parameters, associations between different conditions, and subtle variations between images with different diagnosis decisions. Recently researchers started experimenting the effectiveness of deep features for medical image retrieval. Kowshik et al. [6] used machine learningbased systems to detect false positives for breast cancer. Subbiah et al. [7] developed a model based on ResNet [8] for classification of the cancerous lymph nodes. Owais et al. [9] used transfer learned ResNet [8] model for medical image retrieval. They tested the proposed model on a huge database containing 12 datasets with 50 classes of different medical images. Kiruthika et al. [10] proposed a model for detecting malignancy at early stages for breast cancer on a database containing 4300 images using image retrieval. Hemdan et al. [11] developed an architecture called COVIDX-Net for detecting patients having COVID-19 positive using transfer learned DenseNet [12], VGG19 [5] and Google MobileNet [13]. They have tested 7 variety of deep learning architectures on Chest X-ray dataset and got a high f 1 score of 0.91 for DenseNet [12]. In [14], Sze-To et al. described a CBIR model using DenseNet [12] for the detection of a medical condition called pneumothorax. In this work, the authors have used deep learning techniques for extracting features from images which are then used to retrieve similar images to a query image. After the retrieval of similar images, they have classified the query image having pneumothorax if most of the similar images are labeled as having pneumothorax. In [15], the authors discuss different preprocessing techniques on Electrical Impedance Tomography images and a conductivity test is also done for finding out their effectiveness on the course of
Analysis on the Effectiveness of Transfer Learned Features …
253
the current flow. The medical images are preprocessed and features are extracted for analysis on ECG signals in paper [16]. The recent developments in medical CBIR models as discussed above primarily deals with images of same category (chest X rays) for the detection of a specific medical condition. Though, a general retrieval architecture for retrieving X-ray images of different body parts still remains unaddressed in the literature. Despite the fact that adaptive histogram equalization methods are commonly employed for improving medical images, they have been used sparingly in CBIR research. This is the reason for developing a new CBIR model trained on enhanced medical images. The primary objective of this paper is to propose a general CBIR architecture for the retrieval of X-ray images of various body parts with different medical condition. This paper studies the effectivenss of using deep feature representation generated from the state-of-the-art deep learning models for radiology image retrieval. To create image representative codes, well known deep convolutional neural networks such as DenseNet [12], InceptionNet [3] and Inception-ResNet [4] are transfer learned on X-ray images. The major contribution of this work can be summarized as follows: • To develop a CBIR model for the retrieval of X-ray images of various body parts • To investigate the effectiveness of transfer learned features on X-ray image retrieval. • To study the effectiveness of preprocessing (contrast enhancement) on X-ray image retrieval. Rest of this paper is organized as follows: the background study is detailed in Sect. 2; Sect. 3 describes the proposed model in detail; the results obtained are detailed in Sect. 6 and dataset is detailed in Sect. 4. State-of-the-art comparative analysis has been done as part of Sect. 6. Finally, the paper concludes with closing remarks in Sect. 7.
2 Background Study This section describes the state-of-the-art deep learning architectures used for the study in detail. The effectiveness of DenseNet [12], InceptionNet [3] and InceptionResNet [4] as feature extractors are explored. The selection of these models were done based on the comparison in terms of feature compatibility for CBIR.
2.1 DenseNet Initial convolution layers of deep convolutional neural network models extract low level image features such as colors, edges and textures. As the layer goes deep, the convolution layers start to extract deeper high level features specific to the image content. Consequently, as the layer goes deeper, more information will be extracted
254
G. Krishnan and O. K. Sikha
from the image. The major drawback of general CNN architectures like VGG [5] is that as the model goes deeper, information being learned by the model starts vanishing. As it reaches to deeper layers, the previous information will be lost. This information loss while passing through deeper layers is caused by vanishing gradient as the gradient of the cost function will become void after reaching deep layers. Huang et al. [12] introduced a new deep CNN architecture called DenseNet where the vanishing gradient problem was resolved using feature concatenation. The advantage of concatenation is that the layers need not have to learn the feature information that are reduntant. DenseNet [12] is a shallow network compared to ResNet [8] which also solves the vanishing gradient problem using skip connections. This work uses DenseNet201 [12] which contains 201 convolution layers, an average pooling layer and one fully connected layer.
2.2 InceptionNet InceptionNet [3] also known as GoogLeNet is a deep convolutional neural network with 27 layers. The primary objective behind InceptionNet [3] model is to reduce the trainable parameters and avoid overfitting which may arise as the model become deep. The InceptionNet [3] model uses multiple filters of size 1 × 1, 3 × 3 and 5 × 5 before each convolution layers which results in reducing the total number of parameters. The 1 × 1 filter is mainly used for decreasing the dimensions, as well as to minimize the number of channels. As a result it reduces the computational complexity.
2.3 Inception-ResNet Inception-ResNet [4] is a hybrid model which combines InceptionNet [3] model and ResNet [8] model. ResNet [8] model is mainly used for removing the vanishing gradient problem and extracting information from images without any loss. InceptionNet [3] model is used to reduce the computational complexity of the model by including 1 × 1, 3 × 3 and 5 × 5 filters before each convolution layers by reducing the number of trainable parameters. The Inception-ResNet [4] will have better information extracted from images without any loss and model is optimized in a better way without taking much toll on the computational power.
3 Proposed CBIR Architecture This section describes the proposed deep feature-based CBIR architecture as shown in Fig. 1. The features are extracted from large collection of X-ray images using
Analysis on the Effectiveness of Transfer Learned Features …
255
Fig. 1 Proposed CBIR architecture
transfer learned deep convolutional models: DenseNet201 [12], InceptionV3 [3], and Inception-ResNetV2 [4]. The X-ray images are preprocessed to enhance the contrast using adaptive histogram equalization algorithm before the feature extraction. Later, the query X-ray image is passed through these models and features are extracted. The extracted features are then matched with the feature database using different distance metrics such as euclidean distance, manhattan distance and cosine distance. The similar medical images are retrieved based on the score and the retrieval accuracy is checked to analyze the performances of each model. Finally the model having the best retrieval accuracy is chosen.
4 Dataset Our model has been tested on publicly available musculoskeletal radiographs (MURA) dataset [17] which consists of nearly 40,561 bone X-ray images of different body parts in varied angles with 7 classes. All the images are gray scaled images. The radiographic images were taken within the years from 2001 to 2012 from 12,763 patients. All the images were labeled by experts as normal or abnormal. This work uses five categories of images for the experiment which include: elbow, finger, hand, shoulder and wrist. Figure 2 shows sample images from the dataset of different body parts.
256
G. Krishnan and O. K. Sikha
Fig. 2 Sample images from the MURA dataset. a Elbow, b finger, c hand, d shoulder, e wrist
5 Experiment The normal and abnormal labellings of the dataset are removed. Two classes, namely XR_HUMERUS and XR_FOREARM are eliminated from the experiment since these two classes have less number of images. 4000 images from each class are selected for the experiment to make a balanced dataset. Initially state-of-the-art models DenseNet201 [12], InceptionV3 [3] and Inception-ResNetV2 [4] are transfer learned on the dataset, and the final softmax classification layer is removed. While training, only the weights of the final fully connected layer is updated and the feature database is generated from the pool of X-ray images. The feature database was created from 33,715 images contained in 5 classes and a query image was taken randomly from each class for CBIR. The prediction is done without any fine tuning as the models are already pretrained on natural images. Later, distance measures such as euclidean distance, manhattan distance and cosine distance are employed to find the similar images from the database. n Euclidean Distance = (ai − bi )2
(1)
i=1
where n is the dimension, a and b are vectors. Manhattan Distance =
n |ai − bi | i=1
(2)
Analysis on the Effectiveness of Transfer Learned Features …
where n is the dimension, a and b are vectors. n ai × bi Cosine Distance = i=1 n n 2 2 i=1 ai × i=1 bi
257
(3)
where n is the dimension, a and b are vectors.
5.1 Preprocessing The X-ray images were preprocessed/enhanced using adaptive histogram equalization algorithm. The adaptive histogram equalization method tries to modify the intensity of pixels in an image and thus perform a contrast enhancement. Figure 3 shows sample images from the MURA dataset [17] and the resultant images obtained after contrast enhancement.
5.2 Parameters Settings The X-ray images has been preprocessed before feature extraction and resized to 224 × 224 resolution for reducing the computational complexity. As the task is feature extraction from X-ray images for CBIR task, the softmax layer is removed which is meant for classification tasks and the features are directly taken from the last fully connected layer from each aforementioned deep CNN models. Most of the deep CNN models used in the study give an output of 512 features, so an additional dense layer is added to the final fully connected layer of each model for standardization.
6 Results and Analysis The performance of DenseNet [12], InceptionNet [3] and Inception-ResNet [4] based CBIR models are evaluated based on the retrieval accuracy. Retrieval accuracy is defined as follows. Retrieval Accuracy =
Number of Correct Images Retrieved Number of Total Images Retrieved
(4)
Euclidean distance, manhattan distance and cosine distance are used as the similarity measures for image retrieval. Table 1 compares the retrieval accuracy of three aforementioned deep learning models for image retrieval using three similarity measures on raw images. From the table, it is clear that deep features extracted
258
G. Krishnan and O. K. Sikha
Fig. 3 Images before preprocessing and the equivalent histogram (left) and preprocessed images and the equivalent histogram (right) (1st row—elbow, 2nd row—hand, 3rd row—finger, 4th row— shoulder and 5th row—wrist)
Table 1 Performance comparison of each model using average retrieval accuracy (%) on raw images Retrieval accuracy
DenseNet201
InceptionV3
Inception-ResNetV2
Euclidean distance
52
74
50
Manhattan distance
52
74
50
Cosine distance
60
66
52
from the InceptionV3 [3] as the base model using euclidean and manhattan distance shows better performance than other models with an average retrieval accuracy of 74%. Table 2 compares the retrieval accuracy of deep learning models as feature extractors for image retrieval on enhaned images after histogram equalization.
Analysis on the Effectiveness of Transfer Learned Features …
259
Table 2 Comparing the performance of each model using retrieval accuracy (%) after preprocessing of the images Retrieval accuracy
DenseNet201
InceptionV3
InceptionResNetV2
Euclidean distance
64
88
50
Manhattan distance
68
90
50
Cosine distance
66
82
40
From Tables 1 and 2, it is evident that all the three deep learning models gives better accuracy on the preprocessed images than the raw images. Only for InceptionResNetV2 [4] with similarity metric cosine distance, the raw images produce a better result. InceptionV3 [3] on enhanced images using manhattan distance gave an accuracy of 90%, which is the maximum compared to all the three models. The reason for increased accuracy on enhanced images might be because of the enhancement of low contrast regions as the adaptive histogram equalization works very well for gray scale images. Our dataset contains gray scale X-ray images and the difference between the foreground and background area is not much discernible. This is where the preprocessing helps, as it makes the difference more prominent. It is also observed that a shallow network like InceptionV3 [3] produced much higher retrieval accuracy than the deeper networks such as DenseNet [12] and Inception-ResNetV3 [4]. The reason for this might be not due to overfitting but probably due to the difficulty in optimizing the parameters in the deep neural network. As the number of convolutionl layers increases, the model parameters also increase which in turn makes it difficult to optimize the parameters. But a network like InceptionV3 [3] reduces the number of parameters and thus gives a better accuracy compared to other models. Figure 4 shows the retrieval results of top 10 X-ray images using features from InceptionV3 [3] with manhattan distance as the similarity metric. For the classes hand, shoulder and wrist the retrieval accuracy is 100%, for elbow and finger the accuracy is 90% and 60%, respectively.
6.1 Effect of Preprocessing on Model Training and Testing We have done an analysis on the performance of models on images with preprocessing and without preprocessing. In most cases, we got better results after preprocessing. DenseNet: The DenseNet [12] model is trained on X-ray images with preprocessing and without preprocessing for 20 epochs. We got maximum training accuracy around 96% and training loss around 0.1 for both cases. Figure 5 shows the model accuracy and loss graphs for raw images (Fig. 5c, d) and for preprocessed images (Fig. 5a, b). From the figures, it is evident that the convergence is much better for DenseNet [12] model when trained on preprocessed images. Convergence till 5 epochs is mostly similar for both the cases but after that the model trained on preprocessed images reach the maximum training accuracy slightly faster than model trained on images
260
G. Krishnan and O. K. Sikha
Fig. 4 10 similar images retrieved when given a query image from each class using InceptionV3 with Manhattan distance similarity metric (first column contains the query images before preprocessing and second column contains the query images after preprocessing and the remaining column contains the retrieved images, 1st row—elbow, 2nd row—finger, 3rd row—hand, 4th row—shoulder and 5th row—wrist)
Fig. 5 Training model accuracy and model loss of DenseNet201. a, b Preprocessed images. c, d Raw images
Analysis on the Effectiveness of Transfer Learned Features …
261
without preprocessing. The convergence of training loss is also slightly better when the model is trained on enhanced images. Figure 6 compares the effect of preprocessing in terms of retrieval accuracy and, it is evident that model trained on preprocessed images has a better accuracy of 68% compared to that of 60% for model trained on images without preprocessing. Inception-ResNet: We trained Inception-ResNetV3 [4] on images with preprocessing and without preprocessing for 20 epochs and we got a training accuracy around 94% for both cases and a training loss of around 0.2. Figure 7 shows the
Fig. 6 Retrieval accuracy of DenseNet compared with all similarity metric measurements (left— model trained on preprocessed images, right—model trained on images without preprocessing)
Fig. 7 Training model accuracy and model loss of InceptionResNetV3. a, b Preprocessed images. c, d Raw images
262
G. Krishnan and O. K. Sikha
Fig. 8 Retrieval accuracy of InceptionResNetV3 compared with all similarity metric measuremnts (left—model trained on preprocessed images, right—model trained on images without preprocessing)
model accuracy and loss graphs for raw images (Fig. 7c, d) and for preprocessed images (Fig. 7a, b). It is clearly evident from Fig. 7 that the training loss converges better in model trained on preprocessed images as the loss of 0.2 is reached around 12 epochs for model trained on enhanced images while, it is reaching the same loss around 15 epochs for model trained on images without preprocessing. Figure 8 shows the comparison of average retrieval accuracy of Inception-ResNetV3 [4] on enhanced images and raw images. It is observed from Fig. 8 that the results for Inception-ResNet [4] is mostly similar for euclidean and manhattan similarity metrics for both models with a slight upperhand for model trained on images without preprocessing. InceptionNet: The InceptionV3 [3] was trained on images with preprocessing and images without preprocessing for 20 epochs. The training accuracy was around 94% with a training loss of 0.1 for both cases. Figure 9 shows the model accuracy and loss graphs for raw images (Fig. 9c, d) and for enhanced images (Fig. 9a, b). We observed from the figures that the convergence is better for training accuracy and training loss in the model trained on preprocessed images when compared to DenseNet [12] and InceptionResNet [4] models. The training accuracy and training loss for model trained on preprocessed images reach 94% and 0.2, respectively, at 10 epochs but it took nearly 15 epochs to reach the same training accuracy and training loss for the model trained on images without preprocessing. A considerable difference of 15% in retrieval accuracy can be observed in Fig. 10 between the model trained on preprocessed images and the model trained on images without preprocessing. The model trained on preprocessed images obtained a maximum retrieval accuracy of 90% when manhattan distance was chosen as a similarity metric while the model trained on images without preprocessing was able to reach 74% retrieval accuracy for the same similarity metric.
Analysis on the Effectiveness of Transfer Learned Features …
263
Fig. 9 Training model accuracy and model loss of InceptionV3. a, b Preprocessed images. c, d Raw images
Fig. 10 Retrieval accuracy of InceptionV3 compared with all similarity metric measuremnts (left— model trained on preprocessed images, right—model trained on images without preprocessing)
7 Conclusion The paper analyzed the efficiency of features extracted from different transfer learned deep convolution models such as DenseNet201 [12], InceptionV3 [3], and InceptionResNetV2 [4] for X-ray image retrieval task. The selected CNN models for the study were well suitable for image classification as it is pretrained for classification on large
264
G. Krishnan and O. K. Sikha
database. The effectiveness of deep features on raw X-ray images and enhanced Xray images were compared and found that deep features from enhanced X-ray images are much effective than features from raw images. A maximum retrieval accuracy of 90% was obtained from InceptionV3 using enhanced X-ray images with manhattan distance as the similarity measure.
References 1. O.K. Sikha, K.P. Soman, Dynamic mode decomposition based salient edge/region features for content based image retrieval. Multimedia Tools Appl. 80(10), 15937–15958 (2021) 2. S. Maji, S. Bose, CBIR using features derived by deep learning (2020). arXiv preprint arXiv: 2002.07877 3. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 2818–2826 4. C. Szegedy, S. Ioffe, V. Vanhoucke, A. Alemi, Inception-v4, inception-resnet and the impact of residual connections on learning, in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, no. 1 (2017) 5. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556 6. G. Kowshik, R.T. Gandhe, A.N.S. Purushotham, G.V. Reddy, D. Vijayan, Reduction of false positives in ıdentification of masses in mammograms, in 2020 5th International Conference on Communication and Electronics Systems (ICCES) (IEEE, 2020), pp. 1046–1050 7. U. Subbiah, R.V. Kumar, S.A. Panicker, R.A. Bhalaje, S. Padmavathi, An enhanced deep learning architecture for the classification of cancerous lymph node ımages, in 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA) (IEEE, 2020), pp. 381–386 8. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 770–778 9. M. Owais, M. Arsalan, J. Choi, K.R. Park, Effective diagnosis and treatment through contentbased medical image retrieval (CBMIR) by using artificial intelligence. J. Clin. Med. 8(4), 462 (2019) 10. K. Kiruthika, D. Vijayan, R. Lavanya, Retrieval driven classification for mammographic masses, in 2019 International Conference on Communication and Signal Processing (ICCSP) (IEEE, 2019), pp. 0725–0729 11. E.E.D. Hemdan, M.A. Shouman, M.E. Karar, Covidx-net: a framework of deep learning classifiers to diagnose covid-19 in X-ray images (2020). arXiv preprint arXiv:2003.11055 12. G. Huang, Z. Liu, L. Van Der Maaten, K.Q. Weinberger, Densely connected convolutional networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 4700–4708 13. A.G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand et al., Mobilenets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv:1704.04861 14. A. Sze-To, H. Tizhoosh, Searching for pneumothorax in half a million chest X-ray ımages, in International Conference on Artificial Intelligence in Medicine (Springer, Cham, 2020), pp. 453–462
Analysis on the Effectiveness of Transfer Learned Features …
265
15. E.E.B. Adam, Survey on medical ımaging of electrical ımpedance tomography (EIT) by variable current pattern methods. J. ISMAC 3(02), 82–95 (2021) 16. T. Vijayakumar, R. Vinothkanna, M. Duraipandian, Fusion based feature extraction analysis of ECG signal ınterpretation—a systematic approach. J. Artif. Intell. 3(01), 1–16 (2021) 17. Bone X-ray deep learning competition. Stanford ML Group. https://stanfordmlgroup.github. io/competitions/mura/
Blockchain-Centered E-voting System Akshat Jain, Sidharth Bhatnagar, and Amrita Jyoti
Abstract Voting is the process by which people with several different opinions come together to collectively agree on a single opinion. Traditionally, voting was done using pen and paper, but as the technology evolves and we stepped into the digital era, also voting has kept its pace via new techniques and integrating itself with blockchain, i.e., a distributed technology. The digital era is marked with a new technological advancement integrating various sub-mechanisms into a robust, more secure, and immutable mechanism known as distributed ledger technology. Blockchain being a similar technology offers a vast range of applications benefiting from shared economies. This paper proposes a solution using blockchain as a base technology, trying to eliminate the disadvantages of the conventional voting system. At the same time, showcasing the advantages and benefits offered by blockchain technology like immutability, security, distributed ledger, faster settlements, and consensus algorithms. This paper also details the implementation of the solution using various other technologies like solidity, react.js, and a few testing frameworks like mocha and chai; which not only improves the security and transparency for the user but on the other hand, helps in decreasing the cost of elections. Keywords Smart contracts · Blockchain technology · Decentralized ledger · Solidity · Front-end coding language · Testing
1 Introduction Elections are sine-qua-non of democracy and free will of people to choose their representative. Because of its significant importance in our society, the elections are held in such a way that it is secure, reliable, and transparent to all the voters or A. Jain (B) · S. Bhatnagar · A. Jyoti ABES Engineering College, Ghaziabad, Uttar Pradesh, India e-mail: [email protected] S. Bhatnagar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_20
267
268
A. Jain et al.
participants to ensure the credibility of elections. Therefore, the domain of balloting becomes an ever-evolving area. And this is encouraged by security, transparency, and verifiability. Now, when considering all the requirements needed for elections, E-polling holds a significant spot here. Earlier elections or traditional elections were held using ballot-based elections, but due to widespread Internet technology (Gobel et al. 2015), voting methods have made exceptional advancements to the internet. However, e-voting systems have been a topic for ongoing analysis for decades, aiming to reduce the expense of operating the elections along with fulfilling the promise of making the elections secure, transparent, reliable, traceable, verifiable, and ensuring privacy and compliance requirements. As every tool and technology has pros and cons, so the e-voting system is. In the vision of the security community, e-voting systems had been seen as vulnerable, primarily in the security field or area. Anyone with access to the machine can attack the machine, thereby affecting the tallies numbered on the attacked machine. Entering the Blockchain Technology. The reason for blockchain technology is because of its properties which will help in driving the elections more securely and transparently. Few of its properties are mentioned here which drives the blockchain technology: • • • • • •
Distributed ledger Immutable Verifiability Consensus algorithm Anonymity Cryptographically secure.
This technology operates through advanced cryptographic algorithms and techniques, which have a more secure than its predecessors. Hence, this technology is thought by several researchers and innovators to harness the power to be implemented being a service or tool for making present elections a success. This work calculates implementation of blockchain applied services for implementing an e-polling system. This paper considers some of the initial contributions: • Researched currently present technologies suited for building e-polling with blockchain technology. • Recommend a “Permission Blockchain” e-polling service for altering liquid democracy.
2 Objectives Our main goal is to solve the problem of elections via blockchain technology. It will also help in reducing the rate of fraud as well as increasing the number of vote casters. The voting system should follow the following conditions:
Blockchain-Centered E-voting System
269
• • • • •
Verifiable to all and transparent. Should be tamper-proof and secured. Give acknowledgment to the vote casters. Only eligible voters should vote. No organization with more resources and power should be able to manipulate the elections. • One voter one vote policy should be followed. • Identity verification should be present, to confirm that the voter is legitimate. Also, blockchain technology helps in satisfying the most important requirements for e-voting: • • • •
Authentication Anonymity Accuracy Verifiability.
3 Proposed System When the user first enters the Web page, the user is required to verify himself/herself to authenticate himself/herself and to check and verify that he/she can be able to cast the vote. After authentication and verification, the user is directed to the voting page, where the user casts the vote. For a time being, first, the vote is collected from the user. Then, when a certain number of votes are collected, they form a block. Then, these blocks one at a time are sent to every node (i.e., participants) participating in the network. On receiving the block, miners start the validation of the node, which is done with the help of a defined or proposed consensus algorithm. Whosoever node or miner validates the block first receives the reward, and simultaneously the block is added to the blockchain. Also, at the same time, cross-checking of the result of the winning miner is done by the rest of the miners present in the network to check whether the result is correct or not. Once the block is added, this updates this distributed across the network, so that all the nodes can update their local blockchain. And now finally, the votes are collected and cast; showing a confirmation to the user that his/her vote is cast.
4 Algorithm Working SHA means secure hash algorithm, and it was a technology that was developed by NSA. SHA algorithm takes an input or a message, and it returns a fixed size alphanumeric string. These are mathematical operations that run over digital data;
270
A. Jain et al.
Fig. 1 SHA-256 algorithm model
by comparing the output which is called “hash” with a given hash value, in order to determine the data’s integrity and security range (Fig. 1). • SHA algorithm takes variable length input and outputs 256 bits output. • SHA-256 algorithm does not depend upon the size of the input whether it be long or short, be it number, alphabet’s, or special characters, it will always an output of length 256 bits. The SHA-256 algorithm has the following properties 1. 2. 3. 4. 5.
Conclusive Fast computation Pre-image resistance Drastic change in output upon change in input Puzzle friendly.
5 Details of Software and Hardware Software Requirements • • • • •
OS: Windows 7 onwards, MAC IOS. Framework: Visual Studio, Sublime Text, or any text editor. Server: Localhost. Node Package Manager (NPM)—Node.js (SDK). Web3.js—it is a collection of multiple libraries which allow the machine to connect with the local or remote Ethereum/blockchain node. • Truffle Framework—it is an integrated development environment that provides a large range of features from building, testing, compiling, and running a blockchain application and its environment.
Blockchain-Centered E-voting System
271
• Solidity Pretty Printer—it is an extension which is used for beautifying the written code, in order to increase the readability of the user. • Linter–Solhint—its goal is to provide linting utility for the solidity written code. • Solidity—coding language used for writing Ethereum smart contracts. • Metamask—it is a browser extension and software cryptocurrency wallet. At the same time, it helps in interacting the local machine with the Ethereum blockchain by acting as an intermediary bridge. • React.js—is a js library which is used in Web development in order to create interactive elements and UI on Web sites. • HTML—used for making Web pages. • CSS—used for giving styling to the Web pages. • Ganache or Ganache CLI—it is a testrpc which is used to create a fast, customizable Ethereum blockchain emulator on our local. Hardware Requirements • Processor: Pentium dual-core CPU, Intel CORE i5 and above processors, also MAC IOS. • HDD: 10 + gigabytes. • RAM: 4 + gigabytes.
6 Use Case Diagram The following diagram details about the underlying flow of our e-voting system project from both front-end and back-end (Fig. 2). Steps • The voter first needs to register to the Web site. • Voter needs to visit the election page and submits the OTP that is sent over their device. • Now voter will be redirected to page where they can cast their vote. • Once voted, a notification will be shown to the voter for “voted submitted.” • Voter can log-out of the Web page.
7 Result This is the home page that consists of different parties (Fig. 3). This page allows admin or voter to login (Fig. 4). This page verifies Aadhaar card (Fig. 5). After entering the Aadhaar card number, this page verifies the phone number, which is sent to the Aadhaar card affiliated phone number (Fig. 6). Acknowledgements received after vote is casted (Fig. 7).
272
Fig. 2 Use case diagram
Fig. 3 Home page
A. Jain et al.
Blockchain-Centered E-voting System
273
Fig. 4 Admin or voter login
Fig. 5 Aadhaar card verification page
8 Discussion and Comparison Analysis Advantages of Blockchain-based Voting over Traditional Ballot-based Voting (Table 1). • Transparency—by using blockchain, votes can be counted and stored on a distributed public ledger. This allows all the user to see the live records and results, while the votes are casted, helping in increasing the trust of the public and legitimacy of the voting.
274
A. Jain et al.
Fig. 6 OTP verification page
Fig. 7 Acknowledgement message Table 1 Data obtained using blockchain-based voting system
Voters count
Accurate verification
Accurate count of votes
Precision (%)
3
3
3
100
5
5
5
100
10
10
10
100
30
30
30
100
Blockchain-Centered E-voting System
275
• Security—Since SHA-256 algorithm is used, hence, the probability of getting hacked or attacked by the hacker or any anonymous user/person decreases which results in increasing the trust and security of the elections. • Anonymity—Since in blockchain, only the private keys are visible in the ledger; hence, the probability of knowing who is casting the vote is decreased and finding the identity is almost negligible. This guarantee of remaining anonymous while voting can help in increasing the number of voters [1]. • Processing Time—Since the results and the outcome will be live as the voting count will be updated over blockchain ledger, as the vote is casted, hence, it decreases the processing time of counting, updating, and later displaying of the results to the public. It also increases the authenticity and trust of the user as the counting and casting of votes will be done live in front the public. • Immutable and Permanent—Since blockchain is immutable ledger, hence, whosoever votes in the election over the blockchain ledger, the ledger will save the records of the user in it and the main thing is the data will be recorded permanently. Hence, whenever and whosoever wants to verify the data for the voting in election and at the time whether present or future, he/she has the facility to do so.
9 Conclusion In this paper, an approach using blockchain technology to build an e-voting system that utilizes “smart contracts” written in solidity as back-end code to improve security and cost-efficiency while ensuring user’s privacy while conducting elections. The paper illustrates the new possibilities that blockchain technology offers to defeat the restrictions and adopted barriers of the present e-polling practices, which assure protection, traceability, coherence, and foundation for transparence. The possibility of sending hundreds of transactions per second in the blockchain can be done with the help of a public blockchain integrated with Ethereum, which helps in utilizing every aspect of blockchain to its full [2]. Although the blockchain-voting is a controversial topic among both political and scientific areas, despite being in the controversy, it still proves a better and far more advanced approach in conducting elections than the traditional ballot elections which still lacks scalability, usability, time processing, and are more cost consuming method. Blockchain technology promises a lot, but currently it requires a lot and in-depth research to realize its full potential. There need a constant efforts and more focused approach toward the core of blockchain technology to evolve it for more wider and complex applications [3].
276
A. Jain et al.
References 1. A. Ben Ayed, A conceptual secure blockchain based electronic voting system. Int. J. Netw. Secur. Appl. 9(3), 01–09 (2017) 2. P.-Y. Chang, M.-S. Hwang, C.-C. Yang, A Blockchain-Based Traceable Certification System (Springer International Publishing AG, part of Springer Nature, Berlin, 2018) 3. A. Falade, A.A. Adebiyi, C.K. Ayo, M. Adebiyi, O. Okersola, E-voting system: the pathway to free and fair election in Nigeria. Election. Gov. Int. J. 15(4), 439 (2019) 4. S. Bell, J. Benaloh, M.D. Byrne, D. Debeauvoir, B. Eakin, P. Kortum, N. McBurnett, O. Pereira, P.B. Stark, D.S. Wallach, G. Fisher, J. Montoya, M. Parker, M. Winn, Star-vote: a secure, transparent, auditable, and reliable voting system, in 2013 Electronic Voting Technology Workshop/Workshop on Trustworthy Elections (EVT/WOTE 13) (USENIX Association, Washington, D.C., 2013) 5. K. Dalia, R. Ben, Y.A. Peter, H. Feng, A fair and robust voting system, by broadcast, in 5th International Conference on E-voting (2012) 6. B. Adida, Helios: web-based open-audit voting, in Proceedings of the 17th Conference on Security Symposium, ser. SS’08 (USENIX Association, Berkeley, CA, USA, 2008), p. 335348 7. D. Chaum, A. Essex, R. Carback, J. Clark, S. Popoveniuc, A. Sherman, P. Vora, Scantegrity: end-to-end voter-veriable opticalscan voting. IEEE Secur. Priv. 6(3), 40–46 (2008) 8. J.M. Bohli, J. Muller-Quade, S. Rohrich, Bingo voting: secure and coercion-free voting using a trusted random number generator, in Proceedings of the 1st International Conference on Evoting and Identity, ser. VOTE-ID’07 (Springer, Berlin, Heidelberg, 2007), pp. 111–124 9. B. Adida, R.L. Rivest, Scratch and vote: self-contained paper-based cryptographic voting, in Proceedings of the 5th ACM Workshop on Privacy in Electronic Society, ser. WPES ’06 (ACM, New York, NY, USA, 2006), pp. 29–40 10. D. Chaum, P.Y.A. Ryan, P.Y.A. Schneider, A practical voter-verifiable election scheme, in Proceedings of the 10th European Conference on Research in Computer Security, ser. ESORICS’05 (Springer, Berlin, Heidelberg, 2005), pp. 118–139 11. D. Chaum, Secret-ballot receipts: true voter-verifiable elections. IEEE Secur. Priv. 2(1), 38–47 (2004) 12. D. Chaum, Untraceable electronic mail, return addresses, and digital pseudonym. Commun. ACM 24(2), 84–90 (1981)
Project Topic Recommendation by Analyzing User’s Interest Using Intelligent Conversational System Pratik Rathi, Palak Keni, and Jignesh Sisodia
Abstract Finding one of the best projects to work on which actually are germane to the students and specifically those topics which are also trending is one of the most onerous tasks to achieve. Students spend most of their time researching what projects to work on to select the best out of the topics pertinent to them. The proposed system aims to solve this problem with the help of an intelligent conversational system that captures and analyzes the student’s interests and then recommends the project topics by displaying various research papers and other reasonably key websites. The system has a chatbot that precisely captures the user’s interests and derives the keywords based on the user information. The system then essentially uses Google Custom Search API which ranks the results. The final results are then displayed by incorporating the importance and weightage of the keywords and user likings extracted. Thus, the proposed model displays the results that align with the user interests and predilection thereby giving accurate results. Keywords Machine learning · Neuro-linguistic programming · Natural language toolkit · Natural language processing
1 Introduction With an exceedingly large number of domains and subdomains available to select and work upon, it becomes a paramount task to select the domain that particularly interests the user and is trending in today’s world. The proposed system focuses to simplify the above-stated predicament by providing a one-stop solution and end-toend application that facilitates the process of finding the apt research papers [1, 2] in the domain of student interest. This system extracts essential keywords through P. Rathi (B) · P. Keni · J. Sisodia Sardar Patel Institute of Technology, Bhavans Campus, Old D N Nagar, Munshi Nagar, Mumbai 400058, India J. Sisodia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_21
277
278
P. Rathi et al.
conversation with the user and then finds the subdomains the user is interested in. The system also aims to find the applications most suitable to the user’s liking. The paper proposes an approach wherein all the above attributes such as domains and their subdomains, and also, the applications are used in a priority order to find the best suitable paper and project topics according to the user’s needs. Intelligent conversational system refers to the part where the system identifies the important keywords from the user’s conversation with the bot. This is done by implementing NLP and NLTK libraries which are based on noun extraction. Usage of the libraries is explained in detail in methodology and implementation. Also, it provides them a subdomain of that keyword where they can make their interest choosing quite specific. So, the user just has to prioritize their interest, and the system will automatically find and recommend [3, 4] the best papers based on it.
1.1 Natural Language Processing (NLP) NLP is an amalgamation of subfields of linguistics, artificial intelligence, and computer science which helps us to process and analyze large amounts of natural language data. It is a process of understanding the text or speech by any software or machine and performing various operations on it [5]. An equivalence of NLP is that humans commune and understand each other’s views and respond with pertinent answers. In NLP, the interrelation, comprehension, and response are made by a computer rather than a human. This technology can help us accurately derive insights from the extracted NLP information. In our system, NLP is used to extract the answers from the users and process them for further use.
1.2 Parts of Speech (POS) Tagging Tagging is the process of assigning a word to a description token [6]. In parts of speech (POS) tagging, we assign different words in a sentence to parts of speech of the English language. The parts of speech are the description tokens. We are already familiar with the different parts of speech for the English language such as nouns, pronouns, adjectives, verbs, adverbs, conjunctions, and subcategories for each of them. The algorithms which are mainly used for POS tagging are stochastic and rule-based classification.
Project Topic Recommendation by Analyzing User’s Interest …
279
1.3 NLTK (Natural Language Toolkit) It is an open-source Python suite that contains libraries and programs for natural language processing purposes [7]. It is one of the most powerful NLP libraries, which contains many useful packages to make NLP machines or systems understand human language and reply to it with a germane response. It contains text processing libraries for tasks such as tokenization, parsing, and POS tagging, which helps build our system.
2 Methodology In this research for project topic recommendation, we have used a chatbot to obtain a domain of user interest. And, some questionnaires are defined based on user inputs in the chatbot to get a profound understanding of their interests. The first step is to scrutinize the user input provided with the help of a chatbot to identify its domain and in turn to use them to identify the most suitable papers. This project does this by identifying the domains best suited to the user based on the inputs given by them to the chatbot with respect to their interests and what new they would like to learn. Then using NLP, the suitable topics are found.
2.1 Lemmatization/Stemming This step permutes word variations to simpler forms that may avail in increasing the coverage of NLP utilities, e.g., removing “s” and “ing” [8].
2.2 Tokenization In this step, for further processing and comprehending, we divide up streams of characters into tokens [9]. Tokens can be numbers, punctuation, identifiers, or words (depending on the use case).
2.3 Tagging and Acronym Normalization Acronyms can be expressed as “ML” or “machine learning” [10].
280
P. Rathi et al.
Following this, the user needs to answer questionnaires basically to select the subdomains from the domains identified. All the topics of user interest are then matched with the Google Search. Papers are then ranked according to the user’s interest priority. A modified ranking algorithm is used to rank the searched paper according to the user’s interests.
2.4 POS Tagging The crux is to match the split tokens with the tags like adverbs, nouns, or adjectives [11]. The process of classifying words and labeling them is known as POS tagging. Parts of speech are also called classes of words or lexical categories. The algorithms are of two types: stochastic and rule-based. The collection of tags for a task is known as a tag set. Below is the extensive list of [12] POS tags: • • • • • • • • •
CC—Conjunction coordinating NN—Noun CD—Cardinal digit NNS—Noun, plural NNPS—Proper noun, plural RB—Adverb UH—Interjection NNP—Proper noun, singular VB—Verb.
Figure 1 shows POS tags functionality using the NLTK library wherein each word was assigned a POS tag using rule-based POS tagging.
3 Implementation The profound part of the implementation is to bring forth a more transparent system of the project.
Fig. 1 POS tagging using NLTK library
Project Topic Recommendation by Analyzing User’s Interest …
281
Fig. 2 Sequence diagram of the application
The complete workflow is provided in Fig. 2. And, each section namely user– chatbot interaction, domain extraction, user subdomain identification, application selection, and modified search ranking results are described in brief subsequently.
3.1 User–Chatbot Interaction The first step is to get user inputs provided with the help of a chatbot to identify its domain and use them in turn for the identification of the most suitable papers [13]. Figure 3 shows chatbot interaction with users.
3.2 NLP—Domain Extraction (Entity Extraction) This defines the first phase of the project which is to extract the keywords which depict the particular domain in which the user is interested. It is also known as information retrieval from the given text. NLP is performed on the answers to the questions asked by chatbot, from which keywords of importance are fetched. For example, if the answer to a question “What is your area of interest” was “My area of interest is machine learning.” The system would extract or fetch the “machine learning” keyword from the sentence, Libraries Used: NLTK, NLP
282
P. Rathi et al.
Fig. 3 Chatbot question to the user
This process is basically entity extraction and these entities are nothing but nouns. Consider, for example, to find a relationship between a person and his interests/likings; we can find out the type of the entities. By extracting the type of entity— person, interests, etc.—it is easy to determine the association between the entities. With this, we can also analyze the entity’s sentiment in the whole document. It involves the following steps: • • • •
Sentence segmentation Tokenization Part of speech tagging Entity extraction.
Sentence segmentation: It divides the whole document into a list of sentences for further processing. Tokenization: It splits the sentences into small chunks or tokens. POS tagging: It is the most arduous task in this whole process. The crux is to match the split tokens with the tags like adverbs, nouns, or adjectives. Figure 4 shows the extraction of the user’s domain, wherein we can see “public blockchain” and “machine learning” were extracted as the user’s interest topics from the sentence.
3.3 User Subdomains Identification The next task is to select the subdomains from the domains identified. Just in case, if “machine learning” was the keyword extracted from the user’s input, then the system will display subdomains related to machine learning, wherein the user will assign his/her priority value (ranging from 0 to 5) to each of the subdomain which in turn will be used in our modified search ranking results.
Project Topic Recommendation by Analyzing User’s Interest …
283
Fig. 4 Code for domain extraction
Generalizing, the subdomains are selected by the user where they are listed by the system for a specific domain identified. The model displays a few subdomains; nevertheless, a comprehensive list can be displayed to the user. The user selects the subdomains which may be of interest to him/her and assigns a priority for that subdomain. The higher the priority assigned, the higher is the importance of that subdomain. Figure 5 shows subdomains with priority based on the user’s domain selection. We can see that five subdomains were provided for blockchain with an input box for priority.
Fig. 5 Subdomain selection
284
P. Rathi et al.
Fig. 6 Application selection
3.4 Application Selection Once the user selects the subdomains of his/her choice, the user will be prompted to select the applications corresponding to each domain. The advantage of selecting the applications is that it enhances the accuracy of search and displays more accurate and specific results which are relevant to the user. Figure 6 shows application identification with priority based on the user’s domain selection. We can see five applications related to blockchain with an input box for priority.
3.5 Modified Search Ranking Results PageRank algorithm is used by Google Search to rank the list of websites in the results of a search engine [14]. In our system, after the user selects the application for a particular domain identified, the system retrieves the titles and abstracts of all the research papers and the links identified by the Google Search engine. Each link in the search results is processed in the Python Beautiful Soup library. Using Beautiful Soup, the text is extracted from the HTML web pages for further processing. The system uses NLP to process the web search results for each of the links in different processing steps as follows: • Tokenizing: Tokenizing helps to break up website text content into words for further processing. It helps the system to turn unstructured data from websites into structured data which is convenient for the upcoming processing steps • Stop words filtering: Stop words filtering is used to filter out the most common words like “I,” “an,” “is,” etc. These words do not add significant meaning to the text and occur in large frequencies as compared to other words. Thus, these words can be ignored and filtered out for further processing.
Project Topic Recommendation by Analyzing User’s Interest …
285
• Stemming: Stemming helps us to truncate actual words to their root words. This allows the system to capture the basic meaning of the word instead of analyzing different forms of words separately. For example, words in the websites “mining” and “mined” will be reduced to “mine.” The stemming algorithm used is Porterstemmer. • Lemmatizing: Lemmatizing is another process to reduce words to their base word. It helps to convert words having plurals, comparatives as well as superlatives to the base word. • Frequency distribution: Frequency distribution helps us to understand the distribution of the count of words in a sentence with a graph. The distribution helps us to analyze different occurrences of subdomain classification and application keywords in the system (Fig. 7). Based on the priority of the user that was specified in the subdomain section, significance, and frequency of keywords and the titles, the system generates a score by multiplying these numbers. The final result is then displayed to the user by sorting in a descending order of scores calculated. This way the current trends along with the user-specific interests are taken into consideration for displaying accurate results. These links in turn will be rearranged based on the user’s assigned priority for each topic.
Fig. 7 Frequency distribution diagram for the processed words in the webpage link “https://jfinswufe.springeropen.com/articles/10.1186/s40854-020-00217-x”
286
P. Rathi et al.
Fig. 8 Final paper links recommendation
4 Results Figure 8 shows final paper links that are displayed to the user by arranging the links in descending order concerning the keyword count score generated. Speaking of the accuracy of the system, there is no such number depicting it, since the system result is reliant on Google results. As, the system scrapes the Google result links based on the user’s interest keywords and then generates the result which would be then shown to the users. But, the final results would be more likable as the system always sorts the links based on the significance of that keyword in the link generated by Google.
5 Limitations The results highly depend on the Google Search engine used in searching titles and abstracts of all the research papers and their links. Also, there is a modification of the results by taking into consideration the user-specific interests for displaying accurate results. Considering the above two points mentioned, if there is no useful information in Google Search results related to the user interests, then there would be no valid results to show from the system end.
Project Topic Recommendation by Analyzing User’s Interest …
287
6 Conclusion The above results elucidate that the system can help students develop projects that align well with their interests and current trends in the industry. Moreover, it also helps to manifest that we can use the process for analysis of the test results submitted by a student to derive their interests and priorities. This attempt can direct the efforts of students and lessen the difficulty they face while finding the project topics.
References 1. C. Pan, W. Li, Research paper recommendation with topic analysis, in 2010 International Conference on Computer Design and Applications (2010), pp. V4-264–V4-268 2. K. Haruna, M. Akmar Ismail, D. Damiasih, J. Sutopo, T. Herawan, A collaborative approach for research paper recommender system. PLOS ONE 12(10), e0184516 (2017) 3. M.C.V. Joe, J.S. Raj, Location-based orientation context dependent recommender system for users. J. Trends Comput. Sci. Smart Technol. (TCSST) 3(01), 14–23 (2021) 4. A. Pushpalatha, S.J. Harish, P.K. Jeya, S. Madhu Bala, Gadget recommendation system using data science, in 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS) (IEEE, 2020), pp. 1003–1005 5. M.C. Surabhi, Natural language processing future, in 2013 International Conference on Optical Imaging Sensor and Security (ICOSS) (2013), pp. 1–3 6. S.G. Kanakaraddi, S.S. Nandyal, Survey on parts of speech tagger techniques, in 2018 International Conference on Current Trends towards Converging Technologies (ICCTCT) (2018), pp. 1–6 7. X. Schmitt, S. Kubler, J. Robert, M. Papadakis, Y. LeTraon, A replicable comparison study of NER software: StanfordNLP, NLTK, OpenNLP, SpaCy, Gate, in 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS) (2019), pp. 338– 343 8. Stemming and lemmatization. Available: https://nlp.stanford.edu/IR-book/html/htmledition/ stemming-andlemmatization-1.html 9. Tokenization. Available: https://nlp.stanford.edu/IR-book/html/htmledition/tokenization-1. html 10. I. Gupta, N. Joshi, Tweet normalization: a knowledge based approach, in 2017 International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions) (ICTUS) (2017), pp. 157–162 11. A.P. Silva, A. Silva, I. Rodrigues, An approach to the POS tagging problem using genetic algorithms, in Computational Intelligence. Studies in Computational Intelligence, vol. 577, ed. by K. Madani, A. Correia, A. Rosa, J. Filipe (Springer, Cham, 2015) 12. Penn part of speech tags. Available: https://cs.nyu.edu/grishman/jet/guide/PennPOS.html 13. M. Nuruzzaman, O.K. Hussain, A survey on chatbot implementation in customer service industry through deep neural networks, in 2018 IEEE 15th International Conference on e-Business Engineering (ICEBE) (2018), pp. 54–61 14. P.S. Yerma, A.K. Majhvar, Updated page rank of dynamically generated research authors’ pages: a new idea, in 2016 IEEE International Conference on Recent Trends in Electronics, Information Communication Technology (RTEICT) (2016), pp. 879–882
Abnormal Behaviour Detection in Smart Home Environments P. V. Bala Suresh and K. Nalinadevi
Abstract The study of user behaviour patterns in activities of daily living is significantly challenging due to the ambiguity in identifying a usual and unusual activities. Capturing the user activities using ambient sensors in a smart home environment serves as the safest and most effective way. Detecting the unusual activities of the residents helps in regularizing their daily tasks. The paper focuses on unsupervised methods for anomaly detection based on clustering. The comparison study is done between different unsupervised clustering models: self-organizing map, densitybased clustering, and Gaussian mixture models. The unsupervised clustering recognizes the normal user behaviour of each activity and looks for any deviations or abnormalities. The experimental study is done on real-time public datasets, and the silhouette score were used to identify the best abnormality rate for each activity. Keywords Self-organizing map (SOM) · Gaussian mixture model (GMM) · Recurrent neural network (RNN) · Density-based spatial clustering of applications with noise (DBSCAN) · Hidden Markov model (HMM) · Gated recurrent unit (GRU)
1 Introduction Smart home environment [1] paves way for monitoring and assisting the daily activities of the residents. It often ensures health care [2], safety, comfort, and security of the residents. The low-cost environmental sensors monitor the activities of the daily living of the residents. Health has always been a major concern in present human life, and people are finding different home automation products to monitor the activities in the smart home [3]. P. V. Bala Suresh (B) · K. Nalinadevi Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] K. Nalinadevi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_22
289
290
P. V. Bala Suresh and K. Nalinadevi
This paper focuses on the usage of sensor data generated from smart home [4] environment. The captured data is used to find deviating behaviour patterns of the residents living in a smart home. This is called abnormal behaviour detection [5]. Abnormal behaviour is majorly classified into two types, namely order-based and duration-based abnormal behaviour. An order-based abnormal behaviour identifies the deviations and irregularities in the order of activities done by residents. For example, the resident failing to take medicines in the prescribed time is identified using the abnormality in the sequence followed by the resident. The duration-based abnormal behaviour determines the unusual amount of time spent while performing an activity, like users spending more time in the toilet during the night. This kind of abnormality detection aims to help the elderly people [6] perform their daily routines independently, and it also helps in the early detection of risk related to specific health conditions. The paper discusses duration-based behaviour analysis of residents in a smart home environment. The novelty of the work is to learn the behavioural patterns of the residents from the sensor data in the smart home in both long and short duration activities. The various use cases of duration-based are: 1. 2. 3.
4. 5. 6. 7.
In a bed-to-toilet activity, the resident spending more time in the toilet during nights is an indicator of dehydration. In cooking activity, the stove is on for a longer duration than expected indicates that a resident is busy with another activity. In work activity, the resident spending more time than previously identified model results acts like an alarm to the user (message like time to eat or take medicine especially for elder smart home user) In leaving and entering home activity, the residents spending more time than the usual indicate the resident forgetting to close the door properly. In eating activity, the residents taking more time than usual show that the user has an immune system problem or taking heavy food. In sleeping activity, the residents spending more time in bed can increase the risk of diabetes, heart disease, stroke, and death. On new duration pattern observed in any of the activities indicates an unusual behaviour which is sent to alarm to the doctors.
Anomaly detection is still a challenging and open problem. Most researchers worked on sequence pattern anomaly detection with help of HMM and sequence pattern mining algorithms [7] such as prefix span and spade. This research is helpful to smart home residents as a remote health monitoring system [8] for risk reduction and early detection. The proposed method is to develop unsupervised methods based on clustering to detect abnormal behaviour in smart home environments. The first task is to observe the time series data which is generated by the sensor in the smart home and make sure there are no outliers or non-faulty data is observed. The proposed method shows the comparison between different clustering models that utilize self-organizing map, density-based clustering, and Gaussian mixture models to detect abnormal behaviour of each activity. To evaluate our method, the dataset collected from the Aruba testbed
Abnormal Behaviour Detection in Smart Home Environments
291
in CASAS repository is used to investigate the proposed methods to detect abnormal behaviour. The performance metrics include the anomaly rate of each activity and the silhouette coefficient. The paper is structured into following subsections: Sect. 2 Related Works, Sect. 3 Proposed Architecture, Sect. 4 Clustering Algorithms, Sect. 5 Results and Discussion, and Sect. 6 Conclusions.
2 Related Works Abnormal behaviour was studied using many methods [9] in the smart home environments in recent years such as variants recurrent neural networks (RNNs) (vanilla RNNs, long short-term RNNs, and gated recurrent unit), hidden Markov models, conditional random fields, sequential algorithms, and clustering algorithms. The existing works traditionally use unsupervised machine learning algorithms to detect abnormal behaviour. The paper [10] presents the activity recognition and behaviour analysis in a smart home, where behaviour analysis demonstrates abnormal behaviour detection being carried out after activity recognition tasks. Abnormal behaviour detection was defined using sequencing pattern mining algorithms called prefix span to find frequent short activities. In [11], the authors observed that patterns deviating from the SOM cluster are marked as unusually short/long abnormalities, at each location from smart home sensor data. The proposed work identifies short and long anomalies on the complete set of sensor readings covering all activities with varied duration in the entire house. In [12], the authors looked into a sequence-based method for detecting abnormal behaviour in dementia patients based on deviations from usual patterns. In [13], the authors proposed a simple way of leveraging a database to detect sequence abnormality for in-home activities with a test accuracy of 90.79%. In [14], the authors using the ship data from marine engine systems have shown the comparison of various unsupervised clustering techniques for anomaly detection. The proposed work investigates on abnormal behaviour detection of each activity jointly with the help of various unsupervised clustering models.
3 Proposed Architecture The process for finding anomaly detection utilizing unsupervised methods based on clustering is shown in Fig. 1. The time series data collection generates a large amount of sensor data, which is labelled with a sensor id and annotated with activities. The data should be pre-processed so that it may be easily analysed for model construction in order to detect abnormal smart home user behaviour.
292
P. V. Bala Suresh and K. Nalinadevi
Fig. 1 Architecture diagram
3.1 Dataset Description The Aruba dataset obtained from Centre for Advanced Studies in Adaptive Systems (CASAS), Washington State University [15], is used in this experiment. The sample dataset shown in Fig. 2 contains recordings of an old woman’s household activities for 220 days. The dataset consists of eight different types of activities, including meal preparation, relaxation, eating, working, sleeping, and toileting, entering home and leaving home. The samples contain timestamp, sensor name, and sensor status. The annotation is performed only to show the beginning and end of an activity. This experiment mainly focuses on, • AT denoting the types of activities performed by user • ST denoting the start time of an activity • TD denoting the total duration of an activity (Fig. 3).
Fig. 2 Sample Aruba testbed dataset
Abnormal Behaviour Detection in Smart Home Environments
293
Fig. 3 Pre-processed dataset of relax activity
4 Clustering Algorithms The unsupervised method based on clustering is used to detect normal and abnormal behaviour from sensor data of each activity. The methods explored for anomaly detection are self-organizing map, density-based clustering (DBSCAN), and Gaussian mixture models. The clustering is performed based on the start time and total duration of the recognized activities using the Python library.
4.1 Self-organizing Map (SOM) SOM or Kohonen map [16] is an artificial neural network for unsupervised methods based on competitive learning. SOM involves majorly three components such as competition, cooperation, and adaptation. Competition is to find the best matching unit on the map with the use of Euclidean distance, cooperation is to find neighbourhood similar kinds of information in the map, and adaptation gives the final winning neural in map with the adjusted weights and parameters of the self-organizing map. The SOM also gives extra information about the nature of the anomaly as short or long in an activity set.
4.1.1
Anomaly Detection Using SOM Clustering
In SOM, the observations (start time and duration) of each activity recognize themselves by competing with each other. The assumed starting initial weight [17] vectors as random between [0, 1] and the learning rate is fixed based on the cluster quality. The input observation vectors are selected randomly, and the weight vector is mapped to find the best weights for the observation using Euclidean distance. The most suitable weights and their neighbouring weights are rewarded by analysing each cluster.
294
P. V. Bala Suresh and K. Nalinadevi
A) EAT Activity
B) RELAX Activity
Fig. 4 Clustering distribution of SOMs for the eat and relax activities
The distributions of each cluster for relax and eat activities are shown in Fig. 4. The clusters are shown as a 10 × 10 lattice node representation. The estimated clusters are examined using the elbow method. The clustering algorithm is evaluated with various values of the hyperparameters such as learning rate and initial weights. To identify each cluster from learned SOM, an estimated number of clusters will be represented as neurons of SOM. From each neuron, the winning observations are centroids of each cluster formed for an activity performed by residents in the smart home environment. The distance of behaviour observed from each activity to the winning coordinates is calculated. The lowest observed distance is treated as normal behaviour. If the distance is above the thresholds, then it is treated as abnormal behaviour, and threshold is calculated based on whiskers of boxplot for each activity. SOM identifies the behaviour to be either a long or short anomaly based on the winning coordinates (start time and duration) from the learned SOM. The graphical representation of SOM identified behaviour of eat and relax activities is shown in Fig. 5.
4.2 DBSCAN DBSCAN [18] refers to density-based spatial clustering of applications with noise. Like other unsupervised techniques, DBSCAN does not require a prior number of clusters. The cluster shape is heavily influenced by the parameters ε (epsilon) and k (Min-points). The special feature of the DBSCAN algorithm is the direct recognition of the noise points which is otherwise the abnormality.
Abnormal Behaviour Detection in Smart Home Environments
A) EAT Activity
295
B) RELAX Activity
Fig. 5 Clusters of SOMs for the eat and relax activities
4.2.1
Anomaly Detection Using DBSCAN
In DBSCAN, the observations (start time and duration) of each activity are fed directly to get abnormal behaviour. As the model is highly dependent on epsilon and min-points, the best epsilon is chosen from the knee graph, and min-points are chosen randomly. If the epsilon value is very high, then the cluster density increases and thereby decreases the number cluster and vice versa; it is important to choose proper epsilon value else the model results in an overfitting algorithm. The observations which are not belonging to any of the clusters are marked as noise point or abnormal behaviour. The graphical representation of DBSCAN identified behaviour of eat and relax activities is shown in Fig. 6.
4.3 GMM GMM refers to the Gaussian mixture model, it is also an unsupervised clustering algorithm, and here, the clusters are called distributions. The Gaussian distributions parameters like mean and variance are used to calculate the probabilities of each cluster belonging to each activity.
4.3.1
Anomaly Detection Using GMM
The distribution of Gaussian mixture model depends on mixing coefficients of individual clusters, and the distribution function is shown in Eq. (1).
296
P. V. Bala Suresh and K. Nalinadevi
A) EAT Activity
B) RELAX Activity
Fig. 6 Clusters of DBSCAN for the eat and relax activities
F(x) =
D
N x/μD, D
(1)
D=1 D
where D is called mixing coefficients, D is the means number of Gaussians, and N (x/μD, D) is multivariate Gaussian distribution. The density depends upon the multivariate Gaussian distribution where model parameters are (mean, covariance matrix, and mixing coefficients) estimated using expectation maximization (EM) algorithm; the probabilities of normal and abnormal behaviour are thus derived. The evaluation of current parameters is calculated in E step and the re-estimate parameters using latent variable in M step. For each distribution in the activity, if the behaviour is in μ ± 2σ, then it is marked as abnormal, and the other behaviour is classified as normal behaviour. The graphical representation of GMM identified behaviour of eating and relax activities is shown in Fig. 7.
5 Results and Discussion Anomaly rate and silhouette score are used to analyse the performance of abnormal behaviour detection in smart homes. Anomaly rate is the ratio of the total number of anomalies found and the total number of observations in each activity. Silhouette coefficient or silhouette score [19] is a metric used to calculate the quality of clusters formed from each cluster identified. It is measured based on inter-cluster distance and intra-cluster distance formed for each activity. The range of the metric is between −1 and 1. The anomaly rate of each activity refers to anomalous behaviour found in the user. The lower anomaly
Abnormal Behaviour Detection in Smart Home Environments
A) EAT Activity
297
B) RELAX Activity
Fig. 7 Clusters of GMM for the eat and relax activities
Table 1 Silhouette score for GMM, SOM, and DBSCAN
Activity
DBSCAN
GMM
SOM
Relax
0.017
0.31
0.49
Meal
0.52
0.010
0.61
Eat
0.22
0.67
0.64
Sleep
0.67
0.58
0.83
Work
0.48
0.56
0.50
Bed to toilet
0.48
0.54
0.50
Leave home
0.013
0.014
0.64
Enter home
0.48
0.42
0.566
with a good silhouette score makes the user healthy and risk-free. Table 2 shows the results of anomaly rate for each activity and Table 1 shows the comparison of the cluster quality for SOM, DBSCAN, and GMM models. The silhouette score helps to identify the best model for each activity to detect abnormal behaviour. SOM performs well on identifying abnormal behaviour in activities like relax, meal preparation, sleep, leave home, and enter home. The abnormal behaviour detected by the SOM models will be calculated through deviations caused by the behaviour of the user. For example, the user in this smart home [20, 21] sleeps for about 5–6 h generally between 11:15 PM to 12:00 PM and sometimes 4–5 h between 12:00 AM to 12:30 AM. The anomaly rate for SLEEP activity is 5.26%; 18 anomalies are detected out of 360 observations deviating the learned duration. If the user slept for a longer or shorter duration than the learned threshold, then it is marked as an anomaly. This identifies user health issues like depression or thyroid issues. Similarly, based on the observation in the dataset, the user generally relaxes for 12–20 min between 9:45 AM and 1:00 PM and in the evening for 35–40 min between 8 and 9 PM. The anomaly rate for relax activity is 15.04%; 437 anomalies are detected from 2904 observations. The user usually requires 12–20 min for preparation of a meal in the kitchen between 7–8 PM and 8–14 min in the morning around 8:45
298
P. V. Bala Suresh and K. Nalinadevi
Table 2 Evaluation metrics for SOM, DBSCAN, and GMM Anomaly rate DBSCAN (%)
GMM (%)
SOM (%)
Relax
22.38
3.03
15.04
Meal
6.55
4.28
6.04
Eating
25.82
8.64
13.16
8.78% long anomaly 6.26% short anomaly 0% short anomaly 6.04% long anomaly 8.63% short anomaly 4.52% long anomaly
Sleep
6.11
7.22
5.26
4.8% short anomaly
Work
22.92
7.05
12.86
9.55% long anomaly
Bed to toilet
24.2
5.09
5.09
3.18% long anomaly
0.46% long anomaly 3.30% short anomaly 1.90% short anomaly Leave home
6.36
3.28
6.57
2.34% long anomaly
Enter home
11.73
9.43
6.57
1.41% long anomaly
4.22 short anomaly 5.03% short anomaly
am and afternoon around 2:00 pm. The anomaly rate for meal preparation is 6.04%; 96 anomalies are detected from 1593 observations. The shorter anomaly will cause uncooked food, and longer anomaly will lead to wastage of food or cause potential danger to the cooking appliance. GMM performs well on identifying abnormal behaviour in activities like eat, bed-to-toilet, and work activities. The user eats for 10–14 min between 1–2 PM and 5–12 min in the morning around 9:30 AM and in the evening around 6:45 PM. The stable anomaly rate detected by GMM model for eat activity is 8.64%; 21 anomalies are detected from 243 observations. Most of the toilet activity is observed on nights after sleep activity occurred. The user uses the toilet 2–3 min between 12 and 5 AM and few times around 2:30 PM. The anomaly rate for bed-to-toilet is 5.09%; 8 anomalies detected from 157 observations. If the user uses bathroom for the longer duration than the learned model results, then it is marked as anomaly because of digestion issue or haemorrhoids. DBSCAN clustering algorithm method is not able to find anomalies of the users because of the parameter constraint.
Abnormal Behaviour Detection in Smart Home Environments
299
6 Conclusion The abnormal activities of the residents based on duration are identified for every activity by finding the best unsupervised learning model. It is observed that the SOM and GMM perform better for identifying the abnormal activity. The models can be further extended for order-based anomaly detection which gives an insight on tracking the sequence of activities. In future, our aim is to work on multiple users in the smart home and different novelty methods to recognize anomalies from the smart home dedicated to elderly people.
References 1. M. Dilraj, K. Nimmy, S. Sankaran, Towards behavioral profiling based anomaly detection for smart homes, in Proceedings of IEEE Region 10 Conference (TENCON) (2019), pp. 1258–1263. http://doi.org/10.1109/TENCON.2019.8929235 2. J. Hariharakrishnan, N. Bhalaji, Adaptability analysis of 6LoWPAN and RPL for healthcare applications of internet-of-things. J. ISMAC 3(02), 69–81 (2021) 3. Y.B. Hamdan, Smart home environment future challenges and issues—a survey. J. Electron. 3(01), 239–246 (2021) 4. J. Ye, G. Stevenson, S. Dobson, Detecting abnormal events on binary sensors in smart home environments. Pervasive Mob. Comput. 33, 32–49 (2016) 5. A. Lotfi, C. Langensiepen, S.M. Mahmoud, M.J. Akhlaghinia, Smart homes for the elderly dementia sufferers: identification and prediction of abnormal behaviour. J. Ambient Intell. Humanized Comput. 3(3), 205–218 (2012) 6. D.M. Menon, N. Radhika, Anomaly detection in smart grid traffic data for home area network, in 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), Nagercoil, India (2016), pp. 1–4. http://doi.org/10.1109/ICCPCT.2016.7530186 7. C.P. Prathibhamol, G.S. Amala, M. Kapadia, Anomaly detection based multi label classification using Association Rule Mining (ADMLCAR), in 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (2016), pp. 2703–2707. http://doi. org/10.1109/ICACCI.2016.7732469 8. A. Grewal, M. Kaur, J. Park, A unified framework for behaviour monitoring and abnormality detection for smart home. Wirel. Commun. Mob. Comput. 1–16 (2019). https://doi.org/10. 1155/2019/1734615 9. G. Ranganathan, Real life human movement realization in multimodal group communication using depth map information and machine learning. J. Innov. Image Process. (JIIP) 2(02), 93–101 (2020) 10. E. De-La-Hoz-Franco, P. Ariza-Colpas, J.M. Quero, M. Espinilla, Sensor-based datasets for human activity recognition—a systematic review of literature. IEEE Access 6, 59192–59210 (2018). https://doi.org/10.1109/ACCESS.2018.2873502 11. M. Novak, F. Jakab, L. Lain, Anomaly detection in user daily patterns in smart home environment. J. Sel. Areas Health Inform. 3(6), 1–11 (2013) 12. D. Arifoglu, A. Bauchachia, Activity recognition and abnormal behaviour detection with recurrent neural network. Procedia Comput. Sci. 110, 86–93. http://doi.org/10.1016/j.procs.2017. 06.121 13. S.-C. Poh, Y.-F. Tan, S. Cheong, C. Ooi, W.H. Tan, Anomaly detection for home activity based on sequence pattern. Int. J. Technol. 10, 1276 (2019). http://doi.org/10.14716/ijtech.v10i7.3230 14. E. Vanem, A. Brandsætera, Unsupervised anomaly detection based on clustering methods and sensor data on a marine diesel engine. J. Mar. Eng. Technol. http://doi.org/10.1080/20464177. 2019.1633223
300
P. V. Bala Suresh and K. Nalinadevi
15. D.J. Cook, Learning setting-generalized activity models for smart spaces. IEEE Intell. Syst. (99), 1 (2010). http://doi.org/10.1109/MIS.2010.112 16. T. Kohonen, The self-organizing map. Proc. IEEE 78, 1464–1480 (1990) 17. H. Haripriya, R. DeviSree, D. Pooja, P. Nedungadi, A comparative performance analysis of self organizing maps on weight initializations using different strategies, pp. 434–438. http:// doi.org/10.1109/ICACC.2015.75 18. M. Ester, H.P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in Proceedings of KDD (1996), pp. 226–231 19. F. Wang, H.-H. Franco-Penya, J.D. Kelleher, J. Pugh, R. Ross, An analysis of the application of simplified silhouette to the evaluation of k-means clustering validity, in Lecture notes in Computer Science (2017). http://doi.org/10.1007/978-3-319-62416-7_21 20. T. Cultice, D. Ionel, H. Thapliyal, Smart home sensor anomaly detection using convolutional autoencoder neural network, in 2020 IEEE International Symposium on Smart Electronic Systems (iSES) (Formerly iNiS) (2020), pp. 67–70. http://doi.org/10.1109/iSES50453.2020. 00026 21. C. Stolojescu-Crisan, C. Crisan, B.-P. Butunoi, An IoT-based smart home automation system. Sensors 21(11), 3784 (2021). https://doi.org/10.3390/s21113784
Mapping UML Activity Diagram into Z Notation Animesh Halder and Rahul Karmakar
Abstract The unified modeling language (UML) is widely used for modeling a system. It captures different views of the system. But the semantics of UML is semiformal and sometimes ambiguous. On the other hand, Z is a formal specification language based on set theory and predicate logic used to prove the required properties of a system mathematically. In this paper, we proposed some rules which help to convert the semi-formal semantics of the activity diagram. It also shows the dynamic aspect of a system, into formal Z notation with example. We also make a case study on the ATM withdrawal system using the proposed rules. We design the system using UML activity diagram and convert it into the Z notation manually. These notations are then verified using the CZT tool support. This approach helps to design a reliable system from semi-formal specification to formal specification. Keywords UML · Activity diagram · Z notation · Formal methods · Translation rules · CZT
1 Introduction A model is a simplified view of a complex system or problem. To construct a model, it is very useful to capture the important aspects of the system. If all the aspects are captured in one model, then this would be as complicated as the main problem. A model helps to reduce the complexity and design cost of a problem. A model makes it easier for different tasks related to the problem like analysis, design, coding, testing, etc. A model can be categorized into three types graphical, textual, and mathematical. We can consider UML as a graphical model although it has textual explanations [1]. UML stands for unified modeling language. As the name suggests, it is a language for creating models by using some syntax and semantics. The term unified is used A. Halder (B) · R. Karmakar Department of Computer Science, The University of Burdwan, Burdwan, India R. Karmakar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_23
301
302
A. Halder and R. Karmakar
because it incorporates different views of a system by different diagrams. UML is very useful for documenting the design of a system. UML gives an explicit view of the relationship among different objects of a system. This is very useful to implement the system by any programming language. UML diagrams contain five different views of a system which contains nine diagrams. But the main disadvantage of UML is that it is not formal. We can consider it as a semi-formal method [1]. In the formal method, a specification is created using set theory and logic notation for making a clear statement of requirements which is called formal specification. This mathematical specification is more unambiguous, complete, and consistent than the other informal representation. A formal method is very useful for the safety– critical system where lives may be lost or a huge economic loss may happen for minor mistakes. The formal method translates informal facts into more formal. There are different types of formal specification language like OCL, Z, etc. [2]. We choose Z for our works. There are many reasons for choosing Z. We discuss only a few of them. A system can be easily modeled in Z notation because it supports inheritance, local variable declarations, state change operation. The Z model of a system implies a detailed design of the system. The notation of Z is easy to learn. Z is supported by tools [3]. The main contribution of this paper is to define mapping rules that map components of the activity diagram to Z notation. Then use these rules in a case study of ATM withdrawal system. The structure of this paper is given below. In Sect. 2, a systematic literature review has been done. Section 3 represents an overview of the UML activity diagram and Z notation. In Sect. 4, translation rules are defined with examples followed by a case study in Sect. 5 and conclusion in Sect. 6.
2 Systematic Literature Review In this section, we have presented a systematic literature review. We have gone through many pieces of works of literature and finally selected some relevant works. Some papers are related to the case study; one is related to only activity diagrams and the rest are related to the formalization of UML diagrams. We have arranged these works in Table 1. A handsome amount of works has been done in this area. Still, one can contribute to this area because critical system design needs more and more correctness from the early phases of software development. Our proposed work mapping activity diagram into Z shall contribute to the present state of the art. There exist many formal methods like Event-B [17] that are used in this domain. Different UML diagrams and graphical notations are converted into Event-B notations [18]. Event-B and model checkers like linear temporal logic (LTL) [19] are used to design critical systems like airbag systems [20], automatic pump controllers [21], and smart irrigation systems [22].
It identifies mapping rules between activity diagram and Petri Nets. It proposed a transformation algorithm that transforms the activity diagram to its corresponding Petri Nets, and the equivalence property between these two is also shown here
In this paper, formal modeling of air traffic control Air traffic control (ATC) signal (ATC) signals is done using Z notation. Here, formal specification is given of different light signals. It helps to achieve a minimum delay of aircraft flight and collision-free landing and takeoff process of aircraft
It presents the formal specification of the train Automated train control system control system using Z notation. The graph theory is used for designing the static component of the system, but there is a lot of complexity to define the entire interlocking system. So using the formal method with Z notation the system will be easier and full proof
Huang et al. [5]
Khan et al. [6]
Zafar [7]
UML activity diagram
It highlights the formal transformation of the UML UML activity diagram activity diagram into FoCaLiZe. It proposed a functional semantic for this purpose. It used Zenon, an automatic theorem prover of FoCaLiZe, to prove its derived theorems
Abbas et al. [4]
Domain
Objectives and topics
Reference
Table 1 Systematic literature review
Case study
Case study
(continued)
Component mapping
Translation tool
Type
Mapping UML Activity Diagram into Z Notation 303
In this paper, conceptual model of the ATM system Automated teller machine (ATM) system and the formal specification of this system using Z notation are presented. Here verification and authentication of a user are mainly focused to make the system unambiguous and reliable
In this paper, a formal procedure is given to NFA transform NFA to Z. Here at first, a string accepter is designed, then a language accepter, and finally, NFA accepting the union of two regular languages. It helps to increase the modeling power of a complex system
It presents complete modeling of traffic accident reporting system (TARS) through UM using GIS with a bar chart. A class diagram, sequence diagram, and an activity diagram of TARS are drawn in this paper
It presents a roadmap to formalize UML using Z. UML class diagram Here, precise UML (PUML) concept was introduced to develop a precise semantic model for UML diagrams that helps to formal deduction about UML diagrams. It presents only a small example of valid deduction of one class diagram to another
Kumar and Goel [9]
Zafar et al. [10]
Ansari and Al-shabi [11]
Evans et al. [12]
Traffic accident reporting system (TARS)
Road traffic management system
It presents a formal analysis of the road traffic management system using Z notation to make this more accurate. But it analyzes only traffic police scenario
Singh et al. [8]
Domain
Objectives and topics
Reference
Table 1 (continued)
Theoretical
Case study
Theoretical
Case study
Case study
Type
(continued)
304 A. Halder and R. Karmakar
It presents a structured and expandable format of UML use case diagram, a Z notation schema which is derived from this format and an ER diagram which is derived from this notation schema to achieve automated traceability and verification in the design phase
It presents an approach to produce formal specification in Z from annotated UML class diagram. For this purpose, it introduces the RoZ tool which uses the rational rose tool as the basis. This work will help to decrease designer effort
It presents the formalization of some basic UML activity diagram building block and their structural semantics of the UML 2.5 activity diagram into Z notation. And, the resultant formal semantics has been checked using the Z/EVES toolset
Sengupta and Bhattacharya [14]
Dupuy et al. [15]
Jamal and Zafar [16]
UML class diagram
UML use case diagram
UML class diagram
It highlights the transformation of UML class diagrams into a formal specification language OhCircus, although all the elements are not formalized. The refinement in UML is also analyzed here with the help of the refinement theory of Z in OhCircus
Borges and Mota [13]
Domain
Objectives and topics
Reference
Table 1 (continued)
Theoretical case study
Tool-based case study
Theoretical case study
Component mapping
Type
Mapping UML Activity Diagram into Z Notation 305
306
A. Halder and R. Karmakar
3 Background The behavioral view of the UML diagram shows the dynamic aspect of a system. It says what happened in a system. It shows how objects interact with each other concerning time [1].
3.1 Activity Diagram The activity diagram is under the behavioral view. It shows the flow of action from one activity to another activity. It represents the various activities and their sequence. We can consider this diagram as an advanced version of the flow chart. It is normally used in business process modeling. It helps to understand complex processing activities and to develop an interaction diagram [1]. Basic notation of activity diagram Initial state or start point: It is a small filled circle. Every activity diagram begins with this notation shown in Fig. 1. Activity or action state: It is a rectangle with rounded corners. It represents the non-interruptible action of the object. It is shown in Fig. 2. Action flow: It is an arrow line. It shows the flow of the object from one activity to another. It is represented in Fig. 3. Decision symbol: It is diamond-shaped shown in Fig. 4. It is used as a test condition. Guard symbol: It is a statement written in the third bracket beside an arrow, like Fig. 5. It says for which condition action goes in which direction. Fig. 1 Initial state
Fig. 2 Action state
Mapping UML Activity Diagram into Z Notation
307
Fig. 3 Action flow
Fig. 4 Decision symbol
Fig. 5 Guard
Fork, join, and synchronization: Fork and join both are a slightly thicker straight line. Fork node splits a single behavior into multiple parallel activities, whereas join node joins parallel activities into a single flow. When a fork and join node are used together, it is called synchronization. This is shown in Fig. 6. Time event: Its shape is like something an hourglass shown in Fig. 7. It implies that flow is stopped for some time. Merge event: It merges nonconcurrent flows as shown in Fig. 8. Sent and received signals: These two signals represent the modification of activities from outside the system. Usually, both signals appeared in pairs. The received signal activity cannot be completed until a response is received. This is represented in Fig. 9. Fig. 6 Synchronization
308
A. Halder and R. Karmakar
Fig. 7 Time
Fig. 8 Merge
Fig. 9 Sent and received signal
Flow final node: It is a circle with a cross which is shown in Fig. 10. The flow which hits this node will be terminated. Endpoint symbol: It is a black filled circle with an outer border as shown in Fig. 11. Every activity diagram ends in this node. Fig. 10 Flow final
Mapping UML Activity Diagram into Z Notation
309
Fig. 11 End point
3.2 Z Notation Z is a specification language that uses set, relation, and function within the context of first-order predicate logic to build schema. It is a mature model-based nonexecutable notation. In the late 1970s, Z was developed through the collaborative projects between Oxford University and industrial partners including IBM and Inmos. The first reference manual of Z appeared in 1989. Z is not a programming language, and it was not designed for machines, it was designed for people. Z is mainly organized by a set of schemas. Schema is a box-like structure, whose first part is used for the declaration of variables, functions, and to inherit others schema. Z is supported by tools. We use the CZT tool for our work [2, 3, 8]. General schema format in Z Set definition: In this section, we define different sets that will be used in the next portion. We cannot use a nondefined set. Figure 12 represents the state-space schema. Set_name:: = element1| element2|….. Here, invariants imply a set of conditions that must not be violated in any situation for this system. We can say it safety condition. The initial schema is shown in Fig. 13. This schema is needed only for initialization. Figure 14 represented the Operational schema.
Fig. 12 State-space schema
Fig. 13 Initial state schema
310
A. Halder and R. Karmakar
Fig. 14 Operational schema
Here, capital delta sign indicates update operation. If the only read operation is needed, then the schema structure remains the same but delta will be replaced by .
4 Translation Rules In this section, we define the rules that map the meaning of the activity diagram components into Z notations. We have explained these rules with examples. Rule 1: Initial node of an activity diagram can be represented in Z notation as a state-space schema where invariants are written or as an initial state-space schema where we initialize state variables. It is represented in Table 2. Rule 2: We can represent the action state of an activity diagram in Z as an operational schema. An action state can include more than one action; similarly more than one operation can be done in an operational schema. It is represented in Table 3. Rule 3: We consider the action flow node of the activity diagram as an implication in Z. Rule 4: Decision symbol between two action states can be considered preconditions in Z. Rule 5: Guard symbol helps to make a statement of preconditions. Table 2 Translation of initial node
Component of activity diagram
Corresponding Z notation Value = 0 Limit = 100
Table 3 Translation of the operational schema
Component of activity diagram
Corresponding Z notation msg! = user already exist
Mapping UML Activity Diagram into Z Notation Table 4 Translation of implication, decision symbol, and guard symbol
Components of activity diagram
311 Corresponding Z notation
user? ∈ / username ⇒ msg! = invalid user
Table 5 Translation of precondition and postcondition Components of activity diagram
Corresponding Z notation
User? ∈ / username ∨ Currentpass = users(user?) ⇒ msg! = invalid user/password
We can show the above three rules in one example, and it is represented in Table 4 with an example. Rule 6: Merge node can be explained in Z as: Condition1 ∨ condition2 => postcondition. It is represented in Table 5 with an example. Rule 7: The function of a fork node can be described in Z as: Condition => event1 ∧ event2 Rule 8: The function of the join node can be described in Z as: Condition1 ∧ condition2 => event When a fork node and a join node are used together, it is called synchronization. Now, we show the example of synchronization. It is represented in Table 6 with an example. Rule 9: To show the time event in the activity diagram, we consider an integer variable in Z. Rule 10: Function of flow final node is automatically supported by Z since one operation schema does not affect another.
312
A. Halder and R. Karmakar
Table 6 Translation of join node Components of activity diagram
Corresponding Z notation
(user? ∈ / username) ⇒ username’ = username ∪ {user?} ∧ users’ = users ∪ {user? pass?} ∧ msg! = user added successfully
5 A Case Study on ATM Withdrawal System 5.1 Activity Diagram of ATM Withdrawal System In Fig. 15, we represented the overall activity diagram of the ATM withdrawal system. All the operations are shown by different activities.
5.2 Z Notation of ATM Withdrawal System We write Z notations for the activity diagram of ATM withdrawal system and verify these notations in CZT tool as shown in Fig. 16.
5.3 Explanations According to Rule 1, initial node of this activity diagram is represented in Z notation as state-space schema named withdrawal. Here, we declare three partial functions and one invariant. The accounts function maps card number to account number, pins function maps card number to PIN, and amounts function maps card number. Withdrawal accounts: N N. pins: N N. amounts: N N. | dom accounts = dom pins = dom amounts
Mapping UML Activity Diagram into Z Notation
313
Fig. 15 Overall activity diagram of the ATM system
The invariant says that the domain of these three functions is the same because one card number holds only one account, only one pin, and a value of amount. The next activity is card checking or validation. For this activity, we take an operational schema named card check. Cardcheck Withdrawal card?:N timer:TIMER transaction:TRANSACTION msg!:MESSAGES | card? ∈ / dom accounts ⇒ transaction = stop ∧ msg! = cardisinvalid
A. Halder and R. Karmakar
Fig. 16 Proofs in CZT tool
314
Mapping UML Activity Diagram into Z Notation
315
card? ∈ dom accounts ⇒ transaction = continue ∧ msg! = enterthepin ∧ timer = on Here, we inherit withdrawal schema. According to our diagram, we check the validity of the card. If the card is invalid, then print the message “card is invalid,” and the transaction will be stopped. For this, we take the card as input and check its existence. If it does not belong to the domain of accounts function, then transaction will be stopped and print that message. But if the card is valid, then the system requires a pin, and it gives some time for input pin. If the time is over, then the transaction will be stopped. To represent this feature, we introduce a variable “timer.” The timer on means counting is started. If input typing is starting within this time limit, timer will be off. To explain these features, we design a time count schema where we use Rule 9. Timecount timer:TIMER time:N transaction: TRANSACTION typing: TYPE | timer = on ⇒ time = time + 1 time = 10 ⇒ timer = off ∧ transaction = stop time < 10 ∧ typing = start ⇒ timer = off ∧ transaction = continue Here, timer = on implies the increment of variable time from 0, and when it reaches 10, the timer will be off and the transaction will be stopped. But in this interval, when typing starts, the timer will be off but the transaction will not be stopped. Now after entering the pin, system checks the validity of the pin. For this purpose, we design the pin check schema. Pincheck Cardcheck pin?:N | pin? = pins(card?) ⇒ transaction = stop ∧ msg! = pindoesnotmatch pin? = pins(card?) ⇒ transaction = continue ∧ msg! = enteramount ∧ timer = on In this schema, we inherit the card check schema. We take the pin as an input. If the entering pin is equal to the pin corresponding to the card number, then it is a valid pin otherwise not. If the pin is invalid, then print the message “pin does not match,” and the transaction will stop. But if it is valid, then the system will want the value
316
A. Halder and R. Karmakar
of the amount which we want and the timer will on. Details about the timer we have already discussed. When we insert the value of the amount system, check this number is greater than the limit or not. For this, we design the amount check schema. Amountcheck Pincheck amount?:N | amount? > 10000 ⇒ transaction = stop ∧ msg! = amountcrossthelimit amount? ≤ 10000 ⇒ transaction = continue Here, we set the limit value as 10,000. User input amount is a natural number. If an amount less than or equal to 10,000, then the system checks two things at the same time 1. 2.
Is the inserted amount present in the account? Is the inserted amount present in ATM?
For parallel activities, we use synchronization. To represent these activities, we design the withdrawaling schema. Withdrawaling Amountcheck atmbalance:BALANCE cash:CASH | atmbalance = high ∧ amounts(card?) ≥ amount? + 1000 ⇒ cash = dispense atmbalance = low ∨ amounts(card?) < amount? + 1000 ⇒ transaction = stop In this schema, we implement fork and join nodes by using Rule 7 and Rule 8. Here, according to the diagram, if two conditions are satisfied, then cash dispense is performed. It is implemented in the first precondition. But when any one of these conditions or none of these conditions is satisfied, then the transaction will be stopped. It is implemented in the second precondition.
6 Conclusion The proposed rules map the UML activity diagram components into the Z notations. We applied these rules in the ATM withdrawal system case study. We manually converted the Z schema from the activity diagram using the rules. We did the mapping between the activity diagram components with Z notations. The correctness of the system is verified using the Z tool support CZT. This work will be helpful to convert semi-formal notations into a formal ones. It indeed helps to design and verify the
Mapping UML Activity Diagram into Z Notation
317
critical software. But we cannot define rules for every component or symbol of the activity diagram in this paper. It is an ongoing work. We shall propose more rules for the remaining components and build tool support that automatically converts the activity diagram into Z notation in the future.
References 1. R. Mall, Fundamentals of Software Engineering, 5th Revised edn. (PHI Learning, Delhi, 2018) 2. R.S. Pressman, Software Engineering: A Practitioner’s Approach, 8th edn. (McGraw-Hill Education, New York, 2015) 3. J. Jacky, The Way of Z: Practical Programming with Formal Methods (Cambridge University Press, Cambridge, 1996) 4. M. Abbas, R. Rioboo, C.-B. Ben-Yelles, C.F. Snook, Formal modeling and verification of UML activity diagrams (UAD) with FoCaLiZe. J. Syst. Archit. 114, 101911 (2021). http://doi.org/ 10.1016/j.sysarc.2020.101911 5. E. Huang, L.F. McGinnis, S.W. Mitchell, Verifying SysML activity diagrams using formal transformation to Petri nets. Syst. Eng. 23(1), 118–135 (2020). https://doi.org/10.1002/sys. 21524 6. N.A. Khan, F. Ahmad, S. Yousaf, S.A. Khan, Formal modeling of ATC signals using Z notation, in International Conference on Open Source Systems and Technologies (2012), p. 4 7. N.A. Zafar, Modeling and formal specification of automated train control system using Z notation, in 2006 IEEE International Multitopic Conference, Islamabad, Pakistan, Dec 2006, pp. 438–443. http://doi.org/10.1109/INMIC.2006.358207 8. M. Singh, A.K. Sharma, R. Saxena, Towards the formalization of road traffic management system for safety critical properties by Z notation, in 2015 International Conference on Green Computing and Internet of Things (ICGCIoT), Greater Noida, Delhi, India, Oct 2015, pp. 1516– 1521. http://doi.org/10.1109/ICGCIoT.2015.7380707 9. M.S. Kumar, S. Goel, Specifying safety and critical real-time systems in Z, in 2010 International Conference on Computer and Communication Technology (ICCCT), Allahabad, Uttar Pradesh, India, Sept 2010, pp. 596–602. http://doi.org/10.1109/ICCCT.2010.5640473 10. N. Zafar, N. Sabir, A. Ali, Formal transformation from NFA to Z notation by constructing union of regular languages. Int. J. Math. Models Methods Appl. Sci. 3, 70–75 (2008) 11. D.G.A. Ansari, D.M. Al-shabi, Modeling of traffic accident reporting system through UML using GIS. Int. J. Adv. Comput. Sci. Appl. IJACSA 3(6), 55/12 (2012). http://doi.org/10.14569/ IJACSA.2012.030606 12. R. France, A. Evans, K. Lano, B. Rumpe, The UML as a formal modeling notation. Comput. Stand. Interfaces 19(7), 325–334 (1998). https://doi.org/10.1016/S0920-5489(98)00020-8 13. R.M. Borges, A.C. Mota, Integrating UML and formal methods. Electron. Notes Theor. Comput. Sci. 184, 97–112 (2007). https://doi.org/10.1016/j.entcs.2007.03.017 14. S. Sengupta, S. Bhattacharya, Formalization of UML use case diagram—a Z notation based approach, in 2006 International Conference on Computing & Informatics, Kuala Lumpur, Malaysia, June 2006, pp. 1–6. http://doi.org/10.1109/ICOCI.2006.5276507 15. S. Dupuy, Y. Ledru, M. Chabre-Peccoud, An overview of RoZ: a tool for integrating UML and Z specifications, in Active Flow and Combustion Control 2018, vol. 141, ed. by R. King (Springer International Publishing, Cham, 2000), pp. 417–430. http://doi.org/10.1007/3-54045140-4_28 16. M. Jamal, N.A. Zafar, Formalizing structural semantics of UML 2.5 activity diagram in Z notation, in International Conference on Open Source Systems and Technologies (2016), p. 6 17. R. Karmakar, B. Biman Sarkar, N. Chaki, System modeling using event-B: an insight, in Proceedings of the 2nd International Conference on Information Systems & Management
318
18. 19.
20.
21.
22.
A. Halder and R. Karmakar Science (ISMS) 2019, Tripura University, Agartala, Tripura, India. Available at SSRN: https:// ssrn.com/abstract=3511455 or http://doi.org/10.2139/ssrn.3511455 R. Karmakar, B.B. Sarkar, N. Chaki, Event ordering using graphical notation for event-B models (2020). http://doi.org/10.1007/978-3-030-47679-3_32 R. Karmakar, Symbolic model checking: a comprehensive review for critical system design, in Proceedings of International Conference on Data and & Information Science (ICDIS2021) held on May 14–15, Accepted for publication in Lecture Notes in Network and Systems (Springer, Berlin) ISSN: 2367-3370 S. Guha, A. Nag, R. Karmakar, Formal verification of safety-critical systems: a case-study in airbag system design, in Intelligent Systems Design and Applications (2021), pp. 107–116. http://doi.org/10.1007/978-3-030-71187-0_10 R. Karmakar, B.B. Sarkar, N. Chaki, Event-B based formal modeling of a controller: a case study, in Proceedings of International Conference on Frontiers in Computing and Systems, Singapore (2021), pp. 649–658. http://doi.org/10.1007/978-981-15-7834-2_60 R. Karmakar, B.B. Sarkar, A prototype modeling of smart irrigation system using event-B. SN Comput. Sci. 2(1), 36 (2021). https://doi.org/10.1007/s42979-020-00412-8
Exploring Sleep Deprivation Reason Prediction Dhiraj Kumar Azad, Kshitiz Shreyansh, Mihir Adarsh, Amita Kumari, M. B. Nirmala, and A. S. Poornima
Abstract Today, more than half of the world is suffering from depression and mental health problems. And the reason for majority of the cases is low quality of sleep. There can be many reasons involved for sleep deprivation like fatigue, high blood pressure, and mood swings. And the effect of the sleep deprivation is drowsiness, low performance, and mental health problems. The idea is to develop a solution for android smart phone/watch fitness application, using Individual user activity data of smartwatch, i.e., fitness, BG/BP measurements, sleep data, sleep patterns, and sleep score for previous days and predict the rationale for sleep deprivation of present day using machine learning and deep learning techniques. This proposed android-based sleep tracker system will provide a solution to help the user for better sleep using ML and deep learning techniques. Our results provide the reasons for sleep deprivation to improve the sleep quality. Keywords Sleep deprivation · Sleep score prediction · Deep learning · Machine learning
1 Introduction Sleep deprivation is a condition in which a person does not have quality sleep to support alertness, health, and overall performance. It can vary a lot in size and can be permanent. Severe deprivation of sleep is caused due to less sleep than normal or when a person does not get a proper sleep for a shorter period of time. Chronic sleep deprivation refers to a condition called insomnia. Though, insomnia and chronic sleep deprivation have reduced the amount and/or quality of sleep and non-sleep performance, and their difference lies in sleep ability. The article [1, 2] shows that D. K. Azad · K. Shreyansh · M. Adarsh · A. Kumari · M. B. Nirmala (B) · A. S. Poornima Department of Computer Science Engineering, Siddaganga Institute of Technology, Tumkur, Karnataka, India e-mail: [email protected] A. S. Poornima e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_24
319
320
D. K. Azad et al.
university students are vulnerable to various risk factors for quality of sleep which gradually leads to frustration and stress. Most of the people fails to get enough quality sleep because of the challenging task of long-term comprehension, and this adversely affects their well-being, health, and their ability to perform daily work. The papers [3, 4] gives an complete analysis on how sleep deprivation effects the performance of an individual. The correct amount of sleep is not constant from one person to another, but the Centers for Disease Control and Prevention (CDC) suggests that adults should get at least 7 h of sleep. It is necessary to consider both amount and quality of sleep. If a person sleeps poorly, he feels tired the next day. Fatigue due to sleep deprivation poses a significant risk to workers’ safety, public safety, and productivity at work. Low-quality sleep can include: • • • •
Waking up several times a night Respiratory problems, such as sleep apnea Very hot, cold, or noisy place Uncomfortable bed.
Problems faced by people suffering from sleep deprivation over time may include: • • • •
Increased risk of mental and depression illness Increased risk of pneumonia and stroke Disorders such as insomnia, sleep apnea Insanity.
2 Literature Survey The authors of [5] discuss on predicting sleep score from wearable data using deep learning. Inadequate sleep reduces emotional, mental, and physical health, which gradually leads to many chronic health problems. Sleep is highly correlated with the physical activity of a person. Sleep and physical activity can be recorded using wearable medical devices called actigraphy sensors. This study is also focused on advanced deep learning methods. The results were that deep learning models were capable of predicting sleep quality based on awake periods. The deep learning models performance was greater than logistic regression. “Convolutional neural network” has very best sensitivity and specificity, and an overall area under the receiver operating characteristic (ROC) curve (AUC) of 0.9449, which was 46% more as compared with conventional logistic regression (0.6463). The authors of [6] discuss on how the stages of sleep are classified. The authors have worked on various ml algorithms for classification of sleep: random forest (RF) classification focusing on features and artificial neural networks (ANNs) working both with features and unprocessed data. In this study, the authors have tested their methods on healthy young males and patients suffering from hypersomnia and narcolepsy. For machine classification, two approaches were followed: random forest (RF) classification based on features and ANNs working both with features and raw
Exploring Sleep Deprivation Reason Prediction
321
data. Deep learning models were also applied like CNN and residual network. The results showed that during the training of artificial neural networks (ANNs) there was increment in the accuracy of the classification. Good convergence was shown by feature-based LSTM networks. The authors have used Cohen’s kappa metric, where Kappa is a number less than equal to 1 (may be −ve), implying perfect classification. Values greater than 0.80 are taken as the best. The authors of [7] state that calculating sleep score is the foremost for speculating low quality sleep. This part shows a complete machine learning and deep learning structure using wrist accelerometry, called Deep-ACTINet, to record sleepawake observation automatically using noiseless inflamed activity signals put down during sleep and without a feature engineering method. In this study, the lodged two traditional fixed model-based sleep-awake scoring algorithms are compared with Deep-ACTINet. Sets of data from 10 people using three-axis accelerometer wristband sensors for 8 h in bed have been retrieved by authors. Using Deep-ACTINet and traditional approaches like feature-based convolutional neural network, Naïve Bayes, LSTM, and random forest, the sleep recordings were investigated. This indicated that this model procured the excellent accuracy of 89.65%, precision of 92.09%, and recall of 92.99% on an average. These percentage was somewhere around 4.05% and 4.74% surpassing those for the conventional methods based and machine learning algorithms based on features, respectively. The authors of [8] researched on sleep apnea which is a sleep associated disorganization which remarkably affects the inhabitants. Recurrent neural network (RNN), deep vanilla neural network (DVNN), and convolution neural network (CNN) in a wide perspective have been the three principal class of deep neural networks waged by the authors. Convolution neural network (CNN) was the extra significantly used classifier and the outlook supported both CNN1D and CNN2D. RCNN with spectrogram depiction accomplished a towering accuracy. Thus, RNN outperformed CNN. Now, to grasp sleeping patterns among the adults, the authors of [9] have done a population-based study. The dataset consisted of 10,220 adults from 15 to 18 years (54% girls). Self-described sleep quantification consists of sleep duration, sleep efficiency, time in bed, bedtime, rise time, wake after sleep onset, sleep onset latency, rate and frequency and duration of difficulties sustaining and initiating sleep and sleepiness and tiredness. Girls showed bigger sleep onset latency and a high frequency of sleep disorders than boys. Thus, the authors concluded that insomnia was common in adults. The authors of [10] have described the long and short run health consequences of sleep disruption. Various factors play a significant role in sleep disruption, starting from environmental factors and lifestyle of someone to disorders in sleep and other conditions. Sleep disorders have a bad long- and short-term health consequences. For adults, school performance, psychosocial health, and risk-taking behaviors are full of sleep disruption. Cognitive functioning and behavioral problems are related with sleep disruption in children. In healthy youths, short-term results of sleep disorder include emotional distress, increased stress and mood disorders, somatic pain, reduced quality of life, and cognitive, memory, and decreased performance.
322
D. K. Azad et al.
Long-term results of sleep disorder in otherwise healthy individuals include weightrelated issues, type 2 diabetes, hypertension, dyslipidemia, metabolic syndrome, and colorectal cancer. The paper [11] talks about the privacy preservation of the collected data from the production of a large amount of data which may be exploited. There are many methods of conserving this data, but they are partially efficient. This proposed work provides with an effective solution by applying perturbation algorithm which basically uses big data by means of optimal geometric transformation. This particular task has been tested and examined with the five classification algorithms and nine datasets. The analysis indicated that the proposed work is much better than the models that are used for privacy conservation. This article [12] proposes a ML model to develop an early anticipation from depression mode, which can prevent from mental illness and suicide state of affairs. A good accuracy level would be achieved by combination of Naïve Bayes algorithm and support vector machine. The classification model contains various cumulative distribution parameters, which has be identified and classified dynamically. The identification or detection of the features are obtained from semantic, writing, and textual content. The performance of various deep learning (DL) methods identifies the early anticipation. A hybrid method gives good results for early prediction and retain better resistivity. This study can also help to develop new ideas for early predictions for different emotions felt by people. The main goal of the system in paper [13] is to offer the users with personalized recommendations focusing on their preference. In this, location and orientation play and important role in getting the accurate preference of users. In this paper, a recommender system is incorporated that utilizes recommender algorithm that takes the context of user into consideration. The user’s preference is evaluated with the help of IoT smart watches like phones and smart watches. There are various existing sleep tracking applications. In the article [14], the working of instant app by Emberify is described. Instant app automatically trackers data from time to time a person spends on his phone and app usage, fitness, sleep, etc. The key features of this app are as follows: • Tracking of phone usage time (screen time) and number of time its unlocked. • Integrated with Fitbit, S Health and Google Fit to track your fitness and travel time. • Widget for quick tracking. • Track your sleep automatically and many more. The article [15] is on Samsung Health, which is used to track record of various aspects of daily life activities which contribute to well-being such as diet, physical activity, and sleep. The key features of this application are as follows: • • • •
Setting user goals Pedometer Activity tracking Dietary monitoring
Exploring Sleep Deprivation Reason Prediction
323
Table 1 Comparison of accuracy of various models Paper
[7]
Model used
Accuracy
Dataset used
Logistic regression
0.732
1 week data of 92 adolescents
CNN
0.926
RNN
0.660
LSTM-RNN
0.787
CNN
0.837
Naïve Bayes
0.809
Data collected from 10 healthy volunteers
Linear discriminant analysis
0.849
CNN
0.908
SCSMC86 [16]
RCNN
0.882
Massachusetts General Hospital data containing 10,000 samples
LSTM
0.890
AED [17]
• Weight tracking • Sleep Monitoring and many other. At last, the comparison of accuracy of different models among papers [5, 7] and [8] is given in Table 1.
3 Methods Wearables like smartwatches and fitness wristbands are becoming more and more popular these days, which eventually has given rise to the cultural phenomenon of “Quantified Self”. Self-quantification consists of calories burnt, step count, or quality of sleep. These devices are very easy to use, and it has made tracking data very simple. Article [18] gives a step-by-step process of sleep score analysis. Steps: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Understanding sleep score. Gathering data. Data preprocessing. Data analysis. Dividing the data into training and test set. Multiple linear regression (MLR). Random forest regression. Gradient boosting regression. Final model evaluation. Tensorflow and Keras.
324
D. K. Azad et al.
3.1 Understanding Sleep Score The Fitbit overall sleep score is actually obtained by combining three other scores which are sleep duration, sleep depth, and revitalization. Overall night sleep quality is based on the heart rate, restless, or time spend awake, and sleep stages—REM, light, and deep. Commonly people scores are between 72 and 83. The sleep score values signifies: • • • •
Poor: Less than 60 Fair: 60–79 Good: 80–89 Excellent: 90–100.
Sleep score components are as follows: • Sleep Duration (Duration score)—which basically refers to the length of sleep and compares to the average wake-up goals and bedtime. • Sleep Depth (Composition Score)—is obtained from how much time you spend in deep sleep and REM within your benchmarks. This includes different sleep stages also where each stage plays an important role in the overall health. The three sleep stages are elaborated below: Light: Stage in which a person generally spends most of the time in. It starts when a person first fall asleep but occurs throughout the night. Deep: This stage is the one in which a person is very relaxed and sometimes the one liable for how well-rested he feels on subsequent day. It helps with memory and learning, but it is also trusted to support our immune systems. REM: Dreams happen during this stage of sleep. It plays a major role in learning, memory, and also mood regulation. • Revitalization Score—track of the saturation level of oxygen in the blood during the night helps to determine sleep disturbances and comparison of heart rate during sleep and wake-up hours is also done here. High blood oxygen variations are abnormal and may point toward breathing problem. • There is one more interesting concept in Fitbit, i.e., restoration index which is designed to seek for signs of allergies, asthma, and even sleep apnea.
3.2 Gathering Data Data are the raw information which is gathered from various users and also from online source [19]. Data collection for 16 people has been completed. These data of people have been collected from Fitbit app. Currently, users data of 5 months have been recorded. Over 25 parameters have been recorded. All the data residing in the different files are combined to produce a single master sheet (CSV file) containing
Exploring Sleep Deprivation Reason Prediction
325
Table 2 Fitbit directory files File name
Description
calorie.json
Total calories burned person the last minute
Distance.json
Total distance moved per minute (in centimeters)
lightly_active_min.json
Adds up lightly active minutes per day
sedentary_min.json
Adds up sedentary minutes per day
very_active_min.json
Adds up very active minutes per day
moderately_active_min.json
Adds up moderately active minutes per day
resting_heart_rate.json
resting heart rate per day
sleep_score.csv
Consists of an overall 0–100 score, helps in understanding various sleeping patterns
sleep.json
Gives a per sleep breakdown of the sleep into periods of deep, REM, light sleeps, and time awake
step.json
Steps per minute
wellness.csv
Parameters like fatigue, soreness, stress, and mood scores are there
all the 25 parameters. This dataset consists of csv and json files, which are given in Table 2.
3.3 Data Preprocessing This section elaborates how to fetch the data from the comma-separated values file to a DataFrame. Firstly, declare the required libraries and then import the CSV (commaseparated values) files using in-built function pd.read_csv(). After importing all the data, next problem is eliminating the NaN values which can be done by finding the mean or median of the attributes of the datasets. The article [20] gives a brief information on data preprocessing.
3.4 Data Analysis This section consists of visualizations for better understanding of dataset. First let’s see the distribution of sleep scores in Fig. 1. Figure 1 shows distribution of sleep scores is shifted toward left, which make sense as bad night sleeps occur more likely than extraordinarily good night sleeps due to multiple reasons such as staying up late at night or getting up early in the morning. Let’s also have a look at Fig. 2, correlation matrix which displays the correlation coefficients for different variables. To sum up large amount of data where the focus is to see patterns. In our example above, the observable pattern is that all the variables are highly correlate with one another.
Fig. 1 Sleep score distribution
326 D. K. Azad et al.
Exploring Sleep Deprivation Reason Prediction
327
Fig. 2 Correlation matrix
3.5 Dividing the Data into Training and Test Set Before moving on to model creation, we need to split the data into two different subsets training set and testing set. This step is very important, and it includes scaling of data as it blocks any information from the test set to merge with the training set. We have divide the data (total sample: 2432) into 60-40 ratio that is 60% training set and 40% testing set. Therefore, total samples used for training set is 1459 and for testing set is 972.
3.6 MLR—Multiple Linear Regression MLR is used for evaluating the relations among two or more independent variables and one dependent variable. In MLR, we assumes the variables depend on each other linearly. The article [21] gives a clear idea on how MLR works. Formula used is given below: y = β0 + β1 X 1 + · · · + βn X n + ε
328
D. K. Azad et al.
Fig. 3 Feature importance graph Of MLR regression
So, from the Fig. 3, we can say that duration_score, deep_sleep_in_minutes, composition score, and revitalization score plays a crucial role in predicting the sleep score.
3.7 Random Forest Regressor A random forest is a grouping technique that holds multiple decision trees through bootstrap aggregation, also known as “bagging”. We found that the accuracy of multiple linear regression is way better than random forest regression (Fig. 4).
3.8 Gradient Boosting Regression Gradient boosting is a machine learning method for classification and regression; it generates a anticipation model in the form of a group of weak prediction models, typically decision trees. So, we found that the accuracy of gradient boosting regression is better than random forest regression but not as good as multiple linear regression (Fig. 5).
Exploring Sleep Deprivation Reason Prediction
329
Fig. 4 Feature importance graph according to random forest regression
Fig. 5 Feature importance according to gradient boosting regression
3.9 Final Model Evaluation Figure 6 shows the different performance parameters for machine learning models. So, after applying the three models given above, we came to a conclusion that linear regression gives the best accuracy of 99.98% in predicting the sleep score. So, we will implement the linear regression model in the android application.
330
D. K. Azad et al.
Fig. 6 Model comparison
3.10 Tensorflow and Keras Now, to predict the reasons for sleep deprivation, we have divided the data into three categories to predict the step score, time active score, and mood score. So, initially, we assumed an ideal person data having the respective scores as 100%. Then, we calculate the step score, time active score, and mood score for few data, so that it can be used for new data generated. So, to integrate a model into an android app, we have to use Tensorflow and Keras to build our model. Sequential Model—it is basically a stack of layers where each layer has exactly one input tensor and one output tensor. This model uses backpropagation algorithm for training the model. This algorithms require to be trained for a specific number of epochs. It is a supervised learning method for multilayer feed-forward networks from the field of ANN. The principle is to model a given function by modifying internal weights of input signals to produce an expected output signal. Figure 7 depicts the working of backpropagation algorithm. Random Forest Regressor model—in this random forest algorithm which is basically a supervised learning algorithm. Figure 8 shows the working of the random forest algorithm. • Prediction of sleep score—model used—deep learning sequential model with ReLU activation function in the intermediate layer and linear activation function in the last layer of the model. Input attributes—composition score, duration score, revitalization score, deep sleep in minutes, resting heart rate, restlessness, sleep efficiency, time awake, time in bed, and sleep duration. • Prediction of step score—model used—random forest regressor. Input attributes—distance, step count, and calories. • Prediction of active score—model used—deep learning sequential model with ReLU activation function in the intermediate layer and linear activation function in the last layer of the model. Input attributes—time sedentary, time lightly active, time moderately active, and time very active. • Prediction of mood score—model used—deep learning sequential model with ReLU activation function in the intermediate layer and linear activation function in the last layer of the model. Input attributes—mood, stress, fatigue, and soreness.
Exploring Sleep Deprivation Reason Prediction
331
Fig. 7 Working of backpropagation algorithm
X1, X2, X3,…….,XN
Fig. 8 Working of random forest algorithm
After successfully training the models, the models are converted into tflite model using Tensorflow lite converter. These tflite files are used to deploy the model in android application. This android application can be synced with Fitbit application to eliminate the need to manually input data.
332
D. K. Azad et al.
4 Results Let’s first see the evaluation metrics of the models used above to predict the sleep score, step score, mood score, and active score in Table 3. Figure 9 shows complete flow of the android application. To see whether an individual is sleep deprived or not, first we check whether the predicted sleep score Table 3 Model evaluation Prediction of
Model used
Mean absolute error
Mean squared error
Accuracy
R2
Sleep score
Sequential model
1.7446
0.0036
0.9966
0.9977
Step score
Random forest regressor
0.0670
0.0204
0.9998
1.00
Active score
Sequential model
0.1516
0.0249
0.8337
0.9995
Mood score
Sequential model
0.2119
0.0941
0.9764
0.9992
Fig. 9 Workflow of android application
Exploring Sleep Deprivation Reason Prediction
333
Fig. 10 Results screenshot
is below 80 or not. If it is below 80, then we check the predicted physical score, mood score, time active score and if these scores are also less than 80, then those are the reasons for sleep deprivation. For example, in the image Fig. 10 given below the sleep score of an individual is 76. The physical score, time active score, and mood score are 68, 91, and 75, respectively. So, the reasons for the person sleep deprivation are as follows: • Your physical activity is very less. • You should try to do more vigorous activities (at least 60 min) and 120 min of moderate activity. • Your heart rate during sleep is not normal. So, from the above reasons, we can clearly understand that the physical activity of the person is very less; therefore as we know, if a person is very tired, he will fall asleep easily. And heart rate during sleep indicates that the person has breathing disturbances. Given in Fig. 10.
5 Conclusion This study shows the use of various different machine algorithms and deep learning algorithms to analyze the data and derive required information from the data. The use of deep learning model has eradicate the requirement for preprocessing of data and simplified the overall workflow of sleep deprivation reason prediction. Tensorflow and Keras make it very easy to integrate the model into an android application. Keras is a neural network library written in Python, and it is very easy to use. In this, we have
334
D. K. Azad et al.
used sequential model, as using Keras we can build a model layer by layer. The loss obtained in this model was 0.098 which is very low. The .tflite and .h5 file generated after compiling our model is simply added to the android application. This androidbased sleep tracker system will gather various information from each individual and will apply deep learning model to predict the physical score, time active score, and mood score, and after analyzing these score, the application predicts the reasons for sleep deprivation.
References 1. F. Wang, É. Bíró, Determinants of sleep quality in college students: a literature review. Explore (2020) 2. Y. Patrick, A. Lee, O. Raha, K. Pillai, S. Gupta, S. Sethi, F. Mukeshimana et al., Effects of sleep deprivation on cognitive and physical performance in university students. Sleep Biol. Rhythms 15(3), 217–225 (2017) 3. J.J. Pilcher, A.I. Huffcutt, Effects of sleep deprivation on performance: a meta-analysis. Sleep 19(4), 318–326 (1996) 4. H. Kang, The effects of sleep deprivation on performance and perceived fatigability. University of British Columbia, Ph.D. diss. (2020) 5. A. Sathyanarayana, S. Joty, L. Fernandez-Luque, F. Ofli, J. Srivastava, A. Elmagarmid, T. Arora, S. Taheri, Sleep quality prediction from wearable data using deep learning. JMIR mHealth uHealth 4(4), e125 (2016) 6. A. Malafeev, D. Laptev, S. Bauer, X. Omlin, A. Wierzbicka, A. Wichniak, W. Jernajczyk, R. Riener, J. Buhmann, P. Achermann, Automatic human sleep stage scoring using deep neural networks. Front. Neurosci. 12, 781 (2018) 7. T. Cho, U. Sunarya, M. Yeo, B. Hwang, Y.S. Koo, C. Park, Deep-ACTINet: end-to-end deep learning architecture for automatic sleep-wake detection using wrist actigraphy. Electronics 8(12), 1461 (2019) 8. S.S. Mostafa, F. Mendonça, A.G. Ravelo-García, F. Morgado-Dias, A systematic review of detecting sleep apnea using deep learning. Sensors 19(22), 4934 (2019) 9. M. Hysing, S. Pallesen, K.M. Stormark, A.J. Lundervold, B. Sivertsen, Sleep patterns and insomnia among adolescents: a population-based study. J. Sleep Res. 22(5), 549–556 (2013) 10. G. Medic, M. Wille, M.E.H. Hemels, Short-and long-term health consequences of sleep disruption. Nat. Sci. Sleep 9, 151 (2017) 11. W. Haoxiang, S. Smys, Big data analysis and perturbation using data mining algorithm. J. Soft Comput. Paradigm (JSCP) 3(01), 19–28 (2021) 12. S. Smys, J.S. Raj, Analysis of deep learning techniques for early detection of depression on social media network—a comparative study. J. Trends Comput. Sci. Smart Technol. (TCSST) 3(01), 24–39 (2021) 13. M.C.V. Joe, J.S. Raj, Location-based orientation context dependent recommender system for users. J. Trends Comput. Sci. Smart Technol. (TCSST) 3(01), 14–23 (2021) 14. https://emberify.com/blog/sleep-tracking-2/ 15. https://en.wikipedia.org/wiki/Samsung_Health 16. E. Urtnasan, J.-U. Park, K.-J. Lee, Multiclass classification of obstructive sleep apnea/hypopnea based on a convolutional neural network from a single-lead electrocardiogram. Physiol. Meas. 39, 065003 (2018). https://doi.org/10.1088/1361-6579/aac7b7 [PubMed] [CrossRef] [Google Scholar] 17. T. Penzel, G. Moody, R. Mark, A. Goldberger, J. Peter, The apnea-ECG database, in Proceedings of the Computers in Cardiology, Cambridge, MA, USA, 24–27 Sept 2000 (IEEE, Piscataway, NJ, USA, 2000), pp. 255–258 [Google Scholar]
Exploring Sleep Deprivation Reason Prediction
335
18. https://towardsdatascience.com/how-to-obtain-and-analyse-fitbit-sleep-scores-a739d7c8d f85?gi=88647defe9f3 19. https://datasets.simula.no/pmdata/ 20. https://medium.com/@yogeshojha/data-preprocessing-75485c7188c4 21. https://medium.com/analytics-vidhya/multiple-linear-regression-with-python-98f4a7f1c26c
E-Irrigation Solutions for Forecasting Soil Moisture and Real-Time Automation of Plants Watering Md Mijanur Rahman , Sonda Majher, and Tanzin Ara Jannat
Abstract This study deals with the development of e-irrigation solution for forecasting soil conditions and real-time automation of watering in smart agriculture applications. There are several factors that affect agriculture, including limitation of water resources, proper usage of pesticides, inaccuracy in the prediction of soil moisture, inefficient irrigation management, etc. Moreover, traditional irrigation procedures lack proper managing of plants/crops watering that causes wastage of water. As smart farming is an emerging concept, this research work aims of using evolving technologies, such as the “Internet of Things (IoT)”, “Information and Communication Technology (ICT)”, wireless sensor networks, cloud computing, machine learning, big data, etc. This paper presents a smart irrigation system for predicting irrigation requirements and performing automatic watering process with the help of IoT tools, ICT protocols, and machine learning approaches. The e-irrigation system is developed based on an intelligent irrigation algorithm considering sensors and weather forecasting data. The e-irrigation solution has been implemented in small scale that can predict the soil condition and irrigate plants effectively without human intervention. Keywords Internet of Thing (IoT) · Industry 4.0 · Smart irrigation · Soil moisture prediction · Support vector regression
1 Introduction The e-irrigation system stands for an IoT-based intelligent irrigation system [1] that is being developed based on evolving technologies, such as the “Internet of Things (IoT)”, “Information and Communication Technology (ICT)”, “Artificial Intelligence (AI)”, “Cloud Computing”, “Big Data”, etc. It is needed for expecting a very high M. M. Rahman (B) · S. Majher · T. A. Jannat Department of Computer Science and Engineering, Jatiya Kabi Kazi Nazrul Islam University, Mymensingh 2224, Bangladesh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_25
337
338
M. M. Rahman et al.
Fig. 1 The global share of fresh water usage by different sectors in 2014
efficiency of water usage and adjusting the amount of water for a certain plant. The agriculture sector plays an important role in the world economy. A large volume of fresh water resources (more than 70% fresh water) is being used globally in agriculture applications [2]. The share of fresh water resources by different sectors is graphically illustrated in Fig. 1 [3]. Bangladesh is an agriculture-based country and most traditional irrigation procedures here are performed manually. Nowadays, the growth of population and climate change cause the shortage of water resources, and hence, the overall irrigation procedure is under pressure. The proper utilization of water resources is very important now to augment the share of retained water. There is a need to provide a smart solution to maximize the resources utilization in an intelligent way. In recent years, advanced technologies have been deployed for automated and semi-automated irrigation systems that are being replaced the traditional agricultural mechanisms. The recent innovation Industry 4.0 [4] reveals that computers are connected and communicated with one another to ultimately make decisions automatically without human involvement in a variety of fields, as shown in Fig. 2. The IoT tools make Industry 4.0 possible, and the smart system is now a reality that can handle more data smartly with more efficiently, productively and less wasteful. The IoT is very familiar with its special features in developing office/industry automation. It helps to design an automatic system that can communicate between physical and digital worlds by transferring information with the help of ICT protocols. Thus, an IoT-based system is defined as an automated and interactive process that does not require physical contact either human-to-human or human-to-machine. The smart things/objects connected to this system has unique identifier. It is able to send or receive information over a
E-Irrigation Solutions for Forecasting Soil Moisture …
339
Fig. 2 Emerging technologies in Industry 4.0
wireless communication network through any other things, like devices, machines, animal, or human [5]. This study is aimed in the prediction of soil moisture and real-time automation of watering that will reduce water consumption in irrigation application. Thus, the IoT technology used in agriculture application can manage plants/crops watering brilliantly according to its soil and weather conditions as well as user requirements. This paper addresses a smart irrigation solution which will allow the user to utilize resources, like water, remotely in a smarter way.
1.1 Smart System As a new technology is being adopted day by day and has potential in modern life [6], the smart systems make our life easier and automated. The “smart system” is used to depict various technological systems, which are autonomous, semi-autonomous or collaborative. These are also hybrid systems that have abilities to sense, actuate, and control a given situation by analyzing and investigating the real-world environment. The term “smartness” of a system is realized by successful automation of the event, which is determined by the effectiveness of functioning, power consumption or energy efficiency, communication/networking capabilities, performance measurements, etc. The literature on smart systems showed the diversity of their activities and intelligence in self-confidence, reliability, and adaptability. A variety
340
M. M. Rahman et al.
of smart systems are being employed by combining various technologies, and smart things/objects in different disciplines, like electrical, mechanical, computer science, biology, chemistry, nanotechnology, or cognitive science. According to the literature, these smart systems are broadly classified into the three groups [7]. These are: (i) first generation smart systems including object detection devices, monitoring, and multifunctional devices for simply enveloping surgery, (ii) second-generation smart systems including active miniaturized artificial organs, energy management systems, and physical sensor networks, and (iii) third-generation smart systems integrate both intelligence and cognitive functions, that offer an interaction between the virtual world and the physical world in an intelligent way. A key challenge in developing a smart system is to integrate a variety of components/devices with different evolving technologies. Thus, it requires focusing the design issues of smart systems for specialized applications. Now, smart systems are being increasingly used in various areas, such as transportation, health care, agriculture, energy, safety and security, industry and manufacturing, etc. These systems can address and overcome environmental, societal, and economic challenges by proper managing of limited resources, climate change, population aging, and globalization with the help of technology.
1.2 IoT-Based Systems The IoT-based systems comprise of web-enabled hardware components, including embedded processors, smart things or sensors and communication devices that can acquire, actuate, and transmit environmental or physical data among them [8]. The sensed data is sent to the cloud by utilizing the IoT gateway or any other edge device. The data is then analyzed locally or on the cloud. Any type of IoT systems is built with the integration of the physical world, the virtual world and a communication network. These are three basic blocks of an IoT-based system, as shown by a block diagram in Fig. 3. The data flow and the overall working procedure of an IoT-based system are illustrated in Fig. 4.
Fig. 3 Major components of an IoT framework
E-Irrigation Solutions for Forecasting Soil Moisture …
341
Fig. 4 Data flow and working process of an IoT-based system
In developing an IoT-based system regarding this study, we have reviewed a list of examples of IoT applications including smart home, smart lock, smart security system, smart factory, smart city, smart tracking and monitoring system, smart mirror [9]. These applications motivated us to do the study in this area of Industry 4.0 as well as IoT technology.
2 Related Work Various researchers’ anticipation is being evolved on the advanced technology that can address and overcome the challenges of existing irrigation systems. With the advancement in IoT and wireless technologies, García et al. [10] presented a survey in the current state of art regarding smart irrigation systems. They determined and analyzed the evolving parameters (such as soil conditions, environmental/physical quantities, and weather forecasting) in the development of the irrigation systems. The solar energy enabled smart irrigation system [11] is an advanced technology-based farming method in agriculture. This type of eco-friendly systems [12] involves a solar energy driven water pump and automatic watering with the help of IoT devices (like soil moisture sensor, smart devices). A smart system using a Cuckoo search algorithm [13] can provide a better distribution of water resource in a crops farm under any circumferences. This system was equipped with IoT related sensors and wireless communication system and analyzed various sensed data values, such as soil moisture, temperature, humidity, and pH. This IoT platform allows storing the sensed data in the cloud environment using ThingSpeak, and the Cuckoo search algorithm assists to select appropriate crops land and permit the watering. Kaewmard and Saiyod [14] presented a portable technology
342
M. M. Rahman et al.
which incorporated sensor devices for measuring soil moisture and air humidity. This system was capable of collecting environmental data and controlling the irrigation process via Smartphone. Nagothu [15] also proposed an IoT-based irrigation system that utilized the sensed data and weather forecasting data. The major issue was handling multiple layers or types of soil at a single location reading for a particular soil type. Authors in [16] proposed a system that can control the watering by utilizing soil moisture sensors and other IoT devices. The acquired data from sensors was sent to a centralized server through a wireless communication module, which assists to controlling and monitoring of the water supply. A cloud-based smart multi-level irrigation system using IoT for reduced water consumption was presented in [17]. A local node (customized to selected crops) for each level was provided that had its own local decision making system. These local nodes communicated with a centralized cloud server which was capable of storing and processing the relevant data. In [18], authors presented a field specific irrigation system that was operated automatically with the help of sensors and valve controllers. This wireless sensor network-based system was also scalable and self-organizing [19]. Kokkonis et al. [20] proposed a fuzzy-based framework for smart irrigation systems which employed sensors (soil moisture, air temperature, and humidity), actuators, and a microcontroller. All the sensor data were forwarded to cloud database through a microcontroller for further processing. In this system, a fuzzy computational algorithm was used to decide the opening of servo valve or not. In [21], another wireless sensor network was employed for managing the agricultural fieldworks. This system provided a long-distance communication by WLAN and server configuration. Besides establishing wireless sensors in agricultural applications, smart mobile technology was also adopted [22]. Other researchers have been conducted their studies focusing on specific aspects regarding irrigation systems and software [23], such as pivot-center specific irrigation systems [24] or irrigation systems for greenhouse plants [25], water management or automation in agriculture [26]. Lastly, there were some surveys focusing on precision agriculture [27, 28], crop monitoring [29] and the agro-industry [30]. These surveys on smart irrigation systems have been analyzed and investigated [31–36] for doing this study. These studies provided us an overview of the current advancement in irrigation systems, and the proper utilization of wireless sensor network, IoT as well as ICT technologies [37]. But, most of the papers do not afford a full analysis of technological aspects concerning e-irrigation in agriculture. This paper addresses the current gap in literature with a survey of IoT-based smart irrigation system and presents an e-irrigation solution for real-time automation for plants watering in smart agriculture. Before doing the development, it requires to study a lot of advanced technologies that must be integrated or embedded in a common platform [11]. These technologies include wireless sensor networks, cloud computing, big data, artificial intelligent and machine learning, smartphone technology, renewable energy (especially, solar energy), IoT and ICT tools, and some other intelligent interfacing tools.
E-Irrigation Solutions for Forecasting Soil Moisture …
343
3 Proposed System The IoT technology is the concept of linking smart devices with the Internet and the IoT-enabled Big Data. IoT-based systems are built of web-enabled devices that can interact with the sensed data acquired from environments using smart sensors, processors and communication network. This system is designed using Arduino technology to offer a diversity of connectivity, and percept the environment parameters, such as soil moisture, weather forecasting data and user requirements, as well as perform real-time automation of irrigation using appropriate algorithms. The proposed e-irrigation model provides effective irrigation by optimizing the watering process in agriculture applications. The proposed e-irrigation solution has the following key features: 1. E-irrigation model has a potential to assist existing irrigation systems and enhance crop quality, which is yielded with the help of forecasting data, like soil moisture, air temperature, humidity and level of watering. 2. Using appropriate forecasting and irrigation planning algorithms, the proposed system informs the wireless sensing unit and related modules to control watering automatically. 3. By providing a real-time automation (monitoring and controlling) of watering, it avoids human intervention and reduces water consumption that leads energy efficiency as well as cost-effectiveness. The proposed e-irrigation system has seven components, such as field data collection, soil moisture prediction, weather data collection, interface for real-time monitoring and controlling, data server, wireless communication network for sensor devices and smart devices. According to their functionality, these components are grouped into three different layers (see Fig. 5), such as (i) data collection and transmission through wireless sensors and smart devices, (ii) data processing, prediction and realtime automation using appropriate algorithms and (iii) applications for irrigation planning and plants/crops watering. This layered architecture and the simplified circuit diagram for developing the proposed system are illustrated in Figs. 5 and 6, respectively. The system design is the primitive step to the development of a system, and the system architecture defines the desired architectures, components, modules, user interfaces and data management for a system to fulfill the specified requirements and objectives. First of all, Arduino is used for controlling the whole process and integrating all the modules, such as GSM module, sensing unit, data management, display unit, smart devices and user interface. All the data percept through sensing devices and other modules are processed by using appropriate algorithms. GSM module is used for communicating and sending notifications to the user’s smartphone. The LCD unit displays all the information/notifications regarding to its functionality. The sensing unit with smart things or sensors acquires weather and environmental data, such as soil moisture, air temperature, humidity and level of watering. The forecasting algorithm utilizing the acquired data makes a prediction on the irrigation requirements
344
M. M. Rahman et al.
Fig. 5 Layered architecture of the proposed e-irrigation model
and enables the operation of watering. The forecast value also assists to control the water pump for watering in the desired location basis on the predicted moisture value in the soil. The major modules of the e-irrigation model are described in the following subsections.
3.1 Wireless Sensor Network and Control Unit Figure 7 shows a typical block diagram of a smart system integrating sensors network and control unit with the microcontroller. It consists of smart things or sensors (like soil moisture sensor, temperature and humidity sensor, and water level sensor), transceiver, web data server and watering control unit. The sensed data is processed by microcontroller and forwarded to the cloud server via the Internet. Thus, the overall control unit comprises of a microcontroller which actuates the water pump, depending on the sensory data and predicted soil moisture values. The sensing unit percepts and compares the relevant measurements frequently with the help of irrigation algorithm, and then the appropriate decision is taken in concerning the need for plants watering. The appropriate control decision is made through irrigation planning algorithm with the help of soil moisture prediction algorithm utilizing sensed information of soil moisture, weather forecasting data (air temperature and humidity measurements) and user requirements.
Fig. 6 Simplified circuit diagram for designing the proposed system
E-Irrigation Solutions for Forecasting Soil Moisture … 345
346
M. M. Rahman et al.
Fig. 7 Block diagram of the proposed irrigation system, integrating wireless sensors, transceiver, data server and control unit with the microcontroller (Arduino UNO)
Fig. 8 Cloud server with the transceiver and the control program
3.2 Cloud Server (Data Server) Figure 8 shows the interfacing between the data server and the controlling program. The data server contains the measurement of soil moisture and the weather forecasting data (temperature and humidity values for the next 6 h) which is utilized to make a decision for irrigating plants. Thus, the decision making to control program is determined by using a forecasting algorithm based on the prediction of soil moisture utilizing the sensed data and the weather forecasting data. The system then automatically controls the irrigation procedure by receiving the decisions from the data server. The cloud with the predicted value is also responsible for stopping the watering process. According to the nature of plants or crops, predicted soil moisture
E-Irrigation Solutions for Forecasting Soil Moisture …
347
values can be varied, and then the duration and frequency of irrigating will also be differed.
3.3 Irrigation Planning Algorithm As discussed earlier, the e-irrigation model has been proposed to manage the irrigation effectively by utilizing the sensors data (soil moisture, air temperature and relative humidity) of the crops land and the weather forecasting data (air temperature and humidity). An irrigation scheduling algorithm developed by Amarendra [38] has been applied to forecast the soil moisture (i.e., the estimation of difference) based on field sensors data and weather forecasting data using “Support Vector Regression (SVR)” model [39] and k-means clustering [40]. This algorithm determines “Mean Squared Error (MSE)” value to give better accuracy in the prediction of the soil moisture utilizing the sensed data and the weather data. At first, the SVR model is trained with the sensors data, and it estimates the soil moisture difference of upcoming days. Then, this predicted value is sent to k-means clustering for predicting more accurate soil moisture value with minimum error. This is the final predicted soil moisture value and used in the e-irrigation planning algorithm for controlling the irrigation process effectively. This algorithm is responsible for monitoring and controlling (start and stop) the overall irrigation or plats/crops watering process. It also provides irrigation suggestions to save water as well as power consumption. The flow diagram of the proposed e-irrigation planning procedure along with the soil moisture prediction algorithm is shown in Fig. 9.
4 Results and Discussion The proposed e-irrigation model has been implemented with both hardware and software/programs requirements. This system involves Arduino UNO, IoT-based sensors, smart devices and the wireless communication module, which have three phases, such as (i) sensing data (soil moisture, air temperature and humidity) and weather forecasting data (temperature and humidity) [41], (ii) predicting soil moisture using irrigation algorithm and (iii) monitoring and controlling irrigation as well as giving notifications to other modules or the user. The overall arrangement of the designed system is shown in Fig. 10. All the components have been integrated with Arduino UNO, and the computer program was coded in C/C++ language for Arduino IDE. Along with the collection of weather forecasting data, smart sensors acquire the physical data (soil moisture, temperature and humidity) from the environment (i.e., plants or crops firm). This system is able to quantify the volumetric water content in the plants land [42]. As this study aimed to develop a smart irrigation system, it effectively performed plants watering according to the decision taken from the irrigation scheduling algo-
348
M. M. Rahman et al.
Fig. 9 Flowchart of the irrigation planning procedure with soil moisture prediction algorithm
Fig. 10 Overall arrangement of the developed system
E-Irrigation Solutions for Forecasting Soil Moisture …
349
Fig. 11 LCD scenario of the status of water motor
Fig. 12 Messaging through mobile device
rithm with the help of prediction algorithm. The developed system also enables the notifications (displaying or messaging) to other devices (LCD display) and users through smart devices, as shown in Figs. 11 and 12. It can enable to display the status of the initial stage, soil monitoring and the watering process. Thus, the user can know about their irrigation system from anywhere, anytime by getting messages. This information confirms their user that the irrigation system is working properly. The major outcomes of the developed system include optimization of water usage,
350
M. M. Rahman et al.
automation of irrigation planning, time consuming and cost-effectiveness as well as provision of smart agriculture.
5 Conclusion This study aimed at designing and developing an e-irrigation system for forecasting soil moisture and real-time automation of plants watering based on IoT and wireless sensor networks technology. This solution leads to the combination of various smart systems, which could be applied in smart agriculture. The experiments showed that this automated system can manage plants/crops watering more effectively and help to optimize the usage of water in the firm. This solution has been implemented in small scale and is applicable in a stand-alone application. It can be further extended in large-scale and customized to irrigate any harvest fields. The future works may be employed a machine learning approach in water management and crops field analysis for saving water as well as other resources. Acknowledgements The authors wish to express the heartfelt gratitude to the research & extension cell and the department of Computer Science and Engineering of the Jatiya Kabi Kazi Nazrul Islam University (@JKKNIU, Bangladesh) for their supports and cooperation.
References 1. M.N. Rajkumar, S. Abinaya, V.V. Kumar, Intelligent irrigation system—an IoT based approach, in 2017 International Conference on Innovations in Green Energy and Healthcare Technologies (IGEHT) (IEEE, 2017), pp. 1–5 2. C. Edgerton, A. Estrada, K. Fairchok, M.T. Parker, A. Jezak, C. Pavelka, H. Lee, L. Doyle, A. Feldmeth, Addressing water insecurity with a greywater hydroponics system in South Africa, in IEEE Global Humanitarian Technology Conference (GHTC) (IEEE, 2020), pp. 1–4 3. T. Khokhar, Chart: globally, 70% of freshwater is used for agriculture. World Bank Blogs. https://blogs.worldbank.org/opendata/chart-globally-70-freshwater-used-agriculture (2017). Accessed 26 Aug 2020 4. M. Rüßmann, M. Lorenz, P. Gerbert, M. Waldner, J. Justus, P. Engel, M. Harnisch, Industry 4.0: the future of productivity and growth in manufacturing industries. Boston Consult. Group 9(1), 54–89 (2015) 5. C. McClelland, IoT explained-how does an IoT system actually work. https://medium. com/iotforall/iot-explained-how-does-an-iot-system-actually-work-e90e2c435fe7 (2017). Accessed 26 Aug 2020 6. K. Andersson, M.S. Hossain, Heterogeneous wireless sensor networks for flood prediction decision support systems, in 2015 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) (IEEE, 2015), pp. 133–137 7. The Iavishkar Project, Smart components and smart system integration. https://iavishkar.com/ smart-components-and-smart-system-integration/ (2016). Accessed 26 Aug 2020 8. Z. Abedin, A.S. Chowdhury, M.S. Hossain, K. Andersson, R. Karim, An interoperable IP based WSN for smart irrigation system, in 2017 14th IEEE Annual Consumer Communications & Networking Conference (CCNC) (IEEE, 2017), pp. 1–5
E-Irrigation Solutions for Forecasting Soil Moisture …
351
9. R.U. Islam, K. Andersson, M.S. Hossain, Heterogeneous wireless sensor networks using COAP and SMS to predict natural disasters, in 2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) (IEEE, 2017), pp. 30–35 10. L. García, L. Parra, J.M. Jimenez, J. Lloret, P. Lorenz, IoT-based smart irrigation systems: an overview on the recent trends on sensors and IoT systems for irrigation in precision agriculture. Sensors 20(4), 1042 (2020) 11. A. Raheman, M.K. Rao, B.V. Reddy, T.R. Kumar, IoT based self-tracking solar powered smart irrigation system. Int. J. Eng. Technol. 7, 390–393 (2018) 12. C. Subramani, S. Usha, V. Patil, D. Mohanty, P. Gupta, A.K. Srivastava, Y. Dashetwar, IoTBased Smart Irrigation System (Springer, 2020), pp. 357–363 13. A. Pathak, M. AmazUddin, M.J. Abedin, K. Andersson, R. Mustafa, M.S. Hossain, IoT based smart system to support agricultural parameters: a case study. Procedia Comput. Sci. 155, 648–653 (2019) 14. N. Kaewmard, S. Saiyod, Sensor data collection and irrigation control on vegetable crop using smart phone and wireless sensor networks for smart farm, in 2014 IEEE Conference on Wireless Sensors (ICWiSE) (IEEE, 2014), pp. 106–112 15. S.K. Nagothu, Weather based smart watering system using soil sensor and GSM, in 2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave) (IEEE, 2016), pp. 1–3 16. A. Kumar, K. Kamal, M.O. Arshad, S. Mathavan, T. Vadamala, Smart irrigation using low-cost moisture sensors and xbee-based communication, in IEEE Global Humanitarian Technology Conference (GHTC 2014) (IEEE, 2014), pp. 333–337 17. S. Salvi, S.F. Jain, H. Sanjay, T. Harshita, M. Farhana, N. Jain, M. Suhas, Cloud based data analysis and monitoring of smart multi-level irrigation system using IoT, in 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) (IEEE, 2017), pp. 752–757 18. R.W. Wall, B.A. King, Incorporating plug and play technology into measurement and control systems for irrigation management, in 2004 ASAE Annual Meeting (American Society of Agricultural and Biological Engineers, 2004), p. 1 19. Y. Wang, L. Huang, J. Wu, H. Xu, Wireless sensor networks for intensive irrigated agriculture, in 2007 4th IEEE Consumer Communications and Networking Conference (IEEE, 2007), pp. 197– 201 20. G. Kokkonis, S. Kontogiannis, D. Tomtsis, A smart IoT fuzzy irrigation system. Power 100(63), 25 (2017) 21. K. Konstantinos, X. Apostolos, K. Panagiotis, S. George, Topology optimization in wireless sensor networks for precision agriculture applications, in 2007 International Conference on Sensor Technologies and Applications (SENSORCOMM 2007) (IEEE, 2007), pp. 526–530 22. K. Masuki, R. Kamugisha, J. Mowo, J. Tanui, J. Tukahirwa, J. Mogoi, E. Adera, Role of mobile phones in improving communication and information delivery for agricultural development: lessons from south western Uganda, in Workshop at Makerere University, Uganda, 2010, pp. 22–23 23. M. Brajovi´c, S. Vujovi´c, S. Dukanovi´c, An overview of smart irrigation software, in 2015 4th Mediterranean Conference on Embedded Computing (MECO) (IEEE, 2015), pp. 353–356 24. O. Debauche, M. El Moulat, S. Mahmoudi, P. Manneback, F. Lebeau, Irrigation pivot-center connected at low cost for the reduction of crop water requirements, in 2018 International Conference on Advanced Communication Technologies and Networking (CommNet) (IEEE, 2018), pp. 1–9 25. G. Nikolaou, D. Neocleous, N. Katsoulas, C. Kittas, Irrigation of greenhouse crops. Horticulturae 5(1), 7 (2019) 26. K. Jha, A. Doshi, P. Patel, M. Shah, A comprehensive review on automation in agriculture using artificial intelligence. Artif. Intell. Agric. 2, 1–12 (2019) 27. A. Mohapatra, B. Keswani, S. Lenka, ICT specific technological changes in precision agriculture environment. Int. J. Comput. Sci. Mob. Appl. 6, 1–16 (2018)
352
M. M. Rahman et al.
28. U. Shafi, R. Mumtaz, J. García-Nieto, S.A. Hassan, S.A.R. Zaidi, N. Iqbal, Precision agriculture techniques and practices: from considerations to applications. Sensors 19(17), 3796 (2019) 29. D. Sreekantha, A. Kavya, Agricultural crop monitoring using IoT—a study, in 2017 11th International conference on intelligent systems and control (ISCO) (IEEE, 2017), pp. 134–139 30. J.M. Talavera, L.E. Tobón, J.A. Gómez, M.A. Culman, J.M. Aranda, D.T. Parra, L.A. Quiroz, A. Hoyos, L.E. Garreta, Review of IoT applications in agro-industrial and environmental fields. Comput. Electron. Agric. 142, 283–297 (2017) 31. S. Jain, K. Vani, A survey of the automated irrigation systems and the proposal to make the irrigation system intelligent. Int. J. Comput. Sci. Eng. 6, 357–360 (2018) 32. A. Joshi, L. Ali, A detailed survey on auto irrigation system, in 2017 Conference on Emerging Devices and Smart Systems (ICEDSS) (IEEE, 2017), pp. 90–95 33. K. Kansara, V. Zaveri, S. Shah, S. Delwadkar, K. Jani, Sensor based automated irrigation system with IoT: a technical review. Int. J. Comput. Sci. Inf. Technol. 6(6), 5331–5333 (2015) 34. P.B. Yahide, S. Jain, M. Giri, Survey on web based intelligent irrigation system in wireless sensor network. Multidiscip. J. Res. Eng. Technol. 2, 375–385 (2015) 35. M.H.J.D. Koresh, Analysis of soil nutrients based on potential productivity tests with balanced minerals for maize-chickpea crop. J. Electron. 3(01), 23–35 (2021) 36. J.I.-Z. Chen, L.-T. Yeh, Greenhouse protection against frost conditions in smart farming using IoT enabled artificial neural networks. J. Electron. 2(04), 228–232 (2020) 37. A. Tzounis, N. Katsoulas, T. Bartzanas, C. Kittas, Internet of things in agriculture, recent advances and future challenges. Biosyst. Eng. 164, 31–48 (2017) 38. A. Goap, D. Sharma, A. Shukla, C.R. Krishna, An IoT based smart irrigation management system using machine learning and open source technologies. Comput. Electron. Agric. 155, 41–49 (2018) 39. H. Drucker, C.J. Burges, L. Kaufman, A. Smola, V. Vapnik, Support vector regression machines. Adv. Neural Inf. Process. Syst. 9, 155–161 (1997) 40. T. Kanungo, D.M. Mount, N.S. Netanyahu, C.D. Piatko, R. Silverman, A.Y. Wu, An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002) 41. Weather Atlas, Weather forecast Bangladesh. https://www.weather-atlas.com/en/bangladesh (2020). Accessed Aug 2020 42. S.L. Su, D. Singh, M.S. Baghini, A critical review of soil moisture measurement. Measurement 54, 92–105 (2014)
Implementing OpenCV and Dlib Open-Source Library for Detection of Driver’s Fatigue R. Kavitha, P. Subha, R. Srinivasan, and M. Kavitha
Abstract Detecting a driver’s drowsiness has become one of the most effective ways of determining driver weariness. When a driver becomes asleep while driving, the car loses control which frequently leads to a collision with another vehicle or immovable things. Based on measures of vehicles, many indicators are constantly watched, to increase the likelihood of the driver to check he is not asleep. This research paper ties to highlight the issues of drowsiness in the drivers. This may induce drowsiness therefore it’s important to catch it early to avoid missing an important deadline. A Dlib open-source library and OpenCV do face recognition. Python is the software program used to execute the concept. For this research purpose, a camera is used to monitors the driver’s behavior, such as eye closure, yawning, movements of the eye, head position, and so on, and the driver is warned if any of these sleepiness indications are observed. The comparative analysis of the proposed Dlib open-source library and OpenCV is compared with the pre-existing algorithm for detecting the driver’s fatigue. The objective of this work is to create a Real-Time Driver Drowsiness Alert system. This technology monitors the driver’s eyes and sounds an alarm if the driver becomes tired. Keywords Automobile · Drowsiness · Driver fatigue · Eye movement · Face detection · OpenCV
R. Kavitha (B) · R. Srinivasan · M. Kavitha Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Chennai, India e-mail: [email protected] R. Srinivasan e-mail: [email protected] M. Kavitha e-mail: [email protected] P. Subha Sri Sai Ram Institute of Technology, Chennai 600044, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_26
353
354
R. Kavitha et al.
1 Introduction Drowsy approaches are slow and depleted in energy. Drowsiness is a state of being on the verge of falling asleep, with a strong desire to do so. Drowsiness refers to the inability to keep your eyes open or the sense of being fatigued. Sleepiness is also known as excessive languor. It may cause recklessness or nod off at inopportune times. It is more commonly associated with flaws, laziness, and a lack of mental acuity. Trade-off rest has also been linked to despair, discomfort, and stress. Individual driver sleepiness is currently a major issue. It will not be related to a medical problem, but rather too long traveling by a skilled driver. This may induce drowsiness; therefore, it is important to catch it early to avoid missing an important deadline [1]. Some elements, such as multi-leveled formulas, tables, and graphics, are not required, but the various table text patterns are prescribed. The word “drowsy” is equivalent to “sleepy,” which merely refers to a desire to sleep. Awake, NREM, REM, and rapid eye movement sleep (REMS) are the three stages of sleep (REM). NREM, the second stage, can be further divided into three phases: first stage: transition from awake to sleepy; second stage: mild snooze; third stage: tiredness and deep sleep. Numerous features of crashes occur as a result of driver sleepiness. It happens late at night (between 0:00 and 7:00 a.m.) or in the afternoon (between 2:00 and 4:00 p.m.). Researchers have largely investigated Stage I, which involves a single car going off the lane, to examine driver tiredness. It happens on highways with high speeds. The driver is usually a young boy between the ages of 16 and 25. There are no crash marks or signs of braking. These strategies have been thoroughly examined, and the benefits and drawbacks of each have been highlighted. However, the capabilities of the multiple aspects must be incorporated into a framework to build an effective Real-Time Driver’s Drowsiness Alert system. The present study tries to capture the facial expression of the driver by fitting the face camera to detect the eye blinking, yawning, head shaking, etc. The time and frequency of all these occurrences will be measured. Through the fatigue indicator, the driver can be drawn for this action. This research paper is structured as follows: Sect. 2 discusses related work, Sect. 3 discusses the proposed method, Sect. 4 discusses the obtained result and the discussion, Sect. 5 discusses the conclusion and future scope.
2 Literature Review Aidman et al. [1] A unique method for important aspects of face detection problems based on analog CNN algorithms is presented in this research. The CNN-based techniques effectively locate [2] and assist to normalize facial features while causing the majority of vehicle-related incidents. Driver tiredness is reduced, and the time required is a quarter of that required by earlier methods. The system begins by detecting color images on heads through color and structural differences between
Implementing OpenCV and Dlib Open-Source Library for Detection …
355
the background and the human face. All faces must be turned into the same position and orientation by normalizing the range and location of the standard points. Khan et al. [3] The eyes act as a basis of reference for normalization. Another CNN method searches for key characteristics of the eyes and eyelids in any image pixels to detect the eyes. Tests on a conventional database reveal that the approach is both fast and dependable. Galarza et al. [2] examined the facial detection. To train the decoder, the system requires a large number of positive and negative images at first. After that, we must extract features from it. The Haar qualities illustrated in the graphic below are employed for this. The investigators are similar to our convolutional network. Each characteristic is a single factor calculated by reducing the number of pixels beneath the white rectangle from the total of pixels underneath the black rectangle. Gupta et al. [4] Morphology and color image analysis were used for eye detection. Many uses, such as iris detection, face identification, auto-stereoscopic displays, face recognition video conferencing, and eye-gaze tracking require eye detection. Using morphological image analysis and color, this research provides a novel methodology for eye recognition. When contrasted to other portions of the face, eye areas in a picture are characterized by low lighting, high contrast, and high-density edges. Wong et al. [5] The proposed method is dependent on the premise that a full-frontal facial image is accessible. To begin, a color-based method and a six-sigma approach are used to identify the skin area on NTSC, RGB, and HSV scales. Morphological analysis utilizing frontier region detection and recognition of light source projection via an eye generally referred to as an eye dot are used in the subsequent analysis. This results in a finite set of eye candidates from which sound can be eliminated later. Liu et al. [6] For recognizing eyes in frontal face photos, this methodology has been demonstrated to be highly accurate and reliable. A strong eye recognition technique for gray pixel intensities is presented in this research. The goal of our approach is to integrate the benefits of two current strategies, the template-based method and the feature-based method, and to overcome their flaws. To begin, a feature-based method will be utilized to identify two rough areas of both eyes on the face after the location of the facial region has been determined. Then, in these two hard regions, reliable recognition of iris centers will be continued using a template-based method. Investigations on people who do not use spectacles reveal that the proposed method is not only reliable but also highly effective. Researchers present our ongoing research on real-time face detection in gray-level images utilizing edge-orientated data in this publication. Gromer et al. [7] An expert will illustrate how edge direction is a useful local picture feature for modeling and detecting objects like faces. The expert will show you a quick and easy way for matching templates and model objects based purely on edge-oriented data. This also illustrates how to use a series of training photos to create an ideal face detector in the edge orientation region. Unlike many other
356
R. Kavitha et al.
approaches to modeling the face’s grayscale look, ours is analytically very quick. A 320 × 240 image is analyzed in less than 0.08 s on a Pentium II 500 MHz utilizing a multi-resolution query with six levels of resolution. Sava¸s et al. [8] On an image set comprising 17,000 photos taken from over 2900 distinct persons, researchers illustrate the capability of our detection system. Data from sensing devices, like electroencephalogram (EEG), electrooculography (EOG), and electrocardiogram (ECG), is used in the second class of approaches. Electroencephalogram signals provide data about the activity of the brain. Alpha, theta, and delta, signals are the three basic signals used to assess a driver’s sleepiness. When a driver is fatigued, delta and theta signals surge, whereas alpha signals rise modestly. Mohammed et al. [9] Face-extracted features are used in computer vision. It uses behaviors like facial or gazes expression, movement of the head, yawning duration, and eye closure. It used the space between eyelids to evaluate 3 levels of sleepiness. This analysis took into account the number of blinks per minute, which is assumed to rise as the driver becomes drowsier. The mouth and yawning behaviors are used by many authors to evaluate drowsiness. For mouth and face detection, the enhanced Viola–Jones9 object recognition technology was used.
3 Proposed System This system proposes a method of Real-Time Driver’s Drowsiness Alert system by face detection to reach the goal of having a reliable and robust system [10]. The proposed architecture is implemented with the help of Dlib open-source library and OpenCV to do face recognition. Python is the software program used to execute the concept. A camera is employed in this device to determine the driver’s facial landmarks and the movement of the eye in real time. The obtained face has been trimmed to reveal a single eye [11]. The lower and top eyelids are removed after the initial face is recognized by estimating the eye region for each picture. The technology monitors eye blinks, that is the driver’s eye closing and opening, by using eyelid. When a driver’s eyelids remain closed for a certain amount of time, it is assumed that the driver was drowsy, and an auditory alert sound is utilized to rouse the driver up [12]. When the driver yawns, the mouth area is likewise retrieved, and the mouth aspect proportion is determined. The driver receives an alert unless the mouth aspect proportion exceeds the minimum value.
3.1 Existing System Several techniques for detecting driver fatigue have been explored in earlier studies. During night and day driving situations, there had been a strong real integrated system to detect the driver’s lack of sensitivity. The % of eye closure, PER CLOSE,
Implementing OpenCV and Dlib Open-Source Library for Detection …
357
was measured by the amount of time the eyelids remain closed in one minute and is used to measure attentiveness. The face is recognized via Haar-like features and monitored by a Kalman filter in this method. During the day, main factor analysis is used to recognize the eyes, but at night, block local-binary features are used [13]. Finally, SVM is used to classify the eye condition as closed or open. In terms of both accuracy and speed, the system was determined to be extremely reliable. The identification of eyes obscured by glasses and in low light circumstances is an area where more research is needed.
3.2 Camera and Frame Acquisition The camera was first established, and the video from it is fed into the system. The picture is then obtained from the video feed and the frame procurement [14], or partition into frames per second is completed. The driver’s face is recognized in each shot of the footage.
3.3 Face Detection After the face detection, we predict the shape of the face. The debug 300-W dataset will be used to test our custom Dlib shape predictor. The purpose of the iBUG-300 W is to develop a shape generator that can locate each face feature, such as the jawline, eyebrows, eyes, mouth, and nose. The collection is made up of 68 pairs of numeric values that represent the (x, y)-coordinates of face components. A network built on the iBUG-300 W can estimate the position of each of these 68 (x, y)-coordinate pairs and, as a result, can localize each of the face’s positions. Figure 1 indicates the face detection process.
3.4 Eye and Mouth Detection (EAR) Eye Aspect Ratio Eye aspect ratio =
[ p2 − p6] + [ p3 − p5] 2[ p1 − p4]
(1)
Now the eye regions and the mouth region are extracted along with the coordinates. Following extraction, the method established below is used to compute the eye aspect proportion for eyes and the mouth aspect proportion for the mouth area. Blink recognition may be approximated by using Dlib and OpenCV’s pre-trained neural prediction and sensor function to measure EAR [8]. The EAR Eq. (1) presented in the section can be used to calculate EAR utilizing eye positions provided from
358
R. Kavitha et al.
Fig. 1 Face detection
OpenCV. An abrupt fall in EAR value over a predefined threshold can be utilized for microsleep and blink detection. Figure 2 shows the eyeland marks. Where (p1, p2, p3, p3, p5, p6) are the facial landmarks of the mouth of the driver. The range between vertical mouth features is computed in the numerator, while the range between horizontal mouth features is computed in the denominator, with the denominator weighted correctly because there is only one set of horizontal spots but three sets of vertical spots. When the mouth is closed, the mouth aspect ratio is almost zero, and when the mouth is open, the mouth aspect ratio increases slightly
Fig. 2 Eye landmarks for open and closed eye
Implementing OpenCV and Dlib Open-Source Library for Detection …
359
Fig. 3 Mouth aspect ratio (MAR)
Fig. 4 MAR detection
[15]. When the mouth aspect ratio is significantly higher, it is clear that the mouth is wide open most probably for yawning as shown in Fig. 3. Advantages 1. 2.
Informs the driver about tiredness and the possibility of micro sleeping. Paying attention to driver alert sounds can help them avoid fatigue-related collisions. Figure 4 shows the MAR detection process.
3.5 General Architecture Due to the extremely high complexity, data generated by these parameters cannot completely account for deaths caused by tiredness; as a result, accidents induced by driver fatigue can be more damaging than the numbers reflect. As a result, robust systems to identify driver fatigue and inform the driver are required to mitigate these kinds of accidents (Fig. 5).
360
R. Kavitha et al.
Fig. 5 Architecture diagram
3.6 Design Phase This is the series of the flow of the diagrams needed for successful execution. The face detection Dlib machine learning algorithms are used by diving into frames, by using OpenCV functions [16]. Then the eye and mouth detection iBUG 300-W dataset will start working. Finally, when a driver is in the sleepy condition, the alerting alarm will be given by a detector process and a pre-trained neural network-based judgment.
3.7 Module Description Modules involved in the system are as follows.
3.7.1
Object Detection
Object recognition is a technique for discovering and detecting the existence of things belonging to a specific class. It can also be thought of as an image analysis method for detecting an item from a set of images. There are various methods for categorizing and locating things in a frame. One approach could be focused on color recognition. However, because multiple different-sized items of a similar hue could be available,
Implementing OpenCV and Dlib Open-Source Library for Detection …
361
this is not an effective means of detecting the object [17]. As a result, Jones and Viola created Haar-like characteristics, which were based on a proposal by Papa Georgiou et al. in 1998. Digital image characteristics called Haar-like characteristics are employed in object recognition. Or, to put it another way, there are rectangleshaped light and dark patches with traits that are identical to our face. The cascade classifier is made up of several phases, each of which contains a large number of weak variables. By panning a window across the full image and constructing a powerful classifier, the system finds things. Each stage’s result is labeled as negative or positive, with positive indicating that an image was identified and negative indicating that the desired object was not located.
3.7.2
Face Detection
Researchers know that a face was a type of thing. As a result, the researcher can think about face recognition as a special case of object detection. At first, try to figure out where the things in the intriguing image are placed and what size they are to determine if they belong to a specific class. The research of the face recognition algorithm is mostly focused on detecting the front side of face. However, a new algorithm has been created that concentrates on more broad instances. In our situation, it could be the slanted face or any other part of the face, and it also considers the potential of numerous faces [18]. If there was a vertical rotational plane, it may be used to accomplish the goal. The video or image is regarded as a variable in the novel type of algorithm, which means that altering factors in them, such as color contrast, might vary its variance. The amount of light in the room can also have an impact. The output can also be affected by the location of the input [5]. The face recognition problem is often recast as a two-way pattern job in many computations. It means that the ambient features in the intriguing image are continuously turned into attributes, resulting in the preparation of the appropriate classifier on the standard faces, which determines if the given area was a face or other objects. If researchers get a good result for recognizing a face, the procedure moves on to the following level; otherwise, the program is set up to capture images until a suggestion of a face is identified. The Viola–Jones technique is the key algorithm employed in this process. The cascade component of the OpenCV is used to get specific output. The OpenCV Cascade file comprises 24 phases and 2913 weak detectors. Its window is initially 24 × 24 pixels in size. The initial scale was fixed to 1.0, and the step level of each scale was fixed to 1.0, as well as the location step size. The total no. of scales used is 32, resulting in a staggering total of over 1.8 million supporting data windows. Because OpenCV was used to train the cascade, it is simple to use [19].
3.7.3
Eye Detection
Poor eye brightness causes a slew of complications when it comes to detection. The eye must be recognized once the face has been successfully identified for further
362
R. Kavitha et al.
analysis. The deciding factor for determining the driver’s condition is in our technique eye. Though the method of detecting the eye does not appear to be difficult, it is rather chaotic. It uses image features to recognize the eye in the defined location in this situation. In most cases, the eigenmethod is used in this procedure. It is a lengthy procedure. When eye tracking is completed, the result is compared to the threshold value or reference to determine the driver’s status. There are two types of eye detection: eye contour recognition and eye location recognition. The detection of eyes is based on the notion that the eyes are darker in color than the rest of the face. As a result, Haar similar features may be moved around the upper half of the face to fit with the eye’s aspect leading to the eye’s location. The non-skin parts inside the facial district are considered potential eye locations. Eyes should be inside a facial area, and the skin marker does not differentiate eyes from the skin. As a result, everyone must find eye-simple groups among a smaller number of possible eye areas. Many eye detection technologies have been established in recent times. One of the most common approaches for recognizing the human eye is to use a stretchable template. In this procedure, an eye model was created first, and then the iterative approach is used to determine eye location. However, this technique is very reliant on the original position of the eye, which must be close to the real position. In terms of template matching, the proposed technique uses neural networks and eigenvalue features to retrieve eyes from grayish feature vectors via rectangular fitting. This approach, which uses the sliding window and eigenfeatures, does not require a huge number of training pictures. However, if the user wears glasses or has a beard, this method will fail. The expert knows that employing Haar features in AdaBoost improves computing efficiency and accuracy compared to other facial recognition techniques. However, the Haar characteristic has a drawback: discriminant power. Even though Haar features come in a variety of sizes, patterns, and placements, researchers can only depict rectangular objects. However, in our scenario of eye detection, both the eye and the iris are circular. As a result, learning discriminating qualities to characterize eye patterns can be used to represent eyes. As a result, using a stochastic classifier to distinguish between non-eyes and eyes is a far better alternative in terms of accuracy and resilience.
3.7.4
State of Eye
Researchers determine the true eye status at this point, whether it was open, semiclosed, or closed. The most significant prerequisite is to determine the state of one’s eyes. It is accomplished using an algorithm that will be explained in further detail in the next sections. If the eyes are semi-open or open up to a certain target value, the devices send out an alert message. If the sensor recognizes that the eyes are wide open, the procedures are continued till the system identifies a closed eye.
Implementing OpenCV and Dlib Open-Source Library for Detection …
363
4 Implementation and Testing Project input is sleep and drowsiness of a driver, whereas sleep will be detected by driver’s facial aspect ratios of the driver’s face. The output will be an alarm sound to wake up the driver and an alert indication on the screen output alert when the driver is sleeping. Figure 6 describes the input data.
4.1 Input Design Figure 7 shows the drowsy face of the driver. With input data as shown in Fig. 6, the drowsiness of the driver will be detected.
5 Results and Discussions To detect the drowsiness of drivers, the most important element is a reliable system to monitor the driver and determine whether he/she is drowsy. Even though drowsiness is a concept that is understood by anyone, it is a very complex task to quantify it. This system proposes a method of drowsiness monitoring system using face detection to reach the goal of having a reliable and robust system. The proposed approach uses Dlib open-source library and OpenCV to perform facial detection. Python is the software program used to execute the concept A camera is employed in this system to detect the driver’s facial landmarks and movement of the eye in real time. The captured face is trimmed to reveal a single eye. The lower and top eyelids are removed after the initial face is detected by estimating the eye region for each picture. The device recognizes eye blinks, or the driver’s closing and opening of his or her eyes, using the driver’s eyelid. When a driver’s eyelids remain closed for a certain amount of time, it is assumed that the driver was drowsy, and an auditory alert is utilized to rouse the driver up. When the driver yawns, the mouth area is retrieved and the mouth ratio is determined. The driver receives a warning if the mouth aspect proportion exceeds the minimum value.
5.1 Comparison of Existing and Proposed System In this, the comparative analysis of the proposed Dlib open-source library and OpenCV is compared with the pre-existing algorithm for detecting the driver’s fatigue, through which the proposed model detecting toward the information of facial motion is correct or not. The outcome from the comparative analysis is done based on the source image through which the real-time performance of the proposed system
Fig. 6 Input when the driver is awake
364 R. Kavitha et al.
Implementing OpenCV and Dlib Open-Source Library for Detection …
365
Fig. 7 Face detection (drowsy face)
Table 1 Comparative analysis of fatigue detection models
Algorithm
Speed
Accuracy in percent (%)
Algorithm 1
34.62
91.42
Algorithm 2
45.19
92.47
Proposed model
49.45
94.38
can be analyzed. The outcome from the proposed algorithm shows that it is significant accuracy with high speed by means of performance. The accuracy of the proposed model relies on the parameter of the driver’s drowsiness and fatigue state of the driver. The proposed model is compared with conventional neural networks with DF-long short-term memory algorithm (Algorithm 1). The second algorithm used for comparative analysis was conventional neural networks with AdaBoost. When compared to the existing algorithm for fatigue detection of the driver, the proposed Dlib open-source library and OpenCV are accurate and have better realtime performance and also meet the demand of the fatigue detection as shown in Table 1.
6 Conclusion and Future Enhancements Drowsiness alarms during driving are a common issue that results in thousands of deadly accidents each year. One of the most pressing needs of the day was a way
366
R. Kavitha et al.
to avoid deaths and accidents. Everyone has built complicated things for identifying tiredness in drivers; this study demonstrates a simpler, yet extremely effective technique of doing so. A sleepy driver warning method is modeled utilizing Python and the Dlib framework in this project. The positions of the human facial landmarks in the input video are mapped using Dlib’s shape decoder, and drivers’ tiredness is recognized by tracking aspect ratios of mouth and eyes. Evaluating videos from a typical public database and also the real-time video was used to evaluate the proposed functionality. This is a changeable mechanism that operates based on the driver’s current state. That is, the motorist must pay attention to the detecting system and the alert signal before slowing down or returning to proper driving. Instead, a model might be created that slows down the car once the driver’s drowsiness is detected. The parameter that can be entered will not only be sleepiness but exhaustion will also be taken into account. The stress level, psychological condition, and cardiovascular parameters will all contribute to this weariness. Once these findings are integrated to produce a binary no or yes answer to the issue of whether or not the driver is fit to drive, the car’s hydraulic mechanisms will adapt instantly. The proposed model yields an accuracy of 94.38% when compared to the other pre-existing algorithm, and the speed of the model was found to be 49.45 ms/f.
6.1 Future Scope Another way that can be done without the headache of embedded system is to create a mobile software which can be positioned on the dashboard of the car and do the same thing that expertly designed by taking pictures from the mobile’s front camera and doing pretty much the same thing as this extended mixture of hardware and code will do. The benefit of doing so is that the app will be held accountable for its energy use and performance. It is more difficult to create advanced technology than it is to create smart hardware.
References 1. E. Aidman, C. Chadunow, K. Johnson, J. Reece, Real-time driver drowsiness feedback improves driver alertness and self-reported driving performance. Accid. Anal. Prev. 81, 8–13 (2015) 2. E.E. Galarza, F.D. Egas, F.M. Silva, P.M. Velasco, E.D. Galarza, Real time driver drowsiness detection based on driver’s face image behavior using a system of human computer interaction implemented in a smartphone, in International Conference on Information Technology and Systems (Springer, Cham, 2018), pp. 563–572 3. A. Khan, B. Rinner, A. Cavallaro, Cooperative robots to observe moving targets. IEEE Trans. Cybernet. 48(1), 187–198 (2016) 4. I. Gupta, N. Garg, A. Aggarwal, N. Nepalia, B. Verma, Real-time driver’s drowsiness monitoring based on dynamically varying threshold, in 2018 Eleventh International Conference on Contemporary Computing (IC3) (IEEE, 2018), pp. 1–6
Implementing OpenCV and Dlib Open-Source Library for Detection …
367
5. J.Y. Wong, P.Y. Lau, Real-time driver alert system using raspberry Pi. ECTI Trans. Electr. Eng. Electron. Commun. 17(2), 193–203 (2019) 6. Y. Liu, T. Zhang, Z. Li, 3DCNN-based real-time driver fatigue behavior detection in urban rail transit. IEEE Access 7, 144648–144662 (2019) 7. M. Gromer, D. Salb, T. Walzer, N.M. Madrid, R. Seepold, ECG sensor for detection of driver’s drowsiness. Proc. Comput. Sci. 159, 1938–1946 (2019) 8. B.K. Sava¸s, Y. Becerikli, Real time driver fatigue detection system based on multi-task ConNN. IEEE Access 8, 12491–12498 (2020) 9. A.Z. Mohammed, E.A. Mohammed, A.M. Aaref, Real-time driver awareness detection system. IOP Conf. Ser. Mater. Sci. Eng. 745(1), 012053 (2020) 10. A. Majumder, L. Behera, V.K. Subramanian, Automatic facial expression recognition system using deep network-based data fusion. IEEE Trans. Cybernet. 48(1), 103–114 (2016) 11. S. Mehta, S. Dadhich, S. Gumber, A. Jadhav Bhatt, Real-time driver drowsiness detection system using eye aspect ratio and eye closure ratio, in Proceedings of International Conference on Sustainable Computing in Science, Technology and Management (SUSCOM) (Amity University Rajasthan, Jaipur, India, 2019) 12. T. Mordan, N. Thome, G. Henaff, M. Cord, End-to-end learning of latent deformable part-based representations for object detection. Int. J. Comput. Vision 127(11), 1659–1679 (2019) 13. C.B.S. Maior, M.J. das Chagas Moura, J.M.M. Santana, I.D. Lins, Real-time classification for autonomous drowsiness detection using eye aspect ratio. Exp. Syst. Appl. 158, 113505 (2020) 14. P. Pattarapongsin, B. Neupane, J. Vorawan, H. Sutthikulsombat, T. Horanont, Real-time drowsiness and distraction detection using computer vision and deep learning, in Proceedings of the 11th International Conference on Advances in Information Technology (2020), pp. 1–6 15. M.C. Shin, W.Y. Lee, A driver’s condition warning system using eye aspect ratio. J. Korea Inst. Electron. Commun. Sci. 15(2), 349–356 (2020) 16. B. Suri, M. Verma, K. Thapliyal, A. Manchanda, A. Saini, DDYDAS: driver drowsiness, yawn detection and alert system, in Proceedings of 3rd International Conference on Computing Informatics and Networks: ICCIN 2020 (Springer Singapore, 2021), pp. 221–231 17. H. Wang, A. Dragomir, N.I. Abbasi, J. Li, N.V. Thakor, A. Bezerianos, A novel real-time driving fatigue detection system based on wireless dry EEG. Cogn. Neurodyn. 12(4), 365–376 (2018) 18. J.S. Wijnands, J. Thompson, K.A. Nice, G.D. Aschwanden, M. Stevenson, Real-time monitoring of driver drowsiness on mobile platforms using 3D neural networks. Neural Comput. Appl. 1–13 (2019) 19. M. García-García, A. Caplier, M. Rombaut, Sleep deprivation detection for real-time driver monitoring using deep learning, in International Conference Image Analysis and Recognition (Springer, Cham, 2018), pp. 435–442 20. M.Y. Hossain, F.P. George, IOT based real-time drowsy driving detection system for the prevention of road accidents, in 2018 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS), vol. 3 (IEEE, 2018), pp. 190–195 21. R. Jabbar, K. Al-Khalifa, M. Kharbeche, W. Alhajyaseen, M. Jafari, S. Jiang, Real-time driver drowsiness detection for android application using deep neural networks techniques. Proc. Comput. Sci. 130, 400–407 (2018) 22. R. Jabbar, M. Shinoy, M. Kharbeche, K. Al-Khalifa, M. Krichen, K. Barkaoui, Driver drowsiness detection model using convolutional neural networks techniques for android application, in 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT) (IEEE, 2020), pp. 237–242 23. A. Kumar, R. Patra, Driver drowsiness monitoring system using visual behaviour and machine learning, in 2018 IEEE Symposium on Computer Applications and Industrial Electronics (ISCAIE) (IEEE, 2018), pp. 339–344 24. E.E.B. Adam, Evaluation of fingerprint liveness detection by machine learning approach—a systematic view. J. ISMAC 3(01), 16–30 (2021) 25. G. Ranganathan, Real life human movement realization in multimodal group communication using depth map information and machine learning. J. Innov. Image Process. (JIIP) 2(02), 93–101 (2020)
Comparative Analysis of Contemporary Network Simulators Agampreet Kaur Walia, Amit Chhabra, and Dakshraj Sharma
Abstract Network simulations are a popular methodology for the testing and study of the behavior of network systems under specific situations and with specific inputs without having to put the actual corresponding system at risk. These network simulations, carried out through simulation software, tend to be much more realistic and accurate in their outcomes compared to basic mathematical or analytical models. Today, there are several competitive as well as feature-rich network simulators available in the market for researchers, each with its own strengths and merits. When deciding on a simulator to utilize to satisfy their purpose, researchers may be led to compare the various positives and negatives of one network simulator over another. Our work intends to aid researchers in such a comparison, by providing an overview of the various features of some of the most popular network simulators available for use in the research community. Keywords QualNet · PSIM · GrooveNet · PEERSIM · MATLAB · MININET
1 Introduction Network simulation techniques utilize calculations and estimates of the various interactions that occur over a network in real time to achieve quantifiable analyses of the network’s performance under any set of circumstances and conditions, without needing to expose the network itself to any potential risks. A good number of powerful network simulator software exist in the market today to aid the research in networking and for testing of computer networks. Many of these are capable of not only modeling traditional wired networks but also cutting-edge technologies such as Wi-Fi and 5G networks.
A. K. Walia (B) · A. Chhabra · D. Sharma Chandigarh College of Engineering and Technology, Sector-26, Chandigarh, India A. Chhabra e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_27
369
370
A. K. Walia et al.
Some of these simulators are open-source software, while others are proprietary, that is, commercial licenses need to be purchased to legally use them [1]. Whether network simulator software is open-source or proprietary is important for several reasons, there are a number of advantages of each type of software over the other. Open-source simulators have source code which is available for the community to use and modify. This means that anyone with the ability to program, and with suitable knowledge in the domain, can contribute to the software by adding new functionality, fixing existing issues and bugs, or simply just improving the available documentation. Thus, network simulators developed by the open-source community tend to implement latest features developed by contributors and are widely adopted by the research community overall. This is seen for popular open-source software like NS-3. However, the absence of firm oversight can sometimes lead to flawed or incomprehensive documentation, lack of support for issues and bugs, etc., which may be fine for casual users but are important factors to consider when software are to be used in commercial settings with strict deadlines and goals. For such settings, commercial software like QualNet network simulator may sometimes be preferred for certain reasons. Commercial software tends to have better support response times and more thorough documentation owing to the commercial organization(s) that oversees the creation and maintenance of the software. However, due to their relatively smaller user bases because of licensing fees, these software may sometimes lag behind open-source alternatives in support for latest cutting-edge technologies. Further, the commercial organization maintains complete control over how the software may be used and what may be considered a breach of license. Such guidelines may sometimes be considered unwarranted by some users.
2 Study Outline The content in this work is laid out as follows. The section ‘Related Works’ provides a literary overview of past and recent works in the same domain and covering subjects similar to our work. The section ‘Overview of Popular Network Simulators’ describes for the readers powerful and important features of some of the most popular network simulator software available for use in the industry. In the next section, ‘Comparison between Network Simulators’, we attempt to objectively compare these network simulators based on several parameters such as cross-OS compatibility, ease of use, and scalability. The section ‘Conclusion’ concludes the work.
3 Related Works A number of surveys and research have been aimed at aiding scientists make the best choice for a network simulation tool that best fits their needs. In Sarkar et al. [1], different network simulators are studied on the basis of their implementation mode,
Comparative Analysis of Contemporary Network Simulators
371
routing protocols supported, and network impairments. Different aspects of simulators are discussed, and a comparison based on their relative popularity and support is presented. This work also discussed some software that have not yet become established in the community or are still in the relatively younger stages of development. Singh et al. [2] analyze some of the popular wireless network simulators available today. The author also provides an overview of the taxonomy of simulations. The researchers have surveyed various simulation tools, open source, key features, limitations, and their comparative study. Gayathri et al. [3] has provided a comparative analysis of some of the popular network simulation tools based on their applications relative to other tools in the same domain. The study is focused on discussing deeply a limited number of network simulators, rather than providing a broad overview of the many choices available to researchers in today’s scenario.
4 Overview of Popular Network Simulators This section intends to give a brief overview of the most important and distinct features found in some of the more popular network simulators available today. OPNET Optimized network engineering tools (OPNET) [2] is a commercial and efficient C++ based network simulator which is available for use to researchers in universities and college programs free of cost. OPNET is notable for its power and versatility which allow it to be used to study communication networks, devices, protocols, and applications. OPNET operates using a hierarchical modeling environment, arranging its structure into network, node, and process domains. Powerful GUI tools provide good user experience and provide some advanced functionality such as the graphical editor, which can be used for setting up various test setups in terms of topologies and networking layers, and GUI-based debugging and analysis. It can be used to model software as well as hardware technologies and supports wired as well as wireless network models. OPNET provides a sizable library of protocols for use in simulations. However, it is limited in these protocols’ customizability and does not allow users to implement custom protocols. OPNET’s powerful tools provide a highly capable source code editing environment, as well as editors for modeling networks, nodes, processes, and packet formats. Powerful simulation abilities and analysis tools along with a GUI-friendly animation viewer make OPNET a very capable tool. GloMoSim Global mobile information system simulation library (GloMoSim) is another discrete event simulation software. It is written in an extension of C, called parallel simulation environment for complex system (PARSEC), to gain advantages in parallel computing [4]. The overall software architecture is designed as a set of library modules each of which simulates particular protocols in the protocol stack. While the
372
A. K. Walia et al.
software was created with the vision for supporting both wired and wireless frameworks and technologies’ simulation, it is currently largely limited to the modeling of wireless components. It has good scalability and is, thus, popular for use in simulations for analysis of the behavior of large-scale hybrid networks. This simulator can also be used in networks such as mesh and opportunistic. Some of the unique points of the software are as follow: • It can support different routing protocols over multihop wireless scenarios [4]. • It can be used in parallel environments and has parallel discrete event simulation capacity owing to its PARSEC implementation [5]. • It offers great scalability that can support simulation and modeling of thousands of modes. • It offers standard API’s for communication with library modules. • It also has a user-friendly visualization tool. • User-defined libraries can be added. QualNet QualNet is one of the leading software for the design, simulation, and analysis of large-scale, complex networks. It is the commercial software having all the features of its successor, i.e., GloMoSim. QualNet platform may be seen to comprise the QualNet scenario designer, animator for visualizations and analysis, the QualNet protocol designer for custom protocol implementations, analyzer tab for statistical analysis, and the packet tracer for generating trace files related to packet transfers. QualNet models are considered to be made up of nodes and links connecting them [1]. Nodes represent network endpoints, and connecting links may be modeled as wireless or wired networks. QualNet is a commercial software and is available in two main versions, QualNet developer and QualNet runtime, with varying features. Notable features available in the former are the visualize mode, the designer mode, and the analyzer tool. This array of tools makes QualNet a viable choice for nearly every use-case. As an industry leader, QualNet is robust for all wired, wireless, and hybrid networks and is widely used in applications where high-fidelity networks need to be modeled for analysis, as well as large and heterogeneous environments. NetSim NetSim simulator can perform modeling and simulating of different scenarios like Ethernet, Wireless LAN, a good number of network protocols like WLANs, Ethernet, or asynchronous transfer mode switches, etc. [4]. It is a stochastic, discrete event network simulation tool available in both commercial and academic versions. The main characteristic of this simulator is that the packages can be run on various different operating systems. But mostly NetSim is used in academic environments. It supports an integrated development environment and also analyzes network performance metrics at various levels of abstraction depending on the needs of the user. Some of the distinguishing features of NetSim are as follow:
Comparative Analysis of Contemporary Network Simulators
373
• It can be used in simulation of a corporate network topology and help troubleshooting without using network devices. • It also has a GUI which enables drop and drag functions for devices and links. • NetSim’s internal animator is used to visualize data packets and control packet flow. • An advanced analysis framework allows for informative comparisons between performances of various protocols. MATLAB MATLAB or Matrix Laboratory is a commercial product provided by MathWorks. Rather than being merely a network simulator, MATLAB works are a more generalpurpose software which is also widely used for network simulation. It provides an extremely powerful and interactive software environment for programming. The software is built with the C programming language. It also provides an integrated CLI as well as several domain-specific tools which can be used for measuring, analyzing, and visualizing data [2]. It is highly extensible through ‘toolboxes’ aimed at specific uses such as digital image processing, machine learning, and deep learning. It supports a number of technical computing domains. MATLAB is highly popular among the research community. Some of the distinguishing features of MATLAB are enumerated below: • It consists of in-built graphics to help in tasks such as data visualization. • It has an interactive environment for designing different scenarios. • Different libraries written in languages such as .NET and Java can be easily implemented. It provides numerical computation and visualization of data. • It maximizes its performance by using tools for improving the quality of code. • Provides a coding environment that can allow custom implementations of algorithms or any other techniques. • MATLAB software can be used for executing experiments with a real-time network using Linux router [6]. • MATLAB has a predictive maintenance toolbox that lets you manage sensor data, design condition indicators, and estimate the remaining useful life (RUL) of a machine [7]. NS-2 NS-2 refers to network simulator version 2 which was developed at the Lawrence Berkeley laboratory at the University of California as a part of a virtual inter-network testbed project [2]. Over the years, it became one of the well-used and widely adopted network simulation tools owing to its flexibility as well as its modular approach to extensibility. NS-2 is built with C++ and object-oriented tool command language. As a consequence, NS-2 is capable of supporting a number of C++ objects, whereas the object-oriented tool command language is used by users for controlling simulations and events. NS-2 is cross-platform and supports modeling of a wide array of wired and wireless network components, as well as a number of networking protocols
374
A. K. Walia et al.
ranging from TCP, UDP, etc. Being completely open source and free of cost makes it a very appealing choice for researchers. Some of the more important features of NS-2 are as follow: • It can be used to design parallel as well as distributed simulations, and the output is impressive for simulation designing [2]. • It provides an event scheduler which checks all occupancy and release of events during scheduling. • It supports traffic distribution modeling. • NS-2 supports animation and plotting of simulation results which can be very helpful in visualizing results and comparisons [4]. • The open-source code of NS-2 is modifiable. • It can be used to acquire the vital enhancements in the life expectancy of the network framed and the delivery rate of the packets [8]. NS-3 NS-3 is another popular network simulation software used primarily for education and business purposes. The emulator is built using C++ with optional Python binding. NS-3 provides a complex and rich library for the simulation of wireless and wired technology. NS-3 is the successor to NS-2 and has several additional features and improvements over its predecessor. It can act as the emulator and is backward compatible to support NS-2 models and scripts. NS-3 has a modular and well-documented core. NS-3 is generally much more efficient to use and also optimized computation and memory requirements of models [9]. Some more of its well-appreciated features are as follow: • It supports the integrations with other open-source software, thus reducing the requirement for rewriting simulation models. It has a rich library of extensions and models developed by the community which can be made use of by users. • It can be used to simulate and analyze large-scale network simulations very efficiently. The protocols are designed to closely model realistic scenario computations and devices. OMNET++ Publicly available since 1997, OMNET++ is an extensible, modular, and discrete event simulation software. Although it can successfully model complex IT systems, multiprocessors, and distributed hardware architectures, it is more often used for computer networks simulation, both wireless and wired [10]. OMNET++ is a modular discrete event-based simulator which is highly popular among the teaching community, commonly used for educational and research purposes. It can support simulation of wired as well as wireless infrastructures [2]. Components and modules are programmed in C++ and then assembled into larger components and models using a high-level language (NED). Using a highly modular architecture and working principles, OMNET++ supports high reusability of models. This simulator has first-class
Comparative Analysis of Contemporary Network Simulators
375
GUI support. OMNET++ is, however, a relatively slow and resource-hungry software and has also been criticized for being a bit difficult to use. Yet, its popularity among the community is appreciable. Some of its major features are listed below: • Useful GUI with tools for data-plotting and analysis. • Well-structured, modular architecture with high reusability such that modules can be reused and combined in different ways. • Object inspectors help inspect the state of network components throughout the duration of experiments. • Can accurately model the characteristics of a wide variety of physical hardware phenomena. • It supports ad-hoc networks and wireless sensor networks with implementation of different routing protocols. J-Sim J-Sim is an open-source Java-based network simulator in which different components communicate with each other by transfer of data with the help of ports. It is mainly used for modeling of quantitative concepts. J-Sim allows us to design, implement, and test single components individually. It is written and developed in Java and can hence work across multiple operating systems. Its component-based design means that components can be added or exchanged easily for existing ones. This network simulator’s models can support integrals and differential equations as required, hence providing a great amount of flexibility to researchers. J-Sim’s model compiler also offers advanced implicit mathematical support. Some of the appealing features of J-Sim are: • J-Sim provides a reusable and platform independent environment [11]. • Support for a powerful scripting interface that allows executing scripts in Python, Perl, etc. • Can support Web-based simulation. • Offers a dual-language environment through which we can do online monitoring or auto-configuration remotely. • Powerful pluggable component architecture supports component manipulation even during execution of simulations. GrooveNet GrooveNet is an open-source hybrid simulator that combines both a mobility generator and a network simulator and real which can communicate with each other in hybrid network [12]. It can support vehicular ad-hoc networks with large number vehicular nodes [13]. The communication can be of different categories, i.e., it can be vehicle to infrastructure (V2I) and vehicle to vehicle (V2V). It has a powerful graphical interface that enables auto-generation of simulation scenarios. Although GrooveNet attempts to fit a niche category of network simulations, it is a powerful tool
376
A. K. Walia et al.
for researchers if it fires their use case. Some of the important features of GrooveNet are: • Great support for hybrid simulations at scale. • It aspires to minimize handshaking between sending and receiving parties of neighboring vehicles. • Simulator supports multiple rebroadcast policies. • Easily extensible via the addition of new network models and protocols. TraNS Traffic and network simulation environment (TraNS) attempts to integrate traffic and network simulators, generating life-like simulations for vehicular ad-hoc networks [4]. Simulator focusses greatly on achieving simulations that have as little variation from real-life simulations as possible. The advanced design of TraNS allows complicated simulations, such as allowing signals from networks to affect the movement and state of the vehicles themselves. TraNS also supports real mobility models to change the behavior of the traffic simulator based on the communication between vehicles. The powerful and intuitive GUI allows users to quickly set up modeling scenarios and required parameters. Some special features of TraNS are highlighted below: • Support for Google earth visualization along with map cropping and speed rescaling to allow realistic simulations without much complexity for the end user. TraNS is an ideal tool for the development of frameworks for vehicular ad-hoc networks (VANET) applications. • Supports the automated random vehicle routes’ generation. Random and flowbased vehicle route generation is supported. SSFNet Scalable simulation framework network (SSFNet) is an event-based network simulator which programs the SSF model using C++ and Java. A unique domain modeling language (DML) is used for modeling simulation networks. The use of multithreading allows SSFNet to support parallelization of queues and simulation tasks, providing appreciable performance at scale. Different physical entities like host, router, and link in network scenario can be represented by Java Class or Java Packages [14]. Some of the key features of SSFNet can be listed as follow: • It provides a fully integrated network environment in which different components operate in integrated mode to generate simulation results [15]. • Through its domain modeling language, SSFNet provides a simplified syntax for high-level model description. • This simulator can make efficient use of powerful multiprocessor machines to boost performance. • They are also used in cyber security [16].
Comparative Analysis of Contemporary Network Simulators
377
TOSSIM TOSSIM refers to TinyOS simulator sensor networks which can generate different scenarios for network interfacing. It has a very extensible network and is scalable to simulate thousands of networks. The GUIs provide detailed visualization and actuations in real time for running simulations, making it good for analysis. Different applications may connect over a TCP to modify and update the network simulations and topologies in real time [17]. TOSSIM supports models that can be compiled from TinyOS code directly. Features of TOSSIM: • It compiles TinyOS code directly. • Allows complex simulations through the ability to modify network and components in real time during simulation. • It enables cost-effective design and implementations. DRMSim The dynamic routing model simulator or DRMSim is a Java-based simulator used mainly for large-scale networks. The design philosophy of DRMSim was aimed at maximizing performance and extensibility. It offers in-depth insights into various routing schemes. It incorporates different routing models such as source routing, distance vector routing, and other internet-based routing techniques [18]. It allows the user to work on these routing schemes and to compare the simulation parameters on different scenarios on the basis of these techniques. Some of the main features of DRMSim are as follow: • Supports storing and loading routing tables to allow execution simulation based on actual real-world conditions. • Cross-platform support. • It represents simulation operations as sequences of chronological events. MININET MININET is an open-source network emulator in which we can create nodes, switches, and links. The software is designed with the research and education community in mind. It allows researchers to simulate different components of the network in a real scenario. It provides a virtual realistic environment to the users, so that they can deploy it on a real hardware [19]. It also provides a simple, cost-effective test bed for the development of OpenFlow applications, enabling concurrent yet independent development on the same topology. Some of the notable features of MININET are: • Quick to load, cost and resource effective, and easily reconfigurable. • MINNET provides a user-friendly Python application programming interface for designing scenarios which may be integrated into custom solutions. • Can perform testing without creating any physical network.
378
A. K. Walia et al.
PEERSIM PEERSIM simulator comes under the category of peer-to-peer event-based simulator which provides excellent support for running dynamic simulations. This simulator supports event-based model and cycle-based model. The event-based model performs scheduling of messages between two peer nodes. Cycle-based models perform random selection of nodes, and at each cycle node, protocol is invoked [20]. Peer-to-peer systems can be very large and include millions of nodes, those nodes join and leave the network continuously, and these characteristics are difficult to handle [21]. Since it is Java based, PEERSIM can run across various platforms. It is a command-based simulator which provides class packages for generating statistical results. Some of the main features offered by PEERSIM are: • Through graph abstraction, it provides the ability to treat overlay networks as graphs and can provide various initializers. • Developers can make use of existing PEERSIM modules by tweaking them to their needs through simply writing a few lines of configuration. • PEERSIM can be configured to use trace-based datasets when using the eventbased engine, which allows for more realistic simulations [22]. • PEERSIM allows good portability of models by allowing user-defined entities to replace almost all predefined entities. PSIM PSIM network simulation tool is one of the fastest simulators for power electronics. PSIM has been noted for its highly accurate simulations as well as scalability. PSIM also supports multi-cycle simulation. Extension modules for PSIM can allow PSIM models to support niche-specific or particular needs such as motor drives, renewable energy, digital control, or DSP and FPGA support [23]. Through the module system, a user has the flexibility to tailor and modify the PSIM models according to one’s own needs. Users can study the results of a simulation displayed in the Simview. Multiple screens and visualizations make the GUI very useful for research purposes. Some of the main advantages offered by PSIM are: • • • •
Ease of use. Efficient functioning and fast simulation at scale. PSIM provides a flexible control representation. GUI interface allows efficient creation of virtual network models as well as visualization of simulation results. • Module system allows good extensibility and allows it to support dynamic scenarios like churn and other failure models if required.
5 Comparison Between Network Simulators Table 1 attempts to capture a comparison between the various networks simulators that have been discussed in the above section. While it may be argued that owing to
Name
OPNET
GloMoSim
QualNet
NetSim
MATLAB
S. No.
1
2
3
4
5
Commercial
Commercial
Commercial
Free software license
Free software license
Type
Table 1 Comparison between network simulators Language
Platform
Windows and Linux platform
Windows and Linux platform
C+ +
Windows, MAC, Linux platform
Windows, Linux, Sun SPARC Solaris
Windows platform
C, C++, and Java
C++
C
C and C++
Scalability
Very large
Small
Very large
Large
Medium
Limitations
(continued)
(i) Files larger than 256 MB cannot be uploaded on MATLAB online (ii) The graphical interface to the profiler is not supported (iii) MATLAB online cannot interact with most other hardware, including instrument control
(i) It is a single process discrete event simulator (ii) Free version is unavailable [24]
(i) The Java-based user interface is slow (ii) Installation process in LINUX is slow [2]
(i) Documentation is poor (ii) PARSEC support for Redhat 7.2 is outdated (iii) Updating of this simulator is not regular [24]
(i) GUI system is complex (ii) Result accuracy is limited (iii) Inefficient simulation in case of long delays [4]
Comparative Analysis of Contemporary Network Simulators 379
Name
NS-2
NS-3
OMNET++
J-Sim
GROOVENET
TraNS
S. No.
6
7
8
9
10
11
Table 1 (continued)
Free software license
Free software license
Free software license
Free software license
Free software license
Free software license
Type
Language
C++, Java
C++
Java
C++
C++ and python
C++ and TCL script
Platform Small
Scalability
Linux, Windows (trace-generation mode)
Linux platform
Linux, Windows, and MAC OS platform
Linux, Windows, and Mac OS platform
Large
Large
Small
Large
Windows, Linux platform Large
Windows and Linux platform
(continued)
(i) Its development is suspended (ii) Proper documentation is not available [4]
(i) Different vehicles can have different mobility patterns which can be difficult to program
(i) Security restriction (ii) The graphical model has limited functionality [11]
(i) Variety of protocols is poor (ii) Relatively weak analysis capabilities (iii) The mobility extension is incomplete [2]
(i) Lack of credibility (ii) Needs a lot of specialized maintainers [2]
(i) Complex to model a real system (ii) Excess nodes slow down simulation (iii) Needs recompilation every time user code is changed [2]
Limitations
380 A. K. Walia et al.
Name
SSFNet
TOSSIM
DRMSIM
MININET
PEERSIM
PSIM
S. No.
12
13
14
15
16
17
Table 1 (continued)
Free software license
Free software license
Free software license
Free software license
Free software license
Free software license
Type
Language
Java
Java
Python
Java
C++, Python
C++, Java
Platform
MAC OS X, Windows, Linux
Windows, Linux
Linux
Mac OS, Linux platform
Linux, Windows. platform
Windows, Linux, and Solaris platform
Large
Large
Large
Large
Medium
Very Large
Scalability
(i) In-depth knowledge of computer and power electronics is required [23]
(i) PEERSIM does not support distributed simulation [20]
(i) Cannot run non-Linux compatible applications [19]
(i) DRMSIM does not use distributed simulation methods (ii) Simulation can only be performed for routing protocols [4]
(i) Can only emulate homogeneous applications (ii) It makes several assumptions (iii) It does not perform energy measurements [25] (iv) TOSSIM cannot provide accurate information on the calculation of the CPU’s energy consumption [26]
(i) Slow convergence may occur (ii) It does not provide clients with any supplementary tools for designing of any scenarios for visualization of simulation results [15]
Limitations
Comparative Analysis of Contemporary Network Simulators 381
382
A. K. Walia et al.
the particular strengths of some of these software, and due to the fact that some of these software attempt to support very particular use-cases only, it is not appropriate to attempt a direct comparison between these software. However, we may try to objectively compare these simulators in order to aid researchers in narrowing down simulators that do fit their needs, and from there, the researchers may look further into those fewer choices to decide which product best supports their use cases.
6 Conclusion This work covered the important aspects which affect the use and adaptability of some of the more popular network’s simulators in the industry (OPNET, QualNet, NetSim, MATLAB, NS-2, NS-3, GloMoSim, GROOVENET, TRANS, SSFNet, OMNET++, J-Sim, TOSSIM, DRMSIM, MININET, PEERSIM, and PSIM). These network simulators have been compared in terms of their individual features, scalability, language support, and limitations in order to help researchers and scholars choose the software which best fits their individual needs. While simulating their application scenario of wired/ wireless networks or VANETs, researchers can make an optimal decision related to the choice of network simulator. This work delivers the aid to researchers in such a comparison, by providing an overview of the various features of some of the most popular network simulators available for use in the research community. By looking through the study with a clarity of their needs, user can successfully identify the right simulator for themselves.
References 1. N.I. Sarkar, S.A. Halim, A review of simulation of telecommunication networks: simulators, classification, comparison, methodologies, and recommendations, pp. 10–17 (2011) 2. A.S. Toor, A.K. Jain, A survey on wireless network simulators. Bull. Electr. Eng. Inf. 62–69 (2017) 3. C. Gayathri, R. Vadivel, An Overview: Basic Concept of Network Simulation Tools (2017) 4. M.H. Kabir, S. Islam, M.J. Hossain, S. Hossain, Detail comparison of network simulators. Int. J. Sci. Eng. Res. 203–218 (2014) 5. R. Khana, S.M. Bilalb, M. Othmana, A performance comparison of network simulators for wireless networks. arXiv e-prints (2013) 6. I.J. Jacob, P.E. Darney, Artificial bee colony optimization algorithm for enhancing routing in wireless networks. J. Artif. Intell., 62–71 (2021) 7. A. Jablonski, Condition Monitoring Algorithms in MATLAB (Springer Nature, 2021) 8. J.I. Chen, Z. Iong, Optimal multipath conveyance with improved survivability for WSN’s in challenging location. J. ISMAC, 73–82 (2020) 9. J. Pan, R. Jain, A survey of network simulation tools: current status and future developments. Washington University in St. Louis. Technical report. Department of Computer Science and Engineering (2008) 10. C. Bouras, A. Gkamas, G. Diles, Z. Andreas, A comparative study of 4G and 5G network simulators. Int. J. Adv. Netw. Serv. (2020)
Comparative Analysis of Contemporary Network Simulators
383
11. A. Sobeih, J.C. Hou, L.C. Kung, N. Li, H. Zhang, W.P. Chen, H.Y. Tyan, H. Lim, J-Sim: a simulation and emulation environment for wireless sensor networks. IEEE Wirel. Commun. 104–119 (2006) 12. I.A. Aljabry, G.A. Al-Suhail, A survey on network simulators for vehicular ad-hoc networks (VANETS). Int. J. Comput. Appl. (2021) 13. R. Mangharam, D. Weller, R. Rajkumar, P. Mudalige, F. Bai, Groovenet: a hybrid simulator for vehicle-to-vehicle networks, in 2006 Third Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services, pp. 1–8 (2006) 14. B.I. Bakare, J.D. Enoch, A review of simulation techniques for some wireless communication system. Int. J. Electron. Commun. Comput. Eng. (2019) 15. S. Yoon, Y.B. Kim, A design of network simulation environment using SSFNet, in First International Conference on Advances in System Simulation, pp. 73–78 (2009) 16. D.Y. McBride, An analysis of cybersecurity curriculum designs, workforce readiness skills, and applied learning effectiveness (Doctoral dissertation, Capitol Technology University, 2021) 17. P. Levis, N. Lee, M. Welsh, D. Culler, TOSSIM: accurate and scalable simulation of entire TinyOS applications, in Proceedings of the 1st International Conference on Embedded Networked Sensor Systems, pp. 126–137 (2003) 18. A. Lancin, D. Papadimitriou, DRMSim: a routing-model simulator for large-scale networks (2013) 19. T. Mininet, Mininet: an instant virtual network on your laptop (or other pc)-mininet. Mininet.org (2017) 20. M. Ebrahim, S. Khan, S.S. Mohani, Peer-to-peer network simulators: an analytical review. arXiv preprint (2014) 21. W.A. Habeeb, A. Assalem, Improve the performance of peer-to-peer networks within publish/subscribe systems by using PeerSim simulator within eclipse environment (2021) 22. A. Montresor, M. Jelasity, PeerSim: a scalable P2P simulator, in 2009 IEEE Ninth International Conference on Peer-to-Peer Computing, pp. 99–100 (2009) 23. H. Mehar, The case study of simulation of power converter circuits using Psim software in teaching. Am. J. Educ. Res. 137–142 (2013) 24. D.L. Raja, Study of various network simulators. Int. Res. J. Eng. Technol. (2018) 25. A. Al-Roubaiey, H. Al-Jamimi, Online power Tossim simulator for wireless sensor networks, in 2019 11th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), pp. 1–5 (2019) 26. H. Hayouni, M. Hamdi, A novel energy-efficient encryption algorithm for secure data in WSNs. J. Supercomput. (2021)
Using Deep Neural Networks for Predicting Diseased Cotton Plants and Leafs Dhatrika Bhagyalaxmi and B. Sekhar Babu
Abstract Complex deep learning models have proved their success in identifying plant diseases, leaf diseases with reasonable performance. But due to the complexity in the algorithm and vanishing gradient problem, it takes lot of time and requires huge computational resources to train the model. In this paper, we analyzed the performance of the deep neural networks on predicting the fresh and diseased cotton plants and leaf. We compared three different models which include—VGG16, ResNet50 and MobileNet. In our analysis, we found that VGG16 and MobileNet models give best results on train set, validation set and test set. The models are analyzed by considering the metrics—accuracy, loss and no. of correct predictions made by the model. Keywords Neural networks · Transfer learning · VGG16 · ResNet50 · MobileNet · Data augmentation
1 Introduction The quantity and quality of the yield are reduced when plant diseases are not identified in a timely manner. Plant diseases and other pests account for 20–40% of global annual productivity losses [1]. Increased consumer prices and lower crop producer earnings are also a result of lower yields. Accurate and timely identification of plant diseases is important for safeguarding maximum yield and is helpful for farmers. Plant diseases trigger disease outbreaks on a regular basis, resulting in large-scale death and a significant economic effect. These issues must be addressed early on in order to save people’s lives and resources.
D. Bhagyalaxmi (B) Koneru Lakshmaiah Education Foundation, Guntur, Andhra Pradesh, India B. S. Babu Department of CSE, Koneru Lakshmaiah Education Foundation, Guntur, Andhra Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_28
385
386
D. Bhagyalaxmi and B. S. Babu
Deep learning is shown to have a significant effect on well-defined types of perception and classification problems. The neural networks of deep learning models need a lot of data to learn a task, such as millions of images. It is self-evident that a large amount of training data is critical to the success of deep learning models. In recent years, cutting-edge models have shown remarkable performance in identifying objects such as Alexa, Google Home and Baidu’s voice assistant. Those models demonstrate deep learning’s maturity, ensuring widespread acceptance. In fact, there is a challenge to achieve excellent results when dealing with small data sets. When given a job, people use shallow models that waste time and give poor results due to excessive overfitting. Models trained on tasks can, however, be reused for different problems in the same domain with proper modifications and achieve good performance without extreme overfitting. So, in this paper, we implemented complex deep learning models on a small data set to check their performance in terms of accuracy and loss over small datasets.
2 Related Work In the literature, we found that lot of research has been done in the area of plant disease detection using Machine learning (ML), DL, Computer Vision (CV) techniques. Traditional classification ML algorithms such as Support Vector Machines, K Nearest Neighbors were used for predicting diseases in the plants. DL approaches like Convolutional Neural Networks (CNN) and Deep Neural Networks (DNN) such as AlexNet, VGGNet, Residual networks are also implemented in this area. Figure 1 depicts a timeline of deep learning successes in the ImageNet image classification challenge and classification error rate attained by the algorithms tested on large number of images. The winning solutions from 2010 and 2011, as presented in Fig. 1, are not built using deep learning. The deep learning models dominated the ImageNet image classification challenge in the 2012 challenge edition. In 2015 ResNet model, the winning algorithm outperformed its competitors and became the first to beat the average human classification error rate of 5%, and by 2017, the algorithms had achieved a classification error rate of 2.28%, producing less than half as many classification errors as humans. This has shown a path for the next generation of image classifiers [2]. Different approaches to identifying and quantifying plant disease are in use, with one of them being leaf image-based identification of plant disease [3–10]. It is by far the simplest method for automatically detecting plant disease and can be applied to a variety of diseases [4]. Patil and Kumar [11] have presented a Content-Based Image Retrieval (CBIR) to extract color, shape and texture-based features. Sandika et al. [12] have proposed feature-based approach for disease identification on grape leaves. Sagar and Jacob [13] have addressed a multi-class classification problem by comparing five models including DenseNet169, ResNet50, VGG16, InceptionResNet and Inception V3 for plant disease detection using transfer learning. They
Using Deep Neural Networks for Predicting Diseased Cotton …
387
Fig. 1 From 2010 to 2017, improvement of the top-5 error rate in the ImageNet classification competition
have used Plant Village dataset with 38 classes, and ResNet50 has given good results comparing with other models. Ahmed et al. [14] have presented an approach for disease identification system through visual classification. The proposed system is able to identify healthy leaf and diseased leaf along with type of disease. The steps followed in the proposal include pre-processing, segmentation of diseased leaf, feature calculation based on Grey Level Co-occurrence Matrix (GLCM), feature selection and classification. Support Vector Machine is used for classification with tenfold cross validation with an accuracy of 98.79%. Shahidur Harun Rumy et al. [15] have studied three rice-plant diseases—Brown Spot, Hispa and Leaf Blast. Features are extracted from the images after preprocessing. Using these features, they have made a classification model using Random forest algorithm and got 97.5% accuracy. Lijo, J. analyzed InceptionV3, DenseNet169 and ResNet50 transfer-learning techniques with and without augmentation [16]. Metrics used to analyze algorithm efficiency are accuracy, precision and recall and got 98.2% accuracy with augmentation and 97.3% without augmentation. Chakraborty et al. [17] have proposed a lightweight deep learning architecture using DenseNet-121 to classify leaf images of the “PlantDoc” dataset of 28 classes with 1874 images and got an accuracy of 92.5%.
388
D. Bhagyalaxmi and B. S. Babu
3 Feature Engineering 3.1 Dataset The dataset used in our study consists of 2293 images of diseased and healthy cotton plants and leafs. The images are classified into four classes, and the class labels are—diseased cotton leaf, diseased cotton plant, fresh cotton leaf, fresh cotton plant. In the process of data splitting, the data is split into 3 sets as • Training set—consists of 1951 images used to design the model • Validation set—consists of 324 images use to refine the model • Testing set—consists of 18 images use to test the model. If we do not split our data, we might test our model with the same data that we use to train our model. Sample images of four classes are presented in Fig. 2.
3.2 Data Augmentation This is referred as a process of synthesizing new images from the existing images by applying different transformations. Data augmentation is applied to deal with limited data and to handle class imbalance problem. Computer vision tasks like image recognition, object detection and segmentation have been very competitive among the common deep learning applications. In these cases, data augmentation can be used to effectively train deep learning models. The most common transformations applied on an image are—geometric transformations and color space transformations. Flipping, Rotation, Translation, Cropping, Scaling are treated as geometric transformations and color casting, varying brightness, noise injection are treated as color space variations. In this study, we have applied the following transformations to images in the train set—rescale, shear_range, zoom_range, horizontal_flip. Images after applying data augmentation transformations are shown in Fig. 3.
3.3 Transfer Learning Transfer learning is the most promising attempt at merging data efficiency and deep learning. A common approach to transfer learning in deep learning is to train a network on ImageNet and then apply it to other tasks [18–21]. When ImageNettrained networks are used for other activities, the results are excellent. The concept of transfer learning is inspired by the idea that humans, including children, can intelligently apply previously learned information to efficiently solve new problems. The need for lifelong machine learning methods that retain and reuse previously learned knowledge was addressed in NIPS-95 workshop on “Learning to Learn,”
Using Deep Neural Networks for Predicting Diseased Cotton …
389
Fig. 2 Sample images of four classes labeled as—diseased cotton leaf, diseased cotton plant, fresh cotton leaf, fresh cotton plant
which highlighted on the need for lifelong machine learning methods that retain and reuse previously learned knowledge. Since 1995, transfer learning research has gotten a lot of coverage. Transfer learning techniques, as opposed to conventional machine learning techniques that try to learn each task from scratch, try to transfer information from previous tasks to a target task when the latter has less high-quality training data.
4 Convolutional Neural Network Models Since the 1990s, convolutional networks have been used in a variety of applications, including automated check and zip code reading [22]. But they failed to succeed at more challenging tasks. This was changed when Krizhevsky et al. [23] won
390
D. Bhagyalaxmi and B. S. Babu
Fig. 3 Images after data augmentation. a Rotation b Flip c Shift d Zoom e Rescale
the ImageNet Large Scale Image Recognition Challenge in 2012. A challenging task consisted of classifying images into 1000 groups using a 1.2 million image training collection. A deep convolutional network with five convolutional layers, two completely connected layers and a Softmax classifier was used by Krizhevsky et al. [23]. The deep CNN outperformed the competition significantly, lowering the top-5 error rate from 25.2 to 16.4%. Since 2012, deeper and more complex convolutional networks have won the ImageNet Challenge. Figure 1 shows how the difficulty has resulted in a gradual decrease in the error rate. As compared to non-deep methods, the reduction in error rate is impressive. The excellent findings have resulted in a significant increase in neural network science. To learn from data, deep learning lets models that are composed of multiple layers [24]. AlexNet [23], VGG net [25], GoogleNet [26], ResNet [27], ResNeXt [28], RCNN (Region Based CNN) [29], and YOLO (You Only Look Once) [30] are advanced models for deep learning.
Using Deep Neural Networks for Predicting Diseased Cotton …
391
Fig. 4 VGG 16 architecture
4.1 VGG16 VGG stands for the Visual Geometry Group at Oxford University. The models from [25], which were the Visual Geometry Group’s submission to the 2014 ImageNet competition, are referred to as VGG in a neural network context. Though VGG came in second place in the classification challenge, their models were successful due to the simplicity of their network architecture. Because of two factors, the VGG models are considered plain. The first is the network’s extensive use of 3 × 3 convolutions. The second is the doubling of the number of feature maps with stride 2 after 2 × 2 max pooling. Throughout the network, 2 × 2 max pooling with stride 2 is also used. The need for fine-tuning convolutional filter sizes and individual layer sizes was eliminated because of these two factors. The VGG 16 architecture is presented in Fig. 4 shows a stack of 13 convolutional layers followed by three dense layers. Features extracted by the VGG16 model during block1_conv1 and block3_pool layers are presented in Fig. 5.
4.2 ResNet50 Residual Networks or ResNets is a classic neural network used to solve computer vision problems. This was the winner of ImageNet challenge in 2015 and it can train deep neural networks with more than 150 layers successfully. Residual learning framework presented in [27] to ease the training of networks that are substantially deeper than those used previously. The ResNet-50 model shown in Fig. 6 comprises of 5 stages each with a convolution block and identity block. A convolution block consists 3 convolution layers and each identity block also has 3 convolution layers. The ResNet-50 has over 23 million trainable parameters. This uses the concept of skip connection, which adds the output of a previous layer to the next layer. Because of this, we can overcome the vanishing gradient problem. Features extracted by the convolutional layers 1 and 5 of the ResNet50 are presented in Fig. 7.
392
D. Bhagyalaxmi and B. S. Babu
Fig. 5 Features extracted by convolutional layers 1 and pooling layer 3
Fig. 6 ResNet50 architecture
Using Deep Neural Networks for Predicting Diseased Cotton …
393
Fig. 7 Features extracted by convolutional layers 1 and 5
4.3 MobileNet This is a lightweight deep neural network with less parameters and greater classification accuracy than other deep neural networks. MobileNet is a CNN architecture that is both efficient and scalable, and it is used in real-world applications. To construct lighter models, MobileNet primarily uses depth-wise separable convolutions instead of the regular convolutions used in previous architectures. This minimizes parameters and computation as compared to the VGG-16 network. At the same time, MobileNet’s classification accuracy on the ImageNet data set only drops by 1%. MobileNet came up with two new global hyper parameters—width multiplier and resolution multiplier [31], which let model developer’s tradeoff latency or accuracy for speed and small size, depending on their needs. The architecture shown in Fig. 8
394
D. Bhagyalaxmi and B. S. Babu
Fig. 8 MobileNet architecture
contains a standalone convolution layer followed by 13 depth-wise separable convolutional layers without any pooling layers. In Fig. 9, we can observe the features extracted by the model in conv_1 and conv_pw_4 layers.
Fig. 9 Features extracted by conv_1 and conv_pw_4 layers
Using Deep Neural Networks for Predicting Diseased Cotton … Table 1 Results given by the models
395
Model VGG16
ResNet50
MobileNet
No. of parameters
14,815,044
23,989,124
3,429,572
Train accuracy
0.9959
0.7406
0.9902
Train loss
0.0261
0.7015
0.0872
Validation accuracy
0.9660
0.7253
0.9629
Validation loss
0.1099
0.9972
0.4781
Test accuracy
0.9444
0.7778
1.0000
Test loss
0.0820
0.8889
0.0005
5 Results In this section, we presented our observations by implementing the three models on the dataset. Table 1 shows important findings of the three models. Figure 10 shows the plots between loss and epochs, accuracy and epochs for train and validation sets. An important observation from this figure is that, ResNet50 has more ups and downs in accuracy and loss when comparing with the other two models. The results in Table 1 showed that, VGG16 and MobileNet have given 99% accuracy for train set and 96% for validation set. But for the test set, MobileNet and VGG16 have given 100 and 97% accuracy. The ResNet50 has given 74, 72 and 78% accuracy for train, validation and test set, which is very low comparing with other two models. The predictions made by these models for the test set are shown in Figs. 11, 12 and 13 by indicating wrong prediction with red color class label and correct prediction with black color class label.
6 Conclusion The objective of this paper is to demonstrate how deep learning models such as VGG16, ResNet50, MobileNet can be used on very small data set of 2293 images with four classes, which is partitioned as 1951 training images, 324 validation images and 18 testing images, without overfitting and with excellent results. The said models are applied using the concept of transfer learning and by applying data augmentation techniques. All the models have given good results with high accuracy and low loss. These results have proved that deep learning models can be used to fit even on very small datasets with proper modifications and data augmentation. But it is critical to select appropriate deep learning models to fit in very limited datasets in order to assess results. Overall, the experiments showed that deep learning models can be used to fit in very small datasets with minimal overfitting if properly modified.
396
D. Bhagyalaxmi and B. S. Babu
Fig. 10 Training and validation loss, train accuracy and validation accuracy for a VGG16 b ResNet50 c MobileNet
Using Deep Neural Networks for Predicting Diseased Cotton …
Fig. 11 Predictions made by VGG16
Fig. 12 Predictions made by ResNet50
Fig. 13 Predictions made by MobileNet
397
398
D. Bhagyalaxmi and B. S. Babu
References 1. S. Savary, A. Ficke, J.-N. Aubertot, C. Hollier, Crop losses due to diseases and their implications for global food production losses and food security. Food Secur. 4 (2012). https://doi.org/10. 1007/s12571-012-0200-5 2. J. Hu, L. Shen, S. Albanie, G. Sun, E. Wu, Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42(8), 2011–2023 (2020). https://doi.org/10.1109/TPAMI.2019. 2913372 3. J. Barbedo, Plant disease identification from individual lesions and spots using deep learning. Biosys. Eng. 180, 96–107 (2019). https://doi.org/10.1016/j.biosystemseng.2019.02.002 4. K. Ferentinos, Deep learning models for plant disease detection and diagnosis. Comput. Electron. Agric. 145, 311–318 (2018). https://doi.org/10.1016/j.compag.2018.01.009 5. I. Khandelwal, S. Raman, Analysis of transfer and residual learning for detecting plant diseases using ımages of leaves, in Computational Intelligence: Theories, Applications and Future Directions, vol. II (Springer, 2019), pp. 295–306 6. S.P. Mohanty, D.P. Hughes, M. Salathé, Using deep learning for image-based plant disease detection. Front. Plant Sci. 7, 1419 (2016) 7. S. Zhang et al., Plant diseased leaf segmentation and recognition by fusion of superpixel, K-means and PHOG. Optik Int. J. Light Electron Opt. 157, 866–872 (2018) 8. P. Goncharov et al., Architecture and Basic Principles of the Multifunctional Platform for Plant Disease Detection (2018) 9. J. Li, J. Jia, D. Xu, Unsupervised representation learning of ımage-based plant disease with deep convolutional generative adversarial networks, in 2018 37th Chinese Control Conference (CCC) (IEEE, 2018) 10. M.K.R. Asif, M.A. Rahman, M.H. Hena, CNN based disease detection approach on potato leaves, in 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS) (2020), pp. 428–432. https://doi.org/10.1109/ICISS49785.2020.9316021 11. J.K. Patil, R. Kumar, Analysis of content based image retrieval for plant leaf diseases using color, shape and texture features. Eng. Agric. Environ. Food 10(2), 69–78 (2017) 12. B. Sandika, et al. Random forest based classification of diseases in grapes from images captured in uncontrolled environments, in 2016 IEEE 13th International Conference on Signal Processing (ICSP) (IEEE, 2016) 13. A. Sagar, D. Jacob, On Using Transfer Learning for Plant Disease Detection (2020). https:// doi.org/10.13140/RG.2.2.12224.15360/1 14. N. Ahmed, H. Asif, G. Saleem, Leaf Image-Based Plant Disease Identification Using Color and Texture Features (2021) 15. S.M. Shahidur Harun Rumy, M.I. Arefin Hossain, F. Jahan, T. Tanvin, An IoT based system with edge ıntelligence for rice leaf disease detection using machine learning, in 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS) (2021), pp. 1–6. https://doi.org/10.1109/IEMTRONICS52119.2021.9422499 16. J. Lijo, Analysis of effectiveness of augmentation in plant disease prediction using deep learning, in 2021 5th International Conference on Computing Methodologies and Communication (ICCMC) (2021), pp. 1654–1659. https://doi.org/10.1109/ICCMC51019.2021.941 8266 17. A. Chakraborty, D. Kumer, K. Deeba, Plant leaf disease recognition using fastai ımage classification, in 2021 5th International Conference on Computing Methodologies and Communication (ICCMC) (2021), pp. 1624–1630. https://doi.org/10.1109/ICCMC51019.2021.9418042 18. H. Azizpour, A. Razavian, J. Sullivan, A. Maki, S. Carlsson, Factors of transferability for a generic ConvNet representation. IEEE Trans. Pattern Anal. Mach. Intell. 38 (2015). https:// doi.org/10.1109/TPAMI.2015.2500224 19. J. Donahue, J. Jeff, V. Yangqing, H. Oriol, Z. Judy, T. Ning, E. Tzeng, T. Darrell, DeCAF: a deep convolutional activation feature for generic visual recognition. arXiv preprint 32 (2013) 20. A. Razavian, H. Azizpour, J. Sullivan, S. Carlsson, CNN features off-the-shelf: an astounding baseline for recognition. Arxiv (2014)
Using Deep Neural Networks for Predicting Diseased Cotton …
399
21. J. Yosinski, J. Clune, Y. Bengio, H. Lipson, How transferable are features in deep neural networks? pp. 3320–3328 (2014) 22. Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324, 17–21 (1998) 23. I.S. Krizhevsky, G.E. Hinton, Imagenet classification with deep convolutional neural networks, ˙ in Advances in Neural Information Processing Systems (2012), pp. 1097–1105 24. Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436 (2015) 25. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) 26. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2818–2826 27. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 770–778 28. S. Xie, R. Girshick, P. Dollar, Z. Tu, K. He, Aggregated residual transformations for deep neural networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 1492–1500 29. S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: towards real-time object detection with ˙ region proposal networks, in Advances in Neural Information Processing Systems, pp. 91–99 (2015) 30. J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: united, real-time object detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 779–788 31. A.G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, H. Adam, MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017)
Document Cluster Analysis Based on Parameter Tuning of Spectral Graphs Remya R. K. Menon, Astha Ashok, and S. Arya
Abstract Partitioning a set of objects in a particular domain into different subsets or clusters can be done using unsupervised learning methods. The algorithms compete to increase the intra-cluster similarity with an objective to find top-quality clusters in a limited time frame. High-dimensional data are very relevant to a wide range of areas but to make analysis and easy understanding we need to converge it to a low-dimensional representation. Here, we implement a spectral clustering algorithm for document data set in which we are able to cluster semantically similar documents with respect to the presence of semantically similar terms. Some documents will be similar in a particular subspace but dissimilar in other. The documents that belong to a union of low-dimensional subspaces are found from a collection. We get a cluster map which has a weighted link between the nodes indicating the strength of similarity between clusters in different subspaces. We apply principal component analysis (PCA) as a feature extraction step. There are no assumptions made about the structure or shape of the clusters. We analyse the cluster formation using homogeneity, inertia and Silhouette score for varying parameters (Epsilon-neighbourhood graph/K-nearest neighbour/fully connected, epsilon value, number of clusters) of spectral clustering. Keywords High-dimensional data · Spectral clustering · Affinity matrix · Laplacian · Silhouette score · Homogeneity score
R. R. K. Menon (B) · A. Ashok · S. Arya Department of Computer Science and Applications, Amrita Vishwa Vidyapeetham, Amritapuri, India e-mail: [email protected] A. Ashok e-mail: [email protected] S. Arya e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_29
401
402
R. R. K. Menon et al.
1 Introduction A major characteristic of any machine learning algorithm is its capability to handle high-dimensional data in the form of images, videos, text, etc. [1]. This means that an image is represented using many pixels, videos using frames, documents are constructed using terms and many more examples. Other applications include speeding the search process in a database by clustering the database index content [2]. The key to success of any algorithm is its strength to represent high-dimensional data. We try to represent data belonging to many categories in a lower-dimensional space. Data clustering is improvised in a way to make high-dimensional data dissimilarity clearer in a low-dimensional scenario [1, 3]. Different algorithms are there to cluster the high-dimensional data. Here we use spectral clustering for highdimensional data analysis. We can divide clustering algorithms into different categories [3]: iterative, algebraic, statistical and spectral methods. Spectral clustering is a fast-developing algorithm, which performs better than many other traditional clustering algorithms. Spectral clustering gathers affinity information, neighbour information, symmetry and distance information for each data point to construct a similarity matrix [1, 4]. Spectral clustering algorithm is done on the similarity matrix to obtain data fragments. But it is still difficult to identify the cluster of those points that lie very close to two or more subspaces. Another strength of spectral clustering is that it makes no assumption on the shapes of clusters, and there is no iterative process to find local minima like K-means. Each data point is considered as a node of the graph, and they are connected to form a cluster [3, 5]. K-means is a clustering technique which assume that each cluster has spherical structure and data points are placed about the cluster centre. This assumption may not be relevant in all cases. So, we go for spectral clustering which helps to build more accurate clusters. It performs faster for large which are sparse. Spectral clustering uses the connection approach on clustering. Connection of each document to every other documents is identified from the graph. The nodes are then represented in a lower space where clusters can be identified easily. Spectral clustering takes information from the eigenvalues (also known as spectrum) of matrices like affinity, degree and Laplacian generated using the graph. Clusters are generated from points that are connected even if a far apart in distance [4]. K-means clustering algorithm is the most popular method for document clustering. The K-means requires distance metric to perform clustering. This algorithm divides n number of elements into k clusters where n > k. Initialization of k number of centroids is the first step of K-means clustering algorithm. Data points are randomly chosen as centroids initially. The second step is to find the distance from every data point to each centroid. Each data point is then clustered with the closest centroid, and therefore, the mean of the values in each cluster becomes the new centroid. That is why it is said that clusters may vary based on the selection of initial centroid. The previous two steps are repeated until the clustering converges (i.e. the centroids
Document Cluster Analysis Based on Parameter …
403
do not change). Currently, spectral clustering alone is computationally expensive for large data sets, because of the computation of eigenvalues and eigenvectors. For large, dense data sets, time complexity also increases. Principal component analysis when done as a pre-processing technique will help reduce the number of dimensions of big data sets especially in the document domain [6]. All document clustering algorithms have a common problem of ignoring tagging of real entities [7] and the semantics like the word ordering which can be dealt with by integrating deep learning algorithms here [8]. Also, ranking the clusters based on similarity score to the given query can be effective only if the formation of the clusters is semantically correct [9].
2 Related Work There are a few existing work where spectral clustering is performed on document data. According to [10], a local scaling parameter μi is used for every data point pi instead of using a single scaling factor for the entire cluster. So the distance between the two data points is taken from pi is d( pi , p j )/μi , and the converse is d( p j , pi )/μ j . Hence, the square distance d 2 is d( pi , p j )d( p j , pi )/μi μ j = d 2 ( pi , p j )/μi μ j . Now each point has a specific scaling parameter that allows a self-tuning of each connected point consistent with the local statistics of the nearest neighbouring points pi and p j . μi was selected only after studying the local statistics of the neighbourhood of point pi . μi = d( pi , pk ), where pk is the K th neighbour of point pi . The value of k does not depend on scale. Knowledge required for separating local scaling automatically identify two scales: within clusters, there is high affinity, and across clusters, there is low affinity. Manoharan [11] uses a modified capsule network algorithm for representing latent information like the semantics of documents and local word order. Another work is an algorithm that support sparse representation techniques, to cluster a large set of data points represented in a lower subspace. The algorithm states that every data vector can be represented as a linear combination of some basis vectors [3] from its own subspace. This is a sparse optimization problem whose solution is used for spectral clustering. Using sparse subspaces, spectral clustering aims at finding clusters using points that exist even in the intersecting of subspaces. The sparse optimization chooses a few points from a given data set that belongs to the same subspace [3]. Dos Santos et al. [12] prove that spectral clustering and community detection in complex networks are complementary to each other in the context of document clustering. An affine subspace spectral curvature clustering (SCC) uses similarity which holds the curvature of data points. The main advantage of SCC is that it can handle noisy data, but there must be knowledge about the number of dimensions of that subspace, which is its drawback. Based on SCC assumption, subspaces have the
404
R. R. K. Menon et al.
same dimension. Similarity graph is obtained from the solution of global optimization algorithm, from where the segmentation of the data is obtained [13]. Günnemann [5] describes an algorithm called SSCG based on spectral clustering. It is a clustering algorithm that extends spectral clustering by considering irrelevant attributes when clustering graphs. In [14], a PCA-based similarity score is used to evaluate normalized and unnormalized Laplacian graphs for image processing. Indulekha et al. [15] describe a technique to find the priority of candidate genes in a PPI network based on community detection graphs. A few issues happen in grouping methods like deciding the suitable count of clusters and to find cluster strength. Benson-Putnins [1] addresses these issues by making a consensus matrix which can be interpreted as a graph, which utilizes two clustering techniques: the Fiedler and the MinMaxCut methods.
3 Solution Approach and Methodology Spectral clustering reduces high-dimensional data sets into clusters in low dimensions. It uses the graph forming approach to clustering. The data points that are connected or close to each other can be identified in a graph. Clusters are identified by representing these nodes in lower dimension. When we compare K-means clustering with spectral clustering, the K-means focused on the distance but spectral clustering is more about connectivity since it is semi-convex. K-means is ideal for discovering globular clusters. Spectral clustering is more general and powerful. If we have P data points each with N dimensions/features, input matrix to K-means would be N by P, while input matrix to spectral clustering would be P by P. Spectral clustering does not depend on the number of features. We here implement the spectral clustering algorithm by tuning its various parameters for document clustering after performing feature extraction through principal component analysis (PCA) which is a well-known method for dimension reduction. The PCA converts data to extract uncorrelated features. The experiments with and without PCA have shown that the PCA is a better algorithm for pre-processing prior to clustering document data set using spectral clustering. To cluster the document data set using the spectral clustering algorithm, we represent the N features of document data set into matrix format. Then a nearest neighbours graph is constructed using the N by N matrix that we obtain. A set of nodes and edges which connect the nodes forms a graph. The edges can be undirected and can have weights associated with it.
Document Cluster Analysis Based on Parameter …
405
Algorithm 1 Modified spectral clustering algorithm 1: 2: 3: 4: 5:
Input: Medline text document collection Construct term-document matrix T by computing TF-IDF Perform PCA on T Compute similarity matrix S R n∗n , k clusters. Compute the weighted adjacency matrix based on the type of similarity graphs i f f ully_connect then: W = ex p(−A/(2 ∗ sigma)) else i f eps_neighbor then: W = (Ad jm at 0. This partitions our graph into two clusters. Zero eigenvalues indicate connected components and eigenvalues near zero indicate that there is nearly a separation of two components. Here we’ve one edge that we can see in Fig. 2. If it is absent, we will have two separate components.
Fig. 2 Fiedler vector
408
R. R. K. Menon et al.
Fig. 3 Generation of four clusters
The primary eigenvalue is 0 as there is one connected component and the second eigenvalue is near 0 because we have one edge. We can show vector related to that value that tells how to separate the nodes into those nearly connected components. In general, finding the essential huge gap between eigenvalues helps us to understand the measure of cluster. There is a specialty between eigenvalues four and five. There exist huge gap between them. Four eigenvalues before the gap demonstrate that there are four groups. K-means is done on the eigenvectors as shown in Fig. 3. With nodes 2 and 6, the graph was divided into the four quadrants, where the data points belong to any connected set. To summarize this, after performing PCA on the normalized data matrix, we built an adjacency matrix from our graph and calculated graph Laplacian by computing the difference between the adjacency matrix and degree matrix. The eigendecomposition of Laplacian indicates that there are four clusters. The eigenvalues specify the number of clusters and K-means clustering is applied on the eigenvectors for deriving the actual clusters.
4 Experimental Results and Analysis For performing spectral clustering, we use different data sets like medline data set, TREC data set and Newsgroup data set. Each data set is divided into training and test set. A validation set is also maintained to validate the results on using different models like SVD or PCA in the preprocessing step. First we convert the text into vectors of numerical values for statistical analysis. This can be done using the functions in the different classes in sklearn. The feature extraction method used is principal component analysis (PCA) so as to reduce the dimensionality of the document collection without losing much information from the larger set. PCA which we have used for dimensionality reduction is a lossy method. Smaller data sets generated from PCA help to visualize and analyse it better. Through our
Document Cluster Analysis Based on Parameter …
409
experiment, it is clear that without PCA the visualization of spectral clustering (output) does not obtain accurate results. Analysis With an adjacency matrix that uses Euclidean distance, and a Laplacian matrix that uses random walk normalized version and number of cluster k = 2, we obtain clusters as seen in Fig. 4. If the adjacency matrix uses cosine distance, Laplacian with unnormalized version and number of cluster k = 3, we obtain clusters as seen in Fig. 5. In similarity graph computation, when the sigma value of fully connected graph is taken as 1 and k value for knn is taken as 8, then the result is as shown in Fig. 6.
Fig. 4 Clusters formed when adjacency matrix uses Euclidean distance, normalized version of random walk on the Laplacian matrix and k = 2 clusters is used
Fig. 5 Three clusters when adjacency matrix with cosine distance, Laplacian with unnormalized version is used
410
R. R. K. Menon et al.
Fig. 6 Clusters when adjacency matrix with cosine distance, Laplacian with unnormalized version and number of cluster k = 2 is used
Fig. 7 Result with fully connected graph
If we change the sigma value of fully connected graph to 0.01 and value of k to 10 for knn, then we obtain result as shown in Fig. 7. Instead of fully connected graph use Epsilon neighbourhood graph with value of epsi = 0.5 to get the result as shown in Fig. 8. Epsilon neighbourhood graph with value of epsilon as 0.01 or 0.5, we get the following result as shown in Fig. 9. When the sigma value of fully connected graph with Euclidean distance changed from 0.5 to 1 shows no change in the quality of clustering (Silhouette score: 0.68884 and 0.68884 respectively) (Table 1). When the k value of KNN graph with cosine distance is changed from 8 to 10 the quality of clustering decreased (Silhouette score: 0.7585 and 0.7843, respectively). And the k value is changed from 4 to 6 the quality of clustering increased (Silhouette
Document Cluster Analysis Based on Parameter …
Fig. 8 Result with Epsilon neighbourhood graph
Fig. 9 Epsilon neighbourhood graph with epsilon 1 and 0.5 Table 1 KNN graph with number of clusters 2—Silhouette score k-value Silhouette score 4 6 8 10
0.7018 0.7258 0.7585 0.7843
411
412
R. R. K. Menon et al.
Table 2 Fully connected graph with number of clusters 2—Silhouette score Sigma-value Silhouette score 0.5 1 1.5 2
0.68884 0.68884 0.68885 0.688906
Table 3 Epsilon Neighbour graph with number of clusters 2—Silhouette score Epsilon value Silhouette score 0.3 0.4 0.5 0.8
0.68894 0.68901 0.68898 0.68635
Table 4 Different similarity graph with number of clusters 3—Silhouette score Similarity graph Value Silhouette score KNN graph Epsilon Neighbour Fully connected
8 0.4 2
0.7585 0.58901 0.5889
score: 0.7018 and 0.7258, respectively). But when we compare with other k values (in table), we obtain accurate cluster at k = 8 (Table 2). When the sigma value of fully connected graph with cosine distance changed from 0.5 to 2 the quality of clustering increased (Silhouette score: 0.68884 and 0.688906, respectively). When compared with fully connected graph, KNN forms higher-quality clusters (Table 3). In Epsilon neighbourhood graph with cosine distance the quality of cluster is decreased when the epsilon value of 0.5 is changed to epsilon value of 0.8 (Silhouette score: 0.68898 and 0.68635, respectively). The epsilon value of 0.3 is changed to epsilon value of 0.4 (Silhouette score: 0.68894 and 0.68901, respectively) the quality of cluster will be increased. From table, it is clear that at sigma value 0.4, we obtain high-quality cluster (Table 4).
5 Conclusion The spectral clustering algorithm clusters a collection of documents as data points. It has been studied that this clustering approach has not been extensively used in the
Document Cluster Analysis Based on Parameter …
413
document scenario. By applying spectral clustering on a document collection after performing matrix approximation using SVD and feature extraction using PCA, we have improvised the Silhouette score by 50%. Applying spectral clustering alone on the data set reduced the homogeneity score as well. Future scope in this work includes introducing spectral clustering over subspaces.
References 1. Benson-Putnins, Spectral clustering and visualization: a novel clustering of Fisher’s Iris data set. SIAM Undergrad. Res. Online 4, 1–15 (2011) 2. B.S.S.K. Chaitanya, D.A.K. Reddy, B.P.S.E. Chandra, A.B. Krishna, R.R. Menon, Full-text search using database index, in 2019 5th International Conference on Computing, Communication, Control and Automation (ICCUBEA) (IEEE, 2019) 3. E. Elhamifar, R. Vidal, Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2765–2781 (2013) 4. U. Von Luxburg, A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007) 5. S. Günnemann et al., Spectral subspace clustering for graphs with feature vectors, in 2013 IEEE 13th International Conference on Data Mining (IEEE, 2013) 6. R. Ramachandran, G. Ravichandran, A. Raveendran, Evaluation of dimensionality reduction techniques for big data, in 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC) (IEEE, 2020) 7. G. Veena, D. Gupta, S. Lakshmi, J.T. Jacob, Named entity recognition in text documents using a modified conditional random field. Recent Findings in Intelligent Computing Techniques (Springer, Singapore, 2018), pp. 31–41 8. R.R.K. Menon, S.G. Bhattathiri, An insight into the relevance of word ordering for text data analysis, in 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC) (IEEE, 2020) 9. R.R.K. Menon et al., Improving ranking in document based search systems, in 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI) (48184) (IEEE, 2020) 10. G. Wen, Robust self-tuning spectral clustering. Neurocomputing 391, 243–248 (2020) 11. J.S. Manoharan, Capsule network algorithm for performance optimization of text classification. J. Soft Comput. Paradigm (JSCP) 3(01), 1–9 (2021) 12. C.K. Dos Santos, A.G. Evsukoff, B.S.L.P. de Lima, Spectral clustering and community detection in document networks. WIT Trans. Inf. Commun. Technol. 42, 41–50 (2009) 13. L. Zelnik-Manor, P. Perona, Self-tuning spectral clustering. Adv. Neural Inf. Process. Syst. (2005) 14. K.R. Kavitha, S. Sandeep, P.R. Praveen, Improved spectral clustering using PCA based similarity measure on different Laplacian graphs, in 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) (IEEE, 2016) 15. T.S. Indulekha, G.S. Aswathy, P. Sudhakaran, A graph based algorithm for clustering and ranking proteins for identifying disease causing genes, in 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (IEEE, 2018)
Distributed Denial of Service Attack Detection and Prevention in Local Area Network Somnath Sinha and N. Mahadev Prasad
Abstract DDOS attack detection has become more challenging due to the exponential growth of network traffic and its diversity on the Internet. The main goal of this research work is to simulate a computer network and investigate its behavior under various conditions including the execution of a DDoS attack, where the strength is determined by the presence of a traffic filtering defense mechanism. This research work proceeds with the network implementation by using the OMNET++ software simulator after an overview centered on this form of assault and a research on the problem. The data is then gathered after numerous simulations are conducted, each distinguished by the modification of a few key factors. The data is subsequently examined, with a focus on some assessment indicators. Finally, observations on the findings, limitations encountered, and potential future advances are provided. Keywords Blocker · Drop quality · Throughput · DDOS attack · OMNET++ simulator
1 Introduction Recently, DDOS attacks are one of the most common forms of a cyber-attacks owing to their ease of implementation and difficulty in detecting and managing them. The main aim of these attacks is to saturate a Web service’s resources causing legitimate clients to experience delays or be unable to access the service. A generic DDOS is carried out in the following way: the attacker infects a group of machines, installing back doors, so that the attacker can monitor them without the owners’ knowledge; the computers that make up this specific form of network, known as a zombie botnet, which is referred to as zombies. At this point, the attacker orders the zombies to simultaneously submit a high number of requests to a single target. The size of the botnet and the number of requests it can send, as well as any security measures implemented by the victim, all contribute to the success of a DDOS. DDOS attacks S. Sinha (B) · N. Mahadev Prasad Amrita School of Arts and Sciences, Amrita Vishwa Vidyapeetham, Mysore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_30
415
416
S. Sinha and N. Mahadev Prasad
come in a variety of forms, each of which takes advantage of various aspects of Web services and the network itself. The attacker and the counterpart who is in charge of testing various defensive mechanisms, estimating the effect of the attack on the victim system, and analyzing its characteristics will benefit from simulations of this form of attack.
2 Related Work In this work, [1] researchers examines various forms of MANET security attacks, with a focus on the Sybil attack, which is one of the most dangerous. Introduces a new method focused on clustering and resource testing for detecting Sybil attacks. In this proposed work [2], authors examine the dangers of Sybil attacks in various MANET environments which are briefly discussed in this paper, as well as the taxonomy of various defending mechanisms. The current paper also presents a novel trust-based strategy. In this work [3], researchers examines simulators and emulators can be used to conduct network testing, but their performance is insufficient. As a result, a comparison study was conducted, and it was discovered that simulators and emulators fail to produce reliable results when compared to actual networks. In this work [4], researchers work on simulators and emulators can be used to conduct network testing, but their performance is insufficient. As a result, a comparison study was conducted, and it was discovered that simulators and emulators fail to produce reliable results when compared to actual networks. The proposed work is [5] concerned with WSN security. The key part is detecting a fake node in the network. They detect an attack and a malicious node in the network using typical scenarios. Calculate the packet drop, throughput, and residual energy by calculating the packet drop, throughput, and residual energy. In this proposed work [6], researchers develop robust and efficient AIDS by using fuzzy and neural network (NN)-based tools in the proposed analysis. Since the proposed framework is lightweight and consumes little overhead, it can be implemented in each node. It can also independently track and classify the behavior of local nodes. In this paper [7], researchers suggest a DDOS flooding attack detection system for local area networks that are attack feature independent. In this paper, they suggest a DDoS flooding attack detection system for local area networks that are attack feature independent. In this work [8], researchers have suggested a DDoS attack detection method for SDN that uses two levels of protection in this paper. Snort is used to detect signature-based attacks. They often employ machine learning algorithms to detect attacks based on anomalies. In this work [9], researchers have worked on current state of anomaly detection methods for network intrusion detection in traditional wired environments which is discussed in this article. In this work, researchers [10] suggested a reCAPTCHA controller mechanism like botnet zombies, which prevents automated attacks. Most automated DDoS attacks are verified and prevented using the reCAPTCHA controller. In this work, researchers [11] have focused on wired network’s data transmission follows a traditional physical routing. The data stream routing of wireless networks, on the other hand, is focused
Distributed Denial of Service Attack Detection and Prevention …
417
on radio signals, which have a number of evolution problems. The statistical variance of the traffic volume is detected by the shift detection. The proposed system’s results are compared to current attack detection systems using the propagation delay metric. Researchers [12] begin by extending a copy of the packet number counter in the OpenFlow table for the flow entry. Based on the flow-based nature of SDN, they created a flow statistics approach in the switch. They next offer an entropy-based lightweight DDOS flooding assault detection model in the OF edge switch. In this work, researchers [13] have studied the flooding attacks in wireless sensor networks in this article. They also show how to create this attack and the consequences of this destructive attack in the network. In this work, researchers [14] have examined the flaws in the AODV routing protocol as well as the potential vulnerabilities of a black hole attack in the current study. They demonstrate some simple and effective detection and prevention methods with low overhead and a high true positive value. In this work, researchers [15] have used a stack-based technique, which has located the malicious nodes. In this work, researchers have worked on [16] every node that maintains a reputation table based on the trust value given to nearby nodes, and a mechanism has been suggested to maintain the safest route via the neighbor. Any node must check the trust value before transmitting data packets. In this work, [17] researchers have implemented the WSN by using NS2 in the study. For testing the communication path, they also used the AODV routing protocol. They used different graphs to figure out how well the AODV performed. They also explain how to make a faulty node in a WSN and how to locate the theoretically specified problematic node. In this work [18], researchers have improved the localization accuracy using iterative methods to achieve the highest level of precision feasible as a result of the research. By assuming the location using RSSI based weighted approaches, the method starts with a rough base value. This study [19] presents an innovative hybrid hazard prevention framework with a reconnaissance framework that uses IOT technologies to minimize the huge proportion of human communications. As a consequence of this research [20], researchers were able to improve localization accuracy by using two-step techniques. They use location identification algorithms based on range, with the received signal strength indication acting as the primary criteria. Flow of Work
Fig. 1 Flow of work in proposed system
418
S. Sinha and N. Mahadev Prasad
Above flowchart Fig. 1 represents the proposed architecture, which will help to detect and analyze the impact of DDOS attack. The network is set up with attackers. This method is primarily intended for ICMP flood attacks. The greatest recorded peak is shown in red in the learning mode, which monitors the intervals utilized second. The traffic is continually inspected with specific objectives in monitor mode, which is the second phase. The package that takes longer than its allotted time blocker mode has been classified as malicious and has been added to a blacklist based on an IP table list. Algorithm Steps involved to analyze the impact of DDOS attack. Step 1: Initialize the network with attackers (zombie node). Step 2: Attacker sends the packets simultaneously. Attacker uses the ping app application whose parameters allow us to customize the attack’s characteristics. Step 3: Blocker is a defense mechanism; it monitors the traffic. The intervals used were 0.03 s, with the highest observed peak highlighted in red. Step 4: The ICMP flood attack to figure out which IP addresses should be blocked. Step 5: Handles message which update the blacklist and schedule the next update. Step 6: Updates ip table monitor methodology subsequently, taken in input an IP it adds it to the ip table. Step 7: Incoming traffic monitor, which takes each packet obtained by the defender as input and performs all of the previously mentioned tests. The IP address table is reset when the traffic stabilizes.
3 Proposed Work The goal of this simulation is to set up a network with a server (victim), some legitimate clients that send requests to the server, and some zombies (attackers), who send a huge number of ping requests to the server (carrying out a ICMP flood), a succession of intermediate nodes, and eventually a blocker, which is a server component responsible for screening zombie requests by analyzing the incoming data. After the system has been modeled, the simulation’s goal is to determine the impact of the attack on the victim server’s and intermediary nodes’ performance, depending on the attack’s severity and network capacity.
3.1 Simulation Model Server: TCP Sink App and Ping App are apps used by the object standard host to receive data from TCP connections and respond to ping queries, respectively. Client: TCP Session App framework; object Standard Host, which delivers data using the
Distributed Denial of Service Attack Detection and Prevention …
419
Fig. 2 DDOS attack simulation using OMNET++
TCP protocol. Zombies: Ping the victim with the object standard host, which uses the program Ping App. Defender: The extension of the node base object, which we developed to analyze and filter the data, which is incoming traffic on the list. As shown in Fig. 2, the topology consists of four groups of client and zombie computers, each connected to an intermediate router. Each intermediary router is connected to a central router, which is connected to a group of just clients and the victim server. This configuration was chosen to send all of the network’s requests to the server via a single channel, allowing the attacker to correctly monitor incoming traffic and assess the attack’s effects on the network infrastructure that connects the server to the “external” network.
3.2 Attack Model We chose to mimic an ICMP flood attack, which includes sending a huge number of ping queries at once to implement the DDOS. Zombie: The main aim is to use an attack bandwidth greater than the targets to saturate the server machine’s physical resources and more importantly, the input bandwidth. This causes a significant slowdown in communication between legitimate clients and servers, and the network buffers of intermediate nodes are often exhausted in the event of very heavy attacks. In addition to the channel, when the channel is busy, packets that cannot be queued are discarded, preventing valid requests from reaching their destination. The zombies in our model use the program Ping App, which has parameters that enable us to define the attack’s characteristics.
420
S. Sinha and N. Mahadev Prasad
Send Interval: Interval between sending one ping and the next; in our case, it characterizes the intensity of the attack. Start Time/stop Time: Start and end time of sending pings; characterize the duration of the attack; Destination Address: It specifies the destination of the pings (the victim server).
3.3 Defense Mechanism The protection mechanism entails monitoring the traffic entering the server using the defender, an object we developed specifically for this purpose. The latter is an extension of the INET object Node Base, and is built as a generic network node with the requisite interfaces to link to a physical network. The defender functions similarly to a sniffer, analyzing all packets passing from the external network to the server’s internal network and classifying the form of control into two distinct stages, likely followed by the defensive process. Learning mode: The main aim of this process is to examine how the service is used in normal traffic conditions or in the absence of some form of attack. The server’s traffic is then tracked at regular intervals with the maximum peak being registered. In Fig. 3, learning mode is the intervals used are 0.03 s, and the maximum recorded peak is indicated in red. Monitor mode: This mode aims at the second phase, in which the traffic is continually monitored, has two separate goals. Comparison: The data obtained in the learning mode is compared to the incoming data. The analysis is divided into time windows, and the number of consecutive windows that reach the maximum peak is kept track of using a clock. An warning state is reached during the counter increment, and all active IP addresses are saved in a chart, accumulating the respective traffic strength. In this step, a threshold is defined by calculating the difference between the minimum
Fig. 3 Learning mode scenario
Distributed Denial of Service Attack Detection and Prevention …
421
peak and the average value of the traffic of all IP to determine which IP should be blocked. Figure 4 graphically shows this threshold configuration policy, which will eventually be used in the defense mode. In our case, the intensity traffic is relative to the number of packets received, rather than the size in bytes, as more is efficient for ICMP attack-flood to determine which IP should be blocked. When the counter exceeds a particular level in terms of time windows analysis, the high intensity of requests is regarded as abnormal and an indication of a possible DDO assault; in this instance, an alarm state is reached and the defense mechanism is engaged. Figure 5 shows an example of comparison: Before triggering the protection, we have a cumulative peak of 2.5 and a threshold of successive warning windows. Each warning window increases the counter, which is indicated in red in the image; however, if there are many warning windows in which the traffic falls below the critical number, the counter progressively decreases. The warning process is triggered in this example by the last comparison window, which also activates the protection mechanism. Filtering: The originating IP address of each incoming packet is checked; if this address is on the blacklist, the packet is immediately deleted. As a result, the server is protected, receiving only valid requests. Blocker mode: The IP table is queried in this third step in order to add addresses to the blacklist that is considered malicious based on the previously established threshold. A waiting time is calculated for each
Fig. 4 Setting of threshold from simulation
422
S. Sinha and N. Mahadev Prasad
Fig. 5 Alert phase and subsequent alarm phase, when the counter reaches the value 10
IP added to the blacklist based on the excess traffic generated in comparison with the threshold. Each IP will have to wait a length of time equivalent to the percentage of extra traffic it generated before being able to send requests to the server again. Any legitimate IP, blacklisted as a result of a random spike in requests, will likely have a shorter timeout than zombies.
3.4 Implementation of Blocker This part analyzes some of the specifications of the module blocker’s implementation. During both learning process and traffic control, sample time is the time interval used for bit/packet counting. Updated blacklist: Any IP whose waiting period has expired should be checked and removed from the blacklist. Insert blacklist: With an IP address, it adds it to the blacklist and sets an expiration period based on the number of packets in transit when compared to the threshold set in monitoring mode. Update IP table: It starts by determining the monitoring method (number of packets or bits). Following that, with a given IP, it updates the created traffic counter and adds it to the ip table (if not already present). Finally, the variable threshold is modified. Incoming traffic monitor: This is the primary role, which is in charge of traffic monitoring and defense mechanism management. It takes each packet obtained by the defender as input and performs all of the previously mentioned tests. Furthermore, the active IP address table is reset when traffic stabilizes, i.e., no
Distributed Denial of Service Attack Detection and Prevention …
423
warning windows are observed for a predetermined period of time (in our case 300 control windows).
4 Result Analysis and Discussion Clients are expected to produce network traffic by sending a randomly generated sum of bytes to the server for a finite number of TCP transfers scheduled at 0.3 s intervals. The server receives the total number of bytes of client requests (at the time of their generation) at the start of the simulation, counts the number of bytes received, and records the time of completion of the requests when the clients finish transferring. The server accepts TCP transfers from clients and responds to pings instantly and without delay. The attack was simulated by configuring Ping App zombies to send ping requests to the victim at a rate of 100 pings per second, 250 pings per second, 500 pings per second, and 1000 pings per second. Sending the ping has a 20% risk of failing, introducing a random error in sending the ping due to network or zombie malfunctions. The experiments are designed to vary the strength of the mentioned attack as well as the size of the buffer of the network interfaces by varying the capacity of 50, 100, and 150 packet queues. Each configuration is performed 30 times, each time corresponding to a sequence of 20 client transfers of a random number of bytes. Drop Quality: The percentage of packets discarded that came from zombies and found by the defender. Server Throughput: Filtered by the defender; number of bits forwarded to the server after processing. Connection Utilization: The percentage of the channel entering the victim that is saturated. Client Transmission Deadline: The time it takes for a client to transmit data, which varies based on the size of the network. A first consideration, based on the results obtained for the configurations of the tests mentioned above, is the size of the ongoing attack, which is conditioned by the frequency with which the zombies transmit pings to the server: A first consideration, based on the results obtained for the settings of the experiments mentioned above, is the extent of the ongoing attack, which is conditioned by the frequency with which zombies send pings to the server: for “low” frequency requests (100 ping/s), the communication channels and intermediate nodes are not overloaded enough to cause delays or changes in the traffic generated by the clients. Furthermore, throughout the attack, the defender and hence the server do not receive enough traffic to trigger the attack.
S. Sinha and N. Mahadev Prasad
Drop Ratio
424
Packet Drop Ratio
200 100 0 ping 250
ping500
ping1000
Ping Drop Zombies
Drop client
Fig. 6 Drop quality on various intensities of attack
Figure 6 shows the results related to drop quality, calculated on the average of the runs of each configuration. The percentage of dropped packets from zombies is generally very high (over 99%) and increases as pings are sent. This happens because the number of packets sent by individual zombies is proportionally, much more high compared to that generated by client requests: the increase in attack intensity facilitates the identification of addresses that are transmitting an excessive amount of requests. Regarding the clients, if the number of packets sent during the alarm state is similar to that of the zombies (e.g., due to a peak of requests), it may happen that they are mistakenly inserted in the blacklist, as you can see in this case that the attack frequency is more low. This situation of uncertainty would be more frequent in a real case, where the number of zombies is typically more high and the frequency of sending pings is usually more low, and this would negatively affect the reject accuracy. Figure 7 shows the throughput arriving at the server before and after the block filter was applied in the two most extreme cases (500 and 1000 ping/s. X-axis represents ping and y represents seconds) and refers to a single sprint. Only requests from addresses deemed “legitimate” are forwarded to the server by using the mentioned defense mechanism, avoid overburdening the server’s computational resources, and neutralize the possible DDOS attack. Beginning at the end of the learning mode, the input and filtered throughput converge over a period of time, as seen in the graphical representation. These values begin to diverge once the assault process begins when the defense reaches the warning state and the protection mechanism is initiated (from second 3 onward). As the filtered IP become available again, the throughput arriving at the server increases, triggering a new warning period in case the assault which is still ongoing and, as a result, the filter is reapplying to the incoming traffic when a new alarm is triggered. Figure 8 indicates the percentage of time the channel was used before and after the defender filter, which was applied in two scenarios of intense attack (500 and 1000 ping/s) referring to a single run. Of fact, when the zombie traffic is higher, there will be a higher saturation in the incoming channel to the server before the defender arrives in cases where this saturation reaches its maximum level, and there could be a considerable slowdown (or even rejection) of client requests. As shown in Fig. 8, the actual use of the channel from the blocker to the server is much lower when the filter is applied to packets coming from IP considered malicious. The average saturation of the incoming channel between the blocker and the server and the 95% confidence
Distributed Denial of Service Attack Detection and Prevention …
425
Fig. 7 Comparison of throughput was measured with and without a blocker
intervals, calculated on each simulation second, starting from the beginning of the ICMP flood attack. As shown, as a result of the defender’s filtration, occupation is more variable for the intensity of 500 ping/s compared to 1000 ping/s; this is due to a greater uncertainty in the filtering policy in the first case, causing a much greater variation in the use of the channel wide. Deadline for Client Transmission Figure 9 depicts the termination of client transmission in three different scenarios, where each scenario is characterized by a change in the network interface buffer, which is set to 50, 100, and 150 packets, respectively. To properly compare the time of the transfers in the three situations, the traffic generated by the customers has been fixed. The duration of the transfers reduce as the buffer size increases, as these get overloaded by the massive volume of ping requests, causing packets that cannot be queued to be deleted. As a result, client transfers are significantly delayed and forced to resend the part of their requests.
426
S. Sinha and N. Mahadev Prasad
Fig. 8 Comparison with and without blocker, in the two cases of intensity 500 ping/s and 1000 ping/s
Fig. 9 Client transmission time comparison depending on the router buffer, with intensity fixed at 500 ping/s
Distributed Denial of Service Attack Detection and Prevention …
427
Simulation Tool This project was built with the OMNET++ simulation system version is 4.3. OMNET++ is a C++ simulation toolkit and framework that is largely used to create network simulators. It is extensible, modular, and component-based, which uses the INET library to model a real network using applications and protocols that are commonly used by Web services. The elements of the proposed model are based on INET objects.
5 Conclusion The defense mechanism that has been introduced is effective and precise. When different traffic patterns corresponding to the usage of the Web service in different time slots and times are considered, this approach is more effective; however, the protection mechanism was specifically created for ICMP based DDOS attacks as the experimental results stated, and the proposed work has efficiently detected and analyzed the DDOS attack. This work has compared with and without blocker based on the performance metrics like packet drop ratio, throughput ratio, and intensity. This research work concludes that the proposed blocker method gives effective results such that the most anomalous traffic can be efficiently prevented.
References 1. S. Somnath Sinha, A. Paul, S. Pal, The sybil attack in mobile ad hoc network: analysis and detection, in Third International Conference on Computational Intelligence and Information Technology (2013), pp. 458–466 2. A. Paul, S. Sinha, S. Pal, An efficient method to detect sybil attack using trust based model, in Proceedings of the International Conference on Advances in Computer Science, AETACS (Elsevier, 2013) 3. A. Mukhopadhyay, A. Anoop, S. Manishankar, S. Harshitha, Network performance testing: a multi scenario contemplate, in 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE) (IEEE, 2020), pp. 1–7 4. S. Manishankar, P.R. Ranjitha, T.M. Kumar, Energy efficient data aggregation in sensor network using multiple sink data node, in 2017 International Conference on Communication and Signal Processing (ICCSP) (IEEE, 2017), pp. 0448–0452 5. S. Nagarjun, S. Anand, S. Sinha, A research on the malicious node detection in wireless sensor network. Int. J. Eng. Adv. Technol. (IJEAT) 8(5) (2019) 6. S. Sinha, A. Paul, Neuro-fuzzy based intrusion detection system for wireless sensor network.Wirel. Pers. Commun. 114(1), 835–851 (2020) 7. Y. Tao, S. Yu, DDoS attack detection at local area networks using information theoretical metrics, in 2013 12th International Conference on Trust Security and Privacy in Computing and Communications (IEEE, 2013), pp. 233–240 8. R. Wang, Z. Jia, L. Ju, An entropy-based distributed DDoS detection mechanism in softwaredefined networking, in 2015 IEEE Trustcom/BigDataSE/ISPA, vol. 1 (IEEE, 2015), pp. 310–317 9. J.M. Estevez-Tapiador, P. Garcia-Teodoro, J.E. Diaz-Verdejo, Anomaly detection methods in wired networks: a survey and taxonomy. Comput. Commun. 27(16), 1569–1584 (2004)
428
S. Sinha and N. Mahadev Prasad
10. B.V. Karan, D.G. Narayan, P.S. Hiremath, Detection of DDoS attacks in software defined networks, in 2018 3rd International Conference on Computational System and Information Technology for Sustainable Solutions (CSITSS) (IEEE, 2018), pp. 265–270 11. M. Poongodi, V. Vijayakumar, F. Al-Turjman, M. Hamdi, M. Ma, Intrusion prevention system for DDoS attack on VANET with reCAPTCHA controller using information based metrics. IEEE Access 7, 158481–158491 (2019) 12. K. Saravanan, Neuro fuzzy based clustering of DDoS attack detection in the network. Int. J. Crit. Infrastruct. 13(1), 46–56 (2017) 13. H.N. Lakshmi, S. Anand, S. Sinha, Flooding attack in wireless sensor network-analysis and prevention. Int. J. Eng. Adv. Technol. 8(5), 1792–1796 (2019) 14. K. Sreelakshmi, S. Anand, S. Sinha, Black hole attack in mobile ad hoc network–analysis and detection. Int. J.Rec. Technol. Eng. (IJRTE) 7(5S3) (2019). ISSN: 2277-3878 15. S. Sinha, D.P. Deepika, Stack based location identification of malicious node in RPL attack using average power consumption, in 2021 2nd International Conference for Emerging Technology (INCET) (IEEE, 2021), pp. 1–5 16. T. Harshini, S. Sinha, Multiple black hole attack in mobile ad hoc network-analysis and detection, in First International Conference on Advanced Scientific Innovation in Science, Engineering and Technology, ICASISET 17. A.P. Prabhan, S. Anand, S. Sinha, Identifying faulty nodes in wireless sensor network to enhance reliability. J. Int. J. Rec. Technol. Eng. (IJRTE) 8(2) (2019) 18. K.M. Akhil, K. Seethalakshmi, S. Sinha, RSSI based positioning system for WSN with improved accuracy, in 2021 3rd International Conference on Signal Processing and Communication (ICPSC) (IEEE, 2021), pp. 325–329 19. B.L. Nisarga, S. Sinha, S. Shekar, Hybrid IoT based Hazard detection system for buildings, in 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC) (IEEE, 2020), pp. 889–895 20. S. Sinha, S. Ashwini, RSSI based ımproved weighted centroid localization algorithm in WSN, in 2021 2nd International Conference for Emerging Technology (INCET) (IEEE, 2021), pp. 1–4
Secure Credential Derivation for Paperless Travel Tarun Tanmay Bhatta and P. N. Kumar
Abstract The international travel spectrum is a challenging environment, in which participants have aims that are often at odds with one another. The traveller wants to be able to plan his schedule ahead of time and travel conveniently, avoiding long lines and unexpected issues. In this paper, we have discussed a solution to meet the demands of stakeholders and service providers, who strive to detect security concerns, innovate and improve their service quality and utilize the resources at their disposal for the traveller’s need for comfort and privacy. In the current travel space, travel is dependent on physical documents, biometric verification utilising data of face and fingerprint, long immigration and departure lines. Instead, we propose a unique solution to address this issue, based on an electronic travel document stored in the phone of the traveller in the form of a QR code, protected by cryptographic procedures without compromising on privacy. We have also discussed the flaw in existing solutions and have proposed a novel solution with a unique set of algorithms and a digital certificate to address the same. The user can also use this novel method to avail multiple other facilities from the service providers at the destination without further need for verification or authorization. A prototype has been created and tested, proving expected benefits to all the stakeholders involved. The PACE algorithm used in this paper surpasses the BAC algorithm usually employed in chips because it is cryptographically more secure, due to its strong entropy and is independent of the randomness of keys. Keywords Biometrics · E-passport · QR code · PACE · Cryptography · CAN · Derived credentials · Security
T. T. Bhatta (B) · P. N. Kumar Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] P. N. Kumar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_31
429
430
T. T. Bhatta and P. N. Kumar
1 Introduction The present system of passport verification and subsequent use of passports and credentials in foreign soil is prone to many risks. The first one is the manual verification of identifying documents and the second one is regarding the misuse of sensitive personal documents that the traveller might submit to the local service providers. This has led to the subtle shift towards the era of digital documents and e-passports. E-passports are widely used in a variety of countries; others are on the verge of replacing their conventional passports with e-passports due to increased protection. The emblem assigned by the International Civil Aviation Organization (ICAO) to the e-passport may be used to identify it (International Civil Aviation Organization). The embedded chip contains all of the individual’s information in encrypted form. Whenever a traveller lands on foreign soil, the first thing that needs to be authenticated is his identity, which the present e-passport systems can handle. But after a careful literature survey, we have identified that the loopholes in the present system makes it vulnerable to eavesdropping and also possibly manipulating the data. We aim to improvise on that front to utilize a combination of different algorithms specified in Sect. 3. After accomplishing this, we can also extend this method to identify the user to various service providers there without the need to furnish more documents. To ensure that the service provider is guaranteed the authenticity of the credential during authentication, the credential must be sent to the user (i.e., authenticate the device to the user) securely after verifying the user’s identity using the chip embedded in the passport. Figure 1 is an example of derived credentials by the services. This paper describes the process of securely verifying the authenticity of the epassport remotely using his smartphone and its implementation. The end–to-end process is listed in the methodology section which describes all the stakeholders involved and the protocol stack used. We have used a combination of encryption and authentication algorithms which are described in detail in the implementation section. The BAC algorithm is generally employed in chips due to its simplicity. But it has its cons. Over time, it was proven to be less secure since it has less randomness in key generation and entropy. We have utilized the PACE algorithm which has Fig. 1 Usage of derived credential by various services
Secure Credential Derivation for Paperless Travel
431
much more reliable key generation. As a result, the suggested system presents a viable architecture that may be used in today’s environment to improve convenience, privacy, efficiency, dependability, security and accountability.
2 Related Work The introduction of near-field communication (NFC) technology has opened several new arenas for technological use cases that use this technology. NFC differs from other near-field communication systems, such as Bluetooth, in that it does not require “pairing” or any device configuration. When a pair of NFC-capable phones/gadgets are brought in proximity, they can automatically detect their approved mode of communication and the rate at which data is shared [1, 2]. NFC is similar to Bluetooth, in that it allows you to send and get data wirelessly, although some main differences exist. Bluetooth can transmit data over a long distance (10–100 m), but NFC can transmit data over a short distance comparatively (4–10 cm). NFC has the advantages of being more stable, using fewer resources, and being more user-friendly [3]. Because of the short range of NFC, it can avoid data hacking, which may happen easily with Bluetooth. There are many encryption algorithms available to keep sensitive data securely. The sender’s data is kept in a text file; therefore, a better algorithm for encryption is required to protect it. The software employs the advanced encryption standard (AES) algorithm, which is an asymmetric encryption/decryption algorithm that can handle multiple combinations of data and key lengths of 128, 192, and 256 bits [4]. AES is a public encryption algorithm that is both stable and commonly used. It outperforms the data encryption standard (DES) algorithm in terms of speed and security. Since it is a symmetric algorithm, it only needs a single key for both encryption and decryption of data. The AES method takes ten cycles for 128-bit keys, twelve cycles for 192-bit keys, and fourteen cycles for 256-bit keys during the encryption/decryption process. AES makes use of different cycles of keys to the array of data, along with other mathematical operations, to get the final ciphertext [5]. Aldya et al. [6] compares and contrasts ECDSA and RSA digital signatures in terms of key duration and time, as well as their benefits and drawbacks. In comparison to RSA, he concludes that ECDSA offers the same degree of protection with a smaller number of key sizes, and he also discusses ECDSA’s security and vulnerabilities. Kiran et al. [7] conclude that ECDSA outperforms the RSA digital signature algorithm in terms of key generation and signature generation, but that RSA outperforms ECDSA in terms of signature verification. As a result, where signature verification is more popular than key generation and signature generation, RSA is a better option. Furthermore, when compared to RSA, ECDSA needs less key size and energy for the same level of protection, making it a better choice for low-resource applications like smart cards and embedded systems.
432
T. T. Bhatta and P. N. Kumar
3 Methodology The protocol we will discuss will address the issue of weak entropy in the algorithm BAC (Basic Access Control) used in the existing methods to read the data from passport, discussed in [8, 9] and define a novel solution to derive the credential from an e-passport using NFC, allowing public authorities to authenticate remotely and store it in form of QR code. Below, we will discuss in detail existing methods and our proposed solution.
3.1 Existing Travel Process In the existing travel and identification environment, ICAO-compliant, augmented with NFC enabled chips are widely deployed. The e-passport includes printed pages as well as data saved on a chip. The data stored in the chip includes the bio-data printed in the passport, biometric information such as iris, facial, fingerprint and crucial information that are required to check the authenticity and integrity of the chip. For the storing of data components, the ICAO developed a standardised data format called logical data structure (LDS). This was done to ensure global compatibility for e-passport tags and readers (Table 1). The existing model defined in [8, 9], uses basic access control (BAC) to authenticate the user and read data from the passport. The protocol defined in [8] describes the usage of e-passport for registering individuals for subscription to a particular service provider, as e-passports guarantee strong binding between the e-passport and the issued individual with the help of biometric technology. The model described in [9] defines a protocol for using derived credential for travellers, using basic access control (BAC) to authenticate as well as read data from passport and passive authentication (PA) to verify the integrity of the data stored in the individual’s passport and stores the derived credential in his smartphone using cryptographic algorithms. Table 1 LDS file name and data elements stored in them
Data groups (DG)
Data elements
EF.DG1
Bio-data and MRZ data
EF.DG2
Encoded facial image
EF.DG3
Encoded fingerprint
EF.DG14
Public key for EAC
EF.DG15
Public key for AA
EF.SOD
Hash of data groups
Secure Credential Derivation for Paperless Travel
433
3.2 Drawbacks of the Existing Travel Process For most passengers, the present travel procedure works pretty well, although it does include several restrictions or downsides that might be addressed. The following is a list of a few of them. • In the current travel system, a traveller has to be physically present to obtain his credentials for travel such as a visa. The manual verification of the credentials leads to delays and errors, leading to frustration for public authorities and the traveller. • The transaction logs and records of travellers are processed manually, making it vulnerable to misplacement and unauthorized changes. • No clear policy and strategy for storage and disposal of the personal data in the airport kiosk after verification, leading to non-compliance with civil law like the “right to be forgotten”. • The usage of outdated algorithms like basic access control (BAC) in [8, 9] to authenticate and read data from passports makes the user data vulnerable to threats like eavesdropping. In our proposed solution, we will address the above drawbacks and address the pain points of the traveller and public authorities.
3.3 Proposed Solution and Architecture In our proposed system, a smartphone application reads an e-passport using NFC, collects data from the user like the password to access the traveller information from the chip, sends the data to public authority server for possible checks like blacklisted database, expired databases and checks for verifying the integrity and authenticity of the passport. On successful authentication, it stores the basic user bio-data, the facial image in an encrypted format. After digitally signing with the issuer’s private key, it creates a QR code using digital signature and document number and sends it to the user phone (see Fig. 2). This QR code is then given to a travel kiosk at the airport of arrival. The kiosk is where biometric verification is done in a private manner and authorization to enter the nation is determined. The kiosk will read the QR code, and fetch the details from the database using the document number stored in the QR code. The kiosk will be a trusted device of the issuer; hence, it can decrypt the information stored in the database, using the public key of the issuer, verifies the digital signature in the QR code, verifies the stored facial image with the traveller and decides if the traveller
434
T. T. Bhatta and P. N. Kumar
Fig. 2 Proposed system design
should be allowed to enter the country or go for further checks with public authority in case of doubt. The storage of bio-data of the traveller empowers the public authorities to take quick action on the fugitives and economic offenders as they can blacklist the traveller even credential derivation has been issued to. It also prevents the extra hassle of rescanning the passport at the destination airport.
3.4 Cryptographic Protocols To read and authenticate the e-passport, we will be using various cryptographic algorithms defined by ICAO. The detailed algorithm is described below.
Secure Credential Derivation for Paperless Travel
3.4.1
435
Pace
The protocol allows the e-passport chip to confirm that the inspection system has permission to access data stored on the device. Because of the BAC protocol’s flaws, a new, more secure method of reading the e-passport has been created. Password Authentication Connection Establishment protocol (PACE) protocol utilises asymmetric encryption based on the Diffie-Hellman(DH) algorithm to solve the issue of poor entropy in the basic access control protocol. PACE uses the password to generate strong session keys. This password can be made using document number, date of expiry and date of birth, like in the BAC, or it can be the Card Access Number (CAN). The password must be CAN, which is an arbitrary number written somewhere inside the passport chip. The clear advantage of CAN is that it is shorter than the MRZ and therefore simpler to manually put in. The detailed algorithm is described as follows: 1.
The reader reads the EF.CardAccess file in the chip, to retrieve the parameters for PACE protocol. The reader uses Key Derivation Function (KDF) to derive key (Kit) using the CAN or MRZ information. Manage Security Environment command is sent to the chip by the reader, to select one of the supported cypher suites, stored in the EF.CardAccess file. The chip a nonce of 16 bytes, encrypts with Kitusing the symmetric cypher that is defined in the EF.CardAccess file. Using the General Authenticate command, the reader requests for the nonce (z) and static domain parameter (Dicc). On retrieval of sand after decrypting z, the chip and the reader calculates the ephemeral domain parameters Dicc and s. Both the reader and the chip generate a public key and a secret key based on domain parameters (PKicc, SKicc, PKifd, SKifd) and exchange public keys. After exchanging public keys, the chip and the reader calculate key seed (K) and derive keys such as encryption key (K enc ) and MAC key (K mac ). For verification of successful execution of PACE, using Mutual Authentication command, the reader sends MAC, calculated with K mac and PKicc. The chip responses with a MAC calculated with Kmac and PKifd to the reader. If both the chip and the reader are able to verify the MAC, the protocol PACE is said to be successfully executed (Fig. 3).
2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
3.4.2
Active Authentication
This protocol protects the chip against cloning. To prove the authenticity of the chip, a challenge and response algorithm is used. Following is the detailed algorithm: 1.
The inspection system generates an 8-byte random nonce (RND.IFD) using the Internal Authenticate command.
436
T. T. Bhatta and P. N. Kumar
Fig. 3 PACE protocol
2. 3. 4. 5.
The chip calculates a nonce of 106 bytes (M1), calculates the length of the nonce (L m1) using the key length (k) and length of the hash (L h ). The chip concatenates RND.IFD (M2), then it generates nonce (M1) and builds M. The chip uses a hashing function (BC) to calculate the hash of M(H) and creates message representative (F) by concatenating 6A, M1, H and BC. Finally, the chip encrypts the message using its private key and sends it to the inspection system.
The inspection system verifies the encrypted message using the public key stored in the chip’s DG15 file. It extracts the calculated hash (H) and the message representative (F). It uses the known M2 and concatenates with M1, calculates hash using the same hash function BCand compares it to the extracted hash H and verifies it. The following figure is the message flow for the Active Authentication protocol.
3.4.3
Passive Authentication
This protocol is used to protect the integrity and authenticity of the data in the chip. During the creation of the chip, the data contained in the chip is digitally signed. A so-called document signer certificate is used for this, which is signed with the originating country’s CSCA (Country Signer Certificate Authority) certificate and is only available to legally commissioned document issuers. Following is the detail of the algorithm: 1.
The inspection system reads the EF.SOD file in the chip, which contains the hash values of the data groups and digital signature of the hash values.
Secure Credential Derivation for Paperless Travel
2.
3.
4.
437
The inspection system extracts the algorithm used to hash the files from EF. SOD. It calculates the hash of the DG files using a known algorithm from EF.SOD and compares it with the hashes stored in the EF.SOD file. The inspection system builds and validates a certification path from a Trust Anchor to the Document Signer Certificate that was used to sign the Document Security Object. It verifies the signature of the EF.SOD file using Document Signer Public Key(KPuds).
3.5 Derivation of Credentials In this process, as shown in Fig. 4. The traveller uses his NFC enabled phone to read the data from the e-passport issued to him and sends the relevant information to the backend server (immigration) to remotely authenticate their e-passports. The server on successful authentication, issues a digital signature, using its private key and sends it to the user’s mobile phone in the form of a QR code. In the first step, the user, using a smartphone and the android application developed as a prototype of this project, enters the CAN number or the password, as a part of the proof of being the owner of the passport. As the next step, the traveller keeps the smartphone on top of his passport. The application is designed to leverage the NFC technology to read the content of the passport. The application performs PACE protocol and reads the data. As part of the next step, the application connects with a remote server, which is deployed by immigration officials, for remote authentication of e-passports. The connection to the server is established by using Transport Layer Security (TLS) protocol. The smartphone, on successful connection with the server, sends the data of DG1, DG2, DG14, DG15, EF.SOD files.
Fig. 4 Proposed system architecture for the derviation of credentials
438
T. T. Bhatta and P. N. Kumar
The server on receiving these files does a couple of checks such as check as blacklisted database, eID stolen database. Then to confirm the authenticity of the chip, the server creates a random 8-byte of challenge data and sends it to the chip, with android as a proxy. In response, the chip signs the challenge with its private key and sends it to the server via smartphone. The server verifies the signature. This protocol is called Active Authentication. On successful Active Authentication, the server starts the process to verify the integrity of the document stored in the chip. It uses the files sent by the chip at the start of the process, extracts the hashes stored in the DG14 file, computes the hashes of the file using the known algorithm from the DG14 file and compares them with the hashes of the DG14 file. In the end, it verifies the signature, stored in the EF.SOD file, which confirms the files in the chips are not altered. In the end, the server creates an entry in the database, where it stores only the traveller’s bio-data, a facial image and digital signature, using the server’s private key. The server calculates digital signatures on bio-data and facial image. This digital signature is then embedded into a QR code, with document number and sent to the traveller’s smartphone to be stored and used at the destination of travel at the airport. Since the traveller’s personal information are sensitive, they are stored in the database and are protected with strong encryption algorithms. The airport kiosk terminals are trusted devices of public authorities, so they have the key for connecting, fetching and decrypting the information stored in the databases.
3.6 Verification of Credentials At the kiosk in the airport, the traveller gives out his QR code for verification. Figure 5 shows the proposed architecture. The flow starts with scanning of the QR code stored in travellers’ smartphones.
Fig. 5 Proposed system architecture for the verification of user credentials
Secure Credential Derivation for Paperless Travel
439
The kiosk terminal, upon scanning the QR code on the traveller’s phone, connects to the database that has stored information of the traveller, during the credential derivation phase. It fetches the document using the document number, embedded in the QR code and fetches the encrypted document. The kiosk terminal being the trusted device of the public authorities has the keys to decrypt the contents. The kiosk terminal verifies the signature in the QR code with the signature stored in the database and also does a verification of the signature using the public key of the server (immigration). Then, as the last step, it physically verifies the facial image stored in the database with the traveller. In case of any discrepancies or lack of trust, the verifying authority can ask the traveller to cooperate with the human interface (public authority) for further checks.
4 Implementation We have created an android application to collect user data for the application request, as well as read passport data. This software also connects with a credential derivation server to generate and store a mobile travel credential. We also built kiosk software to scan the QR code and retrieve the traveller’s mobile travel credential from the traveller’s phone, as well as connect with backend servers for the verification process, when needed.
4.1 Derivation of Credentials 4.1.1
Chip Card
The chip card used here is Infineon Technologies Secora eID. It is personalized using JCIDE (Java Card IDE). The card uses hash algorithm SHA 256 and then RSA 2048 to sign the challenge sent by the server for Active Authentication. The document hashes are calculated using the SHA-256 algorithm and are signed with the RSA 2048 algorithm.
4.1.2
The Android Smartphone
We have used the Android platform in our implementation. The application we have designed and developed uses NFC to communicate with the chip card. The communication details of the android application are as follows:
440
T. T. Bhatta and P. N. Kumar
• Smartphone to Chip—On the success of PACE, the communication between the chip and the smartphone happens through a secured channel. During the communication with the chip, the smartphone sends commands to the chip with extended APDU’s (Application Protocol Data Unit). • Smartphone to Server—The smartphone establishes secure communication with the server using SSL (Secure Socket Layer). For sending binaries of the data group files, the binaries of the respective files were converted to BASE64 format and sent as JSON files. The server is implemented in Spring Boot. We have used Spring Security to provide the security to authenticate the users and secure the API endpoints. Below are the various technologies and algorithms used to ensure the entire transaction is in a secure environment. API endpoints: /ePassport/authenticate—This endpoint allows access to users based on their username and password. It authenticates and generates JWT(JSON Web Token) against the username. It uses HMACSHA-256 to sign the token. The passwords are stored in the database in a secured manner using PKDF2 (Password-Based Key Derivation Function 2). /ePassport/submit—This endpoint accepts LDS files with media type JSON. Using the information from this file, the server verifies against the blacklist database and generates a CSPRNG (cryptographically secure pseudo-random number generator) to ensure sufficient randomness and unpredictability. Challenge of 8 Bytes is generated and sent to the smartphone as a part of the protocol called Active Authentication. /ePassport/verifyresponse—This endpoint accepts all the data group files, required for Active Authentication, Passive Authentication and issuance of QR Code. On successful Active Authentication and Passive Authentication, the server creates a JSON data structure with the following information: • Document Number • Digital Signature. The library used is ZXing to create the QR code. The server signs DG1 data (biodata) and DG2 data (facial image) with 256 bits ECDSA [10]. A 256 bit ECDSA Comparatively has a smaller key size and in terms of security, it is comparable with RSA 3072 [3]. This facilitates reduction in storage and can be stored in QR Code, which has limitations in terms of storage. Database—The following information is stored in the database to be referred by the public authorities. The relevant chips DG1, DG2 and signature. The primary key is the document number. The information is encrypted using AES (Advanced Encryption Standard) and can only be decrypted by trusted public authority terminals or devices. The database used here is No-SQL, hosted in a cloud server.
Secure Credential Derivation for Paperless Travel
441
4.2 Verification of Credentials The traveller on visiting the kiosk, hands out the QR code to be scanned, to initiate the verification process for his credentials.
4.2.1
Scanning
The library ZXing is used to extract information from the QR code.
4.2.2
Data Retrieval
Using TLS, the database is connected with secure communication. Document number from QR code is used as the primary key to retrieve the document. The key for decryption is available with the kiosk, as it is the trusted device.
5 Result and Inference The proposed system is implemented in Android device, and the server for credential issuance and verification is created using Springboot framework. Below are the excerpts of the work done, and inference w.r.t. to current and proposed methodology (Fig. 6).
5.1 Threat Vector Analysis 5.1.1
Skimming
This is an online attack of reading the e-passport without having the approval of the document holder and not having any physical access to the e-passport. To mitigate this, ICAO has discussed protocols that require document holders to enter MRZ information (bio-data) or CAN number. This information is used as a key to access the e-passport. Thus, the inspection system verifies, the data printed on the passport and chip belongs to the individual trying to access the document and has willingly handed over the documents for verification. BAC protocol requires MRZ information, while PACE protocol provides options between MRZ and CAN number which is the highlight of our proposed system which worked seamlessly in the test phase.
442
T. T. Bhatta and P. N. Kumar
Fig. 6 Image 1—User has to enter username, password for JWT token and CAN number to read data from passport. Image 2—The derived credentials and the QR code sent to traveller’s phone after successful authentication of user data by immigration server. Image 3—An excerpt from the database, after the successful credential derivation process, with “id” as document number and after an entry of travellers information to be used during the credential verification process
5.1.2
Eavesdropping
This is an offline attack, where the attacker uses methods to record the data transferred between the e-passport and the reading device. To mitigate this attack, the inspection system establishes encrypted communication with the e-passport. The PACE protocol employs more secure communication using cryptographic techniques and is a better choice than BAC, which is vulnerable to this attack due to the weak entropy of the key.
5.1.3
Cloning
This involves creating a new cloned passport using document holders bio-data. This attack is possible if the private key generated during the production of the chip is compromised. This private key is stored in the secure memory of the chip and has
Secure Credential Derivation for Paperless Travel Table 2 Different threat vectors and solution
443
Threat
Solution (protocols)
Skimming
BAC, PACE
Eavesdropping
PACE
Cloning
Active authentication
Forging
Passive authentication
no read or write access to the outside world. The Active Authentication protocol is used here to check if a chip is cloned which most current systems do not utilize.
5.1.4
Forging
This involves changing the content of the chip. During personalization of the epassport, the hash of the data group files is calculated and stored in the DG14 file. The Passive Authentication protocol is used to verify if the files are changed, and thus, this attack can be mitigated (Table 2).
5.2 Comparison of the Existing and Proposed Methodology Based on the threat to a smartcard which we discussed in Sect. 5.1, the below table gives a comparative study of the existing methodology and the proposed methodology. We have thus verified in our test phase that the methodology described in this paper with this specific set of protocols and algorithms help in solving the problems described in Table 3. Table 3 Comparison of existing methodology and proposed methodology Prone to forgery
Prone to skimming
Prone to eavesdropping
Prone to cloning
Securely derived identity credentials on smart phones via self-enrolment, 2016 [8]
NO
NO
YES
NO
Mobile travel credentials, 2019 [9]
NO
NO
YES
YES
Proposed method
NO
NO
NO
NO
444
T. T. Bhatta and P. N. Kumar
6 Conclusion The methodology defined in this paper thus was able to successfully verify the authenticity of the passport remotely. We have done extensive research on the existing protocols and cryptographic techniques for implementing this particular application. Which solution is best for which situation is determined by a number of factors such as costs, effort, and willingness of different parties to cooperate. We were successfully able to implement the entire methodology described and produce results much efficiently in terms of time, cost, bandwidth and complexity with the highest level of authenticity, all of which are mentioned in results and inference. We have outlined and also proved that the combination of the algorithms used here is the best in the industry for authenticating the traveller after following all the guidelines set by ICAO. Also, we have proved without a doubt that the digital certificate issued through this method is secure enough to even one day be used in place of the passport system. Acknowledgements This work was carried out as proof of concept at Infineon Technologies Pvt. Ltd., Bangalore, India. As a student trainee, I was allowed to do research and develop a prototype on their Secora smartcard. We are much obliged to the Department of CSE and Infineon India, for having facilitated the seamless research environment even during the adverse circumstances caused by COVID-19 pandemic.
References 1. Information technology—telecommunications and information exchange between systems— near field communication—interface and protocol-2 (NFCIP-2). ISO/IEC Std. 21481 (2005) 2. Information technology—telecommunications and information exchange between systems— near field communication—interface and protocol (NFCIP-1). ISO/IEC Std. 18092 (2004) 3. V. Coskun, B. Ozdenizci, K. Ok, A survey on near field communication (NFC) technology. Wirel. Pers. Commun. 71(3), 2259–2294 (2012) 4. G. Singh, S. Supriya, A study of encryption algorithms (RSA, DES, 3DES and AES) for information security. Int. J. Comput. Appl. 67(19), 33–38 (2013) 5. R. Bhanot, R. Hans, A review and comparative analysis of various encryption algorithms. Int. J. Secur. Appl. 9(4), 289–306 (2015) 6. A.P. Aldya, A. Rahmatulloh, M.N. Arifin, Stateless authentication with JSON web tokens using RSA-512 algorithm. J, Infotel 11(2), 36–42 (2019) 7. K.V.V.N.S. Kiran, N. Harini, Evaluating eciency of HMAC and digital signatures to enhance security in IoT. Int. J. Pure Appl. Math. 119, 13991–13996 (2018) 8. F. van den Broek, B. Hampiholi, B. Jacobs, Securely derived identity credentials on smart phones via self-enrolment. Institute for Computing and Information Sciences, Radboud University, Nijmegen, The Netherlands, 17 Sept 2016 9. D. Bissessar, M. Hezaveh, F. Alshammari, C. Adams, Mobile Travel Credentials (Springer Science and Business Media LLC, 2019) (Chapter 4)
Secure Credential Derivation for Paperless Travel
445
10. D. Toradmalle, R. Singh, H. Shastri, N. Naik, V. Panchidi, Prominence of ECDSA over RSA digital signature algorithm, in 2018 2nd International Conference on 2018 2nd International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) (IEEE, 2018), pp. 253–257
Performance Analysis of Spectrum Sensing in CRN Using an Energy Detector Ch. Sateesh, K. Janardhana Gupta, G. Jayanth, Ch. Sireesha, K. Dinesh, and K. Ramesh Chandra
Abstract The primary objective of Cognitive radio is to distinguish the empty or unused spectrum bands in the range and use them in giving better wireless communication. The intellectual radio (CR) empowers dynamic range access (DSA) and licences unlicensed or auxiliary clients (SU) to get to unused range dispensed to essential clients (PU), along these lines radically expanding the range utilization. Cognitive Radio requires efficient digital signal processing capabilities to sense the presence of vacant bands, which are known as spectral holes. In presence of single SU local sensing is used whereas in presence of multiple SU cooperative sensing is used. Among all the sensing techniques Energy detection method is the standalone Sensing Technique which we are going to use in our Project. In Cooperative Sensing cognitive radio Network, multiple secondary users (SU) communicate sensing decision through the fusion node. The fusion centre (FC) makes the final decision based on the rules. The various fusion rules are AND rule, OR rule, and L-sensor rule. We also extended our work by considering a MIMO Channel. In our project, we simulated various scenarios used in spectrum sensing Using Python and MATLAB tools. Keywords Cognitive device · Sensing node · Energy detection · SISO fading channel · MIMO channel and cooperative sensing
1 Introduction A cognitive radio is sensible and intelligent which will be programmed dynamically. Cognitive transceiver [1] is designed to detect the available radio channels. In response to operator commands, the cognitive engine might adjust radio system characteristics. In the communication environment, it functions as a self-contained unit, exchanging environmental data with the networks it links to and other cognitive radio devices, and continuously monitoring its own performance in addition to Ch. Sateesh · K. Janardhana Gupta · G. Jayanth · Ch. Sireesha · K. Dinesh · K. Ramesh Chandra (B) Vishnu Institute of Technology, Bhimavaram, Andhra Pradesh, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_32
447
448
Ch. Sateesh et al.
Fig. 1 Primary users sub-bands
reading radio outputs. This information is then used to adapt the radio configuration [2] to provide the required quality of service, subject to a suitable combination of user requirements, operational constraints, and governmental constraints. The CR is mainly considered as the future technology in the 5G wireless communication system that solve the problem of allocation of resources which is the major in concern in 5G. The quality of service is being increased in the 5G and also the data rates with the interconnection of the wide wireless communication systems. The primary users are allocated the sub-bands as shown in Fig. 1 and the spectrum is varied based on the allocation of the bands to the users for their specific application of the bands, the frequency and power allocation are carried. The organization of the remaining paper is as follows: SISO Fading Channel Spectrum sensing is discussed in Sect. 2, MIMO channel spectrum sensing is explained in Sect. 3, cooperative spectrum sensing is discussed in Sect. 4 and finally concluded the paper in Sect. 5.
2 Literature Survey Mansi Subhedar and Gajanan Birajdar gave an overview of spectrum sensing techniques, stating that spectrum sensing aids in the detection of spectrum gaps, allowing for high spectral resolution [10]. Simon Haykin and David J. Thomson developed Sensing algorithms for Cognitive Radio, with the goals of performing space–time processing and time frequency analysis using a tutorial exposition of the multitaper method and cyclostationarity for signal detection and classification [11]. Ajay Singh and Manav R. Bhatnagar discussed the performance of cooperative spectrum sensing with large no of antennas at each cognitive radio, which uses selection combining of decision statistics to improve spectrum dependability at very low signal-to-noise
Performance Analysis of Spectrum Sensing …
449
ratios [12]. In [13], Timur Duzenli and Olcay Akay introduced a cumulative addition based weighted energy detector (Cus-WED) for detecting randomly modelled Primary Users. It increases primary users’ detection performance for less probability false alert values.
3 SISO Fading Channel Spectrum Sensing When coming to single node spectrum sensing the energy of the signal present at that time is calculated and based on its energy the primary user present or not will be decided. Here, we use energy detector for calculating energy of the signal and helps in determining the spectrum holes (Fig. 2). ⎡ ⎢ ⎢ ⎢ ⎣
⎤ ⎡ ⎤ ⎡ ⎤ y1 s1 n1 ⎢ s2 ⎥ ⎢ n 2 ⎥ y2 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ .. ⎥ = ⎢ .. ⎥ + ⎢ .. ⎥ . ⎦ ⎣ . ⎦ ⎣ . ⎦
ym
sm
nm
y
s
n
n1 , n2 , n3 , …, nm : Complex Gaussian mean = 0 and variance = σ 2 , s1 , s2 , s3 , …, sm : Complex Gaussian mean = 0 and power = p.
3.1 Energy Detection Energy detection is one of the most often used approaches since it is simple to implement and does not require any prior knowledge of the PU signal [3]. Energy detection technique [16] is very susceptible to noise and hence it cannot differentiate
Fig. 2 SISO fading channel
450
Ch. Sateesh et al.
Fig. 3 Energy detector
signal and noise when signal-to-noise ratio (SNR) is low. The process of energy detection is indicated in Fig. 3. The received energy is calculated by squaring the frequency domain of the signal and averaging over the no. of samples gives the signal average energy [4]. Energy calculated is given by the below equation N 2 1 − jωk x(k)e s(ω) = N k=1 ˙In above block diagram and flow chart the presence of primary user is detected using a hypothesis test. The hypothesis used here is also called as Binary Hypothesis. It is stated as follows H0 : y(k) = w(k) H1 : y(k) = s(k) + w(k) where s(k) = Transmitted symbols, w(k) = Added noise, Binary Hypothesis Testing Problem: ⎡ ⎢ ⎢ H0 : ⎢ ⎣
⎤ ⎡ ⎤ y1 n1 ⎢ n2 ⎥ y2 ⎥ ⎥ ⎢ ⎥ .. ⎥ = ⎢ .. ⎥ . ⎦ ⎣ . ⎦
ym
nm
y
n
Performance Analysis of Spectrum Sensing …
⎡ ⎢ ⎢ H1 : ⎢ ⎣
451
⎤ ⎡ ⎤ ⎡ ⎤ y1 s1 n1 ⎢ s2 ⎥ ⎢ n 2 ⎥ y2 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ .. ⎥ = ⎢ .. ⎥ + ⎢ .. ⎥ . ⎦ ⎣ . ⎦ ⎣ . ⎦
ym
sm
nm
y
s
n
Energy Detector y 2 = |y1 |2 + |y2 |2 + |y3 |2 + · · · + |ym |2 H 0 if y 2 ≤ γ . H 1 if y 2 > γ . γ is the detection threshold. y 2 is the test static. Performance Analysis: Probability of false Alarm (PFA ). Given hypothesis H 0 what is the probability that the decision is H 1 . Signal is absent but given as present y2 > γ ⇒ |n 1 |2 + |n 2 |2 + |n 3 |2 + · · · + |n m |2 > γ PFA = Pr |n 1 |2 + |n 2 |2 + |n 3 |2 + · · · + |n m |2 > γ = Pr n 21,r + n 21,i + n 22,r + n 22,i + · · · + n 2m,r + n 2m,i > γ ⎛
n 1,r Pr ⎝ σ √
2
2 +
n 1,i σ √ 2
2 + ··· +
n m,r
+
σ √ 2
2
n m,i σ √ 2
2
⎞ γ ⎠ > 2 σ /2
2 χ2m
This is a Chi-Squared distribution PFA = Q
2 χ2m
γ σ 2 /2
2 is a Complementary cumulative distributive function (CCDF) of the ChiQ χ2m Squared distribution. Probability of Detection: Given H1 what is the probability that decision is H1
452
Ch. Sateesh et al.
PD = Pr ∼ (||y||2 > γ ) yi = si + n i ∼ CN 0, P + σ 2 p + σ2 yi,r ∼ N 0, 2 p + σ2 yi,i ∼ N 0, 2 PD = Pr
2 2 + y1,i y1,r p+σ 2 2
2 PD = Q χ2m
γ p+σ 2 2
+ ··· + ⎛
2 ⎝ = Q χ2m
2 2 ym,r + ym,i
p+σ 2 2 σ2 2
Q −1 (PFA ) χ2 2m
p+σ 2 2
⎞ ⎠
PFA
γ σ 2 −1 2 ⇒ γ = Q 2 (PFA ) = Q χ2m σ 2 /2 2 χ2m ⎛ 2 ⎞ σ Q −1 2 (PFA ) 2 χ2m ⎠ 2 ⎝ PD = Q χ2m p+σ 2
2
ROC
PD versus PFA : Receiver operating characteristics (ROC). The probability of false alert and probability of detection are used to measure the energy detector’s performance [5]. Detecting the primary user when it is genuinely there is what the Detection probability (PD) is all about. The probability of false alert (PFA) is CR decides the presence of primary user when it is not present. The characteristics between PFA and PD is shown in Fig. 4. The optimum threshold value is taken based on the Receiver Operating Characteristics (ROC) of the system. Here, we obtained the ROC using python.
4 MIMO Channel Spectrum Sensing Here, there will be multiple antennas to transmit and multiple antennas to receive and the same Energy detector technique is used. The basic model of a MIMO system is shown in Fig. 5. The above block diagram, can be represented as R x T MIMO system. So, we can say that here there are R receive antennas and T transmit antennas.
Performance Analysis of Spectrum Sensing …
453
Fig. 4 ROC fading channel
Fig. 5 MIMO fading channel
If primary signal x(t) is transmitted and it is received by the ith receiving antenna over the channel hi and additive noise n(t) is added. The received signal at the ith antenna is as follows: yi (t) = h i x(t) + n(t) The obtained signal yi (t) is passed through certain technique and the signal s(t) is obtained which is given as input to the Energy detector and E denotes the energy of the signal [6]. The primary user status is then determined based on the H 0 and H 1 hypothesis. The channel matrix in MIMO system is given in Fig. 6.
h 11 h 12 h= h 21 h 22
where h11 = channel coefficient between 1st receiver and 1st transmitter;
454
Ch. Sateesh et al.
Fig. 6 MIMO energy detection
h12 = channel coefficient between 1st receiver and 2nd transmitter; h21 = channel coefficient between 2nd receiver and 1st transmitter; h22 = channel coefficient between 2nd receiver and 2nd transmitter. The signal transmitted by Transmitter antenna-1 is received by Receiver antenna1 and Receiver antenna-2 which is same when transmitted by Transmit antenna-2. The signal having good response is considered as maximum signal and given as input to the Energy detector after which Sensing Decision will be taken based on Energy. The performance of the MIMO system can be shown like PFA versus PD. The optimum threshold value is taken based on the Receiver Operating Characteristics (ROC) and the ROC of MIMO channel is obtained using python. The ROC of MIMO channel is shown in Fig. 7. Fig. 7 ROC MIMO channel
Performance Analysis of Spectrum Sensing …
455
5 Cooperative Spectrum Sensing If multiple nodes sense the energy, they will co-operate and share their decision to the fusion centre this is called cooperative sensing [7]. A cognitive radio network is generated by shared perception. They make their own conclusions about frequency availability and discuss what they have learned with one another. The fusion node determines the availability of the spectrum based on information from all nodes (Fig. 8).
5.1 Fusion Centre Cooperative Spectrm Sensing The centralized cooperative sensing is a three step process and a fusion node called Fusion centre (FC) controls these steps [8]. In first step, the frequency band for which sensing has to be done is selected and asks all CR’s to perform their local sensing. In the second step, the fusion centre receives all of the local sensing judgments. On the basis of local sensing judgments, the FC assesses the presence or absence of the principal user (PU) in the final phase [9]. The fusion centre (FC) takes decision based on predefined rules. These rules are also called as hard computing-based methods. Some of these rules are classified as below. • AND rule • OR rule • L-sensor rule In AND rule, if all the nodes detect white space, then only the fusion centre determines the decision as white space. Here, the fusion centre decides H 1 only if all FC be the fusion centre’s likelihood sensors report H 1 otherwise H 0 . Let PDFC and PFA of detection and likelihood of false alert. The likelihood of false alert as a whole is expressed as FC = (PFA )K PFA
Fig. 8 Cooperative spectrum sensing
456
Ch. Sateesh et al.
The overall probability of detection is given as PDFC = (PD )K In OR rule, the fusion centre decides H 1 if at least one of the sensors report H 1 FC else H 0 . Here, the fusion centre makes OR of all decisions. Let PDFC and PFA be the fusion centre’s likelihood of detection and likelihood of false alert. The overall probability of detection is given as PDFC = 1 − (1 − PD )K The overall probability of false alarm is given as FC = 1 − (1−PFA )K PFA
In L-Sensor rule fusion centre decides H 1 if at least L-sensors report H 1 otherwise FC be the fusion centre’s likelihood of detection and likelihood of H 0 . Let PDFC and PFA false alert. The overall probability of detection is given as PDFC =
N
NCk ( pD )k (1 − pD ) N −k
k=L
The overall probability of false alert is given as FC = PFA
N
NCk ( p F )k (1 − p F ) N −k
K =L
5.2 Distributed Cooperative Spectrum Sensing The decision-making process in distributed cooperative sensing will not rely on a fusion centre. Here, all CR users communicate with one another and iterate their way to a decision. Using a distributed method, all CR users broadcast their local sensing decisions [17] to the other users, who integrate it with the received data and determine whether or not the primary user is present. If the criteria are not met, the CR users send their sensing data to other users, and the process is repeated until the algorithm is converged and a final decision is made.
Performance Analysis of Spectrum Sensing …
457
Original Signal - Test
25
20
Power/frequency (dB/Hz)
15
10
5
0
-5
-10
-15 0
100
200
300
400
500
600
700
Frequency (MHz)
Fig. 9 Periodagram of the original signal
5.3 Relay-Assisted Cooperative Spectrum Sensing If the decision has to be reached to the ith node then the nodes which are before the ith node are relay structured. Single-hop cooperative sensing [18] approaches include both centralized and distributed spectrum sensing. Multi-hop cooperative sensing is Relay-Assisted Cooperative Spectrum Sensing. We simulated the centralized cooperative spectrum sensing using MATLAB and the results are shown in Figs. 9, 10 and 11.
6 Conclusion By using Cognitive Radio the vacant spectrum bands or spectrum holes are used by unlicenced users without interfering with licenced users. This increases the efficiency of spectrum usage. The same principle of energy detection is used when there is multiple input and multiple out system. Performance characteristics of both systems are obtained using python (ROC—Receiver Operating Characteristics). The cooperative sensing technique and its classification are explained. The simulation of cooperative spectrum sensing is validated through MATLAB.
458
Ch. Sateesh et al. All Nodes Position
100 90 80 70
y-position
60 50 40 30 20 10 0 -10
10
0
20
30
50
40
60
80
70
90
x-position
200
300
400
500
600
700
Frequency (MHz) Periodogram at node- 3; dis- 5.513277e+01 vel-1i+0j 20 10 0 -10 0
100
200
300
400
500
600
700
Frequency (MHz) Periodogram at node- 5; dis- 5.589249e+01 vel-0i+2j 20 10 0 -10 0
100
200
300
400
500
600
700
Frequency (MHz) Periodogram at node- 7; dis- 7.930336e+01 vel-2i+1j
20 0 -20 0
100
200
300
400
500
600
700
Frequency (MHz) Periodogram at node- 9; dis- 6.211723e+01 vel-0i+0j 20 10 0 -10 0
100
200
300
400
500
600
Frequency (MHz)
Fig. 11 Periodagram at different nodes
700
Power/frequency (dB/Hz)
100
Power/frequency (dB/Hz)
0
Power/frequency (dB/Hz)
0 -20
Power/frequency (dB/Hz)
Periodogram at node- 1; dis- 6.300060e+01 vel-1i+0j
20
Power/frequency (dB/Hz)
Power/frequency (dB/Hz)
Power/frequency (dB/Hz)
Power/frequency (dB/Hz)
Power/frequency (dB/Hz)
Power/frequency (dB/Hz)
Fig. 10 Randomly generated positions for nodes Periodogram at node- 2; dis- 6.875105e+01 vel-0i+0j
20 10 0 0
100
200
300
400
500
600
700
500
600
700
500
600
700
500
600
700
500
600
700
Frequency (MHz) Periodogram at node- 4; dis- 1.023709e+02 vel-0i+1j 20 10 0 -10 0
100
200
300
400
Frequency (MHz) Periodogram at node- 6; dis- 9.106061e+01 vel-0i+0j 20 10 0 0
100
200
300
400
Frequency (MHz) Periodogram at node- 8; dis- 3.062884e+01 vel-0i+0j
20 10 0 -10 0
100
200
300
400
Frequency (MHz) Periodogram at node- 10; dis- 7.189592e+01 vel-0i+0j
20 10 0 -10 0
100
200
300
400
Frequency (MHz)
Performance Analysis of Spectrum Sensing …
459
References 1. A. Bagwari, S. Tuteja, J. Bagwari, A. Samarah, Spectrum sensing techniques for cognitive radio: a re-examination, in 2020 9th IEEE International Conference on Communication Systems and Network Technologies, 2020 2. P. Pandya, A. Durvesh, N. Parekh, Energy detection-based spectrum sensing for cognitive radio, in 2015 Fifth International Conference on Communication Systems and Network Technologies, 2015 3. A. Bagwari, G.S. Tomar, Multiple energy detection vs cyclostationary feature detection spectrum sensing technique, in 2014 Fourth International Conference on Communication Systems and Network, 2014 4. J.S. Gamit, S.D. Trapasiya, Cognitive radio energy based spectrum sensing using MIMO, in International Conference on Communication and Signal Processing, 3–5 April 2013, India 5. A. Bagwari, J. Kanti, G.S. Tomar, Novel spectrum detector for IEEE 802.22 wireless regional area network. Int. J. Smart Device Appl. (IJSDA) 3(2), 9–25 (2015). ISSN: 2288-8977 6. J. Kanti, G.S. Tomar, A. Bagwari, Quality analysis of cognitive radio networks based on modulation techniques, in 2015 IEEE International Conference on Computational Intelligence and Communication Networks, Dec 2015, pp. 566–569 7. S. Maleki, G. Leus, S. Chatzinotas, B. Ottersten, To AND or To OR: how shall the fusion center rule in energy-constrained cognitive radio networks? in IEEE ICC 2014—Cognitive Radio and Networks Symposium, 2014 8. I.E. Igbinosa, O.O. Oyerinde, V.M. Srivastava, S. Mneney, Spectrum sensing methodologies for cognitive radio systems: a review. Int. J. Adv. Comput. Sci. Appl. 6(12) (2015) 9. N. Muchandi, N. Muchandi, Cognitive radio spectrum sensing: a survey, in International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), 2016 10. M. Subhedar, G. Birajdar, Spectrum sensing techniques in cognitive radio networks: a survey. Int. J. Next-Gener. Netw. 3 (2011). https://doi.org/10.5121/ijngn.2011.3203 11. S. Haykin, D.J. Thomson, J.H. Reed, Spectrum sensing for cognitive radio. Proc. IEEE 97(5), 849–877 (2009). https://doi.org/10.1109/JPROC.2009.2015711 12. A. Singh, M.R. Bhatnagar, R.K. Mallik, Cooperative spectrum sensing in multiple antenna based cognitive radio network using an ımproved energy detector. IEEE Commun. Lett. 16(1), 64–67 (2012). https://doi.org/10.1109/LCOMM.2011.103111.111884 13. T. Duzenli, O. Akay, A new method of spectrum sensing in cognitive radio for dynamic and randomly modelled primary users. IETE J. Res. (2019). https://doi.org/10.1080/03772063. 2019.1628668 14. T. Wang, Y. Chen, E.L. Hines, B. Zhao, Analysis of effect of primary user traffic on spectrum sensing performance, in Proceedings of the IEEE International Conference on Communications and Networking in China (ChinaCOM2009), China, 2009, pp. 1–5 15. J.-K. Choi, S.-J. Yoo, Undetectable primary user transmissions in cognitive radio networks. IEEE Commun. Lett. 17(2), 277–280 (2013). https://doi.org/10.1109/LCOMM.2013.010313. 121699 16. J.S. Raj, Improved response time and energy management for mobile cloud computing using computational offloading. J. ISMAC 2(01), 38–49 (2020) 17. V. Suma, W. Haoxiang, Optimal key handover management for enhancing security in mobile network. J. Trends Comput. Sci. Smart Technol. (TCSST) 2(04), 181–187 (2020) 18. T. Vijayakumar, R. Vinothkanna, Efficient energy load distribution model using modified particle swarm optimization algorithm. J. Artif. Intell. 2(04), 226–231 (2020)
Forecasting Crime Event Rate with a CNN-LSTM Model M. Muthamizharasan and R. Ponnusamy
Abstract In India, the crime percentage is growing day by day. It is essential to develop different modern advanced tools and techniques to predict the rate and time of the crime events in advance. This prediction will enable the police to improve the monitoring/vigilance and strengthen intelligence in the particular district to avoid such crime events. There are several Spatiotemporal statistical methods are used to predict such events in the past. Forecasting crime event rate prediction is a central part of setting a prediction approach or taking suitable timely action to reduce the crime rate. Additionally, using this Long short-term memory (LSTM) that one can analyze the relationships among long-term data utilizing its functions. Therefore in this work, we attempted to forecast the crime rate using the CNN-LSTM model. For this research, we utilized the crime dataset taken from the NCRB for three years. We chose four features: murder, rape, theft, and offenses against property. Initially, we use CNN to excerpt the attributes from the dataset and we used LSTM to forecast the crime rate. During the experiments, we found that the CNN along with the LSTM model could provide a trustworthy crime predicting method with high forecast accurateness. This method is a new exploration idea for crime rate forecasting as well as a good prediction technique. Keywords Convolutional neural network · Crime rates · Long short-term memory · Crime time series · Crime events · NCRB · Prediction
M. Muthamizharasan Department of Computer Science, AVC College (Autonomous), Mannampandal 609305, India Department of Computer Science, Periyar University, Periyar, Palkalai Nagar, Salem, Tamil Nadu 636011, India R. Ponnusamy (B) Center for Artificial Intelligence & Research, Chennai Institute of Technology, Kundrathur 600069, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_33
461
462
M. Muthamizharasan and R. Ponnusamy
1 Introduction Urbanization is one of the critical issues in modern India, and it massively increases the population. This activity impacted to find the advanced solutions to manage the problems arising in the urban administration. These problems are infrastructure creation, traffic management, crime control, etc. One of the essential challenging issues is crime monitoring and controlling. It is vital to recognize altered influences, occurrence relationships of crimes, and thus an effective, improved way to decrease crime rates. It is essential to use all modern tools to aid the crime prediction and monitoring system. A noble prediction method provides quicker evolution of the crime data set helps to forecast the correct level of crime rate and supports in keeping track of properties disturbing study of crime. Therefore, Criminal District Prediction (CDP) is challenging for the police to monitor and control crime events. There are several statistical and machine learning methods are available in the world to predict the control. But, still, the accurate prediction in a realistic environment is questionable. This work proposes Convolutional Neural Network Long Short-Term Memory Network (CNN-LSTM) for the prediction of crime event rates. This system will enable the Police Department to accurately get the information about the crime level and take actions accordingly. The experiments and evaluation are performed with NCRB [1] dataset. This paper follows the below-given organization. Section 2 provides an overview of the critical methods in crime prediction literature and the important works in this research. This paper presents a CNN-LSTM Crime Forecast Algorithm by explaining its steps in element in Sect. 3. Section 4 focuses on the investigational evaluation. Section 5 provides the results and interpretation. And finally, Sect. 6 concludes the paper by discussing avenues for future research.
2 Background Literature The spatiotemporal forecast has been widely applied globally, and crime prediction is a typical spatiotemporal forecast problem. A vast amount of work has been carried out to predict the crime rate worldwide by considering its importance using different algorithms and modern tools. Several data sets are available on the web, leading to the development of Spatial–Temporal Data Mining. This section deliberates works related to spatiotemporal prediction issues. 1. 2.
Anchal Rani, Rajasree S. in 2014 [2] used Mahanolobis Distance and Dynamic Time Warping Technique to predict and analyze the crime trends. In 2013, Wang and his colleagues [3] employed Geospatial Discriminative Designs to detect significant differences in a geospatial dataset between two
Forecasting Crime Event Rate with a CNN-LSTM Model
3.
4.
5. 6.
7.
8.
9.
10.
11.
12.
463
groups (hotspots and common areas). Improve a fresh model—Hotspot Optimization Tool (HOT)—using GDPatterns to advance the credentials of criminal hotspots. Umair Muneer Butt and his team in 2016 [4] have done a detailed survey on prediction methods and technologies and suggested four other forms for further exploration, such as LSTM, ARIMA, transfer learning, and DBSCAN for different prediction. Umair Munneer Butt and his colleagues investigated crime forecasting in smart cities in 2016 [5], first looking at the Hierarchical Density-Based Spatial Clustering of Applications with Noise to identify hotspots with a higher risk of crime occurring. Second, in each dense crime location, the Periodic Auto-Regressive Integrated Moving Average (SARIMA) is utilized to predict crime. S. Sivaranjani and her team in 2016 [6] used clustering approaches to forecast the crime rate in Tamil Nadu. Xing Han and his team in 2017 [7] solved the daily crime prediction problem using LSTM and ST-GCN to predict the high-risk areas in the city automatically. It specifies the number of crimes in the slide areas. Yong Zhuang and his team in 2017 [8] at the University of Massachusetts Boston used to predict the spatiotemporal neural network. They used five-year data to predict the crime spot in Portland. Ginger Saltos, Mihaela Cocoa used in 2017 [9] used this idea of hotspots. The study examines the prediction of crimes of various types per month and per LSOA code (Lower Layer Super Output Area), an administrative system used by the UK police. Anneleen Rummens [10], in 2017, used the statistical predictive modeling technique to predict the forecasting, and it will also indicate the crimes like a home burglary, street robbery, battery, etc., in a particular location. It divides the areas as a grid by grid, and prediction is grid-based. Bao Wang and his team in the year 2017 [11] at UCLA have used spatial– temporal signal enhancement techniques to boost crime forecasting accuracy. These techniques also solve the deficiency of the CNNs for sparse data due to the weight sharing. More specifically, the work also computes the cumulative diurnal crime per grid spatial region in the temporal dimension. In the spatial dimension, also use bilinear interpolation super-resolution. They have adapted the ST-ResNet for crime prediction. Dejiang Kong and Fei Wu, Zhejiang University [12] in 2018 have predicted the location using HST-LSTM and ST-LSTM are methods with a certain large margin. Charlie Catlett et al. [13] 2018 used prototypes to detect high-risk crime areas in urban areas automatically and reliably forecast crime trends in each region. The experimental evaluation, performed on real-world data collected in a big area of Chicago, shows that the proposed approach achieves good accuracy in spatial and temporal crime forecasting over rolling time horizons.
464
M. Muthamizharasan and R. Ponnusamy
13.
Hicham Aitelbour in 2018 [14] used crime prediction models for predicting using the readymade seasonal time series data set. In this experiment, both RNN and LSTM methods were used for prediction. Romika Yadav and Savita Kumari Sheoran in 2018 [15] used the Generalized Linear Model for Crime Site Selection and analyze it for crime events using Modified ARIMA (Auto-Regressive Integrated Moving Average) with big data technologies. This method increases the accuracy compared to the standard methods Shraddha Ramdas Bandekar and her team [16] in the year 2019 have made a machine learning algorithm prediction of crime rate, different algorithms used for prediction. Moreover, a comparison is carried out with three different approaches. It also suggested extending the work with the deep learning approaches. Peixiao Wang and his team in 2019 [17] used Markov-LSTM to predict the indoor location services, particularly in the retail industry. Also, they proved that the Markov-LSTM outperforms the five other baseline methods. Charlie Catlett and his team in 2019 [18] predicted the high-risk crimes in Chicago and Newyork city using regressive auto models. Wenjie Lu and his team in 2020 [19] have done a reliable stock price prediction using CNN-LSTM with the best prediction rate is possible. Using this prediction method offers a new idea for stock price prediction.
14.
15.
16.
17. 18.
The CNN-LSTM model gets features and automatically learns the crime dataset’s dynamic temporal and spatial dependent features through the attention-based specific crimes at the district level.
3 System Modeling The CNN-LSTM system is explained in detail through the following model.
3.1 CNN-LSTM Model CNN fetches the best attributes from the given dataset and it is used for getting the best system design. LSTM is used for time series prediction problems. In this work, both methods are combined in anticipating the best crime-accurate prediction percentage. The system design model is given in Fig. 1, which consists of input, one-dimensional convolution, pooling layers, LSTM covered up layer and full association layer.
Forecasting Crime Event Rate with a CNN-LSTM Model
465
Fig. 1 CNN-LSTM design diagram
3.2 CNN Lecun et al. in 1998 designed the CNN model [20]. It is a type of feed-forward neural network and has great execution in image preparation and normal language handling [21]. It is applied for measuring the time arrangement.The adjacent insight and weight exchange of CNN can extraordinarily diminish the number of margins, in this way improving the productivity of model learning [22]. It is using two different layers, one is the convolution and another one is the pooling. The convolution layer has a majority of convolution pieces and the computation equation. Its main task is to separate the important attributes which have a major impact on the system. After the convolution layer decrease, the measurement of elements a pooling layer is added, which employs the tanh actuation work, is the information vector, is the heaviness of the convolution part, and is the inclination of the convolution bit.
3.3 LSTM Long Short-Term Memory (LSTM) networks are an altered adaptation of repetitive neural networks that streamline reviewing data from an earlier time. Here, the RNN’s evaporating inclination issue is solved. LSTM is appropriate to distinguish, investigate, and foresee time models given delays of the dubious term. Back-spread is utilized to prepare the model. Three entryways are available in an LSTM organization and it is displayed in Fig. 2.
466
M. Muthamizharasan and R. Ponnusamy
Fig. 2 LSTM gates
1.
Input gate—Find the value in the entry you want to use to change the memory. The sigmoid function determines the value that passes 0.1. The tanh function weights the sent value to choose its importance between −1 and 1. i t = σ (Wi · [h t−1 , xt ] + bi ) C˜ t = tanh(WC · [h t−1 , xt ] + bC ) Input gate
2.
Forget gate—Find what you are removing from the block. The sigmoid function determines this. It outputs a value of 0 (ignore this) to 1 (leave as is) for each number in the cell state C t -first to examine the previous state (ht −1 ) and the content entry (X t ). f t = σ (W f · [h t−1 , xt ] + b f ) Forget gate
3.
Output gate—Input and block memory are used to define the output. The sigmoid function determines values greater than 0.1. The tanh function weights the transmitted values to assess their importance in the range −1 to 1 and multiplies the output of the sigmoid. ot = σ (Wo · [h t−1 , xt ] + bo ) h t = ot ∗ tanh(Ct ) Output gate
Forecasting Crime Event Rate with a CNN-LSTM Model
467
it —represents input gate, f t —represents forget gate, ot —represents output gate, σ —represents sigmoid function, W t —weight for respective gate (x) neurons, ht −1 —output of the previous LSTM block (at timestamp t − 1), x t —input at current timestamp, bx —biases for the respective gate (x). A recurrent neural network is a type of long short-term memory. In the current action, the RNN output from the previous step is used as input. Hochreiter and Schmidhuber created the LSTM. It addressed the issue of RNN long-term dependency, in which the RNN cannot predict words stored in long-term memory but can make more accurate predictions based on current data. RNN does not deliver efficient performance as the gap length rises. By default, the LSTM can keep the information for a long time. It’s utilized for time-series data processing, prediction, and classification. The LSTM features a chain structure with four neural networks and various memory blocks known as cells.
4 CNN-LSTM Model for Crime Prediction Algorithm The following algorithm is adapted to algorithm simulation results from the proposed computational model: The major procedure steps are given below: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Input the dataset for training CNN-LSTM Model. Data Standardize. Initialize network. CNN Layer Computation. LSTM Layer Computation. Output Layer Computation. Calculate Error. If the Error is with the limit continue otherwise go to step 4. Save the model. Give the Test Data as input. Data standardization. Prediction. Data standardization restore. Output result.
5 Results and Discussions The dataset used for prediction is received from the National Crime Record Bureau (NCRB). It is very hard to identify the crime-prone areas using conventional techniques. In the dataset, four essential attributes are selected for prediction, such as murder, rape, theft, and offenses against property.
468
M. Muthamizharasan and R. Ponnusamy
5.1 Experiment Setup A python program is developed with the Keras package to model the CNN-LSTM model to predict the crime rates. This prototype works well for state-of-the-art data in the database. In this program, it will be able to expect only the CNN-LSTM model alone.
5.2 Results The main objective of this work is to identify the crime rate prediction in a particular region or district using a time series dataset. Simple, offenses like human assault are not taken into account. Only those most vulnerable crimes are considered for prediction. In this work, forty-two districts in Tamil Nadu are considered for analysis. It is observed that highest number of crimes recorded in Chennai and in the initial experiment, forecasting the vulnerable crimes in Chennai alone is taken for forecasting. The dataset consists of forecasted crime values obtained using the CNNLSTM model are compared with actual values are shown for Chennai City for certain crimes (shown in Fig. 3). The dataset inputted to the system is considered only up to 2019 and it forecasts till 2021. Further, to understand the prediction accuracy in
Fig. 3 CNN-LSTM model predicted value and real value for Chennai district
Forecasting Crime Event Rate with a CNN-LSTM Model
469
this experiment, R-square and MAE values are computed. In this experiment, the R-square score is 0.99 and MAE is 0.0027, which is evident that it yields a good prediction.
6 Conclusion The best crime rate prediction and forecasting are one of the essential tasks and it is directly connected to the economic development of the nation. This paper proposes a novel CNN-LSTM model for district-wise crime prediction, and it introduces spatial– temporal influences into access machinery to mitigate data insufficiency. In this paper, due to the availability of data every year, it is taken for processing. If the data is given either on a monthly or quarterly, or seasonal basis, it would enable us to predict the results more accurately and quickly. The experimental results prove the efficiency of the suggested technique in Tamil Nadu.
References 1. https://ncrb.gov.in/en/ 2. A. Rani, S. Rajasree, Crime trend analysis and prediction using mahanolobis distance and dynamic time warping technique. (IJCSIT) Int. J. Comput. Sci. Inf. Technol. 5(3), 4131–4135 (2014) 3. D. Wang, W. Ding, H. Lo, M. Morabito, P. Chen, J. Salazar, T. Stepinski, Understanding the spatial distribution of crime based on its related variables using geospatial discriminative patterns. Comput. Environ. Urban Syst. 39, 93–106 (2013). https://doi.org/10.1016/j.compen vurbsys.2013.01.008 4. U.M. Butt, S. Letchmunan, F.H. Hassan, M. Ali, A. Baqir, H.H. Sherazi, Spatio-temporal crime hotspot detection and prediction: a systematic literature review. IEEE Access 8, 166553–166574 (2020). https://doi.org/10.1109/access.2020.3022808 5. U.M. Butt, S. Letchmunan, F.H. Hassan, M. Ali, A. Baqir, T.W. Koh, H.H. Sherazi, Spatiotemporal crime predictions by leveraging artificial intelligence for citizens’ security in smart cities (2021). https://doi.org/10.20944/preprints202102.0172.v1 6. S. Sivaranjani, S. Sivakumari, M. Aasha, Crime prediction and forecasting in Tamilnadu using clustering approaches, in 2016 International Conference on Emerging Technological Trends (ICETT), 2016. https://doi.org/10.1109/icett.2016.7873764 7. X. Han, X. Hu, H. Wu, B. Shen, J. Wu, Risk prediction of theft crimes in urban communities: an integrated model of LSTM and ST-GCN. IEEE Access 8, 217222–217230 (2020). https:// doi.org/10.1109/access.2020.3041924 8. Y. Zhuang, M. Almeida, M. Morabito, W. Ding, Crime hot spot forecasting: a recurrent model with spatial and temporal information, in 2017 IEEE International Conference on Big Knowledge (ICBK), 2017. https://doi.org/10.1109/icbk.2017.3 9. G. Saltos, M. Cocea, An exploration of crime prediction using data mining on open data. Int. J. Inf. Technol. Decis. Mak. 16(05), 1155–1181 (2017). https://doi.org/10.1142/s02196220175 00250 10. A. Rummens, W. Hardyns, L. Pauwels, The use of predictive analysis in spatiotemporal crime forecasting: building and testing a model in an urban context. Appl. Geogr. 86, 255–261 (2017). https://doi.org/10.1016/j.apgeog.2017.06.011
470
M. Muthamizharasan and R. Ponnusamy
11. B. Wang, P. Yin, A.L. Bertozzi, P.J. Brantingham, S.J. Osher, J. Xin, Deep learning for real-time crime forecasting and its ternarization. Chin. Ann. Math. Ser. B 40(6), 949–966 (2019). https:// doi.org/10.1007/s11401-019-0168-y 12. D. Kong, F. Wu, HST-LSTM: a hierarchical spatial-temporal long-short term memory network for location prediction, in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018. https://doi.org/10.24963/ijcai.2018/324 13. C. Catlett, E. Cesario, D. Talia, A. Vinci, A data-driven approach for Spatio-temporal crime predictions in smart cities, in 2018 IEEE International Conference on Smart Computing (SMARTCOMP), 2018. https://doi.org/10.1109/smartcomp.2018.00069 14. H. Ait El Bour, S. Ounacer, Y. Elghomari, H. Jihal, M. Azzouazi, A crime prediction model based on spatial and temporal data. Period. Eng. Nat. Sci. (PEN) 6(2), 360 (2018). https://doi. org/10.21533/pen.v6i2.524 15. R. Yadav, S. Kumari Sheoran, Modified ARIMA model for improving certainty in spatiotemporal crime event prediction, in 2018 3rd International Conference and Workshops on Recent Advances and Innovations in Engineering (ICRAIE), 2018. https://doi.org/10.1109/icr aie.2018.8710398 16. S.R. Bandekar, C. Vijayalakshmi, Design and analysis of machine learning algorithms for the reduction of crime rates in India. Procedia Comput. Sci. 172, 122–127 (2020). https://doi.org/ 10.1016/j.procs.2020.05.018 17. P. Wang, H. Wang, H. Zhang, F. Lu, S. Wu, A hybrid Markov an LSTM model for indoor location prediction. IEEE Access 7, 185928–185940 (2019). https://doi.org/10.1109/access. 2019.2961559 18. C. Catlett, E. Cesario, D. Talia, A. Vinci, Spatio-temporal crime predictions in smart cities: a data-driven approach and experiments. Pervasive Mob. Comput. 53, 62–74 (2019). https://doi. org/10.1016/j.pmcj.2019.01.003 19. W. Lu, J. Li, Y. Li, A. Sun, J. Wang, A CNN-LSTM-based model to forecast stock prices. Complexity 2020, 1–10 (2020). https://doi.org/10.1155/2020/6622927 20. Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998) 21. B.S. Kim, T.G. Kim, Cooperation of simulation and data model for performance analysis of complex systems. Int. J. Simul. Model. 18(4), 608–619 (2019) 22. L. Qin, N. Yu, D. Zhao, Applying the convolutional neural network deep learning technology to behavioral recognition in intelligent video. Tehnicki Vjesnik-Technical Gazette 25(2), 528–535 (2018)
LegalLedger–Blockchain in Judicial System Soumya Haridas, Shalu Saroj, Sairam Tushar Maddala, and M. Kiruthika
Abstract One of the major concerns around the globe is access to justice and with the advancement of technology, there is tremendous scope to make these judicial systems much more efficient and swift. There have been various cases in which a courthouse reopened a case after several years, and it was found that some evidence was either missing or had been tampered with. This is a big issue in countries like India where there is very minimal digitization of such documents, and it has been a great obstruction in strengthening our judicial systems. We thereby propose a system where court proceedings and case-related documents can be stored digitally on a decentralized storage with every record stored on the Ethereum blockchain. Moreover, using blockchain in the legal system will serve as a completely tamperproof solution with easy accessibility and storage for all its users. The evidence files will be stored encrypted in the decentralized storage, so even if the storage is compromised an attacker can never know the content. The private key of the case is also encrypted with an authorized user’s public key and stored on the smart contract, so an authenticated user can view the case files. Hence, the objective of this paper is to propose a tamperproof solution for the digitization of court proceedings in a secure storage leveraged by blockchain technology and to also provide accessibility in a secure manner. Keywords Digitization · Ethereum · Blockchain · Decentralized
1 Introduction Our Indian judicial system has more than 33 million cases in backlog, and it is extremely alarming for our country. The number of cases is getting piled up on a daily S. Haridas (B) · S. Saroj · S. T. Maddala · M. Kiruthika Department of Computer Engineering, Fr. C. Rodrigues Institute of Technology, Vashi, Navi Mumbai, India M. Kiruthika e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_34
471
472
S. Haridas et al.
basis but clearance for the same is not happening. Even if 100 cases are disposed-off every hour without sleeping and eating, it would take more than 35 years to catch up. Moreover, when several cases are reopened after a very long time it is often observed that the evidence files are either missing or tampered with. Therefore, all law enforcing and justice departments are gradually moving towards virtual and digital solutions in order to deliver justice more efficiently. The chain of custody (CoC) documents are very important as they help keep the evidences unfabricated and make them useful in the court of law. In order to maintain the authenticity of digital evidence, the ability of blockchain to carry out safe transactions and give authenticated access without being mutable makes it perfect for its safe storage. Also, according to Niti-Aayog CEO Amitabh Kant, only blockchain can enable the system to find solutions to huge logjams in courts. Moreover, in the current scenario of the COVID-19 pandemic, it is not feasible to conduct the court proceedings offline, thus using a blockchain-powered system would be a more effective, convenient and contactless approach. Hence, in order to curtail any legal misconduct and subjectivity, the system has proposed to introduce blockchains in the judicial system that would maintain the integrity of digital evidence while simultaneously preserving the digital chain of custody. The organization of this paper is as follows: Section 2 deals with the related work. In Sect. 3, the architecture of the proposed system has been presented. Section 4 deals with the results, and conclusion is presented in Sect. 5.
2 Related Work Chopade et al. [1] proposed a digital forensic model which aimed to create transparency in chain of custody in order to maintain the security of evidence while transferring data from one person to another using the Base64 encryption algorithm. Chain codes were brought into the play which allows users to create transactions in hyper ledger fabric. Wanhua et al. [2] assured to build a system for transportation administrative law enforcement using blockchain. With the use of a hash function, the law enforcement data is digested and signed, and the results are transmitted to the law enforcement platform. Singh et al. [3] came up with an idea to identify the cases that need time-bounded solutions and proposed a Roster and Cause list preparation algorithm which was based on complexity and priority of cases so that cases can be solved faster and performance of the judicial system can be improved. It follows the hierarchical structure of courts. Sindhuja et al. [4] put forward a simple web application without coming up with the security and integrity segment. To enhance the proficiency and availability of equity in the judicial system, classifying the court cases and automation of the system were brought into the play. Tasnim et al. [5] proposed CRAB: Blockchain-Based Criminal Record Management System, a secure and integrated system that ensures stored information is secure and cannot be accessed or altered by attackers by introducing blockchain technology to store data transaction
LegalLedger–Blockchain in Judicial System
473
logs alongside encrypting the data so it cannot be altered. The platform employs the elliptic curve cryptography (ECC) encryption scheme to encrypt criminal data in order to ensure the authenticity and integrity of the blockchain ledger. Forensic Chain: A Blockchain-based Digital Forensics Chain of Custody was presented by Lone et al. [6]. They provided a prototype of the forensic chain model built with hyperledger composer and assessed its performance. In addition, the creation of an end-to-end integrated framework for storing digital evidence and maintaining chain of custody, supported by IPFS and hyperledger blockchain, was proposed. Chen et al. [7] put forward IPFS which is a peer-to-peer version-controlled file system that synthesizes learnings from many previous successful systems. Shah et al. [8] discussed how a tool as a proof of concept with limited functionality can be used to evaluate the authenticity of the chain of custody. He came up with the concept of ring approach and consequently started using smart cards to store the digital credentials. Montasari et al. [9] proposed a standardized data acquisition process model for forensic investigations to be conducted in a digital format. The paper tried to enforce a uniform and generic approach to obtain any kind of proof in digital format. The SDAPM proposed in this paper was presented and described utilizing a proven formal notation, unified modelling language activity diagram that can assist courts. Oliveira et al. [10] discussed analysing the security and privacy challenges that will be faced in future. It brought in the concept of ubiquitous computing and focused upon determining the relevant sources of leakage and employing privacy-preserving protocols to manipulate sensitive data. The need for a distributed and decentralized mechanism to overcome the challenges of privacy and security was also emphasized. Sivaganesan et al. [11] put forward a DDTMB model-based trust assessment which makes sure that health specialists receive sensed data in a timely and suitable manner, allowing for particular measures to be performed. WSNs are used to gather and disseminate data across a large area. Bhalaji et al. [12] came up with an analysis of the performance of 6LoWPAN and RPL IoT for healthcare applications which aims to improve effective monitoring of an athlete’s thermoregulation. McAbee et al. [13] seek to understand the barriers to military intelligence adoption of this technology in order to facilitate wider adoption as more applications are developed. It examines the guidelines for assessing adaptability and viability for potential use cases, proposes a prototype decision-aid from the perspective of military intelligence and offers techniques for enabling abstraction of the blockchain-based process for preliminary feasibility analysis. Short et al. [14] examined the various possibilities to make the process of federated learning much more secure by incorporating the concepts of blockchain and its resistance against poisoning attacks has also been emphasized. Shen et al. [15] proposed a privacy protection mechanism for location-based services by utilizing the blockchain technology. Various simulations were carried out to prove that this mechanism could be used for effectively maintaining the confidentiality of the location information of the user. Meanwhile, there are many loopholes and drawback in the existing system which can lead to vulnerabilities and can cause failure, and risks of malicious activities. One of the limitations and issues of the existing work comes into the picture
474
S. Haridas et al.
where blockchain has been used for record management systems and not for storing evidence which has been effectively addressed in this work.
3 Proposed System From the literature review, it is understood that the automation of the judicial system can be analysed further because there is a rise in the demand for digitizing the legal system with authenticity and integrity and this system would like to solve this using blockchain technology. Figure 1 consists of five stages, namely In the user registration phase, there are three types of users for this application. The user can be either an admin, a judge or a lawyer. Once registered, a unique ID and a private key are generated which can be saved locally. In this stage, the application will automatically inherit the Ethereum address from the metamask account which would act as a security layer for this application by preventing unauthorized access. Case registration phase includes new case registration or creation, and this is done by the admin. Once the admin creates the case, a unique case ID and a private key are generated for the case. In the next phase, the smart contracts are created which contain two main identities, one for the lawyer or judge and one for the case. The lawyer or judge’s smart contract consists of details such as their public key (generated during registration), Ethereum address (for authentication) and an array of all the cases assigned to them. The case contract would consist of the public key (generated during registration), Ethereum Fig. 1 Block diagram of proposed system
LegalLedger–Blockchain in Judicial System
475
addresses of the lawyers and judge assigned and the evidence hash. The decentralized storage that is being used here is interplanetary file system (IPFS). This smart contract would make the system credible without using any centralized technology that could have made the application vulnerable to attacks. In the assignment stage, after creating a new case, the admin assigns two lawyers and a judge to the case. The key arrays of the lawyers and the judge are populated with the case key. Since the users of the application have been assigned a unique ID during registration, the cases are assigned directly to them, thus ensuring that the privacy of all the case files is being maintained and only authorized users will have access to it. When an admin adds new evidence to the case it is stored in IPFS, and the result hash returned is then pushed into the smart contract. In the access stage, evidence added to the case by the admin can be accessed by authorized users (lawyers and judges of the case) by decrypting the IPFS hash value. Due to such strong encryption, any unauthorized user would not have access to the case files and the case files would remain tamperproof and secure from any type of cyberattacks. Figure 2 shows the system architecture of the proposed system. Here, the admin creates a new case and a Case ID (token) is generated whenever a new case is added to the system. This case ID is then assigned to the preregistered users of the system—Lawyer 1, Lawyer 2 and Judge. In order to add an evidence of any format to this application, the lawyer will have to handover the evidence description, digital evidence and the case ID to the admin and once admin approves the fact that the evidence is genuine, then it is pushed into the application using Base 64 encryption which is capable of embedding various file formats to an encoded text and this eventually helps to reduce loss of data over the network. The encoded textual data ensures easy transportation over networks with no chances of data loss further recorded to trace back the evidence’s lifetime. This hash value is then stored in the smart contract in a decentralized, trustworthy and distributed format. This sums up the evidence generation module of this software solution. In case a preregistered lawyer wants to view the digital evidence, they’ll have to provide the respective case ID as an input to this application and after performing a lookup in the smart contract on the basis of this case ID, all the evidences that have been added to that case would be visible in the dashboard of the lawyer. In the evidence display module, the preregistered judge will have the access to go through all the caserelated documents after entering the case ID and thereby the court proceedings can be conducted smoothly via this application.
476
Fig. 2 Diagram of system architecture
S. Haridas et al.
LegalLedger–Blockchain in Judicial System
477
4 Result and Discussion 4.1 Implementation Figure 3 showcases the use of Ganache to set up a personal Ethereum blockchain for testing the solidity contracts. Ganache is a personal blockchain used for the development of distributed Ethereum/Corda applications. It can be used through the entire development phase and also during deployment and testing of the application. It helps in testing this software solution locally before deploying it in the public domain. Figures 4 and 5 show the user registration stage of the application. Details such as name, email ID and contact number of the lawyer or judge are stored onto the smart contract along with the Ethereum addresses of the accounts used to register. Once a user is registered, a unique ID and a private key are generated which can be saved locally. These users—lawyers and judges—will then have access to view the files pertaining to a case which includes documents like Vakalatnama, Roznama and various other evidence pertaining to the case. Figure 6 shows the case registration stage where a new case is added to the application by the admin. Further, the admin will assign two lawyers and a judge through the unique lawyer ID/judge ID assigned during registration for the proceedings to take place. Apart from that other details belonging to the case such as the parties
Fig. 3 Setting up a local blockchain network
478
S. Haridas et al.
Fig. 4 Lawyer registration
contesting the dispute and further details regarding the case are stated by the admin during new case creation. Once done, these details are stored in the smart contract. In order to add evidence of any format to this application as shown in Fig. 7, the lawyer will have to handover the evidence description, digital evidence and the case ID to the admin and once admin approves the fact that the evidence is genuine then it is pushed into a decentralized storage (IPFS) and the hash address returned from IPFS is stored in the smart contract. The same procedure will be followed for the other case-related documents. Figure 8 shows the access stage of the application. Evidence and other caserelated documents added to the case by the admin can be accessed by authorized users (lawyers and judge of the case) by decrypting the hash addresses stored in the smart contract. Due to such heavy encryption, any unauthorized user would not have access to the case files and the case files would remain tamperproof.
LegalLedger–Blockchain in Judicial System
479
Fig. 5 Judge registration
4.2 Testing Table 1 describes the testing report of this application. Here, out of 118 total test cases, the application passed 107 test cases successfully.
5 Conclusion Through this decentralized solution, the court proceedings were virtualized effectively and a platform was successfully created for storing the court proceedings and case-related documents for civil cases. Since the files are stored in a decentralized manner with encryption at both levels (i.e. IPFS and application level), these files are way more secure than any centralized storage which are highly susceptible to penetration and fabrication. This also achieves the whole purpose of building a decentralized application which is to remove a sort of central authority which governs over all the data associated with this application which makes it even more secure and trustworthy. Also based on the testing done and the statistics mentioned above, the
480
Fig. 6 Adding a new case
Fig. 7 Adding new documents related to the case
S. Haridas et al.
LegalLedger–Blockchain in Judicial System
481
Fig. 8 Viewing the case-related documents
Table 1 Testing report Functionality
Test cases executed
Test cases passed
Remark
Launching local blockchain
10
9
Pass
Metamask connection
10
9
Pass
Lawyer, judge registration
10
10
Pass
Lawyer, judge login
9
8
Pass
New case registration
15
13
Pass
New case login
13
11
Pass
Case access security
11
11
Pass
Adding new evidence to IPFS
20
18
Pass
Storing hash in blockchain
20
18
Pass
application passed 107 test cases (90.6% success rate) out of the 118 test cases. Thus, the objective of having a decentralized courthouse with secure storage and access mechanism has been accomplished.
References 1. M. Chopade, S. Khan, U. Shaikh, R. Pawar, Digital forensics: maintaining chain of custody using blockchain, in 2019 Third International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2019 2. L. Wanhua, Research and application of blockchain technology in transportation administrative law enforcement, in 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC), 2020 3. M. Singh, Indian judicial system overview and a approach for automatic roster preparation and case scheduling for faster case solving (need of: e-courts), in 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN), 2018
482
S. Haridas et al.
4. C.H. Sindhuja, P. Siva Prasad, L. Venkateshwara kiran, Automation system for law and legal proceedings, in 2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN), 2019 5. M.A. Tasnim, A. Al Omar, M.S. Rahman, M.Z.A. Bhuiyan, CRAB: blockchain based criminal record management system, in Security, Privacy, and Anonymity in Computation, Communication, and Storage. SpaCCS 2018, 2018 6. A.H. Lone, R.N. Mir, Forensic-chain: blockchain based digital forensics chain of custody with PoC in hyperledger composer. Digit. Invest. 28(4) (2019) 7. Y. Chen, H. Li, K. Li, J. Zhang, An improved P2P file system scheme based on IPFS and blockchain, in 2017 IEEE International Conference on Big Data (Big Data), 2017 8. M.S.M.B. Shah, S. Saleem, R. Zulqarnain, Protecting digital evidence integrity and preserving chain of custody. J. Digit. Forens. Secur. Law 12, 12 (2017) 9. R. Montasari, A standardised data acquisition process model for digital forensic investigation. Int. J. Inf. Comput. Secur. 9, 229–249 (2017) 10. L. Oliveira, J. Liu, The computer for the 21st century: security & privacy challenges after 25 years, in 2017 26th International Conference on Computer Communication and Networks (ICCCN), 2017. https://www.researchgate.net/publication/319887096 11. D. Sivaganesan, A data driven trust mechanism based on blockchain in IoT sensor networks for detection and mitigation of attacks. J. Trends Comput. Sci. Smart Technol. (TCSST) 3, 59–69 (2021) 12. J. Hariharakrishnan, N. Bhalaji, Adaptability analysis of 6LoWPAN and RPL for healthcare applications of Internet-of-Things. J. ISMAC 2(2), 69–81 (2021) 13. A.S.M. McAbee, M. Tummala, J.C. McEachen, Military ıntelligence applications for blockchain technology, in Proceedings of the 52nd Hawaii International Conference on System Sciences, 2019 14. A.R. Short, H.C. Leligou, M. Papoutsidakis, E. Theocharis, Using blockchain technologies to improve security in federated learning system, in 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), 2020 15. H. Wang, C. Wang, Z. Shen, K. Liu, P. Liu, D. Lin, A MADM location privacy protection method based on blockchain. IEEE Access 9, 27802–27812 (2021)
Arduino-Based Smart Walker Support for the Elderly Pavithra Namboodiri, R. Rahul Adhithya, B. Siddharth Bhat, R. Thilipkumar, and M. E. Harikumar
Abstract Science and technology have been an indispensable part of our lives. An intelligent use of science and technology is to achieve a new vision for every step of our life. This paper focuses on proposing an idea to enhance the living standard of an elderly, who needs a walking aid to perform their daily activities, and also for a caretaker to be able to monitor their health status. This can be done by developing a model which focuses the critical parameters like temperature, heart rate, location, etc., by using various sensors embedded inside the walking aid. The main focus of this paper is to design a module that includes a mobile application, through which the caretakers and doctors can monitor the patient while the patient performs their daily routine. Node MCU and Arduino UNO control the working of sensors, an accelerometer is included to detect any fall of the elderly, and a GSM module has been used to initiate calls in case of emergency. This avoids the need for a regular in-person checkup with the doctor. Keywords Arduino · GSM module · Heart rate sensor · Accelerometer · Temperature sensor · Infrared sensor
1 Introduction A smart walking stick is very useful for differently abled and elderly people, so that they can also of performing daily activities without affecting their individual lives, so they get involvement in the community life. The Smart Walker is embedded with sensors for calculating health-related parameters (blood pressure, the position of obstacles, and heart rate). With the technology developing around us, one of the rapid upcoming technology is Internet of Things (IoT). It is being applied in almost every field, also in the field of healthcare. Haidar et al. [1] give an idea of IoT being applied in healthcare and the enhancements it can offer in the same field. Also explains the economical benefits and time investment a person can save on limiting periodic P. Namboodiri · R. Rahul Adhithya · B. Siddharth Bhat · R. Thilipkumar · M. E. Harikumar (B) Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_35
483
484
P. Namboodiri et al.
checkups. In [2] virtual monitoring can bring feasibility, when there is a scarcity of medical facility and how wearable sensors have helped in this process. The objective of this paper focuses on providing an enhanced and smarter walking stick for the elderly. This walking stick ensures that the elderly person performs his daily routines with utmost confidence. The entire system (sensors and microcontrollers) is amalgamated on a Walker, thus making it the Smart Walker [3]. Raykar and Shet [4] deal with the design of a smart assistive walker device for visually impaired people using IoT and guarantee the location of the patient being updated with the help of Global Positioning System (GPS). This project is structured into two parts: the local and the remotely controlled system. Detection of obstacles and notifying the blind is taken care of by the local system. While fetching of heart rate, video streaming is taken care by the remote-controlled system. All the sensors and components required for both parts are mounted on to the Walker. However, the remote-controlled part can be monitored using Wi-Fi through a router [3]. Walking stick in [5] aims at building a system that detects any obstacle in front of it. When the user is holding the walker support, he/she must hold the stick with a little distance between the ground and bottom tip of the stick. Once an obstacle is detected, chances of tripping can be avoided by providing an indication to the holder. This project focuses on designing an improvised navigation system and a fall detecting system [5]. So, this proposed work focuses on incorporating the main features of the papers discussed above, and adding few more features to it. It consists of building an Android application through which the caretaker can view the health status of the patient, even while the caretaker is not with the patient. The caretaker can also get the exact location of the patient through the application when an emergency occurs. The provision of alerting nearby people (through a buzzer) is also included to attract immediate care for the patient. Parihar et al. [6] present a clear idea of the elements needed while building an IoT application related to healthcare. The paper also talks about the parameters and the sensors required for monitoring a patient’s health. Heartbeat and temperature are given first preference while examining the patient’s health status. Heart rate refers to the number of times heart pumps per minute [7]. In general, the heart rate sensor works on the principle of light modulation, it has a transmitter and a receiver. Once the fingertip is placed on the sensor, the light gets transmitted through the transmitter and the receiver senses the volume of light transmitted through the finger. When blood is pumped by the heart, the volume of light changes which is sensed by the receiver. The output is connected to Arduino UNO to compute beats per minute (BPM) rate [8]. Normal heart rate lies in the range of 70–120. Any value deviating from this range indicates unhealthy conditions. The temperature sensor calculates analog output voltage corresponding to the change in temperature. The temperature sensor requires an analog to digital converter, so that the reading can be converted to digital form [7]. The sensor would imitate the behavior of a thermometer when connected to the microcontroller. GPS module can obtain geographical coordinates of the location where the person is present (in
Arduino-Based Smart Walker Support for the Elderly
485
terms of latitude and longitude). The GPS module calculates the coordinates of the current location on the surface of the earth with respect to the equator of the earth. The controller takes care of converting the information provided by the GPS module into a readable format [9]. Global system for mobile communication is used for communication (SMS or CALL). In [8], SMS alert is provided in case of emergency by connecting the GSM module with microcontroller. This GSM module has an RS232 interface for serial communication which has an exterior peripheral. The data communication with the microcontroller is executed at a baud rate of 9600. An accelerometer is used to measure static acceleration like gravity or tilt and dynamic acceleration like vibration, shock, or motion. Usually, the sensor has three axes (x, y, and z). The amount of acceleration or deceleration is defined by the axes which detect any fall. The output of this sensor is three analog values which are given to two analog pins of the Arduino UNO microcontroller. It works with a power of 3.3 V which is provided by the microcontroller board itself [10]. This work utilizes the accelerometer sensor to detect any sudden change in the acceleration due to fall or tilt. Shankar Balu et al. [11] have an application of navigating visually impaired people using its obstacle detection mechanism and fall detection mechanism. An infrared sensor works on the principle of reflection of infrared waves from the obstacle. In [12, 13], an infrared sensor is used for safety measures as it detects obstacles or humans when they are present at a close distance to avoid slip or clash with the surpasser. In this work, it is used to detect obstacles in front of the walking stick, to help the patient walk safely. This work is mainly based on IoT and its applications. Tamilselvi et al. [14] talk about building a smart grid, which shows the big picture model of IoT application. The paper refers the important elements required to build an IoT application, like the dependence on application programming interfaces and several elements like cloud database. Being a real-time database, firebase is a great platform to store ordered data. Its real timing enables the users to get the updates of data at a high rate. When users go offline, the real-time database uses local cache on the scheme to serve and supply changes. The real-time firebase also comprises a firebase verification and a security model, and hence, it is properly encrypted. In [15], first access to the database and rules according to the user’s database plan is set. In [16], they have used firebase to fetch values from the microcontroller, and these values are instantly shared with the user via an SMS. An application serves a purpose of making the caretaker monitor the patient anytime, anywhere. This application was built according to the patient’s requirements, and also providing a user-friendly environment making it more comfortable for the caretaker to monitor. MIT app inventor software, an open-source software, was used to develop the application. It uses a graphical interface that allows users to drag-and-drop graphical tools to generate an application that runs on mobile devices. This application acts as an interface between the microcontroller and database. When the data in firebase gets updated, it is reflected in the application as well [10].
486
P. Namboodiri et al.
Fig. 1 Overall schematic of the working model
Figure 1 illustrates the overall schematic of the proposed work. The main concept of this paper is the Smart Walker for the patient (the elderly person) has. The Smart Walker is loaded with the microcontroller embedded with a set of sensors and modules. This microcontroller processes the data from the sensors and sends it to the firebase. The caretaker has an Android application, which receives and displays the data from the firebase. The walker also has a fall detection mechanism, thus whenever a fall is detected, a call is initiated to the caretaker’s mobile number along with the GPS coordinates being shared.
2 Methodology Our paper consists of two parts, the microcontroller which is embedded on the walker, and an application that acts as an interface between the walker and the users. It revolves around synchronizing the predesigned real-time database with the microcontroller and the application built. The system is designed with sensors—an IR sensor, an accelerometer sensor, a pulse sensor, and a temperature sensor [17]. This proposed plan also includes GPS and GSM modules. Inputs are taken from the microcontroller to the firebase, which can be finally viewed through the application.
2.1 Arduino UNO Arduino is used for getting the input from various sensors and modules. These inputs are then processed, and these values are sent to the Node MCU. Arduino UNO is one of the most popular Arduino microcontrollers that uses ATmega328 [18, 19]. Arduino UNO has 14 I/O digital pins, six analog inputs, an oscillator crystal of 16 MHz, a USB connection, a power jack, and a reset button. The Arduino UNO is chosen because of its simple and clear programming environment.
Arduino-Based Smart Walker Support for the Elderly
487
2.2 Node MCU The Node MCU is the microcontroller unit that uses the ESP8266 Wi-Fi module, its operating voltage is 3.3 V, and this chip is the best-integrated Wi-Fi module. It has an inbuilt TCP/IP protocol stack that provides the microcontroller, the right to connect to your Wi-Fi network [20]. It is a development board with open-source firmware that helps to build IoT products. It runs on the ESP8266 Wi-Fi SoC module and hardware which is based on the ESP-12 module. Node MCU is chosen beacuse it can also be programmed on the environment, in which the Arduino UNO is programmed.
2.3 Firebase The values from the microcontroller need to be stored in a database, so as to be retrieved easily. Important parameters such as heart rate, temperature, and location of the patient are sent to the database. The main advantage of Google Firebase is that it allows any number of users to access the firebase as long as they have the valid database credentials and can be used to build not only Android apps, but also iOS and Web apps.
2.4 MIT App Inventor The application is built using MIT app inventor software. It resembles a graphical interface that allows users to make use of the graphical tools to generate an application that works on Android devices. It has an easy environment to create an application and add features to it. It also supports the use of cloud data via an experimental firebase component.
2.5 Sensors and Other Modules Values from the temperature sensor are fetched from the analog pins, and the analog values are multiplied with a factor of 0.4887. The heartbeat sensor produces analog data which is also fetched using another analog pin. With the help of the pulse sensor playground library, the values taken from the analog pin are processed and the heartrate is noted. Two of the other analog pins of the microcontroller are used for the accelerometer. The accelerometer gives three coordinates of the XYZ axes, which denotes the inclination of the stick [21]. The previous and current values of the accelerometer sensor are compared to detect any fall. When a fall occurs, there will be a big change
488
P. Namboodiri et al.
between the present and the past values, and the microcontroller will sense the fall. The output from the infrared sensor is connected to one of the digital pins of the microcontroller. The infrared sensor sends a signal at a digital level of ‘0’ if it detects any obstacle in front of it, or a signal at a digital level of ‘1’ if there is no obstacle in front of it. GPS module is connected to the microcontroller using the Tx and Rx pins. The values received from the GPS module are processed using its’ library, and the respective coordinates are calculated and sent to the firebase. GSM module is connected to the microcontroller using two of the digital pins. A SIM card is inserted into the GSM module for enabling calls and sending texts to the caretaker in cases of emergency experienced by the patient. These calls and texts are initiated by using certain AT commands [22]. A buzzer is also connected to one of the digital pins. The buzzer is used to indicate obstacles and alert the people nearby in cases of emergency with different level of buzzing. Table 1 shows the specifications of the components used in the implementation of the walking stick. The cost parameters and the efficiency of the components were considered during implementation. The total cost of implementing the walking stick is around $51.19. The miscellaneous components include the wiring needed to connect all the sensors and materials needed for attaching the components to the walking stick. Table 1 Components used and their specifications along with their cost Component
Component specifications
Cost of each component (in dollars)
Pulse sensor
SEN-11574
2.00
Temperature sensor
LM35
1.00
Accelerometer sensor
ADXL-345
2.15
Infrared sensor
LM358
0.50
GPS module
NEO-6M
9.39
GSM module
SIM 800
8.03
Microcontroller
Arduino UNO
7.84
Microcontroller
Node MCU
4.02
Battery
10,000 mAh power bank
Walking stick
Single stemmed and three legged base
Miscellaneous
10.71 5.05 0.5
Arduino-Based Smart Walker Support for the Elderly
489
3 Implementation The rough model describing the positioning of various sensors and modules along with the prototype is shown in Fig. 2. First, the pulse sensor and the temperature sensor are placed on the handle of the walking stick. The temperature sensor is placed at the bottom of the handle and the pulse sensor at the top of the handle. Usually, sweat from the patient’s fingers affects the readings taken from the sensor. Therefore, positioning of the temperature sensor was decided in such a way, so as to minimize this error. The accelerometer is placed on the center of the stick, so that it can detect even the slightest movements of the patient. GPS works more efficiently when it is exposed to open surroundings; hence, it is placed on the outer side of the microcontroller kit which consists of the GSM module, Arduino UNO, Node MCU, and the battery. Finally, the IR sensor’s purpose being detection of obstacles, it is attached to one of the three legs. Initialization of the database is to be done. The firebase cloud consists of a unique firebase URL for sending and receiving data from the microcontroller to the firebase. It also has a unique database secret (a secret key) to prevent unauthorized access to the database. The microcontroller has a code running behind, creating a bridge between the application and the firebase. The firebase URL and the database secret are provided in the code, and the microcontroller can send values to the database. Figure 3 shows the real-time values stored in the database through Google’s firebase. The values being sent by the microcontroller are being stored under each tag, so they can be retrieved accordingly.
Fig. 2 Prototype of the walking stick
490
P. Namboodiri et al.
Fig. 3 User datalog in the cloud
The bridge between the firebase and the user needs to be created. This can be done using the MIT App Inventor. The code can be generated in a simple dragand-drop manner. The MIT App Inventor has an option of linking the firebase with the application. The credentials of the firebase (the firebase URL and the database secret) are entered while creating the link between the firebase and the application. After the credentials are given, the tag value needs to be stated, to specify which part of the database needs to be extracted. So, the application is created, and a log in page is also set up so as to give access to only authorized members, by giving them a username and a password. Figure 4a shows the code for setting up the username and password in MIT App Inventor, which enables only the caretaker and authorized people to access the data. Figure 4b shows the code running behind, for retrieving values and displaying it to the user. It also shows how the map is being set up and marker is created to point the exact location of the patient. Figure 5a shows the log in page and the welcome page of the application made using MIT App Inventor. The caretaker has to log in with the appropriate username and password set up initially. Once a wrong entry is made, a screen would pop up saying the credentials are invalid. Figure 5b shows the display in the application, where the current values are displayed it will be refreshed as the firebase values are updated. Values out of sensors like temperature, heart rate, and GPS module will be updated to the caretaker’s application with respect to the timer set initially. Also, Fig. 6 shows the location sharing protocol, once the live location is requested by the caretaker in case of emergency, the patient’s live location will be shared immediately. The coordinates of the patient’s location are also pinpointed in a map making it easy for the user to track the patient.
Arduino-Based Smart Walker Support for the Elderly
491
Fig. 4 a Code for the first page of the mobile application-programmed in MIT app inventor. b Code for receiving values from the firebase-programmed in MIT app inventor
492
P. Namboodiri et al.
Fig. 5 a Log in screen of the mobile application. b Welcome screen of the mobile application
Figure 7 shows the actual walking stick before and after embedding all the sensors. The embedded walking stick is light in weight that the elderly people can easily carry them without any difficulties. Figure 8 shows the status of caretaker’s mobile receiving a call, when an emergency occurs, while a fall is detected, a call will be initiated to the caretaker’s mobile with the help of GSM module. Figure 9 shows the walking stick held by an elderly person and also shows how easily it can be held and used. Hence, the elderly people can actively take part in their daily activities without depending on others.
4 Conclusion In this paper, Arduino-based Smart Walker support for the elderly has been developed. The main aim was to build a stick which calculates the heart rate, temperature, and location in real-time and sends it to the firebase. A mobile application was developed to view the current data, as well as previous readings. This is also secure since a unique username and a password are given to the users. The mobile application also can view the location of a person on the map. The stick was also embedded with an infrared sensor which detects objects very close to the stick to avoid any fall. An
Arduino-Based Smart Walker Support for the Elderly
493
Fig. 6 Real-time values and the live location as shown in application
accelerometer and a buzzer were embedded, to alert people in the surrounding in case of fall. Also the caretaker will be alerted by a call when the patient’s health status goes down. All the sensors were implemented successfully, and the results were recorded. The main purpose of this paper is to help the elderly take care of their own needs without depending on others. This can also be used for differently abled or people who need to be monitored regularly. A doctor can also be given access to the data, so the doctor can suggest any steps needed to ensure proper health of the patient. This walker can also be used by visually impaired people, since it also has an obstacle detecting mechanism.
494
Fig. 7 Walking stick before and after fixing all sensor modules
Fig. 8 Caretaker’s mobile status before and after receiving an emergency
P. Namboodiri et al.
Arduino-Based Smart Walker Support for the Elderly
495
Fig. 9 Smart walking stick held by an elder person
References 1. G.A. Haidar, R. Achkar, R. Maalouf, B. McHiek, H. Allam, A.R. Moussa, Smart walker, in 2014 International Conference on Future Internet of Things and Cloud, FiCloud 2014, 2014, pp. 415–419 2. M. Aljahdali, R. Abokhamees, A. Bensenouci, T. Brahimi, M.A. Bensenouci, IoT based assistive walker device for frail & visually impaired people, in 2018 15th Learning and Technology Conference (L&T), 2018, pp. 171–177 3. K.T. Kadhim, A.M. Alsahlany, S.M. Wadi, H.T. Kadhum, An overview of patient’s health status monitoring system based on Internet of Things (IoT). Wirel. Pers. Commun. 114(3), 2235–2262 (2020) 4. S.S. Raykar, V.N. Shet, Design of healthcare system using IoT enabled application. Mater. Today Proc. (2019). https://doi.org/10.1016/j.matpr.2019.06.649
496
P. Namboodiri et al.
5. S. Srinivasan, M. Rajesh, Smart walking stick, in Proceedings of the International Conference on Trends in Electronics and Informatics, ICOEI 2019, 2019, pp. 576–579. https://doi.org/10. 1109/ICOEI.2019.8862753 6. V.R. Parihar, A.Y. Tonge, P.D. Ganorkar, Heartbeat and temperature monitoring system for remote patients using Arduino. Int. J. Adv. Eng. Res. Sci. 4(5), 55–58 (2017). https://doi.org/ 10.22161/ijaers.4.5.10 7. P.W. Digarse, S.L. Patil, Arduino UNO and GSM based wireless health monitoring system for patients, in 2017 International Conference on Intelligent Computing and Control Systems, ICICCS 2017, 2017, pp. 583–588. https://doi.org/10.1109/ICCONS.2017.8250529 8. L. Boppana, V. Jain, R. Kishore, Smart stick for elderly, in Proceedings—2019 IEEE International Congress on Cybermatics; 12th IEEE International Conference on Cyber, Physical and Social Computing; 15th IEEE International Conference on Green Computing and Communications; 12th IEEE International Conference on Internet of Things, 2019, pp. 259–266, doi: https://doi.org/10.1109/iThings/GreenCom/CPSCom/SmartData.2019.00064 9. S. Suryanarayanan, N. Rakesh, Emergency human collapse detection and tracking system, in Proceedings of the 2017 International Conference on Smart Technology for Smart Nation, SmartTechCon 2017, 2017, pp. 324–329. https://doi.org/10.1109/SmartTechCon.2017.835 8390 10. R. Alkhatib, A. Swaidan, J. Marzouk, M. Sabbah, S. Berjaoui, M.O. Diab, Smart autonomous wheelchair, in 2019 3rd International Conference on Bio-engineering for Smart Technologies, 2019, pp. 1–5 11. T.M.B. Shankar Balu, R.S. Raghav, K. Aravinth, M. Vamshi, M.E. Harikumar, J. Rolant Gini, Arduino based automated domestic waste segregator, in Proceedings of the 5th International Conference on Communication and Electronics Systems (ICCES 2020), 2020, pp. 906–909. https://doi.org/10.1109/ICCES48766.2020.09137977 12. W. Li, C. Yen, Y. Lin, S. Tung, S. Huang, JustIoT Internet of Things based on the Firebase Realtime Database, in 2018 IEEE International Conference on Smart Manufacturing, Industrial & Logistics Engineering (SMILE), 2018, pp. 43–47 13. L. Goswami, Power line transmission through GOOGLE Firebase database, in 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI), 2020, pp. 415–420 14. V. Tamilselvi, S. Sribalaji, P. Vigneshwaran, P. Vinu, J. Geetha Ramani, IoT Based Health Monitoring System, in 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), 2020, pp. 386–389. https://doi.org/10.1109/ICACCS48705.2020. 9074192 15. M.E. Harikumar, M. Reguram, P. Nayar, Low cost traffic control system for emergency vehicles using ZigBee, in Proceedings of the 3rd International Conference on Communication and Electronics Systems, ICCES 2018, 2018, pp. 308–311. https://doi.org/10.1109/CESYS.2018. 8724035 16. C. Nave, Smart walker based IoT physical rehabilitation system, in 2018 International Symposium in Sensing and Instrumentation in IoT Era, 2018, pp. 1–6 17. S. Preethi et al., IoT based healthcare monitoring and intravenous flow control, in 2020 International Conference on Computer Communication and Informatics, ICCCI 2020, 2020, pp. 20–25. https://doi.org/10.1109/ICCCI48352.2020.9104119 18. P. Dey, M. Hasan, S. Mostofa, A.I. Rana, Smart wheelchair integrating head gesture navigation, in 2019 International Conference on Robotics, Electrical and Signal Processing Techniques, 2019, pp. 329–334 19. R. Akhil, M.S. Gokul, S. Sanal, V.K.S. Menon, Enhanced navigation cane for visually impaired, in Ambient Communications and Computer Systems, 2018, pp. 103–115
Arduino-Based Smart Walker Support for the Elderly
497
20. S. Suryanarayanan, A.V. Vidyapeetham, Design and development of real time patient monitoring system with GSM technology. J. Cases Inf. Technol. 19(4), 22–36 (2017). https://doi. org/10.4018/JCIT.2017100103 21. S. Mugunthan, T. Vijayakumar, Review on IoT based smart grid architecture implementations. Electr. Eng. Autom. 1(1), 12–20 (2019) 22. S.B. Baker, W. Xiang, I. Atkinson, Internet of things for smart healthcare: technologies, challenges, and opportunities. IEEE Access 5, 26521–26544 (2017). https://doi.org/10.1109/ACC ESS.2017.2775180
Enhancing the Framework for E-Healthcare Privacy and Security: the Case of Addis Ababa Selamu Shirtawi and Sanjiv Rao Godla
Abstract This study attempts to design data security and privacy of healthcare centers mainly found in Addis Ababa city including both public and private health centers. Most of the Ethiopian healthcare centers are keeping their confidential records in a manual way. Records of patients, healthcare centers and other communication letters that are shared with other centers need privacy and security. However, nowadays there is no clear framework used to share health information among healthcare centers, which the current study targeted to deal with. The current study has been conducted based on the design science research process. Problems were identified using surveys conducted by distributing questionnaires to the health institutions. The collected data were analyzed and interpreted to define the problem and motivate the need to design a framework. This stage leads the researcher to define the system requirement and design the system framework. The prototype is implemented using the JAVA programming language. The prototype is needed to be integrated into the e-health system that currently used by the healthcare centers. Keywords Healthcare centers · Security and privacy framework · E-health information system · Encryption protocols
Supported by BuleHora University, Ethiopia. S. Shirtawi Department of Software Engineering, College of Informatics, Bule Hora University, Oromia, Ethiopia e-mail: [email protected] S. R. Godla (B) College of Informatics, Bule Hora University, Oromia, Ethiopia e-mail: [email protected]
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_36
499
500
S. Shirtawi and S. R. Godla
1 Introduction Health Information System (HIS) supports patients and also doctors, nurses, paramedical and other healthcare providers in diagnosing, treating and supporting patients [1]. Health care is a matter of life and death issues. In such serious issue, patients have to trust healthcare providers and both patients and healthcare providers depending on the trustworthiness of the information systems used. Privacy and security requirements are frequently expressed in vague, contradictory and complex laws and regulations [1]. It is necessary to find new approaches in systems design. Health Information System provides effective, high-quality support for providing the best care for patients but without compromising their privacy and security [1]. Trust can be built by ensuring the security and privacy which is required to realize the potential benefits of electronic health information exchange especially for achievement of the security goals. However, developing a secure and scalable framework for the healthcare information system is a difficult task the higher complexities within the healthcare environment [2]. Privacy and security are the basic issues in healthcare centers around the world [3]. These issues are the main challenges especially for developing countries like Ethiopia. Many scholars discussed the challenges facing healthcare centers [3]. The challenges include the tension between data growth and analytics and data minimization, handling connected devices and mobile apps, creating effective cross-functional privacy and security teams, the performance of employees, the financial resource to adopt the latest technology and effective and tiered vendor management [2, 4]. This study first tried to analyze the main obstacles to keep the privacy and security of information in healthcare centers that are found in Addis Ababa healthcare centers. This is followed by designing an enhancement framework to minimize the issues and enhance the current security and privacy situation in the healthcare centers.
1.1 Statement of Problem Patients crowd big public and private hospitals like Black Lion Hospital, Korean Hospital, etc. every time. It is also true that most HIV/AIDS patients do not feel free to go to the hospitals periodically due to fear of social discrimination. Such problems can be solved by controlling the patients who have chronic diseases such as HIV using pervasive healthcare systems remotely. However, such systems need security and privacy protection mechanisms. Therefore, for countries like Ethiopia, the privacy concern of patients can hinder the acceptance of pervasive healthcare systems in the countries. In Ethiopia, due to the existing manual medical records management system, patients’ information is exposed to different security threats. Currently, most of the healthcare centers are keeping health data in local database storage for securing the health data. However, keeping health data in such way could not keep the data safe and secured without a clear framework. The healthcare centers
Enhancing the Framework for E-Healthcare Privacy …
501
are not using any security and privacy of health data handling framework. Some of the papers tried to overcome the problems by proposing open source software like OpenMRS. None of the papers investigate the organizations working culture in which how the healthcare centers are handling their employees’ level of awareness regarding the security and privacy of health data. As a result, privacy and security in health data and information management is the crucial point. The main aim of this study is, therefore, to explore and identify the security and privacy issues raised in healthcare providers. Our research will address the following research questions. • What are the requirements to enhance privacy and security in the current HMIS in Addis Ababa? • How security and privacy are ensured during sharing patient information with other healthcare centers? • To what extent the designed framework enables handling the privacy and security of E-health?
1.2 Objective of the Study The general objective of this study is to design a framework for handling and resolving privacy and security issues in e-health information system, so as to minimize the risk of losing health data in healthcare centers. The following specific objectives are formulated. • To review literatures related to security and privacy issues in health data. • To gather and analyze data for defining the requirement of health information system. • To design a prototype and framework that enhances the security and privacy of health data • To simulate the designed framework. • To evaluate the performance of the prototype and conduct user acceptance testing of the framework.
1.3 Scope and Limitation of the Study The scope of the study designing privacy and security framework for EHRS for Addis Ababa Healthcare centers. This study also tried to assess and investigate the challenges on the implementation of HIMS around the healthcare centers. This study covers the healthcare centers found in Addis Ababa, which are currently using Electronic Health Information System and intended to solve the security and privacy of the health data. Most of the healthcare centers are just focusing on data storage and accessing the data from the database, but they responded they never get concerned for health data security and privacy. The centers just keep data offline for the security
502
S. Shirtawi and S. R. Godla
and privacy matter even though they are expected to share patient data for fast and timely service provision. Therefore, there should be a framework, which allows them to keep health data electronically and share with each other. This study mainly concerns on enhancing e-health privacy and security in sharing patient data. Hence, the study attempts to design a framework and implement a prototype, which will enhance the security and privacy, level of the health data. This study does not cover all the healthcare centers found in Addis Ababa city because of inconsistent application of HIS. The study is also limited to implement specific part of framework because of time and lack of related articles in the area.
1.4 Significance of the Study The main necessity of the study is to protect different health records of the healthcare centers from unauthorized users like hackers, crackers and information theft. In fact, it has the following significance for the following beneficiaries of the study. • Healthcare centers can manage their health data easily and efficiently. This study will allow them to keep health data including organization information safe and secure. • Patients are one of the main stakeholders on the security and privacy of health data. There is much information, which they needed to hide from other people other than doctors or nurses. Therefore, they need confidentiality of their data. After the design of this security and privacy framework, they will have great trust in the healthcare centers. • Federal Ministry of Health (FMoH) manages different information regarding healthcare centers. This study will tip the security and privacy of health-related information. • Researchers are the other beneficiaries of this study. They can get the situation how the healthcare centers are dealing with the security and privacy of health data and they can take as input for their further study. • Policymakers are another stakeholder from this study. From the security and privacy challenges, policy is identified as one so the policymakers start considering the security and privacy of health data when they are considering health data. Healthcare system is one of the major issues for developing countries, and thus, the information technology is becoming progressively more important nowadays. So, this thesis result will enable to establish greater public trust in HIS and hosting hospitals. It is applicable to: • • • •
Keep the healthcare center’s information secure and safe. Minimize fear of stigma and discrimination. Accurate patient identification. Protect against any reasonably expected uses or disclosures of such information that are not permitted under the privacy regulations.
Enhancing the Framework for E-Healthcare Privacy …
503
2 Related Work There are studies, which have been worked by different scholars on the security and privacy of health data for enhancing information security and privacy of Health Information System: in case of OpenMRS. The researcher tried to present those works as follow. Kidanu [5] took a broad approach into existing information security and privacy of healthcare domain with a practical focus on Black Lion Hospital and Korean Hospital. The main theme of this thesis was paradigm shifts or using/adopting new technologies desire a reconsidering of the security and privacy aspects and solutions. The paper focused on only two healthcare centers, namely Black Lion Hospital and Korean Hospital. The paper also proposed to implement a prototype that enhances information security and privacy of HIS using OpenMRS. The prototype’s main security and privacy features include confidentiality on the server-side that is ensured by a carefully placed access control mechanism, encryption that protects the confidentiality during the transfer of the data and at storage, anonymization of the patient medical record and the use of log files. The author concluded the implemented prototype is founded it can overcome the issue raised by the time in two hospitals. And there is less awareness of the security and privacy of health data around the employees [6]. This study recommended that healthcare centers should engage in biometric technology (like a fingerprint, face recognition) to improve the security and privacy of health data. Bashiri et al. [7] The authors highlight OpenMRS’s high potential for lowering the cost of implementing and developing electronic health record systems in developing countries, as well as how it can be used to manage patient information and improve the quality of health care. Author also concludes that creating OpenMRS systems will result in more cost savings for physicians and other healthcare providers by lowering the cost of installation, maintenance and update of electronic health record systems. They have the potential to improve patient information management as well as the quality and efficiency of healthcare services. Liu et al. [8] The author focuses on secure medical managerial strategies that can be applied to the network environment of a medical organization’s information system to avoid external or internal information security events, allowing the medical system to run smoothly and safely, benefiting not only patients but also doctors and promoting overall medical health. Farida et al. [9] The goal of his study was to examine current privacy by design frameworks and identify their major drawbacks. The study was based on a systematic literature review, looked at seven contemporary privacy by design frameworks in depth. The result, which is aimed at the healthcare industry, should provide a high level of protection against data breaches in the personal information domain.
504
S. Shirtawi and S. R. Godla
2.1 Privacy and Security for Analytics on Healthcare Data The main goal of this study was to investigate the status of the security and privacy aspects of an architecture for collaborative data analytics. Basically, the study aimed to provide an architecture for safe access to patient data for data mining purposes. In order that, the paper looks at the means currently available for protecting privacy in a standard healthcare organization (e.g., hospital), and further, the paper examines mechanisms to preserve the data protection in a collaborative analytics framework [10]. This study also indicates there should be a label of data to keep security and privacy. Therefore, according to their label, the security and privacy will also be labeled, and this study mainly followed the risk analysis toward security and privacy of health data. The paper proposed re-identification of risk could improve the security and privacy risks [11]. This study describes an architecture for a secure healthcare data lake that uses the security policies authored by the medical data sources, including patient consent, to provide limited access for secondary use to data analysts. This paper has shown how these source policies can be enforced on analytical data [12].
2.2 Privacy and Security of Health Information Privacy, as distinct from confidentiality, is viewed as the right of the individual client or patient to be let alone and to make decisions about how personal information is shared. Even though the U.S. Constitution does not specify a “right to privacy”, privacy rights with respect to individual healthcare decisions and health information have been outlined in court decisions, in federal and state statutes, accrediting organization guidelines and professional codes of ethics [13]. Security refers directly to protection, and specifically to the means used to protect the privacy of health information and support professionals in holding that information in confidence. The concept of security has long applied to health records in paper form; locked file cabinets are a simple example. As use of electronic health record systems grew, transmission of health data to support billing became the norm, and the need for regulatory guidelines specific to electronic health information became more apparent [13].
2.3 Important of Privacy and Security in Healthcare privacy of Health Data There are a variety of reasons for placing a high value on protecting the privacy, confidentiality and security of health information [14]. Some theorists depict privacy
Enhancing the Framework for E-Healthcare Privacy …
505
as a basic human good or right with intrinsic value NRC [15]. They see privacy as being objectively valuable in itself, as an essential component of human well-being. They believe that respecting privacy (and autonomy) is a form of recognition of the attributes that give humans their moral uniqueness. The more common view is that privacy is valuable because it facilitates or promotes other fundamental values, including ideals of [16] such as: • • • •
Personal autonomy (the ability to make personal decisions) Individuality Respect Dignity and worth as human beings.
Security of health data Protecting the security of data in health research is important because health research requires the collection, storage and use of large amounts of personally identifiable health information, much of which may be sensitive and potentially embarrassing. If security is breached, the individuals whose health information was inappropriately accessed face a number of potential harms. The disclosure of personal information may cause intrinsic harm simply because that private information is known by others. Another potential danger is economic harm. Individuals could lose their job, health insurance or housing if the wrong type of information becomes public knowledge. Individuals could also experience social or psychological harm. For example, the disclosure that an individual is infected with HIV or another type of sexually transmitted infection can cause social isolation and/or other psychologically harmful results. The HIPAA Security Rule provided the first national standards for protection of health information. Addressing technical and administrative safeguards, the HIPAA Security Rule’s stated goal is to protect individually identifiable information in electronic form a subset of information covered by the Privacy Rule while allowing healthcare providers appropriate access to information and flexibility in adoption of technology.
2.4 Summary of Research Gaps In general, the researcher has identified gaps by which the listed paper should have address problems on privacy and security of health data. Almost none of the papers considered the level of awareness of the employees of healthcare centers, and the papers did not identify how to improve the perception of security and privacy of health data in the centers. Most of the papers especially local papers did not consider the devices found in the healthcare centers like biometrics. Some of the papers [17, 18] did not address all the security and privacy issue they just pick one or two security and privacy issue. Selemawit [6] selected only two healthcare centers: Black Lion and Korean Hospitals to investigate the security and privacy situation. However, the researcher identified that the implementation of HIS in healthcare centers is different. In this case, the researcher should have had investigated more than two centers. Those researchers did not identify privacy and security issues that need prior attention to
506
S. Shirtawi and S. R. Godla
address like integrity, confidentiality and access control. The papers do not consider sharing of health data within the healthcare centers.
3 Our Contribution This study undergoes design science research process as explained in [19, 20]. The first task in the process is identifying problems in the healthcare centers on security and privacy of health data. Problems were identified by investigating the situations in the centers on the way of keeping health data. Under problem identification, the study has followed investigating the problems using different methods. Data has been gathered through questionnaires, interview and observation. In addition, the literature review plays a vital role in identifying problems. The population has been studied according to the statistics of Addis Ababa office of healthcare centers. There are 11 governmental and 35 private hospitals and more than 100 healthcare centers other than hospitals. In this study, we consider 5 governmental hospitals, namely Black Lion Hospital, Menilik Hospital, Zewditu Hospital, Yekatit 12 hospital and Torhailoch Hospital and 10 private, namely Addis Hiwot General hospital, St. Gabriel Hospital, Legehar General Hospital, Bethel General Hospital, Land mark Hospital, Korean Hospital, CMC General Hospital, Yere General Hospital, Amen general Hospital and Hayat hospital. In addition, four other healthcare centers are namely Jaleloya higher clinic, Bole health center, Betsegah higher clinic and Yeka health centers. In each healthcare centers, the IT officers have been asked for the questionnaires and for additional organizational information; doctors, nurses and healthcare center managers were interviewed. The investigation result is presented below. In order to select health institutions for the survey, the researcher used purposive sampling. This is because the target population selected as a sample are those with Electronic Health Information System. In addition, the researcher targeted to employees specially IT officers in each healthcare centers for questionnaires, and other workers including healthcare centers manager were interviewed on the situation of security and privacy of health data. Figure 1. depicts the need and relationships among the components of a safe e-healthcare system, with security and privacy serving as a superset for all healthcare systems. After the problems identified, conceptual framework has designed which will identify the factors and variables that could be involved in the process of finding a solution. In addition, as shown in Fig. 2, related work on the research has been examined in order to build a conceptual framework for a secure e-healthcare system. The suggested framework is intended to allow people to share their health information without worrying about security or privacy. Figure 2 shows the conceptual framework for secure e-heathcare systems which consist of three different layers, e-health information, e-health software and infrastructure, and privacy and security are considered in separate component.
Enhancing the Framework for E-Healthcare Privacy …
507
Fig. 1 Relationship between secure e-healthcare system
Fig. 2 Conceptual framework for secure e-healthcare system
4 Discussion of the Findings This section explores the situation of healthcare centers, which are found in Addis Ababa city administration. The security and privacy of health data, the data management, staffs awareness, the organization work culture toward privacy and security of health data, and their technology adoption were being investigated.
508
S. Shirtawi and S. R. Godla
In the privacy and security component, the analysis of the obtained data is given according to the defined framework and the detected elements in Fig. 2. • Technological factors • Reliability Reliability is a technological factor in the security and privacy of health data. In this study, an investigation was made to gather respondents comment on reliability concern, reliability rate, ensuring reliability and the reason to ensure reliability. Every respondent responds to this factor. According to the survey, 75% of respondent says reliability is the concern for the security and privacy of health data, which means reliability should be considered when the health centers keep the health data. Even if the respondent says, reliability is a basic concern; the healthcare centers are not concerning what to be expected. The above Table. 1 shows only medium concern is taken for the health data. Reliability can be ensured by continuous training for data entry staffs and using reliable data storage. 40% of the respondent said reliability can be ensured by training the data entry staff and 35% of the respondent responded reliability can also be ensured by using reliable data storage. There are many reasons to ensure the reliability of health data. According to the respondent, the reliability enables to use and access the health data easily.
4.1 Access Control The other variable which is identified as a technological factor in healthcare data is access control [21]. The researcher has conducted on the following variables to assess the strength of the access control of the healthcare centers in selected centers. Table 2. shows the way of access control used in the healthcare centers in Addis Ababa. The respondents have been asked the access control method by which they are using as listed in the above table. They responded local administrative account is being used by the healthcare centers to control the unauthorized user. However, the local administrator account is not much secured and safe to keep health data. The reason why they are not using the other access control method is they have no idea about public and private keys. Most of the respondent said there is no access to use another method because they fear about the security and safety of the health data, and they are never aware of data security and privacy. In addition, the level of employee’s knowledge about technology is the main reason for the access control problem. According to the respondents, lack of appropriate technology is the main cause of the access problems listed in the below Table 2. In addition to the main cause, lack of professionals and clear policies takes 50% of the cause for the access control problems. The survey indicates access control methods proposed by the respondents. The survey also indicates the security and privacy methods, which are being used in some of the healthcare centers in Addis Ababa. The aim of this factor is to assess the methods, which have been used in the
Enhancing the Framework for E-Healthcare Privacy … Table 1 Reliability concern Reliability concern Valid Frequency
509
Percent
Cumulative percent
Highly Partially Total Reliability rate Valid
15 5 20
75.0 25.0 100.0
75.0 100.0
Frequency
Percent
Cumulative percent
Very high High Medium Low Total Ensure reliability Valid
3 8 8 1 20
15.0 40.0 40.0 5.0 100.0
15.0 55.0 95.0 100.0
Frequency
Percent
Using reliable data storage Good data capture method Update data periodically Training data entry staff Total Why ensure reliability Valid
7
35.0
3
15.0
2
10.0
8
40.0
20
100.0*
It enable to use data easily It takes less budget Allows the user to access data easily Total
Frequency
Percent
Cumulative percent
12
60.0
60.0
2 6
10.0 30.0
70.0 100.0
20
100
centers. However, the centers are not using that much separate method for security and privacy of health data than the manual way like local administrative account.
4.2 Organizational Factors Information sharing is the key to any organization for day-to-day activity [22]. The following statistics show the habit and ways of information sharing across the health-
510
S. Shirtawi and S. R. Godla
Table 2 How access control took place in the healthcare centers Valid Frequency Percent Access control in healthcare centers Administrative 16 account Public key 1 Private key 3 Total 20 Problems occurred when applying access control Information security 6 Level of security is not 1 defined Level of employee 7 knowledge about technology Level of user 6 knowledge about technology Total 20 Cause of the problems Lack of appropriate 10 technology Lack of professional 5 staff Lack of clear policy on 5 health data Total 20 Access control enhancement method By limiting data access 8 By identifying 6 sensitive data and securing By using strong and 4 different password Disallowing any 2 access without administrator Total 20
Cumulative percent
80.0
80.0
5.0 15.0 100.0
85.0 100.0
30.0 5.0
30.0 35.0
35.0
70.0
30.0
100.0
100.0 50.0
50.0
25.0
75.0
25.0
100.0
100.0 40.0 30.0
40.0 70.0
20.0
90.0
10.0
100.0
100.0
Enhancing the Framework for E-Healthcare Privacy …
511
care centers. Three variables have been selected to compute the factor. The requested respondent has filled all variables. This variable investigates the structure of the organization with respect to the security and privacy of health data in each healthcare centers. Table 3 illustrates the situation of the healthcare centers’ organizational structure with security and privacy health data. Organizations are improving their employee’s knowledge in different ways. Healthcare centers are one of the organizations, which need a continuous improvement on current technology. The below Table 3. depicts the ways that Addis Ababa healthcare centers are working on the improvement of employee knowledge especially on health data security and privacy. The centers are using continuous training as knowledge improvement. 55% of the respondent replied that they are improving their employee’s knowledge through training but this is not that much fruitful for successful work [23]. The survey shows the linkage between employee and the habit of securing and keeping the health data private. Unfortunately, the linkage is medium even low. 40% of the respondent said the linkage is medium, and 30% were replied the linkage is low. This indicates the linkage between employee and the security and
Table 3 Ways of improving security and privacy knowledge Valid Frequency Percent Ways of improving security and privacy knowledge Continuous training 11 55.0 Adopting new 2 10.0 technology Appreciating 1 5.0 performer Experience with 6 30.0 similar organization Total 20 100.0 The linkage between employees and data security and privacy Partially 5 25.0 Low 6 30.0 Total 20 100.0 Who responsible for health data security and privacy IT department 12 60.0 Each department 6 30.0 Manager 2 10.0 Total 20 100.0 Linkage with IT and another department in the healthcare centers Strongly 9 45.0 Partially 5 25.0 Low 6 30.0 Total 20 100.0
Cumulative Percent 55.0 65.0 70.0 100.0
25.0 30.0 100.0 60.0 90.0 100.0
45.0 25.0 30.0 100.0
512
S. Shirtawi and S. R. Godla
privacy needs enhancement. Employees should have a tight linkage concerning the security and privacy of health data. Every organization has different department, and linkage within the department is key for effective communication and organized work. IT department in any organization plays a vital role in the communication between the departments. The following statistics show the linkage and role of the IT department in the healthcare centers.
5 A Prototype Demonstration to Enhance the Level of Security and Privacy of Health Data In this section, the researcher presented the prototype demonstration, which improves the security and privacy of healthcare centers. The prototype is mainly concerned with securing communication between the nurses and doctor on the detail of the patient. Most of the healthcare center in Addis Ababa is communicating only through the messaging the detail of patient without any security and privacy measurement. The researcher has identified this conversation as a big threat to the security and privacy of patient information. Therefore, Advanced Encryption Standard (AES) algorithm using the JAVA platform is demonstrated to secure the conversation of the doctors and nurses. By taking healthcare centers conversation form encrypting the conversation at both side, the user is capable of encrypting the conversation when the receiver receives the message. The message is arrived at the receiver encrypted, and the receiver can read the message by decrypting the message using the provided platform. The conversation is illustrated in Fig. 3 in which the researcher takes doctor’s computer as a server and the client will be the nurses. All the nurses can communicate with a single nurse by only taking the name or the address of the doctor’s computer. First of all the doctors should make himself active by starting the server. The nurses can write the detail of the patients to the respective doctor on the chat box provided. When they are writing the detail of the patient, they should not worry about the security and privacy of the patient detail. The above scenario illustrates that the nurse receives information from patient and send the patient information to respective doctor. The doctor decrypts the patient information using the platform from his devices. After he/she analyzes the patient information, the information is sent to next process secretly.
5.1 Discussion The document analysis and the interviews indicated the current practice of electronic and non-electronic health data and how it lacks data security and privacy architecture. Disclosing of sensitive information to unauthorized users, fraud risk is increased, and the violation is arising to patient privacy. However, security and privacy are a major
Enhancing the Framework for E-Healthcare Privacy …
513
Fig. 3 Communication between doctors and client for secure e-healthcare system
concern of both the patient and the hospital. Most of the healthcare centers found in Addis Ababa currently have a serious challenge and problems in handling patient information system. The proposed prototype has implemented encrypted information communication within the health professionals that will secure the health data. In addition, the proposed prototype has implemented data sharing between the healthcare centers using a shared key. Compared with current data handling regarding security and privacy, the proposed prototype will enhance the level of security and privacy of health data. Therefore, the proposed prototype seems to have the capability of enhancing information security and privacy in the health information system.
6 Conclusion Security and privacy of data are becoming the main concern of any organizations. Healthcare centers are one of the organization those need serious concern on health data security and privacy. Ensuring the security and privacy of health data could improve the service provision of the centers. The healthcare centers can earn patient trust when they are assured of the safety of the patient record. Currently, the healthcare centers mainly found in Addis Ababa either governmental or private are discouraging the security and privacy of the healthcare centers. This thesis investigated the situation how they are handling the security and privacy of their data including the patients’ record and sharing with other centers. The survey indicates they are not concerning the sharing of health data within the centers. Most of the healthcare centers are not concerning the security and privacy of the health data
514
S. Shirtawi and S. R. Godla
because they are keeping records traditional file-based data management approach, in addition, they are not sharing any information with other centers. The researcher has had asked why they are not sharing the health data, and they said there is not clear security and privacy platform to share data with other healthcare centers. User requirement has been defined based on the survey and the requirements indicated what to be implemented. The main aim of this thesis was to ensure the security and privacy of the health data by enhancing the current system security and privacy. The enhancement has been made by developing the prototype based on users requirement identified when the researcher assess their requirement, which encrypts the conversation of the doctors and the nurses and sharing the health data within the healthcare centers using public key encryption. The prototype is implemented using JAVA, and the public key encryption has been designed using Kleopatra platform. The Kleopatra platform allows the user to put their public key in the key directory so the registered healthcare centers can access the respective public to encrypt and decrypt the shared data. System testing has been made using IEEE testing standard. The testing process was taken place by asking the users how they embrace the designed prototype with respect to security and privacy of the health data. There are several requirements needed to enhance the level of security and privacy of the current HMIS. Most of the healthcare centers do not have any platform to handle the security and privacy of health data. Three things must be ensured to enhance the security and privacy level of health data. First, the full platform should be applied in all centers. This platform will allow the entire users to communicate in an encrypted conversation. Therefore, every healthcare center should install the platform so they will not be concerned about security and privacy of health data in three phases (i.e., initial, motion and rest). Second, the employees of the healthcare centers must have awareness on the platform and the way to use the security system. They should know how risky health data is when recording and sharing with the other healthcare centers. Third, the users including patient must have trust in the service provided by the healthcare centers. The healthcare centers are sharing patient information through the manual way. In this way, there is no clear platform to share patient information between healthcare centers. Most of the healthcare centers responded that they are considering security and privacy of patient information in sharing the data through HMIS. In addition, healthcare centers think it could be easy when there is a secured and safe way of sharing health data through HMIS. This thesis designs a prototype to share health data in consideration of security and privacy. The designed framework allows healthcare centers to share health data with their public key. The public key of each healthcare centers is provided on the public key directory as illustrated in Fig. 3. The receiver is able to access the sent data using its own public key since the provided key in the directory encrypts the data. This study designs a framework, which enhances the level of security and privacy of health data. The designed framework includes many factors (like access control, encryption, reliability, etc.) those are identified as main factors to improve the security level of health data. The factors were identified based on a different point of view as presented in Chap. 3. The factors were identified based on gathered data; this helps
Enhancing the Framework for E-Healthcare Privacy …
515
the study to keep in touch with the problem identification. Most of the respondents were responded; there is no clear platform, which keeps the health data secure and safe. This thesis work designs enhancement framework in different manners (like access control, reliability, backup sensitive data, etc.). The main contribution of this thesis is delivering the best way of security and privacy of health data for healthcare centers. The study shows the main gaps between healthcare centers regarding security and privacy. This thesis mainly figures out the challenges and provides a solution for the challenges by putting best practice in keeping health data secured and safe. This study also has main contribution for researchers and scholars on the process of solving further problems in health industry regarding security and privacy of health data. This study could be used as basis to investigate further researches.
6.1 Future Work Since the advancement of HIMS increases the vulnerability of the privacy and security of health data, especially sensitive health data, which might have a great impact to the health service, future works to the continuation of this work are: • The designed system could be better if the level of security and privacy is identified. Therefore, identifying and separating the level of security and privacy of health data in healthcare centers needs further investigation. • All healthcare centers should use the same HIS for better improvement of security and privacy of health data. • Online access to such record by the patient is another important aspect of the service that can be an add-on to the proposed system. Acknowledgements Supported by BuleHora University, Ethiopia. The authors would like to thank to Almighty God and the College of Informatics and particularly the Software Engineering department staffs and authorities of Bule Hora University for their constant support and cooperation!
References 1. C.C.A.J. McGhee, Ed., Private and Confidential?: Handling Personal Information in the Social and Health Services (The Policy Press, London, 2008) 2. P. Davidson, Issues and possible solutions, in Healthcare Information Systems: A Complex Multi-location Enterprise (2004) 3. B.L. Filkins, J.Y. Kim, B. Roberts,W. Armstrong, Privacy and security in the era of digital health. Am. J. Transl. Res. (2016) 4. M.N. Ngafeeson, Healthcare information system opportunities and challenges, in Encyclopedia of Information Science and Technology, 3rd edn. (Information Resources Management Association, New York, 2014), p. 12 5. S.H. Kidanu, Enhancing Information Security and Privacy of Health Information System: A case of OpenMRS (Addis Ababa University, Addis Ababa, 2015)
516
S. Shirtawi and S. R. Godla
6. S. Hadush, Enhancing Information Security and Privacy of Health Information System: A Case of Openmrs. http://localhost:80/xmlui/handle/123456789/3105 7. A. Bashiri, M. Ghazisaeedi, Open MRS softwares: effective approaches in management of patients’ health information, in Int. J. Commun. Med. Public Health 4(11), 3948–3951. http:// www.ijcmph.com 8. C.H. Liu, Y.F.Chung, T.S. Chen, S.D. Wang, The enhancement of security in healthcare information systems. J Med Syst.36(3), 1673–1688. https://doi.org/10.1007/s10916-010-9628-3 (PMID: 21104304) 9. F.H. Semantha, S. Azam, K.C. Yeo, B. Shanmugam, A systematic literature review on privacy by design in the healthcare. Electronics 9, 452 (2020). https://doi.org/10.3390/electronics9030452 10. L. Zhang, G. Ahn, B. Chu, A role-based delegation framework for healthcare information systems, in The Seventh ACM Symposium on Access Control Models and Technologies (SACMAT’02) (2002) 11. Health IT Security, Intelligent network media , February 2018. [Online]. Available: https://healthitsecurity.com/news/healthcare-data-privacy-security-concerns-hinder-digitaladoption. Accessed 24 July 2018 12. W. H. Organization, Framework and Standards for Country Standards for Country Health Information Systems (World Health Organization, 2012) 13. J. George, B. Takura, Security confidentiality and privacy in health of healthcare data. Int. J. Trend Sci. Res. Develop. 3(4) (2019) 14. J.L. Pritts, Altered states: state health privacy laws and the impact of the Federal Health Privacy Rule. HeinOnline (2001) 15. National Research Council and Others, Expanding Access to Research Data Reconciling Risks and Opportunities (National Academies Press, 2005) 16. V.K Omachonu, Healthcare Value Proposition: Creating a Culture of Excellence in Patient Experience (CRC Press, Boca Ratonn, 2018) 17. J.A. Miller, M. Fan, S. Wu, I.B. Arpinar, A.P. Sheth, K.J Kochut, Security for the METEOR workflow management system, in Distributed Information Systems Lab (LSDIS), Large Scale 18. M. Anne Zender, Ensuring Data Integrity in Health Information Exchange (American Health Information Management Association, 2012) 19. V. Vaishnavi, B. Kuechler, S. Petter, Design science research in information systems information systems, 20 Jan 2004. [Online]. Available: http://www.desrist.org/design-research-ininformation-systems/. Accessed 17 May 2019 20. K. Peffers, T. Tuunanen, C.E. Gengler, M. Rossi, W. Hui, V. Virtanen, J. Bragge, The design Science Research Process: A Model for Producing and Presenting Information Systems Research (CA, Claremont, 2006) pp. 84–85 21. “cyberark,” February 2018. [Online]. Available: https://www.cyberark.com/privileged-accesssecurity/. Accessed Sept 2018 22. T. Shah,L. Wilson, N. Booth, O. Butters, J. McDonald, K. Common, Information-sharing in health and social care: Lessons from a socio-technical initiative. Published online: 03 Apr 2019, pp. 359–363 23. A. Nassazi, Effects of training on employee performance: evidence from Uganda (2013). Julkaisun pysyvä osoite on http://urn.fi/URN:NBN:fi:amk-2013120419934
Socket Programming-Based RMI Application for Amazon Web Services in Distributed Cloud Computing Sanjiv Rao Godla, Getahun Fikadu, and Abinet Adema
Abstract The purpose of this paper is to explain how the Remote Method Invocation (RMI) model can be used in a distributed cloud environment. It describes the conceptual design and implementation of the RMI-based language translator. The research also used Java RMI-based socket programming for the implementation of distributed RMI services that are located on a remote central server. In this case, we have set up five instances, two of which are virtual servers in different data centers. On these servers, server-side programming is used. The three instances (virtual machines set up as clients). These three clients received client-side programming. If there is a data load on one server, the server may be busy with various client requests, which causes clients to wait for a long time and unbalances the data load on a single data center. Taking this into account, we configure the server as a virtual server, and the client can access the nearest data center if the previous server is unavailable. The paper also explains the steps to deploying the application on an Amazon Web Service and discusses how visualization is applied to the application after the server-side and client-side programs are installed in different regions. Amazon Web Services’ ondemand infrastructure is very secure, which alleviates concerns about the security (confidentiality, integrity, and authenticity) of the services. It includes an SSH protocol for securing data. In general, the paper provides a conceptual design for language translators, converting the designed concept to a solution by using Java RMI-based socket programming and then providing the way for deployment on Amazon Web Services (AWS) like Elastic Cloud Computing (EC2). Keywords Cloud computing · Socket programming · RMI services · Amazon Web Services Supported by BuleHora University, Ethiopia. S. R. Godla (B) College Informatics, Bule Hora University, Bule Hora, Ethiopia e-mail: [email protected] URL: http://www.springer.com/gp/computer-science/lncs G. Fikadu · A. Adema Department of Software Engineering, College Informatics, Bule Hora University, Bule Hora, Ethiopia © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_37
517
518
S. R. Godla et al.
1 Introduction Java RMI enables the creation of distributed sharable resources from a remote location. The construction of distributed Java apps is quite prevalent in today’s distributed technologies. Among these, the first technology employed by distributed application developers is RMI [1]. Other technologies are becoming more popular, such as the Common Object Request Broker Architecture and the Distributed Component Object Model [2, 3]. We have designed a language translator program utilizing Java RMI in this article, which will assist readers in using this technology for a network-based application they have created. We have only utilized RMI because it’s compatible with socket development. Client–server interaction is a phrase used in distributed cloud computing and network-based application development to describe the interaction between the server and the client [4]. The increasing applicability of virtualization and containerization on cloud-based applications boosts network-based application development [5]. The access latency of the services is reduced by these applications. It allows you to access an object from another Java Virtual Machine. We have only utilized RMI because it is compatible with socket development. Client–server interaction is a phrase used in distributed cloud computing and network-based application development to describe the interaction between the server and the client [4]. This study also explains how virtualization can be used in a cloud context. In general, the paper shows how to create a language translator RMI-based Java program, how to connect between remote servers and clients, and how to deploy the application in a cloud environment like AWS (Amazon Web Services). We have created five virtual machines here, two of which are configured as servers, and server-side programs are running on them. As clients, three more virtual machines are used. We have tried to solve issues related to load balancing and access latency by using virtualization. Virtualization is nothing but making services dynamic by using a virtual server. We used CodeDeploy to simplify and authenticate code deployment to any of the listed instances. The code deploys also automate software deployments, eliminating the need for error-prone manual operations and make it easier for us to rapidly release new features by helping us avoiding downtime during application deployment, and handles the complexity of updating our applications. The code deploys also automate software deployments, eliminating the need for error-prone manual operations and make it easier for us to rapidly release new features by helping us avoiding downtime during application deployment, and handles the complexity of updating our applications [6] which reduces the complexity of application updates. When a service needs to be scaled with the infrastructure provided, CodeDeploy can easily one instance or instance. After deployment, the program was converted to jar files. Besides, the paper analyzes network throughput by taking current time and decide how much time it takes to respond for client request. We used the method currentTimeMillis() to predict the throughput [6].
Socket Programming-Based RMI Application for Amazon Web Services …
519
2 Literature Review Cloud computing is a significant area in which most researchers have focused their efforts on challenges such as client–server interaction and the rules that must be followed to ensure secure communication between agents. To begin this article, we read the research publications listed below. Client–Server Communication The interaction of a server that delivers services and a client who requests services from the server across a network. To accept a connection request issued directly by the clients, the server must be active or started. Following the acceptance of a connection, the server must provide available services in response to the client’s request[7, 8]. In this paper, the clients can request language translator operations concurrently from the central server. The server must handle incoming client connections and requests, and reply each client eventually. Communication Protocol TCP/IP and UDP with stream and datagram sockets are the two protocols used in Java, according to the IDG Communications Java Forum conducted in July 2017. TCP/IP communication is connection-oriented, whereas UDP communication is connectionless. We employed the connection-oriented communication protocol in this paper to facilitate RMI communication between virtual clients and servers. Sockets and Socket Programming An RMI-based client–server application that interfaces with the distributed cloud environment via sockets is created using socket programming (stub and skeleton). Sockets are used to allow the server and clients to communicate in a two-way fashion [9, 10]. This paper adopts these concepts and uses two virtual servers and three clients by using instances provided by EC2. The server deployed virtually, the client’s request shared between these virtual servers, and this is the way of reducing the workload on a single server. Here, AWS provides EC2 as Infrastructure as a Service (IaaS); by using this infrastructure, we can create virtual servers and clients that interact with each other. “IaaS provides us with the highest level of flexibility and management control over your IT resources and is most similar to existing IT resources that many IT departments and developers are familiar with today” [6]. Cloud computing provides several services for developers. The Amazon Web Service (AWS) is one of the biggest cloud service providers in today’s scenario. The survey by Subhadra Bose Shaw [12] defines the cloud as is a parallel and distributed system that consists of many virtualized interconnected computers and has three components clients, data center, and distributed servers. Many researchers like Abdulrahman [13], Xue and Zhu [14] explain how the requests and responses are passed between distributed components by using socket programming and how remote components can share information by using Remote Procedure Call (RPC), Remote Method Invocation (RMI), and Common Object Request Broker Architecture (CORBA), but deploying such application on cloud and the use of cloud computing was not considered. In this paper, the applicability of cloud computing and visualization
520
S. R. Godla et al.
technology is studied. Here virtually distributed servers (two virtual servers) are provided for us by AWS, and we have deployed server-side programs on these servers. We have assigned three instances of the clients and deploying client-side program. The client and server communicate by using RMI. Raj [15] proposed “Improved Response Time and Energy Management for Mobile Cloud Computing Using Computational Offloading” and a computational offloading strategy to bring down the response time latencies, the energy consumption, and the cost of the computation by employing the fuzzy k-nearest neighbor, the hidden Markov model to identify the proper services and envisage the future location of the mobile devices based on its past locations respectively and employ ant colony optimization to identify the shortest path with the between the resources allotted and the mobile devices. The proposed system adopts this concepts and applies it to language translator applications.
3 Our Contribution Conceptual design and description Java RMI are used to access remote objects. To call this remote object, RMI has a library that builds a server and a remote client. On the one hand, these libraries make servers remotely accessible, while on the other, they make remote client writing easier. Each computer has its own Java Virtual Machine (JVM). In this paper, the RemoteService class is built as a service that can be accessed via a remote interface. The client can use the remote serverclass to perform operations such as searching a dictionary, adding a new word to a dictionary, logging previous requests, and retrieving statistics. By binding to the server via TCP/IP and port number, the client leverages the server’s name service. The given port number is 1099. This naming service is used by the client to obtain a reference to the stub interface. A remote object is contained in this stub. A skeleton is also included on the remote server, which refers to the remote server interface. The stub will deliver the call and all parameters to the skeleton on the remote server, marshaling the parameters before giving them to the skeleton. When a client calls a method particular to a remote object, it appears to the client as if the client is calling the method directly on the object. The stream will then be converted into a method call with the specified parameters by the server’s skeleton. Finally, the skeleton will call the method (which is implemented on the remote server). Because the requested search, log, and statistics have a return value, we must reverse the procedure to obtain and use the value returned by the RemoteServer. The skeleton serializes the returned value on the server, and the stub de-serializes it on the client. In this case, the RemoteServer class has a public access modifier, allowing it to be accessible from outside of the class. The RemoteClient class, which allows clients to request tasks, is an external class that accesses the remote object of the server class. Skeleton objects can be used to allow remote access to the server. All low-level networking operations are handled by this object, which is implemented in the RMI library. In general, every access is routed through the remote
Socket Programming-Based RMI Application for Amazon Web Services …
521
method invocation interface skeleton object. On the client side, RMI libraries also generate an interface stub object. Access to RemoteServer is provided via this object. Each appears to use a specific remote interface. Instead of explicitly implementing the interface, each stub object contacts a remote skeleton to route all method calls to the server. When a client calls a stub object method, the stub establishes a connection with the skeleton and passes the method name and arguments. The stub’s user does not need to do any network I/O explicitly, as with the skeleton; this is handled fully by the stub object and implemented within the RMI library. Figure 1. depicts the conceptual architecture for RMI-based communication on the cloud. The remote server in this case is an on-demand infrastructure provided by the Amazon Web Service. On this remote server, the server-side program was installed. Every client communicates and accesses services through a virtual machine, which, as of EC2, is a virtual machine. The service here is language translation, and we have used supervised machine learning to add, delete, and search for new words. The privileged user adds words with their meanings, and the words are saved on the EC2 registry. Searching for the word has two probabilities: accessing the meaning if it exists and if not found the server sends message of “word not found”. The two virtual servers provided by AWS can help us to reduce the workload by sharing tasks between them.
Fig. 1 Conceptual design for RMI-based language translator and their implementation
522
S. R. Godla et al.
Virtualization The addition of data storage on a server (data center) makes the server busy, which affects the availability of services and might result in excessive latency while accessing those services. Researchers mined the idea of virtualization as a result of this and similar problems. The goal of virtualization is to minimize workload on servers (data centers) and reduce latency so that customers can access services from anywhere. According to certain studies, the access latency problem can be overcome by locating the application server close to the end-users [11]. However, how to use cloud infrastructure as a service and how to deploy applications on specific web services is not explained well. In this paper, we have used the Amazon web service that is Elastic Compute Cloud (EC2). The EC2 is secure, scalable computing capacity in a cloud. It provides a self-contained virtual machine OS which helps developer to deploy and run their code [5]. Since the virtual machine is distributed across the globe, you can change the data center and create an instance in different countries to avail of the services. We can create the virtual machine on EC2. In this paper, we have created four virtual machines by assigning one as a server and the other three as a client; the client in a different region can access the server. Creating a virtual machine on EC2 follows some steps. As in Fig. 2 shows an AMI which contains an OS, an application server and other applications are required to launch an instance [16]. The developer selects the instances provided by EC2 from various instances based on their application areas. An instance is a virtual server provided by EC2 to run a network-based application. The selected instance needs some configurations and storage device specifications to store applications and services. Another advantage of using Amazon Web Service is the security it provides for application developers. EC2 offers security groups that, for example, control traffic and restrict other protocol access to the selected protocol. Generally, the instance is created, and getting to the created instance also needs some steps. The actions taken by cloud application developers during deployment are depicted in Fig. 3. The developer locates an instance after creating an account with the Amazon Web Service. The PuTTY system is used to connect these instances to virtual machines. To secure the information, this PuTTY employs the SSH protocol. Follow-
Fig. 2 Setup a virtual machine on Amazon Web Services EC2
Socket Programming-Based RMI Application for Amazon Web Services …
523
Fig. 3 Accessing instances and deploying the code on VM Fig. 4 Network throughput analysis, average service time in ms for every requested operation
ing the establishment of the connection, the developer deploys the code to specific virtual machines. In this paper, we have analyzed the network, the network throughput by calculating execution time delay. To calculate each request’s execution delay, we run each request multiple times and report the average result, as shown the result in Fig. 4. To accomplish this, we create a timestamp before the request is sent to the server, and we also timestamp when the data arrives at the client. Thus, we come up with latency information by taking the time difference, as reported in Fig. 4. As shown in this figure, the most expensive operations are search and statistics. This is because searching involves comparing the user input with all keys in our datasets of language translator server and hence takes much time compared to the other. Also, computing statistics involves grouping and counting all words based on their parts of speech and hence consumes relatively higher latency. In contrast, logs and add operations are the fastest. However, I consider that the latency of statistics can be further reduced by caching frequently asked statistics in memory or separate files. Likewise, the latency of the search operation can also be optimized by logging each client’s pattern.
524
S. R. Godla et al.
Translator application results on adding words and sentences to the server dataset This provides the user to add the specific language word and sentences with their meanings. If the submitted word already exists, then a warning message is returned to the client unless it shows success alert/message. Here we take walaita language to be translated to English language. The user may be in deferent region, and they can access centralized dataset on central server. Figure 5 depicts the user that has adding word privilege adds new words to the cloud registry. Since it is supervised, the user must insert a word with its meaning. After some time that word may be searched by the other user so that it needs an expert to prepare some translated language to the repository. The application checks for the existence of inserted words to avoid data redundancy. So that if a word already exists, it responds word exists. In Fig. 6, the user wants to look for the meaning of words, the virtual server nearest to the user may serve this requester and respond for the word meaning. The application can also display network throughput. Here searching may need comparing each word, so the network throughput takes O(n) where n is a number of words in the dataset. According to Fig. 7, when searching for words, the client can choose whether to look for existing words or not, and if the client looks for an existing word, the server responds with a message. Figure 8 depicts the log information. The IP address of the machine, the activity he has performed, and the timestamp are all displayed in the log files.
Fig. 5 Adding word or sentence with the meaning
Fig. 6 Searching existed sentence or words
Socket Programming-Based RMI Application for Amazon Web Services …
525
Fig. 7 Searching existed word or sentence
Fig. 8 Log information for each requests
4 Conclusion This paper provides a better implementation-based explanation of how to apply Java RMI to distributed cloud computation of virtual servers. It helps the readers to become familiar with the Remote Method Invocation (RMI) model for communication among distributed processes and the deployment of these processes on different virtual machines. For the simplicity and availability of built-in classes with libraries, the researchers used the Java programming language with Eclipse IDE. The reader also knows about the use of virtualization to solve the problems related to resource availability and access latency. In general, the paper helps the reader by providing the know-how to develop cloud-based applications and how objects on the remote can interact with each other. Recommendation Though the paper deals with the applicability of Java RMI to cloud computing, basically it is applied to a centralized limited number of server and client instances in cloud environments. So the researchers recommend that the study be expanded and be more decentralized by adding multiple virtual servers and clients to cloud computing to provide a solution for service availability and problems
526
S. R. Godla et al.
related to quality of service (QoS). The dataset we have prepared for the language translator application is also less than expected, which causes confusion about the performance evaluation of the application. So it is better if a large dataset is prepared and kept on the server to evaluate service migration between virtual servers to reduce the workload of a single server.
References 1. T. Sysala, J. Janecek, Optimizing remote method invocation in Java, in Proceedings of 13th International Workshop on Database and Expert Systems Applications (2002), pp. 29–33. https://doi.org/10.1109/DEXA.2002.1045872 2. S. Chapin, W. Herndon, L. Notargiacomo, M. Katz, T. Mowbray, Security for the common object request broker architecture (CORBA), in Tenth Annual Computer Security Applications Conference (1994), pp. 21–30. https://doi.org/10.1109/CSAC.1994.367322 3. Y.-M. Wang, O.P. Damani, W.-J. Lee, Reliability and availability issues in distributed component object model (DCOM), in Fourth International Workshop on Community Networking Processing (1997), pp. 59–63. https://doi.org/10.1109/CN.1997.629957 4. A. Jitbanyud, N. Toadithep, The system of powerful computer laboratory class via socket programming, in 2010 3rd International Conference on Computer Science and Information Technology (2010), pp. 638–641. https://doi.org/10.1109/ICCSIT.2010.5564556 5. T.V. Doan et al., Containers vs virtual machines: choosing the right virtualization technology for mobile edge cloud, in 2019 IEEE 2nd 5G World Forum (5GWF) (2019), pp. 46–52. https:// doi.org/10.1109/5GWF.2019.8911715 6. S. Mathew, J. Varia, Overview of amazon web services. Amazon Whitepapers (2014) 7. M. Xue, C. Zhu, The socket programming and software design for communication based on client/server, in 2009 Pacific-Asia Conference on Circuits, Communications and Systems (2009), pp. 775–777. https://doi.org/10.1109/PACCS.2009.89 8. I.M. Abba, U. Eaganathan, N.A. Aziz, J. Gabriel, LAN chat messenger (LCM) using JAVA programming with VOIP, in International Conference on Research and Innovation in Information Systems (ICRIIS) (2013), pp. 428–433. https://doi.org/10.1109/ICRIIS.2013.6716748 9. M. Ahsan, J. Haider, J. McManis, M. Saleem J. Hashmi, Developing intelligent software interface for wireless monitoring of vehicle speed and management of associated data, 01 June 2016 https://doi.org/10.1049/iet-wss.2015.0080 10. TCP/IP Sockets in Java: Practical Guide for Programmers By Kenneth L. Calvert, Michael J. Donahoo 11. C. Campolo, G. Genovese, A. Iera, A. Molinaro, Virtualizing AI at the distributed edge towards intelligent IoT applications. J. Sens. Actuator Netw. 10(1), 13 (2021). https://doi.org/10.3390/ jsan10010013 12. S.B. Shaw, A.K. Singh, A survey on cloud computing, in 2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE), IEEE (2014) 13. A. El-Sayed. G.M. Attiya, Framework for Multi-Task Scheduling in Distributed Systems using Mobil Agent (2019) 14. M. Xue, C. Zhu, The socket programming and software design for communication based on client/server, in Circuits, Communications and Systems, 2009. PACCS’09. Pacific-Asia Conference IEEE (2009) 15. S. Kumar, S.P. Singh, A.K. Singh, J. Ali, Virtualization, the great thing and issues in cloud computing. Int. J. Curr. Eng. Technol. 3 (2013) 16. M. Singh, Virtualization in cloud computing—a study, in 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN) (2018), pp. 64–67,. https://doi.org/10.1109/ICACCCN.2018.8748398
Design and Verification of H.264 Advanced Video Decoder Gowri Revanoor and P. Prabhavathi
Abstract Video compression and decompression are an inevitable part of the multimedia applications. There is a need for effective algorithms that can reduce the bandwidth along with maintaining the perceptual quality of the images. H.264 is a video coding standard by ITU/T and ISO/IEC groups that fits the purpose. Design of H.264 decoder as an IP is developed using xilinx ISE design suite. Functional verification of the same is carried out in ISE simulator. The physical design of the decoder is carried out using OpenLANE synthesis and PR software using sky130nm PDK. The IP core dimensions are 148.57 mm length and 146.88 mm width. The end IP meets the timing constraints and is free from DRC and LVS errors. Keywords H.264 decoder · Intra-prediction · Inter-prediction · Physical design
1 Introduction The analog to digital converters have made the digitization of audio and video information possible. The compression algorithms help to store and manage the audio and video data. The video compression started with compact discs. Compression algorithms can be designed for different applications such as real-time video conferencing, high-speed video streaming, video editing and video archives. As the video content was increased due to high resolution cameras, new compression algorithms were proposed with more than 100:1 compression ratio. The first video compression standard developed was known as H.261; it included the direct cosine transform to convert the pixel information to a stream of bits. The next advancement was H.262 that was used in SD televisions. The next version was H.263 that was developed in 1999. All these codec (compression and decompression) G. Revanoor (B) · P. Prabhavathi Department of ECE, BNMIT, 12th Main Road, 27th Cross, Banashankari Stage II, Banashankari, Bengaluru, Karnataka 560070, India e-mail: [email protected] P. Prabhavathi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_38
527
528
G. Revanoor and P. Prabhavathi
are lossy in nature and computationally intensive, highly configurable. They can provide compression ratio up to 1000:1. H.264 was developed in 2003 that is one of the widely used standard for Internet services, Web softwares and HDTV applications. Compared to the previous versions, it is considered as the best because it reduced the data rate required to half without compromising on the picture quality.
2 Related Work 2.1 Literature Survey The international standard specification for video compression H.261 was provided, and also, the standard was implemented by Tan et al. [1]. The performance analysis of the advanced video coding standard H.264 with H.263 and H263 + standards was presented by Raja and Mirza [2]. The H.264 standard is widely applied for applications including still image and videos. It is also used by decoding standards, such as JPEG, MPEG-1/2 and MPEG-4. Both of the above-mentioned references provided a better picture of the standards that existed before H.264. The history of evolution of H.264, starting with the description with the most basic components of video and making way to the advanced H.264 standard, is explained by Sullivan et al. [3]. The overview of the standard and its evolution to be the network friendly technique is described by Wiegand et al. [4], and also, the author claims that when used it properly, it produces 50% bit rate saving than its previous versions. The quality evaluation of subjective parameters and objective parameters of H.264 was illustrated by Vranjes et al. [5]. The author’s evaluation has proven that H.264 has higher rates that have been able to achieve for bit rates below 640 kb/s. For the videos with rates greater than 640 kb/s, the subjective quality of sequences that are coded using MPEG-4 is almost same as that of the quality of H.264 coded video; these values are confirmed results established by the SSIM and local PSNR. Implementation of CAVLC for H.264 AVC is elaborated by Mukherjee et al. [6]; this paper briefs about the detailed method of operation of CAVLC encoder. It also aims at increasing the operating frequency without affecting the area. The details and explanation of one of the attractive features of H.264, the fast intra-prediction algorithm is explained by Shi et al [7]. Additionally, the proposed algorithm [7] aims at reducing the overhead that is required in regular intra-prediction algorithms and make it suitable for applications like accelerator hardware modules in a real-time HDTV H.264 encoder. The implementation of CAVLC decoder for H.264 video decoding is explained by Guanghua et al., and the author performs a comparison with the traditional decoding method and CAVLC decoding, and the results [8] show that the traditional decoding
Design and Verification of H.264 Advanced Video Decoder
529
is less efficient, and the CAVLC is more efficient, and he also proves that it is very well suited for real-time applications. Wireless environments are more challenging than wired, losses in wireless environments are more, and sometimes, there is a need to add extra layers to accommodate the standard to wireless environment. Authors [9] describe the challenges faced to implement H.264 in wireless environment; these challenges led to the introduction of a new layer that helps to maintain the network friendly design. The extension of H.264 for stereoscopic applications is explained by Adikari et al. [10]; this standard was originally designed for monoscopic applications but as humans are more familiar to the stereoscopic vision, the video decoded with this seems to be more realistic than ones decoded with monoscopic.
2.2 Advantages of H.264 • This standard is said to produce a better bandwidth saving maintaining the video quality that was not possible in its previous versions • Since a lot of applications require video streaming, this standard can considerably contribute to the bandwidth savings. • It is an IP that can be used for further optimization at technology levels and physical design. • It can be used for different technology and also for different types of communications. • It is designed so that the version changes can be easily incorporated.
3 Proposed Work 3.1 H.264 Encodıng and Decodıng The block diagram is as shown in Fig. 1. The scope of the standard is mainly the decoder part; hence, it is standardized. The encoders are usually designed to mirror the steps of the decoder. The encoding process involves prediction, transformation and encoding. • Prediction: It is the process of generating the values so that only residual values need to be represented. • Transformation: It works with the predicted values generated; it generates a new set of samples by combining the input samples, which is often done by a linear combination. It is also known to be as sub-band decomposition • Encoding: It deals with the formation of the stream from the transformed samples. Compression, encryption and alteration to the image pixels can be done in numerous ways by encoders prior to writing it to the stream.
530
G. Revanoor and P. Prabhavathi
Fig. 1 Scope of the standard [1]
The data bits thus encoded can be stored or transmitted and are decoded at the receiving end by inverse transformation and reconstruction. • Decode: It is the process of decoding each of the syntax elements from received compressed, quantized bitstream and extracting the quantized information and prediction information. • Inverse transformation: It is done to reverse the effects of transformation performed in the transmitter side. The quantized transform coefficients need to be rescaled. Each and every coefficient is multiplied by an integer value to rescale to original scale. All the rescaled values are combined to form the residual blocks; these blocks are combined to form the residual macroblock. • Reconstruction: The decoder forms a prediction for all macroblocks. The decoder adds the prediction to decoded residue to reconstruct the video frames at the video output.
3.2 H.264 Decoder Design Blocks Figure 2 illustrates the blocks of H.264 decoder. External memory: It is the memory that is used store the compressed bitstream after encoding or before decoding. Circular buffer: It is an interface between the external memory and the decoder unit. For each clock cycle, 16 bits of encoded data are fed into the buffer from the memory, and after 4 clock cycles that is when the buffer is half full (that is at 64 bits of data is present in the buffer), the data is taken for the decoding process. A controller is present which takes care of the taking new data and passing it to the decoder unit. Leading one detector: It is a unit that detects the position of the first one present in the bitstream; this position can be used in the decoding methods adopted here like CAVLC decoding, exponential Golomb decoding and fixed length decoding.
Design and Verification of H.264 Advanced Video Decoder
531
From transmitter External memory Buffer half full
Buffer half full Circular buffer
Buffer half full Leading one detector
CAVLC decoding
IQIT
Expo-Golomb decoding
Inter pred decoder
Variable length decoding
Intra pred decoder
Sum
Display
Fig. 2 FSM of decoder
Three different types of decoders used here: • The CAVLC decoding is used for quantized transform residue decoding. • The Exp-Gololmb codes are used for decoding other syntax elements. • The variable length decoding is used for entropy decoding of residues. Exponential Golomb decoding: Table 1 depicts how the heading position one can be used to detect the code number using the exponential Golomb method. Variable length decoding: This decoding technique is based on the buffer output, and it decodes entropy of the residues. All these form the video decoder, and all these are controlled by the controller. CAVLC decoding: It is used in five stages: Coefficient token, trailing ones sign flag, level prefix level suffix, total zeros, and run before. IQIT: IQIT stands for inverse quantization and inverse transformation. This block is used to inverse quantize and inverse transform the received bitstream. After decoding the bitstream, the quantized transform coefficient is rescaled. To decode the spatial frequencies back to their pixel intensity values, inverse discrete cosine transform (IDCT) is used. Reconstruction: The attractive feature of H.264 is its ability to convert large amounts of data into less data and transmit for this; it uses the concepts like intra-frame
532 Table 1 Exponential Golomb decoder using code word
G. Revanoor and P. Prabhavathi Code num.
Code word
0
0
0
1
010
01X0
2
011
3
00100
4
00101
5
00110
6
00111
7
0001000
8
0001001
9
0000010
…
…
Code word
001X1 X0
0001X2 X1 X0
and inter-frame prediction. The frames are considered for prediction; the difference between the actual frame and the prediction frame is known as the residual frame. Since residual frame is smaller than the actual frame size, the data to be transmitted is less, and also, decoding is easily done using this technique. Picture frames can be divided into three types: • I-Intra-coded picture, it offers least compression and do not require any other frames to decode. • P-Predictive picture also known as delta frames; they can use data from previous frames to decode and can be compressed more than I frames. • B-Bidirectional predictive pictures; they use both previous and next frames to decode and have the highest compression of all the three. Intra-frame decoding: The spatial redundancy is explored by calculating the prediction values from already decoded pixels. Only, the closet pixels are considered for decoding. Inter-frame decoding: It is like the P and B picture frame that can use its other frames to decode. This type of decoding uses one or more neighbouring frames to decode. Sum: The inverse transformed and inverse quantized 4 × 4 matrix, intra-predicted frames and the inter-predicted frames are added in the sum frame to reconstruct and get back the pictures. Clipping: Clipping is used to create more realistic images from the 2D or 3D models. Deblocking filters: It is used to remove sharp edges that are formed between macroblocks, and this step increases the video quality of the videos. Display: It displays the decoded values as the images. The design blocks defined above are designed in accordance to the standard defined by ITU/T [11]. The functional verification of these blocks is performed as defined in
Design and Verification of H.264 Advanced Video Decoder
533
Fig. 2 in Xilinx ISE suite. ASIC design methodology is followed with open-source tool named OpenLANE tool on Linux OS. OpenLANE is used to perform logic synthesis, gate level analysis and physical design. It contains other tools like yosys, Magic and OpenSTA. Magic is useful in generating the layout from the Verilog code; due to its well-stitched geomentry, it is most accepted in research community.
4 Results The different functional blocks along with FSM were designed using behavioural modelling in Verilog HDL. Encoded data in multiple bitstreams is decoded and reconstructed. Figure 3 shows the following operations: • The encoded data from the external memory that is taken for decoding that is represented by the values in external_ram_address and given as input to circular buffer. • The decoded and reconstructed data is sent for display memory that is represented by ext_frame_RAM_address. • The decoded values are represented by display_din. Physical design flow is the process of generating the layout from a circuit description, some of the steps involved till RTL generation are performed in Xilinx and the results are as depicted in Fig. 3; floorplan, placement, routing are performed in an open-source tool named OpenLANE. OpenLANE allows ASIC implementation steps from RTL to GDSII flow for design exploration and optimization. It includes OpenSTA, Yosys, and Magic. Using OpenLANE software on a Linux platform, the gate level netlist, constraints, and Sky130nm technology library were taken through various stages of physical
Fig. 3 Functional verification of H.264
534
G. Revanoor and P. Prabhavathi
design. The results of floorplan, placement, routing are shown in the following figures. Floorplan Floorplan is the process of allocating the positions to macros/blocks of the design and creating routing area around them that are used later during the routing phase; it also determines the die area and creates tracks to assist placement tool to place the standard cells. Figure 4 indicates the floor plan layout of the decoder; this layout also contains the power distribution network strategy. Magic tool is used to generate the floorplan of H.264. Placement It is the step where the elements of the design are fixed in the area decided by floorplan. The placement is a calculated strategy in order in which the elements placements yield a certain positional advantage with respect to their placement in the part of the area like the elements that interact more are placed together, the elements with more I/O are placed at the edges in order to avoid the long routing lines. Figure 5 displays the placement strategy of the decoder. Magic tool is used to generate the placement of H.264.
Fig. 4 Floorplan of H.264
Design and Verification of H.264 Advanced Video Decoder
535
Fig. 5 Placement of H.264
Routing It is concerned with the wiring of the elements of the design. Routing commonly causes bottleneck issues arising due to crowded channels that affects the performance; hence, this is an interactive process multiple, routing possibilities are explored and the best routing strategy that still keeps up with all the constraints is chosen. Figure 6 shows the routing layout of the decoder. Magic tool is used to generate the routing of H.264. Layout verification Since the layout is formed by a combination of materials, it is necessary to verify that the produced layout is in accordance to the desired schematic and free from any unwanted connections caused by interaction of materials in the layout. • Layout versus Schematic check Figure 7 shows that the decoder is free from mismatches and the errors are zero. • Design Rule Check
536
Fig. 6 Routing of H.264
Fig. 7 Result of LVS check
Figure 8 shows that the design is free from DRC errors. • Antenna check
Fig. 8 Result of DRC check
G. Revanoor and P. Prabhavathi
Design and Verification of H.264 Advanced Video Decoder
537
Fig. 9 Antenna check
Figure 9 shows that there are no antenna effects.
5 Conclusion Hard IP has been designed for H.264 decoder standard which can be used in end devices like digital television and mobile platform. It uses concepts like inter-frame decoding and intra-frame decoding that enables the bandwidth saving by double as compared to previous versions yet maintaining the perceptual quality of the videos. Functional verification of the same is carried out in ISE simulator. After synthesis, the same behavioural code was again synthesized in OpenLANE using sky130nm PDK, and physical design was carried out. The IP core dimensions are 148.57 mm length and 146.88 mm width. All timing constraints were met, and the end IP was found to be free from DRC and LVS errors.
References 1. J.A.Y. Tan, P.R.A. Lai, S.K.T. Que, E.R. Lapira, R.T. Ricardos, K.H. Torrefranca, A modified video coding algorithm based on the H.261 standard, in TENCON 2003. Conference on Convergent Technologies for Asia-Pacific Region, Bangalore, India, Vol. 3 (2003), pp. 913–917. https://doi.org/10.1109/TENCON.2003.1273380 2. G. Raja, M.J. Mirza, Performance comparison of advanced video coding H.264 standard with baseline H.263 and H.263+ standards, in IEEE International Symposium on Communications and Information Technology (ISCIT 2004), Sapporo, Japan, Vol. 2 (2004), pp. 743–746. https:// doi.org/10.1109/ISCIT.2004.1413814 3. G.J. Sullivan, T. Wiegand, Video compression—from concepts to the H.264/AVC standard. Proc. IEEE 93(1), 18–31 (2005). https://doi.org/10.1109/JPROC.2004.839617 4. T. Wiegand, G.J. Sullivan, G. Bjontegaard, A. Luthra, Overview of the H.264/AVC video coding standard. IEEE Trans. Circ. Syst. Video Technol. 13(7), 560–576 (2003). https://doi. org/10.1109/TCSVT.2003.815165 5. M. Vranjes, S. Rimac-Drlje, D. Zagar, Subjective and objective quality evaluation of the H.264/AVC coded video, in 2008 15th International Conference on Systems, Signals and Image Processing, Bratislava (2008), pp. 287–290. https://doi.org/10.1109/IWSSIP.2008.460 4423 6. R. Mukherjee, I. Chakrabarti, S. Sengupta, FPGA based architectural implementation of Context-based adaptive variable length coding (CAVLC) for H.264/AVC, in IET International Conference on Information Science and Control Engineering 2012 (ICISCE 2012) (2012), pp. 1–4. https://doi.org/10.1049/cp.2012.2417 7. Y. Shi, K. Tokumitsu, N. Togawa, M. Yanagisawa, T. Ohtsuki, VLSI implementation of a fast intra prediction algorithm for H.264/AVC encoding, in 2010 IEEE Asia Pacific Conference on Circuits and Systems (2010), pp. 1139–1142. https://doi.org/10.1109/APCCAS.2010.5774925
538
G. Revanoor and P. Prabhavathi
8. C. Guanghua, W. Fenfang, M. Shiwei, VLSI implementation of CAVLC decoder for H.264/AVC video decoding, in 2007 International Symposium on High Density packaging and Microsystem Integration (2007), pp. 1–3. https://doi.org/10.1109/HDP.2007.4283628 9. T. Stockhammer, M.M. Hannuksela, T. Wiegand, H.264/AVC in wireless environments. IEEE Trans. Circ. Syst. Video Technol. 13(7), 657–673 (2003). https://doi.org/10.1109/TCSVT.2003. 815167 10. B.B. Adikari, W.A.C. Fernando, H.K. Arachchi, K. Loo, A H.264 compliant stereoscopic video codec, in Canadian Conference on Electrical and Computer Engineering, Saskatoon, Sask (2005), pp. 1614–1617. https://doi.org/10.1109/CCECE.2005.1557292 11. Draft ITU-T recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264/ISO/IEC 14 496–10 AVC, in Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, JVT G050 (2016)
Analyzing Student Reviews on Teacher Performance Using Long Short-Term Memory Shiva Shankar Reddy, Mahesh Gadiraju, and V. V. R. Maheswara Rao
Abstract Educational institutions collect feedback from students and analyze them to provide effective teaching experience and knowledge to students. The laborious process of present traditional way of collecting, summarizing and abstracting feedback manually is overcome by online process. There are many existing online methods. But obtaining an effective method is the main problem. In this paper, a sentimental analysis model has been proposed for reviewing student’s feedback using long short-term memory. A comparison is done between the proposed model and Naive Bayes, decision tree and random forest methods. The results obtained in this work show that the proposed method is more accurate than the other considered machine learning methods. Keywords Feedback · Sentimental analysis (SA) · Recurrent neural networks (RNN) · Long short-term memory (LSTM) · Machine learning
1 Introduction Feedback is a statement that gives an individual or groups of current and previous behavior of the issue which can be improved for their best performance to achieve the desired output. It is a process which helps the students to improve their confidence, enthusiasm toward leaning and helps the organization to evaluate and monitor the overall working environment and in improving the teaching and learning experience. It will help to adopt new knowledge and prevent repetitive mistakes in education and S. S. Reddy (B) · M. Gadiraju Department of CSE, S.R.K.R. Engineering College, Bhimavaram, India e-mail: [email protected] M. Gadiraju e-mail: [email protected] V. V. R. Maheswara Rao Department of CSE, Shri Vishnu Engineering College for Women (A), Bhimavaram, Andhra Pradesh, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_39
539
540
S. S. Reddy et al.
learning criteria [1]. Educational institutions collect feedback from students to review courses and facilities provided by the institution to enhance the quality of education. Previously, feedback was collected by the educational institutions manually, and further, the feedback was analyzed which takes a lot of time and needs more effort in analyzing the feedback. Moreover, students have to be present while submitting the feedback. So, this kind of process will take more time, and it is difficult to analyze [2]. In present scenario, the analysis of student feedback had developed gradually. Feedback is collected from students through online process where students can submit the feedback wherever they are. Grading technique is employed for feedback in this process, so that a few questions will be given to the students, and they have to respond to all the questions for the point of scale [3]. Entire process is evaluated by the course coordinator on many themes such as of lecture organization, its presentation, instructor punctuality and student counseling. Here, only grading is provided by students, and the textual review of feedback will not be known, and it is not analyzed. This grading technique does not reveal the exact sentiment of students; the reports provide an opportunity to highlight the certain aspects to improve in positive manner. To overcome all the above issues, in this project, feedback is collected online through Google forms and textual feedback. Now, the textual feedback is analyzed using sentimental analysis in machine learning, and it is classified into positive and negative reviews. Textual review of feedback is achieved here. Hence, the difficulty level is reduced, and analysis of feedback is done quickly and efficiently. Opinion mining is nothing but sentimental analysis which will identify the positive or negative class depending on the orientation of textual content. This task is used to determine the opinion mining of polarity labels such as positive, negative or neutral class which depends on comments given by the students [4]. In general, there are three approaches in sentimental analysis. Machine learning models have been used to categorize the sentence as positive and negative. In lexicon-based approach, the sentiment polarity expressed in textual content is determined by using sentiment lexicon. There are many approaches in supervised learning to analyze the data, and LSTM is one among them. Using this neural network model, analysis of data is done and obtained better accuracy and results using this model.
2 Literature Survey Leung et al. [5] explained about analysis of reviews that are submitted by customers using sentiment analysis. They have taken a multipoint rating scale which is called as rating inference. They developed a model for the classification of reviews by using two approaches either by +ve or −ve. Yang et al. [6] proposed the sentiment polarity of specific targets, and they have also implemented LSTM which is a part of NN in which it extracts more effective sentiment features. Further, they developed Memnet network which helps in improving sentiment classification result. Wu et al. [7] considered rule-based approach to address OTE and ATE tasks. Opinion target
Analyzing Student Reviews on Teacher Performance …
541
extraction (OTE) and aspect term extraction (ATE) are the important factors in sentiment analysis. They get the pseudo-labeled data using chunk-level linguistic rules, and they train a deep gated recurrent unit for ATE and OTE. Thus, this unsupervised method help in opinion target extraction. Rajput et al. [8] performed sentiment analysis on Twitter datasets using a novel metaheuristic method, and this method is based on k-means and hybrid cuckoo search. This experimental result has outperformed or performed better than the existing methods. Fortuny et al. [9] proposed custom built expert system which analyzed 68,000 news articles which were published. They have developed an opinion mining which can be used to detect the sentiment of an article automatically. They developed a framework which introduces generic text mining method to analyze coverage of media on politics. Zhang et al. [10] introduced an entity-level SA technique to analyze the tweets. Labeling of training dataset is done by opinion lexicon with sentiment polarities. A binary classifier is trained using labeled dataset which is used to calculate sentiment polarity on the evaluation dataset. Their results ascertained that predicted method has better F-score and the recall. Hu and Liu [11] proposed classic text summarization by selecting the original sentences from the reviews to capture important points in it and also predicted the semantic orientation by using the lexicon-based method. Thus, they have summarized the customer reviews. Altrabsheh et al. [12] have done classification between several machine learning algorithms on student’s feedback analysis. The algorithms used here are Naive Bayes, SVM and complement Naive Bayes (CNB), to train student’s feedback. They classified student’s moods, i.e., confusion and boredom using the above methods and found out SVM with highest accuracy. Appel et al. [13] have developed a hybrid approach which depends on sentiment lexicon expanded utilizing fuzzy sets and Senti Word Net to decide sentiment polarity of a sentence. This method is applied to the datasets, and evaluated results will be compared by using Naive Bayes and maximum entropy techniques to get accurate and precise. Turney [14] proposed an approach to classify the reviews whether they are recommended or not. To classify the reviews, average semantic orientation is used and good associated phrase has positive semantic orientation, whereas bad associated phrase has negative semantic orientation. This algorithm achieves an average accuracy of 74%. Rajput et al. [15] analyzed the sentiment polarity by using lexicon approach given by the student review. The polarity academic domain is determined by the modified general-purpose sentiment dictionary. Here, the results are provided for domainspecific sentiment lexicon and it achieved good results other than the general-purpose sentiment lexicon. By using lexicon approach, it is further classified into corpus and dictionary based to determine polarity of the sentiment. Manoharan [16] worked on hierarchical multi-label classification for text data. ˙In this paper, a capsule network algorithm is proposed. The author of the work applied different machine learning methods on the data along with the proposed model. Compared the considered models and finally found the proposed model in this work performed better than their counterparts.
542
S. S. Reddy et al.
Shriram and Kumar [17] have applied bidirectional recurrent neural network (BRNN) on the reviews for sentiment extraction. BRNN is mainly applied on the Indian regional languages to get high accuracy. This method is also applied on Twitter data, and results prove that BRNN has obtained high accuracy or outcomes than other existing methods. Kumar et al. [18] considered a hybrid recommendation system which has both the collaborative filtering and content-based filtering. Their systems use weighted score fusion to improve recommendations in which they mainly use sentiment analysis of the previous reviews given by the viewers. Experimental results found out that proposed model gives better recommendations than the other models. Se et al. [19] proposed the algorithms to classify whether the reviews are positive, negative or neutral class by using SVM, Maxent classifier, Naive Bayes and decision tree. By using SVM algorithm, they got 75.9% of accuracy, and it is better than the other algorithms. Soelistio et al. [20] applied text mining in digital newspaper to do politic sentimental analysis using machine learning. Here, sentimental analysis is done to classify political articles either positive or negative. They have used Naïve Bayes classifier method for analyzing the text.
3 System Architecture As shown in Fig. 1, a dataset which contains the text and sentiment is taken. Now preprocessing the dataset is done by using methods like data reduction and data cleaning, etc. Now, the training dataset is used to train the model using Naive Bayes, random forest, decision tree and LSTM features. The model which is having highest accuracy is chosen, it is applied to the student feedback dataset and results are
Fig. 1 System architecture
Analyzing Student Reviews on Teacher Performance …
543
obtained which classifies the data into positive or negative reviews. Here, the model with LSTM features obtained high accuracy, and it is chosen.
4 Implementation 4.1 Dataset The dataset considered is “sentimental analysis on student feedback” from kaggle.com. The text dataset contains two columns in which sentiment is assigned to each sentence to classify into positive or negative. The sentiment assigned for each sentence is either 0(for negative) or 1(for positive). This labeled dataset totally contains 5200 rows in which it contains 2600 positive sentences and 2600 negative sentences.
4.2 Data Preprocessing The preprocessing is done by the following modules: • Tokenization of sentences • Removal of stop words. Tokenization of Sentences. After giving text as input, now the sentences have to be tokenized. Tokenization is splitting (or) dividing a text into list of tokens. In that, sentences are considered as tokens, and sentence tokenization is applied. Here, libraries are imported that is “re” stands for regular expression operations which is used for substitution of text and “nltk” stands for Natural Language Toolkit which is used for tokenization. From nltk, Tokenize, word_tokenize is imported which is used to tokenize the entire sentences and then tokenizer is imported from keras. Preprocessing and text package are used for giving values to the above tokens for processing the data. Now, download stop words and punctuations from the nltk for removing stop words and punctuations in the given sentence. After downloading the libraries, let us start process the text by converting the text into lower cases by using lower function, and then, non-ASCII values are removed, emojis and other special characters. Removal of stop words. In NLP, useless words are considered as “stop words.” The regularly used words are “a,” “an,” ”the,” ”in.” Stop words are found in nltk_data directory. In nltk_data, the stop words text is present in corpus. These words are removed because they do not help us to find the true meaning of the sentence. The preprocessed dataset containing 5200 records is splitted into two different datasets using train_test_split function. Two-third of data is utilized to train the model and remaining one-third is utilized to test the model. Then the number of elements
544
S. S. Reddy et al.
in train set is 3484, and the number of elements in test set is 1716. The training data is used to train model using Naive Bayes classifier, decision tree, random forest and LSTM features. The test data is used for evaluating the algorithms.
5 Algorithms 5.1 Naive Bayes Classifier It classifies data depending on event probabilities. It is commonly used in text mining, and it performs in many text mining problems. Similar to other machine learning models, a training dataset is needed for each class. In this work, sentence classification is considered to classify a sentence whether it is a positive or negative represented with 1 or 0. With the training dataset, the model is trained to categorize a sentence using the probability as given in Eq. (1). P(A1/A2) = P( A2/A1)P(A1)/P( A2)
(1)
An example of the sentence consider is “He is good at teaching.” It is need to determine probability of class “1 (positive)” for the given sentence and probability of class “0 (negative)” for the given sentence. From that, it is needed to obtain class with a highest probability for the given sentence, i.e., (Positive|He is good at teaching) (or) P(Negative|He is good at teaching). To classify the highest probability, Naive Bayes theorem will be used. The above equations can be written as follows: P(Positive|He is good at teaching) = P(He is good at teaching|Positive) ∗ P(Positive)/ P(He is good at teaching) P(Negative|He is good at teaching) = P(He is good at teaching|Negative) ∗ P(Negative)/ P(He is good at teaching) As the given sentence is not there in any one of the considered classes in the training dataset, the probability becomes zero. So, it is useless. To overcome the above problem, split the sentence into words. So, entire sentences are no longer considered but rather individual words are considered. (Here, positive class is represented with T, and negative class is represented with N) P (He is good at teaching|T ) = P(He|T ) ∗ P(is|T ) ∗ P(good|T ) ∗ P(at|T ) ∗ P(teaching|T ) P(He is good at teaching|N ) = P(He| N ) ∗ P(is|N ) ∗ P(good|N ) ∗ P(at|N ) ∗ P(teaching|N )
Analyzing Student Reviews on Teacher Performance …
545
Now, compute probabilities in the above equations. P(word|class) =
No. of times the word appears in the class . Total no. of words in the class
To compute how many times a word appears in a class, Count Vectorizer in sklearn could be used. Term-document matrix (TDM) for each class is found using count vectorizer. A TDM has a list of word frequencies operating in a set of documents. Now, compute the TDM for positive and negative classes, and the output for TDM is a TDM matrix. A cell in TDM matrix represents the frequency of the word in the sentence. Next, it is needed to compute frequency of each word in a class. From that obtain frequency of words in each class. Then the probability of each word in a given class is calculated. Since there is a problem that, if a word of the considered sentence does not appear in the class in training dataset, the overall probability becomes zero. To solve the problem Laplace smoothing is used. By using this smoothing, the probability of words that does not exists in the class of training dataset is obtained. The above equation could be used for +ve and −ve classes. Compare both values of the +ve and −ve. The class which has the highest value then the given sentence is classified as that class. By applying the Naive Bayes algorithm, accuracy about 56.06% is obtained.
5.2 Decision Tree This algorithm is used for classification that creates a tree structure. A tree is developed by taking dataset, it breaks into smaller subsets and the result consists of decision node, internal node and leaf nodes. A decision node may have two or more branches, and it can handle both categorical and numerical values. Internal node denotes test on attribute, and it is also called as test node. Leaf node represents decision that is positive or negative. After the model is trained with training dataset, it is evaluated using testing data and evaluation parameters are obtained. The trained model produces decision tree for the feedback data. Internal nodes denote a test on attribute and branch nodes denote an outcome of the test and leaf nodes represent the classification whether it is positive or negative. Firstly, whole trained dataset is considered as a root. After that, the attribute value is to be identified for the root in every level. Based on that attribute value, records are distributed and construct a decision tree for the whole dataset. Attribute values have to be measured based on two measures, i.e., information gain and gini index. In this, the decision tree is constructed using information gain. Let us take an example: “He is good at explanation.” Information gain is a measure of the change in the entropy and the value for the entropy is to be found. After tokenization, the above sentence becomes [He, good, explanation]. From that count the no. of instances and instance for each word, and based on this values, find the entropy value. Now by using information gain, the tree is built by choosing the
546
S. S. Reddy et al.
attributes to label the instances of training dataset which is associated with root node. By using the instances, the sub-tree and the path could be constructed. If all the positive training instances remain, then that node is represented as “1,” whereas if all the positive training instances remain, then that node is represented as “0.” By applying the decision tree algorithm, accuracy of 78% is obtained.
5.3 Random Forest Random forest is basically a boot strap in which there is “n” number of decision trees, and an output is obtained at the each decision tree. By aggregating the outputs at each decision tree, prediction can be made. This means based on majority vote for classification and average of outputs for regression. Training data is assigned to decision trees randomly in small amounts which is less than the actual data. Similarly, the training data is splitted into different decision tress, and then model will be trained. Consider the test data in which the sentences have to classify. Now, the sentence is partitioned and sent into “n” number of decision trees. Basing upon the sentiment for each word, the decision tree gives output, and for classification, majority of the outputs, i.e., obtained from the decision trees, is considered, and the reviews are classified into negative or positive reviews. Suppose six sentences are given as listed below. 0—He is good at teaching. 1—He is good at explanation but he tells stories instead of lessons. 2—She is good and punctual. 3—Very good teaching. 4—Her way of teaching is good. 5—His lecture delivery is not so good. These sentences form random forest. First it constructs “n” no. of decision trees as explained above in which, each decision tree consists of two or more sentences. Each decision tree gives a result. Based on the majority of the results, it is classified as positive or negative. By applying the random forest algorithm, an accuracy of 80% is obtained.
5.4 The Proposed Model The proposed model contains the following layers. Embedding layer is the first layer in the neural network. An embedding layer has one vector for one word, and it converts the sequence of word index into the sequence of vectors. The sequences of vectors are trainable, and after training, words with similar
Analyzing Student Reviews on Teacher Performance …
547
Table 1 Code block for the considered algorithms Code model = Sequential() model.add(Embedding(max_vocab_size, 128, input_length = max_words)) LSTM(400, dropout = 0.5) model.add(Flatten()) model.add(Dense(128, activation = ’relu’)) model.add(Dense(1, activation = ’sigmoid’)) adam = optimizers.adam(lr = learning_rate) model.compile(loss = ’binary_crossentropy’, optimizer = adam, metrics = [’accuracy’]) print(model.summary())
meaning often have similar vectors. Simply, word embeddings are used to represent data in the form of numerical data. LSTM Cell. LSTM stands for long short-term memory. It is one type of RNN architecture and is designed to store the previous information so it is called as memory cell. LSTM unit comprises of a memory cell, a hidden state, an input gate, a forget gate and an output gate. In the LSTM model, two LSTM neural networks run in parallel. One for the current input sequence and other is for the reverse of the input sequence. A series of vectors and parameters are input to the LSTM model. First LSTM neural network models, the text at start, i.e., from right to left and the second neural network models, the text at the end, i.e., from left to right to accomplish text feature learning and thereby produce result for sentiment classification. Flatten. The flatten layer flattens the output of the previous layer into a onedimensional sequence and feeds it to the next layer. Dense Layer. It is a neural network connected densely and contains neurons, and each neuron receives an input from previous layer neurons. After flatten, the embedding layer is connected to dense layer. Finally, a dense layer with “relu” and “sigmoid” activation functions is added. To execute the model, Adam optimizer, binary_crossentropy as loss function are used. The model is evaluated for accuracy, and then summary of model is printed using the code demonstrated in Table 1. Testing model. For testing the model, the model.predict() function is used to predict the sentiments of the test set. An accuracy of 83.83% is obtained. After observing the above models, LSTM model has obtained high accuracy, and it is chosen as a best model.
6 Result Analysis The trained models are evaluated utilizing testing data for getting accuracy, precision, f1 score and recall. Then an optimal model is extracted from the considered models based on accuracy.
548
S. S. Reddy et al.
Confusion Matrix. The confusion matrix has true positives (T1), false positives (F1), true negatives (T2) and false negatives (F2) values from which all the evaluation metrics could be found. Accuracy (A). Accuracy is the ratio of the number of correct predictions to the total number of input records. Eq. (2) is used for computing accuracy. A = (T 1 + T 2)/(T 1 + T 2 + F2 + F1)
(2)
Precision (P). Equation (3) is used for computing precision. P = T 1/(T 1 + F1)
(3)
Recall (R). Sensitivity or recall or true positive rate of an algorithms is obtained using Eq. (4). R = T 1/(T 1 + F2)
(4)
F1 score (F). It is the harmonic mean of the precision and recall. Equation (5) is used for computing precision. F = 2T 1/(2T 1 + F1 + F2)
(5)
6.1 Confusion Matrix Obtained for the Considered Algorithms In Fig. 2, true positive value is 736, false negative is 116, false positive is 719 and true negative is 145.In Fig. 3, true positive value is 704, false negative is 148, false Fig. 2 Naive Bayes
Analyzing Student Reviews on Teacher Performance …
549
Fig. 3 Decision tree
positive is 226 and true negative is 638.In Fig. 4, true positive value is 773, false negative is 79, false positive is 248 and true negative is 616. In Fig. 5, true positive value is 754, false negative is 96, false positive is 206 and true negative is 658. Fig. 4 Random forest
Fig. 5 LSTM
550
S. S. Reddy et al.
6.2 Results Table 2 shows the evaluation metrics of various algorithms. In Fig. 6, evaluation metrics like accuracy, precision, recall, f1 score are used for comparison. In Fig. 6, accuracy is represented as red color, precision is represented as green color, recall is plotted as blue color and f1 score is represented as yellow color. The model is implemented by using several algorithms, and comparison of accuracy is done between these models, and the best model is chosen. Firstly, Naive Bayes algorithm is used and accuracy of 51% is obtained. NB has less value of accuracy as NB assumes the features are independent and data in the dataset considered is dependent one another. So, later on applied few algorithms to create best model with best accuracy. Decision tree is applied and got 78% accuracy, and random forest which is the group of several decision trees is also applied and obtained 81% of accuracy. In addition to these algorithms long short-term memory network is used in which it contains several layers, i.e., embedding layer, LSTM cell, flatten layer, dense layer and obtained 83.3% accuracy. As LSTM is better for sentimental analysis, it is Table 2 Comparative results of considered algorithms Algorithms
Accuracy (%)
Precision (%)
Naive Bayes
51
56
16.7
25.7
Decision tree
78
80.8
73.7
77.8
Random forest
81
88.6
71.2
79
LSTM
83.3
85.4
78.2
82.2
Fig. 6 Graph to compare different scores
Recall (%)
F1 score (%)
Analyzing Student Reviews on Teacher Performance …
551
proposed and got better results. After comparing the accuracies of all these models, LSTM got highest accuracy.
6.3 Demonstrating Proposed Algorithm Using Testcases Instead of dataset used in training and testing, real-time student’s feedback is also used for demonstrating LSTM and is loaded as shown in Fig. 7.There are total 744 reviews in this feedback set. Now, the proposed model is going to be applied on this student’s feedback dataset. After applying proposed model, 744 reviews are classified into 425 positive reviews and 319 negative reviews and are illustrated in Figs. 8 and 9, which show the representation of no. of positive and no. of negative reviews by using bar graph. Test case. The sample test case code is shown in Fig. 10, and its output is shown in Fig. 11. Input. “not regular to class and he is not punctual.” Expected Output. Negative review.
Fig. 7 Code for loading dataset
Fig. 8 Code for displaying output of model
552
S. S. Reddy et al.
Fig. 9 Bar graph for representing no. of positive and negative reviews
Fig. 10 Test case code
Fig. 11 Test case output
7 Conclusion A novel model is proposed in this work for effortless and efficient analysis of students’ feedback in an educational organization. It is also attempted to analyze feedback using several trained models and long short-term memory network model got better accuracy than their counterparts. The model which was developed in this work has not only got better accuracy but performance is also increased. Sentimental analysis of student’s feedback is also performed for helping teachers to improve their strengths and to encounter weakness in particular areas. As a future work, more datasets containing more reviews could be considered for obtaining still better model.
Analyzing Student Reviews on Teacher Performance …
553
References 1. Z. Nasim, Q. Rajput, S. Haider, Sentiment analysis of student feedback using machine learning ˙ and lexicon based approaches, in 2017 International Conference on Research and Innovation ˙ in Information Systems (ICRIIS). IEEE (2017), pp. 1–6 2. M.A. Ullah, Sentiment analysis of students feedback: a study towards optimal tools, in International Workshop on Computational Intelligence. IEEE (2016), pp. 175–180 3. K.S. Krishnaveni, R.R. Pai, V. Iyer, Faculty rating system based on student feedbacks using sentimental analysis, in 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI). IEEE (2017), pp. 1648–1653 4. X. Chen, Y. Rao, H. Xie, F.L. Wang, Y. Zhao, J. Yin, Sentiment classification using negative and intensive sentiment supplement information. Data Sci. Eng. 4(2), 109–118 (2019) 5. C.W. Leung, Sentiment analysis of product reviews, in Encyclopedia of Data Warehousing and Mining, 2nd ed. (IGI Global, 2009), pp. 1794–1799 6. C. Yang, H. Zhang, B. Jiang, K. Li, Aspect-based sentiment analysis with alternating coattention networks. Inform. Process. Manage. 56(3), 463–478 (2019) 7. C. Wu, F. Wu, S. Wu, Z. Yuan, Y. Huang, A hybrid unsupervised method for aspect term and opinion target extraction. Knowl.-Based Syst. 148, 66–73 (2018) 8. A.C. Pandey, D.S. Rajpoot, M. Saraswat, Twitter sentiment analysis using hybrid cuckoo search method. Inform. Process. Manage. 53(4), 764–779 (2017) 9. E.J. de Fortuny, T. De Smedt, D. Martens, W. Daelemans, Media coverage in times of political crisis: a text mining approach. Expert Syst. Appl. 39(14), 11616–11622 (2012) 10. L. Zhang, R. Ghosh, M. Dekhil, M. Hsu, B. Liu, Combining lexicon-based and learning-based methods for Twitter sentiment analysis.Technical Report, HP Laboratories HPL-2011 (2011), p. 89 11. M. Hu, B. Liu, Mining and summarizing customer reviews, in Proceedings of the Tenth ACM ˙ SIGKDD International Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, New York, 2004), pp. 168–177 12. N. Altrabsheh, M. Cocea, S. Fallahkhair, Learning sentiment from students’ feedback for realtime interventions in classrooms, in Adaptive and Intelligent Systems (ICAIS 2014). LNCS, vol 8779, ed. by A. Bouchachia (Springer, Cham, 2014), pp. 40–49 13. O. Appel, F. Chiclana, J. Carter, H. Fujita, A hybrid approach to the sentiment analysis problem at the sentence level. Knowl.-Based Syst. 108, 110–124 (2016) 14. P.D. Turney, Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews, in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (Association for Computational Linguistics, USA, 2002), pp. 417–424 15. Q. Rajput, S. Haider, S. Ghani, Lexicon-based sentiment analysis of teachers’ evaluation. Appl. Comput. Intell. Soft Comput. (2016) 16. J.S. Manoharan, Capsule Network Algorithm for performance optimization of text classification. J. Soft Comput. Paradigm (JSCP) 3(01), 1–9 (2021) 17. R.G. Kumar, R. Shriram, Sentiment analysis using bi-directional recurrent neural network for Telugu movies. Int J. Innov. Technol. Explor. Eng. 9(2), 241–245 (2019) 18. S. Kumar, K. De, P.P. Roy, Movie recommendation system using sentiment analysis from microblogging data. IEEE Trans. Comput. Social Syst. 7(4), 915–923 (2020) 19. S. Se, R. Vinayakumar, M.A. Kumar, K.P. Soman, Predicting the sentimental reviews in tamil movie using machine learning algorithms. Indian J. Sci. Technol. 9(45), 1–5 (2016) 20. Y.E. Soelistio, M.R.S. Surendra, Simple text mining for sentiment analysis of political figure using Naive Bayes classifier method (2015). arXiv preprint arXiv:1508.05163
A Novel Ensemble Method for Underwater Mines Classification G. Divyabarathi, S. Shailesh, M. V. Judy, and R. Krishnakumar
Abstract Naval and military research pays significant attention on the targets within the seabed. Identification of hidden mines is a crucial research problem and sonar images aid in the process by providing signal features. Previous studies on underwater mines classification have used standalone algorithms for classification, which lacks the ability to generalize and are prone to errors. Driven by the challenge, an ensemble approach by stacking the machine learning classifiers is presented in this research work. Also, lack of availability of data in dense has been overcome by generating synthetic data with the available data from the repository. Proposed approach is tested with synthetic as well as real-time dataset which reached classification accuracy and F1-score of 91%. Observations reveal that proposed model has improved performance than individual classifiers on the Mines versus Rocks data. Keywords Classification · Ensemble · Mines and rocks · Sonar · Synthetic data
1 Introduction Underwater environments are complex to explore because of their characteristic nature like canyons, seamounts, hydrothermal vents and methane seeps [1]. We reach the deep and wide seafloor better using acoustic sonar systems than optical systems in the water medium. Exploring targets underlying the seabed are helpful for the naval in identifying mines and other artifacts. Detecting mines on the seafloor is one of the significant tasks in the military field. To assist the defense system in finding underwater mines which are harmful for underwater habitats active sonars are used. G. Divyabarathi (B) · S. Shailesh · M. V. Judy Department of Computer Applications, Cochin University of Science and Technology, Kochi, Kerala, India R. Krishnakumar School of Computing, SASTRA University, Thanjavur, Tamil Nadu 613401, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_40
555
556
G. Divyabarathi et al.
Fig. 1 Diagrammatic representation of seabed scanned using active sonar
The active sonar signals that are sent from the surface returned back from the seabed are recorded as shown in Fig. 1. With the signal attributes as data for machine learning algorithms, we can classify Mines and Rocks effectively for the exploration. Ensemble learning is a method of using a combination of machine learning models which make better classification on the data [2]. In ensemble learning, accurate and diverse prediction is achieved by combining prediction of models individually. Sonar signals collected from underwater study are classified into mines or rocks based on their signal attributes using machine learning ensemble models. We have blended Logistic regression, Support Vector Machines, K-Nearest Neighbor and Naive Bayes models as base learner and random forest as metaclassifier along with same input as base classifier for customizing sonar classification with high precision. From the previous studies available, it is inferred that data is limited for machine learning models and less precision attained from individual ML classifier models. These identified challenges motivated authors to propose the method for mines classification. The key contribution of this paper are as follows: (i) synthetic data has been generated for the experiment since underwater acoustic data available in public repository is inadequate for the research purposes; (ii) modifying the classifier that aims to increase the precision of mine classification from rocks.
1.1 Related Works Ensemble methods are used for classification to capture the diversity from the dataset such as data diversity, structural diversity and parameter diversity. Diversity aids and justifies the improved performance of an ensemble classifier over individual classifiers in it [2]. In general, research has been conducted to combine the models to obtain better models using ensemble techniques provided outcomes with less error rate for classification tasks [3–7]. In [3], 35% of recognition performance improved
A Novel Ensemble Method for Underwater Mines Classification
557
when ensemble of boosting and mixture-of-experts methods used for large vocabulary continuous speech dataset. Aziz et.al found that cascading ensemble classifiers works best for dataset containing numerical predictors. Among other datasets experimented with cascaded ensemble classifiers, Sonar dataset achieved 87.5% of win using cascade base on predictor variables [4]. A probabilistic framework with minimum error for 73 benchmark datasets obtained in the experiment with ensemble model combinations. They found that framework not suitable for every dataset and Naive Bayes-based ensemble combinations performed optimally. Further it is explored that Naive Bayes-based schemes can work well with balanced datasets [5]. Abdelaal et al. classified Arabic tweets based on linguistics characteristics, results show that the ensemble model enhanced accuracy with increased number of base classifiers in their stacking individual model as an ensemble [6]. In [7], synthesized sonar data having four targets has been classified effectively with ensemble models in contrast with single classifiers. Literature [4–7] reveals the opportunity of building our sonar ensemble model for better performance with efficiency. This paper further organized as Section 2 confers preliminaries required and Section 3 describing materials and methods used for the classification. Section 4 discusses the outcomes of the experiment in comparison with the models. Finally, Section 5 concluded with future scope.
2 Preliminaries 2.1 Support Vector Machine Support Vector Machine (SVM) is a supervised learning method and a discriminative classifier [8] which define a separating hyperplane on the data points. In two-dimensional space [9], hyperplane is a line dividing two parts where in each class lays in either side. In [10], SVM combination model based on non-linearity been a promising forecasting tool for a Meteorological application which helped prediction of rainfall with quality. Due to less complexity along with efficiency in classifying binary class, the Support Vector Machine performance is combined with other classifier conventionally in experiments [11]. Similarly in order to detect bone cancer, a better mix of SVM and pre-processing was used [12].
2.2 Logistic Regression Logistic Regression [13] is a supervised statistical technique to find the probability of a dependent variable. Model uses a function called the logit function, that help derive a relationship between the dependent variable and independent variables by
558
G. Divyabarathi et al.
predicting the probabilities or chances of occurrence. The logistic functions also known as the sigmoid functions which convert the probabilities into binary values further used for predictions. A study [14] suggests and proved logistic regression as benchmark base classifier, and popular statistical model used for ensemble which predicted credit scoring from unbalanced and large dataset better than decision tree and lasso-logistic regression.
2.3 K-Nearest Neighbor K-Nearest neighbors are one of the simplest machine learning algorithms [15] that are capable of complex classification task. K-NN models work by taking a data point and looking at the ‘k’ closest labelled data points [16]. The data point is then assigned the label of the majority of the ‘k’ closest points. KNN based ensemble model to identify plant disease improved precision and recall percentage of the disease classification [17]. Combination of KNN with base classifier adabag produced 10% increased AUC than individual KNN for flood susceptibility mapping [18].
2.4 Naive Bayes Classification A Naive Bayes classifier is probabilistic-based model. It is a classifier in a machine learning model that is used to discriminate different objects based on Bayes theorem. It is easy and useful to predict the class of the test dataset having high dimension [19] by computing the maximum probability. Work by Abdeelaal et al., found Naïve Bayes classifier improved classification accuracy of 1.6% when used with one of the ensemble methods like bagging for tweets classification [6].
2.5 Random Forest A random forest [20] is an ensemble model which makes accurate prediction by combining the decision trees. The Breiman [15] defined this model based on creating randomized trees on the training set. Data samples selected randomly and decision trees are generated to make the prediction from each of the decision tree. Later by means of voting, a befitting solution for the task is obtained. In [21], random Forest prompted generalization performance than the previous work with single classifier for landslide susceptibility when integrated with weight of evidence. Also, in comparison with other classifier for prediction, random forest has highest possible outcome [22].
A Novel Ensemble Method for Underwater Mines Classification
559
3 Materials and Methods Multiple signal attributes of the sonar are given to classify Mines from Rocks. Both Mines and Rocks belong to class targets and non-targets, respectively. We have to identify an ensemble using machine learning algorithms which could predict the target with high accuracy and precision. With benchmark datasets in [5], it is observed that there is no dominant combination for all data because attributes and class can be numerical, categorical or both. Hence, this experiment seeks to extract a novel ensemble classifier also considering sonar dataset characteristics.
3.1 Dataset The sonar dataset has been collected from one of the public repositories, UCI machine learning repository uploaded by Gorman and Sejnowski [23] named Connectionist Bench Sonar. The frequency-modulated chirp with rising frequency is the transmitted sonar signal obtained from different aspect angles, spanning 90 degrees for cylinder and 180 degrees for rock. The signal range for each attribute is 0.0–1.0 representing frequency band energy over a period of observation [23]. Dataset contains 60 signal columns of the frequency obtained from Sonar and one class label column. From the total of 208 instances, each contains class labels ‘R’ for rocks and ‘m’ for mines. 111 patterns are characterized as mines and 97 patterns as rocks based on the signal collected. In this work, new data generated with Gaussian random normal distribution functions since the publicly available data is less for the model to learn. Synthetic dataset has been generated as shown in Fig. 2 to train the model with sufficient data. As the first step, dataset split in rock class and mine class, respectively. To generate the data, each of the target’s mean and covariance are calculated before invoking normal distribution function. Using the available multivariate normal method from NumPy package, new random values for rock and mines based on their mean and covariance were generated. Later, each class data has been concatenated to form a single dataset for classification task. This way machine learning models achieve better performance having trained with large dataset. Synthetic data generated for each target with 1000 instances. Training and testing the data with both original and synthetic data improves model prediction. Table 1 lists the characteristics of the both the dataset.
Fig. 2 Synthetic data generation
560
G. Divyabarathi et al.
Table 1 Dataset characteristics Dataset
Total observation
No. of mines
No. of rocks
No. of numerical attributes
No. of class labels
Sonar
208
111
97
60
2
Synthetic sonar
2000
1000
1000
60
2
3.2 Proposed Method Blending is one of the ensemble techniques where multiple and various selected machine learning algorithms as base learners and metaclassifiers are melded to make the prediction. Here, base classifier and metaclassifier are stacked to form the ensemble model. To make accurate predictions on unseen data, Logistic regression, SVM, KNN and Naive Bayes models are trained as base classifiers and their predictions are embedded with a dataset and fed into a meta classifier for final categorization as shown in Fig. 3.
Fig. 3 Proposed ensemble model for sonar classification
A Novel Ensemble Method for Underwater Mines Classification
561
Divide and conquer approach is used to improve the performance [24]. Individually weak learners stacked together to form a strong learner for Sonar data. Our proposed model is an ensemble learning technique which improves the quality of prediction even when week/single classifiers are combined. To create diversity in output space, a technique called Output smearing [25] is used here by which multiple output is created from base predictors. Initially after synthetic data is generated, data is fed into the base learner for prediction and their output is a new feature vector which we have embedded with the data. Later, concatenated data and feature vectors are given as input to the metaclassifier chosen that is random forest here as shown in Fig. 3. Base classifier and metaclassifier algorithm chosen based on best possible combination tried. Significance of appending data to input of random forest allows the model to train on the features along with results produced by base classifier module. This model enhances the precision of the classification of the targets which we have selected.
4 Results and Discussion In this section, implementation of classifiers on connectionist dataset from UCI repository are discussed by stacking the individual classifiers as an ensemble approach. Connectionist Bench Sonar (Mines vs. Rocks) is the sonar dataset that has been collected and studied in the previous section. A proposed model built to discriminate between Mines and Rocks based on sonar signals. The selected models blended for the classification are Logistic regression, K-nearest neighbors, Gaussian Naive Bayes, Support Vector Machines and random forest. We assess the model with binary classification metrics to find a better performing model for sonar data. Metrics used for evaluation are confusion matrix, accuracy, precision, recall. In the total of 2208 rows, the file contains 1111 patterns of metal cylinders and 1097 patterns of rocks [23] obtained by both observation and synthetically generation. We split the data for training around 70% and testing the model 30% from the dataset having instances, n = 2208. This split data is fed into the classifiers. And the better performing model is selected based on comparison of the metric for each model from Table 2. They achieve excellent performance and generalization in both testing and training dataset having 663 and 1545 samples, respectively. Figure 4 shows the confusion matrix for the proposed model, precision of targets is higher. Confusion matrix summarizes the accurate prediction and inaccurate prediction with count for evaluating a classifier performance. When the predicted and actual values are positive then it is called true positive. Conversely, when the predicted and actual values are negative then it is called true negative. False positive occurs when predicted output is positive but actual is negative. Similarly, False negative occurs when predicted as negative but actual value is positive [26]. Ensemble learning method collectively learns from various models by combining them. Overall accuracy of the model is increased as the ensemble classifier corrects
562
G. Divyabarathi et al.
Table 2 Evaluation metrics for the models Model
Accuracy (%)
Precision (in %) Rock
Mine
Recall (in %)
F1-score (in %)
Logistic regression
83
83
84
83.5
83
K-nearest neighbor
86
92
82
86
86
SVM
84
84
86
85
85
Naïve Bayes
84
83
86
84
84
Random forest
87
84
90
87.5
87
Proposed model
91
89
93
91.5
91
Fig. 4 Confusion matrix for proposed classifier
the error from any individual classifier used. Table 2 lists the results from the experiment. Single and ensemble models performance have been evaluated using accuracy, precision, recall and F1-score metrics. All the evaluation metric has been calculated from confusion matrix obtained. From the analysis of the results listed in Table 2 and Fig. 5, classification metrics like accuracy, precision, recall and F1-score, the proposed model has highest performance. Random forest and KNN had predicted with accuracy 87% and 86%, respectively. SVM and Naïve Bayes predicted with 84% accuracy while SVM having 1point better recall and F1-score than Naïve Bayes. Random forest performed second
A Novel Ensemble Method for Underwater Mines Classification
563
Fig. 5 ROC of the proposed model
best comparing the other individual model but our proposed has increased accuracy and precision of mines to 93% while classifying Mines from Rock. Along with increased accuracy, recall rate and F1-score have also been improved gradually. Thus, it’s evident that the proposed model gives better solutions for sonar classification than individual classifiers. Algorithm stack that built here in this experiment is customizable for any dataset for the target classification based on idea of creating diversity and accuracy within the ensemble model [11].
5 Conclusion Sonar target detection using an ensemble classifier for both original and synthetic dataset has been experimented. This work explored that generation of synthetic data and classification of sonar targets with proposed model could achieve best accuracy of 91%. Proposed model has improvised the accuracy and precision for target mines, in comparison with individual classifiers such as SVM, K-Nearest neighbor, Naive Bayes, Logistic Regression and random forest performance. The results achieved using a proposed classifier have been tailored based on parametric and non-parametric model combinations for the dataset. Thus, ensemble of models enhanced results of prediction which are reliable. We can use feature selection to enhance the attained accuracy of the model used. Further, applying deep learning techniques to the data can increase the accuracy of prediction of Mines and Rocks targets.
564
G. Divyabarathi et al.
Acknowledgements This work is supported by the Department of Science and Technology (DST) Funded Project, DST/ICPS/DIGITAL POOMPUHAR/2017 under Interdisciplinary Cyber Physical Systems in the Department of Computer Applications, Cochin University of Science and Technology, Kochi, Kerala.
References 1. Ocean floor Features, https://www.noaa.gov/education/resource-collections/ocean-coasts/ ocean-floor-features 2. Y. Ren, L. Zhang, P.N. Suganthan, Ensemble classification and regression-recent developments, applications and future directions. IEEE Comput. Intell. Mag. 11(1), 41–53 (2016) 3. G.D. Cook, S.R. Waterhouse, A.J. Robinson, Ensemble methods for connectionist acoustic modelling, in Fifth European Conference on Speech Communication and Technology (1997) 4. A.A. Aziz, B. Sartono, Improving prediction accuracy of classification model using cascading ensemble classifiers, in IOP Conference Series: Earth and Environmental Science, vol. 299, no. 1 (IOP Publishing, 2019), p. 012025 5. L.I. Kuncheva, J.J. Rodríguez, A weighted voting framework for classifiers ensembles. Knowl. Inf. Syst. 38(2), 259–275 (2014) 6. H.M. Abdelaal, A.N. Elmahdy, A.A. Halawa, H.A. Youness, Improve the automatic classification accuracy for Arabic tweets using ensemble methods. J. Electr. Syst. Inform. Technol. 5(3), 363–370 (2018) 7. J. Seok, Active sonar target classification using classifier ensembles. Int. J. Eng. Res. Technol. 11, 2125–2133 (2018). ISSN 0974-3154 8. J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik, Feature selection for SVMs. Adv. Neural Inform. Process. Syst. 13 (2000) 9. SVM—Understanding the math, the optimal hyperplane, https://www.svm-tutorial.com/2015/ 06/svm-understanding-math-part-3/ 10. K. Lu, L. Wang, A novel nonlinear combination model based on support vector machine for rainfall prediction, in Proceedings of IEEE International Joint Conference on Computational Sciences and Optimization (CSO’11) (2011), pp. 1343–1346 11. A. Chandra, X. Yao, Evolving hybrid ensembles of learning machines for better generalisation. Neurocomputing 69(7), 686–700 (2006) 12. B. Jabber, M. Shankar, P. Venkateswara Rao, A. Krishna, C.Z. Basha, SVM model based computerized bone cancer detection, in 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) (IEEE, 2020), pp. 407–411 13. J. Friedman, T. Hastie, R. Tibshirani, Additive logistic regression: a statistical view of boosting. Ann. Stat. 28(2), 337–407 (2000) 14. H. Wang, Q. Xu, L. Zhou, Large unbalanced credit scoring using lasso-logistic regression ensemble. PLoS ONE 10(2), e0117844 (2015). https://doi.org/10.1371/journal.pone.0117844 15. Y. Lin, Y. Jeon, Random forests and adaptive nearest neighbors. J. Am. Stat. Assoc. 101(474), 578–590 (2006) 16. L.E. Peterson, K-nearest neighbor. Scholarpedia 4(2), 1883 (2009) 17. A. Yousuf, U. Khan, Ensemble Classifier for Plant Disease Detection (2021) 18. P. Prasad, V.J. Loveson, B. Das, M. Kotha, Novel ensemble machine learning models in flood susceptibility mapping. Geocarto Int. 1–23 (2021) 19. P.K. Jain, W. Quamer, R. Pamula, Sports result prediction using data mining techniques in comparison with base line model. Opsearch 58, 54–70 (2021). https://doi.org/10.1007/s12597020-00470-9 20. L. Breiman, Random forests. Mach. Learn. 45(1), 5–32 (2001)
A Novel Ensemble Method for Underwater Mines Classification
565
21. W. Chen, Z. Sun, J. Han, Landslide susceptibility modeling using integrated ensemble weights of evidence with logistic regression and random forest models. Appl. Sci. 9(1), 171 (2019) 22. H. Singh, N. Hooda, Prediction of underwater surface target through SONAR: a case study of machine learning, in Microservices in Big Data Analytics (Springer, Singapore, 2020), pp. 111–117 23. R.P. Gorman, T.J. Sejnowski, UCI Machine Learning Repository. http://archive.ics.uci.edu/ml. 24. L. Rokach, Ensemble-based classifiers. Artif. Intell. Rev. 33(1–2), 1–39 (2010) 25. L. Breiman, Randomizing outputs to increase prediction accuracy. Mach. Learn. 40(3), 229–242 (2000) 26. P. Zhu, J. Isaacs, B. Fu, S. Ferrari, Deep learning feature extraction for target recognition and classification in underwater sonar images, in 2017 IEEE 56th Annual Conference on Decision and Control (CDC) (IEEE, 2017), pp. 2724–2731
Design of Blockchain Technology for Banking Applications H. M. Anitha, K. Rajeshwari, and S. Preetha
Abstract The emergence of bitcoin has enabled the financial transactions with advantages across the world. Bitcoin is the successful currency used in business transactions. Huge security issues are related to financial transactions. Bitcoin is the most secure mode for cash payment and decentralized approach. The advantages of bitcoin are decentralization and anonymity, which attracts several users to utilize their business-related tasks. This paper presents the benefits, advantages over traditional banking and Blockchain implemented using ISECoins. The proposed work is faster than the traditional services in creating wallets. Keywords Blockchain · Bitcoin · Crypto currency · Block · Client · Validator · ISECoins
1 Introduction Currency plays a vital role in offline and online transactions to purchase any goods or commodities. Business on the Internet has arrived at a point of dependency solely on monetary establishments serving as reliable external parties to process electronic payments. The framework performs commendably yet at the same time experience certain shortcomings as there is a tremendous measure of trust invested into the external resources. Bitcoins are introduced to overcome the problems of centralized money management. Bitcoins are computerized coins that are exchanged from among individuals, without experiencing a budgetary organization. It is a peer-to-peer shared adaptation of electronic money payments. H. M. Anitha (B) · K. Rajeshwari · S. Preetha B.M.S College of Engineering, Affiliated To VTU, Bengaluru, India e-mail: [email protected] K. Rajeshwari e-mail: [email protected] S. Preetha e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_41
567
568
H. M. Anitha et al.
Since the utilization of online cash is becoming widely contrasted with past years, we need secure mediums and exchange strategies to ensure the cash sent and received by the legitimate users. This clears a path for some outstanding improvement in the online cash exchange process and subsequently, our significant point is to contribute through bitcoin exchange application and improvise the Digital India plan. The decentralized feature of bitcoin [1] is achieved through the blockchain. Data is stored in a distributed manner in the network and hence blockchain avoids risks of central data storage. But the distributed storage of replicated data in the blockchain opens up many opportunities of deceitful activities such as double spending. In double spending, user issues the same coin to two receivers for different transactions. However, the double spending is avoided by validating one transaction and second transaction is not valid. In the proposed work, we term bitcoin as ISEcoins implemented to enable financial transactions. There are benefits of this bitcoins [2] which are mentioned below. • • • • • • • • • •
Lower fraud risks for buyers Users can preserve the coins without inflation risk Fees incurred on transaction can be decreased Ease of use Dependency of third party is minimal Payments can be made faster Secure system Smart contracts Asset distribution Wallet-building technology.
Blockchain is a rising innovation which has the features to improve the business using bitcoins. Buyers and Sellers have the advantages of using the decentralized blockchain [3]. Subsequently, an initial appropriation of this innovation for a negligible expense is an amazing move for organizations and purchasers alike. Contribution of the paper The contribution of the proposed work is • Tradeoffs between the traditional banking and Blockchain transactions • Implementation of Blockchain and bitcoin concept in the banking system to validate the results. Traditional Banking Retail banking has evolved over past few years. Before traditional banking, individuals used to store money at sanctuaries for their financial wellbeing. A portion of the primary budgetary mediators were “cash exchangers”, dealers who might take cash in one currency and trade it in another for their benefit. Banking, as it is known today, appeared in fourteenth century as money order bills now pursues with ATMs. The mix of an obsolete organization, such as banking and a three-thousand-year-old
Design of Blockchain Technology for Banking Applications
569
money-related framework is not perfect with the borderless, transparent, advanced, popularity-based norms for quick moving new economy of today. Banking has endured numerous changes from the beginning, the most recent being the 2008–2009 money-related emergency. At the point when markets crumbled, millions lost their homes and their positions, causing financial misery everywhere throughout the world. Amid this emergency, enormous banks disposed of their opposition and expanded their riches by zeroing smaller financial institutions. Today, banks have not changed much in their services. Banking is unwieldy and is agonizingly moderate to serve you, yet quick to receive cash. Banking administrations are progressively not affordable for weaker economic people. After a significant number of expenses, awareness of banking charges is unknown to the customers. A portion of the realized expenses incorporate outside exchange charges, returned store charges, early record conclusion charges, returned thing charges, overdraft assurance exchange charges, paper articulation charges, returned mail charges and human teller charges. The rest of the paper is organized as follows, Sect. 2 presents literature survey, Sect. 3 describes the proposed approach, Sect. 4 discusses the experimental setup, Sect. 5 presents results and discussions and Sect. 6 concludes the paper.
2 Literature Survey Nakamato described how electronic cash can be sent from sender to receiver without investment on establishing the network directly [4]. The proposed system provided a solution for twofold spending in distributed manner. The system used hashing messages and time stamping messages. The record created cannot be recreated without verification of the work. The messages are exchanged in the network freely. Van Saberhagen et al. [5] explained how bitcoin is not accountable for the issue of untraceability and unlinking since assets which were sent can be followed and used to realize who sent the amount to whom via looking through the open database. Additionally, a few researchers found that a great deal of data can be discovered by investigation of a public chain. An attempt was made to give the arrangement utilizing another innovation called CryptoNote. The framework’s appropriated nature is firm, counteracting the usage of new highlights until practically the majority of the system clients update their customers. Some basic defects that can’t be fixed quickly prevent bitcoin’s far-reaching spread. Security and obscurity are the most significant parts of electronic money. While a few agreement calculations exist for the Byzantine Generals Problem, explicitly in accordance with circulated installment frameworks, many experience the hard effects of high dormancy incited by the necessity that all hubs inside the system impart synchronously. A novel agreement calculation that bypasses this necessity by using on the whole trusted subnetworks inside the bigger system is presented in [6]. The “trust” required of these subnetworks is certainly insignificant and can be additionally diminished with principled decision of the part hubs. Insignificant availability is required to keep up understanding all through the entire system. The
570
H. M. Anitha et al.
outcome is a low-dormancy agreement calculation which still keeps up vigor despite Byzantine disappointments. David Mazieres presented another model for accord called Federated Byzantine agreement (FBA). FBA accomplishes vigor through majority cuts—singular trust choices made by every hub that together decide framework-level majorities [7]. They likewise present the Stellar Consensus Protocol (SCP), a development for FBA. Like all Byzantine understanding conventions, SCP makes no suspicions about the sane conduct of assailants. Unlike to earlier FBA, which surmise a consistently acknowledged participation list, SCP appreciates open enrollment that advances natural system development. Thus, those significant members do not consent to the exchange until the members they consider significant concur too. In the end, enough of the system acknowledges an exchange that it winds up infeasible for an assailant to move it back. FBA’s agreement can guarantee the honesty of a money-related system. Its decentralized control can goad natural development. Miers et al. [8] presented Zerocoin, which works on the principles of bitcoin. Similar to other e-money conventions, Zerocoin utilizes zero-learning verifications to forestall exchange chart investigations. Unlike prior functional e-money conventions, Zerocoin has no dependency on advanced marks to approve coins, nor does it require a national bank to anticipate twofold spending. Rather, Zerocoin validates coins by demonstrating, in zero-learning, there is a place with an open rundown of substantial coins. However, instead of undeniable mysterious cash, Zerocoin is a decentralized blend where clients may occasionally “wash” their bitcoins by means of the Zerocoin convention. Recovering Zerocoins requires twofold discrete-logarithm confirmations of information; this measure surpasses 45 kB and requires 450 ms to check. Verifications must be communicated through the system, checked by each hub and updated in the record. The involved expenses are higher, in the order of size, than those in bitcoin. Ben-Sasson et al. [9] discussed in spite of the fact that installments are led between people, can’t offer more protection. The installment exchanges are recorded in an open decentralized record, from which much data can be concluded. Zerocoin handles a portion of these protection issues by unlinking exchanges from the installment’s inception. Vorick et al. present Simple decentralized storage (SIA) [10], a stage for decentralized capacity. SIA empowers the development of capacity contracts between companions. Contracts are understandings between a capacity supplier and their customer, characterizing what information will be put away and with a cost tag. Contracts are freely auditable in blockchains. SIA will at first be actualized as an altcoin, and later monetarily associated with bitcoin by means of a two-way peg. SIA is a decentralized distributed storage stage that means to rival existing capacity arrangements, at both the P2P and endeavor level. Rather than leasing stockpiling from an incorporated supplier, peers on SIA lease stockpiling from one another. Datadriven trust method using Blockchain is discussed in [11] to investigate any internal attacks on Internet of Things (IoT) sensor nodes. Due to decentralized nature of the Blockchain, the faults and failures are minimalized. Encryption methods, anonymity and decentralized modes provide advantages. Secured communication is the benefit of the model.
Design of Blockchain Technology for Banking Applications
571
3 Proposed Approach The idea of bitcoins is hard to hypothesize, yet can be executed. It can demonstrate to be a solid type of online cash. Bitcoin has its own advantages when it comes in contrast with the customary old bank exchanges. It is a decentralized type of money; there is no specific standard for the same [12]. Bitcoin is most successful cryptocurrency [13] used for transactions. Forgoing of digital currency is practically unimaginable on account of the endeavors of cryptography and encryption calculations and the vast amount of computational power it requires. Cryptocurrency has since picked up authenticity with huge population across the world, along with governments [14]. It is presently acknowledged and continues to be a tremendous worldwide network as a stage to trade esteem. The issues are fundamental and specialized. It turns very costly as evidence-ofwork calculations, scaling it up has turned into a major issue both as far as accounts and assets are concerned. Outright obscurity has impeded cryptocurrency’s approach to be acknowledged as lawful cash in numerous nations as it very well may be utilized for tax avoidance and tax evasion and other financial fraud. The undertaking means to build up a local Blockchain sans preparation to such an extent that it is adaptable and coordinate it with government databases to encourage pseudonymity and in the meantime make it assessable. Public Key cryptography is essential part of the bitcoin in order to attain integrity of the messages. Creation of wallets is mainly carried out using the public key cryptography. Hashing is applied on the public key to create the public address for transactions. Private Key is used to validate the transaction. The System Architecture as shown in Fig. 1 has three components: Block chain, Blockchain Network and Blocks. ISECoin is the decentralized cryptocurrency that is transacted over the network, without any monitoring and controlling from centralized authority. A user can own any number of such coins. Transaction allows clients of ISECoin network to spend
Fig. 1 System architecture
572
H. M. Anitha et al.
them. A record of each transaction is maintained and each transaction is associated with a unique ID. ISECoin network is a global network, where a registered client from anywhere-anytime can do transactions. This network is based on the concept of peer-to-peer network that operates on cryptographic protocol where every transaction made is encrypted ensuring security. Mining is responsible for validation of complete transactions. Miner maintains his own database. On successful validation of the transaction, the block is added to the Blockchain and it is the responsibility of the miner. The third component is Blockchain, is a ledger similar to the ones used in banks to keep record of each transaction. Each user of ISECoin network has a copy of Blockchain. When a new transaction is initiated and approved as valid transaction, a copy of it is stored in blockchain of each user. All the users are aware of updates in the Blockchain. Use Case Diagram A Unified Modelling Language (UML) is defined to depict the use case analysis with various actors; their actions are depicted in Fig. 2. Actors: • Client: A registered client using ISECoin network for transactions. Any client in the network can send or receive the bitcoins. • Validator/Miner: Responsible for validation and updating of transaction. Use cases: • Send ISECoin: This use case indicates sending of ISECoin to other client on ISECoin network.
Fig. 2 Interaction between users: case diagram
Design of Blockchain Technology for Banking Applications
573
• Receive ISECoin: Receiving of ISECoin from other client of network. • Transaction: Sending of ISECoin from ISECoin wallet of a client to that of another client in network. • Updating and validation: Validating the transaction and then updating it to blockchain. When client has to make an ISEcoin payment, the network receives the payment instruction. The hosts in the network verify this instruction and send to other hosts. Later, payment transaction will be included in one of the block updates. This block is added to ISECoin Blockchain in the network. Figure 3 depicts a sequence diagram to illustrate interactions of the system with its actors. The process of updating the Blockchain of every computer in the network is known as mining. This is often described as “solving complex mathematical puzzles to win ISECoin”. Mining is the procedure of predicting the output for mathematical puzzles. The host will win the chance of adding the block to network if output is generated before every other host. Validation will be performed quickly in the network by the hosts. Data Flow Diagram The Data Flow Diagram (DFD) shown in Fig. 4 is Level-0 DFD. It is the context-level diagram which depicts the communication between the system and the user. It shows the interaction of the whole system with external agents which are usually some
Fig. 3 Sequence diagram
574
H. M. Anitha et al.
Fig. 4 Data flow diagram—level 0 for blockchain
sources of data. The context diagram portrays the entire system as a single unit of process with no additional information about the internal operation of submodules. Client A and B transact using the blockchain. Figure 5 depicts the DFD-Level 1; transactions are initiated, processed, mined and put into a block. The level 1 diagram is an extended diagram which depicts the internal sub-modules within the system and how they interact with each other to provide the output. Three sub-modules viz core, webapp and services are present within the main module. Figure 6 depicts the level 2 diagram for the Blockchain webApp module. It shows the data flow between web services, AADHAAR validation and initiating transactions.
4 Experimental Setup Many agile techniques will focus on delivering the Most Viable Product (MVP) at first, followed by the rest of the requirements. Core Module: This module handles all the work related to the core programming logic and computation of the Blockchain which includes the structure of blocks, wallets, transactions and the Blockchain. WebApp Module: This module contains the views and the front-end logic of the application. It has been developed with a collection of web-based programming languages and frameworks. Services Module: This module connects the core module with the WebApp module by using various sockets and RESTful APIs and asynchronous requests and responses.
5 Results and Discussions The proposed work is able to revolutionize the financial system with the usage of comprehensive cryptographic algorithms. It helps eliminating third parties from the process and enables the system to be robust and trustworthy. Computation time is minimized to process the transactions. Figure 7 shows the dashboard of ISE coin network. Dashboard enables users to navigate and send money. Figure 8 depicts the
Design of Blockchain Technology for Banking Applications
575
Fig. 5 Data flow diagram—level 1 blockchain
user to enter credentials and can create wallet, and Fig. 9 depicts the user way to send money to beneficiaries. As per the tests conducted, the average time to create a wallet was around 282 ms and the average transaction confirmation time was 1.2 s significantly faster than NEFT, which may take up to 3 days to be completed. Future of Blockchain Bitcoins and Blockchain use the distributed ledger. As discussed in [15], several techniques are adapting this technology in the finances considering some standardization approaches. Many researchers can pursue research on security aspects. Inferences The work carried out is decentralized and hence failures in the network do not impact functioning of the application. The system is more secure as there is no involvement of the third parties since hashing is used. Hence the proposed system is secure and safe.
576
Fig. 6 Data flow diagram—level 2 for blockchain WebApp
Fig. 7 Dashboard details
H. M. Anitha et al.
Design of Blockchain Technology for Banking Applications
Fig. 8 Dashboard and wallet creation
Fig. 9 Send money to beneficiaries
577
578
H. M. Anitha et al.
Blockchain is helpful for the various applications [16] across the world due to its traceability, fastness, robustness and safety. The Blockchain technology can be used in supply chain management, trading and amount settlement. There are certain problems with Blockchain such as acceptance of the technology by different category of users and risk of non-refund of bitcoins.
6 Conclusion The proposed work has been successfully implemented using a native Blockchain method, which has been tailored specifically for the Indian Financial System. The work benefits financial sectors with bitcoins. The advantages have been the benefits of cryptocurrencies like shield from inflation, reduced transaction fees, easier crossborder transactions and anonymity. The user accountability and avoiding money laundering with tax evasion problems by linking it with government databases are addressed. The prototype is developed to implement a parallel currency which leads to countering money laundering and tax evasion problems. The decentralized approach provides security in the financial transactions without involvement of third parties.
References 1. H. Vranken, Sustainability of bitcoin and blockchains. Curr. Opin. Environ. Sustain. 28, 1–9 (2017) 2. “Benefits of Bitcoin” https://www.bitcoin.com/get-started/the-benefits-of-bitcoin/. Accessed February 15, 2021 3. M.N.M. Bhutta, A.A. Khwaja, A. Nadeem, H.F. Ahmad, M.K. Khan, M.A. Hanif et al., A Survey on blockchain technology: evolution, architecture and security. IEEE Access 9, 61048– 61073 (2021) 4. S. Nakamoto, Bitcoin: a peer-to-peer electronic cash system. Bitcoin (2008). URL: https://bit coin.org/bitcoin.pdf 5. N. Van Saberhagen, CryptoNote v 2.0 (2013) 6. D. Schwartz, N. Youngs, A. Britto, The ripple protocol consensus algorithm. Ripple Labs Inc White Paper 5(8), 151 (2014) 7. D. Mazieres, The stellar consensus protocol: A federated model for internet-level consensus. Stellar Develop. Found. 32 (2015) 8. I. Miers, C. Garman, M. Green, A.D. Rubin, Zerocoin: anonymous distributed e-cash from bitcoin, in 2013 IEEE Symposium on Security and Privacy (IEEE, 2013) 9. E.B. Sasson, A. Chiesa, C. Garman, M. Green, I. Miers, E. Tromer, M. Virza, Zerocash: decentralized anonymous payments from bitcoin, in 2014 IEEE Symposium on Security and Privacy (IEEE, 2014) 10. D. Vorick, L. Champine, Sia: Simple Decentralized Storage. Nebulous Inc (2014) 11. D. Sivaganesan, A data driven trust mechanism based on blockchain in IoT sensor networks for detection and mitigation of attacks. J. Trends Computer Sci. Smart Technol. (TCSST) 3(01), 59–69 (2021) 12. G. Huberman, J.D. Leshno, C. Moallemi, An economist’s perspective on the bitcoin payment system. AEA Papers Proc. 109 (2019)
Design of Blockchain Technology for Banking Applications
579
13. P.D. DeVries, An analysis of cryptocurrency, bitcoin, and the future. Int. J. Bus. Manage. Comm. 1(2), 1–9 (2016) 14. M.L. Di Silvestre, P. Gallo, J.M. Guerrero, R. Musca, E.R. Sanseverino, G. Sciumè et al., Blockchain for power systems: current trends and future applications. Renew. Sustain. Energy Rev. 119, 109585 (2020) 15. M. Xu, X. Chen, G. Kou, A systematic review of blockchain. Financial Innov. 5(1), 1–14 (2019) 16. U. Bodkhe, S. Tanwar, K. Parekh, P. Khanpara, S. Tyagi, N. Kumar, M. Alazab, Blockchain for industry 4.0: a comprehensive review. IEEE Access 8, 79764–79800 (2020)
A Brief Survey of Cloud Data Auditing Mechanism Yash Anand, Bhargavi Sirmour, Sk Mahafuz Zaman, Brijesh kumar Markandey, Anurag Kumar, and Soumyadev Maity
Abstract In today’s digital world, the demand for cloud storage is increasing. Cloud has attracted a vast number of users in today’s environment because it delivers a wide variety of services. Users can access data from the cloud server whenever they need it, and data owners can store any type of data on it. It provides a cost-effective solution for medium and small businesses, releasing them from the burdens of data storage. The term “cloud storage” refers to data storage on a centralized server that is part of another organization’s infrastructure. The corporation owns and operates the data storage, while the user is charged with the storage space used. Since the data is processed on a cloud server, the accuracy of data storage is jeopardized. Users have a low level of trust in cloud service providers because they can be deceiving at times. These programs often come with a critical concern, and due to newly introduced policies such as GDPR, the organizations can face legal and financial consequences. Since the cloud infrastructure is regulated by a cloud service provider, a separate operating agency, data privacy is critical. As a result, several auditing schemes have been proposed to protect the integrity and confidentiality of cloud records effectively. In this article, we looked at a variety of existing data integrity auditing systems and their outcomes. Keywords Data integrity · Cloud storage · Third-party auditors (TPA) · Cloud service provider (CSP) · Merkle Tree
Y. Anand (B) · B. Sirmour · S. M. Zaman · B. Markandey · A. Kumar · S. Maity Department of Information Technology, Indian Institute of Information Technology Allahabad, Prayagraj, Uttar Pradesh 211012, India e-mail: [email protected] S. Maity e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_42
581
582
Y. Anand et al.
1 Introduction In today’s era, cloud computing is a booming technology because of elasticity, scalability, and pay-as-you-use policies. In simple words, cloud computing is a service that uses the Internet to deliver data storage services to everyone from an organization to an individual. Another main reason it is booming is because it serves as a cost-effective solution to medium- and small-size organizations and liberates enterprises from the pressures of data storage. Since storing data into the cloud is more convenient, the consumer should not need to be concerned about the difficulties of direct hardware management [7]. After storing data in the cloud, we can access data from anywhere through Internet. Information stored in the cloud is usually held in a virtual network with other users’ data. CSPs typically offer infrastructure that supports service-level agreements and authentication power (Cloud Service Provider) [14]. Users generally store their data locally in conventional data management systems, but users do not physically own it in cloud storage. As a result, data integrity becomes a significant user concern. By Forrester’s research, we see that “The global cloud computing market is going to reach 304.9 Billion Dollar till the end of 2021 as that of 40.7 Billion Dollar in 2010 when it started” [1] (Fig. 1). In the cloud computing, performance and security is an essential factor. CSP is confronted with several security issues that could jeopardize data protection, integrity, and availability. Any CSPs mask data breaches to protect their integrity or free up space by removing the rarely viewed data [3, 7]. We needed data auditing to ensure data security and integrity to prevent such risks. Cloud users generally store extensive data into CSP, and auditing this large data can be a hectic process as it is not feasible to download the entire cloud. Moreover, CU deletes the data locally after uploading it to the cloud. This paper presents some challenges on auditing of cloud data and investigates challenges as well as discuss their existing solutions and frameworks. We also ana-
Fig. 1 Architecture of cloud storage
A Brief Survey of Cloud Data Auditing Mechanism
583
lyze the capability of these frameworks and suggest the best practice for cloud data auditing system. This paper consists of six sections; in Sect. 2 we have discussed application areas of cloud storage system and real-time challenges. Section 3 presents different existing auditing techniques in which we audit the cloud data without downloading the entire cloud data and their shortcomings. Contribution and the limitations of the technologies used are discussed in Sect. 4. Performance of different technologies and their results and the conclusions are presented in Sect. 5 and Sect. 6 respectively.
2 Application Area and Challenges Data backup, data sharing, and resource service are indeed a few of the purposes for cloud computing. Many network services will also benefit from simplified interfaces. If cloud storage does not have cloud auditing, it has a significant drawback. Cloud data auditing is used in every organization which is storing its data into third-party cloud storage providers. Cloud data auditing also helps the organization legally by gaining the required certification such as ISO 27001. The possible attack that can take place on the cloud, due to which we can lose data integrity and data confidentiality. Co-resident attacks: In CSP, several virtual machines (VMs) are placed on the same physical resources, due to which they share the same CPU and memory. In this attack, the attackers build various types of side channels between targeted VM. And this channel is used to mine sensitive information from the CSP [6]. DDOS attack: In this attack, the attacker sends multiple requests to the targeted CSP by launching a coordinated DoS attack using a zombie machine or bots. In this way, the attacker overburdened all CSP resources, and to handle this situation, CSP allocates more resources. And this goes on until the CSP runs out of resources, or else the owner is out of credit [5]. More such attacks can damage the integrity of the cloud. Such as SQL Injection, Guest Hopping attack, Cookies poisoning, Backdoor and debug option, Cloud browser security, Cloud malware injection attack, ARP poisoning, Domain hijacking, IP Spoofing [16]. A conventional IT security audit evaluates an IT organization’s Auditors count, examine, and test an organization’s systems, processes, and operations to evaluate if the systems safeguard information assets, maintain data integrity, and work efficiently to achieve the organization’s business goals or objectives. IT security auditors require data from both internal and external sources to support these goals. Furthermore, cloud computing has its own set of security issues. A cloud infrastructure is the result of a continuous three-way negotiation between service organizations, cloud service providers, and end users to ensure productivity while maintaining a reasonable level of security. Cloud security audits must identify whether CSP customers have access to security-relevant data. Transparency helps organizations to discover potential secu-
584
Y. Anand et al.
rity threats and assaults more rapidly, as well as create and develop suitable solutions. Inadequate data transparency might result in a loss of control over internal business resources. For example, an undetected back door into a vital business system might cause catastrophic damage to the organization. Transparency is significantly more critical in cloud security auditing since security-relevant data is more difficult to get because CSPs, not CSUs, retain the bulk of the data. In cloud security evaluations, a complete understanding of CSP asset data, data location, and data security standards is also necessary. It is dangerous to retain vital unencrypted data anywhere, especially outside of one’s own business’s IT infrastructure. If a cloud is hacked, hackers will have immediate access to the data stored within it. To circumvent this, a client can encrypt all of its data before sending it to the cloud provider, but this method increases the risk of system administrators misusing their rights. Traditional IT systems, too, face a host of encryption challenges. Which is more critical, data encryption or data access? How can an organization query the data quickly and efficiently if the whole data pool is encrypted at rest? Because of its high computational requirements, encryption may not always be the most efficient option. Also complexity of the systems grows in tandem with the rise in scale and scope. Cloud auditors should accommodate for this complexity by devoting more time and resources than in a typical IT auditing procedure [2]. According to Information Systems Audit and Control Association (ISACA), cloud auditors should cover these steps for a successful cloud audit. • Provide an organization with security policy and control the services which are hosted on a cloud. • Provide a way to find out the insufficiencies and inadequacies in the client data hosted on a service provider cloud. • Providing audit stakeholders with capacity and quality evaluation criteria and reports so that they may be confident in the service provider’s certification and accreditation, as well as its internal controls [12].
3 Survey and Existing Work While cloud storage provides many benefits to customers, it does not ensure datasourced privacy security. Any user can generally check the data integrity in two primary ways that are, one is the customer should first retrieve all of the data and check the accuracy of the data that has been outsourced, and second is whenever data is accessed, the user can evaluate the data’s storage accuracy, but in both ways, we have demerits the very first solution is inefficient since network input–output operations are costly. And the second approach would not have guarantees of data privacy for data that is not used or is accessed infrequently.
A Brief Survey of Cloud Data Auditing Mechanism
585
Manoharan has consider the problem of user-level security author has proposed Arnold transform-based biometric authentication model with a robust performance and protecting from unauthorized access from cloud [11]. Also for reliable data transmission on cloud and other IoT systems Dr. N. Bhalaji recommended a strong routing approach between mesh users, the gateway, and routers. This assesses the performance of the existing link in order to improve throughput and connectivity and reduces the likelihood of duplicated data packet transmission and increases communication overhead, creates a trustworthy one additional hob-based data transfer to improve data security by utilizing asymmetric encryption and decryption based on the cryptographic protocol [4]. Although cloud storage makes some benefits more attractive than ever, it also exposes outsourced data to different and complex security threats. Data outsourcing relinquishes the owner’s only leverage about the destiny of their data since cloud service providers (CSP) are independent operating organizations. As a consequence, the accuracy of data in the cloud is jeopardized. Wang et al. then proposed a system in which it requires a publicly auditable cloud storage system so that data owners can rely on a third-party auditor (TPA) to check outsourced data if appropriate. Auditing by a third party is a straightforward and cost-effective way to create trust between the data owner and the cloud server. Indeed, based on a TPA’s audit result, the published audit report would assist owners in evaluating the risk of their subscribed cloud data services and benefit the cloud service provider in improving their cloud-based service framework. However, this is still not enough for a public auditable secure data cloud storage system. What if the TPA and the data owner are all untrustworthy? In this situation, the auditing outcome should assess whether the data is correct. Which person (including the owner, TPA, and cloud server) is responsible for any problems that might arise [9]. To overcome all these problems for many years, many researchers have contributed their work. After reviewing these articles, we may categorize them into three groups as follows:
3.1 Integrity Checking Using Cryptographic Functions Hiremath et al. proposed an efficient privacy-preserving public auditing system. Here TPA can conduct auditing tasks without retrieving a cloud user’s data copy, thus protecting privacy. Before uploading the data into cloud storage, Cloud users divide the file into blocks, which are then encrypted using the AES algorithm and a message digest generated using the Secure Hash Algorithm. In verification, The hash generated by TPA and the hash provided by the data user is compared by the TPA. If both of them match, then there is no change in data, or else it has been tempered [7]. In this proposed scheme, the CSP can store the hash value of the file which is least used and remove the file from the CSP. And if the file is too large, there will be overhead of re-computing the hash of all the block files, which is also resource-consuming.
586
Y. Anand et al.
Mohanty et al. have proposed a public auditing scheme where TPA validates the store data in the cloud without losing data privacy to TPA. Here cloud users, before uploading the data in the cloud, are pre-processed for the generation of key pairs (K pri ,kpub ) and a secret key(K s ). Then CU encrypts the data along with the secret key and sends it to the CSP with the private key(K pri ) and CSP verifies with the public key K pub . In verification, TPA also generates key pairs (K Tpri , kTpub ) and sends the encrypted K s given by CU after that CSP verifies the Ks and validates the TPA. Now CSP sends hashed data using HMAC-MD5 as a response(HCSP ) to TPA. If the CSP response (HCSP ) matches with a hash (HCU ) given by CU to TPA then there is no change in data or else it has been tempered [14]. In this proposed scheme, there is overhead in storing keys. There might be some malfunction in server-side or client-side, and we might end up losing keys, due to which we might lose all the data present in the cloud. Agarkhed and Ashalatha have suggested that many auditing frameworks have been designed for cloud storage systems. The key aim is to provide a public auditing scheme for cloud data management that protects privacy. Infrastructure security and data security auditing are the two problems that occurred in the auditing process. IT security is achieved through the infrastructure auditing standards. The shared data management compliance auditing scheme in the cloud has a privacy-preserving auditing protocol. TPA may use public audibility to check the accuracy of cloud data without having to download it. TPA, which tests the integrity of cloud data, is highlighted as a privacy-preserving public audibility tool. TPA is used to ensure the accuracy of consumer data but still protecting their privacy because TPA has no access to their information. RSA-based storage security (RSASS), for security, the scheme employs public auditing of remote data. It’s a public key cryptography technique that relies on data storage accuracy to enable complex data operations while reducing server computation. Four phases of our public auditing algorithms are the initialization phase, audit phase, integrity phase, hacker phase [1]. Jegadeesan and Sahithi suggest and style a replacement model called data integrity auditing without private key storage. Here the consumer, the cloud, and the TPA are all included in the device model. The biometric data of the user who wants to use the cloud storage service is collected during the user registration process. The data owner then uses his signing key to build authenticators for data blocks, which he upload to the cloud. Using the challenge-response protocol, the TPA verifies whether the cloud is keeping the user’s data intact. By using the fuzzy key, which is the result of biometric data of the user and a security parameter K a public parameter is generated (PP), and using this PP, public key (PK) and verification key (VK) is generated and then by using proofGen and proofVerify algorithms. The TPA verifies the proof P is legitimate. But in this approach, most of the security depends on the TPA, and it is often seen that TPAs act as advertisers. But basically, they are minimizing the harm caused by a client’s primary disclosure in cloud storage auditing [8]. To address the problem that data is introduced to TPA securely and ensure data storage accuracy, Prof. D. N. Rewadkar and Suchita Y. Ghatage proposed a solution during the auditing process. TPA would not be able to discover anything about the data content thanks to ElGamal Homomorphic encryption. initially, they performed
A Brief Survey of Cloud Data Auditing Mechanism
587
an analysis of MAC-based and HLA-based solutions, but there is some weakness as A file It can only be inspected a certain amount of time, and complex data handling is inefficient. So in their solution, cloud users will upload the data with verification code to the cloud server then TPA will conduct the audits at the customer’s request. For auditing, it will obtain a verification code from the customer and submit a challenge to the cloud server based on that information. The TPA would then verify the response sent by the cloud server. The proposed system’s findings should be expanded to allow batch auditing, allowing multiple audits to be performed with multiple customers [17].
3.2 By Using Provable Data Possession (PDP) Data integrity comes into play when clients uploads data on untrustworthy cloud storage, and from time to time clients need to check the authenticity of their data. The simplest way to download the data locally and check the data integrity, this simple approach is only feasible when the data size is small. Still, when the data size is huge, it is impossible to download the data locally. One of the most effective ways to ensure the data integrity of the data hosted over the cloud is to use Provable Data Possession (PDP). In this technique, before sending data to the cloud, the user computes the metadata of the data and stores the computed metadata locally. Whenever the client asks for integrity checks, metadata is computed over the cloud and then compared with locally stored metadata; if both the metadata matches, it guarantees data stored over the cloud. Natu and Pachouly proposed a timer-based integrity checking system. After a set of intervals, PDP-based integrity verification is initiated by the client-side. The results of their integrity verification are stored in a log that the auditors can further analyze. This is very helpful in company-oriented environments, where integrity verification can be tracked for future enhancements. The disadvantage of integrity checking using Provable Data Possession is that the data block may reveal the data to the third-party auditor, leading to publicly leaking the personally identifiable information publicly [15]. Wu et al. propose a framework for privacy-preserving public auditing for safe cloud storage that blends data deduplication with complex data operations. This mechanism has some properties: (1) Public auditable: TPA can be used to verify the authenticity of data sent to a cloud service without requiring any data to be retrieved from the cloud. (2) Data deduplication and complex data processes combined: After deduplication and complex data processes, ensure the availability of outsourced data shared by a small number of users. (3) Storage correctness: As soon as the TPAs audit is passed, the cloud server must ensure that the user’s data is accurately stored. (4) Data privacy: To protect information about outsourced data from being leaked to TPA. It believes that cloud service providers can improve storage quality by reducing data volumes, lowering cloud storage costs, and lowering energy consumption for large cloud storage systems. In deduplication, convergent encryption ensures data secrecy. These two models serve as the basis for ensuring the confidentiality of outsourced
588
Y. Anand et al.
data in the cloud: Provable of Data Possession (PDP) and Proof of Retrievability (POR). Then the user’s goal of ensuring data privacy in a cloud storage environment is achieved. So, by applying some skills, we can improve the framework for cloud storage systems that use data deduplication techniques to save storage space and upload bandwidth [19].
3.3 By Using Blockchain and Merkle Tree As we discussed earlier, TPA could act as an advertiser to overcome this problem: Sumathi Ma, Rajkamal Mb, Dr. B Gomathyc, Infant Rajd, and Dr. Sushma Jaiswale, and D. Swathif discussed blockchain technology which allows for safe data access as well as data privacy without jeopardizing confidentiality. They also addressed the problem, which often occurs when authorized networks, instead of permission-less networks, offer customers personal information with anonymity and dignity; cloud service providers sometimes block registered consumer access to cloud services. In this system, the data owner divides the data into several blocks and uses the double SHA 256 algorithm to compute a signature for each block. Hash data along with signature is then sent to the cloud server. For authentication and authorization, the user must first register with the system. When a user needs data, he sends a request to the database server. Using encryption and decryption keys, It is possible to share data safely. The only disadvantage of blockchain technology is its storage space, limited to 1 MB per block. When the scale of the data is raised, the efficiency of a blockchain technique degrades [18]. Earlier, we discussed the PDP technique, which is a public verification technique. In this data, the owner hires some third-party auditors to ensure the data integrity. These third-party auditors check for data integrity after a fixed time interval, and if the data integrity fails, they inform the data owner. When the auditor is irresponsible, they may always make a good integrity report without checking. Sometimes, the auditor doesn’t check on the data integrity in the given time interval. To get the full confirmation of the data integrity, the user may hire another auditor who audits that auditor. Arun Prasad Mohan, Mohamed Asfak R., and Angelin Gladston proposed a Merkle Tree-based cloud audit system which is further integrated with the blockchain-based audit system to get the recording of audit logs. The cloud auditing system can be divided into three parts based on the users. First is cloud user whose data is stored in the cloud, second is cloud server where data is stored, and third is third-party auditor (TPA). Blockchain is a collection of linear data types in which the smallest unit of data is called a block, and these blocks are linked one after the other to form a linked list. Each block contains the cryptographic hash function of the previous block due to this. It is almost impossible to modify the previous block. Whenever a new valid block is added to the blockchain, the timestamp is added to the block. This feature is used to record the result of the audit whenever the auditor performs data verifications. A new block is added in the blockchain, and a timestamp is also added in the block. Merkle tree is a hash tree in which each leaf node contains
A Brief Survey of Cloud Data Auditing Mechanism
589
the hash of the data block, and every non-leaf node is labeled with the cryptographic hash of the labels of its child nodes. The main advantage of using the Merkle tree with blockchain is that the time required to check a block that belongs to the Merkle tree is reduced to log(n). The researcher concludes that the Merkle tree verification can save an average of 0.25 ms [13]. Lekshmi et al. proposed a method where we can eliminate the TPA’s using smart contract. It is a decentralized data auditing scheme it uses smart contract to check the data integrity. The data owner encrypts the data and splits into blocks and hash value is generated before uploading the data into clouds. Ethereum smart contract is used for smart auditing in a platform that is decentralized. It stores the information which is provided by the owner, which includes information regarding the file such as name, size, hash and data owner details. CSP then verifies the data integrity. Blockchain is a continuous growing chain; in this case, new chain will be generated which will contain hash of previous block. A data contained in the data block verifies content from the preceding block. If any old block is tampered, the value of the block will change and will not match with successive blocks [10]. A basic comparison of all the articles has been made in Table 1.
4 Discussion In Sect. 3, we described the different research papers of techniques of cloud data auditing. Various cloud service providers (CSPs) are independent operating organizations so the accuracy of data is jeopardized. So for avoiding this situation, we use third-party auditors (TPAs) for auditing the data in the cloud. Third-party auditors are often those who conduct an independent and objective audit of an organization’s management system to determine whether or not it meets the requirements of a specific standard; if successful, this third-party audit will provide the organization with certification or registration of conformity with the given standard. However, the difficulty with TPAs is that they are also independent organizations, so how can we blindly trust them. Now we check integrity by some cryptographic algorithms and hashing techniques; from this, we achieve data confidentiality and data integrity, but the drawback is that if the file is too huge, the overhead of recalculating the hash of all the block files will occur, which is equally resource-intensive. In addition, there are costs associated with storing keys. There could be a server-side or client-side fault, and we could lose keys, resulting in the loss of all data stored in the cloud. To overcome the problems of big files, we suggest to use provable data possession (PDP) which store the metadata of original data present in cloud and when we have to check the integrity of the data, we compare it to metadata which we stored locally. But the limitation of PDP is that the data block may reveal the original data to the TPA. Now to endure the PDP problem we use data deduplication techniques.
590
Y. Anand et al.
Table 1 Overview of existing surveys on cloud data auditing Surveys
Year
Journal
Major contribution
Limitation
Wang et al. [9]
2010
IEEE Network
They propose that publicly auditable is able to help this cloud data storage.
They emphasize more on cloud data storage security
Pooja et al. [15]
2014
IJCSIT 2014
Automate the PDP Process with result login features
No notification when data verification failed
Rewadkar et al. [17]
2014
ICCICCT 2014
Provide a secure way to introduced data to TPA
Limited to do only one customer auditing at a time
Hiremath et al. [7]
2017
2017 ICEECCOT
Privacy-preserving public auditing system without retrieving a cloud user’s data copy
CSP can store the hash value of least used and remove the file from the CSP
Agarkhed et al. [1]
2017
2017 ICCPC
They provided security Does not provide the for storing data security issues and efficiently concerns in detail
Wu et al. [19]
2017
IEEE International Conference on CSE and EUC
They proposes a mechanism that combines data deduplication with dynamic data operations in the auditing for secure cloud storage
Increased complications by adding data deduplication
Mohanty et al. [14]
2018
2018 RICE
Provide a Merkle tree based blockchain system which is faster than regular blockchain model
Cannot be used for auditing of the sensitive data
Mohan et al. [13]
2020
2020 IJCAC
Validate cloud data without losing data privacy to TPA
Storing Key overhead, Due to malfunction in server-side or client-side we might loose key
Lekshmi et al. [10]
2020
2020 ICSSIT
Decentralized data auditing scheme, it uses smart contract to check the data integrity
As the number of blocks increase the cost of gas will increase
Jegadeesan et al. [8]
2021
International Journal of Research
Provides a replacement User need to be online paradigm called data for integrity checking integrity auditing process without private key storage using bio metric data
Sumathi et al. [18]
2021
2021 TURCOMAT
Provides safe data access data privacy
Limited to blockchain restrictions that size of block should be less than 1 MB
A Brief Survey of Cloud Data Auditing Mechanism Table 2 Summarized comparison of auditing scheme Method use PAa DIb Using third-party auditor (TPA) [9] HMAC-MD5 and RSA [14] AES and HASH [7] RSA-based storage security [1] Data integrity auditing without private key storage [8] SHA256 and ElGamal algorithm [17] Double SHA 256 algorithm and blockchain [18] Data deduplication [19] Provable data possession schemes [15] Merkle tree with blockchain [13] Smart contract and blockchain [10]
591
DCc
YES
NO
NO
YES
YES
YES
YES YES
YES YES
YES YES
YES
YES
NO
NO
YES
YES
YES
YES
YES
YES
YES
YES
YES
YES
NO
YES
YES
NO
YES
YES
YES
PA Public auditing DI Preserve data integrity DC Preserve data confidentiality
As we already know and discussed in TPA, sometimes the auditors are irresponsible and give you false report on data integrity. To tackle this problem, we use blockchain technology and to reduce the verification time we use Merkle tree.
5 Result and Analysis Table 2 contains all of the comparisons of various technologies that we addressed in terms of public auditing (PA), data integrity (DA), and data confidentiality (DC).
592
Y. Anand et al.
6 Conclusion Cloud computing has been hailed as the next-generation enterprise IT infrastructure. Unlike traditional business IT solutions, which keep IT resources under strict physical, logical, and staffing controls, cloud computing transfers applications and databases to servers in vast storage centers on the Internet, where data and service processing isn’t always reliable. Now, this feature creates plenty of additional security problems, including device and data security, recovery, and privacy, as well as legal concerns, including regulatory enforcement and auditing. In this article, we have given an overview of cloud storage data auditing techniques. We first discuss the third-party auditor (TPA) technique for auditing the data. Then the SHA, MD-5 and other cryptographic algorithm for computing data integrity, and for resolving the limitations, we discussed the PDP, data deduplication and blockchain. Data deduplication saves storage space and upload bandwidth, whereas Merkle tree reduces the verification time. Almost all of these papers emphasize the need to create optimization techniques that can be used to accelerate the data owner’s fixed operation.
References 1. J. Agarkhed, R. Ashalatha, An efficient auditing scheme for data storage security in cloud (2017) 2. W. Aiken, S. Rizvi, Cloud Security Auditing: Challenges and Emerging Approaches (2015). https://www.infoq.com/articles/cloud-security-auditing-challenges-and-emergingapproaches/ 3. T.K. Babu, C. Guruprakash, A systematic review of the third party auditing in cloud security: security analysis, computation overhead and performance evaluation, in 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC). IEEE (2019), pp. 86–91 4. N. Bhalaji, Reliable data transmission with heightened confidentiality and integrity in IoT empowered mobile networks. J. ISMAC 2(02), 106–117 (2020) 5. K. Bhushan, et al., Ddos attack mitigation and resource provisioning in cloud using fog computing, in 2017 International Conference on Smart Technologies for Smart Nation (SmartTechCon). IEEE (2017), pp. 308–313 6. M.M. Hasan, M.A. Rahman, Protection by detection: a signaling game approach to mitigate co-resident attacks in cloud, in 2017 IEEE 10th International Conference on Cloud Computing (CLOUD). IEEE (2017), pp. 552–559 7. S. Hiremath, S. Kunte, A novel data auditing approach to achieve data privacy and data integrity in cloud computing, in 2017 International Conference on Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT). IEEE (2017), pp. 306–310 8. R. Jegadeesan, C. Sahithi, A scalable mechanism of cloud storage for data integrity auditing without private key storage (2021) 9. C.W. Jin Li, W.L. Kui Ren, Toward publicly auditable secure cloud data storage services (2010) 10. M. Lekshmi, N. Subramanian, Data auditing in cloud storage using smart contract, in 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT). IEEE (2020), pp. 999–1002 11. J.S. Manoharan, A novel user layer cloud security model based on chaotic Arnold transformation using fingerprint biometric traits. J. Innov. Image Process. (JIIP) 3(01), 36–51 (2021)
A Brief Survey of Cloud Data Auditing Mechanism
593
12. M. Moghadasi, S. Majid, G. Fazekas, Cloud computing auditing. Int. J. Adv. Computer Sci. Appl. 9(01) (2018). https://doi.org/10.14569/IJACSA.2018.091265 13. A.P. Mohan, A. Gladston et al., Merkle tree and blockchain-based cloud data auditing. Int. J. Cloud Appl. Comput. (IJCAC) 10(3), 54–66 (2020) 14. S. Mohanty, P.K. Pattnaik, R. Kumar, Confidentiality preserving auditing for cloud computing environment, in 2018 International Conference on Research in Intelligent and Computing in Engineering (RICE) (2018), pp. 1–4. https://doi.org/10.1109/RICE.2018.8509052 15. P. Natu, S. Pachouly, A comparative analysis of provable data possession schemes in cloud 16. B. Nt, 14 most common cloud security attacks and counter measures (2019). https://roboticsbiz. com/14-most-common-critical-cloud-security-attacks-and-countermeasures/ [Online]. Last accessed 9 May 2021 17. D. Rewadkar, S.Y. Ghatage, Cloud storage system enabling secure privacy preserving third party audit, in 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT). IEEE (2014), pp. 695–699 18. M. Sumathi et al., Secure blockchain based data storage and integrity auditing in cloud. Turk. J. Computer Math. Educ. (TURCOMAT) 12(9), 159–165 (2021) 19. Y. Wu, et al., Dynamic data operations with deduplication privacy-preserving public auditing for secure cloud storage (2017)
Detecting Sybil Node in Intelligent Transport System K. Akshaya and T. V. Sarath
Abstract The most important applications of vehicular ad hoc networks (VANET) are dynamic traffic light control. Sybil attack creates multiple fake identities for the same vehicle. In this paper, we propose a method to detect Sybil vehicle using machine learning algorithms. The network and vehicular demand were created using simulation of urban mobility (SUMO) and OpenStreetMap (OSM). The dataset and features for training the machine learning models were collected by conducting a series of simulations in NetSim. The machine learning algorithms like support vector machine (SVM), logistic regression, and random forest were trained using the data packets received by the roadside unit which includes the features like packet status, source identity, destination identity, battery status, and geographical values. The Sybil node classification results show that logistic regression and random forest give more accuracy. With logistic regression, an accuracy of 99.62%, and with random forest, an accuracy of 99.67% is obtained. Keywords ITS · Machine learning · NetSim · SUMO · Sybil attack · VANET
1 Introduction Intelligent transportation system (ITS) helps to improve the quality of transportation systems by providing driver assistance, safety, and traffic control. For all these kinds of applications, wireless communication is inevitable. Vehicular ad hoc network (VANET) is a subdivision of mobile ad hoc networks (MANETs) where the vehicle acts as a wireless communication node. VANET is made up of components such as on-board unit (OBU) which are installed in the vehicles to provide capability of wireless communication and roadside unit (RSU) installed in junctions on the K. Akshaya (B) · T. V. Sarath Department of Electrical and Electronics Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India T. V. Sarath e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_43
595
596
K. Akshaya and T. V. Sarath
roadside which acts as a gateway between OBU and communication infrastructure. Communication in VANETs can be categorized into two: vehicle-to-vehicle (V2V) communication and vehicle-to-infrastructure (V2I) communication. V2V communication is a short range of communication where the data collected from different sensors is communicated between the OBU within the vehicles. V2I is a long-range communication between vehicle and RSU [1]. VANETs are vulnerable to network attacks, few of the attacks are denial of service (DoS) attack, distributed denial of service (DDoS) attack, black hole attack, illusion attack, timing attack, man-in-themiddle attack, and Sybil attack [2]. In a DoS attack, the attacker vehicle jams the entire network and network services will be no longer available for the authentic users. DDoS attack is a distributed attack where the attacker launches the attack in a network from a different location. In black hole attacks, the attacker nodes circulate false routing information to other vehicles in the network. Whereas illusion attack, the attacker manipulates the vehicle data, readings and circulates the incorrect readings to other nearby vehicles. In a timing attack, if the malicious node receives some emergency message, it will add a delay before forwarding the message, thus denying the other vehicle receiving the real-time data. In man-in-the-middle attack, the attacker node intrudes on the communication between two vehicles and accesses the data. In a Sybil attack, the attacker node creates multiple fake identities and creates a false traffic condition [2]. Currently, most of the traffic lights have a pre-set time for controlling the traffic which leads to congestion in traffic flow. The traffic light time is pre-set depending upon the location, peak on, and peak off traffic. Dynamic traffic light is implemented in VANETs where RSU receives the message from OBU and counts the number of vehicles based on the packets received and optimizes the traffic light timings which develops decision-making and computational abilities in the physical world according to unified theory of acceptance and use of technology (UTAUT) [3, 4]. When RSU receives the packets from vehicles, it extracts the vehicle identity and base upon the number of vehicle identities received the RSU calculates the density of traffic [5–8]. But the chances of a dynamic traffic light system getting attacked by a malicious node is high. One such dangerous attack model is the Sybil attack where the attacker vehicle can create multiple fake identities and increase the density of the vehicles, thus interrupts the working dynamic traffic light control system. Implementing intrusion detection system (IDS) on the RSU monitors the network and identifies suspicious activates happening in the network. IDS identifies intrusion in different methods. Network-based IDS, where IDS is deployed in the network, therefore it can analyze inbound and outbound traffic of the network. Host-based IDS helps to identify malicious activity in the host due to any malware. Signature-based IDS saves the signature of intrusion and identifies the intrusion due to already exiting signature, antivirus software is an example for signature-based IDS. In anomalybased IDS, it stores data from different network and uses machine learning models, and the data includes bandwidth, ports, protocols, and other devices [9]. In the proposed method, we use machine learning to identify the malicious node. The main component of the intrusion detection system using machine learning is dataset obtained from the network nodes and the machine learning model is implemented
Detecting Sybil Node in Intelligent Transport System
597
Fig. 1 Sybil attack
on the receiver. As shown in Fig. 1, the malicious vehicle A creates two fake identities B and D. The vehicles C and E are the normal vehicles that communicate to the RSU. When RSU calculates the vehicle density, the fake nodes are also calculated. The RSU calculates the number of vehicles as five. In this work, we focus on detecting Sybil attacks in intelligent transport systems using machine learning models. Different malicious and non-malicious traffic scenarios are simulated for generating the dataset for machine learning. SUMO is used for creating the vehicular demand. Different vehicular networks were implemented in NetSim. The features like geographical data, packet status, and packet count, and battery status of the vehicle are collected from the packet trace of NetSim. Machine learning algorithms like the random forest, logistic regression, and SVM are trained to identify the Sybil attacker.
2 Literature Survey In the current traffic condition, a dynamic traffic light control system is necessary. Srivastava et al. [6] implemented an intelligent traffic light control system by collecting the data like vehicle count, speed, and density by simulating the network scenarios. This data is used to optimize the average waiting time of the traffic light. N. Maslekar et al. [5] introduced an adaptive traffic light control system where VANET was implemented for developing the dynamic traffic light control. These works did not discuss the network attacks that can pan out in VANET communication. Sumra et al. [10] discuss different types of network attacks in VANETs such as DoS attack, DDoS attack, and Sybil attack and also mentions Sybil attack is the most dangerous network attack. IDS identifies unauthorized or malicious activates in a network. IDS can be widely classified into two: first one is based on signature and second one is based on Anomaly [10]. Signature-based intrusion detection systems (SIDS) is a
598
K. Akshaya and T. V. Sarath
pattern-based intrusion detection system. One intrusion will be labeled with a signature. If a new intrusion signature matches older signature, it is identified as malicious node. Anomaly-based intrusion detection system (AIDS) is machine learning-based intrusion detection system. Implementing deep learning in intrusion detection helps in automatic feature extraction and selection of adaptive activation function [11]. In AIDS, the models are trained using the data of normal network and intruded network, classification, and identification of malicious node is done in testing phase [12]. A time-series approach to identify Sybil attack is discussed in [13], in which they consider that two or more vehicles cannot pass the RSU exactly at the same time. The messages from the vehicle with adjacent time stamps were considered as a Sybil attack node. In [14], the attacker node is identified by the signal strength received. The neighboring vehicle calculates the distance to the vehicle from which the signal originated. Based on the pre-defined threshold for signal strength, the Sybil node is identified. Grover et al. [15] identify the Sybil node from the neighboring vehicle. Network scenarios were simulated such that all the neighboring nodes will be having the same set of neighbors if a node exists in the list of other nodes for more than a defined time, that node was categorized as a Sybil node. In [16], authors identify the Sybil node by V2V communication. The neighboring node calculates the accurate speed of the vehicle and determines whether the node is a Sybil node or not. The mobility scenario in SUMO is used to simulate the network for this work. As discussed in [13–16], authors use neighboring nodes to identify the Sybil attack in VANETs. The accuracy of these methods might be decreased due to the dynamic nature of traffic. The complexity also increases when all the nodes broadcast messages [17]. In [18], the authors use the existing advance driver assistant system (ADAS) which is followed by deep learning to analyze the images from traffic and identify the fake identity nodes. In [19], Grover et al. identify the misbehavior instances in VANET by machine learning model using the classification algorithm like random forest, naive Bayes, AdaBoost. Different traffic scenarios were simulated, and the features like geographical positing, speed deviation, acceptance range, and signal strength of packet trace are used as dataset for training the machine learning algorithms. The above-discussed work concentrates on identifying whether a Sybil attack has happened or not, but not concentrating on identifying which is the Sybil vehicle. In [20], the Sybil nodes are identified by training the SVM model using the driving patterns of the vehicles, only the Sybil node was showing erraticness in the driving patterns. In the above-discussed works [18–20], machine learning models are implemented in RSU, thus decreasing the complexity in OBU. The features like geographical positing, speed deviation, acceptance range, and signal strength are used to train the machine learning models to identify the occurrence of Sybil instances. Image dataset and driving patterns are used to train the machine learning models to identify the Sybil node. In this work, machine learning models are implemented for identifying the Sybil node in an intelligent transport system. The features like battery status, packet count from the Sybil node, packet status, source identity, destination identity, and battery status of the vehicles are used for training the machine learning model.
Detecting Sybil Node in Intelligent Transport System
599
Along with these features, the geographical position is also included from the insight of [21] to train the machine learning model. In the simulation of VANETs, the number of Sybil nodes is less than non-Sybil node. In such detection and classification problems with image dataset, the imbalance in the classes is neutralized by using augmentation of the image. One another technique to solve the problem of imbalance is to give each class equal weights, backpropagation and computing the error. But in case of structured data the common methods that exist are resampling the minority data such a way that the dataset is balanced, preselection, and dynamically adjusting the hyperparameters and by ensuring that the useful information is not lost [22]. In [23], the authors introduce SMOTE with a simple genetic algorithm as a solution for undersampling in datasets. Different datasets with undersampling were selected for the work. Synthetic instances of the dataset have been created and used for training. Health-related datasets usually tend to have unbalance in data Li et al. [22] have implemented SMOTE without uneven distribution as a solution for unbalanced dataset. Srinilta et al. [24] have collected the dataset of abalone, E.coli, ozone level, balance scale, mammographic mass and wine quality which have minority class and majority class of data and use SMOTE with the nearest neighbor as preprocessing step to balance the dataset. In [25], the unbalanced dataset of E.coli, Page Block and Pima were collected and SMOTE with reverse neighbor is implemented to obtain balanced data. In the above-discussed works [22–25], SMOTE algorithm is used on different datasets to overcome the problem of unbalance in the dataset.
3 System Overview The data for machine learning is generated by simulating different traffic scenarios in VANET. VANET simulation has two sections. The first section includes urban mobility and road traffic simulation which is implemented using SUMO. The second section includes the network communication which is implemented using NetSim. Figure 2 shows the system overview of VANET simulation. SUMO is an opensource traffic simulation package where users can model roads, traffic signals, vehicles, etc. [26]. As part of this work, OpenStreetMaps (OSM) is used to generate the digital map. The digital map downloaded from OSM WebWizard includes network file, routes file, and SUMO configuration file. The network files describe the roads with different lanes, traffic lights, and intersections. The routes file includes the
Fig. 2 System overview
600
K. Akshaya and T. V. Sarath
information of vehicular demand. It includes information of all the edges of which vehicles pass. The configuration file comprises all the information of the traffic scenario generated which includes paths for the routes and network file, the step length of the simulation. The NetSim takes a SUMO configuration file as an input. Different simulation scenarios can be designed in NetSim including Sybil nodes and non-Sybil nodes, the payload for the messages to be sent, sender and the receiver vehicles for the messages. In NetSim, a vehicle can be configured as a static node and set as a RSU. NetSim provides different evaluation metrics while simulating the traffic scenario, which includes packet trace and event trace. The packet trace and event trace include details of each packet like destination identity, source identity, status of the packet, gateway, and next hop. All these details can be downloaded, preprocessed and used as data for machine learning.
4 Proposed Methodology The proposed methodology uses features of packets received by RSU to train the machine learning models. Figure 3 describes the Sybil node detection method. Networks and vehicles are simulated to obtain the required input to the machine learning model. The network and the vehicle demand are created using an opensource tool SUMO and OSM. Thirty-five vehicle nodes and an RSU are simulated, where all the vehicles send a unicast message to RSU. The message sent from vehicles to RSU is the basic safety messages, and the communication is based on wireless access in vehicular environment (WAVE). The features like packet status, sender identity, receiver identity, geographical data, and battery status are extracted from the packet trace of the simulation model. Label encoding for the features like source identity, destination identity and geographical values is implemented in the initial state of data preprocessing. Label encoding gives a structured dataset as input to a
Fig. 3 A proposed design for detecting Sybil node
Detecting Sybil Node in Intelligent Transport System
601
machine learning algorithm. Synthetic minority oversampling technique is implemented in data prepossessing to overcome the imbalance of data. Considering the real traffic scenario, the number of malicious nodes will be lesser than the number of non-malicious nodes. This created an imbalance between the number of samples for non-malicious nodes and malicious nodes. To resolve the problem of imbalance classes, SMOTE is used. In SMOTE, the undersampled data is duplicated to obtain balanced data [27]. Model training was followed by data preprocessing, where the algorithms like SVM, logistic regression, and random forest were used for the classification of Sybil vehicles.
4.1 Data Generation For dataset generation, a digital map of an area in Bangalore, India, is selected using OSM WebWizard as shown in Fig. 4. The selected area includes traffic lights, two-lane roads, single-lane roads, and junctions. The selected map is converted to an equivalent road network file in OSM WebWizard. OSM WebWizard also allows us to simulate different types of vehicles and pedestrians. In this work, we have simulated 36 vehicles using OSM WebWizard. Figure 5 shows the road network of the selected map area. The generated scenario is given as input to NetSim VANET. The application for each vehicle is configured in NetSim. Figure 6 shows the VANET model simulated in NETSIM, which includes 35 vehicles and one RSU. RSU is simulated in NetSim by making one vehicle as standstill vehicle unit. All the vehicles in the network send a unicast message to the RSU. Sending unicast message reduces the complexity of network that occurs due to broadcasting. First VANET simulation model without any attacks. Second VANET model with one Sybil attack vehicle that creates twenty
Fig. 4 Selecting map area in OSM
602
Fig. 5 Network topology of the selected map in SUMO
Fig. 6 VANET simulation in NETSIM
K. Akshaya and T. V. Sarath
Detecting Sybil Node in Intelligent Transport System
603
fake identities. Third VANET model with two Sybil attack vehicles where first Sybil vehicles create ten fake identities and the second Sybil node creates twenty-five fake identities were created. In this implementation, we consider that RSU is fault-free and only 5–10% of the vehicles will be malicious. A series of experiments are carried out in NetSim. The features like source identity, destination identity packet status, number of packets, geographical values, and battery status were obtained. Source identity is the feature where the Sybil node intrudes and creates multiple identities and the dynamic traffic light control unit calculates the density based on a number of source identities received. Destination identity is the identity of the RSU, and each RSU will be having a different identity. The geographical value of the vehicle is extracted for Sybil node classification. The message from the attacker will be having a different source identity but the geographical value will be the same for all those messages. Similarly, the next feature extracted from all the vehicles is battery status, the battery status of the Sybil node will be the same for all the messages with a different identity. Another important feature obtained is the number of packets. However, the Sybil node sends messages with different source identities; the number of packets generated will be lower than the non-malicious vehicles.
5 Results and Analysis To detect the Sybil vehicle in a dynamic traffic light control, different features were collected from VANET simulation. The packet trace data obtained after simulation is used as input to machine learning models. Table 1 shows the dataset description. The first VANET model simulation without any Sybil attack vehicle has a data size of 476,506 × 7. The data has thirty-five vehicle identities and one RSU. The second VANET model with one Sybil attacker node has a data size of 476,506 × 7. The data has fifty-five vehicle identities and one RSU. The third VANET model with two Sybil attackers has a data size of 476,506 × 7. The data has seventy-one vehicle identities and one RSU. The RSU is considered as non-malicious in all the scenarios. In data preprocessing, the Sybil vehicles are label encoded as 1. The non-Sybil vehicles are label encoded as 0. The classification of Sybil vehicles and non-Sybil vehicles is done using three different classification methods. SVM which predicts a hyperplane between data points of Sybil vehicles and non-Sybil vehicles. Sybil vehicles are considered as positive hyperplane, and non-Sybil vehicles are considered Table 1 Dataset description
VANET model
Data size
Number of Sybil vehicles
Number of vehicle identities
model 1
476,506 × 7
0
35
model 2
476,506 × 7
1
55
model 3
476,506 × 7
2
71
604
K. Akshaya and T. V. Sarath
as negative hyperplane. While testing the data points falling on a positive hyperplane is predicted as Sybil vehicles. The data points that fall on the negative hyperplane are predicted as non-Sybil vehicles. The accuracy obtained after testing the SVM model is 97.32%. Logistic regression statically analyzes the dataset and predicts whether a vehicle is Sybil or not. In logistic model, prediction an accuracy of 99.62% is obtained. Random forest classifies the Sybil vehicle and non-Sybil vehicle by an ensemble learning. 100 estimators used for the classification of Sybil vehicle and nonSybil vehicle. In classification of Sybil nodes and non-Sybil nodes, random forest algorithm gives an accuracy of 99.67%. The tested models are evaluated by confusion metrics. True positive (TP) is the number of non-Sybil vehicles correctly identified as non-Sybil vehicles. False positive (FP) Sybil vehicles incorrectly identified as non-Sybil vehicles. True negative (TN) is the number of Sybil vehicles correctly identified as a Sybil vehicle. False negative (FN) is the number of non-Sybil nodes incorrectly identified as Sybil vehicles. True Positive Rate (TPR): It is the ratio of non-Sybil vehicles correctly identified as non-Sybil vehicles, can be calculated by, TPR = TP/(TP + FN)
(1)
False Positive Rate (FPR): It is the ratio of Sybil vehicles incorrectly identified as non-Sybil vehicles and can be calculated by, FPR = FP/(FP + TN)
(2)
True Negative Rate (TNR): It is the ratio of Sybil vehicles correctly identified as Sybil vehicles and can be calculated by, TNR = TN/(TN + FP )
(3)
False Negative Rate (FNR): It is the ratio of non-Sybil vehicles incorrectly identified as Sybil vehicles and can be calculated by, FNR = FN/(FN + TP)
(4)
The SMOTE method was also used to create a balanced dataset but the accuracy obtained was decreased. Less number packets for fake identities is also an important feature for the classification of Sybil nodes. The accuracy obtained for SVM is 83.7%. The predicted accuracy for logistic regression is 84.7%. In a random forest, the accuracy obtained is 97.8%. As shown in Table 2, the static binary classification of logistic regression classified Sybil nodes and non-Sybil nodes more accurately. The bagging property of random forest helps to obtain more accuracy in the classification of Sybil nodes.
Detecting Sybil Node in Intelligent Transport System
605
Table 2 Classification accuracy using analytic metrics Testing model
TPR
FPR
TNR
FNR
Accuracy (%)
SVM
0.96
0.015
0.98
0.311
97.32
Logistic regression
0.99
0.005
0.99
0.001
99.62
Random forest
0.99
0.005
0.99
0.001
99.67
6 Conclusion Sybil attack in the dynamic traffic light control system affects the whole functionality of the system. This paper implements a machine learning model for the classification of malicious nodes. Along with identification of Sybil instance in a traffic, this paper also identifies which is the Sybil vehicle that causes the network attack. The dataset was obtained by simulating VANET models in NetSim. Machine learning models like SVM, logistic regression, and random forest were implemented. SMOTE for balancing data was implemented but the accuracy decreased because less number of packet samples of Sybil node is an important feature for classification and duplicating the undersampled data gives redundant information. Results were validated using confusion metrics and observed that logistic regression and random forest give more accuracy and will be perfect for the classification of Sybil nodes from nonmalicious nodes. Once all of the vehicle in real world is converted as VANET nodes, the proposed system can be implemented to identify the attacker vehicle and make necessary actions on the vehicle. As part of future work, machine learning can be implemented in the classification of more network attacks.
References 1. P.S. Gautham, R. Shanmughasundaram, Detection and isolation of Black Hole in VANET, in 2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT) (IEEE, 2017), pp. 1534–1539 2. T. Zaidi, F. Syed, An overview: various attacks in VANET, in 2018 4th International Conference on Computing Communication and Automation (ICCCA) (IEEE, 2018), pp. 1–6 3. V. Suma, Wearable IoT based distributed framework for ubiquitous computing. J. Ubiquitous Comput. Commun. Technol. (UCCT) 03, 22–23 (2021) 4. D. Thando, R. Van Eck, Z. Tranos, Review of technology adoption models and theories to measure readiness and acceptable use of technology in a business organization. J. Inform. Technol. Digital World 02, 207–212 (2020) 5. N. Maslekar, M. Boussedjra, J. Mouzna, H. Labiod, VANET Based adaptive traffic signal control, in 2011 IEEE 73rd Vehicular Technology Conference (VTC Spring) (IEEE, 2011), pp. 1–5 6. J.R. Srivastava, T.S.B. Sudarshan, Intelligent traffic management with wireless sensor networks, in 2013 ACS International Conference on Computer Systems and Applications (AICCSA) (IEEE, 2013), pp. 1–4
606
K. Akshaya and T. V. Sarath
7. Y. Chen, K.-P. Chen, P.-A. Hsiungy, Dynamic traffic light optimization and control system using model-predictive control method, in 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC) (IEEE, 2016), pp. 2366–2371 8. K. Dar, M. Bakhouya, J. Gaber, M. Wack, P. Lorenz, Wireless communication technologies for ITS applications [Topics in Automotive Networking], in IEEE Communications Magazine (IEEE, 2010), pp. 156–162 9. Search security techtarget, https://searchsecurity.techtarget.com/definition/intrusion-detect ion-system 10. I.A. Sumra, I. Ahmad, H. Hasbullah, J. bin Ab Manan, Classes of attacks in VANET, in 2011 Saudi International Electronics, Communications and Photonics Conference (SIECPC) (IEEE, 2011), pp. 1–5 11. H. Wang, S. Smys, Overview of configuring adaptive activation functions for deep neural networks—a comparative study. J. Ubiquitous Comput. Commun. Technol. (UCCT) 03, 10–12 (2021) 12. A. Khraisat., I. Gondal., P. Vamplew, J. Kamruzzaman, Survey of intrusion detection systems: techniques, datasets and challenges, in Cybersecurity 2 (Springer, 2019), pp. 1–22 13. S. Park, B. Aslam, D. Turgut, C.C. Zou, Defense against Sybil attack in vehicular ad hoc network based on roadside unit support, in MIL-COM 2009-2009 IEEE Military Communications Conference (IEEE, 2009), pp. 1–7 14. R. Shrestha, S. Djuraev, S.Y. Nam, Sybil attack detection in vehicular network based on received signal strength, in 2014 International Conference on Connected Vehicles and Expo (ICCVE) (IEEE, 2014), pp. 745–746 15. J. Grover, G. Manoj Singh, L. Vijay, P. Nitesh Kumar, A Sybil attack detection approach using neighboring vehicles in VANET, in Proceedings of the 4th International Conference on Security of Information and Networks (2011), pp. 151–158 16. M. Ayaida, N. Messai, S. Najeh, K.B. Ndjore, A macroscopic traffic model-based approach for Sybil attack detection in VANETs, in Ad Hoc Networks, Vol. 90 (2019) 17. S. Pareek, R. Shanmughasundaram, Implementation of broadcasting protocol for emergency notification in vehicular ad hoc network (vanet), in 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS) (IEEE, 2018), pp. 1032–1037 18. K. Lim, T. Islam, H. Kim, J. Joung, A Sybil attack detection scheme based on ADAS sensors for vehicular networks, in 2020 IEEE 17th Annual Consumer Com- munications Networking Conference (CCNC) (IEEE, 2020), pp. 1–5 19. J. Grover, P. Nitesh Kumar, L. Vijay, G. Manoj Singh, Machine learning approach for multiple misbehavior detection in VANET, in International Conference on Advances in Computing and Communications (Springer, 2011), pp. 644–653 20. P. Gu, R. Khartoum, Y. Begriche, S. Ahmed, Support vector machine (svm) based Sybil attack detection in vehicular networks, in 2017 IEEE Wireless Communications and Networking Conference (WCNC) (IEEE, 2017), pp. 1–6 21. P. Harshita, R. Dharmendra Singh, G. Thippa Reddy, I. Celestine, K.B. Ali, J. Ohyun, A review on classification of imbalanced data for wireless sensor networks. Int. J. Distrib. Sens. Networks 16 (2020) 22. T.E. Tallo, M. Aina, The implementation of genetic algorithm in SMOTE (Synthetic Minority Oversampling Technique) for handling im- balanced dataset problem, in 2018 4th International Conference on Science and Technology (ICST) (IEEE, 2018), pp. 1–4 23. L. Chen, D. Ping, S. Wei, Z. Yan, Improving classification of imbalanced datasets based on km++ smote algorithm, in 2019 2nd International Conference on Safety Produce Informatization (IICSPI) (IEEE, 2019), pp. 300–306 24. C. Srinilta, K. Sivakorn, Application of natural neighborbased algorithm on oversampling SMOTE algorithms, in 2021 7th International Conference on Engineering, Applied Sciences and Technology (ICEAST) (IEEE, 2021), pp. 217–220 25. R. Das, S.Kr. Biswas, D. Devi, B. Sarma, An oversampling technique by ıntegrating reverse nearest neighbor in SMOTE: reverse-SMOTE, in 2020 International Conference on Smart Electronics and Communication (ICOSEC) (IEEE, 2020), pp. 1239–1244
Detecting Sybil Node in Intelligent Transport System
607
26. P.A. Lopez, B. Michael, B.W. Laura, E. Jakob, F. Yun Pang, H. Robert, L. Leonhard, R. Johannes, W. Peter, W. Evamarie, Microscopic traffic simulation using sumo. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 2575–2582 (IEEE, 2018) 27. N.R. Sujit, C. Santhosh Kumar, C.B. Rajesh, Improving the performance of cardiac abnormality detection from PCG signal, in AIP Conference Proceedings, vol. 1715, no. 1 (AIP, 2016), p. 020053
Applying Lowe’s “Small-System” Result to Prove the Security of UPI Protocols Sreekanth Malladi
Abstract Unified Payment Interface (UPI) is a system in India to make electronic payments using mobile phones. Here, a group of UPI servers acts as intermediaries between customers, merchants, and banks. Customers can register themselves to UPI servers using registration protocols and make payments using payment protocols. Though used by millions of people all over the country, the security of these protocols was never proven formally. In this paper, Lowe’s classical “small-system” result is applied to prove that two of the three UPI registration protocols are indeed secure in the presence of an unbounded number of participants. Keywords Cryptographic protocols · Completeness results · Financial security · Formal methods · Model checking · UPI
1 Introduction Electronic transactions in India between customers having bank accounts initially started as peer-to-peer transactions using protocols such as RTGS, NEFT, and IMPS. However, the National Payments Consortium of India (NPCI) with support from the government of India has started using the Unified Payments Interface (UPI), which is an approach that optimizes the process of transactions by employing intermediaries between banks, customers, and merchants. Typically, both customers and merchants register themselves with UPI servers that are placed all over the country. When a customer wishes to pay a merchant, the customer sends the payment information to the nearby UPI server, which in turn forwards it to the recipient’s bank. This process obviates the need for the customer to enter all the details of the bank account corresponding to the merchant, each time a fresh transaction is initiated. Instead, it is only necessary to include the unique UPI ID of the merchant or the recipient.
S. Malladi (B) Department of Computer Science and Engineering, CMR Institute of Technology, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_44
609
610
S. Malladi
UPI has become very popular in India with millions of registered users transferring billions of rupees every day [1, 2]. UPI released two versions so far, namely UPI 1.0 and UPI 2.0. UPI 1.0 had three registration protocols, the default protocol, alternate protocol I, and alternate protocol II. The default protocol is used under normal circumstances when the user’s phone is able to send SMS messages to the server so that the server may retrieve the user’s phone number from it. Alternate protocol I is used when the first protocol does not work either because the SMS fails to reach the server or could not be decoded properly. In that case, the user enters the phone number manually and sends it to the server. Alternate protocol II is used when the user wishes to change the phone number in his UPI account which has already been registered. The default protocol and alternate protocol II were retained in UPI 2.0, but the alternate protocol I was not, due to a weakness in that protocol. These protocols are similar to cryptographic protocols such as SSL/TLS or Kerberos protocols, except a few features that are new due to the use of both cellular network and the Internet to send and receive messages. Such features were not included even in other protocols for mobile transactions such as [3]. Hence, UPI protocols will have to be analyzed slightly differently. In [4], Kumar et al. have analyzed these protocols and found an attack on the alternate protocol I. However, their analysis was not only informal, but it was focused purely on the practical implementation of the protocols. In particular, they did not consider attacks that might exist when multiple parties run those protocols in parallel. This problem has been partly addressed in [5], and an approach was outlined with which multiple runs of the protocols can be analyzed formally. Furthermore, this approach has been recently implemented using the constraint solver tool under several scenarios with multiple agents running the protocols simultaneously. One known attack on the alternate protocol I was found but none were found on the remaining two protocols. However, this does not mean that they are secure. To quote Edsger Dijkstra, Program testing can be used to show the presence of bugs, but never to show their absence!.
This is the problem considered in this paper: Lowe’s classical “small-system” completeness result for protocol security [6] is applied on UPI registration protocols to prove that the default protocol and the alternate protocol II remain secure even when multiple participants run the protocols simultaneously (in parallel). This result hopefully boosts the confidence of millions of users in India who use UPI for their transactions every day, involving billions of rupees. Organization. In Sect. 2, we explain UPI protocols in detail. In Sect. 3, we present Lowe’s completeness result including the assumptions and restrictions. In Sect. 4, we explain the application of Lowe’s result to UPI protocols to conclude their security and sum up with a discussion of future work.
Applying Lowe’s “Small-System” Result to Prove the Security of UPI Protocols
611
2 Background Electronic payments in India before UPI were made directly between payer and payee following protocols such as IMPS, RTGS, and NEFT. The payer would enter the details of the payee in the web or mobile applications including the account number of the payee, the IFSC code for the bank, and the amount. The information is sent to the payer’s bank, which then forwards it to the payee’s bank. The payee bank finally sends a message to the payee that the money has been received into his/her account. The above process is cumbersome. Hence, to simplify this, the Government of India with the help of the National Payments Consortium of India (NPCI) has brought in Unified Payment Interface (UPI). UPI works by placing servers all over the country that act as intermediaries between customers, merchants, and banks. Customers and merchants can register themselves with UPI using applications such as PayTm, PhonePe, GPay, and BHIM and by following registration protocols. They can also use the same applications to make payments using payment protocols. Both protocols are executed by encrypting all the messages with a key that was exchanged using SSL. The details of the protocols are below, starting with the default protocol (Fig. 1): In this protocol, the customer sends her device details in the first message such as make and model of the phone or device with which she would conduct transactions. She receives a registration token (a hash of the device details and a nonce) in the second message from the server that she sends back in the third message as an SMS. The server tries to extract the phone number from the SMS and sends a confirmation in the fourth message if it was successful. In the fifth message, the customer checks with the server if the device was registered successfully to which the server responds in the sixth message. The customer then selects a passcode and sends a hash of it along with the phone number in the seventh message. The server stores both of them and responds in the eighth message with a login token to confirm that the profile has been set up in the server. The server also contacts all national banks, seeking
Fig. 1 UPI default protocol
1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
A→B B→A A→B B→A A→B B→A A→B B→A A→B B→A
: : : : : : : : : :
Device Details Registration Token, RA RA as SMS Conf Msg 3 Recd Is my device registered?, RA You, your device verified h(passcode), PhNo Login Token (confirms profile setup) Selected BankID Bank account details
612
S. Malladi
information on all accounts held by the customer under his/her phone number. After receiving information from the banks, the server sends a list of the banks to the customer in which he/she has accounts. In the ninth message, the customer selects a bank and transmits the ID of the bank to the server. The server then sends (masked) account details of accounts held by the customer in the bank selected. If message 3 in the default protocol does not reach the server properly or is corrupted, then the alternate protocol I is executed, wherein a user may manually enter the phone number (Fig. 2). In this protocol, the server does not receive the SMS from the user in message 3. Hence, it informs the same to the user in message 4. The user then manually enters the phone number which is sent in message 5 along with the device hard binding request and the registration token. To confirm that the user has indeed entered the correct phone number, the server sends an OTP (One-Time Password) to the phone number. The user’s device is supposed to send that OTP along with a hash of the user’s chosen passcode and phone number in message 7. The rest of the protocol is the same as the default protocol. If a user wants to change his device, then the following alternate protocol II is executed (Fig. 3). In this protocol, when the user sends the request to hard bind his device in Message 5, the server responds in Message 6 that the account already exists with the same phone number but another device registered. The user is supposed to then respond with the hash of the passcode along with the new phone number, whence the device details are updated. The rest of the protocol is the same as the default protocol.
Fig. 2 UPI alternate protocol I
1. 2. 3. 4. 5. 6. 7.
A→B B→A A→B B→A A→B B→A A→B
Fig. 3 UPI alternate protocol II
1. 2. 3. 4. 5. 6. 7.
A→B B→A A→B B→A A→B B→A A→B
: : : : : : :
Device Details Registration Token, RA RA as SMS SMS Not Received Device hard binding req, PhNo, RA Verification Status, OTP, CustID, RA OTP, h(passcode), PhNo
: : : : : : :
Device Details Registration Token, RA RA as SMS Conf Msg 3 Recd Device hard binding req SMS, RA Account already exists h(passcode), PhNo
Applying Lowe’s “Small-System” Result to Prove the Security of UPI Protocols
613
As can be seen, there are certain features in these protocols that are unconventional. In particular, protocol messages are sent both on the Internet (using mobile data) and as SMS messages. Further, the phone number is not sent as plain-text, but is supposed to be derived from the SMS sent by the phone. A Dolev-Yao (DY) attacker is not powerful enough to capture all the possible actions by an attacker in this environment. For instance, an OTP sent as an SMS cannot be sniffed from the data connection. These features make it hard to apply existing protocol analysis techniques directly on these protocols. Hence, we have made some changes so that the protocols can be described in the conventional style of cryptographic protocols and also analyzed with existing tools without requiring any modifications to the tools themselves. In particular, we model SMS as private-key signatures. Also, even though OTPs are sent over the phone network, since malware can always capture those and replay them, we consider them as being sent over the regular Internet only. We believe that both these modifications are fail-safe in the sense that any attacks that may exist with separate channels of communication will also exist in the simplified model with only one channel.
3 Lowe’s Small-System Result We will now recall the main concepts behind Lowe’s classical completeness result for protocol security, also called the “small-system” result in [6]. The result basically states that under certain assumptions and restrictions on protocol design, if a protocol is secure when exactly one agent plays each role in the protocol, then it is secure when any number of agents playing roles of the protocol. The main assumptions in Lowe’s result are as follows: Free message algebra. Messages of the protocol will be not constructed using operators with algebraic properties such as Exclusive-OR. This implies that equations such as a = a ⊕ b ⊕ b are not possible where two syntactically different messages are practically equal. No long-term shared-keys inside messages. Long-term shared-keys are never included in messages of the protocol (i.e., in readable positions). They can only be used for encryption. Acceptance of values for variables. Honest agents executing protocols will accept any values for variables in messages that they receive, if they did not already introduce a value for those variables. This rules out protocols where fields such as time-stamps are used, that need to be checked for validity by agents. No blind-forwarding. Agents must be able to fully decrypt every message. In particular, they should not be accepting encryptions that they cannot decrypt but forward them to other agents.
614
S. Malladi
No temporary secrets. The intruder always possesses all the values used by agents in place of non-secret variables of the protocol. No type-flaws in messages. Agents do not accept values of one type in place of variables of another type. This can be ensured by using constant fields inside encryptions that clearly identify the type of those fields. This was proven to prevent type-flaw attacks as well in [7, 8]. In addition to the above assumptions, the following restrictions must be observed during protocol design: 1. Identities inferrable: Every role of the protocol must exist somewhere as a free variable inside every encryption; 2. Textually distinct encryptions: Encryptions and hashes must be textually distinct. Note that Lowe’s protocol model did not include hash functions. However, as stated in [6], it is straightforward to extend the result to include those as well. Note also that though Lowe’s result applies when security of protocols is defined as security against the property of secrecy, it actually applies to some flavors of authentication as well, since secrecy defined in [6] includes authenticated secrecy, which is broader than some others (e.g., [9]).
4 Applying Lowe’s Result to UPI Protocols We will now apply Lowe’s result to UPI protocols. In order to apply the result, we need to first check whether there is a small-system attack on each of the protocols (i.e., an attack in a single run when all participants are honest). It turns out that the alternate protocol I has in fact an attack on a small-system of the protocol [4, 5]. Hence, we cannot apply Lowe’s result to that protocol. We have formally verified that there is no small-system attack on the other two protocols using the constraint solver tool [10], namely the default protocol and the alternate protocol II. Hence, they are candidates to apply the result. We will first reason as to why all the required assumptions hold: Free message algebra. It should be obvious that both the default and alternate protocol II do not employ operators with algebraic properties to construct messages. The only operators are concatenation, hash, and asymmetric encryption, all of which are free. No long-term shared-keys inside messages. The only long-term key used in the protocols is an asymmetric key, which is the private-key of the user. Even this does not exist in the original protocols. It replaces the SMS in message 3 only in our modified protocols.
Applying Lowe’s “Small-System” Result to Prove the Security of UPI Protocols
615
Acceptance of values for variables. This assumption is actually violated in messages 8, 9, and 10 of the UPI protocols, since some processing of the messages occurs in those messages. However, these messages are really disparate from the rest of the protocols. Hence, we will remove them from consideration of the security of the protocols. There are some security issues with those protocols from an operational point of view that were discussed in [4]. In particular, the server seems to send a list of all the bank accounts owned by a user without any authentication. However, that is a separate issue that needs to be treated independently. We will argue the security of protocols only until Message 7. No blind-forwarding. This assumption is not violated. The only steps in which there is forwarding are in messages 2 and 3. However, it is only a hash (R A ). Furthermore, it is not being forwarded but only sent back to the server. No temporary secrets. The intruder always possesses all the values used by agents in place of non-secret variables of the protocols. The secret variables in the protocol are: N A and passcode. The non-secret variables are: Device_Details and PhNo. It is reasonable to assume that the attacker would possess the values for these that are substituted by a user. No type-flaws in messages. There is only encrypted message in the protocol (message 3). Following the suggestion in [7], we can simply insert a distinct constant inside the encryption (or signature) to conclude that protocol runs with type-flaws will not be successful. Hence, any successful completion of the protocols will be correctly typed. In addition to the above assumptions, the following restrictions must be observed during protocol design: 1. Identities inferrable: Every role of the protocol must exist somewhere as a free variable inside every encryption; We can satisfy this restriction by including the identity of the server inside the signature in message 3, which is the only encryption in the protocol. We could also include the identities of both the user and server inside the hashes as well (R A and h(passcode)); 2. Textually distinct encryptions: Encryptions and hashes must be textually distinct. We can ensure this by adding distinct tags inside the encryption and the hashes in the protocol. This has the added benefit that such tagging prevents typeflaw attacks [7], satisfying the assumption on correctly typed messages above; After applying all the above modifications, the default protocol is as follows (Fig. 4): Above, sms is a one-way function like hashing, except that an attacker cannot compute it. Like hash functions, it is easy to extend Lowe’s result in the presence of this function as well. We can also similarly modify the alternate protocol II as well and conclude that both will be secure even when an unbounded number of participants execute the protocols in parallel.
616 Fig. 4 UPI default protocol, modified
S. Malladi
1. 2. 3. 4. 5. 6. 7.
A→B B→A A→B B→A A→B B→A A→B
: : : : : : :
DD (Device Details) RA (h(1, DD, NA )) sms(2, B, RA ) recd RA device verified h(3, passcode), PhNo
5 Conclusion In this paper, we have considered the registration protocols to register for the Unified Payments Interface (UPI), a digital payment system in India. In particular, we have made minor modifications to the protocols so that they fit the pattern of conventional cryptographic protocols and then applied Lowe’s small-system completeness result in [6] to conclude that two of the registration protocols can be deemed secure even when executed by an unbounded number of participants. This is the first-ever result to the best of our knowledge that presents a guarantee of the security of UPI protocols. The result should hopefully instill confidence in the security of UPI and contribute to the increase of digital transactions in the country. As future work, we plan to extend the result to UPI 2.0 protocols, analyze them with tools such as CPSA [11], Maude-NPA [12], and also verify them using tools such as ProVerif [13] and Tamarin [14]. Finally, an important direction of research is to explore defense mechanisms to ensure that the protocols are correct, perhaps along the lines of [15, 16]. However, an even better approach might be to revamp and redesign the entire operational structure of UPI from the ground up so as to improve both its security and usability.
References 1. S. Kalra, Insights into the use of UPI payment applications by management students in India. Int. J. Adv. Sci. Techno. 29(7), 4081–4091 (2020) 2. M. Bhuvaneswari, S. Kamalasaravanan, V. Kanimozhi, A study on consumer behaviour towards upi (unified payment interface) payment application based in Nilgiris district. Int. J. Adv. Res. Ideas Innov. Technol. 7(3) (2021) 3. V. Cortier, A. Filipiak, J. Florent, Gharout, S. Gharout, J. Traoré, Designing and proving an emv-compliant payment protocol for mobile devices, in 2nd IEEE European Symposium on Security and Privacy (EuroSP’17) (2017), pp. 467–480 4. R. Kumar, S. Kishore, H. Lu, A. Prakash, Security analysis of unified payments interface and payment apps in India, in USENIX Security Symposium (2020)
Applying Lowe’s “Small-System” Result to Prove the Security of UPI Protocols
617
5. S. Malladi, Towards automatic analysis of UPI protocols., in Proceedings of the 3rd International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021). (IEEE Computer Society, 2021) 6. G. Lowe, Towards a completeness result for model checking of security protocols. J. Computer Secur. 7(2–3), 89–146 (1999) 7. J. Heather, G. Lowe, S. Schneider, How to prevent type flaw attacks on security protocols. J. Computer Secur. 11(2), 217–244 (2003) 8. S. Malladi, P. Lafourcade, How to prevent type-flaw attacks under algebraic properties, in Security and Rewriting Techniques. Affiliated to CSF09 (July 2009) 9. J. Millen, V. Shmatikov, Constraint solving for bounded-process cryptographic protocol analysis, in Proceedings of ACM Conference on Computer and Communication Security (ACM Press, 2001), pp. 166–175 10. S. Malladi, J. Millen, Adapting constraint solving to automatically analyze UPI protocols, in Joshua Guttman’s Festschrift Conference (2021) 11. J.D. Guttman, Establishing and preserving protocol security goals. J. Computer Secur. 22(2), 203–267 (2014) 12. S. Escobar, C. Meadows, J. Meseguer, Equational cryptographic reasoning in the Maude-NRL protocol analyzer. Electr. Notes Theor. Comput. Sci. 171(4), 23–36 (2007) 13. B. Blanchet, Modeling and verifying security protocols with the applied pi calculus and ProVerif, in Foundations and Trends in Privacy and Security (2016), pp. 1(1–2):1–135 14. S. Meier, B. Schmidt, C. Cremers, D. Basin, The TAMARIN prover for the symbolic analysis of security protocols, in Computer Aided Verification. ed. by N. Sharygina, H. Veith (Springer, Berlin, 2013), pp. 696–701 15. J.S. Manoharan, A novel user layer cloud security model based on chaotic Arnold transformation using fingerprint biometric traits. J. Innov. Image Process. (JIIP) 3(1), 36–51 (2021) 16. D. Sivaganesan, A data driven trust mechanism based on blockchain in IoT sensor networks for detection and mitigation of attacks. J. Trends Computer Sci. Smart technol. (TCSST) 3(1), 59–69 (2021)
Spark-Based Scalable Algorithm for Link Prediction K. Saketh, N. Raja Rajeswari, M. Krishna Keerthana, and Fathimabi Shaik
Abstract With the specified instance of the graph data like a social network, can we identify the possible connections among the nodes that are possible to happen the course of time? Graph social networks used in social networking sites like Twitter, Facebook, LinkedIn, and Instagram are of high traction because these sites equip endusers with power to upload and circulate the content of any form that may be hyperlinks, images, videos, communicate or rather dispense their views and perceptions, widen their community, with the creation of new friends and groups. Graph social networks assist end-users with the creation or giving new friend recommendations subjected to the prevailing and already existing explicit connections in friendship network, but also to the interests and likings consequent to the association with the network that they eventually construct. Because of the humongous volumes of data that are acquired nowadays, the necessity toward flexible and protractible techniques emerges to this problem. The goal is to experiment and use Big Data Technology with different approaches of machine learning, node-based methods, and Spark CNGF to predict links in a network of academic papers. In addition to, this paper aims to show the difference of using Spark when done on a single node and multi-nodes for different methods of Link Prediction and defines the best method for Link Prediction. Keywords Link prediction · Graphs · Online social networks · Big data · Apache spark · Machine learning · Single node · Multi-node cluster · Spark CNGF
K. Saketh (B) · N. Raja Rajeswari · M. Krishna Keerthana · F. Shaik Department of Information Technology, Velagapudi Ramakrishna Siddhartha Engineering College, Vijayawada, AP, India e-mail: [email protected] N. Raja Rajeswari e-mail: [email protected] M. Krishna Keerthana e-mail: [email protected] F. Shaik e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_45
619
620
K. Saketh et al.
1 Introduction The involvement in the fields of statistics, network science, machine learning, and data mining, link prediction is rapidly growing its importance in various research communities. In statistics, generating a random graph model (such as a random block model) provides a way to establish connections between nodes in a random graph. The data mining and machine learning communities have proposed several statistical models for predicting connections. Consider the network G = (V, E), where V represents the nodes of the objects in the network, and E = V × V represents a set of “real” connections between network objects. Apache Spark is a quick, universal, scalable computing platform for clusters. Developed for high availability, supports very simple Java, Python, Scala, and SQL APIs with a wide range of integrated libraries. It is also tightly integrated with other big data tools. It has a comprehensible and separate layer architecture. All the components and layers in Apache Spark are loosely connected and non-segregated in various extensions and libraries. Spark is mainly used in graph processing and increases the speed of data processing for machine learning tasks. Spark is 100 times faster than Map Reduce [1]. In this paper, we mainly focus on using Apache Spark which is a Big Data framework [2] to solve the Link Prediction problem. The experiment is done by utilizing a multi-node cluster which is formed by deriving the concepts of Spark Architecture. This Spark multi-node cluster provided us the technique of combining computational power from different nodes together to increase the performance of the system. The Link Prediction problem is solved from a machine learning point of view by utilizing Spark which provides the machine learning libraries like SparkML. This technique is considered to process large-sized graph networks and to ameliorate the efficiency of the work. Spark is also used in performing traditional graph algorithms which include node-based and path-based methods as Spark is worthy for processing graph datasets. The novel graph-based algorithms like the capability node guidance formula are experimented on top of Spark architecture to increase the efficacy of the output and reduce the time consumption for processing graph networks. Link Prediction has many applications [3]. Some of them are mentioned below: • • • •
Used in epidemiology. Used for planning Political Campaigns. Used in Transportation planning. Recommendation Systems [4]. The contribution from the paper is summarized below:
• We proposed spark-based CNGF algorithm (SCNGF) for link prediction in a huge social network. • Experimental analysis of ML algorithms and CNGF in a single node and multinode cluster. • This paper also conveys that Spark-based CNGF is better than node-based methods and traditional ML algorithms.
Spark-Based Scalable Algorithm for Link Prediction
621
The rest of the paper is organized as follows where Sect. 2 is about literature and Sect. 3 is about existing and proposed methodologies and Sect. 4 gives the description of the dataset and in Sect. 5 we present the results, and Sect. 6 outlines the conclusion and future work.
2 Related Work Link Prediction problems are playing a vital role in any social network. There is an array of methods for Link Prediction. The traditional link prediction methods can be categorized into node-based and path-based methods. The methods which are based on node information are known as node-based methods [5–8]. The node-based methods like Common Neighbors Index, Adamic/Adar, Preferential attachment, and Jaccard Similarity were proposed by LadaAdamic and Eytan Adar which determines the similarity between any two unconnected nodes in a graph. The methods which are based on paths are known as Path-based methods [5, 6]. Unlike Node-based methods, path-based methods consider the whole network structure. Leo Katz introduced a path-based method named, Katz Status Index and Dijkstra algorithm to determine the shortest path between two unconnected nodes in a graph. These techniques can give us more accurate results if the graph is very small. If a very large social graph is taken under consideration, then these techniques might get complicated and provide less accuracy. In paper [9] sorting of nodes and representing nodes in matrix form is done. The paper only concentrated on the conceptual approach of these techniques. In these papers [5, 10], work is done by using machine learning algorithms like logistic regression Classifier, Random Forest Classifier, and other techniques to deal with large scale graphs. These algorithms can also be used to predict the productivity of any company during this crisis where all the employees are working from home as mentioned in paper [11] (Table 1). In these papers [12], theoretical approach of Capability Node Guidance Formula which is based on common neighbors is described to find the similarity between two unconnected nodes in a graph. In papers [13, 14] heuristic methods, SEAL Framework which is an implementation theory using Graph Neural Networks are discussed. The papers [5, 15] contained content that concentrated on Apache Sparke which is a Big Data Framework capable of forming cluster computing platforms and dealt with cluster formation and its management. Our proposed paper deals with all node-based methods, machine learning algorithms and Capability Node Guidance Formula in a multi-node cluster on the top of Apache Spark to determine the apt link between two unconnected nodes and determines the efficient approach to solve the problem of Link Prediction in social networks.
622
K. Saketh et al.
Table 1 Summary and Limitations of existing approaches Paper name
Techniques used
Limitations
Link Prediction in large scale graphs using apache spark
Node-based methods, path-based methods, machine learning algorithms with Apache Spark, Hashing like min hashing, and LSH (local sensitive hashing)
Did not use a cluster-based technique which is better for link prediction and also runtime execution of brute force is quite long
Link Prediction based on graph neural networks
Heuristic methods, common neighbors, Katz index, the SEAL framework
Theoretical justification is given by using the SEAL framework
New perspectives and methods Supervised methods—feature in link prediction extraction, classification methods, ensemble methods—random forests, bagging, sparse trees
Only a supervised approach is discussed
The algorithm of link prediction on social network
Common neighbors CNGF algorithm KatzGF algorithm
No implementation details, different techniques are discussed
The Link Prediction problem for social networks
Common neighbors Jaccard’s coefficient and Adamic/Adar, Katz, Low-rank approximation, clustering
The raw performance of the predictors is relatively low
3 Proposed Method 3.1 Preliminaries The preliminary methods of Link Prediction techniques include four types of Nodebased methods to find the similarity between two nodes. Common Neighbors index Common neighbors index is a straightforward method that is used to find the common neighbors. The concept of common neighbors is to introduce two strangers to each other by their common friend. As the name indicates, this is used to compute the number of common neighbors for any two unconnected nodes in a graph network. Degree of A = 2(B, C) Degree of A = 2(B, C) Degree of D = 2(B, C) Degree of E = 1(B) d(A) ∩ d(D) = 2 d(A) ∩ d(E) = 1 score (V x, V y) = |(x)| ∩ |(y)|
(1)
Spark-Based Scalable Algorithm for Link Prediction
623
Fig. 1 Example graph for common neighbors
Therefore, nodes A and D will show a high probability to be considered close and are more likely to be connected by a link in future. Jaccard Similarity Jaccard coefficient is a comparison technique that has been used in various fields like information retrieval field. This is used to find the similarity based on the probability of two unconnected nodes having a common node d. Jaccard Similarity is calculated for Fig. 1 and is explained as follows: Degree of A = 2 Degree of D = 2 Degree of E = 1 d(A) ∩ d(D)/d(A) ∪ d(D) = 1. d(A) ∩ d(E)/d(A) ∪ d(E) = 0.5 score (V x, V y) = {|(V x) | ∩ |(V y) |/|(V x)| ∪ |(V y)|}
(2)
Hence, the appropriate link exists between nodes A and D. Adamic/Adar Adamic/Adar is an algorithm that was introduced in the year 2003 by Lada Adamic and Eytan Adar [16]. This is an extension to Common Neighbors Index. In the Common Neighbors Index, we count the number of common neighbors for any two unconnected nodes, but in Adamic/Adar, in addition to this, we also compute the summation of the inverse log of the degree of each common neighbor. Adamic/Adar is calculated for Fig. 1 and is explained as follows: For node A and D: Common nodes (z) = B, C = 1/(log(degree(B)) + 1/(log(degree(C)) = 1/ log(3) + 1/ log(4) = 1/0.477 + 1/0.62
624
K. Saketh et al.
= 2.096 + 1.66 = 3.756 For nodes A and E: Common nodes (z) = B = 1/(log(degree(B)) = 1/ log(4) = 1.66 Since Adamic/Adar score for nodes A, D is greater than A, E, therefore, best link exists between nodes A and D. score (V x, V y) = 1/ log(z)|, z ∈ (V x) ∩ (V y)
(3)
Preferential attachment Preferential attachment is one such concept that is very familiar to network scientists. This algorithm has been publicized by LaszloBarabasi and Reka Albert by their work on large scale networks. The perspective of this algorithm is that node which has a greater number of connections or relations will be able to gain more relations. score (V x, V y) = |(V x) | · | (V y)|
(4)
Path-based methods like the Katz Status index are very complicated when compared to CNGF and it is not a scalable algorithm; therefore, it is not suitable for big data technologies.
3.2 Proposed Method This paper mainly concentrated on using node-based methods, CNGF algorithm, and ML algorithms for link prediction using spark [17]. We proposed a Spark-based CNGF algorithm. This section describes the architecture of spark and the proposed architecture diagram. Apache Spark Architecture Spark is tightly integrated with other big data tools. Specifically, Spark can run a Hadoop cluster and access any of its data sources like, any files stored on the Hadoop Distributed File System (HDFS) or any other storage system (including the local file system) that supports the Hadoop API, Amazon S3, Cassandra, Hive, HBase, etc. Apache Spark obeys a master–slave architecture with master daemons and a cluster manager. Figure 2 shows the basic structure of the Spark architecture.
Spark-Based Scalable Algorithm for Link Prediction
625
Fig. 2 Spark architecture diagram
Role of the driver This is the central means of access for Spark Shell. The driver program performs the chief functions of the application and creates a SparkContext in it. It contains various components which are accountable for converting Spark custom code into Spark jobs that run on the cluster. Role of the executor The executor is defined as a distributed agent in charge of performing tasks. Every single spark application has its execution procedure. The executor generally runs during the life cycle of the Spark application. Users can also choose their dynamic associations, in which Spark can be added or deleted. The actors are dynamically adjusted to match the overall workload. Role of cluster manager The cluster administrator is an external service that is supervised for obtaining resources in the Spark cluster and assigning them to Spark jobs. Spark applications can use three types of cluster managers to configure and share various physical resources: a simple standalone Spark cluster that can be run locally or in the cloud, Hadoop YARN, and Apache Mesos. Proposed Architecture The initial step includes considering the graph dataset and exploring it. As we use Spark architecture along with Hadoop, the dataset is stored in the HDFS for faster accessing and processing. The feature engineering step consists of taking traditional link prediction techniques as input for model development in machine learning. All the machine learning algorithms are experimented with using the Apache Spark framework which provides the SparkML library. Along with machine learning, novel graph algorithms like the Capability node guidance formula (CNGF) are also implemented using Spark to increase the efficacy of the output and algorithm performance.
626
K. Saketh et al.
The spark architecture obeying the master–slave concept manages the spark application submitted and processes the spark jobs. The comparison between the different proposed methods is done by considering the evaluation metrics and analysis is done to identify the best and efficient approach to solve the link prediction problem (Fig. 3). Capability Node Guidance Formula (CNGF) using Spark. Input: A graph G = V, E, node x, node y. Output: Similarity value of node x and node y. Algorithm description: (1) (2) (3)
Find all the common neighbors of node pair x and y and store them in a common neighbor set. Extract the sub-graph which contains the tested node pair and their common neighbors. While (the common neighbor set is not null) {
Fig. 3 Architecture diagram for the proposed methodology
Spark-Based Scalable Algorithm for Link Prediction
627
Fig. 4 Example Graph and extracted graph for CNGF
(4) (5) (6) (7)
Calculate the degree of node a, get a degree. Node a is one node of the common neighbor set. Calculate the degree of node a in the sub-graph extracted in Step 2, get a1degree. Calculate the guidance capability of node V, Guid (a) = a1degree/log(adegree). The similarity of node x and node y is Similarityxy + = Guid (a).
This proposed algorithm is implemented using Spark architecture to avail the benefits of it. Processing graph datasets [18] are a hectic task and time consuming which can be reduced by considering Apache Spark. CNGF which is a novel graph algorithm compared to traditional link prediction methods, experimenting it on Spark increases the efficiency and computation power of the system. This algorithm is built on a spark session in a multi-cluster environment. The Spark application processes the graph dataset using HDFS and all the node processing which includes extracting subgraphs, finding degrees of common neighbors is done on the top of the Spark framework. The utilization of Spark increases the computation capacity and thereby reduces the execution time. It also helps in providing libraries like graph frames that can process the graph datasets. As the resources are shared and managed efficiently in Spark architecture, the CNGF algorithm performs efficiently in solving the problem of Link Prediction (Fig. 4).
4 Experimental Results 4.1 Experimental Data To grasp our method and the way we tend to work with this data, we should initially comprehend its format and representation. Imagine the task as a network delineated by a graph within which the nodes (vertices) are the papers with their original attributes and therefore the edges (links) between the nodes will exist, provided that 2 papers refer each other, irrespective of the direction of graph or network. So, within
628 Table 2 Dataset
K. Saketh et al. Attribute
Description
node_id
A unique id for the paper
Year
The publication year of the paper
Title
The title of the paper
Authors
The authors of the paper
Journal
The journal of the paper
Abstract
The abstract of the paper
the following table, we can see the attributes and a brief description of them relating to each paper [5] (Table 2). As we tend to observe, each paper consists of six attributes that can be utilized for the feature engineering process. The dataset consisting of paper’s data can be a CSV file (comma-separated value), where each row in the dataset represents a node. If there are multiple authors, they are separated by a comma. The edge list file is one more file that is used which is of .txt format. This edge list file contains two rows with node id in each row. Each row in this file indicates that there is a link between each of the papers denoted by node_id. The ids are tab-separated or can be customized to space-separated. This file is essential and considered as ground truth.
4.2 Experimental Setup The Operating System used is Ubuntu 19 where all the required software like Hadoop, Apache Spark, Java, Python, Scala, SSH, etc., are installed to work on the problem. The whole application was built with Apache Spark 2.7. The Java JDK version used is 1.8 which is also known as Java8. The additional package used for processing the graph dataset is the graph frames package which is available in spark. The version of the graph frames package used is graph frames: 0.5.0-spark2.1-s_2.11. Create the Spark multi-node cluster by configuring all the necessary parameters on both master and slave nodes. After Successful creation start the cluster using start-all.sh command. The verification of the cluster can be confirmed using the jps command which displays the daemons on both master and slave as shown in Fig. 5.
4.3 Execution Process The main command for submission of a spark application is a spark submit which starts the application to run on the Apache Spark framework. This command is appended with other parameters or arguments like the file path, file name, required packages, deploy mode, and other specific spark job configurations. As the aim is to deploy the Link Prediction Application on the multi-node cluster, the argument
Spark-Based Scalable Algorithm for Link Prediction
629
Fig. 5 Daemons on the master node and worker node
to spark submit will be deploy mode as a cluster. This submits the job to spark, and the job starts executing on the workers managed by the master with data streaming along HDFS. To submit the spark application on the cluster mode we have to add additional parameters to the Spark Session which includes the .master (), .config (). Val spark = SparkSession. builder (). appName(“SparkApp”). master (“”).config (“spark. submit. deployMode”, “cluster”). Spark submit command: spark-submit –packages graph frames: graphframes:0.5.0-spark2.1-s_2.11 cngf.py “example.txt” “ ”.
5 Result and Analysis 5.1 Results Node-based methods without using spark Node-based methods like common neighbors index and Preferential Attachment are implemented without using spark. These algorithms can also be implemented using Spark which reduces the execution time and increases the computation speed and capacity. The output for the algorithms done without spark is shown in Fig. 6. Node-based methods using spark Adamic/Adar Adamic/Adar calculates the summation of the inverse log of the degree of each common neighbor of two unconnected nodes in a graph. In Fig. 7, there is a source node and destination node which are unconnected and their Adamic/Adar score. Jaccard Similarity Jaccard Similarity calculates the probability of two unconnected nodes having a common node d. In Fig. 7, Jaccard coefficient is calculated for two unconnected nodes which are represented as Source_node and Destination_node.
630
K. Saketh et al.
Fig. 6 Output for common neighbors and preferential attachment without using Spark
Fig. 7 Output for Adamic/Adar and Jaccard similarity using spark
Common Neighbors index Common Neighbors index calculates the number of common neighbors between two unconnected nodes in a graph. Figure 8 shows the common neighbors index of every two unconnected nodes which are represented as Source_node and Destination_node.
Fig. 8 Output for common neighbors index preferential attachment using spark
Spark-Based Scalable Algorithm for Link Prediction
631
Preferential attachment Preferential attachment calculates the number of connections of every node and the node which has a greater number of connections is likely to have future relations. Figure 8 shows the Preferential Attachment score of two unconnected nodes which are represented as source and destinations nodes, respectively. CNGF Algorithm The similarity-based CNGF algorithm for every two unconnected nodes in a graph is shown below. The output of this Algorithm represents the score of similarity between the source node and all the other destination nodes. In other words, it shows the similarity or the score of possible links between the nodes in a graph. The results are displayed on the command line or the terminal. The similarity of unconnected nodes is shown in the form of a Table 3. As the similarity between the nodes 1000 and 1006 is larger when compared to others, the best possible link is between the nodes 1000 and 1006. In the same way, the similarity is calculated for each node with all its unconnected nodes and the nodes which have the highest similarity becomes the most probable link. ML Classifiers Accuracy & F1 scores which are the evaluation metrics of Machine Learning algorithms are calculated to know the efficiency of the model designed. These metrics are displayed on the terminal as shown in Fig. 9. Figure 9 shows the difference between the execution of times for machine learning algorithms when executed on a single node and multi-node clusters of the Apache Table 3 Similarities given by CNGF algorithm
Node with possible connection to the node 1006
The similarity of the nodes with the node 1006
1000
6.398526504755877
1012
2.485339738238447
1010
2.270466553858725
1011
2.270466553858725
1018
1.442695040888963
1016
1.442695040888963
1015
1.442695040888963
1017
1.442695040888963
1014
1.242669869119223
1013
1.242669869119223
1009
1.027796684739501
1008
1.027796684739501
1007
1.027796684739501
632
K. Saketh et al.
Fig. 9 The output of ML algorithm on a multi-node and single node cluster
Spark framework. The time taken for execution on the multimode cluster is lower when compared to a single node.
5.2 Observations from the Work The application that is used for demonstrating Link prediction is the paper citation dataset and it is done in three methods using node-based methods, machine learning algorithms, and CNGF algorithm. The best link prediction is done using the CNGF algorithm when compared to node-based methods and traditional machine learning algorithms because extraction of a sub-graph is done in the CNGF algorithm which gives us the better prediction score value or similarity value. The execution time for the Algorithms using Spark and without using Spark shows a lot of variation. In the case of not using Spark the execution time for the node-based algorithms was observed as 3245 s, whereas in the case of using Spark multi-node cluster, the execution time observed was 154 s. By comparing, we can deduce that Spark provides extra power, speed and reduces execution time. Comparison of the Machine Learning Classifiers The fastest classifier was the Linear Regression Classifier and the slowest one was Naïve Bayes as shown in Table 4. Upon observation from the graphs, though the execution time of the Linear Regression classifier is least, the Accuracy is not appreciable. Therefore, the classifier which performed in less time and brought the best Accuracy is Random Forest whose execution time is 154 s and Accuracy is 94%. This makes the Random Forest Classifier the best model for Predicting Links in the CoAuthorship Network using Machine Learning as shown in Fig. 10. On comparing Random Forest and CNGF, CNGF is more efficient because of having sub-graph extraction which is a vital feature. Comparison of CNGF on Spark Cluster The increase in the number of nodes in the cluster increases the computation capacity and reduces the execution time. Table 5 is represented by keeping the size of the dataset constant and increasing the number of nodes. Each Node is equipped with 4 cores and has 8 Gb of memory. We have utilized a cluster of 5 nodes where 4 are
Spark-Based Scalable Algorithm for Link Prediction
633
Table 4 Accuracy & execution time per Classifier Classifier
Accuracy (%)
Execution time (s) Single node
Execution time (s) Multi-node
Comparison with the best classifier (Random forest)
Logistic regression
79
5489.93
148.42
Random forest can discover more complex dependencies
Naïve Bayes
82
6215.74
165.32
Random forest produces a more precise classifier
Decision tree
88
6017.89
160.45
Random forest is a combination of multiple decision trees
Random forest
94
5528.31
154.16
Best classifier
S 1 c o 0.5 r e 0 s
Accuracy & F1 scores 0.79
0.82 0.75
0.68
Logistic Regression
0.88
0.94 0.78
0.85
Naïve Bayes Decision Tree Random Forest Accuracy F1 Score
Fig. 10 Bar chart of accuracy & F1 score per classifier
Table 5 Execution Time based on node characteristics
Nodes
Cores
Memory (GB)
Execution time (s)
1
4
8
15,172
2
8
16
10,542
3
12
24
7564
4
16
32
5642
5
20
40
4125
workers and 1 is master. Based on these configurations and constraints, the execution time of the CNGF algorithm on the cluster is mentioned in Table 5.
634
K. Saketh et al.
6 Conclusion and Future Work Consider a case where a million number of documents in a database are provided, for the new document as input, we tend to figure out which of the document from the million is most identical to the given input document. In this case, the motive is to predict novel links between documents based on similarity thresholds. The proposed technique can function efficiently since it extracts the sub-graph to find the similarity which is one of the efficient processes to get better results. This similarity metric is utilized to predict the similarity. Node-based methods which are a part of the link prediction techniques, along with traditional machine learning algorithms are tested but better results are obtained using Node guidance methods like CNGF due to the feature of sub-graph extraction. The following research will include addressing the link prediction problem from a different perspective. Further research involves re-examining a similar network using cluster-based techniques. This method groups up the similar nodes from the cluster and aims to predict that nodes that are from the same cluster tend to exhibit a similar pattern in connection. More precisely, by using this approach, an initial threshold θ is set, and later subtraction of all the edges of the network graph with an edge weight > θ is performed. Later, it can be inferred that the connected component in the graph will be considered as one cluster. Two nodes with similarity exist in the same connected component if an edge or a path exists between the nodes.
References 1. S. Fathimabi, E. Jangam, A. Srisaila,MapReduce based heart disease prediction system, in 2021 8th International Conference on Computing for Sustainable Global Development (INDIACom) (IEEE, 2021), pp. 281–286 2. W. Haoxiang, S. Smys, Big data analysis and perturbation using data mining algorithm. J. Soft Comput. Paradigm (JSCP) 3(01), 19–28 (2021) 3. F. Tang, Link-prediction and its application in online social networks. PhD diss., Victoria University (2017) 4. E.A. Mohamed, N. Zaki, M. Marjan, Current trends and challenges in link prediction methods in dynamic social networks: a literature 5. A.. εoδoσ´ιoυ, Distributed Link Prediction in Large Scale Graphs using Apache Spark. No. GRI-2019-24591. Aristotle University of Thessaloniki (2019) 6. D. Liben-Nowell, J. Kleinberg, The link-prediction problem for social networks. J. Am. Soc. Inform. Sci. Technol. 58(7), 1019–1031 (2007) 7. A. Zareie, R. Sakellariou, Similarity-based link prediction in social networks using latent relationships between the users. Sci. Rep. 10(1), 1–11 (2020) 8. R. Dharavath, N.S. Arora, Spark’s GraphX-based link prediction for social communication using triangle counting. Soc. Netw. Anal. Min. 9(1), 1–12 (2019) 9. W. Wang, L. Wu, Y. Huang, H. Wang, R. Zhu, Link prediction based on deep convolutional neural network. Information 10(5), 172 (2019) 10. R.N. Lichtenwalter, J.T. Lussier, N.V. Chawla, New perspectives and methods in link prediction, in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 243–252 (2010)
Spark-Based Scalable Algorithm for Link Prediction
635
11. A. Sungheetha, R. Sharma, A comparative machine learning study on IT sector edge nearer to working from home (WFH) contract category for improving productivity. J. Artif. Intell. 2(04), 217–225 (2020) 12. L. Dong, Y. Li, H. Yin, H. Le, M. Rui, The algorithm of link prediction on social network. Math. Probl. Eng. 2013 (2013) 13. M. Zhang, P. Li, Y. Xia, K. Wang, L. Jin, Revisiting graph neural networks for link prediction. arXiv preprint arXiv:2010.16103 (2020) 14. M. Zhang, Y. Chen, Link prediction based on graph neural networks. Adv. Neural. Inf. Process. Syst. 31, 5165–5175 (2018) 15. G. Ranganathan, Real time anomaly detection techniques using pyspark frame work. J. Artif. Intell. 2(01), 20–30 (2020) 16. L.A. Adamic, E. Adar, Friends and neighbors on the web. Soc. Netw. 25(3), 211–230 (2003) 17. M. Al Hasan, V. Chaoji, S. Salem, M. Zaki,Link prediction using supervised learning, in SDM06: Workshop on Link Analysis, Counter-Terrorism and Security, vol. 30, pp. 798–805 (2006) 18. S. Fathimabi, R.B.V. Subramanyam, D.V.L.N. Somayajulu, MSP: multiple sub-graph query processing using structure-based graph partitioning strategy and map-reduce. J. King Saud Univ.-Comput. Inf. Sci. 31(1), 22–34 (2019)
Obstacle Detection by Power Transmission Line Inspection Robot Ravipati Jhansi, P. A. Ashwin Kumar, Sai Keerthana, Sai Pavan, Revant, and Subhasri Duttagupta
Abstract Inspection of a power transmission line (PTL) is crucial for fault detection and maintenance of the line cable but it is a job that comes with high-occupational hazard. PTL inspection robots are developed to minimize risks to humans, while doing maintenance work on PTL cables. These robots are capable of traversing the power line cables with minimal manual intervention provided they detect the obstacles present on the line and are able to overcome them. This paper discusses a technique that incorporates image texture analysis using popular gray-level cooccurrence matrices (GLCM) algorithm for obstacle detection and then, followed by standard machine learning approach to classify the obstacles. Our results indicate that this approach can be effectively used by autonomous armed robot for traversing a PTL cable. Keywords Ptl traversal robot · Machine learning · Image texture analysis · Obstacle detection · GLCM
R. Jhansi (B) · P. A. Ashwin Kumar · S. Keerthana · S. Pavan · Revant · S. Duttagupta Department of Computer Science and Engineering, Amrita Viswa Vidyapeetham, Amritapuri, India e-mail: [email protected] P. A. Ashwin Kumar e-mail: [email protected] S. Keerthana e-mail: [email protected] S. Pavan e-mail: [email protected] Revant e-mail: [email protected] S. Duttagupta e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_46
637
638
R. Jhansi et al.
1 Introduction Autonomous robots that can travel along electrical transmission lines to perform inspection and repair work, can help in improving efficiency of electrical line maintenance, reduce labor costs and are expected to reduce the risk of injury to maintenance personnel. Power Transmission Line (PTL) robots have simple robot camera that collects visual data which is transferred to a workstation below. Usually, a human controller maneuvers the robot according to the obstacle encountered. In our research work, we incorporate machine learning techniques for the robot to recognize obstacles on its own. This can facilitate the robot to do maneuvers by itself according to the obstacles encountered. In this project, path planning is rather easy step. However, even though obstacles present on the line are well documented, the real challenge is to continue the linear pathway. In the existing literature, we have not come across papers where robots detect automatically different types of obstacles and overcome them adopting appropriate techniques based on the object. The existing robots usually follow a single-obstacle avoidance strategy. The scope of the paper, however, is limited to the strategy for object detection and classification which is undoubtedly the first step in designing such a robot. In the following sections, we elaborate on our strategy for transmission line object detection and classification. The paper contributes in building datasets for five different PTL objects and demonstrates the efficacy of machine learning algorithm in classifying them with high accuracy. Even though the techniques used in the paper are popular ones in the literature, this combination of image texture analysis by GLCM algorithm followed by applying Random Forest classifier for detecting PTL objects is the unique contribution of our work.
2 Related Work There have been a number of research work [1, 2] related to high-voltage transmission lines robots but papers using machine learning or deep learning techniques for this problem are published in the recent years. In this section, we discuss some of the important related works. In the paper [3], a number of algorithms are discussed which are able to find out obstacles from the background. In this paper, classification is done using machine learning algorithms and the authors introduced a binary classifier called OTLClassifier having an extra marker module. This classifier module could classify images with foreign objects and achieves 95% recall rate and 10.7% error rate. Another obstacle detection method [4] is based on feature fusion and structural constraint. In this method, the regions are identified using bounding Box algorithm. For classifying and recognizing obstacles, Particle Swarm-optimized support vector machine (PSO-SVM) is used, and it is able to achieve the detection accuracy of 86.2%.
Obstacle Detection by Power Transmission Line Inspection Robot
639
The paper [5] introduces a new deep learning network called RCNN4SPTL (RCNN-based Foreign Object Detection for Securing Power Transmission lines). This technique primarily focuses on detecting foreign objects such as kites, birds nests, etc., on transmission lines. Similar to our technique, in this paper also image enhancement techniques are utilized to extended the data set. The technique consists of three parts; the first one is a shared convolutional neural network (SPTL-Net) and second one is a region proposal generation network (RPN), and last one is a classification regression network. This proposal has faster detection speed and provides higher accuracy in identification of foreign objects on transmission lines than the earlier image processing techniques. Similar object classification approach based on deep neural network is discussed in [6]. Another proposed method for line fault detection using R-CNN is discussed in [7]. They perform tests under various conditions and was able to show out accurate results. They make use of drone for image gathering and used CNN to distinguish transmission line objects from each other [8]. Muhamed Jishad et al. [9] have introduced a robot which can travel on transmission lines and overcome obstacles on its way. The robot is controlled remotely and it protects itself from colliding with any objects in front of it. This robot has three arms and flexible joints which can overcome suspension clamps, spacers and other objects present in the transmission line. For adjusting the center of mass of the robot body, while obstacle avoidance or overcoming, a counter weight mechanism is provided. Two orientation sensors called Accelerometer and Gyroscope are also used. Transmission lines fault detection and classification using AI approach is addressed in [10, 11].
3 Proposed Solution Most of the existing approaches adopted for image recognition by PTL Robots make use of the R-CNN based solution. Whereas our proposed solution comprises of gathering texture details of the image and classifying them using supervised learning technique. We use Random forest classifier to classify the gathered data. Setup required for our analysis Since the project is software oriented, the requirements are software; mostly python libraries. The python libraries used here are skimage.feature which has the required functions to calculate the GLCM of an image to analyze texture features and sklearn to import preprocessing functions as well as defining Random forest classifier model.
3.1 Preparation of Datasets For any machine learning algorithm, a training dataset is necessary. Considering the lack of dataset for objects in PTL, we made a dataset by combining web search images
640
R. Jhansi et al.
(a) Bird Diverters
(c) Insulators
(d) Spacers
(b) Clamp
(e) Signalling Sphere
Fig. 1 Dataset images after loading
and available datasets [12]. The prepared datasets are for five different PTL objects: Insulators, bird diverters, signaling spheres, suspension clamps and line spacers. The datasets for these line objects are not readily available. Except for insulators, other objects lack a proper dataset of considerable size. Hence, the other datasets comprises of manually compiled images from the web and augmented datasets. However, there is a sharp lack of such images in the Internet as well. These collected images are then resized to 250 ∗ 250 resolutions before being passed into the next algorithm. The datasets are labeled by their directories and are loaded accordingly, thus their directories serve as identifiers. The images are read in greyscale for ease of the next step (Fig. 1).
3.2 Texture Analysis This section deals with texture analysis of the objects. Since the objects we deal with are having distinct shapes, we infer that by performing texture analysis, it would be possible to easily detect them. Texture analysis deals with representing the underlying image characteristics. For texture analysis, we are using gray level co-occurrence matrices (GLCM). GLCM plots a graph for co-occurring grayscale
Obstacle Detection by Power Transmission Line Inspection Robot
641
values for a patch of pixel at a given distance and angle over an image. Then five features: energy, homogeneity, contrast, correlation and dissimilarity are calculated between the patch of pixels and the pixel at a given offset. These values are stored into an array for each images. In our model, GLCM is implemented in a function meant for feature extraction that takes in a training image dataset to return an array with the aforementioned features. This array serves as input for the classification algorithm. The training and test dataset are passed through the same function.
3.3 Classification Model We use Random Forest classifier to use the gathered texture details of the images and classify the images as one of the selected PTL object. Random forest is a technique where a large number of individual decision trees are used and they operate independently. Each of the decision trees provides a prediction class. Among all the classes, the class with the highest scores are chosen as the model prediction. Of all the algorithm models we were able to get the most accuracy out of Random Forest Classifier. The returned array from feature extraction function is reshaped into an array of form (image, feature). The training array is passed into the Random Forest model, and the predicted accuracy of the model is tested. Algorithm 1: Object Detection algorithm Result: To detect PTL objects and classify them accordingly Declare Img_data, Img_lbl, X_for_ML; Load dataset, directory-wise; for Images in dataset do Load as grayscale image and resize into 250*250; Append the image and image details into Img_data and Img_lbl; Do it for both training and test datasets; end Img_lbl is converted into integers for ease of processing; Pass Img_data into feature extractor(dataset); X_for_ML will contain the returned dataframe; Reshape the dataframe; Define Random Forest model; Pass X_for_ML as the training data; Evaluate the model; feature extractor(dataset) for Image in dataset do Evaluate the GLCM features energy, homogeneity, contrast, correlation and dissimilarity; Store in a dataframe; Return the dataframe; end
642
R. Jhansi et al.
Table 1 Classification report Precision Bird diverter (0) Insulators (1) Signalling sphere (2) Spacer (3) Clamps (4) Accuracy Macro avg. Weighted avg.
Recall
F1-score
Support
1.00 0.88 0.71 1.00 1.00
0.83 0.96 1.00 0.80 1.00
0.92 0.93
0.92 0.91
0.91 0.92 0.83 0.89 1.00 0.91 0.91 0.91
372 487 207 328 241 1635 1635 1635
4 Result and Analysis This section presents the results obtained from the obstacle classification algorithm. Using Random Forest algorithm, we get an accuracy of 90.89. Table 1 shows the precision and recall for each object. From the results, we can see that texture analysis gives very reliable information about images which makes classifications much more reliable. Although images of some datasets are augmented, the insulator dataset comprises of all the individual images and we did not require to do augmentation since the size of the dataset was quite satisfactory. Overall our approach is able to achieve reliable results.
(a) Confusion Matrix
(b) Normalised Confusion Matrix
5 Conclusion We discuss the first step in building transmission line monitoring robot in this paper which deals with PTL object detection. We show how GLCM algorithm can be used to perform texture analysis and then apply a suitable classification algorithm
Obstacle Detection by Power Transmission Line Inspection Robot
643
to identify specific object. Our results reflect that the proposed technique is able to classify objects in PTL lines with high accuracy. Hence, a robot with arms having camera can be used to take images of transmission line objects and then, machine learning algorithms can be used to identify different objects. As part of future work, we intend to work on obstacle overcoming strategy based on the shape and size of the identified object.
References 1. X. Liu, X. Miao, H. Jiang, Review of data analysis in vision inspection of power lines with an in-depth discussion of deeplearning technology. Annu. Rev. Control 50, 253–277 (2020) 2. K.-H. Seok, Y.S. Kim, A state of the art of power transmission line maintenance robots. J. Electr. Eng. Technol. 11(5), 1412–1422 (2016) 1975-0102 (pISSN), 2093-7423 (eISSN) 3. F. Zhang, Y. Fan, T. Cai, W. Liu, Z. Hu, N. Wang, M. Wu, OTL-classifier: towards imaging processing for future unmanned overhead transmission line maintenance. Electronics 8, 1270 (2019) 4. X. Ye, D. Wang, D. Zhang, Hu, X, Transmission line obstacle detection based on structural constraint and feature fusion. Symmetry 12, 452 (2020) 5. W. Zhang, X. Liu, J. Yuan, L. Xu, H. Sun, J. Zhou, X. Liu, RCNN-based foreign object detection for securing power transmission lines (RCNN4SPTL). Proc. Comput. Sci. 147 (2019) 6. A. Neena, M. Geetha, Image classification using an ensemble-based deep CNN, Recent Findings in Intelligent Computing Techniques (Springer, Singapore, 2018), pp. 445–456 7. H. Liang, C. Zuo, W. Wei, Detection and evaluation method of transmission line defects based on deep learning. IEEE Access 8, 38448–38458 (2020). https://doi.org/10.1109/ACCESS. 2020.2974798 8. Z.A. Siddiqui, U. Park, A drone based transmission line components inspection system with deep learning technique. Energies 13(13), 3348 (2020) 9. T.K. Muhamed Jishad, S. Ashok, Obstacle avoidance mechanism for transmission line inspection robot, in International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT) 2019, vol. 364 (2019) 10. V. Vanitha, D. Kavitha, R. Resmi, M.S. Pranav, Fault classification and location in three phase transmission lines using discrete wavelet transform, in 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT) (IEEE, Coimbatore, India, 2019) 11. A.S. Neethu, T.S. Angel, Smart fault location and fault classification in transmission line, in 2017 IEEE International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM) (2017) 12. M. Tomaszewski, B. Ruszczak, P. Michalski, The collection of images of an insulator taken outdoors in varying lighting conditions with additional laser spots. Data Brief (2018)
Performance Analysis of Logistic Regression, KNN, SVM, Naïve Bayes Classifier for Healthcare Application During COVID-19 Mausumi Goswami and Nikhil John Sebastian
Abstract Heart disease is one of the main causes of mortality in India and the USA. According to statistics, a person dies out of a heart-related disease every 36 s. COVID-19 has introduced several problems that have intensified the issue, resulting in increased deaths associated to heart disease and diabetes. The entire world is searching for new technology to address these challenges. Artificial intelligence [AI] and machine learning [ML] are considered as the technologies, which are capable of implementing a remarkable change in the lives of common people. Health care is the domain, which is expected to get the desirable benefit to implement a positive change in the lives of common people and the society at large. Previous pandemics have given enough evidence for the utilization of AI-ML algorithm as an effective tool to fight against and control the pandemic. The present epidemic, which is caused by Sars-Cov-2, has created several challenges that necessitate the rapid use of cutting-edge technology and healthcare domain expertise in order to save lives. AI-ML is used for various tasks during pandemic like tracing contacts, managing healthcare-related emergencies, automatic bed allocation, recommending nearby hospitals, recommending vaccine centers nearby, drug-related information sharing, recommending locations by utilizing their mobile location. Prediction techniques are used to save lives as early detections help to save lives. One of the problems that might make a person suffering from COVID-19 extremely sick is heart disease. In this research, four distinct machine learning algorithms are used to try to detect heart disease earlier. Many lives can be saved if heart disease can be predicted earlier. Keywords COVID-19 · Heart disease · Coronavirus · Naïve Bayes · SVM · Pandemic
M. Goswami (B) · N. J. Sebastian CHRIST (Deemed to be University), Bangalore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_47
645
646
M. Goswami and N. J. Sebastian
1 Introduction The Coronavirus pandemic has resulted in a shocking loss of human life throughout the world, and it is posing a serious threat to general health, food security, and the working world. The financial and social disruption caused by the epidemic is obliterating: a large number of people are in imminent risk of starving, while the overall number of undernourished people is estimated to be about 600 million [1]. Over the last 12 months, the epidemic has disproportionately impacted the weak and it is continuing to push millions more into poverty. The vast majority of countries have slowed their product production [2]. The origin of this epidemic is caused by the different companies and sectors, which include the chemical, solar, tourism, information, and electronics industries [3]. This virus has a huge knock-on impact on our everyday lives, as well as the global economy. COVID-19 will begin the first turnaround in the battle against global poverty in a century this year by following the decades of gradual success in reducing the number of people surviving on less than $1.90/day [4]. In addition to the risk of death from illness, the outbreak has brought intolerable psychological pressure [5]. Results found that outrageous uneasiness, where 2.7% have moderate nervousness, and 21.3% have mellow tension with 0.9% of respondents. In an examination, the preventive factors against dread were living in metropolitan conditions, family unit monetary security, and living with guardians [4, 5]. The Coronavirus pestilence started the most exceedingly awful worldwide droop since 1930, the economy completely creamed up at the full-scale level. In the primary quarter, The GDP of China fell by 6.8% contrasted with a similar period a year ago, 1, and a few nations endured major corporate liquidations and occupation misfortunes [6]. A lot of people died during COVID-19 pandemic, many of the deaths were linked to heart-related disease [7–11].
2 Related Works COVID-19 [1–6, 12–55], the pandemic has provided enough opportunity for researchers to be challenged from several perspectives. To tackle the problems, several types of applications are created to adapt to the demands of the scenario. In this section, a few applications are reviewed from different perspectives like category of the algorithm that is being used to solve the problem, theme of the application, description of the dataset with respect to a specific category (Table 1). In [44], the method used for data collection is (1) data arrangement using the public streaming Twitter application programming interface (Programming interface) for Coronavirus-related watchwords; and (2) data cleaning, dealing with, and assessment of tweets using an independent computer-based intelligence approach. Heart-related disease [7–10] is one of the main reasons for high fatality rates related to COVID-19 [50–60].
Performance Analysis of Logistic Regression …
647
Table 1 Multidimensional review of COVID-19-related applications Theme of the application
Category of the algorithm and dataset
ML strategy for picture-based conclusion of COVID-19 [26]
Random forest
GitHub and images extracted from 43 different publications
New ML process for picture-based conclusion of Coronavirus [27]
Dimensionality reduction and predictive analysis
dataset at first included nation, state, and number of passings and affirmed instances of North America
Utilizing ML to gauge surreptitiously Coronavirus diseases in North America [28]
Support vector machines (SVM)
anticipate the quantity of infection cases for the 12 most-influenced nations
Coronavirus pestilence examination utilizing ML and deep learning calculations [29]
Epidemic analysis and forecasting
it comprises of everyday case reports and day by day time arrangement outline tables
Discovery of COVID contamination from routine blood tests with ML: an attainability study [30]
Decision tree, SVM, random forest, Naïve-Bayes, extremely randomized trees, K-nearest neighbors, logistic regression
IRCCS Ospedale San Raffaele Footnote
Coronavirus pandemic expectation for Hungary; a cross-breed ML model [31]
ANFI, MLP-ICA
data from Hungary
A ML bibliometric investigation [32]
Bibliometric methodology
International Institute of Anticancer Research
A receptive public approach examination utilizing AI-based subject displaying [33]
Latent Dirichlet Allocation (LDA) algorithm
Data accumulated from PIB
Forecast of criticality in XGBoost algorithm admitted citizens having extreme Coronavirus contamination utilizing three clinical highlights: an AI-based prognostic model [34]
Data provided by Tongji Hospital
Coronavirus future determining Linear, ridge regression, Lasso, utilizing regulated ML and support vector machine representation [35] (SVM)
Dataset provided by Johns Hopkins
Evaluating nations’ exhibitions against Coronavirus through WSIDEA and AI calculations [36]
RF algorithms and decision tree Data Envelopment Analysis
Coronavirus description of the use of CT scan pictures by ML representations [37]
SVM, DWT GLCM, LDP, GLRLM), GLSZM,
World Health Organization (WHO) (continued)
648
M. Goswami and N. J. Sebastian
Table 1 (continued) Theme of the application
Category of the algorithm and dataset
ML investigation of chest CT FFT-Gabor scheme examine pictures as a reciprocal computerized trial of COVID (Coronavirus) patients [38]
Github Repository
Coronavirus immunization configuration utilizing reverse vaccinology and ML [39]
SVM, K-nearest neighbors, Logistic Regression, Random forest, XGB
ClinicalTrials.gov database and PubMed literature
Coronavirus public assumption bits of knowledge and ML for tweets arrangement [40]
Logistic Regression algorithm, Naïve Bayes method & K-nearest neighbor
Numerous R bundles were utilized in the cleaning cycle, to make a clean dataset for investigation
Relationship between climate information and COVID outrage anticipating death rate: ML draws near [41]
Regression machine learning algorithm
We gathered the required datasets identified with climate and evaluation highlights
A blend of significant profound Ensemble algorithm, ResNet50, The three datasets are the learning models with ML SVM, and decision trees (RMFD), (SMFD), and procedures for face cover (LFW) recognizable proof in the time of the Coronavirus pandemic [42] Foreseeing the development & pattern of coronavirus outrage utilizing ML and distributed computing [43]
Generalized Inverse Weibull distribution
The data utilized for this work is the Our Reality Dataset, accessible at Github
ML to separate self-uncovering of results, testing access, and recovery related with Coronavirus on Twitter [44]
Biterm topic model (BTM)
Twitter data
Part of natural information mining and AI procedures in identifying and diagnosing the novel (COVID-19): an efficient audit [45]
Regression and prediction algorithm
Patient inside similar geological zone by straightforwardly speaking with committed emergency clinics and wellbeing organizations
3 Methodology This section discusses the flow of the working model, which compares four algorithms such as logistic regression, KNN, SVM, and Naïve Bayes algorithm. Decision tree and random forest are also a good choice as machine learning algorithms but not in the scope of this paper (Fig. 1).
Performance Analysis of Logistic Regression …
649
Fig. 1 Flowchart of existing model
4 SVM SVM is a bunch of regulated techniques for learning that is utilized for recognizing grouping, relapse, and anomalies. In AI, these are considered as the common assignments. A basic direct SVM classifier works by making a straight line between two classes. It implies that, all the information focuses on one side of the line, which will address a class and the information focuses on the opposite side of the line will be placed in an alternate classification. This infers that, there will be a limitless number of lines for selection. The straight SVM computation, which includes k-closest neighbors, is superior to all other calculations since it selects the optimal line to characterize the information focus. It selects the line that separates the data and is as far away from the information that concentrates in the storeroom as possible. y = mx T + c where m is the slope. The stages for SVM are 1. 2. 3. 4. 5.
Importing the dataset Exploring the information to sort out what they resemble Pre-measure the information Split the information into properties and names Breaking the information into sets for preparing and research
(1)
650
M. Goswami and N. J. Sebastian
Fig. 2 SVM algorithm flowchart
6. 7. 8.
Train the SVM calculation Making a few forecasts Evaluating the consequences of the calculation.
A 2-D model assists in detecting all the languages of AI. Fundamentally, on a lattice, there are few information focuses. This research work attempts to separate these information focuses by the class, and as a result, they should find a way into; however in some unacceptable classification, you would prefer not to have any information. That means we are looking for the line that connects the two nearest focuses and holds the other information focuses (Fig. 2).
5 Logistic Regression Calculated relapse is the legitimate regression assessment that has been used to coordinate when the poor variable is parallel. Like all regression assessments, the logistic regression is a judicious examination. Calculated relapse is used to portray
Performance Analysis of Logistic Regression …
651
data and explain the association between one ward twofold factor and in any event one apparent, ordinal, range- or extent-level self-sufficient elements. Equation for logistic regression: logit(b) = ln(b/(1 − b)) = a0 + a1X 1 + a2X 2 + a3X 3 + · · · + ak X k
(2)
where b = probability of the occurrence of the feature. X1, X2, … Xk = set of input features of X. a1, a2 … ak = parameter values to be estimated in the logistic regression formula. The sigmoid function (logical regression model) is utilized to plan the anticipated expectations to probabilities. The sigmoid capacity addresses an ‘S’ molded bend when plotted on a guide. The diagram plots the anticipated qualities somewhere in the range of 0 and 1. The qualities are then plotted toward the edges at the top and the lower parts of the Y-axis, with the marks as 0 and 1. In view of these qualities, the objective variable can be arranged in both of the classes. The equation for the Sigmoid function is given as: y = 1/(1 + e x ),
(3)
where ex = the exponential constant with a value of 2.718281828. Steps to build a logical regression model 1. 2. 3. 4. 5. 6.
Import the libraries Load the dataset Splitting the data into training and test Create an instance classifier and fit it into the training data Create a predication on the test data Check for performance of the model through confusion matrix.
6 KNN K-nearest neighbor (KNN) is probably the least difficult calculation utilized in machine learning for solving the regression and characterization issue. KNN calculations use information and group new information focuses dependent on likeness measures (e.g., distance work). Grouping is finished by a dominant part vote to its neighbors. The information is allotted to the class which has the closest neighbors. As you increment the quantity of closest neighbors, the estimation of k, and precision may increment. Step to build a KNN model 1. 2.
Import libraries Load dataset
652
3. 4.
M. Goswami and N. J. Sebastian
Splitting the data into training and test data Calculate the distance between the test information and each column of the training information by utilizing the Euclidean model.
As per the Euclidean distance, the distance between two focuses in the plane with organizes (e, f ) and (c, d) is given by dist((e, f ), (c, d)) = (e − c)2 + ( f − d)2 1. 2. 3. 4. 5.
(3)
Now, in light of the distance value, sort them in climbing request Next, it will pick the top K lines from the arranged exhibit Assign a class to the test point dependent on most much of the time utilized class Compute the accuracy Check for the performance of the model through confusion matrix.
7 Naive Bayes Naïve Bayes computation is a managed learning estimation, which relies upon Bayes speculation and is used for handling the plan issues. It is fundamentally used in text portrayal that fuses a high-dimensional dataset. Naïve Bayes classifier is one of the straightforward and best classification calculations, which help in building the quick AI models that can make speedy predictions. It is a probabilistic classifier, which implies its prediction based on the likelihood of an object. Some well-known instances of naïve Bayes algorithm are spam filtration, sentimental examination, and grouping articles. Naive Bayes is a grouping calculation that works based on the Bayes hypothesis. Prior to clarifying about Naive Bayes, first, we ought to examine Bayes theorem. Bayes hypothesis is utilized to discover the likelihood of a theory with the given proof P(A|B) =
P(A)P(B|A) P(A|B)
(4)
A is the hypothesis and B is the proof. P(B|A) are probability of B given that A is True. P(A) and P(B) are independent probabilities of A and B. Steps to build a Naive Bayes model 1. 2. 3. 4. 5.
Importing packages Loading the dataset Splitting the data into training and test data Featured scaling and creating an instance classifier and fit it to the training data Predicting the test results
Performance Analysis of Logistic Regression …
653
Table 2 Confusion matrix of the models Algorithm
True positive
True negative
Logistic Regression
22
3
4
32
KNN
21
4
9
27
SVM
23
2
5
31
Naive Bayes
24
1
22
14
6. 7.
False positive
False negative
Compute the accuracy Check for the performance of the model through confusion matrix.
8 Experimental Results In this section, the experimental results are included. The experiments are conducted using Intel core i5 8th generation processor, 1 TB hard disk, 256 GB SSD. Table 2 and Fig. 4 show a comparison of the TP, TN, FP, and FN. Naïve Bayes classifier has given the highest number of instances, where the classifier has predicted correctly. So, considering the severity of the disease may say that Naïve Bayes is one of the best options to take immediate action on taking care of the patient, who is actually having heart disease and prediction also says that the patient has heart disease. It is also observed that, false negative value is least in case of Naïve Bayes classifier. It indicates that there is a minimal chance of error in diagnosing the condition if the patient is actually suffering from heart disease. It is one of the requirements for selecting the correct machine learning algorithm for predicting heart disease. False negative value is highest in case of logistic regression followed by SVM. Due to higher values of missing to identify a person, who is under risk could increase the fatality due to heart-related conditions that are related to COVID-19 (Fig. 3 and Table 3).
9 Conclusion Artificial intelligence and machine learning are considered as the most promising technology when it comes to health care. During first and second waves of COVID19, a lot of applications are developed based on machine learning. Heart disease is found to be one of the diseases, which is considered as a fatal precondition in case of COVID-19 infection. In this work, this precondition is studied by using different machine learning algorithms. Least accuracy is shown in case of Naïve Bayes classifier. But true positive value is also highest in case of Naïve Bayes classifier. This work has tried to study the four different machine learning algorithms, and they are applied in identifying a precondition related to heart disease, which could lead to
654
M. Goswami and N. J. Sebastian
Comparing TP, TN, FP, FN 35 30 25 20 15 10 5 0 True_Positive
True_Negative
Logistic_Regression
KNN
False_Positive SVM
False_Negative
Naive Bayes
Fig. 3 Comparison of TP, TN, FP, FN
Accuracy Comparison among the Classifiers to predict heart disease 100.00% 80.00% 60.00% 40.00% 20.00% 0.00%
88.52%
77.05%
Logistic Regression
KNN
88.52% 62.30%
SVM
Naive Bayes
Fig. 4 Comparison of the accuracy values among the four different classifiers
Table 3 Results of the algorithms in terms of accuracy
Algorithm
Accuracy (%)
Logistic regression
88.524590163934
KNN
77.049180327869
SVM
88.524590163934
Naive Bayes
62.295081967213
avoid a fatal situation in case of COVID-19. This work may help to further investigations related to other preconditions, which are fatal in nature and related to many fatalities during first and second waves of COVID-19.
Performance Analysis of Logistic Regression …
655
References 1. X. Xiang, X. Lu, A. Halavanau, J. Xue, Y. Sun, P.H.L. Lai, Z. Wu, Modern senicide in the face of a pandemic: an examination of public discourse and sentiment about older adults and COVID-19 using machine learning. J. Gerontol. B 76(4), e190–e200 (2021) 2. J. Wu, P. Zhang, L. Zhang, W. Meng, J. Li, C. Tong, Y. Li, Y. Cai, Z. Yang, J. Zhu, M. Zhao, H. Huang, X. Xie, S. Li, Rapid and accurate identification of COVID-19 infection through machine learning based on clinical available blood test results (2020). MedRxiv 3. J. Xiong, O. Lipsitz, F. Nasri, L.M. Lui, H. Gill, L. Phan, D. Chen-Li, M. Iacobucci, R. Ho, A. Majeed, R.S. McIntyre, (2020). Impact of COVID-19 pandemic on mental health in the general population: a systematic review. J. Affective Disorders 4. P.K. Ozili, T. Arun, Spillover of COVID-19: impact on the Global Economy (2020). Available at SSRN 3562570 5. W. Cao, Z. Fang, G. Hou, M. Han, X. Xu, J. Dong, J. Zheng, The psychological impact of the COVID-19 epidemic on college students in China. Psychiatry Res. 287, 112934 (2020) 6. H. Shen, M. Fu, H. Pan, Z. Yu, Y. Chen, The impact of the COVID-19 pandemic on firm performance. Emerg. Mark. Financ. Trade 56(10), 2213–2230 (2020) 7. WHO, UNICEF, C. Mathers, Global strategy for women’s, children’s and adolescents’ health (2016-2030). Organization201, 4–103 (2016) 8. S. Mohan, C. Thirumalai, G. Srivastava, Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 7, 81542–81554 (2019) 9. F. Ali, S. El-Sappagh, S.R. Islam, D. Kwak, A. Ali, M. Imran, K.S. Kwak, A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Inf. Fusion 63, 208–222 (2020) 10. M.A. Khan, An IoT framework for heart disease prediction based on MDCNN classifier. IEEE Access 8, 34717–34727 (2020) 11. D. Shah, S. Patel, S.K. Bharti, Heart disease prediction using machine learning techniques. SN Comput. Sci. 1(6), 1–6 (2020) 12. M. Yadav, M. Perumal, M. Srinivas, Analysis on novel coronavirus (COVID-19) using machine learning methods. Chaos, Solitons & Fractals 139, 110050 (2020) 13. A. Ahmad, S. Garhwal, S.K. Ray, G. Kumar, S.J. Malebary, O.M. Barukab, The number of confirmed cases of covid-19 by using machine learning: methods and challenges. Arch. Comput. Methods Eng. 28(4), 2645–2653 (2021) 14. P. Wang, X. Zheng, J. Li, B. Zhu, Prediction of epidemic trends in COVID-19 with logistic model and machine learning technics. Chaos, Solitons & Fractals 139, 110058 (2020) 15. S. Lalmuanawma, J. Hussain, L. Chhakchhuak, Applications of machine learning and artificial intelligence for Covid-19 (SARS-CoV-2) pandemic: a review. Chaos, Solitons & Fractals 139, 110059 (2020) 16. M. Nemati, J. Ansary, N. Nemati, Machine-learning approaches in COVID-19 survival analysis and discharge-time likelihood prediction using clinical data. Patterns 1(5), 100074 (2020) 17. Y. Gao, G.Y. Cai, W. Fang, H.Y. Li, S.Y. Wang, L. Chen, Y. Yu, D. Liu, S. Xu, P.-F. Cui, S.-Q. Zeng, X.-X. Feng, R.-D. Yu, Y. Wang, Y. Yuan, X.-F. Jiao, J.-H. Chi, J.-H. Liu, R.-Y. Li, X. Zheng, C.-Y. Song, N. Jin, W.-J. Gong, X.-Y. Liu, L. Huang, X. Tian, L. Li, H. Xing, D. Ma, C.-R. Li, F. Ye, Q.L. Gao, Machine learning based early warning system enables accurate mortality risk prediction for COVID-19. Nat. Commun. 11(1), 1–10 (2020) 18. A. Alimadadi, S. Aryal, I. Manandhar, P.B. Munroe, B. Joe, X. Cheng, Artificial intelligence and machine learning to fight COVID-19. Physiol. Genom. (2020). https://doi.org/10.1152/ physiolgenomics.00029.2020 19. A. Di Castelnuovo, M. Bonaccio, S. Costanzo, A. Gialluisi, A. Antinori, N. Berselli, L. Blandi, R. Bruno, R. Cauda, G. Guaraldi, I. My, L. Menicanti, G. Parruti, G. Patti, S. Perlini, F. Santilli, C. Signorelli, G.G. Stefanini, A. Vergori, A. Abdeddaim, W. Ageno, A. Agodi, P. Agostoni, L. Aiello, S. Al Moghazi, F. Aucella, G. Barbieri, A. Bartoloni, C. Bologna, P. Bonfanti, S. Brancati, F. Cacciatore, L. Caiano, F. Cannata, L. Carrozzi, A. Cascio, A. Cingolani, F.
656
20. 21.
22.
23.
24. 25.
26. 27. 28. 29. 30.
31. 32. 33. 34.
35.
36.
M. Goswami and N. J. Sebastian Cipollone, C. Colomba, A. Crisetti, F. Crosta, G.B. Danzi, D. D’Ardes, K. de Gaetano Donati, F. Di Gennaro, G. Di Palma, G. Di Tano, M. Fantoni, T. Filippini, P. Fioretto, F.M. Fusco, I. Gentile, L. Grisafi, G. Guarnieri, F. Landi, G. Larizza, A. Leone, G. Maccagni, S. Maccarella, M. Mapelli, R. Maragna, R. Marcucci, G. Maresca, C. Marotta, L. Marra, F. Mastroianni, A. Mengozzi, F. Menichetti, J. Milic, R. Murri, A. Montineri, R. Mussinelli, C. Mussini, M. Musso, A. Odone, M. Olivieri, E. Pasi, F. Petri, B. Pinchera, C.A. Pivato, R. Pizzi, V. Poletti, F. Raffaelli, C. Ravaglia, G. Righetti, A. Rognoni, M. Rossato, M. Rossi, A. Sabena, F. Salinaro, V. Sangiovanni, C. Sanrocco, A. Scarafino, L. Scorzolini, R. Sgariglia, P.G. Simeone, E. Spinoni, C. Torti, E.M. Trecarichi, F. Vezzani, G. Veronesi, R. Vettor, A. Vianello, M. Vinceti, R. De Caterina, L. Iacoviello, Common cardiovascular risk factors and in-hospital mortality in 3,894 patients with COVID-19: survival analysis and machine learning-based findings from the multicentre Italian CORIST Study. Nutr. Metab. Cardiovasc. Dis. 30(11), 1899–1913 (2020) L.A. Amar, A.A. Taha, M.Y. Mohamed, Prediction of the final size for COVID-19 epidemic using machine learning: a case study of Egypt. Infect. Dis. Model. 5, 622–634 (2020) D.P. Kavadi, R. Patan, M. Ramachandran, A.H. Gandomi, Partial derivative nonlinear global pandemic machine learning prediction of covid 19. Chaos, Solitons & Fractals 139, 110056 (2020) L. Flesia, M. Monaro, C. Mazza, V. Fietta, E. Colicino, B. Segatto, P. Roma, Predicting perceived stress related to the Covid-19 outbreak through stable psychological traits and machine learning models. J. Clin. Med. 9(10), 3350 (2020) S.F. Ardabili, A. Mosavi, P. Ghamisi, F. Ferdinand, A.R. Varkonyi-Koczy, U. Reuter, T. Rabczuk, P.M. Atkinson, Covid-19 outbreak prediction with machine learning (2020). Available at SSRN 3580188 M. Mele, C. Magazzino, Pollution, economic growth, and COVID-19 deaths in India: a machine learning evidence. Environ. Sci. Pollut. Res., 1–9 (2020) C.M. Ye¸silkanat, Spatio-temporal estimation of the daily cases of COVID-19 in worldwide using random forest machine learning algorithm. Chaos, Solitons & Fractals 140, 110210 (2020) M.A. Elaziz, K.M. Hosny, A. Salah, M.M. Darwish, S. Lu, A.T. Sahlol, New machine learning method for image-based diagnosis of COVID-19. Plos One 15(6), e0235187 (2020) S. Vaid, C. Cakan, M. Bhandari, Using machine learning to estimate unobserved COVID-19 infections in North America. J. Bone Joint Surgery. American Volume (2020) Y. Peng, M.H. Nagata, An empirical overview of nonlinearity and overfitting in machine learning using COVID-19 data. Chaos, Solitons & Fractals 139, 110055 (2020) N.S. Punn, S.K. Sonbhadra, S. Agarwal, COVID-19 epidemic analysis using machine learning and deep learning algorithms (2020). MedRxiv. D. Brinati, A. Campagner, D. Ferrari, M. Locatelli, G. Banfi, F. Cabitza, Detection of COVID19 infection from routine blood exams with machine learning: a feasibility study. J. Med. Syst. 44(8), 1–12 (2020) G. Pinter, I. Felde, A. Mosavi, P. Ghamisi, R. Gloaguen, COVID-19 pandemic prediction for Hungary; a hybrid machine learning approach. Mathematics 8(6), 890 (2020) F. De Felice, A. Polimeni, Coronavirus disease (COVID-19): a machine learning bibliometric analysis. In Vivo 34(3 suppl), 1613–1617 (2020) R. Debnath, R. Bardhan, India nudges to contain COVID-19 pandemic: a reactive public policy analysis using machine-learning based topic modelling. PloS One 15(9), e0238972 (2020) L. Yan, H.T. Zhang, Y. Xiao, M. Wang, Y. Guo, C. Sun, X. Tang, L. Jing, S. Li, M. Zhang, Y. Yuan, Prediction of criticality in patients with severe Covid-19 infection using three clinical features: a machine learning-based prognostic model with clinical data in Wuhan (2020). MedRxiv F. Rustam, A.A. Reshi, A. Mehmood, S. Ullah, B.W. On, W. Aslam, G.S. Choi, COVID-19 future forecasting using supervised machine learning models. IEEE Access 8, 101489–101499 (2020) N. Aydin, G. Yurdakul, Assessing countries’ performances against COVID-19 via WSIDEA and machine learning algorithms. Appl. Soft Comput. 97, 106792 (2020)
Performance Analysis of Logistic Regression …
657
37. M. Barstugan, U. Ozkaya, S. Ozturk, Coronavirus (covid-19) classification using ct images by machine learning methods (2020). arXiv preprint arXiv:2003.09424 38. D. Al-Karawi, S. Al-Zaidi, N. Polus, S. Jassim, Machine learning analysis of chest CT scan images as a complementary digital test of coronavirus (COVID-19) patients (2020). MedRxiv 39. E. Ong, M.U. Wong, A. Huffman, Y. He, COVID-19 coronavirus vaccine design using reverse vaccinology and machine learning. Front. Immunol. 11, 1581 (2020) 40. J. Samuel, G.G. Ali, M. Rahman, E. Esawi, Y. Samuel, Covid-19 public sentiment insights and machine learning for tweets classification. Information 11(6), 314 (2020) 41. Z. Malki, E.S. Atlam, A.E. Hassanien, G. Dagnew, M.A. Elhosseini, I. Gad, Association between weather data and COVID-19 pandemic predicting mortality rate: machine learning approaches. Chaos, Solitons & Fractals 138, 110137 (2020) 42. M. Loey, G. Manogaran, M.H.N. Taha, N.E.M. Khalifa, A hybrid deep transfer learning model with machine learning methods for face mask detection in the era of the COVID-19 pandemic. Measurement 167, 108288 (2020) 43. S. Tuli, S. Tuli, R. Tuli, S.S. Gill, Predicting the growth and trend of COVID-19 pandemic using machine learning and cloud computing. Internet of Things 11, 100222 (2020) 44. T. Mackey, V. Purushothaman, J. Li, N. Shah, M. Nali, C. Bardier, B. Liang, M. Cai, R. Cuomo, Machine learning to detect self-reporting of symptoms, testing access, and recovery associated with COVID-19 on Twitter: retrospective big data infoveillance study. JMIR Public Health and Surveillance 6(2), e19509 (2020) 45. A.S. Albahri, R.A. Hamid, J.K. Alwan, Z.T. Al-Qays, A.A. Zaidan, B.B. Zaidan, A.O.S. Albahri, A.H. AlAmoodi, J.M. Khlaf, E.M. Almahdi, E. Thabet, S.M. Hadi, K.I. Mohammed, M.A. Alsalem, J.R. Al-Obaidi, H.T. Madhloom, Role of biological data mining and machine learning techniques in detecting and diagnosing the novel coronavirus (COVID-19): a systematic review. J. Med. Syst. 44, 1–11 (2020) 46. L. Corey, J.R. Mascola, A.S. Fauci, F.S. Collins, A strategic approach to COVID-19 vaccine R&D. Science 368(6494), 948–950 (2020) 47. T.T. Le, Z. Andreadakis, A. Kumar, R.G. Román, S. Tollefsen, M. Saville, S. Mayhew, The COVID-19 vaccine development landscape. Nat. Rev. Drug Discov. 19(5), 305–306 (2020) 48. T.P. Velavan, C.G. Meyer, The COVID-19 epidemic. Tropical Med. Int. Health 25(3), 278 (2020) 49. CDC COVID-19 Response Team, S. Bialek, E. Boundy, V. Bowen, N. Chow, A. Cohn, N. Dowling, S. Ellington, R. Gierke, A. Hall, J. MacNeil, P. Patel, G. Peacock, T. Pilishvili, H. Razzaghi, N. Reed, M. Ritchey, E. Sauber-Schatz, Severe outcomes among patients with coronavirus disease 2019 (COVID-19)—United States, February 12–March 16, 2020. Morbidity Mortality Weekly Rep. 69(12), 343 (2020) 50. M. Vaduganathan, O. Vardeny, T. Michel, J.J. McMurray, M.A. Pfeffer, S.D. Solomon, Renin– angiotensin–aldosterone system inhibitors in patients with Covid-19. N. Eng. J. Med. (2020) 51. R.Q. Cron, W.W. Chatham, The rheumatologist’s role in COVID-19 (2020) 52. J. Daniel, Education and the COVID-19 pandemic. Prospects 49(1), 91–96 (2020) 53. P. Sahu, Closure of universities due to coronavirus disease 2019 (COVID-19): impact on education and mental health of students and academic staff. Cureus 12(4) (2020) 54. The Lancet Infectious Diseases, Challenges of coronavirus disease 2019. Lancet. Infect. Dis 20(3), 261 (2020) 55. L. Meng, F. Hua, Z. Bian, Coronavirus disease 2019 (COVID-19): emerging and future challenges for dental and oral medicine. J. Dent. Res. 99(5), 481–487 (2020) 56. J. Phua, L. Weng, L. Ling, M. Egi, C.-M. Lim, J.V. Divatia, B.R. Shrestha, Y.M. Arabi, J. Ng, C.D. Gomersall, M. Nishimura, Y. Koh, B. Du, Asian Critical Care Clinical Trials Group, Intensive care management of coronavirus disease 2019 (COVID-19): challenges and recommendations. Lancet Respir. Med. 8(5), 506–517 (2020) 57. P.C. Ilie, S. Stefanescu, L. Smith, The role of vitamin D in the prevention of coronavirus disease 2019 infection and mortality. Aging Clin. Exp. Res. 32(7), 1195–1198 (2020) 58. M. Kieliszek, B. Lipinski, Selenium supplementation in the prevention of coronavirus infections (COVID-19). Med. Hypotheses 143, 109878 (2020)
658
M. Goswami and N. J. Sebastian
59. Z.Y. Li, L.Y. Meng, The prevention and control of a new coronavirus infection in department of stomatology. Zhonghua kou qiang yi xue za zhi= Zhonghua kouqiang yixue zazhi= Chin. J. Stomatol. 55, E001–E001 (2020) 60. N. James, M. Menzies, P. Radchenko, COVID-19 second wave mortality in Europe and the United States. Chaos Interdiscip. J. Nonlinear Sci. 31(3), 031105 (2021). https://ourworldi ndata.org/covid-data-switch-jhu
Recognition and Detection of Human Judgment About Influential Pairs Using Machine Learning Techniques G. Charles Babu, P. Gopala Krishna, B. Sankara Babu, and Rokesh Kumar Yarava
Abstract The wide selection of online media as means for communicating, sharing, or even for demanding estimation, testing sentiments has progressively drawn in light of a legitimate concern for associations that gives, or sell, items and administrations. Social media gives a stage for peer proposals to assume a larger part in adoption and purchase decisions. This paper deals with the identification of influencers in a special social network called Twitter using scikit tools. The existing system uses the follower count benchmark method. It is based on only the no of followers and achieves an accuracy of 70.2%. This paper presents a data mining modeling approach to forecast the human judgment about who is added powerful for pairs of persons. The model performance is shown using different performance metrics like accuracy, classification report, ROC curve, cross-validation accuracy, and AUC. We use machine learning algorithms such as logistic regression, naive Bayes, random forest, multilayer perception, decision tree and applying stacking to those classifiers. From all these classifiers, random forest gives the best results based on AUC and ROC curves. This paper will give good results as compared to the existing system in terms of accuracy. Keywords Influencers · Logistic regression · Naïve Bayes · Random forest · Multilayer perception · Decision tree · Stacking
G. Charles Babu (B) · B. Sankara Babu Department of CSE, Gokaraju Rangaraju Institute of Engineering and Technology (Autonomous), Bachupally, Telangana, India P. Gopala Krishna Department of IT, Gokaraju Rangaraju Institute of Engineering and Technology (Autonomous), Hyderabad, Telangana, India R. K. Yarava Department of CSE, Chalapathi Institute of Engineering and Technology (Autonomous), Lam, Guntur, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_48
659
660
G. Charles Babu et al.
1 Introduction A social networking site has become such a crucial part of life that it is difficult to imagine life without them. People have grown so addicted to them that they spend more than half of their day on Facebook and Instagram. The only reason it is so popular and addictive is that it is the means of instant validation and gratification. People upload their pictures and details about personal lives and get instant likes and a comment in form of validation which makes them feel good and encourages them to continue their participation. This addiction of instant pleasure and appreciation increases so much that their lives start to revolve around building their social networking profiles as attractive as they can to make their imperfect life appear happy and perfect which is not. As a result, this pressurizes them to play fake and live in an illusionary world. But on the other hand, it also has several advantages if it is used just for having fun and not as means of addiction. It helps different people sitting in the different parts of the world to give voice to their opinions and display their talents and skills. It is good for introverted and lonely people to interact and express their heart out and make friends. It is a good source of entertainment and having a quick break while feeling pressured and lots more. For the whole society, the social network closely connects people, makes them more closely connected, and maximizes the utilization of social information resources. During those same time, data could make stretched and also disseminated in the best feeling through social networks. Also, it may be a greater amount open and also transparent, which may be favorable of the equitable development. Furthermore, equitability and openness of the entire particular social order. However, because of the present unsound system, inadequate laws, and more regulations, the absence of supervision is also different factors, and the social organization might additionally prompt the revelation for particular data. What is more turn into that appropriation focus from claiming gossipy tidbits. Moreover, likewise an ever-increasing amount people get dependent on social networks, that entirety of the public arena will additionally make trapped in the cumbersome circumstance, and the place everybody is a companion on the web and the sentiments would apathetic. Furthermore, correspondence will be not smooth birch actually. So it is important to analyze who is more vulnerable to social networks. If we can figure out which groups of people are likely to be affected, we can make specific precautions, such as pop-out anti-addiction alerts, changes the display content to guide customer interests and habits. This is why we choose such kind of data to conduct a data mining work, and we try to figure out the factors that affect people by social networks. This paper focuses on the identification of influencers using machine learning techniques with the sickest tools. The problem statement concerns the identification of influencers using machine learning techniques based on human judgment. In this paper, we train the machine learning model with data and predict who is more influential for pairs of individuals.
Recognition and Detection of Human Judgment …
661
2 Review of Literature Many methods have been presented for the identification of influencers. Every method employs a different strategy. A review of some prominent solutions is presented. Zengin Alp and Gunduz Oguducu [1] present an approach known ‘Personalized PageRank’ which integrates mutually the data acquire from the Twitter network and utilizes activities. The worked method aims to conclude topic-specific influencers who are believed to be the experts on the selected topic. The features they used to model the proposed approach are the amount of tweets of the customer, the number of tweets by the user on the selected topic, the number the retweeted tweets of the user, the number the retweeted tweets of the customer on the selected topic, the amount of retweets of the customer, the amount of retweets of the customer on the selected topic, the total amount of days, the amount of days the customer posted on the selected topic, and the period conceded for the first retweet of the post. Cataldi and Aufaure [2] propose an approach that analyzes the numerous paths of the data follow over the system and provides a technique for approximation the power among utilized by calculating the association by them. They model the Twitter network for a selected topic using a directed graph where the directed edges represent a retweet action between the nodes (users). They also consider the authorities of users while calculating influences. Kwak et al. [3] compare different influence measurements depended on parameters like the amount of retweets and the amount of group. They also propose ‘retweet trees’ but they do not include this feature as a part of power measurement. Cha et al. [4] propose an in-depth assessment of three measures of power, namely in-degree, retweets, and mentions. Based on these three measurements, they investigate the influence across time and topics. They report the finding that the most followed users are not automatically powerful in terms of gained retweets or mentions. Liu et al. [5] recommend a generative graphical model that exploits both heterogeneous link data and textual contented linked by every customer in the network to measure topic-level control. They apply the proposed approach in four dissimilar genres of social networks including Twitter to reveal its effectiveness. Twitter Rank [6] is an expansion of the PageRank technique proposed to determine the control of Twitter users. It uses an aimed at graph D (V, E) which V (Vertex) represents the Twitter users, and E (Edge) represents the following relationships among them. E is intended for from disciple to friend. Web ecology work [7] measures the power depended on the ratio of the thought (namely retweet, reply, and mention) the user receives to the total amount of tweets the customer has. As a result, the great majority of the studies conducted in the literature is subjectbased and require long-term observation. However, an important feature of Twitter is that it is a platform where day-to-day issues [8, 9] are discussed in a short period. With
662
G. Charles Babu et al.
this point of view, we have concentrated on the problem of identifying influential individuals on a day-to-day topic.
3 Proposed Model Recognizing human activities detection from a video stream is a challenging task. From the past decade, human action recognition has received significant attention from the researchers of the computer vision community. Analyzing a human action is not only presenting patterns of movement of different parts of the body, but also a description of the human intention, emotions, and thoughts. Human behavior analysis and understanding are essential for many applications such as human–computer interaction, surveillance, sports, elderly health care, training, entertainment, and so on. In general, human activity recognition systems follow a hierarchical approach [1]. At lower-level, human objects are segmented from the video frame. This process is followed by feature extraction such as the characteristics of human objects such as colors, shape, silhouette, body motion, and poses. The human activity action recognition module falls under a mid-level approach followed by the reasoning engines on the high level that interprets the context of the actions as either normal or abnormal.
4 Methodology Our work, identification of influencers using machine learning techniques comes under the supervised machine learning model. Managing the learning utilizes categorization methods and decaying methods to extend analytical models. The methods comprise logistic regression, neural networks, decision trees, random forest, and naive Bayes. The objective is the identification of influencers which is visualized as a binary classification trouble. Our donation is to describe and investigate a representation utilizing the data mining stages, for pairs of individuals (Fig. 1).
4.1 Data Collection The data we selected is from a Hackathon in Kaggle. Data Science London and the UK Windows Azure Users Group in association with Microsoft and Peer index arranged to examine a bunch of information that is utilized to foresee human decisions with reference to who is more compelling via online media. The dataset, given by peer index, involves a norm, pair-wise inclination learning job. Every data point portrays two people, An and B. For every individual, 11 pre-figured, non-negative numeric
Recognition and Detection of Human Judgment …
663
Fig. 1 Construction of the proposed system
highlights dependent on Twitter movement (like volume of connections, amount of supporters, and so forth) are given. This is a standard Kaggle dataset. The binary mark in the dataset addresses a human judgment referencing by one of the two people is more powerful. A mark ‘1’ signifies client An is more compelling than B. 0 methods client B is more compelling than A. The goal is to prepare an AI representation that calculates the human judgment on who between An and B is more compelling, by high precision. The information comprised of 11 different boundaries like devotees, retweets referenced, amount of posts, and so on.
4.2 Data Exploration Investigating the data usually engages the basic analysis of a real-time dataset.
4.3 Dataset Description The dataset has 22 features. Every data point represents two users ‘A’ and ‘B’. Both have 11 similar features like follower-count, following-count, and listed-count, mentions-received, retweets-received, mentions-sent, retweets-sent, posts, networkfeature-1, network-feature-2, and network-feature-3. This is a classic binary classification problem predicting human judgment on who is more influential ‘A’ or ‘B’. The class tag is 1, which means that A is extra powerful than B, and 0 means that B is added powerful than A. Training set consists of 5500 data points, and the test set contains 5952 data points.
664
G. Charles Babu et al.
4.4 Data Distribution To analyze each feature more carefully to see if the feature is suitable for training or should be dropped the data distribution is done. Here, we used python’s advanced visual library seaborn, to flexibly plot a univariate histogram, correlation map, and pair plots. Histograms are generally used to analyze the distribution of numerical data. A correlation map can plot a color-encoded matrix to show the correlation between each individual in a heatmap. The progressive color shows the degree of correlation. The darker the color is, the stronger relationship between features. The pair plot function plots pair-wise relationships in a dataset. It will make a matrix of axes to such an extent that every factor in information will be partaken in the y-axis across a solitary line and in the x-axis across a solitary segment. Class Analysis The dataset contains two classes. We count all classes to get 2802 for class ‘0’ versus 2698 for class ‘1’, the class distribution is 49% (class 1) verse 51% (class 0), which means there is no class imbalance in train data. The training data is suitable for training.
4.5 Finding Missing Values and Abnormal Data in the Dataset Missing no gives a little toolset of adaptable and simple to-utilize missing information representations and utilities that permit us to get a fast visual rundown of the fulfillment (or deficiency in that department) of the dataset. It is anything but an information thick presentation that lets you rapidly outwardly select examples in information finish. In the density diagram, there are no missing values and all the variables have complete information. Modeling Step1: Split the data We divided the dataset by train and test set into the ratio of 70:30. In the training dataset, the class distribution is 51% for class 0 and 49% for class 1. In the test dataset, the class distribution is 50% for class 0 and 50% for class 1.
4.6 Training Model Function Since we want to optimize several different classifiers, we built a model function to avoid duplicated code. The model is created based on the requirement. Every classifier calls this model function for predictions. Train the model by assigning the training
Recognition and Detection of Human Judgment … Table 1 Confusion matrix
Predicted positives
665 Predicted negatives
Positives
True positives (TP)
False negatives (FN)
Negatives
False positives (FP)
True negatives (TN)
dataset. The model performance is done using several performance metrics like accuracy, classification, ROC curve, cross-validation accuracy, and AUC (Table 1).
4.7 Precision Precision is the part of forecasts our model got right. For binary arrangement, the precision score is given by Accuracy = (TP + TN)/(TP + FP + TN + FN)
4.8 Classification Report A categorization statement is utilized to compute the eminence of prediction by a categorization method. It gives an accuracy, recall, F1-score, sustain, macro-average, weighted average. Precision—It is the capability of a classifier not to name a case positive that is negative. For each class, it is characterized as the proportion of genuine positives to the amount of valid and bogus positives. It is calculated by using Precision = (TP)/(TP + FP) Recall—It is the capability of a classifier to track down every single positive case. For every class, it is distinguished as the quantity of actual positives to the amount of genuine positives and bogus negatives. It is calculated by using Recall = (TP)/(TP + FN) F1-score—It is the harmonic mean of precision and recall. The perfect model achieves an F1-score of 1. It is calculated by F1-score = (2 ∗ precision ∗ recall)/(precision + recall)
666
G. Charles Babu et al.
Support—It is the amount of sample of every metric which is considered. It is the same for every classifier. Macroaverage- It is average precision, average recall, and average F1-score. Weighted Average—It favors majority class precision, recall, and F1-score.
4.9 ROC Curve A receiver operating characteristic curve (ROC curve) is a diagram presenting the exhibition of an arrangement replica at all grouping edges. This bend plots two boundaries. They are true positive rate (TPR) and false positive rate (FPR). It plots TPR versus FPR at various arrangement edges. Bringing down the grouping limit characterizes more things as certain, in this manner expanding both false positives and true positives.
4.10 Cross-Validation Analysis Cross-approval is to test the representation ability to predict the latest data that was not exploited in assessing it, to signal problems such as overfitting or willpower inclination, and to present a thoughtful on how the representation will sum up to a free dataset (i.e., an obscure dataset, for example, from a genuine issue). Here we used the scoring parameter. It is one of the API of sci-kit learn. Cross-validation is done by calculating scoring parameters with accuracy and AUC.
4.11 Random Forest Classifier Random forests or random decision forests are a group learning method for characterization, degenerative, and dissimilar errands that workings by mounting a large amount of choice trees at preparing time. For arrangement errands, the yield of the irregular woodland is the class chosen by most trees. For relapse errands, the mean or normal expectation of the individual trees is returned.
4.12 Logistic Classifier Calculated relapse, regardless of its anything but, a direct model for grouping instead of relapse. Calculated relapse is likewise referred to in the writing as logistic regression, most extreme entropy arrangement (MaxEnt), or the log-straight classifier. In
Recognition and Detection of Human Judgment …
667
this model, the possibility of portraying the possible consequences of a solitary beginning is demonstrated using a deliberated capability.
4.13 Naïve Bayes Classifier A naive Bayes method is a set of regulated learning calculations reliant on applying Bayes’ hypothesis with the ‘guileless’ suspicion of contingent freedom among each pair of highlights given the worth of the class variable. In spite of their misrepresented presumptions, guileless Bayes classifiers have functioned admirably in some true circumstances, broadly report order and spam separating. They need a limited quantity of preparing data to assess the important boundaries. Innocent Bayes students and classifiers can be remarkably quick dissimilarity by more refined strategies. The decoupling of the class contingent component appropriations involves that each circulation is autonomously evaluated as a one-dimensional conveyance. This thusly assists with mitigating problems initiating from the scourge of dimensionality.
4.14 Multilayer Perception Classifier A multilayer perception may be a population of encouraged forward artificial neural networks (ANN). MLP uses a regulated taking in procedure called again proliferation for preparing. Its dissimilar layers and non-linear actuation identify MLP from a straight perception.
4.15 Decision Tree Classifier Decision trees (DTs), for classification and regression, are a non-parametric supervised learning technique. The objective is to study simple decision rules from data attributes to extend a representation that forecasts the worth of a target variable. In most circumstances, the more complicated the decision rules and the more accurate the model, the deeper the tree.
4.16 Stacking Multiple classifiers are used to achieve higher predictive performance than each of the constituent classifiers could provide alone. The stacking approach was utilized. Stacking is an ensemble learning strategy that uses a meta-classifier to merge numerous classification models.
668
G. Charles Babu et al.
4.17 Developing GUI The GUI is developed by using Tkinter, a python library. It consists of a label for displaying the work named as ‘identification of influencers using machine learning techniques’ and buttons named as upload dataset, random forest, decision tree, multilayer perceptron, logistic regression, naïve Bayes, stacking for displaying predictions. When we click on those buttons, the results are displayed on the message box.
5 Results and Discussion The output is seen through the user interface which consists of the button to upload the dataset, and it shows the accuracy score and predictions for the most influential person. The result of the some is shown in Fig. 2. Here the identification of influence is given by the first step for accuracy of 0.7 by using machine learning technique called random forest after uploading the data into set (Fig. 3). Next the machine learning algorithm uses decision tree technique for subsequent process for getting the accuracy of 0.7 (Fig. 4). Next the following step is stacking prediction for predicting the accurate value, and the reports are given in Fig. 5. Here Fig. 5 shows the classification report of random forest algorithm which specifies the values between 0and 1 for getting the accuracy by comparing with precision, recall, and f 1 (Fig. 6).
Fig. 2 Random forest classifier predictions
Recognition and Detection of Human Judgment …
669
Fig. 3 Decision tree classifier predictions
Fig. 4 Stacking predictions
Here Fig. 7 shows the classification report of logistic regression algorithm which specifies the values between 0 and 11 for getting the accuracy by comparing with precision, recall, and f 1. Here Fig. 8 shows the classification report of Gaussian naive Bayes algorithm which specifies the values between 0 and 11 for getting the accuracy by comparing with precision, recall, and f 1.
670
Fig. 5 Random forest classification report
Fig. 6 Logistic regression classification report
Fig. 7 Naive Bayes classification report
G. Charles Babu et al.
Recognition and Detection of Human Judgment …
671
Fig. 8 MLP classification report
Here Fig. 9 shows the classification report of MLP algorithm which specifies the values between 0 and 11 for getting the accuracy by comparing with precision, recall, and f 1. Here Fig. 10 shows the classification report of decision tree algorithm which specifies the values between 0 and 11 for getting the accuracy by comparing with precision, recall, and f 1. Here Fig. 10 shows the classification report of stacking algorithm which specifies the values between 0 and 11 for getting the accuracy by comparing with precision, recall, and f 1. We have done evaluation of data analysis using five distinct classifiers. Those are random forest, decision trees, logistic regression, multilayer perception, and naive Bayes classifier. Here the dataset is binary classification predicting who is more
Fig. 9 Decision tree classification report
672
G. Charles Babu et al.
Fig. 10 Stacking classification report
popular. In terms of accuracy, random forest and stacking give 79 which is the highest among all classifiers (Figs. 11, 12, and 13). Next is the decision tree and logistic regression. Naive Bayes gives poor results. The highest value we achieved using ROC curve is 86 from random forest. The other classifier values in descending order are decision tree, stacking, multilayer perceptron, logistic regression, and naive Bayes (Fig. 14). The highest value we achieved using cross-validation accuracy is 79 from random forest and stacking. The order of classifier using cross-validation accuracy in Fig. 11 Random forest ROC curve
Recognition and Detection of Human Judgment …
673
Fig. 12 Logistic regression ROC curve
Fig. 13 Naïve Bayes ROC curve
descending order is random forest, stacking, multilayer perceptron, decision tree, logistic regression, and naive Bayes (Figs. 15 and 16). The highest value we achieved using AUC is 86 from random forest. The other classifier values in descending order are decision tree, stacking, multilayer perceptron, logistic regression, and naive Bayes (Fig. 17). Difference between AUC and Accuracy:
674
G. Charles Babu et al.
Fig. 14 MLP ROC curve
Fig. 15 Decision tree ROC curve
There is a difference between accuracy and AUC is that accuracy varies with the threshold. We change the threshold depending on the circumstances, for instance, the regression on medical data which is allowing predicting the chances of disease from the symptoms chooses lower threshold which increases the sensitivity of the analysis. On the other hand, AUC does not depend on threshold. It does not vary. In fact, AUC can be determined by integrating accuracy considering all values of threshold from 0 to 1.
Recognition and Detection of Human Judgment …
675
Fig. 16 Stacking ROC curve
Fig. 17 Comparison of classifiers
We have applied different classifiers and stacking on the dataset. Even though we applied stacking, random forest shows better results in performance metrics like ROC curve and AUC. Based on all performance metrics, random forest is the winner for this dataset.
676
G. Charles Babu et al.
6 Conclusion From the above results, we can conclude that the best accuracy score we can achieve is about 77% using random forest among other implemented models. Even though we also implemented model stacking that deduces the bias in a model on a particular dataset by which prediction of model becomes unbiased and accurate. The random forest model has the best speculation, and it can be trusted that it is used to identify the most influencing person among different influencers. Precise influencer perdition is essential for accurate classification for the purpose of influencer marketing of a company, which involves in improving the companies brand and gaining trust among the customers.
7 Future Enhancement In the future, we will try to consider the nature of the content being shared by influencers on the networking platform into account, which also acts as the main role in shaping the opinion of the users and we also need to consider the individuals who are significant in the past and who have creditability to have many followers in the future are more probable to be significant. The overall performance could also be improved and customized with the help of considering more features in the dataset and daily activities of user.
References 1. Z. Zengin Alp, S. Gunduz Oguducu, Identifying topical influencers on twitter based on user behavior and network topology. Knowl. Based Syst. 141, 211–221 (2018) 2. M. Cataldi, M.A. Aufaure, The 10 million follower fallacy: audience size does not prove domaininfluence on Twitter. Knowl. Inf. Syst. 44(3), 559–580 (2015) 3. H. Kwak, C. Lee, H. Park, S. Moon, What is Twitter, a social network or a news media?, in Proceedings of the 19th International Conference on World Wide Web (WWW’10), vol. 112, no. 2, pp. 591–600 (2010) 4. M. Cha, H. Haddai, F. Benevenuto, K.P. Gummadi, Measuring user influence in Twitter: the million follower fallacy, in International AAAI Conference on Weblogs and Social Media 2010 (ICWSM-10), pp. 10–17 (2010) 5. L. Liu, J. Tang, J. Han, S. Yang, Learning influence from heterogeneous social networks. Data Min. Knowl. Discov. 25(3), 511–544 (2012) 6. J. Weng, E.P. Lim, J. Jiang, Q. He, Twitterrank: finding topic-sensitive influential Twitterers, in Third ACM International Conference on Web Search and Data Mining (WSDM 2010), pp. 261– 270 (2010) 7. A. Leavitt, E. Burchard, D. Fisher, S. Gilbert, The Influentials: New Approaches for Analyzing Influence on Twitter (2009) 8. P. Ficamos, Y. Liu, A topic based approach for sentiment analysis on Twitter data. Int. J. Adv. Comput. Sci. Appl. 7(12), 201–205 (2016)
Recognition and Detection of Human Judgment …
677
9. P.-C. Lin, P.-M. Huang, A study of effective features for detecting long-surviving Twitter spam accounts, in 2013 15th International Conference on Advanced Communications Technology (ICACT), pp. 841–846 (2013)
Design and Implementation of High-Speed Energy-Efficient Carry Select Adder for Image Processing Applications K. N. VijeyaKumar, M. Lakshmanan, K. Sakthisudhan, N. Saravanakumar, R. Mythili, and V. KamatchiKannan Abstract The design of High-Speed and Energy-Efficient Carry Select Adder (CSLA) for image processing applications results in the reduction of the number of transistors leading to less delay and high power dissipation. The adders are the basic building blocks of each processing unit. The novelty of this paper includes the proposal of a new CSLA which includes Final Selection Unit (FSU), Primary Carry Unit (PCU), Wave Carry Unit (WCU1, WCU2), these are split up into the suitable width in terms of bit. These blocks are integrated with functional blocks using random logic for which the output and input include only the carry function. The carry estimation is skipped in the first stage of every block of bit-slice. The design is synthesized using 180 nm CMOS technology for different input bit widths. The proposed CSLA shows a significant reduction in delay and area compared to existing adders. Keywords Threshold logic · Wave carry unit · Final selection unit · Carry save adder
1 Introduction The addition is the primary function in arithmetic units used in image processing applications. Full adders used for addition in VLSI implementation possess high propagation delay and high power. But digital image application usually requires less propagation delay and low power consumption for quick process. For n bit addition K. N. VijeyaKumar · N. Saravanakumar (B) · R. Mythili Department of ECE, Dr. Mahalingam College of Engineering and Technology, Pollachi, India M. Lakshmanan Department of EEE, CMR Institute of Technology, Bengaluru, India K. Sakthisudhan Department of ECE, Dr. N.G.P Institute of Technology, Coimbatore, India V. KamatchiKannan Department of EEE, Bannari Amman Institute of Technology, Sathyamangalam, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_49
679
680
K. N. VijeyaKumar et al.
in VLSI implementation, ripple carry adders are used in which the delay and power increase. This leads to less speed and increase in number of gates resulting in more power dissipation. To overcome these issues, a new Carry Select Adder (CSLA) is proposed. The conventional CSLA structure consists of Ripple Carry Adder (RCA) and BEC units. The proposed CSLA has no RCA units which result in reduction of propagation delay. The numbers of gates are also reduced when compared to conventional CSLA. Pradhan [1] proposed high-speed digital adder with selecting sum inputs and multiple-radix carry with Boolean expressions. This reduces carry propagation time which reduces the delay of addition process. Neil and Eshraghian [2] proposed low power threshold logic-gates based on capacitive input and gate which results in low power dissipation and high operating speed. Datta et al. [3] proposed Carry Select Adder with simple and efficient gate level modification which significantly reduces area and power. The area and delay are reduced by calculating final sum before Carry Select (CS) operation which is proposed by Mahalakshmi and Sasilatha [4]. An area efficient Carry Select Adder with common Boolean logic term in order to reduce delay, power consumption and power delay production is proposed in [5]. Highly area efficient CMOS Carry Save Adder (CSA) with regular and iterative-shared transistor structure which has both static and compact multi-output carry look ahead circuit to improve speed is proposed in [6]. The concept of CSA known to be the fastest adder is proposed in [7]. The fault tolerant carry saves adder using threshold logic having reduced delay is proposed in [8]. A high-speed hardware efficient Carry Select Adder is proposed in [9]. A Comparative Analysis of different 32-bit Adder topologies with multiplexer-based full adder is proposed in [10].
2 Overview of CSLA 2.1 CSLA The most frequently used RCA includes a series connection of Full Adders (FA). Full adders used for addition in VLSI implementation have high propagation delay and power dissipation. But digital image applications usually require less propagation delay and low power consumption for rapid process. For an n bit addition in VLSI implementation, ripple carry adders are utilized in which the delay and power increase because each adder in ripple carry adder has to hold output for its next adder input. This research is proposed to overcome these issues in VLSI implementation a new CSLA. The previous CSLA structure consists of RCA and BEC units. Instead of RCA units here random logic is used. This reduces the propagation delay and number of gates when compare to previous CSLAs.
Design and Implementation of High-Speed Energy …
681
2.2 Bit Single Block CSLA The 5-bit single block CSLA contains of various combinational logic in place of the typical RCAs and MUXs which emphases on applying the carry generation logic. Sum = (A ⊕ B) ⊕ Cin ∵ Sum = A ⊕ B if Cin = 0 Sum = A B if Cin = 1
(1)
Cout = Cin · (A + B) + A · B
(2)
∵ Cout = A · B if Cin = 0
(3)
Cout = A + B if Cin = 1
(4)
The 5-bit single block CSLA is designed using three main parts. They are PCU, WCU, and a FSU. The input stage includes Primary Carry Unit which contains XOR and NAND logic that produces the primary carry. The Wave Carry Unit performs carry distribution which functions similar as an RCA. This unit includes WCU0 and WCU1 which are operated at C in = ‘0’ and ‘1.’ The Final Selection Unit determines the final outputs as Sum and C out of the adder by evaluating the inputs of the block as PCU and WCU. The number of gates is more which leads to increased delay and high dissipation of power.
3 Proposed System 3.1 Single Block CSLA with 5-Bit The 5-bit single block CSLA is shown in Fig. 2. The dissemination delay and power consumption are reduced by changing the sum path direction and replacing four input gate by three input gate. The speed of the architecture is increased by applying this logic in all directions. The proposed 5-bit CSLA deals with cascading bit-portion blocks divided according to corresponding bit-width. All interposed nodes in the bitportion blocks are distributed with carry dissemination functions and are incorporated into combinational logic. The bit-portion block includes three units which include PCU, FSU, and WCU. The PCU is the initial stage which generates sources for carry propagation from input. The carry dissemination is utilized in second stage by WCU and the final Sum and C out is decided by FSU. The functionality of this modified CSLA and conventional CSLA was same which is verified by the simulation results. Then power dissipation and delay are calculated for both architectures.
682
Fig. 1 5-bit single block CSLA architecture
Fig. 2 Proposed 5-bit single block CSLA
K. N. VijeyaKumar et al.
Design and Implementation of High-Speed Energy …
683
Table 1 Truth Table of proposed 5-bit single block CSLA I/P 1
I/P 2
I/P 3 (CIN)
S
C
10010
10101
1
01000
1
10101
11100
1
10010
1
00101
10001
1
10111
0
10101
11100
0
10001
1
10010
10101
1
01000
1
Fig. 3 Simulation result of proposed 5-bit single block CSLA
4 Results and Discussion 4.1 Single Block CSLA with 5-Bit The truth table and simulation results of 5-bit single block CSLA are shown in Table 1 and Fig. 3. This architecture produces power dissipation of 22.93 µW and delay of 1.4 ns.
4.2 Implementation The 5-bit single block CSLA is implemented in mean filter. The power dissipation and delay are calculated for conventional mean filter and proposed mean filter and then results are compared. The proposed CSLA is inserted in the adder section of mean filter, and the architecture is shown in Fig. 4 (Fig. 5). The simulation inputs and outputs of proposed mean filter are shown in Fig. 6. The conventional mean filter produces power dissipation of 490 µW and delay of 4.14 ns. The proposed mean filter produces power dissipation of 388 µW and delay of 3.476 ns. The proposed mean filter produces 20% less power dissipation and 16% less delay when compared to conventional mean filter. The input and output images of conventional and proposed mean filter are shown in Fig. 7. The different parameters such as SSIM, PSNR, and MSE are calculated between conventional mean filter and proposed mean filter to estimate the performance of proposed mean filter which is shown in Table 2.
684
K. N. VijeyaKumar et al.
Fig. 4 Mean filter architecture
INPUT IMAGE
Fig. 5 Simulation results of conventional mean filter
OUTPUT IMAGE
Design and Implementation of High-Speed Energy …
INPUT IMAGE
685
OUTPUT IMAGE
Fig. 6 Simulation results of proposed mean filter
NAME
INPUT IMAGE
OUTPUT IMAGE
CONVENTIONAL MEAN FILTER
PROPOSED MEAN FILTER
Fig. 7 Input and output image of conventional mean filter and proposed mean filter Table 2 Parameters analysis
MSE
PSNR
SSIM
1.9174
45.1830
0.9508
686
K. N. VijeyaKumar et al.
5 Conclusion This paper involves proposal of energy-efficient and high-speed CSLA influenced through carry generation logic. The proposed adder is executed for different input bit-width using 180 nm CMOS technology. The delay and power dissipation are reduced by proposed CSLA. Then, it is implemented in mean filter which results in 20% less power dissipation and 16% less delay when compared to conventional mean filter. Then the image processing parameters such as SSIM, PSNR, and MSE are analyzed to estimate the performance of proposed adder.
References 1. D.K. Pradhan, Fault-tolerant carry-save adders. IEEE Trans. Comput. 21(11), 1320–1322 (1974) 2. H.E.W. Neil, K. Eshraghian, Principles of CMOS VLSI Design: a Systems Perspective (Addison-Wesley, 1985) 3. R. Datta, J.A. Abraham, R. Montoye, W. Belluomini, H. Ngo, C. McDowell, A low latency and low power dynamic carry save adder. IEEE Int. Symp. Circ. Syst. 2, II477–II480 (2004) 4. R. Mahalakshmi, T. Sasilatha, A power efficient carry save adder and modified carry save adder using CMOS technology, in IEEE International Conference on Computational Intelligence and Computing Research (2013) 5. R.A. Javali, R.J. Nayak, A.M. Mhetar, M.C. Lakkannavar, Design of high speed carry save adder using carry lookahead adder, in IEEE International Conference on Circuits, Communication, Control and Computing, pp. 33–36 (2014) 6. F.C. Cheng, S.H. Unger, M. Theobald, W.C. Cho, Delay-insensitive carry-look ahead adders, in IEEE International Conference on VLSI Design, pp. 322–328 (1997) 7. Y.T. Pai, Y.K. Chen, The fastest carry lookahead adder, in Second IEEE International Workshop on Electronic Design, Test and Applications, pp. 434–436 (2004) 8. P. Celinski, J.F. Lopez, S. Al-Sarawi, D. Abbott, Low depth, low power carry lookahead adders using threshold logic. Microelectron. J. 33(12), 1071–1077 (2002) 9. Saravanakumar, Vijeyakumar, Sakthisudhan, FPGA implementation of high speed hardware efficient carry select adder. Int. J. Reconfig. Embedded Syst. 7(1), 43–47 (2018) 10. D. Mohanapriya, N. Saravanakumar, A comparative analysis of different 32 bit Adder topologies with multiplexer based full adder. Int. J. Eng. Sci. Comput. 6(5), 4850–4854 (2016)
Mitigating Poisoning Attacks in Federated Learning Romit Ganjoo, Mehak Ganjoo, and Madhura Patil
Abstract With the world transitioning into the era of artificial intelligence, a prodigious amount of data is being gathered and preserved in different silos. The excessive collection of data has resulted in various threats and privacy issues. This has questioned the use of conventional approaches of Artificial Intelligence and led to the development of a robust approach to processing and using data. Federated Learning, which trains an algorithm across many decentralized servers storing data without exchanging them, has shown promising results under this new reality. But the extant Federated learning mechanism has shown to possess some vulnerabilities that can be exploited to bargain data privacy. Federated Learning schemes have been subject to poisoning attacks. It is therefore of extremely high importance to showcase the prevailing flaws in the current design. This paper discusses the unique characteristics and challenges of federated learning, provides a review of the topic by highlighting two major poisoning attacks in the Federated Learning; and provides a method to mitigate these poisoning attacks. Keywords Machine learning · Federated learning · Cyber security
1 Introduction As the presence of technology is increasing day by day in this era, so is the amount of data that is generated. Keeping large and various types of data in central storage is difficult, as the concern of data privacy and user privacy arises [1]. It is not practical to store sensitive information from different users at a central place which may lead to potential fraud [1–3]. As countries are becoming more aware of where their data is being used, European countries decided to create a legal framework GDPR (General data protection regulation) which ensures the protection of data and the R. Ganjoo (B) · M. Patil MIT-WPU, Pune, India M. Ganjoo Barclays GSC, Pune, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_50
687
688
R. Ganjoo et al.
user’s confidentiality. When the centralized machine learning technique isn’t feasible, that’s where federated learning comes in. Federated learning plays a huge role in training models from remote devices from their own data [2, 4–6]. How does federated learning help? Federated learning helps eliminate the issue of central storage data, as the models are trained on localised or siloed data. With the concept of federated learning, the concern of optimized training output and data privacy preservation arises as huge complex networks are involved for training models on remote devices. This paper mainly discusses the poisoning attacks that happen in federated learning and measures to prevent such attacks. The advantage of federated learning is that it gives us a trained model with an optimized accuracy by iterating the process of training the model on different remote devices. In this process, it also ensures the security of private training data with the ability to handle different types of data. However, federated learning is still vulnerable to poisoning attacks. It could be very dangerous as malicious users could hijack one’s personal sensitive data. For example, an attacker invades the hospital’s data and induces wrong records of the patient’s data, which may lead to severe complications. Therefore, there is a need for various measures to prevent such attacks on federated machine learning. There have been few researches to prevent such a state like, Bhagoji et al. [4] in his paper explores the threats related to model poisoning attack on federated learning model, and also proposes an alternate minimization method which optimizes the training loss and the adversarial objective. Xingchen et al. [7] performs a systematic investigation on such threats and proposes a novel optimization-based model poisoning attack. This paper experiments with the concept of relative weight update, which helps in detecting malicious agents. This works when an encoding image is generated from each agent, and the distance between each pair of agents is computed and compared, after computing the results are plotted and any distance that holds larger values than the threshold value is flagged as malicious. Figure 1 shows the system model of Federated Learning. This experiment is furthermore conducted in Sect. 8, additional information on how there is a threat to federate learning is explained, with how data poisoning and model poisoning attacks occur on the model and finally the result of this experiment is concluded.
2 Federated Learning 2.1 Types of Federated Learning Federated learning techniques are classified on the basis of data distribution i.e., horizontal FL, vertical FL and federated transfer learning. Horizontal FL—This type of federated learning takes place only on horizontal data and is used for supervised machine learning task. The dataset under HFL contains similar feature space but different samples by users. That means, two users A and B may have the same type
Mitigating Poisoning Attacks in Federated Learning
689
Fig. 1 FL system model
of data but they are different users. Vertical FL—VFL is only applicable when two datasets share the same sample space but differ in feature space. Federated transfer learning—The data federation as in case of FTL allows knowledge to be shared without compromising user privacy, and enables complimentary knowledge to be transferred in the network. The major difference between Horizontal FL and Vertical FL lies in the fact that Horizontal FL using same features of dataset to train the global model while as Vertical FL uses different features of the data set and combine them to train the global model. Furthermore, federated learning can be classified into two more terms, that is cross-device and cross-silo federated learning. In Cross-device FL only few devices are available at a time to compute, and data is fixed partition i.e., horizontal FL. Whereas Cross -silo federated learning, all users are available most of the time, and data partition could be either vertical or horizontal.
2.2 Threats to Federated Learning Federated Learning associates training of machine learning models over isolated devices while keeping the data localised [8]. The model is trained on real time data present across the globe on decentralised datasets [9, 10]. In spite of remote data centres and data localisation, numerous threats prevail in Federated Learning model. This bi-directional frequent exchange of information exposes caveats for various issues, few of them being data leakage and malicious input to FL model. The decentralized edge devices transfer machine learning parameters as gradients to the central server [11]. This opens up the possibility of attackers manipulating the gradient thus, tampering the FL model [12–14]. Vulnerability lies at both the ends, server being malicious or biased activities through siloed data centres. The server
690
R. Ganjoo et al.
could be attacked to tamper the gradients received during updates over time which results in discrepancies in the training process control the view of the participants on the global parameters [15]. Remote devices being an integral computation entity in this topology are prone to multiple attacks. Malicious users can control the parameter uploads. Either the inputs to the server could be altered to tamper the Fl model training or it could just leverage the model without actually contributing to the training data set. Such attacks pose significant threats to FL, as in centralized learning only the server can violate participants’ privacy, but in FL, any participant may violate the privacy of peer participants, even without involving the server. Federated Learning systems pose two types of threats; Insider or Outsider, based on their origin. Insider attacks are initiated by the components of FL network topology. It is either done by the FL server or by the remote participants in the network. Gradient exchange in FL systems can result in data leakage which opens up doors for outsider attacks. These attacks are triggered by external agents eavesdropping the communication. They can get hold of unintended features of participants training data and thus introduce disparity in FL model. Figure 2 shows the various threat models of Federated Learning while Fig. 3 shows the classification of Federated Learning attacks. Fig. 2 Classification of threat models in federated learning
Mitigating Poisoning Attacks in Federated Learning
691
Fig. 3 Hierarchy of attacks in federated learning
2.3 Poisoning Attacks Poising Attack happens when bad data is injected into the model’s training pool to compromise the learning process so much so that it leads to the model being useless eventually. Poisoning attacks can be classified as a) random attacks and b) targeted attacks depending on the attacker’s objective. Random attacks aim to reduce the efficiency of the FL model whereas targeted attacks induce the FL model to produce output labels as per the malicious input which constructs a backdoor for the attacker even in the future [16]. Since targeted attacks are induced with a specific aim to hamper the FL systems, they tend to be more adverse. At high level, poisoning attacks in general are induced with an attempt to change the behaviour of the target model in some undesirable way. Due to availability of large amount of data, the model is susceptible to Trojan attacks as well [17]. Trojan attacks are achieved by inducing small amount of poisoned data in the whole model [18]. If adversaries manage to compromise the FL server, then they can easily perform both targeted and untargeted poisoning attacks on the trained model.
2.3.1
Data Poisoning
Manipulation of training data by FL participant is referred to as Data Poisoning. Malicious participant introduces poisoned data instances or deliberately changes existing instances leading to adversarial training of FL model. These types of attacks are primarily divided into two categories: (1) Clean Label (2) Dirty Label. In Clean label poisoning attack, the adversary cannot change the label of training data as there exists as process by which valid data is classified to be genuine. Here, poisoning of data samples needs to be indiscernible. On the contrary, attacker in Dirty label poisoning attack misclassifies the training data by adding malicious instances into the training data set. Label flipping attack is an example of Dirty label poisoning attack in which labels of genuine training instances are flipped while the features of the data remain unaltered. For instance, the adversary participant in FL system tampers the dataset by flipping all 0 s into 1 s. On successful attack occurrence, the
692
R. Ganjoo et al.
model is unable to correctly identify the original 0 s and accepts the altered values to be true. Backdoor attack is another realistic example of Dirty label attack. Here, the attacker modifies small patches or features of the original training dataset to create backdoors in the model thereafter [19]. The performance of the model on clean input is unaffected but it creates a backdoor for attacker to manipulate the FL model if the input contains backdoor features. Thus, these attacks become harder to detect.
2.3.2
Model Poisoning
Model Poisoning attack refers to damaging the local model updates before they are sent to the server or insertion hidden backdoors into the FL model. Attacker’s main aim in targeted model poisoning is to instigate the FL model to classify any valid data set as discrepant. Here, the data sets at test time are not modified rather, the training process is maligned to result into inefficient FL model. There are two types of Model Poisoning attacks to which FL systems are vulnerable (1) Untargeted Poisoning attack (2) Targeted Poisoning attack. Untargeted Poisoning attack hinders the FL model to attain high accuracy or even convergence. It develops high error rate indiscriminately for testing data sets. In Targeted poisoning attack, the malicious agent manipulates FL model at training time such that it produces attacker targeted outputs at evaluation time. This type of attack affects only a subset of training data while maintaining the overall accuracy of the model.
3 Adversarial Goal The goal of the adversary is generally the misclassification of the auxiliary data by the classifier learned at the server. This data is present in huge amounts, and the adversary takes advantage of this excess amount present without getting noticed. The data consists of samples {xi }ri=1 with labels {yi }ri=1 that is to be classified as desired target classes {μi }ri=1 . Thus, the objective of the adversary is: r A dm U daux , wGt = max f xi ; wGt = μi
(1)
i=1
where dm daux wGt f
training data auxiliary data global parameter vector classifier.
The main aim of an adversary is to ensure a good performance of the model on the test dataset. To tackle this hurdle, the system must take some preventive measures
Mitigating Poisoning Attacks in Federated Learning
693
to detect such anomalies in the system. So, for the detection of these anomalies, we have provided some control measures in the next section.
4 Relative Weight Distance Computation The weight update statistics has an important role to play in flagging the abnormal agents. The relative distance between an update and the remaining updates indicates how much that update differs from the others. The pair wise distance range between a particular update and the remaining ones is calculated using a distance metric ‘d’. Rm =
min d δmt , δmt , max d δmt , δmt
i∈[k]\m
i∈[k]\m
(2)
Choosing absolute distance over relative distance makes the model prone to attacks [3]. For an agent to be marked as abnormal; the distance range value has to vary from others. A time dependent threshold value, αt , is defined by the server. For a malicious agent: l u and Rmax[k\m] = Minimum lower and Maximum upper bound Let Rmin,[k\m] distance range for non-malicious agents among themselves. Then for the model to stop the entry of malicious agent and label it as abnormal, we need that: l
l u , Rm − Rmax[k\m] > αt max Rmu − Rmin,[k\m]
(3)
This condition keeps a check on how different the distance ranges are for malicious and non-malicious agents.
5 Experimental Setup Object detection is amongst the most useful applications of computer vision [20, 21]. To demonstrate the effect of the poisoning control measures, we have implemented a face recognition model. Ten different agents are considered containing both malicious and non-malicious agents. Eight agents are non-malicious, while agent 3 and 6 are malicious. The challenges possessed by the model, apart from poisoning attacks, are training on images of varying lighting and contrast conditions [22, 23]. This experiment tests the two poisoning preventive measures on these agents. The effectiveness of the two measures is tested by whether the model would be able to recognize the malicious agents and flag them.
694
R. Ganjoo et al.
5.1 Generating Update Agent update ensures the data privacy of the users. The users send the updates in the form of encodings. In the experiment, each agent is trained on face images (Fig. 4) and an encoding is generated. These encodings are in the form of an array that store values of negative and positive polarity [24, 25]. These encodings are the feature vectors, also known as gradient vectors [24, 26]. Figure 5 shows 1/3rd of the feature vector of agent 1. These feature vectors, once generated, are stored in a .py file. Once all the agents have generated these encoded files separately, these are sent to the central model in the form of an update. The central model then applies the attack measures on these updates to identify the malicious agents and prevent them from reducing the model performance.
Fig. 4 Training images for 10 agents with agent 3 and 6 as malicious agent
Fig. 5 Feature vector of agent 1
Mitigating Poisoning Attacks in Federated Learning
695
5.2 Applying Control Measure After the central model receives the updates from each agent, the poisoning preventive measure is applied to them, as mentioned in Sect. 4. The algorithm for measuring the relative distance between the updates is presented in Sect. 5.3. The measure is applied to these updates by the central model.
5.3 Algorithm Encode each agent’s data to generate a feature vector. Send local update for each agent’s feature vector to the FL model. Create agent pool having sets of combination size 2. for each combination in Agent pool: Calculate Euclidean Distance using feature vector of agents in the combination. n d( p, q) = (qi − pi )2 i=1
where p, q pi , qi n
concerned agents ith value of feature vectors dimension of feature vector.
Set Threshold value for agent classification. if (d(p,q) > Threshold): Add agents into suspicious pool. else: Add agents into Non-Malicious Pool. for each agent in Suspicious Pool: if agent present in Non-Malicious Pool: Label agent as ‘B’. else: Label agent as ‘M’. Where B M
Benign Malicious.
(4)
696
R. Ganjoo et al.
5.4 Analysis For analysing the proposed algorithm, a system is taken into consideration where 10 agents are involved in the Federated Learning system in which two agents, i.e. Agent 3 and Agent 6 are malicious. A suspicion Table is created for the above scenario to identify attacking agents. Suspicion table T (n × n) can be defined as below: T = [Labelij ] where i, j depicts the agent placed in row and column, respectively Labeli j =
1, where atleat one agent is malicious 0, where agents i,j are non - malicious
n = Number of agents X = No comparison for selected pair. With the help of Table 1, malicious agent(s) can be identified using the procedure described below. 1. 2. 3. 4. 5.
As per the algorithm, Agent Pool is created, and their respective Euclidean distance is computed against the threshold. For the agent set, which lies well within the threshold, the value is updated as 0 in the Suspicion Table. For the agent set which lies beyond the threshold, in the Suspicion Table, update the value as 1. Update the Suspicion Table for all sets in the agent pool. The row and column where the count of 1s is (n − 1), depicts that the concerned agent is malicious.
Table 1 Suspicion table for detecting malicious agents Agent
1
2
3
4
5
6
7
8
9
10
1
X
0
1
0
0
1
0
0
0
0
2
0
X
1
0
0
1
0
0
0
0
3
1
1
X
1
1
1
1
1
1
1
4
0
0
1
X
0
1
0
0
0
0
5
0
0
1
0
X
1
0
0
0
0
6
1
1
1
1
1
X
1
1
1
1
7
0
0
1
0
0
1
X
0
0
0
8
0
0
1
0
0
1
0
X
0
0
9
0
0
1
0
0
1
0
0
X
0
10
0
0
1
0
0
1
0
0
0
X
Mitigating Poisoning Attacks in Federated Learning
697
6 Results For showcasing the results, updates of different combination pairs of agents are taken, as shown in Fig. 6. The combinations are as follows: Benign versus Benign, Benign versus Malicious and Malicious versus Malicious. Table 2 shows the average distance value of the updates from these combinations. For the table, it can be verified that for the combinations involving both benign agents, the distance value is far less than the combination where at least one agent is malicious. All the pairs exceeding the threshold value are considered suspicious. This model was able to detect both malicious agents using the relative weight distance measure. Figure 7 shows the distance value for all the pair combinations.
Fig. 6 Relative weight distance plot for agents with different combination of malicious and benign agents
Table 2 Output table
Type
Distance
Benign versus Benign
0.31
Benign versus Malicious
0.72
Malicious versus Malicious
0.74
698
R. Ganjoo et al.
Fig. 7 Agent classification based on relative distance of agent pairs for malicious and benign agents
7 Conclusion and Future Scope As quantity and exchange of data are increasing rapidly with time, the privacy and protection of data have become a topic of huge concern. The current threat protection model deals with computing variation in all the updates received from local models and classifying the highly varied updates as malicious. There can be many more such methods that can be implemented for better protection. A comparison based on the accuracy of the model after the updates are applied on it can be computed. If the resulting model has a validation accuracy lower than the model obtained by aggregating all the other updates, the server can flag that update and consider it malicious. Another prevention could be Byzantine Isolation Model, which can help prevent data poisoning. Tang et al. [17] in his paper has developed a mechanism for detection of Trojan attacks. That approach can be used in Federated Learning as well to avoid data poisoning.
References 1. M. Abadi et al., Deep learning with differential privacy, in CCS (2016), pp. 308–318 2. L. Lyu, H. Yu, J. Zhao, Q. Yang, Threats to Federated Learning, in Federated Learning: Privacy and Incentive 3. Y. Aono, T. Hayashi, L. Wang, S. Moriai et al., Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans. Inf. Forensics Secur. 13(5), 1333–1345 (2018) 4. A.N. Bhagoji, S. Chakraborty, P. Mittal, S. Calo, Analyzing federated learning through an adversarial lens
Mitigating Poisoning Attacks in Federated Learning
699
5. A. Bhowmick, J. Duchi, J. Freudiger, G. Kapoor, R. Rogers, Protection against reconstruction and its applications in private federated learning. CoRR (2018). arXiv:1812.00984 6. Q. Yang, Y. Liu, Y. Cheng, Y. Kang, T. Chen, H. Yu, Federated Learning (Morgan & Claypool Publishers, San Rafael, 2019) 7. X. Zhou, M. Xu, Y. Wu, N. Zheng, Deep model poisoning attack on federated learning. Future Internet 13, 73 (2021). https://doi.org/10.3390/fi13030073 8. H.B. McMahan, E. Moore, D. Ramage, B.A. Arcas, Federated learning of deep networks using model averaging. CoRR (2016). arXiv:1602.05629 9. B. McMahan, E. Moore, D. Ramage, S. Hampson, B.A. Arcas, Communication-efficient learning of deep networks from decentralized data, in Artificial Intelligence and Statistics (2017), pp. 1273–1282 10. H.B. McMahan, D. Ramage, K. Talwar, L. Zhang, Learning differentially private recurrent language models, in ICLR (2018) 11. Q. Yang, Y. Liu, T. Chen, Y. Tong, Federated machine learning: concept and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2), 1–19 (2019) 12. M. Kantarcioglu, C. Clifton, Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans. Knowl. Data Eng. 16(9), 1026–1037 (2004) 13. D. Cao, S. Chang, Z. Lin, G. Liu, D. Sun, Understanding distributed poisoning attack in federated learning, in 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS) (2019) 14. P. Kairouz et al., Advances and open problems in federated learning. CoRR, arXiv:1912.04977 (2019) 15. K. Bonawitz et al., Practical secure aggregation for privacy-preserving machine learning, in CCS (2017), pp. 1175–1191 16. T. Li, A.K. Sahu, A. Talwalkar, V. Smith, Federated learning: challenges, methods, and future directions. CoRR, arXiv:1908.07873 (2019) 17. R. Tang, M. Du, N. Liu, F. Yang, X. Hu, An embarrassingly simple approach for Trojan attack in deep neural networks 18. R. Wang, G. Zhang, S. Liu, P.-Y. Chen, J. Xiong, M. Wang, Practical detection of Trojan neural networks: data-limited and data-free cases 19. T.D. Nguyen, P. Rieger, M. Miettinen, A.-R. Sadeghi, Poisoning attacks on federated learningbased IoT intrusion detection system, in Workshop on Decentralized IoT Systems and Security (DISS) 2020, San Diego, CA, USA, 23–26 Feb 2020 20. Y. Liu et al., Fedvision: an online visual object detection platform powered by federated learning, in IAAI (2020) 21. R. Szeliski, Computer Vision: Algorithms and Applications (September 3, 2010) 22. N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’2005), San Diego, CA (2005), pp. 886–893 23. Z. Lachiri, A. Adouani, W.M. Ben Henia, Comparison of Haar-like, HOG and LBP approaches for face detection in video sequences, in 16th International Multi-Conference on Systems, Signals & Devices (2019) 24. R. Ganjoo, A. Purohit, Anti-spoofing door lock using face recognition and blink detection, in 2021 6th International Conference on Inventive Computation Technologies (ICICT) (2021), pp. 1090–1096. https://doi.org/10.1109/ICICT50816.2021.9358795 25. M. Ghorbani, A. Tavakoli Targhi, M. Mahdi Dehshibi, HOG and LBP: towards a robust face recognition system, in The Tenth International Conference on Digital Information Management (ICDIM 2015) 26. M.A. Hearst, S.T. Dumais, E. Osuna, J. Platt, B. Scholkopf, Support vector machines. IEEE Intell. Syst. Their Appl. (1998)
An Audit Trail Framework for Saas Service Providers and Outcomes Comparison in Cloud Environment M. N. V. Kiranbabu, A. Francis Saviour Devaraj, Yugandhar Garapati, and Kotte Sandeep
Abstract The elegant features of cloud computing are evaluated with different functionalities and techniques. In the scenario of service-oriented architecture, assessing the service provider’s efficiency, economy and effectiveness are the ever-ending challenging issues. The preferences of consumer in SaaS environment vary based on the effective utilization of resources that are delivered through service providers. Disparate broker-based approaches enlighten the above concerned issues with uncertainty. We address the solution with audit trails for the above addressed issues in an optimistic manner based on provenance data. In enhancing the quality of service, we introduce a delegate role of auditor service for assessing service provider’s service towards consumer preferences based on provenance data for SaaS provisioning in cloud environment. Keywords Cloud auditor · QoS economics · Audit trail · Provenance data · Provider efficiency approach · Performance assessment
1 Introduction The large data computations for the present-day service-oriented architectures prominently admit the usage of cloud computing. The demanding situations in SOA are tagged with many issues like Quality of Service, data integrity parameters, resource utilizations, compliance monitoring, trust ability and accountability. The M. N. V. Kiranbabu (B) Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andha Pradesh, India e-mail: [email protected] A. Francis Saviour Devaraj Kalasalingam Academy of Research and Education, Krishnakoil, Tamil Nadu, India Y. Garapati GITAM Deemed to be University, Hyderabad, India K. Sandeep Dhanekula Institute of Engineering and Technology, Vijayawada, Andhra Pradesh, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_51
701
702
M. N. V. Kiranbabu et al.
stakeholders in the cloud paradigm integrate provenance data in examination of integrity, resource utilizations and data logs. The usage of provenance data helps in QoS audit, service workflows and error detections [1]. The tracking of workflows from the provenance data yields the policy monitoring solutions for optimal resource utilizations [2]. Most of the IT-based enterprises adopt cloud computing for effective computations workflows and can be solved by sensing of provenance data [3]. There are many categories in provenance with their collecting and acquiring levels of data. Few categories of provenance are scenario based, maximum provenance, minimum provenance and no provenance, and their levels of collections are generalized as fine grained and course grained [4]. Provider’s services and their workflows are recorded in provenance database which helps to analyse the activities log of provider and consumer based on which economic, effective, efficient performance assessment can be done through auditor [5]. The importance of broker-based approaches, negotiation issues and authentication issues was well documented which acts as a baseline for the development of our framework [6–8]. We identified that there is a requirement to enhance the focus on issues like resource allocation and assessment [9, 10], considering Quality of Service parameters while ranking the provider and VM instance [11–13], secure data transaction [14–17], emergence of service-oriented architecture in response to service orchestration [18], provenance-based cloud data integrity and study on various anti-forensics, insider attacks [19–21]. Our research paper is segmented into seven parts. The introduction part discloses the importance of the provenance data for audit trail. The review of literature explores the different research scripts which are useful for our framework description. Framework description part deploys the conceptual diagram with procedural guidelines. The next parts are modelling the problem and results, which shows the providers’ policy assessment. The result analysis and comparison pinpoint the comparisons of our framework with others. The conclusion part is finalized with future scope.
2 Review of Literature The demanding growth for SaaS service resource utilization in a distributed and centralized manner is rendering more in cloud computing. The big data volumes with complex computations can be transformed in cloud computing. Provenance takes a commanding role where the data and its actions are recorded to that service which helps for audit trail. The data in the provenance is taken as a priority issue for different specific applications to deliver an optimal outcome [3]. The workflows of different services showcase the issue of dependability and exhibit the workflow states at an instance. This research script represents the tangibility of provenance for dependability issues. Usually, the usage of provenance was used more in SOA architectures. The failure states of workflows and tolerant schemes are presented in an algorithm with an aware of provenance for disclosing the results of recoverability and dependability issues [22].
An Audit Trail Framework for Saas Service Providers and Outcomes …
703
An auditing service embedded with provenance data is a pivot element for the evaluation of SaaS providers in cloud environment. This research script focuses on protocols, standards of data processing methods and implementations. Automation in the creation of provenance was used by a BEEP tool, and data in provenance is judged with the help of Trov, DBNotes and Orchestration Tools. The parsing challenges from the provenance data and the accessing of that data from the database are well discussed [23]. The security features entail with integrity, confidentiality and availability are taken as primary concerns and compare with data correlations of provenance usage. The above-said issues are investigated with data forensic approach which tends for authentication, authorization and evaluates the auditing features. Resource monitoring and tracking of proactive forensic techniques are well discussed [24]. The specific goal of the service is a composition of process and objectives which can be collected from provenance. Integration of these services may turn into a complex issue. Markov decision approach was imparted on the composition of services to showcase service quality in a workflow environment. Most of the services are endorsed with non-functional properties judged for service discovery issues and QoS issues. An optimal objective function with Markov decision approach was used in inspection of policy [25]. PROV used for monitoring of provenance where the framework takes the small datasets for justifying service object properties logical service and physical service representations. Rule-based message translators are used for checking events in a provenance, which signifies the history of events and states of service models [1]. The prominent role of SLA for service selection and monitoring quality of service identifies pros and cons to that specific issue. A seven-step process approach was used in this paper which determines the QoS in differentiation of classes, prioritizing the class properties, etc. [26]. The role of secure administration based on authorized database using monitored with references was carried in this paper. The proposed provenance approach predicts the entity events, entity roles, entity messages and states. This paper targets the consumer profile history roles and his behaviour to that utilization of resources at specific instances [2]. The feature of elasticity was adopted more in mobility of service. This paper showcases smart mobility for all (SMALL) architecture for secured payment gateways. This paper also addresses the scheduling, routing, maintaining of data, trust worthiness of data with the features of accountability and scalability with customizations [27]. Multi-customer, multi-provider, multi-service provisioning environment was created by SaaS service model. This paper projects the performance efficiency in data sharing issues that was cached. The framework in this paper integrates the metadata with caching mechanism which is used for communications between data objects. Check-pointing method was used along with PUSH and PULL algorithms to get an optimal outcome [28].
704
M. N. V. Kiranbabu et al.
3 Framework Description Based on the above literature study, many research gaps are found and we came up innovatively to create this framework (Fig. 1). The below diagram represents the conceptual view of audit trail framework for SaaS provisioning in cloud environment. The top level of this conceptual diagram clearly exhibits the stakeholders for this framework. The middle level of this conceptual diagram discloses the auditor management operations on SaaS services which are acquired from the top-level stakeholder’s inputs. The bottom level of this conceptual diagram targets the performance audit in three ways like economic, effectiveness and efficiency. Various mechanisms are initiated for the above addressed performance audit segments and finally were integrated to achieve improvised performance assessment of service provider [17, 29]. The performance audit trail framework progresses in the following sequential manner. • Auditor acquires the provenance data from cloud service provider and consumer where they are mutually tagged based on a scenario. • Our audit trail accesses the economic constraints of services provided by the service provider and computes the feasibility towards consumer preferences. • Here the audit trail targets on effective policy monitoring based on acquired provenance data SLA values and compares the outcomes of the each service provider’s service tangibility for consumer preferences.
Fig. 1 Conceptual view of Audit Trail Framework
An Audit Trail Framework for Saas Service Providers and Outcomes …
705
• Finally, our audit trail implements the computing of efficiency of services for the consumer preferences in detail with respect to the utilization percentage of services. • The integration of above methods discloses the improved provenance-based performance audit trail over SaaS provisioning in cloud environment.
4 Modelling the Problem Based on the research gaps identified, our problem states that there is a need of improvised and integrated audit trail for performance assessment, wherein all the earlier mechanisms are independent to that of specific applications and their outcomes are isolated and ambiguous. Most of the mechanisms are broker based which leads to the domination of the intermediate delegate role. We take a dataset [Table 1] from broker-based SaaS provider assessment approach assuming it as provenance data and impart our audit trails for acquiring optimal outcomes without polarizing to any of the participating stakeholders transparently. The above dataset [Table 1] is normalized for consistent computations. Based on the performance, audit trail framework progresses in sequential manner [B], [C], [D]. [B] Auditor implements economic assessment of services based on QoS through the following computations using above data table. Quality-Driven Economic Performance Audit Trail (EPAT) Algorithm. Step 1: Classification of Quality of Service parameters into two aspects: they are utility level and cost level. Step 2: Utility level is computed using SP(Availability) – Min(Availability)/Difference Value(Availability). Table 1 SaaS service providers SaaS_ID
Availability
Reliability
Cost ($)
Response time (ms)
SP2
0.99968
0.99953
0.762
0.2
SP4
0.99988
0.99964
0.804
0.3
SP6
0.99963
0.99958
0.444
0.6
SP8
0.99918
0.99975
0.506
0.2
SP10
0.99958
0.99956
0.484
0.3
SP12
0.99981
0.99976
0.134
0.7
SP14
0.99924
0.99983
0.234
0.5
SP16
0.99948
0.99973
0.454
0.3
SP18
0.99999
0.99962
0.478
0.2
SP20
0.99943
0.99972
0.414
0.5
SP22
0.99959
0.99977
0.544
0.2
SP24
0.99999
0.99992
0.508
0.5
706
M. N. V. Kiranbabu et al.
SP(Reliability) – Min(Reliability)/Difference Value(Reliability). Step 3: Cost level is computed using Max(Cost)–SP(Cost)/Difference Value(Cost). Max(Response Time)–SP(Response Time)/ Difference Value(Response Time) Step 4: Utility Score = Sum (Utility Level, Cost Level). [C] Auditor implements effective policy monitoring based on SLA through following computations on the above data table. While assessing the effectiveness, auditor also considers the sensitivity value of consumer towards the provider. Policy Monitoring Performance Audit Trail (PPAT) algorithm. Step 1: Classification of policy parameters into two: utility level (UL) and cost level (CL). Step 2: UL1 = Utility level (Sensitivity Value). Step 3: SU1 = SLA value for utility level * UL1. Step 4: UL = SU1 * UL1/Average Availability of SaaS_ID. Step 5: CL1 = Cost level (Sensitivity Value). Step 6: SC1 = SLA value for cost level * CL1. Step 7: CL = SC1 * CL1/Average Response Time of SaaS_ID. Step 8: Compare the UL and CL values of service providers of Table 1 to benchmark value of proposed SLA. [D] Auditor implements efficiency assessment of service providers by calculating their utilization percentage based on Table 2 which derived from data Table 1 with few additional assumptions (Tables 3, 4). Service Efficiency Performance Audit trail (EFPAT) algorithm. Step 1: Based on service accessible time and due time of service given in Table 1, auditor tries to compute service completion time. Table 2 SaaS service providers’ utility score
Rank
Service providers for our algorithm rank
Utility score for EPAT algorithm
1
SP24
2.84179104
2
SP18
2.71733639
3
SP22
2.50961716
4
SP12
2.36752137
5
SP16
2.20557894
6
SP14
2.09405111
7
SP8
2.00887868
8
SP4
1.94624881
9
SP10
1.84836218
10
SP20
1.77791101
11
SP2
1.67997052
12
SP6
1.42107412
An Audit Trail Framework for Saas Service Providers and Outcomes …
707
Table 3 SaaS service providers’ policy monitoring assessment SaaS_ID
PPAT algorithm availability policy values
PPAT algorithm response time policy values
SP2
0.989739
0.021333
SP4
0.990135
0.048
SP6
0.98964
0.192
SP8
0.988749
0.021333
SP10
0.989541
0.048
SP12
0.989997
0.261333
SP14
0.988868
0.133333
SP16
0.989343
0.048
SP18
0.990353
0.021333
SP20
0.989244
0.133333
SP22
0.989561
0.021333
SP24
0.990353
0.133333
Service completion time = Current service accessible time + previous service completion time. Step 2: Auditor computes total service accessible time(ms). Total service accessible time = i = 1 to n Service(i) accessible time. Step 3: Then auditor computes total service completion time. Total Service Completion time = i = 1 to n Service (i) completion time. Step 4: Compute average number of services provided by the provider. Average number of service provider = i=1 to n Service (i) completion time/ i= 1 to n Service (i) accessible time. Step 5: Finally auditor computes utilization (%) of services. Utilization (%) = (1/Average number of service providers) * 100. Step 6: Auditor chose the best service provider based on the highest utilization %
5 Results Based on the computations performed in the first step [B] of modelling the problem, the outcomes are as follows using Table 1 SaaS service providers (Fig. 2) [Note: SP represents service provider]. As per the computations performed in the second step [C] of modelling problem, the following table is derived using Table 1 SaaS service providers (Fig. 3). Finally based on the computations performed in the third step [D] of modelling the problem, the outcomes of services utilization percentage are as follows (Fig. 4).
708
M. N. V. Kiranbabu et al.
Table 4 SaaS service providers’ utilization percentage SaaS_ ID
Service providers
Service accessible time (ms)
Due time of the service (ms)
Service completion time
EFPAT algorithm utilization (%) of services to customer
SP2
A
2
2
2
66.6666
B
2
2
4
C
4
4
4
D
10
17
14
E
5
15
5
F
12
18
17
SP8
G
3
6
3
H
9
15
12
SP10
I
5
5
5
J
11
16
16
K
5
14
5
L
13
19
18
SP14
M
4
5
4
N
8
13
12
SP16
O
6
6
6
P
14
15
20
Q
3
5
3
R
7
7
10
SP20
S
10
13
10
T
8
14
18
SP22
U
12
20
12
V
10
10
22
W
2
2
2
X
8
8
10
SP4 SP6
SP12
SP18
SP24
77.7777 77.2727 80 76.1904 78.2608 75 76.9230 76.92307692 64.28571429 64.70588235 83.33333333
6 Comparison of Results For comparing the results, we use the coefficient of mean deviation based on median which helps to find data distortion rate from the outcomes of the dataset obtained from the result tables with respective to the earlier approaches (Figs. 5, 6).
An Audit Trail Framework for Saas Service Providers and Outcomes …
709
Fig. 2 Utility score of service providers. X-axis list of service providers with rank and Y-axis utility score scale from 0 to 3
Fig. 3 Utility score of service providers. X-axis list of service providers’ availability and response time policy values and Y-axis policy values variation rate
Fig. 4 Utilization score of services by the customer. X-axis list of service providers utilization score of the services and Y-axis utilization of services with related to time
710
M. N. V. Kiranbabu et al.
Fig. 5 Utility score of service providers availability for different algorithms with coefficient of mean deviation from medium. X-axis list of service providers availability coefficient of mean deviation from medium and Y-axis coefficient of mean deviation scale
Fig. 6 Utility score of service providers’ response time for different algorithms with coefficient of mean deviation from medium. X-axis list of service providers’ response time coefficient of mean deviation from medium and Y-axis coefficient of mean deviation scale
7 Conclusion In representing this provenance-based performance audit trail framework, we evaluated economic based on QoS parameters, effective policy monitoring and efficiency of service providers with strong mathematical computations (Tables 5, 6, 7, 8). This proposed model integrates the above evaluations and showcases the comparison of results strongly from the earlier approaches. This framework triggers the importance of provenance data as an input for audit trail. In future, this approach opens door towards the integrity issues of provenance data in cloud (Figs. 7, 8).
An Audit Trail Framework for Saas Service Providers and Outcomes …
711
Table 5 Outcome comparisons with utility score of service providers’ SaaS_ID
Broker-based utility score
Topsis method utility score
EPAT algorithm utility score
SP2
22.9
48.2
1.679970518
SP4
35.59
55.29
1.946248813
SP6
33.69
40.47
1.421074117
SP8
27.08
47.94
2.008878684
SP10
29.51
47.71
1.848362178
SP12
60.49
57.62
2.367521368
SP14
42.24
48.81
2.094051112
SP16
37.74
54.66
2.205578943
SP18
47.71
65.05
2.717336395
SP20
36.54
45.03
1.777911015
SP22
41.45
62.23
2.509617156
SP24
65.9
70.6
2.841791045
Table 6 Outcome comparisons with utility score of service providers’ for different algorithms with coefficient of mean deviation from medium values Algorithms
Availability
Response time
Broker based method
0.479835
5.623002213
EPAT algorithm
0.466084
5.348708237
Table 7 Outcome comparisons with policy values of service providers’ availability towards consumer SaaS_ID
Availability
Broker-based SLA value comparisons availability
PPAT algorithm comparisons with availability
SP2
0.99968
1.99933
0.989739
SP4
0.99988
1.99973
0.990135
SP6
0.99963
1.99923
0.98964
SP8
0.99918
1.998331
0.988749
SP10
0.99958
1.99913
0.989541
SP12
0.99981
1.99959
0.989997
SP14
0.99924
1.998451
0.988868
SP16
0.99948
1.99893
0.989343
SP18
0.99999
1.99995
0.990353
SP20
0.99943
1.99883
0.989244
SP22
0.99959
1.99915
0.989561
SP24
0.99999
1.99995
0.990353
712
M. N. V. Kiranbabu et al.
Table 8 Outcome comparisons with policy values of service providers’ response time towards consumer SaaS_ID
Response time (ms)
Broker-based SLA value comparisons response time
PPAT algorithm comparisons for response time
SP2
0.2
0.024
0.021333
SP4
0.3
0.054
0.048
SP6
0.6
0.216
0.192
SP8
0.2
0.024
0.021333
SP10
0.3
0.054
0.048
SP12
0.7
0.294
0.261333
SP14
0.5
0.15
0.133333
SP16
0.3
0.054
0.048
SP18
0.2
0.024
0.021333
SP20
0.5
0.15
0.133333
SP22
0.2
0.024
0.021333
SP24
0.5
0.15
0.133333
Fig. 7 Policy assessment of broker-based and PPAT algorithm working comparison. X-axis list of service providers with policy of different algorithms and Y-axis policy scale
An Audit Trail Framework for Saas Service Providers and Outcomes …
713
Fig. 8 Comparison graph of response time policy for broker-based and PPAT algorithm. X-axis list of service providers’ policy of different algorithm and Y-axis policy scale
References 1. H. Rafat, C-S. Wu, Provenance as a service: a data-centric approach for real-time monitoring. 2014 IEEE International Congress on Big Data. IEEE (2014) 2. M. Ramane, B. Vasudevan, S. Allaphan, A provenance-policy based access control model for data usage validation in cloud. arXiv preprint arXiv:1411.1933 (2014) 3. G.T. Lakshmanan et al., Guest editors’ introduction: provenance in web applications. IEEE Int Comput. 15(1), 17–21 (2010) 4. W.-T. Tsai et al, A new soa data-provenance framework, in Eighth International Symposium on Autonomous Decentralized Systems (ISADS’07). IEEE, 2007 5. Y. Feng, W. Cai, Provenance provisioning in mobile agent-based distributed job workflow execution, in International Conference on Computational Science. Springer, Berlin, Heidelberg, 2007 6. M.N.V. Kiranbabu, JARDCS special issue on signal and image processing technique. J. Adv. Res. Dynam. Control Syst 9(12), 1822–1832 (2017) 7. E. Badidi, A broker-based framework for integrated sla-aware saas provisioning. arXiv preprint arXiv:1605.02432 (2016) 8. V. Reddy, Y. S. Krishna, K. Thirupathi Rao, Distributed authentication for federated clouds in secure cloud data storage. Indian J. Sci. Technol, 9.19, 1–7 (2016) 9. S. Praveen, K. T. Rao, An effective multi-faceted cost model for auto-scaling of servers in cloud (2019) 10. S.P. Praveen, K.T. Rao, B. Janakiramaiah, Effective allocation of resources and task scheduling in cloud environment using social group optimization. Arab. J. Sci. Eng 43, 4265–4272 (2018). https://doi.org/10.1007/s13369-017-2926-z 11. S. Phani Praveen, K. Thirupathi Rao, in An optimized rendering solution for ranking heterogeneous VM instances, eds. by V. Bhateja, C. Coello Coello, S. Satapathy, P. Pattnaik. Intelligent Engineering Informatics. Advances in Intelligent Systems and Computing, vol 695 (Springer, Singapore, 2018). https://doi.org/10.1007/978-981-10-7566-7_17 12. N. Shinde, P. Sai Kiran, A survey of cloud auction mechanisms & decision making in cloud market to achieve highest resource & cost efficiency, in 2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT). IEEE, 2016 13. S. A. Patil, K. Thirupathi Rao, S. Patil, Survey on features and classification techniques in music genre classification. HELIX 8.5, 3833–3837 (2018)
714
M. N. V. Kiranbabu et al.
14. K.T. Rao, S. Saidhbi, Data security mechanism in private cloud – a case study. J. Adv. Res. Dyn. Control Syst. 9, 2060–2067 (2017) 15. T. Burramukku, V. Manjusha, G. Pavani, Y. Sainath, Proof of ownership scheme for deduplication using yes-no bloom filter. J. Adv. Res. Dynam. Control Syst. 9, 858–868 (2017) 16. S. B. Rathod, V. Krishna Reddy, NDynamic framework for secure VM migration over cloud computing. J. Inform. Process. Syst. 13.3 (2017) 17. I. Lee, Pricing and profit management models for SaaS providers and IaaS providers. J. Theor. Appl. Electron. Commerce Res. 16(4), 859–873 (2021) 18. S. Saidhbi, K. Thirupathi Rao, A modern approach in cloud computing storage by using compression and crypto mechanism. Int. J. Appl. Eng. Res. 12.9, 1815–1818 (2017) 19. D. Vadlamudi, T.R. Komati, S. Bodempudi, L. Kadulla, A framework for data integrity through lineage tracking in cloud. Int. J. Eng. Technol 7, 477–480 (2018). https://doi.org/10.14419/ijet. v7i2.32.16272 20. D. Vadlamudi et al., Analysis on digital forensics challenges and anti-forensics techniques in cloud computing. Int. J. Eng. Technol, 7.2.7, 1072–1075 (2018) 21. T. Gunasekhar, A framework for preventıon and detectıon of ınsıder attacks ın cloud ınfrastructure (2017) 22. P. Townend, P. Groth, J. Xu, A provenance-aware weighted fault tolerance scheme for servicebased applications, in Eighth IEEE International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC’05). IEEE, 2005 23. A. Bates et al., Transparent web service auditing via network provenance functions, in Proceedings of the 26th International Conference on World Wide Web (2017) 24. G. Meera, G. Geethakumari, A provenance auditing framework for cloud computing systems, in 2015 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES). IEEE, 2015 25. M. Naseri, A. L. Simone, Automatic service composition using POMDP and provenance data, in 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM). IEEE, 2013 26. G. Dobson, A. Sanchez-Macian, Towards unified QoS/SLA ontologies, in 2006 IEEE Services Computing Workshops. IEEE, 2006 27. F. Callegati et al., Data security issues in maas-enabling platforms, in 2016 IEEE 2nd International Forum on Research and Technologies for Society and Industry Leveraging a better tomorrow (RTSI). IEEE, 2016 28. R. Yang et al., Dˆ 2PS: a dependable data provisioning service in multi-tenant cloud environment, in 2016 IEEE 17th International Symposium on High Assurance Systems Engineering (HASE). IEEE, 2016 29. R. Kumar, M. F. Hassan, M. A. M. H. M. Adnan, Intelligent negotiation agent architecture for SLA negotiation process in cloud computing, in Proceedings of International Conference on Machine Intelligence and Data Science Applications (Springer, Singapore, 2021) 30. N. Chandrakala, B. Thirumala Rao, Migration of virtual machine to improve the security in cloud computing. Int. J. Electric. Comput. Eng. 8.1, 2088–8708 (2018)
Predicting Entrepreneurship Skills of Tertiary-Level Students Using Machine Learning Algorithms Abdullah Al Amin, Shakawat Hossen, Md. Mehedi Hasan Refat, Proma Ghosh, and Ahmed Al Marouf
Abstract Entrepreneur and entrepreneurship are two words profoundly involved in every sector of the world. In this era of modern economy, the most important thing is that entrepreneurs are the main driving force of the global economy. Several kinds of entrepreneurship ventures are emerging with the help of technological support. With the help of information technology and the access to the new generation customers, the ventures are outperforming the old-fashioned companies. The startup companies nowadays are started by young students and professionals. In this paper, we have focused on tertiary-level students, who are currently at their studies and thinking about starting a startup company. We have collected self-report inventories for determining personality traits and entrepreneurship skill level. The Big Five personality model and entrepreneurship self-assessment survey for collecting data. With the machine learning models, we have tried to correlate personality traits and entrepreneurship skills, therefore, to find out the actual skills of entrepreneurs. Around one hundred participants have participated in this research. For the machine learning classification, we have applied tree-based algorithms such as, decision tree (J48), random forest (RF), decision stump, Hoeffding tree, random tree, logistic model tree (LMT) and REP tree. The performance evaluation of the classifiers is performed considering three classes, namely outstanding ability to be an entrepreneur, satisfactory ability to be an entrepreneur and inappropriate for A. Al Amin · S. Hossen · Md. Mehedi Hasan Refat · P. Ghosh · A. A. Marouf (B) Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh e-mail: [email protected] A. Al Amin e-mail: [email protected] S. Hossen e-mail: [email protected] Md. Mehedi Hasan Refat e-mail: [email protected] P. Ghosh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_52
715
716
A. Al Amin et al.
entrepreneurship. Among the classifiers applied, Hoeffding tree performed better (88%), in terms of accuracy. Keywords Big Five personality traits · Entrepreneurial skill · Machine learning techniques
1 Introduction Considering business as one of the most diversified field, entrepreneurship can be depicted as a form of art of initiating a business, and it is an activity full of creativity. For the betterment of the economy, an entrepreneur generally invents or creates something new to the firm or company. Since the creation of the world, people have been addicted to inventing new things. The constant intoxication of this innovation has made the present world competitive. As a result, the earth has moved from the primitive age to the present modern age. In this competitive battle, it is more challenging for an entrepreneur to emerge as a successful entrepreneur than to survive. This is why, an entrepreneur should have personality traits along with some qualitative features of self-assessment that can influence or predict his organizational behavior. To be a successful entrepreneur, it is very important to have managerial skills and strong team-building abilities. Strong leadership attributes are very much necessary to become a successful entrepreneurs. Researchers’ are continuously working on finding the predictors for understanding human personality for a long time. Without appropriate justification and repetitive experimentation, it is very difficult to define human personality. Based on the Big Five personality traits [1], openness to experience (O), conscientiousness (C), extraversion (E), agreeableness (A) and neuroticism (N) are the five personality traits to identify the overall personality of any individuals. The personality model is also widely known as OCEAN model. Each of the personality traits has defined set of attributes to be determined to finalize the overall personality of an individual. For those attributes, separate IPIP questionnaire is set up in the item set of the OCEAN model. And, self-assessment measurement is based on some common characteristics of an entrepreneur. The different entrepreneur ability levels, namely outstanding, satisfactory, inappropriate and avoid entrepreneurship are calculated using the selfassessment survey. The items are considered to be answered honestly to predict the entrepreneurship skills level. It is very difficult to determine the correlation and mapping between the undergraduate student’s personality traits and entrepreneurial skills. It should be noted that tertiary-level students in Bangladesh usually look for jobs after graduation. Job fields are inadequate compared to the graduate students of Bangladesh. As a result, many educated youths remain unemployed or do not get proper employment according to their qualifications. Entrepreneurship is important for our country. It has ability to improve our standards of living. It has ability to remove poverty because
Predicting Entrepreneurship Skills of Tertiary-Level Students …
717
entrepreneur creates job sector. When an entrepreneur creates new products and services, it produces job sector. In this paper, the traditional supervised learning method has been adopted, and classification algorithms, specifically, tree-based classifiers are utilized. The decision tree, decision stump, Hoeffding tree, logistic model tree (LMT), random forest, random tree and REP tree have been implemented in this paper. We have utilized machine learning algorithms to map undergraduate student’s general personality traits and entrepreneurship skills. The rest of the paper has related works presented in Sect. 2. Description of machine learning algorithms in brief is in Sect. 3, data collection procedure is in Sect. 4, and experimentation setup for applying the models is presented in Sect. 5. The results coming from the classification models are illustrated in Sect. 6, and some of the pitfalls and possible scopes of improvements are shown in Sect. 7.
2 Related Works Entrepreneurship is one of the most important research topics for all countries. Because, an entrepreneur can improve the economic growth and reduce unemployment. Therefore, many kinds of research were done to determine the effect of entrepreneurial personality traits. In [2], the China Family Panel Studies data and entrepreneurial probability of individuals in self-employment are associated with the personality traits. The agriculture and non-agriculture sectors have been explored to analyze the context. Teh results show that personality traits affect the entrepreneurial probability impactfully. On the other hand, no significant effects on non-agricultural employment were found. Personality traits such as conscientiousness (C) and agreeableness (A) are positively linked to the entrepreneurial probability of farmers in rural areas. In [3], authors have conducted a detail research on risk propensity and the idea of distinguishing between the high- and low-growth entrepreneurs in perspective of managers and entrepreneurs. This paper presents the list of personality traits which are capable of getting entrepreneurial or business success. ˙In [4], authors have performed verification of the relationship between locus of control, need for achievement and entrepreneurial intention of youth. And, how entrepreneurial education consolidates entrepreneurial skills. In [5], authors have consolidated the effect of linguistic features and social network features for each user profile to understand and predict the Big Five personality traits. Traditional machine learning techniques are brought into action for the defined feature selection-based prediction task. Researchers have conducted works on single personality traits, such as predicting negative traits (neuroticism) utilizing stylometric and linguistic feature analysis [6]. Some interesting outcomes were reported in [7] while finding the relationships between the perceived stress scale and personality of the undergraduate students. In this paper, we tried to apply machine learning
718
A. Al Amin et al.
classification algorithms to predict the personality traits and entrepreneurship skill level.
3 Machine Learning Techniques In this paper, seven machine learning algorithms were used here. By implementing those algorithms, we have trained and tested our dataset. And, we achieved different accuracy for different algorithms. The following algorithms we have applied are: decision tree (J48), Hoeffding tree, logistic model tree (LMT), random tree, REP tree, decision stump and random forest. Machine learning and different data mining algorithms have been utilized in [8–11] for solving similar problems. Decision tree [12] is our first implemented algorithm. Decision tree is one of the predictive tools which is used for decision-making in data mining classifications. Decision tree classifiers are used successfully in many cases such as character recognition, radar signal classification, treatment diagnosis, remote sensing, speech recognition and expert systems. It is one of the most popular machine learning algorithms for given intelligibility and simplicity. Decision tree has proven superior for classifying numerical and categorical data when the predicted outcome is the class (Table 1). Decision stump is a classifier of machine learning algorithms which make decision by generating one-level decision tree [12]. Decision stump is also a decision treebased classifier in which the root (known as internal node) is immediately connected to the leaves (known as terminal nodes). It makes a prediction based on the value of just a single-input feature. Sometimes, they are also known as 1-rules [13] (Table 2). Table 1 Confusion metrics of decision tree Observed
Prediction Satisfactory
Outstanding
Inappropriate
Satisfactory
48
10
1
Outstanding
6
30
0
Inappropriate
5
0
0
Satisfactory
Outstanding
Inappropriate
Satisfactory
56
3
0
Outstanding
11
25
0
Inappropriate
5
0
0
Table 2 Confusion metrics of decision stump Observed
Prediction
Predicting Entrepreneurship Skills of Tertiary-Level Students …
719
Table 3 Confusion metrics of Hoeffding tree Observed
Prediction Satisfactory
Outstanding
Inappropriate
Satisfactory
57
2
0
Outstanding
6
30
0
Inappropriate
4
0
1
Hoeffding tree [14] algorithm is one of the known algorithms for stream data classification. It is an incremental and decision tree-based algorithm that uses decision tree induction method for learning. And also, it is capable of learning from a large data stream (Table 3). Logistic model tree [15] is a combination of logistic regression models and tree induction where logistic regression functions remain as leaves with a standard decision tree. It is a popular technique suited for classification problems and works fine for the prediction of nominal and numerical features (Table 4). Random forest [16] is used for both classification as well as regression and mainly used for classification problems. It is decision tree-based algorithm which builds a large number of trees on dataset while training and assembling them. Finally, it selects the best outcome that is the mode of the classes and/or mean regression of each tree (Table 5). Random tree [17] is a supervised classifier that deals with both classification and regression problems. Random trees are a combination of two algorithms of machine learning. Here, single model trees are combined with random forest ideas (Table 6). REP tree [18] is a fast decision tree learner algorithm that creates multiple decision/regression trees based on the information gain or reducing the variance and Table 4 Confusion metrics of logistic model tree Observed
Prediction Satisfactory
Outstanding
Inappropriate
Satisfactory
56
3
0
Outstanding
7
29
0
Inappropriate
5
0
0
Table 5 Confusion metrics of random forest Observed
Prediction Satisfactory
Outstanding
Inappropriate
Satisfactory
57
2
0
Outstanding
9
27
0
Inappropriate
5
0
0
720
A. Al Amin et al.
Table 6 Confusion metrics of random tree Observed
Prediction Satisfactory
Outstanding
Inappropriate
Satisfactory
47
9
3
Outstanding
9
27
0
Inappropriate
3
0
2
Satisfactory
Outstanding
Inappropriate
Satisfactory
56
3
0
Outstanding
12
24
0
Inappropriate
5
0
0
Table 7 Confusion metrics of REP tree Observed
Prediction
prunes it using reduced error pruning. Then, it picks the best tree from all generated trees (Table 7).
4 Data Collection 4.1 Participants For this paper, 100 participants were selected as a control group, and we have collected data from them. There are two types of data collection procedure for this paper. The data were collected physically from students through fill up the forms by visiting the classroom. We also collected data through Google form. And, most of the data were collected through the Google form. The participants were cooperated to fill up the google form and printable hard copy. The participants were willingly giving answers to the questions honestly. Among the participants, 77 were males and 23 were females. And, everyone was between the ages of 18 and 25. Every participant is Bangladeshi undergraduate-level students.
4.2 Questionnaires Two types of questionnaire were asked for the experimentation. The first one is for collecting personality traits data, and the second one is for collecting entrepreneurial self-assessment. We searched for the numerical score of each personality trait using the 50-item IPIP question set of the Big Five model. For the entrepreneurship skill
Predicting Entrepreneurship Skills of Tertiary-Level Students … Table 8 Dataset properties
Properties
721 Values
Total instance
100
No. of instances in outstanding class
36
No. of instances in satisfactory class
59
No. of instances in inappropriate class
05
No. of male instance
77
No. of female instance
23
scoring, we used entrepreneurial self-assessment survey which is based on personal information. Each personality trait and entrepreneurial self-assessment question was labeled between 1 and 5 score, where 5 = strongly agree and 1 = strongly disagree. In between, 4 is considered for agree, 3 for neutral and 2 for disagree.
4.3 Data Credentials We have conducted the survey on 100 participants. All the participants are Bangladesh-born undergraduate-level students. And, they are from CSE, BBA, EEE and the law departments. And, most of the participants are from CSE and BBA department. The participants have duly signed consent letter for not to disclose personal information, but to share the data publicly. With the experimentation, the privacy and security of the data are highly maintained. We converted survey collected data into a formal machine learning dataset containing five personality traits values and entrepreneurial class for using a machine learning algorithm (Table 8).
4.4 Data Cleaning and Data Preprocessing In dataset, the personality traits and entrepreneurship skill levels are considered as attributes. And, the scores of these traits are determined using formulas set by the Big Five personality model, and no data cleaning or data preprocessing such as normalization and uniformities is performed over the data.
5 Experimental Method In this section, we implemented the tree-based machine learning algorithms. Machine learning algorithms are briefly described, and the confusion matrices are presented in Sect. 3. We have used tenfold cross-validation for each of the classifiers. The experimental model is shown in Fig. 1.
722
A. Al Amin et al.
Data Cleaning and Pre-processing
Dataset
Mapping personality traits and entreprenuership skills
Applying Treebased classifiers
Fig. 1 Experimental method for mapping personality traits to entrepreneurship skill
6 Experimental Result and Analysis In this paper, the classifiers are implemented using ARFF file format inside in Waikato Environment for Knowledge Analysis (Weka) [19], developed by the University of Waikato, and we have utilized the tool to apply the ML classifiers directly to the collected data. F-measure, precision, recall and accuracy are shown in performance metrics. Performance metrics are calculated using Eqs. (1), (2), (3) and (4). F-measure = 2 ∗ Precision ∗ Recall/(Precision + Recall)
(1)
Precision = TP/(TP + FP)
(2)
Recall = TP/(TP + FN)
(3)
Accuracy = (TP + TN)/(TP + FN + FP + TN)
(4)
Here, TP, FP, TN and FN are true positive, false positive, false negative and true negative, respectively (Table 9). Table 9 Performance metrics of different tree-based classifiers Classifier name
Precision
Recall
F-measure
Accuracy (%)
Decision tree (J48)
0.750
0.780
0.764
78
Decision stump
0.8355
0.810
0.818
81
Hoeffding tree
0.889
0.880
0.868
88
LMT
0.865
0.850
0.8675
85
Random forest
0.867
0.840
0.854
84
Random tree
0.760
0.760
0.760
76
REP tree
0.828
0.800
0.805
80
Highest accuracy is kept in bold
Predicting Entrepreneurship Skills of Tertiary-Level Students …
723
7 Conclusion and Future Work The purpose of our paper is to find potential entrepreneurs through the application of machine learning algorithms. Seven types of machine learning classifiers were implemented here. Among them, Hoeffding tree gives optimal and comparatively better accuracy, which is 88%. This research will be beneficial for the students because they will know about their skills. Also, it helps our country to develop by creating new entrepreneurs. Using this concept, students at the secondary and higher secondary levels who will be qualified by mapping personality traits can be encouraged to practice and study to become entrepreneurs.
References 1. S. Rothmann, E.P. Coetzer, The big five personality dimensions and job performance. SA J. Ind. Psychol. 29. http://doi.org/10.4102/sajip.v29i1.88. Retrieved 27 June 2013 2. J. Yang, D. Ai, Effect of the big five personality traits on entrepreneurial probability: influence of China’s household registration system. J. Labor Res. 40, 487–503 (2019). https://doi.org/ 10.1007/s12122-019-09294-z 3. J.B. Miner, N.S. Raju, Risk propensity differences between managers and entrepreneurs and between low- and high-growth entrepreneurs: a reply in a more conservative vein. J. Appl. Psychol. 89, 3–13 (2004) 4. A.I. Voda, N. Florea, Impact of personality traits and entrepreneurship education on entrepreneurial intentions of business and engineering students. Published: 23 Feb 2019 5. A.A. Marouf, M.K. Hasan, H. Mahmud, Comparative analysis of feature selection algorithms for computational personality prediction from social media. IEEE Trans. Comput. Soc. Syst. 7(3), 587–599 (2020). https://doi.org/10.1109/TCSS.2020.2966910 6. A.A. Marouf, M.K. Hasan, H. Mahmud, Identifying neuroticism from user generated content of social media based on psycholinguistic cues, in 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE) (2019), pp. 1–5. http://doi.org/10.1109/ ECACE.2019.8679505 7. A. Marouf, A.F. Ashrafi, T. Ahmed, T. Emon, A machine learning based approach for mapping personality traits and perceived stress scale of undergraduate students. Int. J. Mod. Educ. Comput. Sci. 11, 42–47 (2019) 8. S. Smys, A. Basar, H. Wang, Artificial neural network based power management for smart street lighting systems. J. Artif. Intell. 2(01), 42–52 (2020) 9. J.I.Z. Chen, S. Smys, Social multimedia security and suspicious activity detection in SDN using hybrid deep learning technique. J. Inf. Technol. 2(02), 108–115 (2020) 10. V. Suma, H. Wang, Optimal key handover management for enhancing security in mobile network. J. Trends Comput. Sci. Smart Technol. (TCSST) 2(04), 181–187 (2020) 11. H. Wang, S. Smys, Big data analysis and perturbation using data mining algorithm. J. Soft Comput. Paradigm (JSCP) 3(01), 19–28 (2021) 12. S.R. Safavian, D. Landgrebe, A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 660–674 (1991) 13. W. Iba, P. Langley, Induction of one-level decision trees, in ML92: Proceedings of the Ninth International Conference on Machine Learning, Aberdeen, Scotland, 1–3 July 1992 (Morgan Kaufmann, San Francisco, CA, 1992), pp. 233–240 14. P.K. Srimani, M.M. Patil, Performance analysis of Hoeffding trees in data streams by using massive online analysis framework. Int. J. Data Min. Model. Manag. 7(4) (2015) 15. N. Landwehr, M. Hall, E. Frank, Logistic model trees. Mach. Learn. 59, 161–205 (2005)
724
A. Al Amin et al.
16. L. Breiman, Random forests. Mach. Learn. 45, 5–32 (2001) 17. D. Aldous, The continuum random tree. I. Ann. Probab. 1–28 (1991) 18. B. Srinivasan, P. Mekala, Mining social networking data for classification using Reptree. Int. J. Adv. Res. Comput. Sci. Manag. Stud. 2(10) (2014). ISSN: 2321-7782 (Online) 19. G. Holmes, A. Donkin, I.H. Witten, Weka: a machine learning workbench, in Proceedings of Second Australia and New Zealand Conference on Intelligent Information Systems, Brisbane, Australia
Smart Pet Insights System Based on IoT and ML G. Harshika, Umme Haani, P. Bhuvaneshwari, and Krishna Rao Venkatesh
Abstract Organizations like People for the Ethical Treatment of Animals (PETA) and the International Society for Animal Rights (ISAR) work toward increasing awareness and providing shelter to abandoned and abused animals yet fostering and adopting animals have proven to be difficult. This work is aimed toward creating an Internet of things (IoT)-based device that helps pet owners remotely monitor their pet’s location, activity, and health. This system is connected to the local network with an M5 Core 2 development kit that contains an ESP32 microcontroller making the device compact. The speed, heart rate, and location are sent through their respective sensors to the microcontroller, and the activity is predicted via a custom machine learning (ML) algorithm. This information is then displayed on a user-friendly Website along with breed-specific information. Such health indicators help users keep track of their pet’s health and help understand every animal’s innate behaviors increasing the likelihood of their adoption. Keywords IoT-based device · Remote monitor pet’s location · ML · Heart rate · Speed
1 Introduction With the rapid rate at which technology is being adapted into our daily lives, IoT devices are becoming extremely popular and commercially available. Their applications are wide and varied, from smart homes and smart cities to remote surveillance, tracking, and security systems [1]. They are revolutionizing the way we approach
G. Harshika · U. Haani · K. R. Venkatesh Department of CSE, Nitte Meenakshi Institute of Technology, Bangalore, India e-mail: [email protected] P. Bhuvaneshwari (B) MVJ College of Engineering, Bangalore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_53
725
726
G. Harshika et al.
health, safety, and convenience in various industries like security systems, agriculture, healthcare, manufacturing, transportation, mining, construction, and even public utility [2]. In India, there has been a steep rise of pet owners from approximately 11 million in 2014 to 16 million in 2018 [3]. This has also spurred the rapid growth of the pet market. Experts suggest the pet industry be one of the most stable industries as even during the economic crisis, for example, the recent COVID-19, there is always a demand for pets and their companionship. However, many pet owners struggle to take care of their pets with active involvement owing to their work-life imbalance. Thus, there is a high demand for pet care products that can help alleviate the stress of pet care [3]. In this work, we wish to combine the power of data analytics using machine learning and IoT to gain a better understanding of the behavior of pets. These days, an enormous measure of information is accessible all over. Consequently, it is vital to investigate this information to analyze some valuable data and to develop an algorithm based on this analysis. This can be accomplished through information mining and machine learning. Machine learning is a necessary piece of reasoning, which is utilized to analyze information to recognize patterns and connections between the data. Machine learning is utilized in different fields, for example, bioinformatics, information recovery, game playing, malware identification, picture deconvolution, marketing, business analysis, etc. [4]. The main aim of this work is to create a remote pet monitoring system that can help a user track their pet’s health, location, and real-time activity. The information collected on the pet’s behavior can also help to identify their characteristics which can be used to help improve fostering in future. This paper has been structured as follows. Section 2 describes the background research, motivation, and disadvantages of the previous work that led to the conception of this work. Section 3 describes the solution and the components that have been used. Section 4 elucidates on the implementation of the work step by step. Section 5 outlines the test cases and their results. Finally, Sect. 6 concludes the paper while shining a light toward possible future applications of the work.
2 Related Work and Motivation 2.1 Background The pet care industry is booming, opening doors to a wide number of tech-based solutions for simplifying pet care. Most of these systems, however, are geared toward feeding and litter cleaning mechanisms. Many pets face several health issues that are like the ones experienced by humans and need constant care and attention [5]. This
Smart Pet Insights System Based on IoT and ML
727
can prove to be a challenging task and most of these pets end up being given up for adoption to an already saturated fostering system. Even with the existence of IoT-based systems that provide the same solution for humans, especially, inpatient monitoring systems [6, 7] and real-time tracking with GPS and GSM. The same has not been implemented for animals, other than tracking and motion sensing devices [8]. There are some implementations for specific health characteristics, which are again limited to farm animals, and these systems are not particularly used for the welfare of the animal, but for increasing the profitability for farmers [9, 10]. Other areas that these devices have been used on a population scale are wildlife reserves and sanctuaries [11]. The previous systems that have been used for automatic food dispensing and litter box management for pets have proven to be great successes with large companies now manufacturing the systems at population scale [12]. This clearly depicts that there is a huge untapped market for pet-based technology, and there are several opportunities in the sector.
2.2 Advantages 1.
2. 3.
4.
Pre-existing technology: All the technology that is required to implement an effective and real-time pet monitoring system already exists in the market [13–15]. Market demand: There is a clear demand for pet products even during times of economic distress. Successful partial implementations: There have been some partial implementations like real-time location tracking for pets and food dispensaries [16]. Animal welfare: Pet technology has proven to vastly improve the living conditions of pets in the past [17].
2.3 Disadvantages 1. 2.
Low-level systems: Most of the current implementations are extremely simple integrations that provide little to no insight into the pet. Industry-motivated innovation: The devices that are currently in use are geared toward increasing profitability from the animals and will do this even at the risk of their overall well-being [16].
728
G. Harshika et al.
3 Proposed Solution Description: The pet monitoring system aspires to monitor your pet’s location, heart rate, and daily activities using ML and IoT. This data can also be consolidated to be used in future medical emergencies for better health care for your pets. Product Perspective: The fundamental thought of the product is to discover what your pet is doing daily. The models have been prepared to determine whether your pet is walking or running or standing, further helping you understand the amount of physical activity, it has gotten during the day. The viewpoint is to have an insight and a better understanding of our pet’s lives.
3.1 Hardware Requirements • • • • • • •
M5Stack Core2 ESP32 development kit board [18] GY-61 accelerometer Neo 6M GPS module Pulse sensor Memory card Type C to Type A cable Dog collar or body strap.
3.2 Software Requirements • • • • • • •
M5 Core 2 library TensorFlow Keras Flask Pandas NumPy Scikit-learn.
3.3 Functional Requirements • It must display all information, i.e., heart rate, location, and speed of the pet on a single dashboard with cumulative charts for analysis. • It must display breed-specific information on the dashboard. • ESP32 can interface with other systems to provide Wi-Fi and Bluetooth functionality through its SPI/SDIO or I2C/UART interfaces.
Smart Pet Insights System Based on IoT and ML
729
• It must connect with the Web to get live updates, else must have an attached memory card to store all the data.
3.4 Nonfunctional Requirements • Maintainability: This system is maintainable, however, it requires maintenance from the user and regular software updates to ensure that the hardware drivers do not malfunction. • Reliability: The collar must be reliable without any faults and bugs. Every mentioned parameter of the device should be monitored properly. • Scalability: As this device is modeled on IoT, the product is capable of scalability as per the requirement and user base, the functionalities can also be increased. • Performance and efficiency: The device shows effective results when used on dogs and can be easily expanded to include other pets.
4 Implementation Our system aims at providing an insight into a pet’s life. We apply the IoT technology and ML for data analytics to implement an integrated system that helps owners constantly monitor, track, and analyze their pets. The system is connected to the local network with M5 Core 2 which contains the ESP32 microcontroller. This is compact and runs on a battery making it easy to deploy for real-time applications. The process of building the pet insight system is described below. Figures 1 and 2 depict the prototype and its live implementation. The architecture of how the prototype has been implemented is demonstrated in Fig. 3.
4.1 System Design The following steps are implemented during the processing of raw data obtained from the device to user-readable information on the Website: 1.
2.
Setting up the band: All the sensors are first connected to the microcontroller, and each of the sensors and the board are set up. The band is first strapped to the dog collar making sure the sensors are placed properly, the board is then turned on, and the board firstly initializes the LCD screen, touch buttons, the baud rate, and the GPS module. Data collection: Raw accelerometer data are collected using the analog pins on the board, this raw data are then mapped and converted into acceleration values. Next pulse data are collected from the analog pins using a pulse sensor, the
730
Fig. 1 Prototype of the pet insight system
Fig. 2 Implementation of the prototype
G. Harshika et al.
Smart Pet Insights System Based on IoT and ML
731
Fig. 3 Architecture of prototype
3.
pulse sensor data are stored, and if a beat occurs, the beat number is counted. The number of beats is counted for 5 s which is then used to calculate the heart rate in beat per minute. Next, we check the UART pins if a serial signal is found from the GPS module, if a signal is found, the data are then decoded using the TinyGPS++ library. The latitude, longitude, speed, date, and time values are extracted. Data storage: The board checks whether connected to the Wi-Fi or not, if Wi-Fi is not connected, then the collected data are logged into SD card. If Wi-Fi is connected and if data are found in SD card, the current data along with data in SD card are sent to ThingSpeak API [19]. If Wi-Fi is connected and data are not found in SD card, then the current data are sent to ThingSpeak API. Then, the
732
4.
5.
G. Harshika et al.
board is set in an idle state for 20 ms. Again, all the tasks from the beginning continue in a loop until the board is turned off. Data analysis: The accelerometer data are then used to predict the activity of the dog by using a deep learning model. The deep learning model was trained by using a custom data set created by recording accelerometer sensor data of Golden Retriever dog. The trained data were then stored which is used by the flask server to predict the activities of dogs when required. All this data are then sent to the Website by the Flask server. Custom ML model (classification): Data from the CSV file are loaded into a pandas DataFrame. The x, y, and z obtained through the accelerometer are normalized. This is done by dividing each value by the maximum possible value so that the values are within 0–1. The activities column is encoded using a onehot encoder so that the string data are converted into a matrix. The data are then split into train and testing data sets with a ratio of 80:20. The data are converted into time-series data by considering four-time steps. The classifier model is built with three hidden layers, containing 100 neurons each that use the ReLu activation function [20], and the output layers consist of three neurons that use the softmax activation function. The model is trained using the training data set with a batch size of 75 for 50 epochs. Predictions are made for the test data using the trained model. These predictions are then used to create a confusion matrix to evaluate the model. The different layers of the ML model are demonstrated in Fig. 4. Layers work as follows: (a) (b) (c)
(d)
6.
reshape_input: InputLayer—It is used to pass data to the model. It is used to accept input data of 1 dim of size 12 reshape: Reshape—Reshape layer is to convert the input data dimensions from a 1D data of size 12 to 2D data of size (4, 3) dense, dense_1, dense_2, dense_3: Dense—Dense layer is the regular deeply connected neural network layer. It is the most common and frequently used layer. The dense layer does the below operation on the input and returns the output. output = activation(dot(input, kernel) + bias) flatten: Flatten—It converts the output from the previous layer from multidimensional to 1D, here, the input to the layer is converted from (4, 100) to 400.
Server and front end: Whenever a user tries to check for the data regarding their dog, a request is sent to the Flask server. The server then fetches the data from ThingSpeak API. All the data associated with a pulse sensor, accelerometer, and GPS module are fetched and stored into a pandas DataFrame. This data are then used to display the data on the Website. Simple DB management systems and languages like HTML, CSS, JS along SQL, and PHP are used to develop the dashboard for the front end.
Smart Pet Insights System Based on IoT and ML
733
Fig. 4 Layers of custom ML model
4.2 UML Diagrams The data flow in the prototype is depicted in Fig. 5 as follows: The data are collected from the pulse sensor, GPS module, and accelerometer, decoded into a readable format, and sent to the ESP32 microcontroller. When there is an active Internet connection, this data are readily accessible through the ThingSpeak API, if not, it is stored on the SD card as backup. The Flask server receives requests from the client and accesses the ThinkSpeak API to display the information on the Website. The ML model is also integrated into the server, it uses the acceleration data to predict the activity (sitting or standing) of the pet and display the same to the user.
734
G. Harshika et al.
Fig. 5 Data flow diagram
5 Test Cases and Result Table 1 shows a brief outline of the different test cases that were carried out to ensure the prototype is working as expected. Our system was easily deployed on pets, and the ML model had a validation accuracy of 75% and a training accuracy of 90%. Figure 6 displays the main dashboard on the user interface with all the important data and links to navigate to find other information. Figure 7 displays the real-time activity for heart rate and speed, on ThingSpeak that users can navigate to from the main dashboard. Figure 8 displays the real-time GPS coordinates along with a google map pin for better user interaction. Figure 9 displays all the important breed-specific information that can be customized by the user as per their requirement.
Smart Pet Insights System Based on IoT and ML
735
Table 1 Test cases Intents
Action
Result
Pet is moving
The device’s accelerometer, GPS, and pulse sensor are activated and collect data and transfer the data to the server
The accelerometer tracks the speed
Pet is in a remote location The GPS module collects the geographical location information and updates it to the server
The coordinates are displayed on a map for user convenience
Pet is sleeping
The device’s pulse sensor detects Lower heart rate than usual the speed of the pet and if it is lower than usual, it means it is sleeping/resting
Pet is running
The device’s pulse sensor and Higher heartbeat rate and accelerometer detect the speed accelerometer rate than usual and heart rate of the pet and if it is higher than usual, it means it is running
Pet is eating
The device does not show any No results obtained significant changes or variance for any of the outputs that can be measured
GPS is turned on
The GPS module gathers the coordinates (latitude and longitude) and updates it to the server
GPS is disconnected
The GPS module does not gather Display the last stored value on any information the map
Fig. 6 Main dashboard for the system
Coordinates are displayed on a map
736
Fig. 7 Real-time activity dashboard
Fig. 8 Real-time location of the pet on google maps
Fig. 9 Breed-specific details displayed on the main dashboards
G. Harshika et al.
Smart Pet Insights System Based on IoT and ML
737
6 Conclusion and Future Work Studies and statistics have already proved that the pet care industry is growing rapidly. With the growing rate of population in metropolitan areas, there is an increased rate of investment in pets and their health care. However, this market is relatively untapped in the Indian economy and our work has a huge market. Our work aims to create a viable pet monitoring system that is portable and easy to use. The system also provides a user-friendly interface that helps monitor the pet’s location, speed, and heart rate. We apply the IoT technology and ML for data analytics to implement an integrated system that helps owners constantly monitor, track, and analyze their pets. Apart from this, the work also helps to create a better bond between the pet and the owner and helps the owner actively track their pet’s location remotely. This ensures that if the pet was to ever get lost, it can always be tracked. This system improves upon similar implementations by integrating the most important aspects of pet care with complex ML that has a high accuracy for data classification. The device is also more oriented toward pets than farm animals, thereby making it a targeted product for this specific market. So far, there is almost no data collected from pets for this specific implementation. In future, behavioral data that are collected from the device can also be applied to understand the characteristic potential of the animal for example therapy pets or guard dogs or seeing-eye dogs. This has the potential to increase the rate of fostering. We can improve our IoT system by considering the noise interference from IoT by using technology like F-RAN architecture [21]. The security of the cloud systems can be implemented using the fingerprint biometric traits [22]. A key feature of the system is to alert the owner in case of an emergency, here, response time plays an important role, and we can use computational offloading to achieve the same [23]. By implementing these techniques, we can offer an overall efficient, safe, and reliable system. Once the work is successfully implemented on pets, it can be scaled to include different animals as well, leading to industry-level use case scenarios. Acknowledgments We are extremely grateful for the assistance provided by Thribhuvan Gupta S during the testing phase.
References 1. S. Kumar, P. Tiwari, M. Zymbler, Internet of things is a revolutionary approach for future technology enhancement: a review. J. Big Data 6, 1–21 (2019). https://doi.org/10.1186/s40 537-019-0268-2 2. A.R. Sfar, Z. Chtourou, Y. Challal, A systemic and cognitive vision for IoT security: a case study of military live simulation and security challenges, in 2017 International Conference on Smart, Monitored and Controlled Cities, SM2C 2017 (Institute of Electrical and Electronics Engineers Inc., Piscataway, 2017), pp. 101–105 3. India—households with pet dogs and cats 2018 | Statista. https://www.statista.com/statistics/ 1061322/india-households-with-pet-dogs-and-cats/. Accessed 4 July 2021
738
G. Harshika et al.
4. L. Nobrega, A. Tavares, A. Cardoso, P. Goncalves, Animal monitoring based on IoT technologies, in 2018 IoT Vertical and Topical Summit on Agriculture—Tuscany, IOT Tuscany 2018 (Institute of Electrical and Electronics Engineers Inc., Piscataway, 2018), pp. 1–5 5. C.-M. Own, H.-Y. Shin, C.-Y. Teng, The study and application of the IoT in pet systems. Adv. Internet Things 03, 1–8 (2013). https://doi.org/10.4236/ait.2013.31001 6. K. Aziz, S. Tarapiah, S.H. Ismail, S. Atalla, Smart real-time healthcare monitoring and tracking system using GSM/GPS technologies, in 2016 3rd MEC International Conference on Big Data and Smart City, ICBDSC 2016 (Institute of Electrical and Electronics Engineers Inc., Piscataway, 2016), pp. 357–363 7. A. Hisham, Health care monitoring system using GPS & GSM technologies. J. Adv. Commun. Syst. 1, 1–11 (2018) 8. M. Prajakta, S. Manakapure, A.V. Shah, Movement monitoring of pet animal using internet of things. Int. Res. J. Eng. Technol. (2018) 9. J. Astill, R.A. Dara, E.D.G. Fraser et al., Smart poultry management: smart sensors, big data, and the internet of things. Comput. Electron. Agric. 170, 105291 (2020) 10. M.H. Memon, W. Kumar, A. Memon et al., Internet of Things (IoT) enabled smart animal farm, in Proceedings of the 10th INDIACom; 2016 3rd International Conference on Computing for Sustainable Global Development, INDIACom 2016 (2016), pp. 2067–2072 11. V.V. Joshi, S.S. Bhange, S.S. Chopade, Wildlife animal location detection and health monitoring system. Int. J. Eng. Res. Technol. 211 (2014) 12. Y. Chen, M. Elshakankiri, Implementation of an IoT-based pet care system, in 2020 5th International Conference on Fog and Mobile Edge Computing, FMEC 2020 (Institute of Electrical and Electronics Engineers Inc., Piscataway, 2020), pp. 256–262 13. P. Sikka, P. Corke, L. Overs, Wireless sensor devices for animal tracking and control, in Proceedings—Conference on Local Computer Networks, LCN (2004), pp. 446–454 14. N. Izzatul, N. Binti, M. Razif et al., Automatic cat feeder and location tracker. J. Comput. Technol. Creat. Content 5, 27–32 (2020) 15. A.A.A. Luayon, G.F.Z. Tolentino, V.K.B. Almazan et al., PetCare: a smart pet care IoT mobile application, in ACM International Conference Proceeding Series (Association for Computing Machinery, 2019), pp. 427–431 16. J.R. Priya, M. Nandhini, Evolving opportunities and trends in the pet industry—an analytical study on pet products and services. J. Appl. Sci. Comput. 5 (2018) 17. M. Caria, J. Schudrowitz, A. Jukan, N. Kemper, Smart farm computing systems for animal welfare monitoring, in 2017 40th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2017—Proceedings (Institute of Electrical and Electronics Engineers Inc., Piscataway, 2017), pp. 152–157 18. M5Stack, https://m5stack.com/. Accessed 5 July 2021 19. IoT analytics—ThingSpeak internet of things. https://thingspeak.com/. Accessed 5 July 2021 20. K. Eckle, J. Schmidt-Hieber, A comparison of deep networks with ReLU activation function and linear spline-type methods. Neural Netw. 110, 232–242 (2019). https://doi.org/10.1016/j. neunet.2018.11.005 21. D.S. Shakya, IoT based F-RAN architecture using cloud and edge detection system. J. ISMAC 3(1), 31–39 (2021). http://doi.org/10.36548/jismac.2021.1.003 22. J.S. Manoharan, A novel user layer cloud security model based on chaotic Arnold transformation using fingerprint biometric traits. J. Innov. Image Process. 3(1), 36–51 (2021). http://doi. org/10.36548/jiip.2021.1.004 23. J.S. Raj, Improved response time and energy management for mobile cloud computing using computational offloading. J. ISMAC 2(1), 38–49 (2020). http://doi.org/10.36548/jismac.2020. 1.004
Design and Implementation of a Monitoring System for COVID-19-Free Working Environment Attar Tarannum, Pathan Safrulla, Lalith Kishore, and S. Kalaivani
Abstract The proposed work aims to design a monitoring system that ensures COVID-free safety working environment. The proposed system has two phases. The first phase is used to monitor the vital parameters of the person entering the workplace relevant to COVID-19 test. The second phase monitors the indoor safety measures to create COVID-19-free working space. It is an inexpensive solution that aims increased COVID-19 indoor safety, with certain aspects covered like contactless sensing of temperature, heartrate monitoring, detection of mask, social distancing check, monitoring air quality, temperature, and humidity of the room. Contactless temperature sensing and pulse checking are carried out using IR sensor and heartrate sensor interfaced with Arduino Uno. The detection of mask and proper social distancing check is done by using Open CV techniques with Raspberry Pi which is equipped with a camera. The monitoring system ensures COVID-19-free safe working environment to the society. Keywords Mask detection · Social distancing · Temperature · Humidity · Air quality · Viola–Jones algorithm · YOLO V3 · Open CV
1 Introduction The World Health Organization on March 11th, 2020, declared a pandemic called Coronavirus disease 2019 (COVID-19). It is mainly due to speed and scale of disease transmitted. The COVID-19 virus spreads in air primarily when a person who is infected coughs or sneeze. It also spreads when a person tries to communicate while speaking. An individual can be infected if person fails to wash their hands before touching others mouth, nose, and eyes and also surface which is contaminated. Transmission of virus happens when contact is directly with respiratory droplets of person who is already infected. The Indian government responded to this pandemic COVID19 in country with thermal screening passengers who come from China or from the A. Tarannum (B) · P. Safrulla · L. Kishore · S. Kalaivani Department of ECE, B. S. A. Crescent Institute of Science and Technology, Vandalur, Chennai 600048, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_54
739
740
A. Tarannum et al.
other countries where COVID-19 cases are present. The pandemic progressed all over the world and the government of India started issuing recommendations regarding measures taken for social distancing and also initiated travel with entry restrictions. Since the end of 2019, the world facing many challenges of COVID-19 where the world may need to battle the COVID-19 pandemic with some precautions measures till vaccines are developed. The precautionary measures taken for COVID-19 include the use of sanitizer which contains alcohol regularly and thoroughly for cleaning hands or wash with soap and water that may eliminate germs from our hands including viruses and also by maintaining social distancing without touching others. The virus can be transferred to mouth, nose, or eyes through hands once contaminated, and then it enters our body and infects. The nose and mouth should be covered while sneezing or cough with mask, tissue, or at least covering it by bending the elbow. They used surgical mask and tissues must be disposed in a closed bin immediately and clean your hands by washing or sanitizing. Following the above-mentioned steps can give good respiratory hygiene and also protect us from virus like cold, flu, and COVID-19. Make sure to disinfect surfaces and clean them which is used regularly such as door handle, mobile screen, and faucets. It is better to stay home and seek medical care attention if unwell with symptoms like fever, cold, cough, and breathing difficulties [1, 2].
2 Existing System The paper [3] introduced a fog–cloud combined IoT platform for the prevention of COVID-19 and a control device that implements COVID-19 symptom diagnosis, quarantine monitoring, contact tracing, and social distancing. It provides solution that aims to increase COVID-19 indoor safety. Covering several aspects, i.e., mask detection and social distancing, is presented in the paper. Microcontroller simulation software within Proteus Design Suite was adopted in the paper [4] in place of physical PIC-based development boards so that remote exercise sessions would be enabled. Based on the LoRa, a COVID-19 patient monitoring device was proposed in the paper [5]. This system is used in order to help medical professionals to remotely monitor the health of the patients infected. The proposed system focus on developing a simple, low cost, and effective monitoring system to ensure safety working environment through COVID-19 symptom diagnosis and COVID-19 safety measures monitoring.
3 Proposed System The proposed system model is simple and user-friendly. It has two stages. One placed at the entrance of the room/workplace which is used to monitor the vital parameter
Design and Implementation of a Monitoring System…
741
that reflects the symptoms of COVID-19 like body temperature and heartbeat of a person. Once these parameters are acquired from the persons using the respective sensors, they are displayed on the LCD display. The normal and abnormal values of the acquired parameters are identified and they are indicated using LED and buzzer. A normal value is indicated by a green light in the LED. Once if a person is identified with the symptoms, the red LED and the buzzer sound alerts the person and he is stopped from entering the room. So only if the person is with no symptoms indicated by a green light in the LED is allowed to enter the working room. The second stage module is placed inside the working environment that helps to monitor the room temperature, humidity, air quality, mask detection, and maintaining social distance. Different sensors are used for obtain the room parameters and a camera to monitor mask detection and social distance maintenance. A buzzer is used to indicate if any abnormal value is acquired by the respective sensors and also if a person violates the rules of wearing a mask or maintaining social distance in the workplace. The block diagram of the Stage-I which is for human parameter monitoring in workplace is shown in Fig. 1. This diagram represents the interfacing of temperature sensor, heartrate sensor, LED, LCD, and buzzer with the Arduino board. The block diagram of the indoor setup is as shown in Fig. 2 that is used for indoor safety monitoring of the workplace. It monitors room temperature, humidity, air quality, and also mask detection and social distancing. This block diagram represents the interfacing of humidity sensor, air quality sensor, IR sensor, camera, and a buzzer interfaced with the Raspberry Pi board. PC is used as a monitor to get the images/videos captured by the camera. Fig. 1 Block diagram of the Stage-I
742
Fig. 2 Block diagram of the Stage-II
4 Hardware Description 4.1 Hardware Description for Stage-1 The architectural design of Stage-I is shown in Fig. 3
Fig. 3 Architectural design of the Stage-I
A. Tarannum et al.
Design and Implementation of a Monitoring System…
743
Arduino ATmega-328 is basically an AVR microcontroller. It supports the information up to eight (8) bits. ATmega-328 has 32KB internal built-in memory. This microcontroller features a heap of different characteristics. ATmega328 has 1KB electrically erasable programmable read-only memory (EEPROM). This property shows if the electric supply to the microcontroller is turned off, it can store the data and give outputs after turning on the electric supply. It has 2KB SRAM. ATmega328 has innumerous variety of features which make it very wanted device in recent days. The features include advanced RISC architecture, good performance, low power consumption, real timer counter having separate oscillator, 6 PWM pins, etc. Temperature sensor The MLX90614 is a contactless temperature sensor used to measure the temperature. It uses IR rays instead of physical contact to measure the temperature and communicates to the microcontroller using I2C protocol [6, 7]. Heartrate sensor Ky-039 is a low-cost heartbeat-sensing module. This uses bright infrared (IR) led and a photo-transistor in order to detect the pulse of the finger; for every pulse in the finger, there will be a flash in red LED. And it is very important to make sure that the stray light shield into the photo-transistor. It measures the heart rate in beats per minute (bpm) using an optical LED light source and an LED light sensor. The light will shine through the skin, and the sensor measures the reflected light. Those reflections are counted as heartbeats [8]. LCD A thin, flat electronic visual display that uses the properties that modulate the light of liquid crystals (LCs), and also, it does not emit light directly. It is known as liquid crystal display (LCD). The places in which LCD is used expeditiously include computer monitors, television, instrument panels, aircraft cockpit displays, etc. It has certain advantages like compact, lightweight, portable, less cost, and reliable. LED Light-emitting diodes (LEDs) are used widely in the whole world. LEDs are available in different shapes and sizes; this provides designers the flexibility to tailor them to their product. Their low power and little sizes make them a good alternative for several totally different products as they will be worked into the design more seamlessly to make it a better device. Buzzer It is used as an indicator. If the output of the sensors and Pi camera matches the threshold/expectations set, the buzzer remains off. Otherwise, the buzzer turns on. This is about the human parameter monitoring that is used outside the workplace in order to provide a safe working environment.
744
A. Tarannum et al.
4.2 Hardware Description for Stage-II The Stage-II consists of the following seven components (as shown in Fig. 4.) Raspberry Pi The Raspberry Pi 3 Model B+ is the newest trending product within the Raspberry Pi 3 models, boasting a 64-bit quad core processor and running at 1.4GHz, dual-band 2.4GHz and 5GHz wireless LAN, Bluetooth 4.2/BLE, faster Ethernet [9–11]. The dual-band wireless local area network comes with modular compliance certification, allowing the board to be designed into end products with reduced wireless local area network compliance testing, improving both cost and time to market. The Raspberry Pi 3 Model B+ maintains the same mechanical footprint as both the Raspberry Pi 2 Model B and the Raspberry Pi 3 Model B. DHT11 Sensor The DHT11 humidity sensor is a small sensor that provides digital humidity readings. It is very easy to set up, and only needs one wire for the data signal. These sensors are mostly used in remote weather stations, soil monitoring, and home automation systems [12]. Temperature sensor The LM35 is an IC sensor that is used to measure temperature with an electrical output proportional to the temperature (in oC). One can measure temperature with more accuracy than using a thermistor. The sensor circuitry is sealed and is not
Fig. 4 Architectural design of the Stage-II
Design and Implementation of a Monitoring System…
745
exposed to oxidation. The LM35 generates a higher output voltage when compared with thermocouples. Gas sensor MQ-135 gas sensor. In an environment wherever there could also be polluting gas, the conduction of the gas device raises in conjunction with the concentration of the polluting gas will increase. It has good detection to smoke and other harmful gas and is significantly sensitive to ammonia, sulfide, and benzene steam. Its ability is to detect various harmful gas and less cost make it an ideal selection out of other sensors that detect the gases [13]. The Pi Camera v2 The Pi Camera v2 is the newest version that is released in April 2016. It remains compatible with all existing Raspberry Pi models. A flat wire/cable is still a dedicated connector on the upper side of all the Raspberry Pi models. It is used to capture the images in the workplace for mask detection and social distancing periodically. PC PC is used for installing software required for the proposed work and programming. It is also used for the display of output of the sensors used in the setup and images captured by Pi camera for mask detection and social distancing. Buzzer It is used as an indicator. If the output of the sensors and Pi camera matches the threshold/expectations set, the buzzer remains off. Otherwise, the buzzer turns on.
5 Implementation and Results The experimental setup and the results obtained from Stage-I and Stage-II of the proposed system are discussed in this section with the respective figures taken during experimentation.
5.1 Stage-I Implementation The experimental setup of Stage-I of the proposed system is shown in Fig. 5. The experimental results obtained from Stage-I are discussed below.
746
A. Tarannum et al.
Fig. 5 Experimental setup for Stage-I
5.1.1
Results for Stage-I
The Stage-I module is tested, and the result in Fig. 6 shows that the body temperature measured is normal and so the led glows green with the buzzer in off state.
Fig. 6 Normal body temperature
Design and Implementation of a Monitoring System…
747
Fig. 7 Abnormal body temperature
The Stage-I module is tested, and the result in Fig. 7 shows that the body temperature measured is not normal and so the LED glows red with the buzzer in ON state. The Stage-I module is tested for heartrate monitoring, and the result in Fig. 8 shows that the heart rate measured is normal and so the led glows green with the buzzer in off state. The Stage-I module is tested for abnormal heartrate monitoring, and the result in Fig. 9 shows that the heart rate measured is abnormal and so the led glows red with the buzzer in on state. Fig. 8 Normal heart rate
748
A. Tarannum et al.
Fig. 9 Abnormal heart rate
5.2 Stage-II Implementation The experimental setup of Stage-II of the proposed system is shown in Fig. 10. The experimental results obtained from Stage-I are discussed below. Fig. 10 Experimental setup for Stage-II
Design and Implementation of a Monitoring System…
5.2.1
749
Results for Stage-II
Viola–Jones algorithm is used for detection of the mask. It was developed by Paul Viola and Michael Jones in 2001. It allows the detection of image features in real time. Haar_ cascade_frontalface_default is used for detecting the human face from frontal side. Viola–Jones was designed for frontal faces, and thus, it is able to observe frontal the simplest instead of faces that appear to be sideways, upwards, or downward. Before detection of a face, the image is converted into grayscale, since it is easier to figure with. This algorithmic program first detects on grayscale then finds its place on colored image [14–16]. Social distancing check algorithm As same as mask detection algorithm, every camera frame is to be converted into a grayscale image. Open CV’s haar_cascade_fullbody classifier is used for human body detection within the image that is captured. If more than one person is detected, then the distance between them is calculated. The distance calculated must be normalized on the basis of the characteristics of camera. If the distance between two people is lesser than the threshold set distance the buzzer turns ON [7, 17]. The result for social distancing check is as shown in Figs. 11 and 12. The threshold distance set is 1 meter . Mask Detection First, the camera frame is to be converted into a grayscale image which is used for the detection of a face, as it is required by Open CV’s Haar cascade classifier. Then, it detects the frontal faces and shows if the mask is detected or not. The algorithm Fig. 11 Social distance violated
750
A. Tarannum et al.
Fig. 12 Social distance violated
is working for both single-person and multi-person modes. If a person is detected without mask, the buzzer turns on. Fig. 13 shows that the person is wearing the mask and so a green box appeared with a text stating that the mask is on [8]. Room temperature, gas level, and humidity outputs The parameters that are required to be normal for a safe working environment are room temperature, CO2 level, and humidity. Safe range of gas level and humidity shows that there are limited people in the workplace. The room temperature in degree Celsius, gas level in percentage, and humidity (in ppm) in Fig. 14. Fig. 13 A person with mask
Design and Implementation of a Monitoring System…
751
Fig. 14 Room parameter results
6 Conclusion The proposed system measures body temperature and heart rate to ensure the physical fitness of the person to work in the working environment. It also monitors room temperature, air quality, humidity, and detects if the mask is worn by the person and if working people are maintaining a social distance of 1 meter to ensure the safety within the working room. The implementation of the proposed work is done in real time, and the results are verified. From the results obtained, it is found that the proposed solution would be useful for its purpose of increasing COVID-19free indoor safety. As future work, emerging technologies like IoT and AI could be implemented along with the proposed work to obtain a smart monitoring system [18, 19].
References 1. D. Lewis, Why indoor spaces are still prime COVID hotspots article. Nature 592(7852), 22–25 (2021). https://doi.org/10.1038/d41586-021-00810-9 2. https://www.who.int/emergencies/diseases/novel-coronavirus-2019/advice-for-public 3. Y. Dong, Y.-D. Yao, IoT platform for COVID-19 prevention and control: a survey. IEEE Access 9, (2021)
752
A. Tarannum et al.
4. N. Petrovic, Prototyping PIC16-based COVID-19 indoor safety solutions within microcomputer systems course. IEEESTEC (2020) 5. C. Young, COVID-19 monitoring devices with Arduino boards and maxim ICs. Article in Maxim Integrated (2020) 6. J.H. Jo, B.W. Jo, J.H. Kim, S.J. Kim, W.Y. Han, Development of an IoT-based indoor air quality monitoring platform. J. Sensors (2020) 7. S. Neelavathy Pari, B. Vasu, A.V. Geetha, V.K. Jeevitha, Monitoring social distancing by smart phone App in the effect of COVID-19. Int. J. Eng. Res. Technol. (IJERT) 09(09) (2020) 8. S. Dunn, Safer indoor air systems help prevent COVID-19. Article from Lab Manager.com (2020) 9. H. Mukhtar, S. Rubaiee, M. Krichen, R. Alroobaea, An IoT framework for screening of COVID19 using real-time data from wearable sensors. Int. J. Environ. Res. Public Health 18(8), 4022 (2021). https://doi.org/10.3390/ijerph18084022 10. A.W. Bartik, et al., The impact of COVID-19 on small business outcomes and expectations. Proc. Nat. Acad. Sci. USA 117(30), 17656–17666 (2020). https://doi.org/10.1073/pnas. 2006991117 11. A. Mridula, A hefty price tag for small businesses complying with NSW health COVID-19 restrictions. Article from abc.net (2020) 12. S. Madhura, IoT based monitoring and control system using sensors. J. IoT Soc. Mobile, Analyt. Cloud 3(2), 111–120 13. M. Rezaei, M. Azarmi, Deep SOCIAL: social distancing monitoring and infection risk assessment in COVID-19 pandemic. Appl. Sci. 10(21), 7514 (2020). https://doi.org/10.3390/app102 17514 14. N. Petrovic, D. Kocic, Iot based system for COVID-19 indoor safety monitoring. J. icTRAN (2020) 15. A. Bashir, U. Izhar, C. Jones, in IoT-Based COVID-19 SOP compliance and monitoring system for businesses and public offices. 7th International Electronic Conference on Sensors and Applications (2020) 16. S.S. Vedaei et al., COVID-SAFE: an IoT-based system for automated health monitoring and surveillance in post-pandemic life. IEEE Access 8, 188538–188551 (2020). https://doi.org/10. 1109/ACCESS.2020.3030194 17. M.M. Islam, A. Rahaman, M.R. Islam, Development of smart healthcare monitoring system in IoT environment. SN Comput. Sci. 1, 185 (2020) 18. J. Hariharakrishnan, N. Balaji, Adaptability analysis of 6LoWPAN and RPL for healthcare applications of Internet of Things. J. ISMAC 3(02), 69–81 (2021) 19. A. Sungheetha, R. Sharma, 3D image processing using machine learning based input processing for man-machine interaction. J. Innovative Image Process. (JIIP) 3(01), 1–6 (2021) 20. K. Kumar, N. Kumar, R. Shah, Role of IoT to avoid spreading of COVID-19. Int. J. Intell. Netw. (2020)
Statistical Measurement of Software Reliability Using Meta-Heuristic Algorithms for Parameter Estimation Rajani, Naresh Kumar, and Kuldeep Singh Kaswan
Abstract In the software field, meta-heuristic algorithms are now accessible. The techniques provided for the reliability model parameter estimate are prevalent in the literature in the domain of software reliability evaluation. In many disciplines, a few well-known algorithms were employed to optimize the problem solutions. This research work has evaluated the different features of system reliability model for parameter estimation. This paper deals with the software reliability models parameter estimation with the help of other optimization algorithms. This paper also compares the result of the proposed parameter estimation model with the other algorithms such as ABC, GA, and PSO. Keywords Meta-heuristic · Statistical measurement · Accuracy · Reliability · Parameter · Estimation
1 Introduction Nature-inspired algorithms originate from the two classes that involve the single solution and population-based solution. In the earlier class, search process initiates with the one single candidate solution that improves with the number of iterations, which later involves the solution by using the set of candidate solutions that start from the random multiple solutions, and these solutions enhance over the course of iterations [1]. These two classifications are further divided into four sections, each of which includes algorithms based on natural evolutionary principles, swarm intelligence behavior-based algorithms, physics phenomenon-based algorithms, and algorithms derived from human intelligence behavior. Evolutionary principal-based algorithms are inspired from the law of nature evolution process and are mainly based on the survival of fittest principal to form the next-generation individual, for example, genetic algorithm, genetic programming, differential evolution, etc. Rajani (B) · N. Kumar · K. Singh Kaswan School of Computing Science and Engineering Galgotias University, Greater Noida, Uttar Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_55
753
754
Rajani et al.
[2]. Second group is following the rule of physics and mainly include the wellknown algorithms like simulated annealing, gravitational search algorithm, black hole, charged system search, galaxy-based search optimizations, etc. Swarm intelligence behavior-based algorithms make the third group of optimization algorithms and imitate animals, birds, and amphibians’ social activities. This group consists of the largest variety of naturally inspired algorithms such as the particle swarm optimization, artificial bee colony, flower pollination algorithm, and ant colony optimization [3]. The figure demonstrates all such swarm intelligence algorithms. The last category is smart human-driven optimization algorithms, which also contain a range of populous algorithms, such as optimization of the learning process, search for equilibrium, looking for taboos, and optimization by fireworks [4]. Among the available models, only few are useful and others are found to be wrong. This low is due to two reasons: (1) very small public experimental SRE datasets; and (2) long-term clusters, which are usually required to generate software reliability data through experiments. Computing software is used in many different fields, including air traffic management, nuclear plants, aviation networks, real-time sensors, industrial process control, vehicle mechanics and safety regulation, patient health, and so on [5]. The stability of applications can be estimated by reliability models and other techniques. In today’s implementation, the increasing importance of computing determines that the researchers have primarily concentrated on the reliable software usability evaluation [6].
2 Literature Review Software reliability modeling has been utilized for almost four decades. In the last few decades with the growing digital world, more than hundred software reliability models have been developed under various categories for the measurement and enhancement of software reliability. An attempt has been made to portray all the essential literature in a format. The articles discussed here are very much helpful in the present research work of software reliability model development. In software reliability modeling, there is a broad list of probabilistic models that represent occurrence and removal of faults as probabilistic event. Literature discusses about the major influential articles in the key categories of software reliability modeling. Failure rate-based models and non-homogenous Poisson process (NHPP)-based model are among the major category of software reliability models. These models are a rising approach to the programmed reliability assessment. Ali Khodam suggested a model in 2021 [7] which proposes a global disassociation approach to address these shortcomings and increase SORA’s results. In this way, SORA is associated to a better differential development with the enhanced ESORA (modified chaos control). A revamped chaos control is used in the evaluation of reliability constraints in order to increase the efficiency of the RBDO system for strongly nonlinear output functions with no usual random parameters. The suggested system uses MCC to locate the MPTP by using the decoupling technique. A vector
Statistical Measurement of Software Reliability Using Meta-Heuristic …
755
shifting based on reliable knowledge is then used to correct the limitations for the next step of deterministic optimization. Athraa Ali Kadhem in 2017 [1], MCS is a highly versatile way to evaluate the stability of power systems, which can be divided by sequences into two, they are sequential and non-sequential simulations by imitating the random design of system components. In contrast to computational approaches, MCS can be used in the most difficult system capacity cases in order to predict the system reliability. The only disadvantage of the MCS is that it takes more calculation time to converge with the reliability index values. This article presents the useful ways to accelerate MCS in the stability evaluation of the power grid. The VRT is used to manipulate the manner in which each MCS sample is defined to maintain the randomness of the system and reduce the estimation variance. The major drawback of Monte Carlo’s method is the computational work taken to achieve the adequate precision. VRTs can reduce the number of samples required by the MCS process. The main purpose of these strategies is to reduce the estimation variance of reliability indices. Shick-Wolverton (SW) in 2013 proposed a model [3] by assuming the failure rate is equal to the number of defects, which are still in existence, and the time after the last failure. After the modification from the simple JM model, this concept has been developed. Turkish L.I.A. and E.G. Alsolami have developed a SW model with the implementation of Weibull’s debugging time between fault occurrences distribution function. The SW model updated in 1978 [4], adding cumulative error quantity at the moment, modified the SW model. This model provides the ideal debugging mechanism when making the model creation decisions. Littlewoods in 1979 [4] proposed a Markov property-based model and assumed an exponential distribution of faults with an unpredictable effect of fault upon the program’s average failure rate. This work criticizes some of the assumption in various software reliability models and some suggestions are made here to overcome the difficulties in the existing models. Miranda in 1979 has proposed a model based on the JM model and assumed that the initial number of defects have followed a distribution of Poisson that decreased geometrically at failure time. It is a logarithmic fish time model, which is used for calculating the software reliability. In 1979, Amrit L. Goel and Kazu Komodo [5] collaborated on a project. They proposed an imperfect debugging model, which incorporates an imperfect debugging phenomenon with the chance of error removal, which was based on the failure data obtained from numerous projects. The Bayesian alteration model in JM was suggested by Littlewoods and Sofer [4], and it was improved for the proper programmer reliability estimation. In 1985, William S. Jewell developed a model that described a Lewinski and Miranda, which are assessments of Bayes’ machine trust model based on Reinhold and Singpurwalla. The stop legislation, the pre-division of the number of faults, and the ambiguity around the failure rate are all extended significantly. Joe and Reid in 1985 have proposed a model that provide an alternative formulation of JM and littlewood models. Formulation in terms of failure rate rather than inter failure time is given in this work.
756
Rajani et al.
Luo, Cao, Tang, and Wu [8] in 2011 proposed a shift to a cloud model theory-based reliability of Jelinski–Moranda applications. This paper proposes a novel approach to change the popular Jelinski–Moranda model because of the universal instability in program reliability. Chang and Liu proposed a simpler model by extending the JM model with a truncated distribution matching a self-energizing spotting technique.
3 Research Gaps Based on Literature Review Based on literature review, the identified research gaps are listed below: • Classical template optimization is focused on durability in order to give greater focus to EA-based search and optimization methods and vice versa, leading to the evaluation of more such hybrid and classic RBDO approaches. • To the various software reliability models and various optimization techniques involving meta-heuristic techniques but it not clearly defined. • To identify a software reliability model that will predict the reliability of a system software.
4 Statistical Measurement of Software Reliability The probability of a system failing during the time interval from 0 to t 1 under the given operating conditions is the mathematical reliability of any R(t 1 ) system, and it may be defined as: Re (t1 ) = p (K > t1 ) t1 ≥ 0
(1)
K is the moment when the device fails. The density function if the time to fail a random variable is ∞ F(t) then Re (t1 ) =
f (t)dt
(2)
t1
F(t) is a random failure density function. Or the same f (t) = −
d[Re(t)] dt
(3)
The probability density function in K can be interpreted mathematically as: lim p(t1 < K ≤ t1 + t)
t→0
(4)
Statistical Measurement of Software Reliability Using Meta-Heuristic …
757
This indicates the possibility that the loss time K will occur between periods t 1 and t 1 + , where t 1 is the working time and adjust the time interval. If there are a number of errors available for the system, device stability cannot be measured [9]. For the calculation of software stability, it is also important to estimate the failure results. While several metrics are available for reliability evaluation, some of them are described below: 1.
Mean failure time [10]: that is the mean time for failure. ∞ Mean Time To failure =
∞ f (t)dt −
ti
2.
(5)
tj
This shows that only if failure density function is given, MTTF could be used. Failure rate function (FRF): ˙It is defined as a device failure likelihood function in a time interval [t i , t j ] [11]. ∞
∞ f (t)dt −
FRF = ti
3.
f (t)dt = Re (ti ) − Re t j
f (t)dt = Re (ti ) − Re t j
(6)
tj
Mean Repair time (MTTR): It is the average time taken by any device to repair itself. ∞ MTTR =
f (ti )dt
(7)
0
4.
g (t 1 ) is density function for the system repair at time t 1 . Mean time for failure (MTBF): It is a summary of the mean time taken for failure and the median time taken for reparation. TMF = MTTF + MTTR MTBF
5.
Availability (A(t 1 )): The performance assessment at time t 1 of restored devices. A(t1 ) =
6.
(8)
System Up Time (MTTF) System Up Time (MTTF) + System Down Time (MTTF)
(9)
Demand failure probability (POFOD): The risk of a device failure when appropriate. At least two out of 10 service requests are skipped with a rate of 0.2 [12]. For vital safety networks, it is considered as an essential measure.
758
Rajani et al.
Fig. 1 Process of software reliability modeling
The general framework for calculating software reliability includes a fault forecasting approach, which is applied with the growth models of software reliability [13]. The flowchart represented in Fig. 1 illustrates how a software reliability assessment model can be created.
5 Accuracy Estimation Techniques Quantitative techniques are required to access how accurate SRGMs are utilized for measuring the predictions about software reliability. A collection of accuracy evaluation criteria is discussed here to determine the efficacy of software reliability growth models, which can be used to quantitatively compare the model accuracy [14]. There are twelve main criteria used by researchers in the area of software reliability evaluation to approximate or compare software reliability models. The parameters for comparison are provided in the following sections [15].
Statistical Measurement of Software Reliability Using Meta-Heuristic …
759
Accuracy estimation method
Measurement
Mean square error (MSE)
MSE =
Mean absolute error (MAE) Sum of square error (SSE)
MAE = i=1 k−i p i k SSE = i=1 |m i − m(t ˆ i )|2
Bias
Bias =
Mean error of prediction (MEOP)
i MEOP = i=1k− p+1 AE = mma −a a
k i=1
k
|m −m(t ˆ )|
k
i=1
(m i −m(t ˆ i )) k
k
Accuracy of estimation (AE)
Predictive ratio risk (PRR) Error estimating root mean square (RMSPE)
|m −m(t ˆ i )|
M a and a is number of errors found, true and predicted k (m(t ˆ i )−m i ) PRR = i=1 (m(t ˆ ) i
RMSPE = 1 k k−1
Theil statistics (TS)
(m i −m(t ˆ i ))2 k− p
TS =
i=1 (m i
k i=1
− m(t ˆ i ) − Bias)2 + Bias2
(m(t ˆ i )−m i )2 k 2 1 (m i )
× 100%
6 Failure Rate-Based Model The JM model is structured to model the Markov process. This is the first model used to predict the software’s reliability from fault data under certain conditions of the existence of a failure process. This model has its fundamental assumptions [16]. 1. 2. 3. 4. 5. 6. 7. 8. 9.
When it is discovered, failures are automatically eliminated. In testing, no new code has been added. Feelers are only stated by the community of studies. It is equal to any unit of time. Tests reflect the features of adoption. The settings are autonomous. Software is isolated and tested Black box technology. It does not change the organization unknown, but a fixed constant is the number of original program glitches. The interval of time between malfunction events is independent [17].
The device failure rate is constant and proportional to the number of defects left in the device over the failure interval. The program failure rate at the ith interval is given as
760
Rajani et al.
λ(t1 ) = φ[N − (i − 1)], i − 1, 2 . . . N φ = a proportional constant N = The number of initial faults in the program Suppose that the failure data set {t1 , t2 , . . . , tn }. It is granted and suppose it is understood. The probability function is obtained by means of the MLE method: L(N ) = f (ti )
= φ(N − (i − 1))e−φ(N −(i−1))ti = φ n [N − (i − 1)]e−φ
n i=1
[N − (i−1)]ti
(10) (11)
and the chances’ function is logged ln L(N ) = n ln φ +
n
ln[N − (i − 1)] − φ
i=1
n
[N − (i − 1)]ti .
(12)
i=1
If we take the first part of the above feature as regards, we get 1 ∂ ln L = −φ ti . ∂N N − (i − 1) i=1 i=1 n
set
Then
n i=1
n
(13)
∂ ln L(N ) = 0, ∂N 1 ti =φ N − (i − 1) i=1 n
(14)
The following equation can then be derived by the solution of the MLE: n i=1
1 = φ ti . N − (i + 1) i=1 n
The parameter is not recognized in many implementations. In this case, the parameters N and those undefined are to be calculated [18]. The log probability function is again L(N φ) = n ln φ +
n i=1
ln[N − (i − 1)] − φ
n i=1
[N − (i − 1)]ti .
(15)
Statistical Measurement of Software Reliability Using Meta-Heuristic …
761
Take the derivatives in relation to and from n n 1 ∂ ti = 0 −φ [ln L(N , φ] = ∂N N − (i − 1) i=1 i=1
(16)
n ∂ n − [N − (I − 1)]ti ≡ 0. [ln L(N , φ] = ∂φ φ i=1
(17)
and
From the two equations above, we obtain n φ=
1 i=1 N −(i−1) n i=1 ti
and n
n i=1
ti =
n
[N − (i − 1)]ti
i=1
n i=1
1 N − (i − 1)
(18)
Jelinski–Moranda model-JM model is modeled as a model for Markov process. This is the earliest model to predict device reliability from fault data under certain theories regarding the existence of the fault mechanism [19]. This model’s fundamental assumptions are. i. ii. iii. iv. v. vi. vii. viii. ix. x. xi. xii.
When it is detected, failures can be removed automatically. In testing, no new code is entered. The research party only records faults. Each time unit is the same. Checking is a function of acceptance. Faults are distinct. Software is isolated tested Black box tech. It does not change the organization. Unknown but a set constant is the number of initial programmer errors. The time period between loss events is independent. The device failure rate over an interval of failure is constant and equal to the number of defects present in the system.
The failure rate program at the interview λ(t1 ) = φ[N − (i − 1)], i − 1, 2 . . . N φ = a proportional constant N = The number of initial faults in the program
762
Rajani et al.
Suppose that the failure data set. It is granted and suppose it is understood. The probability function is obtained by means of the MLE method: L(N ) = f (ti )
= φ(N − (i − 1))e−φ(N −(i−1))ti = φ n [N − (i − 1)]e−φ
n
i=1
[N −(i−1)]ti
(19)
and the log of the likelihood function is ln L(N ) = n ln φ +
n
ln[N − (i − 1)] − φ
i=1
n
[N − (i − 1)]ti .
(20)
i=1
If we take the first part of the above feature as regards, we get 1 ∂ ln L = −φ ti . ∂N N − (i − 1) i=1 i=1
(21)
∂ ln L(N ) = 0, ∂N n n 1 Then =φ ti N − (i − 1) i=1 i=1
(22)
n
n
set
If we take the first part of the above feature as regards, we get n i=1
1 =φ ti . N − (i + 1) i=1 n
(23)
The parameter is not recognized in many implementations. In this case, the parameters N and those undefined are to be calculated. The log probability function is again L(N φ) = n ln φ +
n
ln[N − (i − 1)] − φ
i=1
n
[N − (i − 1)]ti .
(24)
i=1
Take the derivatives in relation to and from 1 ∂ −φ ti = 0 [ln L(N , φ] = ∂N N − (i − 1) i=1 i=1 n
and
n
(25)
Statistical Measurement of Software Reliability Using Meta-Heuristic …
763
∂ n [N − (I − 1)]ti ≡ 0. [ln L(N , φ] = − ∂φ φ i=1 n
We get from the two above equations n φ=
1 i=1 N −(i−1) n i=1 ti
and n
n
ti =
n
i=1
[N − (i − 1)]ti
n
i=1
i=1
1 N − (i − 1)
(26)
7 Experimental Result This smart and successful hunting capability of the otter is utilized for software reliability growth model parameters estimation [20]. The proposed algorithm is implemented by the 4 GB of RAM and 64-bit windows architecture, the × 64-based processor Intel(R) core (TM) i5(5th gen)-62,000 CPU 2.40 GHz. On most reliability estimation models, the proposed algorithm is implemented. The experimental models used here are: Model
Mean value function
Goel Okumotto NHPP model [21]
m(t) = a(1 − e−bt ), a(t) = a, b(t) = b
Infliction S-shaped model [22]
m(t) =
Zheng–Teng-pham model [23]
a(1−e−bt ) , 1+βe−bt
1−
m(t) =
a p−β
c 1+αe−bt
, β(t) = β
b(t) =
, a(t) = a c ( p−β)
b 1+βe−bt
(1+α)e−bt 1+αe−bt
b
, b(t) =
Other meta-heuristic algorithms, for example, PSO, ABC, GA, and PSOGSA, compared the results of the proposed algorithm [24]. For experimental analysis, four Pham [1] benchmark datasets were used. The dataset used for execution are given in Table 1. Performance of proposed algorithm is calculated and analyzed by using various statistical results like estimated model parameter values, a total of squared mistakes, mean quadrature error, and time spent in seconds using algorithms [25]. Implementation of the algorithm is done with more than 1000 iterations. Using three most famous software reliability models and four datasets, result analysis is done in various examples.
764
Rajani et al.
Table 1 Datasets used for analysis Dataset
Number of faults
Time (sec/hours/days)
Type of application
DS1
136
25 h
Echtzeit system for control
DS2
100
10,000 h
tandem computers software projects
DS3
34
849 days
US Navel Tactical data systems
DS4
136
88,682 s
Real-time control system
Table 2 Result analysis using GO Model on DS1 Sr. no
Algorithm
Estimated parameter values A
B
SSE
MSE
Elapsed time
1
ABC
142.4072
0.18173
2.81E + 05
7.76E + 03
27.43
2
GA
138.9734
0.16789
4.03E + 05
2.10E + 03
69.6842
3
PSO
139.4080
0.18173
3.17E + 06
1.27E + 04
4.25219
4
Otter
138.8168
0.187335
1.94E + 05
7.16E + 02
16.69301
7.1 GO Model and DS1 Statistical Analysis Three meta-heuristic algorithms are analyzed in this example and compared to the proposed otter-based algorithm by means of DS1 and Goel Okumotto. Table 2 shows the results in terms of parameter estimates, algorithm performance for SSE, MSE, and elapsed time in algorithms. Table 3 shows the results [26]. Assessed GO model parameter values for DS2 are very close to the actual values of the dataset by using the proposed algorithm. SSE and MSE performance beats other algorithms, and it has less error than the other algorithms used. In this table, comparison of meta-heuristic algorithms are made through estimated parameter values and elapsed time, which is clearly defined in the above table.
8 Conclusion In this paper, an algorithm based on smooth coated otter’s intelligent foraging behavior used for the parameter estimation of software reliability models. Then, the result is also compared with several other algorithms inspired by nature. ˙It also show that the proposed algorithm can be significantly optimized by using various means through ABC, GA, and PSO. In some cases, the algorithm proposed may have greater complexity, but in others it is superior. Compared to other approaches, the algorithm shows its convergence in less iteration. For further optimization of parameters in different fields, the proposed algorithm can be used as a generalized algorithm.
Statistical Measurement of Software Reliability Using Meta-Heuristic …
765
References 1. A.A. Kadhem, N.I.A. Wahab, I. Aris, J. Jasni, A.N. Abdalla, Computational techniques for assessing the reliability and sustainability of electrical power systems: a review. Renew. Sustain. Energy Rev. 80, 1175–1186 (2017) 2. K.S. Kaswan, S. Choudhary, K. Sharma, Software reliability modeling using soft computing techniques: critical review. J. Inform. Tech. Softw. Eng. 5(144), 1–9 (2015). https://doi.org/10. 4172/2165-7866.1000144 3. K. Deb, S. Gupta, D. Daum, J. Branke, A.K. Mall, D. Padmanabhan, Reliability-based optimization using evolutionary algorithms. IEEE Transact. Evolut. Comput. 13(5), 1054–1074 (2009) 4. K. Moslehi, R. Kumar, A reliability perspective of the smart grid. IEEE Transact. Smart Grid 1(1), 57–64 (2010) 5. C.L.T. Borges, An overview of reliability models and methods for distribution systems with renewable energy distributed generation. Renew. Sustain. Energy Rev. 16(6), 4008–4015 (2012) 6. S. Shakya, S. Smys, Reliable automated software testing through hybrid optimization algorithm. J. Ubiquit. Comput. Commun. Technol. 2(03), 126–135 (2020) 7. A. Khodam, P. Mesbahi, M. Shayanfar, B. M. Ayyub, Global decoupling for structural reliability-based optimal design using ımproved differential evolution and chaos control. ASCE-ASME J. Risk Uncertain. Eng. Syst. Part A Civil Eng. 7(1), 04020052 (2021) 8. C. Chen, W. Wu, B. Zhang, C. Singh, An analytical adequacy evaluation method for distribution networks considering protection strategies and distributed generators. IEEE Transact. Power Deliv. 30(3), 1392–1400 (2014) 9. P. Hao, R. Ma, Y. Wang, S. Feng, B. Wang, G. Li, F. Yang, An augmented step size adjustment method for the performance measure approach: toward general structural reliability-based design optimization. Struct. Safety 80, 32–45 (2019) 10. V. Ho-Huu, T.D. Do-Thi, H. Dang-Trung, T. Vo-Duy, T. Nguyen-Thoi, Optimization of laminated composite plates for maximizing buckling load using improved differential evolution and smoothed finite element method. Compos. Struct. 146, 132–147 (2016) 11. B. M. Ayyub, U. O. Akpan, T. S. Koko, T. Dunbar, Reliability-based optimal design of steel box structures. I: theory. ASCE-ASME J. Risk Uncertain. Eng. Syst. Part A Civil Eng. 1(3), 04015009 (2015) 12. A. Bigham, S. Gholizadeh, Topology optimization of nonlinear single-layer domes by an improved electro-search algorithm and its performance analysis using statistical tests. Struct. Multidisciplin. Optimiz. 62(4), 1821–1848 (2020) 13. K.S. Kaswan, S. Choudhary, K. Sharma, A new classiffcation and applicability of software reliability models. Int. J. 2(7), 99–104 (2014) 14. Z. Chen, H. Qiu, L. Gao, L. Su, P. Li, An adaptive decoupling approach for reliability-based design optimization. Comput. Struct. 117, 58–66 (2013) 15. S. Gholizadeh, M. Mohammadi, Reliability-based seismic optimization of steel frames by metaheuristics and neural networks. ASCE-ASME J. Risk Uncertain. Eng. Syst. Part A Civil Eng. 3(1), 04016013 (2017) 16. V. Ho-Huu, T. Vo-Duy, T. Luu-Van, L. Le-Anh, T. Nguyen-Thoi, Optimal design of truss structures with frequency constraints using improved differential evolution algorithm based on an adaptive mutation scheme. Automat. Construct. 68, 81–94 (2016) 17. S. Kumar, A. Negi, J. N. Singh, H. Verma, in A deep learning for brain tumor MRI images semantic segmentation using FCN, 2018 4th International Conference on Computing Communication and Automation (ICCCA) (IEEE, 2018, December), pp. 1–4 18. M. Benidris, J. Mitra, in Composite power system reliability assessment using maximum capacity flow and directed binary particle swarm optimization, 2013 North American Power Symposium (NAPS) (IEEE, 2013, September), pp. 1–6 19. X. Xu, T. Wang, L. Mu, J. Mitra, Predictive analysis of microgrid reliability using a probabilistic model of protection system operation. IEEE Transact. Power Syst. 32(4), 3176–3184 (2016)
766
Rajani et al.
20. J. Mudi, C. K. Shiva, V. Mukherjee, Multi-verse optimization algorithm for LFC of power system with imposed nonlinearities using three-degree-of-freedom PID controller. Iranian J. Sci. Technol. Transact. Electric. Eng. 43(4), 837–856 (2019) 21. S. Kumar, J.N. Singh, N. Kumar, An amalgam method efficient for finding of cancer gene using CSC from micro array data. Int. J. Emerg. Technol. 11(3), 207–211 (2020) 22. A.S. Awad, T.H. El-Fouly, M.M. Salama, Optimal ESS allocation and load shedding for improving distribution system reliability. IEEE Transact. Smart Grid 5(5), 2339–2349 (2014) 23. Z. Meng, G. Li, B.P. Wang, P. Hao, A hybrid chaos control approach of the performance measure functions for reliability-based design optimization. Comput. Struct. 146, 32–43 (2015) 24. S.R. Mugunthan, T. Vijayakumar, Design of ımproved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm 3(02), 70–82 (2021) 25. A. Abraham, R.K. Jatoth, A. Rajasekhar, Hybrid differential artificial bee colony algorithm. J. Comput. Theoret. Nanosci. 9(2), 249–257 (2012) 26. C. J. Hsu, C. Y. Huang, in A study on the applicability of modified genetic algorithms for the parameter estimation of software reliability modeling, 2010 IEEE 34th Annual Computer Software and Applications Conference. IEEE (2010, July), pp. 531–540
Development of Optimized Linguistic Technique Using Similarity Score on BERT Model in Summarizing Hindi Text Documents S. B. Rajeshwari and Jagadish S. Kallimani
Abstract Growth of natural language processing (NLP) in recent past has been exponential. Various use cases like chatbots, information retrieval system, spam filtering, etc., has been used widely in industries and is in demand. Technology giants like Google, Facebook, Amazon, etc., is contributing a lot in the field of NLP and stateof-the-art technology being used industry-vide. Considering geographical market, industries has started coming with multilingual systems to cater wider audience. Various solutions are available for document and information retrieval in English language. In 2018, researchers at Google AI language open-sourced a new technique for NLP called BERT. It is designed to help computers understand the meaning of ambiguous language in text by using surrounding text to establish context. Limited solutions or models are proposed that are multilingual and focuses mainly on Hindi language. In this paper, BERT model is trained with Hindi Wikipedia dataset in order to create an information retrieval system. System is capable of understanding Hindi queries and gives results on the basis of similarity score. Keywords Natural language processing · Bidirectional encoder representation from transformers (BERT) · Information retrieval · Artificial intelligence · Cross-lingual information retrieval (CLIR) · Machine-readable dictionaries (MRD)
1 Introduction Information retrieval (IR) systems are capable of handling huge document repository, retrieval, and evaluation, particularly textual form. Retrieval is the process of gathering items that can usually be noted in an unstructured form. Generally, huge collection of textual data from various sources are collectively stored in computers. S. B. Rajeshwari · J. S. Kallimani (B) Department of Computer Science and Engineering, M S Ramaiah Institute of Technology, affiliated to Visvesvaraya Technological University, Belagavi, Bangalore, Karnataka, India e-mail: [email protected] S. B. Rajeshwari e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_56
767
768
S. B. Rajeshwari and J. S. Kallimani
For instance, IR is considered when a user enters a query into the system. The IR framework helps users locate the data they need, but it does not directly return the answers to the query. It notifies the presence and location of documents which may consist of the information needed. It also extends support for users to collect or process a collection of collected documents while searching or filtering documents. Classic retrieval of information also assumes that words are statistically independent of each other. Since documents are not bags of words, this presumption is obviously impractical, but treating them like bags of words simplifies engineering in IR systems. In the context of a query, IR model selects and ranks the document that the user or user requested is needed. Documents and queries are similarly represented, so that the selection and ranking of documents can be formalized with a corresponding function that returns the retrieval status value (RSV) for each document in the array [1]. A collection of descriptors, called words, belonging to a vocabulary V, represent document content in many IR systems. The query-document matching feature is defined by an IR model according to four key approaches. Figure 1 displays various types of IR models.
2 Related Works IR model is described based on four components namely acquisition, representation, file organization, and query. These are defined as below: Acquisition refers to the collection of documents and other items from different Web resources consisting of text-based documents takes place in this phase. Web crawlers gather the necessary data and store it in the database. Representation consists of indexing that also includes free-text words, regulated vocabulary, manual, and automated methods [2]. For instance, abstracting requires a review and explanation of the bibliography containing the author, title, sources, data, and metadata. There are two types of methods of organizing data. Sequential type includes records based on document details. Inverted type includes a list of records for each word, term by term. A blend of both is what comes in this component. When a user enters a question into the system, an IR operation begins. Queries are structured declarations of data needs such as Web search engine search strings [3]. A query does not uniquely classify a particular item in the set during information retrieval. Instead, several items maybe with various degrees of relevance can fit the question (Fig. 2).
Development of Optimized Linguistic Technique Using Similarity Score…
769
Fig. 1 Types of IR models
3 Various Techniques 3.1 Bidirectional Encoder Representation from Transformers (BERT) Pretraining of the language model has been shown to improve many tasks of natural language processing, such as emotion classification. Training a word embedding layer from a broad scale of style is the basic concept behind the pretrained language model, so that it has an outstanding ability to extract data from contextual text. Since only from the minimal supervisory data of terminal tasks is it not enough to train different neural architectures of coding context representation. There are
770
S. B. Rajeshwari and J. S. Kallimani
Fig. 2 Components of IR models
various models available in market which are used as language models in NLP. Available models in market are Open AI’s GPT2 and GPT3, Facebook’s M2M100, and Genism’s fastText model [4]. BERT is a model of pretrained language representation introduced in 2018 by the Google AI team based on deep learning techniques. BERT can produce deep bidirectional representations from the unlabeled input text, unlike other language representation models, by jointly conditioning both the left and right context in all layers. In various NLP tasks such as text classification and question response, BERT was applied and performed an outstanding result. Due to the adopted fine tuning methodology, when we use BERT, there is no particular architecture for downstream NLP tasks. As an intelligent agent, the use of prior human information in model design should be reduced and such knowledge should be gained only from data [5]. There are various information retrieval system available in market. Most of the systems are based on single language or focuses on single language as base for these systems. Current market and industry trend are changing as they are focusing different geographic location by considering factors that would affect their business. As a result, we are seeing multilingual system coming in market. Globalization is reducing the importance of national boundaries in terms of commerce and the exchange of information. World’s third most commonly spoken language is Hindi. And Marathi is the most commonly spoken language in Maharashtra, too. As our nation is diversified by different languages, only 12% of individuals are aware of the English language. Information retrieval is becoming common in the Hindi, Marathi, and English languages. Google these days in Hindi, Kannada, Bengali, Nepali, Punjabi, Gujarati, Kannada, Urdu, Malayalam, Marathi, and Kannada, transliteration is given [6].
Development of Optimized Linguistic Technique Using Similarity Score…
771
Fig. 3 Cross-language information retrieval system
Cross-language information retrieval (CLIR) offers a simple way to address language boundary issues, enabling users to send written requests in their language to retrieve documents in different language, as shown in Fig. 3.
3.2 Query Translation Approach In CLIR, the gap between queries and input document is a major challenging issue. The translation of the query is considered as key cross-lingual mechanism in the present CLIR systems. Extracting the contents in a language other than the language used for query is possible through CLIR search engines. Query translation has the benefit of lower computational effort in terms of time and space as compared to other approaches. Translation acts as key role in CLIR query processing which can be achieved by: dictionary-based translation approach, corpora translation approach, and machine translation approach [7].
3.3 Dictionary-Based Translation Approach The query is processed linguistically in dictionary-based query translation, and only keywords are translated using machine-readable dictionaries (MRD). MRDs are electronic copies, either in a general domain or in a particular domain, of printed dictionaries. A natural approach to cross-lingual IR is the use of established linguistic tools, particularly MRDs. It is much easier and simpler to translate the query using dictionaries than to translate documents [8] (Fig. 4).
772
S. B. Rajeshwari and J. S. Kallimani
Fig. 4 Dictionary-based translation
3.4 Corpora Based Translation Approach Query translation involves single corpus or multiple corpus translations using corpora. Corpora is a structured collection of language content that is naturally existing, such as documents, paragraphs, and phrases from one or more languages. Queries are translated into corpus-based methods on the basis of multilingual words derived from parallel or comparable text collections. Since the early 1990s, a parallel corpus has been used for the translation of a given word. A parallel corpus is a series of texts translated into one or more languages, each of which is different from the original language. Parallel companies are often used to evaluate the relationships between words in different languages, such as co-occurrences. One of the significant concepts in corpus-based translation research is a comparable corpus. In more than one language, comparable companies contain text. For several language pairs and territories, such texts are extensively accessible on the Internet. They also have several pairs of sentences that are reasonably good translations of each other [9].
3.5 Machine Translation Based Approach Compared with the other two above, cross-lingual IR with question translation using machine translation seems to be an easy alternative. The benefit of using machine translation is that it saves time when big texts are translated. Four separate machine translation approaches are: word-for-word approach, syntactic transfer approach, semantic transfer technique, and inter lingual approach. MT dependent methods seems to be the perfect solution for CLIR. This is primarily because MT systems translate the sentence as a whole, and during the study of the source sentence, the translation ambiguity problem is solved. Table 1 outlines the variations between different query translation techniques [10].
3.6 Document Translation Approach In CLIR, document translation may be the most suitable scenario if the aim is to allow users to search for documents that differ from their own language and to get
Development of Optimized Linguistic Technique Using Similarity Score…
773
Table 1 Query translation techniques comparison Parameters
Dictionary-based translation approach
Ambiguity
High
Offline translation Possible
Corpora-based translation approach
Machine translation-based approach
Low
Low
Possible
Not possible
Working architecture
Visible as like white box Visible as like white box Works similar black testing testing box testing
Development expenses
Less expansive
Translation availability
Highly available in many Available only in few languages languages
More expensive than DBT
More expensive Available only in few languages
results back in the language of the users. In this sense, it is really a better choice that needs no passive awareness of the user’s foreign language. Both target languages are translated into the source language during the document translation approach. The function of this translation is twofold. First, post-translation or as-and-whenneeded or on-the-fly translation, in which user-searched documents from some other language are translated at query time into user language. The IR process also uses indexing methods to speed up the document search process. In post translation, however, indexing is not possible. So this solution is infeasible because it takes more translation time [11] (Fig. 5).
3.7 Dual Translation (Both Query and Document Translation Approach) In this technique, a generic representation is translated into both queries and documents. This approach requires additional storage space for translated documents, but it offers scalability when several languages need the same set of documents. Regulated vocabulary systems are one example of such an approach. Using a predefined set of language-independent terms, these systems describe all documents and implement queries in the same concept space. The space of this definition determines the granularity or accuracy of potential searches. The key problem of managed vocabulary systems is that, generally, non-expert users need some training and often need vocabulary interfaces in order to generate successful queries. It is also possible to perform a dual translation method called a hybrid translation approach via pivot language [12, 13]. Due to the limitations of translation tools, direct translation between two languages may not always be feasible. A resource or a third language, called a pivot language, is needed between these languages in order to perform this type of translation. Two types of methods are available in this process: either the query or the document is translated to the pivot language first, then the target language; both the document and the query are translated to the pivot language.
774
S. B. Rajeshwari and J. S. Kallimani
Fig. 5 Document translation approach
4 Objectives A system that takes queries in Hindi has been developed. The model will tokenize the query create vector and is fed to our system. System will calculate cosine similarity and provide results ranked on the basis of similarity. Figure 6 shows basic block diagram of the work. The work has been developed on AWS machine for deployment, Python and Jupiter Notebook as software resources, 12 vCPUs, 40 GB of RAM, and a single V100 GPU (16 GB VRAM) as hardware resources. The key challenges faced during the work are: understanding the working and training of transformers-based model, preprocessing of Hindi text, and Hindi news data scrapping (Fig. 7).
5 Solution Approach and Architecture BERT is the first bidirectional language model, which simultaneously utilizes left and right word contexts to predict word tokens. It is educated by optimizing two goals: prediction of masked terms and prediction of the next expression. In BERT, a pair of masked inputs are phrases in the same language, where there are several tokens in the language the symbol ‘[Mask]’ would replace both sentences. The BERT
Development of Optimized Linguistic Technique Using Similarity Score…
775
Fig. 6 Dual translation (pivot language)
Fig. 7 Work flow diagram
model is trained to predict these masked tokens by capturing the interpretation (or context) that is relevant for IR within or across sentences. The second goal is to judge whether or not the sentences are consecutive. It encourages the model of BERT to model the relationship between two phrases. In BERT, the self-attention mechanism models the local interactions between words in sentence A and words in sentence B, so that pairwise sentence or word-token significance patterns can be learned. The
776
S. B. Rajeshwari and J. S. Kallimani
Fig. 8 BERT pretraining architecture
entire BERT model is pretrained on large-scale text corpora and learns language linguistic patterns (Fig. 8).
6 Obtained Results In this section, screenshots of the work are presented. It consists of dataset, preprocessing steps, and result screenshots. The data consists of 44,069 documents with title of that document. Snippets of same are shown in Figs. 9 and 10. In data preprocessing, tokenization generates the following output as shown in Fig. 11. The list of
Fig. 9 Data example
Development of Optimized Linguistic Technique Using Similarity Score…
777
Fig. 10 Data information
Fig. 11 Data after sentence tokenization
tokens identified after the tokenization step is shown in Fig. 12. Figures 13, 14, 15, 16, 17, and 18 shows the sample outputs for various test input texts in Hindi. It is evident that in majority of the cases, the queries are being answered successfully.
7 Conclusions and Future Scope In finding documents across various language varieties around the globe, crosslingual IR offers a new mirror and can be the basis for searching not just between two languages, but also in multiple languages. Today, only a few common languages, such as English, Hindi, Spanish, China, and French, are involved in most of the cross-lingual study. Language research has increased the nation’s growth. As the world becomes more linked through technology, cross-language IR is needed in every language. CLIR is a multidisciplinary field in which the scientific community has gradually gained more interest. There are still many areas to be explored, despite
778
Fig. 12 Data after word-level tokenization
Fig. 13 Sample output 1
Fig. 14 Sample output 2
S. B. Rajeshwari and J. S. Kallimani
Development of Optimized Linguistic Technique Using Similarity Score…
Fig. 15 Sample output 3
Fig. 16 Sample output 4
Fig. 17 Sample output 5
Fig. 18 Sample output 6
779
780
S. B. Rajeshwari and J. S. Kallimani
recent advancements and new technologies. Cross-language information retrieval frameworks can play a major role in the Indian context, which is one of the hotspots of linguistic diversity in the world, and in the fact that a primary language in one region may be a language of a linguistic minority in another region, enabling people to go through the documents and literature of other languages, thereby removing the language barrier. Models the interactions of question words with words in the foreign-language sentence in the self-attention dependent architecture. The pretrained multilingual BERT model initializes the relevance model and finally tunes it with home-made CLIR training information generated from parallel data. The results of the CLIR BERT model look good but as constraint of Hindi text data available in market, data scrapping, labeling, and preprocessing becomes a tedious task. More easily the data is available in market, more powerful the model will be and thus improved performance. Further data scrapping is in progress which will include more of current affairs data. This data will help in improving performance of our system with much better results. More data will also ensure that all the dimension of knowledge is covered ensuring more relevant document results.
References 1. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, Aidan N. Gomez, L. Kaiser, I. Polosukhin, in Attention is all you need, 31st Conference on Neural Information Processing Systems (NIPS, Long Beach, CA, USA, 2017) 2. S. Tan, Z. Duan, S. Zhao et al., Improved reviewer assignment based on both word and semantic features. Inf. Retriev. J. 24, 175–204 (2021). https://doi.org/10.1007/s10791-021-09390-8 3. N.D. Sidiropoulos, E. Tsakonas, Signal processing and optimization tools for conference review and session assignment. IEEE Signal Process. Magaz. 32(3), 141–155 (2015) 4. J. Protasiewicz, W. Pedrycz, M. Kozlowski, S. Dadas, T. Stanislawek, A. Kopacz, M. Galezewska, A recommender system of reviewers and experts in reviewing problems. Knowl. Based Syst. 106, 164–178 (2016) 5. R. Yu, R. Tang, M. Rokicki et al., Topic-independent modeling of user knowledge in informational search sessions. Inf. Retriev. J. 24, 240–268 (2021). https://doi.org/10.1007/s10791021-09391-7 6. C. Xu, P. Zhao, Y. Liu, J. Xu, V. S. S. S. Sheng, Z. Cui, X. Zhou, H. Xiong, in Recurrent convolutional neural network for sequential recommendation, The world wide web conference (2019), pp. 3398–3404 7. X. Zhang, M. Cole, N. Belkin, in Predicting users’ domain knowledge from search behaviors, Proceedings of the 34th international ACM SIGIR conference on Research and development in information retrieval (ACM, 2011), pp. 1225–1226 8. R. Kalyani, U. Gadiraju, in Understanding user search behavior across varying cognitive levels, Proceedings of the 30th ACM conference on hypertext and social media (2019), pp. 123–132 9. U. Gadiraju, R. Yu, S. Dietze, P. Holtz, in Analyzing knowledge gain of users in informational search sessions on the web, 2018 ACM on Conference on Human Information Interaction and Retrieval (CHIIR) (ACM, 2018) 10. U. Gadiraju, J. Yang, A. Bozzon, in Clarity is a worthwhile quality—on the role of task clarity in microtask crowdsourcing, Proceedings of the 28th ACM Conference on Hypertext and Social Media (ACM, 2017), pp. 5–14
Development of Optimized Linguistic Technique Using Similarity Score…
781
11. N. Bhattacharya, J. Gwizdka, in Measuring learning during search: differences in interactions, eye-gaze, and semantic similarity to expert knowledge, Proceedings of the 2019 Conference on Human Information Interaction and Retrieval (ACM, 2019), pp. 63–71 12. D. Sivaganesan, Novel influence maximization algorithm for social network behavior management. J. ISMAC 3(01), 60–68 (2021) 13. R. Valanarasu, Comparative analysis for personality prediction by digital footprints in social media. J. Inform. Technol. 3(02), 77–91 (2021)
Faulty Node Detection and Correction of Route in Network-On-Chip (NoC) E. G. Satish and A. C. Ramachandra
Abstract As per the current inventions in transistor technology and fabrication of integrated circuits, cramping of transistors in ICs is also increasing to accommodate multiple processing elements on a single chip. Network-on-chip is the standard communication infrastructure that connects different processing elements on the chip. On the other hand, reducing the size of the ICs resulted in increased probability of faults during runtime. Therefore, fault detection and correction play a vital role in the file of network-on-chip. This paper attempts to present an efficient technique to isolate and bypass the faulty node to route the data from one end to other end. This approach proposes robust faulty node detection technique and rerouting method. The results confirm overhead in area, power, and performance but at the cost of guaranteed delivery of data to the destination. Keywords Network-on-chip (NoC) · Processing element · Faulty node · Packet drop · Routing · SoC · Traffic sensor model · Virtual channels
1 Introduction The advances in the field of semiconductor technology have led to development of 7 nm processing element. This indirectly increases the density of integrated chips (ICs). Therefore, sophisticated systems can be embedded in an IC to make system-on-chip (SoC). The SoC integrates elements like processing elements (PEs), storage, peripheral components along with interconnecting circuit [1]. As number of processing elements are increased in the highly dense IC, interconnecting E. G. Satish (B) Department of Computer Science and Engineering, Nitte Meenakshi Institute of Technology, affiliated to Visvesvaraya Technological University, Belagavi, Bangalore, Karnataka, India e-mail: [email protected] A. C. Ramachandra Department of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bangalore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_57
783
784
E. G. Satish and A. C. Ramachandra
circuits/interconnection unit become more complex. So interconnecting unit plays a vital role in delivering high performance and reducing the power consumption of the chip [2]. Network-on-chip was proposed to address the issues occurred while interconnecting the processing elements in the chip. NoC has enhanced the performance in multicore processor systems, where applications are dynamically allocated to multiple processing elements and run simultaneously sharing the network resources [3]. On the other hand, increased density and interconnections between the elements have exposed IC to different faults of transistors which are intermittent or permanent in nature and have become more susceptible to environmental influences. Protecting NoC communication unit from external and internal factors is a critical task, and designing a fault-tolerant routing algorithm is a challenging task. An extensive amount of literature survey has been done till today but most of the work focuses on temporary faults that affected the communication links between the processing elements like cross-talk, voltage-induced errors, etc. So, hard faults may occur during runtime, and consequently, for this continuous filed testing is required. The main aim of this paper is to detect permanent faulty nodes during runtime and reroute the data to provide fault tolerance on NoC. The proposed work tries to fill this gap by detection of such packet drops and rerouting the data through alternate path. In this article, diagnosis and isolation technique for a faulty node from routing on NoC is proposed. Once the faulty node is detected, the routing is reconstructed to dynamically save packets around the faulty node with least performance cost. The structure of the paper is as follows: The background and related work is discussed in the second section; a fault detection model for NoC is explained in the third section. Section 4 presents router reconfiguration for fault tolerance, and Sect. 5 evaluates the proposed system. Finally, Sect. 6 concludes the work done and suggests enhancements.
2 Related Works Reliable systems are the systems which function even if few components fail due the property of fault tolerance mechanism. In multicore systems, NoC plays a vital role in communication between the processing elements of the cores, whereas in non-fault tolerance systems, a component failure may halt the entire system, so there is a need for reliable fault tolerance NoC-based systems. Monitoring the traffic in the channel and illustrating the working of online testing with traffic sensor model (TSM) component are discussed in [4]. The traffic in the channel is tested using channel tester when the NoC channels are idle. The authors also presented results on minimal impact on the throughput for varying testing conditions and also shown the area overhead compared with a NoC router implementation on hardware [5, 6].
Faulty Node Detection and Correction of Route …
785
A concept called shield, a reliable NoC router architecture, has been proposed in [7]. This architecture has the capability to tolerate all types of errors in the routing process using spatial redundancy, exploitation of idle cycles, bypass of faulty resources, and selective hardening. An attempt to handle the faults on the router components to achieve high performance with low cost is proposed in [8]. Here, the router design is based on a generic 2-stage router. The work presented in [9] is to identify the packet drop in NoC. The detection approach for packet drop is conducted in runtime. This has to be achieved with as least as possible performance cost.
3 Architecture of NoC Router System-on-chip embeds multiple processing elements which are connected through NoC routers and which are sorted in regular pattern like mesh, liner, torus, 2D, and 3D topologies. To provide fault tolerance routing, router plays a vital role. The router for a NoC must be designed to provide fault tolerance in routing and improve the overall efficiency [10]. As the density of IC increases, the complexity of router design also increases. Figures 1 and 2 show the block diagram of simple NoC router communication links and NoC router, respectively. The areas on the ICs are decided by underlying circuits and microarchitecture of the selected router. Buffers, virtual channels, and switching fabric are included in the data paths of router. Arbiters and allocators decide the control paths of communication routers [11]. To perform matching of resources and allocate VC, allocators are used. As shown in Fig. 1, handshake signals are exchanged among router units for efficient data transfer. The allocation among input flits is done by VC allocator. At most, one flit at input port is targeted to output port. The remaining flits are accumulated in virtual channels in order to reduce blocking. They all will be synchronized appropriately to the clock cycles [12]. The necessity of VCs can be avoided by avoiding input buffers. This results in blocking as well as performance reduction. Consequently by removing buffers, area and power are also reduced [13]. Elastic buffer (EB) flow control is proposed to remove the buffer cost. Fig. 1 Pictorial representation of NoC router communication
786
E. G. Satish and A. C. Ramachandra
Fig. 2 Pictorial view of NoC router
For a random amount of time, both the senders and receivers are stopped to perform their operations due to handshake signals. So buffering should be implemented to supply the information on both sides of the link. EBs can be associated at sender and receiver router, as they are simple logic control channels. The EBs can implement dual interface—enqueues and dequeues. By the internal logic, enqueues can accommodate new data from incoming links. Similarly, dequeues can accommodate data to outgoing links. This is possible when ready signals and valid signals are asserted properly. In the similar manner, EB enqueues at receiver end when it is ready for flit. It eliminates the flit to internal logic location. EBs are very simple and primitive type of NOC. Based on handshake signals and elastic nature of operation, EBs are very easy to integrate with the ICs. It has become a simplified means at input and output routers and acts as buffer repeaters [14]. The fault in the router is due to control signal error or internal buffer malfunction. However, internal errors in the router may go undetected. Consider a case in the Fig. 1, where two neighboring routers may successfully exchange the handshake
Faulty Node Detection and Correction of Route …
787
Fig. 3 Packet drop inside the router
signals but due to error in the router buffer or control signal packet may not be saved or forwarded to the next router, as shown in Fig. 3. The neighbor router may assume that the packet is successfully delivered to the router but it is not. The main focus of this work is to detect such errors and enhance the reliability.
4 Router Reconfiguration Model The efficient fault mechanism is required to avoid the faulty node. When the faulty node is detected, updating the surrounding nodes is necessary. This helps the NoC in separation of the virtual segments, thus forming a shield ring. The neighboring nodes will alter the packets path routing to keep away from the fault node. The change of the packet routing varies on the direction from where it comes and also depends on the present node and the destination. The system has to free from the deadlock so that malicious-tolerant routing algorithms will avoid in forming the possible cycles in the system. The work concentrates on detecting faults in the router which leads to packet dropping. This makes inevitable to reconfigure the routers. To classify types of faults, communication among the packets is significant. The faulty node is detected by the neighboring nodes with a temporary error messages. It also requests for resending the packet. This error message will be considered as a permanent error, when it repeats for several times. In this case, the router will be identified as faulty node due to drop in packets.
5 Experimental Setup and Results A simulator for baseline network mesh NoC is considered in SystemC with the proposed reconfigurable routing technique. This simulator is considered for performing various analyses over different NOC sizes. It is found that these analyses are similar to analysis proposed in previous research works. The constraints such as network traffic, buffer size, and injection rat are considered for the evaluation of router. This is proposed in the simulation work conducted. Based on various buffer size and injection rate, analysis of the performance of the system and overhead is
788
E. G. Satish and A. C. Ramachandra
Fig. 4 Waiting time and buffer sizes
conducted. Uniform random distribution is used for initializing the network traffic. For different buffer sizes among the routers, the simulation is expected to evaluate the waiting time in worst case. This is performed to validate the effects of buffer size on the waiting time of the packets. Waiting time in worst case for NOC for different buffer sizes is shown in Fig. 4. From the analysis, it is proved that the buffer size and waiting time are inversely proportional. When the waiting time increases, the input FIFO buffer size deceases. It is due to less congestion in NOC traffic and rapid movement of packets.
6 Conclusion This research article proposes a fault model which could detect faulty nodes. Faulty nodes could be responsible for packet-dropping in NoC which are unnoticed sometimes. This is due to unexpected faults in control path. Here, the consequences of packet loss are monitored and recorded. Their locations are analyzed over NOC. For a single fault node, the rate of packet loss varies. It is based on the location in the network. For runtime identification, a new architecture is proposed. Acknowledgment mitigation helps in protection of such faulty nodes. The architecture identifies packet loss in runtime and analyses with respect to area, power, and performance. Runtime detection and avoiding of unfortunate faulty nodes could be attempted as future work and other researchers.
Faulty Node Detection and Correction of Route …
789
References 1. L. B. Daoud, M. E.-S. Ragab, V. Goulart, Faster processor allo- cation algorithms for meshconnected cmps, in 2011 14th Euromicro Conference on Digital system design (DSD) (IEEE, 2011), pp. 805–808 2. L. B. Daoud, M. E.-S. Ragab, V. Goulart, Processor allocation algorithm based on frame combing with memorization for 2D mesh CMPs, in 2012 IEEE Third Latin American Symposium on Circuits and systems (LASCAS) (IEEE, 2012), pp. 1–4 3. L. Daoud, V. Goulart, High performance bitwise or based submesh allocation for 2d meshconnected cmps, in 2013 Euromicro Conference on Digital system design (DSD), (IEEE, 2013), pp. 73–77 4. P. Poluri, A. Louri, Shield: a reliable network-on-chip router architecture for chip multiprocessors. IEEE Transact. Parallel Distribut. Syst. 27(10), 3058–3070 (2016) 5. S. Smys, C. Vijesh Joe, Metric routing protocol for detecting untrustworthy nodes for packet transmission. J. Inform. Technol. 3(02), 67–76 (2021) 6. J.I.Z. Chen, Optimal multipath conveyance with improved survivability for WSN’s in challenging location. J. ISMAC 2(02), 73–82 (2021) 7. L. Wang, S. Ma, C. Li, W. Chen, Z. Wang, A high performance reliable noc router. Integration 58, 583–592 (2017) 8. L. Daoud, N. Rafla, Analysis of black hole router attack in network- on-chip, in IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS) (IEEE, 2019) 9. L. Daoud, N. Rafla, Routing aware and runtime detection for infected network-on-chip routers, in IEEE 61st International Midwest Symposium on Circuits and Systems (MWSCAS) (IEEE, 2018), pp. 775–778 10. L. Daoud, D. Zydek, H. Selvaraj, A survey of high level synthesis languages, tools, and compilers for reconfigurable high performance computing, in Advances in Systems Science (Springer, 2014), pp. 483–492 11. Xilinx Inc., Vivado Design Suite User Guide: High-Level Synthesis (December, 2018) 12. W.-C. Tsai, D.-Y. Zheng, S.-J. Chen, Y.-H. Hu, A fault-tolerant noc scheme using bidirectional channel, in Proceedings of the 48th Design Automation Conference (ACM, 2011), pp. 918–923 13. S. Shamshiri, A. Ghofrani, K.-T. Cheng, End-to-end error cor- rection and online diagnosis for on-chip networks, in 2011 IEEE International Test Conference (IEEE, 2011), pp. 1–10 14. J. Liu, J. Harkin, Y. Li, L. Maguire, Online traffic-aware fault detection for networks-on-chip. J. Parallel Distribut. Comput. 74(1), 1984–1993 (2014)
Paddy Crop Monitoring System Using IoT and Deep Learning Chalumuru Suresh, M. Ravikanth, G. Nikhil Reddy, K. Balaji Sri Ranga, A. Anurag Rao, and K. Maheshwari
Abstract Crop production is critical to the Indian economy’s agriculture sector. Agriculture relies on environmental factors such as humidity, temperature, soil moisture, wind-light intensity, soil pH, and water levels. Fluctuations in these factors, as well as meteorological conditions, might result in output losses. As a result, controlling these variables becomes critical. Crop diseases have an equal impact on crop yield. To address these issues, a solution based on IoT and deep learning is proposed for improving paddy output. The system collects data through sensors and transfers it to the cloud in order to diagnose plant stress caused by soil fertility, environmental imbalance, and crop diseases (AWS EC2 Server). The proposed system is implemented in three stages. The stages of production are divided into three categories: the start stage (planting to panicle initiation), the middle stage (panicle initiation to flowering), and the end stage (flowering to maturity). To improve productivity, the proposed system employs two methods. Deep learning is used in both suggestion system and disease detection system. The proposed system aims to devise a plan of suggesting new ways to maintain the parameters (moisture, water level) that are required for a healthy crop production and better yielding. The disease detection system using deep learning categorizes the plant leaves into one of the crop diseases like leaf smut, bacterial leaf blight, and brown spot. Keywords Soil moisture sensor · Water level sensor Bluetooth · Wi-Fi · Arduino · Leaf smut · Bacterial leaf blight · Brown spot · EfficientNet-B3 deep learning · IoT technology C. Suresh · M. Ravikanth · G. Nikhil Reddy (B) · K. Balaji Sri Ranga · A. A. Rao · K. Maheshwari Department of CSE, VNR VJIET, Hyderabad, Telangana, India C. Suresh e-mail: [email protected] M. Ravikanth e-mail: [email protected] K. Maheshwari e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_58
791
792
C. Suresh et al.
1 Introduction India is a country where agriculture is the primary source of income for the majority of its citizens. Environmental factors cause farmers to lose output and productivity. Furthermore, plant diseases are a key contributor to lower the yield. To address these issues, the proposed system includes an IoT-based agricultural production system that incorporates environmental sensors such as soil moisture sensors and water level sensors, as well as a disease diagnosis system that employs deep learning and a suggestion system. This system is useful to maintain the soil moisture and water level depending upon the stages of crops, detect the disease caused due to the crops, and also take appropriate measures with the suggestions given by the suggestion system. The proposed system is implemented in three stages. Water level and soil moisture ranges are monitored in each stage. Any changes in the sensor values against the optimal levels are triggered to the farmer and suggest the farmer to take appropriate measures to control the values of sensors by using a suggestion system. The farmers can get accurate soil moisture and water level data by a customized Web application. The system uses AWS EC2 as a cloud server to process the data.
1.1 Internet of Things The Internet of things (IoT) is the process of connecting everyday physical things or objects that are connected with sensors, actuators to gather the real-time data over the Internet and use the data for processing and responds through actuators. Actuators act in the reverse direction of sensors. The Internet of things (IoT) connects everyday items like light bulbs and healthcare things like medical equipment, as well as wearables, smart devices, and smart cities. IoT is a collection of physical devices that receive data and transfer it via a wireless network with little or no human interaction. IoT enables the direct integration of real-world entities into computerbased systems.
1.2 Importance of IoT in Agriculture In IoT-based agricultural productivity improvement, a system is built to monitor the paddy field with the use of sensors (such as soil moisture, humidity, light, temperature, and so on) and automate the irrigation system. Farmers may monitor their fields from any location. When compared to the conventional field monitoring techniques, IoT-based smart farming is extremely efficient. In agriculture, the Internet of things (IoT) delivers previously unattainable efficiency, higher output, resource and expenditure reductions, water conservation, real-time data and production insight, decreased pollution, low operation costs, improved production quality, and remote
Paddy Crop Monitoring System Using IoT and Deep Learning
793
monitoring. IoT sensors provide the information on soil nutrients, meteorological conditions, rainfall, disease detection, and general crop health.
1.3 Deep Learning Deep learning is a subset of artificial intelligence (AI). It is the process of developing interesting patterns from the available data, which mimics the human intelligence in decision-making operations. Convolutional neural network (CNN/ConvNet) is a class of AI, and it is also a well-known type of feed-forward artificial neural network in which the connection pattern is influenced by the visual cortex’s architecture. When there are edges (horizontal or vertical edges) of a specific orientation, the visual cortex is more likely to activate. Convolutional neural network is predominant in various domains and has various applications like image segmentation, natural language processing, speech recognition, brain interfaces, and many more. Convolutional neural network [CNN] is used to detect different diseases in the agricultural sector. For paddy crop, EfficientNet B3 is used to detect leaf smut, brown spot, and bacterial blight.
1.4 Importance of Deep Learning in Agriculture Deep learning is a data analysis and image processing technology that produces reliable results. Deep learning is now employed in a variety of areas, but in recent years, it has also been used in the agriculture domain. Deep learning is compared with other existing techniques so that the difference in classification or regression performance can be identified. Deep learning will give us more accurate results, and it also improves the performance by using image processing techniques.
2 Literature Survey Bandi et al. [1] has discussed about the framework to improve crop yield by using IoT, which becomes a more powerful tool when integrated with artificial intelligence and data analytics, which makes decisions on behalf of the user with the acquired knowledge and the evaluated options. Harte et al. [2] initiated a model, which is deployed as a Web application and is competent in recognizing seven plant diseases. A dataset containing 8685 leaf images is utilized and established to train and validate the model. Convolutional neural network (CNN) was fine-tuned and validated to show the results with an accuracy of 97.2%. The final output was a plant disease detection application.
794
C. Suresh et al.
Ghosal et al. [3] have created a dataset that has less data; therefore, a model has been developed by using transfer learning. The suggested convolutional neural network (CNN) architecture is based on VGG-16. It is trained and tested on the dataset gathered online. The proposed deep learning architecture has achieved an accuracy of 92.4% by training with 1509 images of rice leaves and testing with 647 different images. Liang et al. [4] proposed a CNN model for rice blast disease detection. Dataset consists of images labeled as positive and negative. CNN model classifies the images given by the user as positive and negative. Furthermore, various experiments are conducted to ensure the quality. The CNN model performs better than most of the traditional methods. Satish et al. [5] proposed a system to monitor crops with various methods, which are used to predict the diseases and also suggests the amount of pesticides to be used. It uses the C4.5 algorithm for classification. Furthermore, the model uses data collected from the sensors. Saradhambal et al. [6] proposed a model to detect leaf diseases and various methods are suggested to recover. The proposed methodology includes data collection, preprocessing, modeling, and testing. The model is used to predict different leaf diseases. The sample dataset consists of 75 images, which are of different plant diseases. Mekala et al. [7] conducted a survey to find out the applications of agriculture by using IoT sensor monitoring networks, which use important technologies like cloud computing. This survey helps in understanding various technologies that help smart agriculture to make it more sustainable. This survey discusses about various technologies like cloud computing, routing protocol, and smart monitoring. Bandi et al. [8] discussed a system framework to improve the productivity in agriculture using IoT. There are four stages to this structure. It is used to provide an interface for collecting real-time data from wireless sensor networks and the cloud, as well as to create algorithms for data analysis, data fusion, and data categorization. Create a data analytics approach. Implement and test the required outcomes. Odoi-Lartey et al. [9] proposed a system to monitor crop fields in real time by using Internet of things (IoT). It collects the data from various sensors and gives suggestions to increase the yield. Sowmiya et al. in this paper [10] discuss various applications IoT in agricultural sector that improves the irrigation system and helps the farmers to understand their crop conditions and eventually helps in improving the rice production. This paper specifies various applications of IoT in agriculture based on various IoT-enabled technologies like cloud computing, embedded system, wireless sensor networks, communication protocol, and big data analytics. Simon1 et al. [11] discuss about various environmental conditions and economic background of Kuttanad and proposes a wireless sensor network (WSN) that for monitoring the paddy field to monitor the water supply to the crop field and also collects the data, monitors, and controls the climate change and also monitors the irrigation system and nutrients supplied to the crops. The WSN used in this paper is the Zigbee network. The three nodes of the Zigbee network are a ZigBee Coordinator,
Paddy Crop Monitoring System Using IoT and Deep Learning
795
ZigBee Routers, and ZigBee EndDevices. The proposed system uses a water level sensor to automate the smart water supply system. Suresh et al. [12] proposed a Smart Water Level Monitoring System based on IoT which is a cost-effective water level measurement system that is necessary for farmers. The system will monitor water levels regularly, and the data collected is stored in a database that helps farmers to manage water in many ways. Farmers can monitor the water levels from any location using a Web application. Shruti et al. [13]. The proposed system uses an IoT-based agricultural field monitoring system. The system uses various sensing devices that interact with the environment and collects data from the soil and the surroundings. This collected data is connected to the cloud where the computation is done. If the values of sensors are more or less than the threshold value, then the required remedies are to be taken. This way the productivity of the paddy crops can be enhanced. Parven et al. [14] suggested an approach that uses machine learning and image processing techniques to detect paddy plant leaf disease detection. The system classifies four types of diseases which are blast disease, brown spot, narrow brown spot, and sheath blight. To identify the crop disease, the images of paddy leaves are captured instantly from different paddy fields. The unnecessary background is removed by doing preprocessing for the image. Then the output is supplied to the segmentation part where the K-means clustering is used to separate the normal leaves and diseased leaves. Finally, diseases are analyzed using the support vector machine (SVM) algorithm. The system achieved the results with an accuracy of 94%. PrajwalGowda et al. [15] proposed a machine learning model using CNN to detect the plant disease using the image as input and provide an appropriate remedy to treat the type of diseases. The solution provides appropriate information regarding the usage of pesticides and insecticides to treat the type of disease. Mangla et al. [16] proposed a model to detect the p crop disease using machine learning, image segmentation technique with other types of classification techniques. The proposed system detects and classifies three types of paddy crop diseases. Mishra et al. [17] MQTT is one of the prevalent protocols. This protocol enables applications to delineate, collect, store, analyze, and process data to solve a myriad of problems. In this paper, client libraries and brokers are compared with versions 3.1.1 and/or 5.0. Furthermore, MQTT protocol advantages, limitations, and approaches to secure MQTT communication are explained. Ferencz et al. [18]. In this paper, the advantages of prevalent IBM-developed NodeRed are illustrated. In addition to it, the paper also presents the ease of using the Node-RED for storing, analyzing, and visualizing data from certain industrial systems.
3 Proposed System Almost 70% of the population in India depends on agriculture. Agricultural productivity influences the economic growth and the food security issues in India. The
796
C. Suresh et al.
most important factor for healthy crop production is water level management, soil properties, and temperature. The other factor that mostly affects the crop yield is crop disease. Supplying water to the crops more than the required amount is one of the drawbacks in traditional agriculture methods. The utilization of disease-specific pesticides and insecticides also plays a vital role in improving the crop productivity. To subdue these problems, the proposed system helps the farmers in taking adequate steps. Practicing the agricultural methods using latest technologies gives better yield when compared to the traditional agricultural methods. The proposed system is specifically designed for the enhancement of paddy crop productivity. The system is implemented in three modules. They are IoT system, convolutional neural network [CNN] model for disease detection and the system that helps farmers to take necessary remedies to normalize the sensor values and treat the disease accordingly. The system uses Internet of things [IoT] technology, which uses soil moisture, temperature, and water level sensors and a disease prediction model using deep learning-based convolutional neural network (CNN) algorithm. The proposed system is also implemented based on sensor values.
3.1 IoT System Module An IoT system consists of three layers. They are application layer, perception layer, and network layer (Fig. 1).
3.1.1
Perception Layer
This layer is the physical layer of the IoT system that collects the required data. Perception layer consists of sensors, actuators, and other devices that interact with
Fig. 1 Layers of IoT mapped to proposed system
Paddy Crop Monitoring System Using IoT and Deep Learning
797
the environment. The perception layer of the proposed system uses water level sensor, soil moisture sensor, and temperature sensor that interacts with the soil and collects the data. Various Sensing Modules of Proposed System (ESP 32) See Fig. 2. Soil Moisture Sensor Soil moisture refers to the quantity of water present in the soil, which is influenced by different factors such as precipitation, temperature, soil properties, and a variety of other factors. Soil moisture sensors are used to determine the quantity of water present in the soil by utilizing some of the characteristics of soil such as neutron interaction, electrical resistance, and dielectric constant as proxies for moisture content. This system uses a resistive soil moisture sensor that works with the relationship between water content to check moisture levels of soil and electrical resistance. The sensor uses two probes to find the moisture level of the soil. The sensor measures the resistance of the soil by the electric current that flows from one probe to another. If the water content is high in the soil, then the electrical conductivity is high. So, a lower resistance reading means soil moisture is high. If the amount of water content in the soil is low, it means the electrical conductivity is poor, and the higher resistance reading indicates low soil moisture. The voltage requirement for this sensor is 5 V. Water Level Sensor Water level sensors are often used in the agriculture sector. Water level sensors are used to measure the amount of water that is present in the farm land and also it is
Fig. 2 IoT system architecture
798
C. Suresh et al.
used to detect the stream flow. Water level sensor is designed in such a way that it can be carried to any place as it is very handy with reduced energy consumption. The water level sensor has three pins to connect; they are signal(S), ground (-GND), + (VCC) voltage (3.3–5 V).
3.1.2
Network Layer
Network layer helps to transmit the data and process the data between different smart objects, servers, and networks. Various network modules used in the proposed system are Wi-Fi, Node Red, and MQTT protocol over Wi-Fi using mosquito broker. Mosquito acts as a bridge between gateway and Node-RED. The proposed system uses Amazon Web Services [AWS] as the cloud storage (Fig. 3). MQTT Protocol The Message Queuing Telemetry Transport [MQTT] is an OASIS standard protocol, which is used for establishing communication in the Internet of things [IoT] model. It is a light-weighted messaging protocol, which is established on top of TCP/IP
Fig. 3 Gateway
Paddy Crop Monitoring System Using IoT and Deep Learning
799
protocol. It is mainly made for high latency, low bandwidth, and unreliable networks. MQTT is an outstanding option to send enormous sensor messages to cloud. Node-RED Node-RED is IBM’s flow-based development tool which is used for processing APIs, hardware devices, and online services together as part of the IoT. It is built on Node.js. It provides a user-friendly interface that makes it simple to connect flows of diverse nodes and deploy in a single click. The proposed system uses AWS EC2 as a server. Node-RED and mosquito broker are installed on AWS EC2 server. Mosquito broker uses MQTT communication protocol as a barrier between the gateway and the server that carries data from the sensors to the Node-RED. Node-RED formats the data into a.json file and transmits the data to the MySQL database.
3.1.3
Application Layer
Application layer provides the user to interact with the system. The proposed system provides a website with two features. They are suggestion system and disease detection. The user should provide an image of the leaf that is affected in the website. The system detects the disease as one of the brown spot, bacterial leaf blight, and leaf smut.
3.2 Suggestion System Module Farmers should take appropriate measures to overcome the fluctuations in soil moisture, water level depending upon the temperature, and also the usage of appropriate insecticides and pesticides based upon the crop disease detected. To help farmers overcome these plant stresses, the recommendation system provides them with the necessary steps to follow in order to effectively produce a healthy paddy crop (Figs. 4 and 5). Fig. 4 Optimal water level requirement
Fig. 5 Optimal soil moisture ranges
800
C. Suresh et al.
This system uses stage wise monitoring of the paddy crops using the optimal water level and the soil moisture ranges. The proposed system helps the farmer in providing water only when needed depending upon the optimal water level requirement. The water level is monitored stage-wise and the required levels are illustrated to the user at every level. Soil moisture is only considered in the third stage of the crop. Hereby, the user is advised to take action whenever the water level and soil moisture values are not within the required range.
3.3 Deep Learning Module Farmers use pesticides and insecticides without knowing the exact disease. The necessary measures are taken according to the disease that helps the crops to heal and become healthy, and eventually, it gives better yields with enhanced crop productivity. The system uses EfficientNet B3 algorithm to implement the model for detecting crop diseases. This model detects three types of paddy crop diseases. They are leaf smut, bacterial leaf blight, and brown spot (Figs. 6, 7 and 8). EfficientNet EfficientNet is a scaling technique that scales all three dimensions of width, depth, and resolution by using a compound coefficient. A convolutional neural network architecture is used. In contrast to the conventional process of randomly scaling these factors, EfficientNet’s scaling method uniformly scales the measurements of resolution, depth, and width of the network. In order to increase the computing resources by 2N , the image size is to be increased by N, network width by bN , and Fig. 6 Bacterial leaf blight
Bacterial leaf blight
Paddy Crop Monitoring System Using IoT and Deep Learning
801
Fig. 7 Leaf smut
leaf smut Fig. 8 Brown spot
brown spot network depth by cN where a, b, c are the constant coefficients set on a small grid search with the original model. To uniformly scale network dimensions, EfficientNet is also used as a compound coefficient in a sophisticated way. EfficientNet B3 is a function in Keras API, which can return Keras image classification model, which can choose to load the pre-trained weights in ImageNet.
802
C. Suresh et al.
Convolutional Neural Network Convolutional neural network (CNN/ConvNet) is a deep learning algorithm. CNN takes image as an input to the model and assigns weights and biases to distinct objects in the image, allowing the model to distinguish between images. ConvNet reduces the image into a form such that it becomes easy to process the image without losing the important features required for good prediction. CNN consists of multiple layers of artificial neurons. The different layers of CNN are input layer, hidden layer, and output layer. Hidden layer consists of multiple other layers like convolutional layer, activation layer, rectified linear unit, pooling layer, and batch normalization. In ConvNet, each layer generates an activation function, which is passed to the next layer (Fig. 9). Convolutional layer The essential building component of convolutional neural network [CNN] is the convolutional layer. It is used for feature extraction from the image by using the set of pre-trained filters. The dot product between input and filter is computed to make edge detection from the image. The filter’s activation map in two dimensions will be utilized to improve the filter that identifies a certain sort of feature in the input. Activation layer The motive of the activation function is to institute nonlinearity in the convolutional output of a neuron. Conventionally the activation layer is applied after the convolutional layer. The activation function determines whether a neuron should get activated or not by the output of the convolutional layer and in addition adds bias to it.
Fig. 9 CNN architecture
Paddy Crop Monitoring System Using IoT and Deep Learning
803
Rectified Linear Unit (ReLu) One of the most commonly utilized activation functions is ReLU. It is a linear function that checks through all the pixel values and outputs the same input value if the value is positive, else if the input value is negative, then the output is 0. Pooling Pooling is used to lessen the dimensions of the input image. There are two types of pooling techniques, namely max-pooling and average pooling. A predefined size of the patch is looped on all the pixels and the dimensionality is reduced by choosing only one pixel value from each patch. Max-pooling chooses the max value from the batch, whereas average pooling chooses the average value of all the values. Batch Normalization Batch normalization is usually used to avoid overfitting the model by decreasing the covariate shift and uncertainty in the distribution of layer activations. It promotes stability and expedites the learning process. Fully connected layer Fully connected layer is the last layer in the model, which receives the flattened output from the previous layers. It assembles all the data deduced from the preceding layers. In the proposed system, the rice-leaf-diseases dataset obtained from Kaggle is used for detecting the diseases of paddy crops like leaf smut, bacterial leaf blight, and brown spot. The images are pre-processed using Keras preprocessing module. Preprocessing is performed by taking the image as input and further resizing and augmenting the image. The output will be a numpy tensor. Later using the Sequential module from Keras, a customized model of nine layers has achieved the accuracy of 92%.
4 Experimental Results for Disease Detection This deep learning model is used to detect the diseases of paddy crops. It classifies the detected disease as one of three diseases. They are bacterial leaf blight, leaf smut, and brown spot. This disease detection model is implemented by using python programming language. The input image to the system is provided by the user in the website. The dataset consists of 120 images with 40 images per class. Each class defines a type of disease. The data is preprocessed. The data is split into train and test sets in the ratio 80:20. Performance metrics of EfficientNet B3 See Fig. 10.
C. Suresh et al.
Accuracy
804
epochs
Fig. 10 Training and validation accuracy of EfficientNet B3
From the above plot, it can be figured out that the model exhibits comparable performance on both the training and validation datasets, as seen in the accuracy figure above. It can be observed that the model has not yet over-learned the training dataset since both the datasets exhibit same level of competence (Fig. 11). Performance metrics of CNN See Figs. 12 and 13. This disease detection model is also implemented by using the CNN model to find the most efficient model to correctly predict the disease. To evaluate the performance of the EfficientNetB3 model, it has been compared with the customized CNN model. The dataset and the preprocessing techniques followed in both the models are same. The EfficientNetB3 model has higher accuracy when compared to the customized CNN model. The accuracy of the EfficientNetB3 model is 91.66%, and the accuracy of customized CNN model is 83%. Below figure shows the table of classification instances that are correctly and incorrectly detected by EfficientNetB3 model and customized CNN models (Fig. 14).
Accuracy
Paddy Crop Monitoring System Using IoT and Deep Learning
epochs
Accuracy
Fig. 11 Training and validation loss of EfficientNet B3
epochs Fig. 12 Training and validation accuracy of customized CNN
805
C. Suresh et al.
Accuracy
806
epochs Fig. 13 Training and validation loss of customized CNN
Fig. 14 Correct and incorrect classification of customized CNN and EfficientNet B3
5 Conclusion Farmers are encountering several challenges when it comes to increasing rice crop yield using traditional methods. A farmer requires expert guidance to cope with agricultural yield loss and water management difficulties, and to take the appropriate actions based on the issues. Henceforth, the proposed system employs IoT technology to monitor soil conditions and water levels for paddy crop, which is intended to assist farmers in adopting necessary steps. Aside from environmental factors, farmers also experience a significant loss in productivity owing to crop disease. To overcome this, the proposed system utilizes a deep learning model for paddy disease detection. The accuracy of the disease detection model using EfficientNetB3 is 91% whereas for customized CNN it is 83%. Hence, EfficientNetB3 model is successfully applied in the disease detection process.
Paddy Crop Monitoring System Using IoT and Deep Learning
807
References 1. R. Bandi, S. Swamy, S Raghav, A framework to improve crop yield in smart agriculture using IoT. Int. J. Recent Technol. Eng. 3(1) (2017) 2. E. Harte, Plant disease detection using CNN (2020) 3. S. Ghosal, K. Sarkar, Rice leaf diseases classification using CNN with transfer learning, in Proceedings of 2020 IEEE Calcutta Conference (CALCON) (2020) 4. W.-J. Liang, H. Zhang, G.-F. Zhang, H.-X. Cao, Rice Blast Disease Recognition Using a Deep Convolutional Neural Network (2019) 5. T. Satish, T. Bhavani, S. Begum, Agriculture productivity enhancement system using IOT. Int. J. Theor. Appl. Mech. 12(3) (2017), ISSN 0973-6085 6. G. Saradhambal, R. Dhivya, S. Latha, R. Rajesh, Plant disease detection and its solution using image classification. Int. J. Pure Appl. Mathe. 119(14) (2018) 7. M.S. Mekala, P. Viswanathan, Survey: smart agriculture IoT with cloud computing. in International conference on Microelectronic Devices, Circuits and Systems (ICMDCS) (2017) 8. R. Bandi, S. Swamy, S. Raghav, A framework to improve crop yield in smart agriculture using IoT. Int. J. Res. Sci. Eng. 3 (2017) 9. B. Odoi-Lartey, E.D. Ansong, Improving agricultural production using internet of things (IoT) and open source technologies. Int. J. Comput. Appl. (0975–8887) 179(21) (2018) 10. M. Sowmiya, S. Prabavathi, Smart agriculture using IoT and cloud computing. Int. J. Recent Technol. Eng. (IJRTE) 7(6S3) (2019). ISSN: 2277-3878 11. S. Simon, K. Paulose Jacob, Wireless sensor networks for paddy field crop monitoring application in Kuttanad international. J. Modern Eng. Res. (IJMER) 2 (2017) 12. N. Suresh, N.V. Hashiyana, N.V.P. Kulula, N.S. Thotappa, Smart Water Level Monitoring System for Farmers (2019) 13. Shruti, R. Patil, An approach for agricultural field monitoring and control using IoT. Int. Res. J. Eng. Technol. (IRJET) (2016) 14. Nargis Parven, Muhammad Rashiduzzaman, Nasrin Sultana, Md. Touhidur Rahman, Md. Ismail Jabiullah, “Detection and Recognition of Paddy Plant Leaf Diseases using Machine Learning Technique”, Blue Eyes Intelligence Engineering & Sciences Publication,2020. 15. B.S. PrajwalGowda, M.A. Nisarga, M. Rachana, S. Shashank, B.S. Sahana Raj, Paddy crop disease detection using machine learning. Int. J. Eng. Res. Technol. (IJERT) (2020) 16. N. Mangla, P.B. Raj, S.G. Hegde, R. Pooja, Paddy Leaf Disease Detection Using Image Processing and Machine Learning (2019) 17. B. Mishra, A. Kertesz, The use of MQTT in M2M and IoT systems: a survey. IEEE Access (2020) 18. K. Ferencz, J. Domokos, Uisng node-RED platform in an industrial environment (2020)
Diabetic Retinopathy Disease Classification Using EfficientNet-B3 M. Naveenkumar, S. Srithar, T. Maheswaran, K. Sivapriya, and B. M. Brinda
Abstract Machine learning in healthcare has been utilized everywhere in many countries. Our objective is to develop newest AI technology that may have a huge impact within the healthcare sector in India. The main aim is to create machine learning models for detecting issues before they occur. Imagine having the ability to find vision defect before it happened. Huge population put up with diabetic retinopathy, the leading explanation for vision defect among operating aged adults. Detection and preventing this sickness among individuals living in countryside wherever medical examination is troublesome to run can bring a serious improvement to the attention sector and therefore the health of individuals living in rural areas. It will increase the foremost hospital’s capability to spot possible patients. Presently, technicians travel the countryside to capture pictures and so consider doctors to review the photographs and supply diagnosing. Our aim is to measure the efforts over technology to achieve the power to mechanically screen pictures for sickness and supply info on however severe the situation could also. We have a tendency to build a machine learning model to hurry up sickness identification by operating with thousands of pictures captured in rural areas to assist establish diabetic retinopathy mechanically. This cannot solely facilitate to forestall long vision defect; however, these models could also be accustomed find different styles of diseases within the future, like eye disease and degeneration.
M. Naveenkumar (B) CSE, KPR Institute of Engineering and Technology, Arasur, India S. Srithar CSE, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India e-mail: [email protected] T. Maheswaran ECE, Sri Shakthi Institute of Engineering and Technology, Coimbatore, India K. Sivapriya CSE, Vivekanandha College of Engineering for Women(Autonomous), Elaiyampalayam, India B. M. Brinda CSE, Paavai College of Engineering, Pachal, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_59
809
810
M. Naveenkumar et al.
Keywords Artificial intelligence · CNN · EfficientNet
1 Introduction Diabetic retinopathy (DR), additionally called diabetic disease, may be a medical condition within which harm happens to the membrane because of diabetes. It is a number one reason for cecity. DR marks up to 75% of those who have taken polygenic disorder for twenty years or additional [1] (Fig. 1). Diabetic retinopathy usually has no early warning symptoms. Retinal photography with manual explanation may be a wide process for DR with the performance that may exceed for the individual person expanded for the eye test [2]. A mechanized tool to categorize severity of diabetic retinopathy would be extremely helpful for Fig. 1 Diabetic retinopathy
Diabetic Retinopathy Disease Classification Using EfficientNet-B3
811
accelerating detection and treatment. Recently, there are variety of tries to utilize deep learning to diagnose DR and mechanically grade diabetic retinopathy. This includes a previous competition and work by Google. Even one deep learning primarily based system is federal agency approved [3]. Automated examination of retinal color pictures has such advantages as inflated potency and analysis of screening programs that are decreased to access and prior detection and treatments. Recent years, the automatic retinal screening algorithm are implemented. The algorithms are mistreatment analysis supported image options manually done by consultants. This is often a PC Vision downside wherever we tend to area unit tasked with the detection of diabetic retinopathy by analyzing the attention image through body structure photography [4]. Our aim is to make a strong machine learning model to spot diabetic retinopathy from the body structure photography pictures of the attention of the patients. We tend to create use of an oversized set of good-resolution tissue layer pictures taken underneath a spread of imaging environments. A practician has graded the presence of DR in every image with the scale from 0 to 4, in line with the subsequent scale: no DR, mild, moderate, severe, and proliferative DR. Our goal is to make an automatic analysis system capable of assignment a score supported this scale [5]. Like all real-world information set, we tend to encountered noise in each the pictures and labels. Pictures could contain artifacts, be out of focus, underexposed, or overexposed. A significant goal of this project is to improve sturdy algorithms that may perform within the presence of noise and variation [6].
2 Literature Survey ImageNet classification with DCNN proposed by Krizhevsky et al. at 2012, a big, deep complication neural network was instructed to divide the millions of highresolution (HD) images into the 1000 different sections. In test data, they attained 39.7% error rates, respectively, which is highly better than the earlier state of art outcomes. To minimize overfitting globally linked layers, a new method is proved to be very successful. Automatically detect the DR using online available dataset with the combination of deep learning (DL) by Michael D A at 2016 to compare the performance of DL-based algorithm for automatic identification of DR. For this, they used severe proliferative DR, non-proliferative and macular edema (ME). Sensitivity, value of negative prediction, specificity, area below the arc, and their assurance intervals were intended and proved. This deep learning-enhanced algorithm for the detection of DR achieves the enhanced performance and have the prospective to improving the competence of DR screening. Varun Gulshan et al. at 2016 suggest the development of DL algorithm for the detection of DR in retinal fundus photographs to create an algorithm to find the DR and diabetic ME in retinal fundus through photographs. This model defined as moderate-worse DR and referable diabetic ME were produced based on the orientation of the majority results of ophthalmologist
812
M. Naveenkumar et al.
panel. The model was chosen at two level of functioning points from the improvement set, one is for high sensitivity and the second is high specificity. With this, the algorithm lead to enhanced care, and the outcomes were analyzed. The classification of DR images using DL models are defined by Suvajit Dutta et al. at 2018 to identify the key backgrounds of DR. There are three ways that the images are trained, backpropagation NN (BPNN), DNN, and CNN. Later, testing these methods with the CPU-trained NN gives low accuracy because of having one hidden layer. By increasing the layers, more these methods will calculate the severity level. This ideal will be helpful to recognize the severity of DR images. DR diagnosis from RI using modified Hopfield neural network (MHNN) by D. Jude Hemanth et al. who at 2018, proposed that the model lies in the training. In CNN, the weights are fixed, but it is changing while using MHNN. This experiment done on the lotus eye care hospital with 500 + images. By this, MHNN got the specificity, average sensitivity of 0.99, and accuracy of 99.25%. Nour Eldeen M. Khalifa et al. proposed the DL-based gender classification through iris patterns at 2019: a robust iris gender identification system using the graph cut segmentation technique. It consists of sixteen subsequent layers for extracting features with different convolutions with the significant improvement in the testing accuracy achieved by 98.88%. DL-based enhanced tumor segmentation (TS) approach for MR brain images done by Mamta Mittal et al. at 2019 which stated that the theory of stationary wavelet transformation and new growing CNN was used. The quantitative analysis is performed to ensure the accuracy to validate the proposed method and the comparisons were made with support vector machine and CNN. Amitojdeep Singh et al. shared the research of uncertainty aware and explainable diagnosis of retinal disease at 2021. In this, they perform the analysis with DL model for diagnosis of retinal disease in various category. A threshold is computed using the distribution. The features learned with the model performance and the uncertainty and explain ability are discussed for the clinical significance. From the comprehensive literature [7–12, 13], it is clear that the methods analyzed the diabetic retinopathy using various techniques like machine learning, image classification, and pattern recognition. To identify the early detection of diabetic retinopathy, the architecture we propose is close to the state-of-the-art, and yet, it is light enough to be deployed in a Web application with the added advantage of being highly efficient and matches state-of-the art models in performance and reduce the loss. This model is entirely different from the previous methods that we can access through Web UI. The AI model with the realistic clinical potential is the idea of the work.
3 Proposed Method The DL ensemble replica for diabetic retinopathy using AI has two main stages. The first stage is the model development stage where the deep learning model is trained and stored. Multiple model architectures are trained, and finally, one best performing
Diabetic Retinopathy Disease Classification Using EfficientNet-B3
813
Fig. 2 Sample dataset for retinal images
model is chosen to deploy in the Web application [7]. Fig. 2 describes the two-stage architecture of the diabetic retinopathy prediction.
3.1 Dataset The training image is an initial set of data type used to support a program to understand how to relate technologies like NN to learn and advance results. It may accompany via the subsequent sets of data called validation and testing sets (Fig. 3). The machine learning (ML) algorithm requires a lot of data to train. So, an image dataset of eye fundus images labelled by experts is used in Joint Shantou Int. Eye Centre—JSIEC, Shantou city, China. The input images were obtained from Kaggle dataset. The dataset contains 209,494 fundus images with 39 classes like, normal, large optic cup, DR1, DR2, DR3, Optic atrophy, Dragged Dic, yellow-white spotsflecks, and so on. In each class, we used around 30 images to train the model. The input classes are obtained from dissimilar models and various cameras were used and the attributes were mixed in terms of quality with the scale of 0–4 [8, 9] (Fig. 4).
814
Fig. 3 Sample dataset ımages
Fig. 4 Architecture diagram for diabetic retinopathy
M. Naveenkumar et al.
Diabetic Retinopathy Disease Classification Using EfficientNet-B3
815
3.2 Image Preprocessing and Enhancement Image preprocessing is an essential step to improve the performance of the ML model because of the variation in the quality of images that are obtained from the real-world dataset. In this module, a preprocessing technique called Ben’s pre-processing is used. This emphasized the regions of the image where model has to pay attention to classify the image correctly.
3.3 Model Training Multiple model architectures are trained, and the performance is noted. If there is an improvement in model score, which is based on accuracy, AUC score and Quadratic Weighted Kappa. Finally, the model architecture with the best performance is chosen to deploy in the Web application. The model backbones we tried are as follows: EfficientNet-B3, EfficientNet-B4, and EfficientNet-B5 [10]. In the final solution, EfficientNet-B3 was chosen for its efficiency and performance. The advantages of EfficientNet are that with greatly less numbers of limitations, the family models are well organized and also deliver good results. The major thing is any network is its stem after which all the examining with the planning starts which is usual in all the eight models and the final layers. These blocks moreover have a contradictory number of sub-blocks whose number is improved as we transfer from EfficientNet-B0 to EfficientNet-B7. There are 237 layers in EfficientNet-B0 and 813 layers in EfficientNet-B7. But all these layers can be made from 5 levels shown in Fig. 5. Figure 6 illustrates the architecture of EfficientNet-B3, which is deployed in our Web application. Its architecture is the same; only difference is that the number of feature maps are differed that growths the number of limitations.
Fig. 5 EfficientNet basic module
816
M. Naveenkumar et al.
Fig. 6 Architecture of EfficientNet-B3
3.4 Save Trained Model The trained model along with its weights are saved for reproducing the results. The format of the saved model is in standard Pytorch model save format.
3.5 Web Application The trained machine learning model is exposed as a service to the end-user using the user interface and backend which contains the trained model and the preprocessing module. Once the user uploads the image through the user interface, the image passes through the preprocessing steps before the model inference is made to obtain the prediction. The final predicted score is mapped to the appropriate diagnosis value and presented to the user in from the UI. The application is containerized using Docker to eliminate the need for installing the dependencies all the time while migrating and ease of deployment.
4 Results and Discussions In order to fortunately execute the code, the user must have an access to a GPU that is CUDA-enabled, RAM of 16 GB, and at least a 4-core CPU. The operating system can be Linux or Windows.
Diabetic Retinopathy Disease Classification Using EfficientNet-B3
817
Fig. 7 DR severity
4.1 Module Description Understanding the Data In this research, two datasets of HD retinal color images were used. It has graded the presence of DR in every image of scale of zero to four, permitting to International Clinical Diabetic Retinopathy severity scale (ICDR) 1–4 [11] (Fig. 7). DR images are obtained through dissimilar hardware run by experts who are changing levels of knowledge. Image quality has huge variation in the result. Subtle signs of retinopathy at an initial stage that can be easily screened on a little difference, or low-resolution. Examination of an image of low value may give undependable outcome when the scheme labels an input as regular while abrasion are available.
4.2 Understanding the Images The variation between image qualities was vast, which is illustrated in Fig. 8 as below. There are five categories of the difficulty that can either be preserved as a multiclass arrangement problem or as an ordinal regression problem. The categories themselves follow a sequence of magnitude of what we are trying to forecast; the problem is more suited to be treated as an ordinal reversal problem [12] (Fig. 9). Figures 10 and 11 show that the number of epochs completed during the model training the model and the AI Web app is displayed. The performance of AI Web app is measured using F1 score, precision, recall, and accuracy. The metrics are defined as, Precision(P) = Recall(R) =
TP TP + FP
TP TP + FN
F1 Score(FS) = 2 ∗
P∗R P+R
(1) (2) (3)
818
M. Naveenkumar et al.
Fig. 8 Variation in image quality
Fig. 9 Images after preprocessing
where FN—False Negative, TP—True Positive and FP—False Positive. Table 1 shows performance metrics of our proposed work with the previous models. The graph performance measure shows 1, 2, 3, and 4 are namely F1 score, precision, recall and accuracy (Fig. 12).
Diabetic Retinopathy Disease Classification Using EfficientNet-B3
Fig. 10 Training output
Fig. 11 Web application screenshot
819
820
M. Naveenkumar et al.
Table 1 Performance metrics Type
F1 Score
Precision
Recall
Accuracy
AlexNet
84
78
81.2
81
VGG16
79.4
77
81
78
ResNet
90
89
92.3
94
EfficientNet-B3
92
89.3
93.4
96.8
Fig. 12 Performance measures
5 Conclusion Machine learning in healthcare has been widely used in several countries. Our objective is to develop cutting-edge AI technology that can have a significant impact in the healthcare sector in India. The main goal is to build a deep learning model for detecting problems before they occur. Imagine being able to detect blindness before it happened. Huge number of individuals hurt from DR, the leading reason of blindness among working aged grownups. Our app can help provide diagnosis at least nearly as accurate as a trained doctor in the field with 96.8% accuracy and successfully deployed the model in the Web application, thus speeding up disease detection. In the future, the accuracy of the model can be improved by fine tuning the parameters, and app will made flexible to use in mobiles.
Diabetic Retinopathy Disease Classification Using EfficientNet-B3
821
References 1. https://www.mayoclinic.org/diseasesconditions/diabetic-retinopathy/symptoms-causes/syc20371611. Mayo Clinic. 2018. Diabetic Retinopathy 2. K.R. Sekar et al., Ranking diabetic mellitus using ımproved PROMETHEE hesitant fuzzy for healthcare sys. ın Intelligent Data Communication Technologies and Internet of Things: Proceedings of ICICI 2020 (Springer, Singapore, 2021), pp. 709–724 3. X. Chen, B.P. Nguyen et al., Re-working multi-label brain tumor segmentation—structured kernel sparse representation. IEEE Syst. Man, Cybern. Mag. 3(2), 18–22 (2017) 4. M.D. Abramoff, Y. Lou, A. Erginay, W. Clarida, R. Amelon, J. Folk, M. Niemeijer, ımproved automated detection of diabetic retinopathy on a publicly available dataset through ıntegration of deep learning. Invest. Ophthalmol. Vis. Sci. 57, 5200–5206 (2016) 5. B.P. Nguyen et al., iProDNA-CapsNet: ıdentifying protein-DNA binding residues using capsule neural networks. BMC Bioinformatics 20(Suppl 23) (2019) 6. R. R. Kumar et.al., Detection of diabetic retinopathy using deep convolutional neural networks. Comp. Vision Bioinspired Comp. 415–430 (2021) 7. M. Naveenkumar, K. Vishnu Kumar, J. Phys.: Conf. Ser. 1362, 012063 (2019) 8. S. Akey, R. Rajesh Sharma, Design an early detection and classification for diabetic retinopathy by deep feature extraction based convolution neural network. J. Trends Comp. Sci. Smart tech. (TCSST) 3(02), 81–94 (2021) 9. M. Naveenkumar, S. Srithar, V. Vijayaganth, G. Ramesh Kalyan, Identifying the credentials of agricultural seeds in modern era. Int. J. Adv. Sci. Technol. 29(7), 4458–4468 (2020) 10. M. Naveenkumar et al., J. Phys.: Conf. Ser. 1916, 012009 (2021) 11. A. Bora et al., Predicting the risk of developing diabetic retinopathy using deep learning. The Lancet Digital Health 3(1), e10–e19 (2021) 12. Ramasamy et al., Peer J. Comput. Sci. peerj-cs.45 (2021)
Design Flaws and Suggested Improvement of Secure Medical Data Sharing Scheme Based on Blockchain Samiulla Itoo, Akber Ali Khan, Vinod Kumar, Srinivas Jangirala, and Musheer Ahmad
Abstract The fast development of information technology and communication includes blockchain technology, artificial intelligence, and cloud computing. Medical cyber systems increasingly need secure data protection. Cloud computing provides the solution to the complexity of data but fails to provide secure data transmission. In blockchain technology, the decentralization specification provides secure authentication and secure data transmission. In this paper, we have reviewed Cheng et al. scheme which is published in the Journal of Medical Systems (https://doi.org/10. 1007/s10916-019-1468-1). In Cheng et al. scheme, we found security weaknesses and some possible security attacks such as clock synchronization attack, DoS attack, impersonation attack, fails to protect the session keys, and stolen verifier attack. Thus, Cheng et al. scheme is not satisfactory for medical secure data sharing for medical communication systems. At last, we suggest some improvements for Cheng et al. protocol. Keywords Blockchain · Medical cyber system · Bilinear mapping · Design flows · Security and privacy
S. Itoo · A. A. Khan · M. Ahmad Department of Applied Sciences and Humanities, Jamia Millia Islamia, New Delhi 110025, India e-mail: [email protected] V. Kumar (B) Department Mathematics, PGDAV College, University of Delhi, New Delhi 110065, India e-mail: [email protected] S. Jangirala Jindal Global Business School, O. P. Jindal Global University, Haryana 131001, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_60
823
824
S. Itoo et al.
1 Introduction Cyber-physical system contains medical cyber system that uses the embedding software and network connectivity to monitor the patients independently and control multiple aspects of the patient’s physiological data; some examples of medical cyber communication system includes patients healthcare record, clinic-to-clinic messages communication, mediating technologies, and robotic healthcare system [1]. It permits immersible systems, distributes computing and wireless to burdened out verbal exchange networks to screen and manipulate the dynamic data of patients [2]. That calls MCPS to have secure authentication [3] and run a control mechanism [4] to confirm the identification and service of every patient. For obtaining permissions to each device and user, it needs to build the primary secure barrier within the medical cyber-physical system. Conventional identification authentication technologies have passed through many phases, via from single component authentication to multicomponent authentication and fixed to dynamic authentication phase. Subsequently, new authentication protocols have been proposed by different researchers that are modifications of previous authentication protocols. Lee pointed out that the public key framework layout is not secure and needs improvement [5]. The smart card-based authentication session key suggested by Tu et al. [6]. The elliptic curve-based two element mutual authentication key scheme suggested by Xu et al. [7]. Thereafter, the cryptography of two-element authentication protocol primarily based on the ECC is also suggested [8]. Zhang et al. focused on the records supply safety certification problem, facts supply certification through frontend blockchain, and returned-end depended on hardware intel software guard extensions [9]. Many authentication protocols are available in literature for authentication [10–12]. Nicolao et al. suggested the secure authentication scheme for a trustworthy management system implemented on blockchain technology [13]. Perera et al. want to authenticate the identification of multi-users due to this the verification and identification step is introduced in a multi-user system [14]. Lin et al. Proposed a novel TCMA framework with more than one participant for communication in the system by using graphs, which is most effective to observe node signatures; the use of a trapdoor hash characteristic successfully improves the signer certificate [15]. Fan et al. stated a records control device based totally on the blockchain, medlock, to manner-affected character statistics. In the proposed scheme, custom-designed gets admission to manipulate the framework and midblock with symmetric encryption technology provides secure data protection. Consequently, it can play a critical position in data sharing [16]. Li et al. framework introduced a blockchain Ethernet technology template for medical facts protection devices. Their solution supplied a dependable storage way to make certain the conformability of saved data [17].
Design Flaws and Suggested Improvement of Secure Medical …
825
1.1 Blockchain Technology Blockchain is essentially a decentralized database of distributing information garage with consensus mechanism, encryption set of rules, and different technologies [18]. The blockchain is defined as a distributed ledger, totally based on cryptography peer-to-peer communication. The data is saved in blocks in chronological order. The dynamic generation of recorded blocks is primarily based mostly on an automatically executable mechanisms.
1.2 Our Contribution We review Cheng et al. [19] scheme for medical data sharing based on blockchain and found some weaknesses and security attacks as follows: • • • • •
Impersonation attack in registration phase. Clock synchronization attack. Fails to protection of session keys. Stolen verifier attack. DoS attack.
1.3 Roadmap of This Paper The remaining part of this paper as bellow: We give the mathematical definitions in Sect. 2. In Sect. 3, Cheng et al. scheme. In Sect. 4, we discuss the weaknesses and some possible attacks of Cheng et al. scheme. The suggested improvement is disused in Sect. 5. Finally, we give the conclusion of this paper.
2 Mathematical Preliminaries In this part, we discuss the some useful definition and notations.
2.1 Notations We used the followings notations that are discussed in Table 1.
826 Table 1 Notations Symbols A1 A2 q f P h H E(.) O1 Tc H (M) L Mac mn sn
S. Itoo et al.
Illustration Addition group of order q Multiplication group of order q Prime number Bilinear mapping Group A1 generator h : {0, 1}∗ × A2 −→ Z q∗ H : {0, 1}∗ × A1 −→ A1 Encryption algorithm Hospital O1 Time stamp Hash of O2 hospital data In cloud storage L represents hash of O2 hospitals data Medical data summary of O2 hospital Master node Super node
2.2 Bilinear Map Let us consider A1 and A2 be additive and multiplicative cyclic group of same order. The mapping A1 × A2 → Z q is said to be bilinear mapping if it satisfies following characteristics: • Bilinear: For all a1 , a2 ∈ A1 their exists x, y ∈ Z q∗ such that f (xa1 , ya2 ) = f (a1 , a2 )x y for all x, y ∈ Z q∗ . • Nondegenerate: Their exists a1 , a2 ∈ A1 such that f (a1 , a2 ) = 1 with 1 the identity element of A2 . • Bilinear maps are called pairings because they associate pairs of elements from A1 and A2 with elements in Z q∗ . • Computability: To check the compatibility selects a1 , a2 ∈ A1 such that f (a1 , a2 ) is efficiently computed.
2.3 Computational Hard Problem The following related problems are considered. • Deffi-Hellman computational problem: For all a1 , a2 ∈ A1 , and x, y ∈ Z q∗ , it is difficult to evaluate the value of x y P when a1 = x P, a2 = y P in polynomial time [20].
Design Flaws and Suggested Improvement of Secure Medical …
827
• Discrete Logarithm: For any a1 , a2 ∈ A1 it is unable to guess x ∈ Z q∗ if a2 = xa1 [21].
3 Chen et al. Protocol The Cheng et al. protocol [19] based on blockchain for secure data sharing scheme as follows.
3.1 Initialization Phase In initialization phase, performed the followings steps to initialize the system parameters as below: Step 1.
Step 2.
Let A1 and A2 be two group of order q. Given the parameter l, supernode generates (Q sn , Ssn ), Where, Q sn = Ssn .P, and issuing the attributes (l, A1 , A2 , q, P, f, H, h, Q sn ). The public-private key pair (Q mn , Smn ) are generated in medical consortium chain, in which Q mn = Smn .P.
3.2 Registration Phase The details of registration phase of mobile user is as below: Step 1.
Step 2.
Step 3.
Hospital O1 selects identity id and random value R1 and statistics the specified O2 hospitals ‘R’ as data request, and forwards (id, R, R1 ) to Super-node. Super-node collects (id, R, R1 ) and evaluate Q id = H (id ⊕ R ⊕ R1 ), Sid = Q id .Ssn , and forwards {Sid , Q id } through secure channel to hospital O1 . Hospital O1 receives the public and private key( Sid , Q id ) and stores the keys securely.
3.3 Authentication The details of authentication phase is given below: Step 1.
Hospital O1 selects a number x ∈ X q∗ , evaluate the parameters r = H (id RQ id Q sn X X Tc ), Q id = H (id ⊕ R ⊕ R1 ), X = x.P, X =
828
Step 2.
Step 3.
S. Itoo et al.
x.Q mn , W = E Sid (id, R, U ), U = Sid + x · r · Q id , s = h(X X Tc R1 ) and then the hospital O1 sends {W, X, R1 , Tc } to mn master node. The mn receives {R1 , Tc , W, X } and validate the time stamp,reject if invalid and if valid then master node evaluates s = h(X X Tc R1 ), X = Smn · X and decrypt (id, R, U ) with hospital O1 public key. Verifies f (U, P) = f (Q id , Q sn + r · x P), rejects if fails otherwise chooses the random number y ∈ Z q∗ and evaluates t = y · P, V = y · X, X = x y · P, M = E Q id (RLH (M)Mac) and create the authentication and session key Au = (W, Tc , X , t, V ), Sk = h(t, X, V, X ),and master node mn forwards {Au, M, t} to hospital O1 . Hospital O1 receives {Au, M, t} and then evaluates h(X, X , t, V ) and V = x · t and checks the authentication of Au = (W, Tc , X , t, V ) if it is not authenticate then rejects if authenticate then decrypt M by Sid to accesses (R, L , H (M), Mac). Finally, hospital O1 and medical consortium node mn acquire mutual authentication and might acquire data which includes area index and information summary data of hospital O2 .
4 Weakness and Cryptanalysis of Cheng et al. Scheme Cheng et al. [19] proposed a scheme on medical data sharing scheme based on blockchain. Durning the cryptanalysis of his scheme for medical chain data sharing, we find some weaknesses and security attacks of their scheme.
4.1 Weaknesses of Cheng et al. Scheme The weaknesses of Cheng et al. scheme are as follows: • In the proposed scheme of Chang et al., the user (hospital O1 ) calculates Q id = H (id ⊕ R ⊕ R1 ) in registration phase. But it also computes Q id = H (id ⊕ R ⊕ R1 ) in authentication phase; this leads to increasing computing cost. • In the authentication phase, Hospital O1 computes X = x.Q mn , whereas medical chain master node mn also computes X = Smn .X but both X = x.Q mn and X = Smn .X are different and do not match each other. • The master node mn computes s = h(X ||X ||Tc ||R1 ) Which has no requirement in further communication in the proposed scheme of Ceng et al. protocol. • Cheng et al. did not use login and verification in both the registration and authentication phases. This helps anyone to easily excess authentication. Hence, Cheng et al. scheme is insecure.
Design Flaws and Suggested Improvement of Secure Medical …
829
4.2 DoS Attack In the proposed scheme given by Cheng et al., their is a possibility of DoS attack in case when the hospital O1 sends (W, X, R1 , Tc ) to medical consortium supernode, in mn the receiver without verifying the (W, X, R1 , Tc ) stores in database. Due to this lack of verification, the attacker tries to send randomly so many massages to the mn node. In mn node, there is no verification to check whether it is authenticated user or attacker; this leads to a denial of service, i.e., DoS Attack.
4.3 Impersonation Attack in Registration Phase In the proposed scheme of Cheng et al., in the registration phase, if an attacker guessing an id of the hospital, then it becomes easy for him to create a request to send a medical consortium node. Medical consortium node does not have any verification method in Cheng et al. framework. The attacker gets easily the public and private key as (Q id , Sid ) of mn and uses them in the authentication phase. This leads to an impersonation attack in the registration phase.
4.4 Clock Synchronization Attack Cheng et al. protocol uses random numbers and timestamps to prevent reply attacks. But in the case of wireless communication and LAN, there exist a time synchronization problem due to timestamp. In Cheng et al. proposed protocol, there is a vulnerability of attacks, that is man in the middle, continuously delay or repeats a valid transmission of messages from receiver to sender side. So Cheng et al. proposed a scheme that needs improvement for protection to delay and other data level and signal spoofing attacks [22].
4.5 Fails to Protection of Session Keys In Cheng et al. protocol, the hospital O1 selects random number x ∈ Z q∗ and checks V = x.t and mn node shares openly Au = (W, X, X , t, V, Tc ) to obtain sessional key hospital O1 needs (X, X , t, V, ). For any attacker, it is easy to construct Sk from Au. Hence, Cheng et al. protocol fails to protect the sessional key.
830
S. Itoo et al.
4.6 Stolen Verifier Attack In the Cheng et al. proposed scheme, any attacker stolen (Q id , Sid ) from the registration phase uses these values in the authentication phase as: • The master node mn sends (Au, t, M) to hospital O1 where Au = (W, X, X , t, V, Tc ) without encryption, so it becomes easy for attacker to make sessional key Sk = (X, X , t, V ) • Attacker decrypt the message with the help of private key Sid of supernode and sessional key which is already obtained.
5 Suggestion Improvement for Cheng et al. Scheme We give the suggested improvement of Cheng et al. protocol as follow: • They must take login and authentication phase in their proposed protocol. • Inside the registration segment, they must use password and biometric-based approach. • They have to use verifying conditions to authenticate the hospital data and user data. • They should be write password-updated and biometric-updated phase.
6 Conclusion and Future Work In this paper, we have reviewed the Cheng et al. scheme. We observed that their protocol have some weaknesses and cannot prevent various attacks such as failure of the protection of session keys, stolen verifier attack, clock synchronization attack, impersonation attack, DoS attack, and lack of login phases. Thus, Cheng et al. scheme is not secure for the medical system. Further, we have suggested some possible improvements for Cheng et al. protocol which is based on blockchain technology for secure data communication in a medical communication system. In future, we will design an authentication protocol in medical healthcare system that are more secure and computationally more efficient than other existing protocols. Instead of bilinear mapping, we will use elliptic curve cryptography that is more secure and lightweight than bilinear map. Surely, that will be highly benefited in medical healthcare system on impersonation problems and data theft.
Design Flaws and Suggested Improvement of Secure Medical …
831
References 1. I. Lee, O. Sokolsky, Medical cyber physical systems, in Design Automation Conference (IEEE, 2010), pp. 743–748 2. A. Haro, M. Flickner, I. Essa, Detecting and tracking eyes by using their physiological properties, dynamics, and appearance, in: Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No. PR00662), vol. 1 (IEEE, 2000), pp. 163–168 3. J.H. Saltzer, M.D. Schroeder, The protection of information in computer systems. Proc. IEEE 63(9), 1278–1308 (1975) 4. A. Ouaddah, H. Mousannif, A.A. Ouahman, Access control models in IoT: the road ahead, in IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA) (IEEE, 2015), pp. 1–2 5. E.A. Lee, Cyber physical systems: design challenges, in 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC) (IEEE 2008), pp. 363–369 6. H. Tu, N. Kumar, N. Chilamkurti, S. Rho, An improved authentication protocol for session initiation protocol using smart card. Peer Peer Netw. Appl. 8(5), 903–910 (2015) 7. X. Xu, P. Zhu, Q. Wen, Z. Jin, H. Zhang, L. He, A secure and efficient authentication and key agreement scheme based on ECC for telecare medicine information systems. J. Med. Syst. 38(1), 9994 (2014) 8. S.A. Chaudhry, K. Mahmood, H. Naqvi, M.K. Khan, An improved and secure biometric authentication scheme for telecare medicine information systems based on elliptic curve cryptography. J. Med. Syst. 39(11), 175 (2015) 9. F. Zhang, E. Cecchetti, K. Croman, A. Juels, E. Shi, Town crier: an authenticated data feed for smart contracts, in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, (2016), pp. 270–282 10. A. Iqbal, A.A. Khan, V. Kumar, M. Ahmad, A mutual authentication and key agreement protocol for vehicle to grid technology, in Innovations in Electrical and Electronic Engineering (Springer, 2021), pp. 863–875 11. J. Srinivas, A.K. Das, N. Kumar, J.J. Rodrigues, Cloud centric authentication for wearable healthcare monitoring system. IEEE Trans. Dependable Secure Comput. 17(5), 942–956 (2018) 12. S. Jangirala, V. Chakravaram, Authenticated and privacy ensured smart governance framework for smart city administration, in ICCCE 2020 (Springer, 2021), pp. 931–942 13. N. Alexopoulos, J. Daubert, M. Mühlhäuser, S.M. Habib, Beyond the hype: on using blockchains in trust management for authentication, in IEEE Trustcom/BigDataSE/ICESS (IEEE, 2017), pp. 546–553 14. P. Perera, V.M. Patel, Face-based multiple user active authentication on mobile devices. IEEE Trans. Inf. For. Secur. 14(5), 1240–1250 (2018) 15. C. Lin, D. He, X. Huang, M.K. Khan, K.-K.R. Choo, A new transitively closed undirected graph authentication scheme for blockchain-based identity management systems. IEEE Access 6, 28203–28212 (2018) 16. K. Fan, S. Wang, Y. Ren, H. Li, Y. Yang, Medblock: efficient and secure medical data sharing via blockchain. J. Med. Syst. 42(8), 136 (2018) 17. H. Li, L. Zhu, M. Shen, F. Gao, X. Tao, S. Liu, Blockchain-based data preservation system for medical data. J. Med. Syst. 42(8), 141 (2018) 18. X. Liang, S. Shetty, D. Tosh, C. Kamhoua, K. Kwiat, L. Njilla, Provchain: a blockchain-based data provenance architecture in cloud environment with enhanced privacy and availability, in 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) (IEEE, 2017), pp. 468–477 19. X. Cheng, F. Chen, D. Xie, H. Sun, C. Huang, Design of a secure medical data sharing scheme based on blockchain. J. Med. Syst. 44(2), 52 (2020) 20. A.A. Khan, V. Kumar, M. Ahmad, S. Rana, D. Mishra, Palk: password-based anonymous lightweight key agreement framework for smart grid. Int. J. Electr. Power Energy Syst. 121, 106121 (2020)
832
S. Itoo et al.
21. A.A. Khan, V. Kumar, M. Ahmad, S. Rana, Lakaf: lightweight authentication and key agreement framework for smart grid network. J. Syst. Architect. 116, 102053 (2021) 22. V. Kumar, A.A. Khan, M. Ahmad, Design flaws and cryptanalysis of elliptic curve cryptography-based lightweight authentication scheme for smart grid communication, in Advances in Data Sciences, Security and Applications (Springer, 2020), pp. 169–179
Comparative Analysis of Machine Learning Algorithms for Rainfall Prediction Rudragoud Patil and Gayatri Bedekar
Abstract India is an agricultural nation and its economy is based primarily on agricultural productivity and precipitation. Prediction over rain is expected and appropriate for all farmers in order to evaluate crop productivity. Rainfall forecast is the application of science and technology to estimate atmospheric status. The precipitation for efficient use, crop production and preplanning of water systems should be accurately calculated. The prediction of rains can be done with the data mining classification task. The performance of various techniques depends on the representation of rainfall data, which includes long-term (month) pattern as well as short-term (daily) pattern representation. It is a challenging task to pick an effective strategy for a specific period of rain. This article focuses on few prevalent rainfall prediction data mining algorithms. Naive Bayes, K-Nearest Neighbor Algorithm, Decision Tree, are some of the algorithms compared in this article. The Delhi NCR region weather data collection from 1997 to 2016 was collected. The approach can be evaluated for better rainfall forecast accuracy. Experimental results demonstrate Decision Tree Classifier (DTC) is powerful in the extraction of rainfall prediction. Applications of Decision Tree Classifier will provide accurate and timely rainfall prediction to support local heavy rain emergency response. Keywords Naive Bayes · K-nearest neighbor algorithm · Decision tree · Rainfall prediction · Data mining · Machine learning
R. Patil Department of Computer Science and Engineering, Gogte Institute of Technology, Belagavi, Karnataka, India e-mail: [email protected] G. Bedekar (B) Department of Computer Science and Engineering, Shri Vasantrao Potdar Polytechnic, Belagavi, Karnataka, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_61
833
834
R. Patil and G. Bedekar
1 Introduction The weather activity for a certain period of time is analyzed [1] by climate change. The central characteristic of climate change is its time-based results [2]. One of the climatic tasks of the expected precipitation is to forecast rain- falls in particular locations using specific characteristics such as humidity and wind. Predicting precipitation is a hard task to forecast weather-related factors including wind, moisture and temperature. In theory, precipitation prediction is usually done through supervised techniques. Because there are several methods for guided learning, diverse output can be achieved. Various techniques include long periods like months and short periods like daily may also be established in rainfall data. Data mining is aimed at discovering useful knowledge and at making it understandable by the new knowledge. This knowledge can be used in future applications [3]. For a number of places like Japan, China, South Africa, and some others like India, many data mining approaches were suggested for rainfall forecasting [4–6]. Recent plumage prediction strategies include Decision Tree, Naive Bayes among K-Nearest Neighbor, and others. To assess the best performance in relation to the expected precipitation, several techniques are required. This research is therefore intended to address several supervised rainfall prediction [7] learning techniques using Indian data. Figure 1 depicts how supervised machine learning algorithms can help depict the prediction of rainfall accurately. Several algorithms were studied in this article. In the analysis of rainfall, data mining techniques are used effectively.
Fig. 1 A schematic of supervised machine learning in prediction
Comparative Analysis of Machine Learning Algorithms …
835
2 Related Work 1.
2.
3.
4.
5.
For the calculation of weather predictions, such as the Nearest, Naive and Trees Decision, ZaheerUllah Khan and Maqsood Hayat use different data mining techniques. Decision Trees performance among the algorithms is quite promising. Promising results were obtained in comparison with other algorithms in the classification algorithms decision tree. The predicted results are 82.62% accurate [8]. The following classification methods were contrasted by Rajesh [2], including Decision Trees [9], Rule-Based Methods, Neural Networks, Memory-Based Arguments, Naive Bayes, Bayesian Belief Networks and Vector Machines. He listed notable algorithm in decision-making processes such as Iterative dichotomiser3 (ID3), C4.5 (ID3 Successor), Classification and Retry Tree (CART), CHI-Squared Automatic Interaction detector (CHAID) and MARS (extends decision theory to better handle numerical data).The machine can be provided with software equipped with a decision tree [10]. The proposed a new approach [3] to preprocessing with moving average and individual spectrum analyses. In training data classes, this function is used to transform the function into small, medium or high levels. An Artificial Neural Network (ANN) will thus investigate the data so that groups can be predicted on an unknown part of the data (testing). As a dataset for studies, two medium daily rainfall sets from Zhenshui and Daninghe watersheds were used. In recent years [11] proposed that the Regression Tree (CART), Naive Bayes, K-nearest, and Neural Networks be used as the comparative study. From June to September (the yearly rainfall duration) from 1996 to 2014, 2245 New Delhi precipitation records include averaging temperature, dew point temperature, humidity, sea stress and wind speeds. Neural Network has performed with 82.1% accuracy; the second best is 80.7% for KNN, 80.3% for Regression Tree (CART), 78.9% of Naive Bayes. Various other authors studied using different techniques, like, Gokila et al. [12] Published a summary study on various data mining strategies used to support weather forecasting and climate analyzers. An adaptive neuro fuzzy approach to estimating wind turbines was presented by A1-Shammari et al. [13]. Oleg V. Divyankov proposed a special technique to construct the image learning collection using ANN, which represents the actual data, for predicting the weather events [14]. The methodology for weather forecast using the Artificial Neural Network in data mining was presented by Sawale and Gupta [15]. In this paper the environment is forecast for a future duration using a neural network-based algorithm. Initial modeling was conducted via the Back Propagation Neural Network (BPN). The associated work has been used extensively in rainfall prediction with computer teaching techniques. The related works are most commonly used, particularly the Naıve Bayes (NB) and the Decision Tree. The research will therefore use these methods, including K-Nearest
836
6.
R. Patil and G. Bedekar
Neighbor, with additional prediction techniques to evaluate the efficiency of those prediction methods. Authors [16] studied and mentioned about 3D image processing using machine learning. In human–robot interaction, 3D objects are identified using the proposed projection-based input processing model. Discussed firefly algorithm [17] which can be used for scheduling, based on layer selection and vector representation, a 3D object is converted into various planes of 2D and more accurate result will be obtained. Deep Learning Techniques [18] are studied and concluded; the deep learning algorithms are having the ability to get trained without utilizing any preprocessing techniques and help to improve the operational speed and accuracy on several critical areas. Results efficiency [19] is presented very clearly by graphs and pictorial methods, which helps to understand concept easily and clearly.
3 System Architecture Figure 2 given below shows the system flow of the algorithm prediction study used in the implementation.
3.1 Data Collection and Preprocessing Climate dataset for Delhi NCR, India from 1997 to December 2016 was obtained from https:/data.world/datasets/weater. Dataset includes different characteristics such as temperature, heat, humidity, rain, snow, and type of attribute. A clear data is one of the most difficult tasks in the process of learning exploration. Data collection and preprocessing are the initial stage in the data mining process. Data preprocessing is a crucial step, because only valid data produces accurate results. Although there were several attributes in that data set, we only considered the important and ignored the rest. Then we transformed data and translated the weather data into numerical values, i.e., string values. The preprocessing phase is intended to prepare the data
Fig. 2 General view of system flow or an algorithm work
Comparative Analysis of Machine Learning Algorithms …
837
before further review. Weather data includes information, noise and incomplete cases which are irrelevant. In order to enhance the performance of the prediction process, preprocessing such data thus plays a vital part; two tasks have been carried out; cleaning and normalization. The main objective is to establish an effective weather model to predict precipitation.
3.2 Segmentation Segmentation refers to the segmenting process by using cross calculating method which analyzes to refine the data based on a given context. In this study some methods like label encoder: LabelEncoder() is used to convert values to non-numerical or string values to numerical values is defined. Some features also deleted using del self.df, which were not relevant for modeling. Dataset is further reshaped and made ready for testing purpose. 1. 2.
Shape before eliminating rows and columns (features and records): (100,990, 20) Shape after eliminating rows and columns (features and records): (86,177, 16)
3.3 Training and Testing the System Now that the dataset is available in the desired format, it is divided into training and testing set (75:25) % randomly. Using Sciktsk learn model, data selection and data import is done, to train, test and split further for training data and testing it with prediction model. Next, we generated the model with a sample data called a training set through testing. Subsequently, this training model is given a test set to predict the future event accurately. Historical data of a classification project were usually divided into two series of data: one for model development and the other for model research. The data set is therefore divided into two parts. 1. 2.
Training collection-We trained the model for the first time by entering from 1996 to December 2016 data with 0 or 1. Score collection–Data from the year 1996 to December 2016 for the score data set were used that included all fourteen attributes except rainfall expected by the model.
The attributes and types remained unchanged in both sets. The classification algorithm establishes relationships between the predictors’ values and the goal values in the model building (training). Classification models are evaluated using the expected value comparison in a collection of test data against known target values and the knowledge gained applies to the algorithm.
838
R. Patil and G. Bedekar
Fig. 3 Sample DTC, NB, KNN
4 Methods 4.1 Decision Tree Algorithm A tree-like diagram or pattern is a decision tree. It appears like an inverted tree on the top and bottom of the sub-tree of the heart. An attribute value based on training set/example specified input attributes is to be generated by a decision tree. The model is expected with data as trained for the next. Within our training set, one is split into sub-sets for each attribute value within case of a minimal attribute. The rim has a sub-tree, which is generated by repeating the algorithm with one side per subset. It specifies the criterion for attribute separation. The significance can include: gain of information, gain ratio, indiceGini and accuracy. The sample Decision tree is shown in Fig. 3.
4.2 Naive Baise Algorithm The Bayes theorem which uses absolute probability is the foundation of the Naive Bayesian classifier. It is a simple classification algorithm which calculates a number of probabilities by counting frequencies and value combinations in a certain set of data. Classification algorithm performance is generally assessed by the assessment of the accuracy of the classification. But since classification is often a fugitive issue, the right response depends on the user. The best choice depends on user’s interpretation of the problem. This algorithm is efficient in comparison with numerical variable(s) for categorical input variables. The sample Naıve Baise Equation is shown in below Fig. 3.
Comparative Analysis of Machine Learning Algorithms … Table 1 Correctly classified instances at 75% training
839
Algorithm
Accuracy with correctly classified instances at 75% training data
Decision tree
99.91
Naive baise
92.9
K-Nearest neighbors
98.09
4.3 K-Nearest Neighbors (KNN) K-nearest neighbor or KNN is a simple algorithm that uses the whole dataset in the training stage. When a forecast is required for an unknown data, the whole training data set searches for the k-most similar cases and the data is eventually returned as the forecast in the most similar occurrence. KNN is often used in search applications to find items similar to one. As KNN can be used both for Classification and Regression, the study uses this as one of the technique for studying predictions. The sample KNN is given in Fig. 3.
5 Results and Discussion 5.1 Performance of Different Machine Learning Algorithms Table 1 displays the performance of the three nominated algorithm for the study. The accuracy rate is around 92–99% for the three techniques.
5.2 Sample Outputs See the Fig. 4.
Fig. 4 Output figures
840
R. Patil and G. Bedekar
Fig. 5 Comparison of the three techniques
5.3 Comparison of the Three Techniques The decision tree is good for weather prediction with higher predictive precision than other data mining techniques and is proved through the use of the software tool MATLAB. The methodology of regression could not find specific prediction value. Nonetheless, it can be obtained roughly nearest value. It is also observed that the accuracy is increased first but then reduced after a certain degree by increasing the dataset size. One reason for this is that the training dataset is best fitted. The comparison of the three is shown in Fig. 5.
5.4 Discussion of Comparison of Results This work used Decision Tree classifier, Naive Baise and K-Nearest Neighbors algorithm for rainfall prediction and observed following points. 1.
Decision Tree versus Naive Bayes • Decision tree is a model of segregation, whereas a Naive bay is a model of generation. • Decision trees are simpler and more versatile. • Decision tree pruning may neglect some of the key values that can lead to toss accuracy in training data. • Naıve Baise operates with a small dataset in contrast to the Decision Tree, which needs more data.
Comparative Analysis of Machine Learning Algorithms …
2.
841
K-Nearest Neighbors versus Naive Bayes • Due to KNN’s real-time execution Naive Bayes is much faster than KNN. • Naive bayes is Parametric whereas KNN is non-parametric.
3.
Decision tree versus K-Nearest Neighbors • Both approaches are non-parametric. • Decision tree allows auto interaction functionality, while KNN lacks. • Decision tree is quicker because of the expensive execution of KNN in realtime.
6 Conclusion This research work is successfully conducted for Delhi NCR weather data collection to forecast rainfall by three grading techniques (Decision Tree, Naive Bayes and K-Nearest Neighbor). Identifying the best rainfall prediction technique is the main aim of the study. A comparative analysis was therefore performed after the three techniques were used to determine the technique that was best suited. The experimental findings show that, due to their capacity to train on small or more data and forecast higher data rates, the rainfall prediction, the Decision Tree and the K-Nearest Neighbors achieve very good performance. Thus, we are able to conclude that the Decision Tree, Naive Bayes, and K-Nearest Neighbors are the supervised classification learning algorithms. It depends heavily on what data is available and what is taught. Limitation of proposed work is we are predicting rain fall for next few days and so. By adding more parameters for study, we will be able to predict rain fall for more time stamp and hourly bases.
References 1. H.S. Badr, B.F. Zaitchik, A.K. Dezfuli, A tool for hierarchical climate regionalization. Earth Sci. Inf. 8(4), 949–958 (2015) 2. I. Panel, C. Change, P. Ivonne, Climate change 2013: the physical science basis: working group I contribution to the fifth assessment report of the intergovernmental panel on climate change. in Intergovernmental Panel on Climate Change (Cambridge University Press, Cambridge, 2014) 3. D.J. Hand, H. Mannila, P. Smyth, D.J. H, Principles of Data Mining (Adaptive Computation and Machine Learning) (Bradford Books, Cambridge, MA, 2001) 4. K. Abhishek, A. Kumar, R. Ranjan, S. Kumar, A rainfall prediction model us- ing artificial neural network,” in Control and System Graduate Research Colloquium (ICSGRC), 2012, IEEE (2012). Available: https://doi.org/10.1109/ICSGRC.2012.6287140. Accessed 6 Nov 2016 5. R. VenkataRamana, B. Krishna, S.R. Kumar, N.G. Pandey, Monthly rainfall prediction using Wavelet neural network analysis. Water Resour. Manage. 27(10), 3697–3711 (2013) 6. B. Wang et al., Rethinking Indian monsoon rainfall prediction in the context of recent global warming. Nat. Commun. 6, 7154 (2015)
842
R. Patil and G. Bedekar
7. V.B. Nikam, B.B. Meshram, Modeling rainfall prediction using data mining method: a Bayesian approach. in 2013 Fifth International Conference on Computational Intelligence, Modeling and Simulation, (2013) 8. Z.U. Khan, M. Hayat, Hourly based climate prediction using data mining techniques by comprising entity demean algorithm. Middle-East J. Sci. Res. 21(8), 1295–1300 (2014) 9. A. Geetha, G.M. Nasira, Data mining for meteorological applications: decision trees for modeling rainfall prediction. in 2014 IEEE International Conference on Computational Intelligence and Computing Research, 18–20 Dec 2014, Coimbatore, India. 10. R. Kumar, Decision tree for the weather forecasting. Int. J. Comput. Appl. 2, 0975–8887 (2013) 11. D. Gupta, U. Ghose, A comparative study of classification algorithms for fore- casting rainfall. in 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions) (2015) 12. G.K. Anand Kumar, A. Bharathi, Clustering and classification in support of climatology to mine weather data—a review. Int. Conf. Comput. Intell. Syst. 04, 1336–1340 (2015), ISSN: 2278-2397 13. E.T. A1-Shammari, M. Amirmoja hedi, S. Shamshirband, D. Petkovic, N.T. Pavlovic, H. Bonakdari, Estimation of wind turbine wake effect by adaptive neuro fuzzy approach. Flow Meas. Instrum. 45, 1–6 (2015) 14. O.V. Diyvankov, V.A. lykov, S.A. Terekhoff, Artificial Neural Networks in Weather Forecasting 829–835 (1992) 15. G.J. Sawale, Dr. S.R. Gupta, Use of artificial neural network in data mining for weather forecasting. Int. J. Comput. Sci. Appl. 6(2), 383–387 (2013), ISSN: 0974–1011 16. Dr. A. Sungheetha, Dr. R. Rajesh Sharma, 3D image processing using machine learning based input processing for man-machine interaction. J. Innovative Image Process. (JIIP) 03(01) (2021), ISSN: 2582-4252 17. M.T. Tapale, R.H. Goudar, M.N. Birje et al., Utility based load balancing using firefly algorithm in cloud. J. Data. Inf. Manag. 2, 215–224 (2020). https://doi.org/10.1007/s42488-020-00022-2 18. Dr. G. Ranganathan, A study to find facts behind preprocessing on deep learning algorithms. J. Innovative Image Process. (JIIP) 03(01) (2021) 19. Dr. A. Sungheetha, Dr. R. Rajesh Sharma, A comparative machine learning study on IT sector edge nearer to working from home (WFH) contract category for improving productivity. J. Artif. Intell. Capsule Netw. 02(04) (2020) ISSN: 2582-2012
Electronic Invoicing Using Image Processing and NLP Samarth Srivastava, Oshi Varma, and M. Gayathri
Abstract In the world of e-commerce, recording sales play a significant role. To achieve it, companies use invoices, and each seller generates an invoice on each transaction. As tons of invoices are generated, by different sellers on a particular day, in their formats, it is challenging for the hosting company to keep track of the trades and transactions. They have to salvage information from each invoice manually as they are in different templates. We found out that almost all decent performing products in this area operate on standard pdf, which consists of various elements enclosing the information, and each of these elements can be selected to extract information. However, these solutions do not perform well on scanned pdf which has no such containers. All the information in this pdf is in one single container as an image element. This type of invoices forms a major portion of the total invoice generated. The products which do support scanned pdf data extraction are expensive. We propose an innovative, cost-effective, and extensive solution to this problem. It will automate the invoicing process and convert all the invoices to a standard structured template, which is predefined by the user. We use cutting edge algorithms, which can detect information in the given invoice easily. They detect the information required following a tag-based system, which is programmed internally. After successful detection of the region of interest (ROI) by localizing bounding boxes around them, the data are extracted from each bounding box using OCR techniques and stored in various data structures. The obtained data generate an Excel file in the defined template automatically. A user-friendly GUI allows annotation of new invoice formats and prepares the dataset for the future. The solution helps the company to redirect its resources currently deployed in it, to other areas, which is beneficial for the company. S. Srivastava (B) · O. Varma · M. Gayathri SRM IST, Chennai, India e-mail: [email protected] O. Varma e-mail: [email protected] M. Gayathri e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_62
843
844
S. Srivastava et al.
Keywords Image processing · NLP · OCR · Region of interest (ROI)
1 Introduction Invoices are used by sellers around the world to record each transaction. Each invoice holds details about the company, product purchased, and amount. But as there is no standard format, each seller works with their invoice format, which causes the location of this information to vary. Figure 1 represents the different invoice formats used by the sellers. With the development of the e-commerce industry, the number of transactions has increased exponentially. Hosting e-commerce companies finds it difficult to extract information from such a large number of invoices. They have to salvage data from each invoice manually to maintain up to date records. This manual intervention is tiring and expensive. It can also lag the data entry process which can cause serious repercussions. To tackle this issue, e-commerce companies are looking for a solution. Various software works with limited number of formats [1]. But with the increasing number of sellers, e-commerce companies struggle to adapt to a new format every time a seller collaborates with the hosting company. Multiple software is being developed to provide a solution in this field. A typical invoice data extraction system involves an optical character reader (OCR). It extracts data from an image with maximum accuracy but lacks structural reference. Natural language processing (NLP) techniques help user to draw semantic features and arrange data accordingly. Also, when there are noises in the image, the error rate
Fig. 1 Different formats of invoice
Electronic Invoicing Using Image Processing and NLP
845
of OCR increases. It leads to incorrect data extraction. Multinational companies can’t afford such errors and requires highly a reliable solution. In this paper, two most widely used algorithms are combined together to form an efficient system, i.e., regex and template matching. A brief about them is given below:
1.1 Regex Regular expression, also referred to as rational expression, is a series of characters which represents a pattern. Regular expressions are a common approach used by multiple languages to match series of characters with a pattern. Languages such as C++, Python, JavaScript, and Java support regular expression. Some of its popular uses includes URL matching in Google analytics. A detailed table about the regex expression and their description is given below. Invoice system uses regular expression to filter out the data extracted from an OCR and before placing it in the Excel sheet (final output). It helps us to figure out that the data extracted is more reliable and doesn’t gives us any garbage value.
1.2 Template Matching Template matching is a technique to find a template image (in our case let’s say template image can be company logo, company name, or other significant information specific to that particular type of invoice whose template is made) in a larger image (in our case the invoice images which will be fed to the software to extract data). Many comparisons were made in OpenCV. It returns a gray image, where each pixel shows how close the pixel neighbor is to the template. It takes the template image and puts it above the input image to compare the template with the input image. Finally, it returns a gray image, where each pixel shows how close the pixel neighbor is to the template. Figure 1 shows how template matching was used in our project. Some other solution to this problem uses graph convolution networks, computer vision, and machine learning algorithms. A detailed study of the previous work is stated in Sect. 2, followed by the implementation of the proposed work in Sect. 3.
2 State of the Art This section highlights all the available researches made in this field. Belaïd and Belaïd [2] uses NLP’s morphological tagging for automatic segmentation of the words. A similar approach was used in [3, 4]. In [3], each word is extracted along with its position, converted into an N-grammar and assigned a probability. This step
846
S. Srivastava et al.
is followed by LSTM and IOB labeling to get the final output. Cloudscan is used by Tradeshift for automatic invoice processing. One of the latest researches was made in 2020 by Yingyi Sun, Xianfeng Mao, Sheng Hong, Wenhua Xu, and Guan Gui in [1]. They have used various image processing methods, such as dilation, erosion, and edge cutting, followed by a template matching approach for twenty predefined invoice templates. Sidhwa et al. [5] also discusses various image processing technique before the training process. A horizontal and vertical scanning is done for image segmentation. If the word found after segmentation is meaningful, then it is passed to an adaptive classifier. Table 1 provides a detailed overview of all the adapted methodology and the problems with the available technologies. Table 1 provides us with a reason to tackle the issues in the existing systems and generate a new invoice system. Sun et al. [6] tries to narrow this problem by classifying the invoice images into taxi invoices, train invoices, and value-added invoices. It will help the invoice system to refine your data extraction process with respective to its class. Various other researches have been made to classify invoices into machine printed and handwritten. As handwritten and machine printed images requires different processing techniques, this classification strategy will help the invoice systems to become more reliable. Our proposed work involves usage of Tesseract’s OCR. Rahul et al. [7] discuss the implementation of the OCR in the field of license plate detection. As OCR can lead to error due to noise, hence, image processing plays a vital role when working with an OCR. As stated in the title of our paper, we will imply various image processing technique before processing our invoices for data recognition and extraction. Rahul et al. [7] have employed multiple image processing techniques before applying the algorithm on the image and get maximum accuracy. A similar approach has been used in our work, where we try to process the images and find the region of interest (ROI). Section 4 discusses the implementation of our model in detail. Table 1 Overview of the adapted methodology in the field of invoice data extraction References Adapted methodology
Problems
[2]
Part of speech tagging
Under segmentation
[1]
Template matching and OCR
Time consuming and issues due to noise
[8]
Graph convolution network and artificial Issues due to noise ıntelligence
[9]
Case-based reasoning
Increased solving time due poor database indexing
[3]
Recurrent neural network
Requires manual ordering of long-short term memory (LSTM)
[5]
Computer vision and OCR
Not suitable for handwritten invoices
[10]
Machine learning and OCR
Works only for tabular structure
[11]
Bayesian’s structure
Less optimized
[4]
Rule-based system
Need fixed weights
[12]
Computer vision
Image size affect results
[13]
Machine learning
Requires human intervention
Electronic Invoicing Using Image Processing and NLP
847
3 Proposed Work We propose an efficient solution to process almost all invoices and extract specific and precise information about fields like customer name, seller name, and invoice number. A UML diagram for the proposed model is depicted in Fig. 2. A seller sends the invoice to the hosting company in the pdf or png format. The hosting company receives millions of invoices daily. Processing each invoice separately will be expensive and time consuming. The hosting company will take the invoices from invoice database and provide the output location for the processed invoice. A structured output will be obtained by the software. Figure 3 shows the GUI of our model. Our proposed model works with both pdf and png format invoices. We have used many templates to recognize the format of the invoice. If no template matches with the invoice, we ask the user to annotate this invoice once. It is then added to the template for future and no other invoice of this format would be required by the user to annotate. After matching the template, we extract the information using bounding boxes from the template. Finally, it is exported as a CSV or Excel file as required by the user. This structured output can be used by the company for maintaining records and analyzing the growth of the company. Figure 3 shows how this model asks user to specify all the directory. Output directory refers to the folder where the output will be saved. Source directory refers to the directory which contains all the input invoice images. Format directory is the directory where all the processed formats are saved. Template directory holds all the processed templates. Destination directory refers to the directory where all the
Fig. 2 UML diagram of the proposed software
848
S. Srivastava et al.
Fig. 3 GUI of the proposed software
processed invoice images will be saved after completing the process. It is done to separate all the processed images from the non-processed image. All these directories are mandatory. Without mentioning these directories, the ‘START’ button will not be visible to the user, and so, the user will not be able to move further. The user needs to create these directories for smooth running of our model. It is one time process. After specifying all the folders, the user can just go to the source directory and put all the invoices that needs to be processed.
Electronic Invoicing Using Image Processing and NLP
849
4 Implementation In this section, we will discuss the implementation of our proposed work. Figure 4 represents the lucid architecture of our model. A user-friendly graphical interface was developed to increase interactivity. We have explained each module of our proposed system and included the various libraries used to each the same. Each part of the architecture is briefed below for better understanding.
4.1 Input Image The program takes input invoices in ‘.pdf’ and ‘.png’ formats. User can store all the invoices in these two formats in a directory and feed the path of that directory to the software. The software will automatically convert each invoice from that directory to the structured output format. The knowledge set of the invoice system is formulated over time. It becomes more reliable with the encounter of more formats.
4.2 Conversion The software will fetch a file from the specified directory, if that file is a png document, it will continue to the next stage in the extraction process. Although if the file fetched is a pdf document, then it will first automatically convert the pdf file to png file using: • ImageMagick • Opencv • Pillow These libraries will convert your pdf invoice into an image internally. After this conversion, the resulting image file is saved a temporary directory with the same
Fig. 4 Architecture diagram of the proposed software
850
S. Srivastava et al.
name as the filename of the invoice in consideration. This image file is then moved to the next stage in the extraction pipeline.
4.3 Format Selector The software will maintain a ‘formats’ directory, where it will store all the known formats of the invoices. These formats will be in png format with all the fields required for extraction annotated. On receiving the png file from the previous step, the software will compare the image with all existing formats using template matching. It will provide us with accuracy/probability metrics for how well each format matches the image. The best match is selected from the template and returned. In this system, we are using the company logo as a template. The system compares the incoming image with the templates (logos) available. If that format is present in the knowledge set, it will proceed toward data extraction, otherwise it will ask the user to annotate the unknown format. The template matching algorithm returns an accuracy which corresponds to the similarity accuracy of the incoming image and template. For our system, we have set 95% accuracy to consider that the format is known. Any accuracy less than that will be discarded. In case, no format provides a good accuracy match, then the software will ask the user to manually annotate this image once. To do this, the software will open a GUI for the user to manually annotate all the required fields from the invoice. Once it is done, this image will be stored in the formats folder as a template for future matching of invoices of this format. The annotated image will generate a XML file. This XML file will be stored in the XML directory along with the XML of all other templates. An example of the generated XML file can be seen in Fig. 5. Now that the format is selected; the file is sent to the next stage for extraction of data.
4.4 Adding Unknown Format By adding an unknown format, this invoice system adds versatility as its feature. Our proposed system adds unknown format to increase the knowledge set. With this set, the system will perform more efficiently when the same format comes in the future. In the format selection stage, if no format is matched with the image file, the software automatically opens up a GUI for the user to manually annotate all the required fields in the image. This stage presents the user with a GUI where the user can draw rectangles on the image to mark the region of interest (ROI) and name the fields. This is done using LabelImg which is integrated into the software itself. After annotating when the user saves the file, it generates an XML file which consists of all the fields marked and the location of their bounding boxes, in key and value formats. These bounding box locations are further used to extract the
Electronic Invoicing Using Image Processing and NLP
851
Fig. 5 Generated XML file from the image
information once a format is matched with the image. This XML file is stored in the ‘XML’ directory and the png file is stored in the ‘formats’ directory. The XML file can contain as many annotations as the user desires. If you want to annotate only the date, you can annotate only one field for that format. This should be kept in mind that the same fields will be extracted in the future.
4.5 Invoice Status All the process apart from adding format is done internally. The user is unaware of the process that is happening in the background. Hence, in order to keep the user updated, we have created a GUI as shown in Fig. 7. It keeps the user updated about
852
S. Srivastava et al.
Fig. 6 Annotating new formats
the processing of the invoices. It mentions the list of all the invoices and their current status. Invoices can be in any of these states, i.e., Queued, Processing, and Done. Here, ‘Queued’ refers that the invoice is in the line and yet to be processed. ‘Processing’ refers that the invoice is being converted and a suitable format is being searched. If it is not found, then you will be asked to annotate it. ‘Done’ refers that the invoice is processed, and output is generated for it.
4.6 Extracting Information After the selection of format, the image comes to this stage. In this stage, the software fetches the XML file of the corresponding format and for each field crops the image using the bounding boxes of that field. This process is repeated for all the fields present in the XML file. After cropping, the image is fed to the OCR to extract text from it. The extracted text is stored in a list. In the end, this list contains all the information extracted using bounding box coordinates of various fields. Also, for better results, we have applied regex to ensure that the data obtained is in correct data. For example, regex for date will only parse the information that fulfills all the requirement for the date. Date in format dd/mm/yyyy, dd.mm.yyyy and dd-mm-yyyy are recognized by our software. This list is passed to the next stage of the pipeline to generate the output.
Electronic Invoicing Using Image Processing and NLP
853
Fig. 7 Invoice status
4.7 Writing Information in Files In this stage, a list containing all the required field values is received. Here, the software writes the data into an Excel file with the predefined format. It can also be used to generate a CSV file or a plain text file as required by the user. Some of the modules used to generate the output file are as follows: • openpyxl • shutil • os
854
S. Srivastava et al.
5 Result and Discussions Our proposed work can work with almost all formats. Various image processing techniques, such as gray scaling, erosion, and dilations, are performed to remove any error due to noise. To make our project more user friendly, we have created GUI for the user as shown in Fig. 3. The user needs to specify all the directories. For the data extraction process, the image is classified into a known and unknown format. In the case of a new format, you can easily annotate your image as shown in Fig. 6. The bounding box coordinates obtained by annotation are saved in an XML file and can be used in the future for similar format invoice. When the software encounters any known invoice image, it extracts the bounding box coordinates from the XML file. Identification of the ROI, followed by the data extraction by an OCR, can be done in minimal time. All these processes are happening internally. We have also created a GUI as shown in Fig. 5 that keeps the user updated the process happening internally. When the invoice is not processed, the user will see ‘Queued’ in front of that invoice. When it is being processed, the user will get to know it by the term ‘Processing’ next to that invoice name. After the invoices are done and the outputs are saved in the output directory, the user will see ‘Done’ in front of the invoice name. The output obtained is placed in an Excel sheet to get a structured format for the user. Figure 8 shows an example of the structured output in an Excel sheet. The structured output can be processed by the hosting company for analysis, marketing, etc. It also helps companies to keep their data up to date. Some of the limitations of our work includes error due to pixaleted images and OCR. Currently, the software can work with pdf and png formats only. It can be scaled up to work with more formats in the future edition.
Fig. 8 Structured output
Electronic Invoicing Using Image Processing and NLP
855
6 Conclusion With our proposed model, e-Commerce industries can be work with multiple formats with ease. Our user-friendly GUI will let the user extract scattered data from the invoice pdfs and images. We have tried to provide a solution for a large-scale problem with minimum time and maximum accuracy. All the processing is done internally, making sure that user has a smooth experience. Our software also gives users the power to welcome any new format and improve the database. The output can be easily read by the computer for further processing and analysis. The same model can be further extended and used in the field of medicines, Web pages, etc. With its interpretable architecture, you can easily employ this software in multiple domains. Further work can incorporate other formats such as jpeg, jpg, and doc for the invoice system. Also, more image processing techniques can be used to minimize the noise in the image and get more accurate data. We can also incorporate an invoice classification system, to classify handwritten and machine printed invoices as both the types of invoices require different kind of processing before data extraction.
References 1. Y. Sun, X.F. Mao, S. Hong, W. Xu, G. Gui, Template matching-based method for ıntelligent ınvoice ınformation ıdentification. IEEE Access 7 (2019) 2. Y. Belaïd, A. Belaïd, Morphological tagging approach in document analysis of ınvoices. ın 17th International Conference on Pattern Recognition, vol. 1 (2004) 3. R.B. Palm, O. Winther, F. Laws, CloudScan—a configuration-free invoice analysis system using recurrent neural networks. ın International Conference on Document Analysis and Recognition (ICDAR), vol. 1 (2017) 4. D. Schuster, K. Muthmann, D. Esser, A. Schill, M. Berger, C. Weidling, K. Aliyev, A. Hofmeier, Intellix—end-user trained ınformation extraction for document archiving. ın 12th International Conference on Document Analysis and Recognition (2013) 5. H. Sidhwa, S. Kulshrestha, S. Malhotra, S. Virmani, Text extraction from bills and ınvoices. ın International Conference on Advances in Computing, Communication Control and Networking (ICACCCN) (2018) 6. Y. Sun, J. Zhang, Y. Meng, J. Yang, G. Gui, Smart phone based ıntelligent ınvoice classification method using deep learning. IEEE Access 7 (2019) 7. R.R. Palekar, S.U. Parab, D.P. Parikh, V.N. Kamble, Real time license plate detection using OpenCV and tesseract. ın International Conference on Communication and Signal Processing (ICCSP) (2017) 8. J. Blanchard, Y. Belaïd, A. Belaïd, Automatic generation of a custom corpora for invoice analysis and recognition. ın International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 7 (2019) 9. H. Hamza, Y. Belaïd, A. Belaïd, A case-based reasoning approach for invoice structure extraction. ın Ninth International Conference on Document Analysis and Recognition (ICDAR), vol. 1 (2007)
856
S. Srivastava et al.
10. D. Mıng, J. Lıu, J. Tıan, The design and ımplementation of a chinese financial ınvoice recognition system. ın International Symposium on VIPromCom Video/Image Processing and Multimedia Communications (2002) 11. M. Rusiñol, T. Benkhelfallah, V.P. dAndecy, Field extraction from administrative documents by ıncremental structural templates. ın 12th International Conference on Document Analysis and Recognition (2013) 12. J. Zhang, F. Ren, H. Ni, Z. Zhang, K. Wang, Research on ınformation recognition of VAT ınvoice based on computer vision. ın 6th International Conference on Cloud Computing and Intelligence Systems (CCIS) (2019) 13. A.R. Dengel, B. Klein, SmartFIX: a requirements-driven system for document analysis and understanding. ın 5th International Workshop on Document Analysis Systems, vol. 5 (2002) pp. 433–444
BEVDS: A Blockchain Model for Multiparty Authentication of COVID-19 Vaccine Beneficiary Tejaswi Khanna, Parma Nand, and Vikram Bali
Abstract COVID-19 vaccinations have been approved for public immunization and are going through clinical trials. Vaccinating a billion people of India is a huge challenge. Tracing, last mile delivery, data privacy, double spend of dosage, and multiparty authorization pose serious challenges for providing COVID-19 vaccines. Policies have been framed on who gets the access to the vaccine. However, there is a need of a technological solution to create an efficient vaccine distribution or delivery system. This paper proposes a novel blockchain enabled vaccine delivery system in the context of India. A consortium blockchain has been proposed which has different participants, like the UIDAI, healthcare facilities, pharmaceutical supply chain, etc. These participants are responsible for authenticating the beneficiary of the COVID19 vaccine. Sequence diagram of how the beneficiary is authorized to access the vaccine has been proposed. Keywords Aadhaar · Vaccine distribution system · Immunization · Vaccine double spend · COVID-19
1 Introduction In the first week of December 2020, Government of India announced that COVID-19 vaccination has been approved to be used for the nationwide immunization process. The Serum Institute of India has tied up with UK-based AstraZeneca for the supply of one billion doses as observed by Roope et al. [1]. This type of vaccine will be T. Khanna (B) · P. Nand Department of Computer Science and Engineering, School of Engineering and Technology, Sharda University, Greater Noida, India e-mail: [email protected] P. Nand e-mail: [email protected] V. Bali Department of Computer Science and Engineering, JSS Academy of Technical Education, Bengaluru, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_63
857
858
T. Khanna et al.
delivered at 2 °C. So, managing vaccine cold chains are a serious concern. Also, there is a limited supply of the vaccine and the beneficiaries will be multiples of that. It is important to have a vaccine delivery system which can (a) provide transparency, (b) ensure single dosage spend, and (c) beneficiary tracking. COVID-19 has been a driver to expose the limitations of intermediaries of trust with regards to healthcare. Khurshid [2] has reviewed the potential of blockchain for addressing this issue of trust. The distributed governance structure of blockchain can be used to create “trustless” systems that can help in maintaining privacy and addressing public health needs in the fight against COVID-19. This paper aims at proposing a novel vaccine delivery system based on blockchain technology. Blockchain technology is used, here, because this delivery system can be categorized as a case multiparty authorization or multiparty computation. Referring to the framework proposed by Wust and Gervais [3], the authors of this paper have categorized that delivering COVID-19 vaccinations can utilize blockchain. The authentication of beneficiaries requires multiple participants who are not interconnected with each other. Pandemic is such an emergency that trusting a third party for managing these authentications is a question mark. All the vaccinations that will be provided to the beneficiaries need to be accounted for and verified. Tracking mechanisms also need to be implemented for administering the second shot, also ensuring that there is no double spend of vaccination dose. Public permissioned blockchain can be utilized in this case. With its potential to combat cyber threat as showcased by Smys and Haoxiang [4], blockchain technology is considered a better alternative to keeping data privacy and security intact. The remainder of this paper is as follows: Sect. 2 sheds light on who gets the access to the COVID-19 vaccine and how the Government is setting up guidelines for the same. Related works of blockchain’s implementation for mitigating coronavirus are discussed in Sect. 3. Next, blockchain enabled vaccine delivery system is proposed in Sect. 4. Architecture diagram and sequence diagram have been proposed. Section 5 concludes the paper.
2 Background: Who Gets Access to the Vaccine? Vaccines are temperature sensitive and must be kept at certain temperatures. The cold chain system consists of 85,634 pieces of functional equipment for vaccine storage, including deep freezers, walk-in refrigerators, and coolers. They will be used in the health ministry’s universal immunization campaign (UIP). The current cold chain system can store the additional amount of COVID-19 vaccine required for the first three crore healthcare workers and frontline workers. As a result, there must be a mechanism in place to govern vaccine distribution. Distribution or roll out of vaccination requires an optimized system as observed in the case of Isreal by Mckee and Rajan [5]. There has been a formation of a National Expert Group on Vaccine Administration for COVID-19 (NEGVAC) [6] by the Government of India for the following purposes:
BEVDS: A Blockchain Model for Multiparty Authentication …
859
Fig. 1 Structure of governance mechanism for COVID-19 vaccination [6]
• • • •
Identifying population groups’ precedence; vaccine inventory management and tracking; monitoring of implementation processes; identification of vaccine delivery platforms.
Electronic vaccine intelligence network (eVIN) is a network which stores and monitors vaccine stocks and the cold chain temperature. Figure 1 shows the structure of the governance mechanism for COVID-19. The COVID-19 vaccination beneficiary management system (CVBMS) is being created for tracking beneficiaries who have received COVID-19 vaccine. This tracking ensures that there is no double spending of the vaccine dosage. Creation of beneficiary databases within the CVBMS streamlines the tracking process required for subsequent vaccine dose (the second shot). However, these databases need to be integrated for authorizing a beneficiary to get access to the vaccine. WHO has recommended vaccination priorities for healthcare workers, essential service providers, and high-risk individuals in pandemic. For healthcare workers (HCW), CVBMS needs to provide coordination mechanism between state, district, and medical facilities.
3 Related Work The COVID-19 pandemic has exposed the limitations of the current healthcare system with regards to its timely and efficient mechanisms to handle the public medical emergency. Being centralized, Ahmad et al. [7] are of the view that healthcare systems can see technological disruption in the form of blockchain technology. Researchers like [8, 9] and [10] have prepared comprehensive reviews summarizing various blockchain applications proposed or implemented to mitigate COVID-19 challenges. Nguyen et al. [11] included artificial intelligence as another technological driver which can be used to combat COVID-19. Blockchain enables early detection, whereas AI can be utilized to identify the disease symptoms to support vaccine manufacturing. This vaccine can be ensured to reach all the beneficiaries by utilizing blockchain for robust supply chain. Along with blockchain, Chamola et al.
860
T. Khanna et al.
[12] observed that allied technologies like IoT, UAVs, and 5G can also become significant in delivering the vaccines in the correct condition. Table 1 gives a summarized view of the blockchain-based proposed studies to combat COVID-19. In an extensive study by Kalla et al. [13] provided a panoramic view of blockchain enabled use cases to fight against COVID-19. Those are • • • • • • • • •
contact tracing, agriculture and food distribution, contactless delivery, disaster relief and insurance, online education, e-government, patient information sharing, immigration and emigration processes, supply chain management.
Different information systems can be developed based on blockchain technology. Azim et al. [14] have researched on a blockchain-based pandemic health record management system to help fight against COVID-19 like pandemic. They have observed that blockchain can be used as a technology to provide accuracy, transparency, and immutability in storing the transactional data of pandemic victims. There is an opportunity to develop a data tracking system for COVID-19. Marbouh et al. [15] have implemented an Ethereum-based system which tracks data related to the number of new cases, deaths, and recovered cases. This system ensured data integrity, security, transparency, data traceability among stakeholders. To control the spread of COVID-19 and like pandemic, contact tracing is a significant measure. Arifeen et al. [16] have proposed a blockchain-based framework to provide contact tracing information. Rimsan et al. [17] have used blockchain to track global COVID-19 infected or tested patients. Using blockchain, smart contracts, and Bluetooth, Song et al. [18] have implemented contact tracing services which also protect user’s privacy. Addressing user privacy, Xu et al. [19] have proposed BeepTrace which is a contact tracing application based on blockchain. They have defined the parties involved and explained their roles and interfaces. Immunity certificates and digital medical passports have been developed using blockchain. Hasan et al., Bansal et al., and Chaudhari et al. [20–22] have explored the possibility of storing COVID-19 vaccination details of individuals on blockchain. This ensures public verification of vaccination records without sharing any other personal information. Tsoi et al. [23] have applied blockchain to provide COVID19 vaccine passports with the main aim to preserve patient data privacy. They have viewed the blockchain potential for contact tracing and vaccine efficacy monitoring. Regarding pharmaceutical supply chain management, Khanna et al. [24] have proposed a permissioned blockchain-based framework to design efficient drug trackability. Ramirez and Beltrán Álvarez [25] have designed a comprehensive blockchain model for distribution chain of the COVID-19 vaccine. Such kind of blockchain model can be harnessed to gain knowledge about the provenance of the drug which can further combat the issue of drug counterfeiting [26].
BEVDS: A Blockchain Model for Multiparty Authentication …
861
Table 1 Blockchain-based proposed studies for combating COVID-19 S. no. Author, year
Publisher
Proposed study
Challenges faced/solved
1
Ahmad et al., 2020
IEEE
Review on applications and challenges
Preservation of user data privacy; development of lightweight blockchain platforms is required
2
Sharma et al., 2020
Springer
Review on applications and challenges
Energy consumption; scalability; privacy preservation
3
Ahir et al., 2020
IEEE
Review on applications and challenges
Data privacy preservation
4
Abd-alrazaq et al., 2021
Elsevier
Review on applications and challenges
performance of COVID-19 blockchain technologies in terms of transaction cost, scalability, and/or latency
5
Nguyen et al., 2020
Preprints
AI and blockchain as a Lightweight combined force to blockchain design in combat COVID-19 healthcare; blockchain network latency
6
Chamola et al., 2020
IEEE
Roles of allied technologies like IoT, UAVs, and 5G for vaccine delivery
Review paper, so no limitations reported
7
Kalla et al., 2020
IEEE
Blockchain enabled uses cases to fight against COVID-19
Legal, security, privacy, latency, throughput, scalability, and resource utilization issues
8
Amirul Azim et al., 2020
Zenodo
Blockchain-based Storing, sharing, and pandemic health accessing healthcare record management data policies system to fight against COVID-19
9
Marbouh et al., 2020
Springer
Ethereum-based data tracking system for COVID-19
Scalability, selfish mining, legal issues, and privacy concerns
10
Arifeen et al., 2020
Preprints
Blockchain-based contact tracing
Centralized server problem for contact tracing applications (continued)
862
T. Khanna et al.
Table 1 (continued) S. no. Author, year
Publisher
Proposed study
Challenges faced/solved
11
Rimsan et al., 2020
IEEE
Blockchain-based contact tracing
Adoption and implementation of blockchain for tracking the corona virus infected patients globally
12
Song et al., 2020
Arxiv
Blockchain-based contact tracing
None specified
13
Xu et al., 2021
IEEE
Blockchain-based contact tracing
Network throughput and scalability, battery drainage and storage optimization, security considerations of private key exchange, economical, and social aspects
14
Hasan et al., 2020
IEEE
Blockchain-based immunity certificates and digital medical passports
Data confidentiality and privacy
15
Bansal et al., 2020
Springer
Blockchain-based immunity certificates and digital medical passports
Social division between the licensed and unlicensed people
16
Chaudhari et al., 2021 Arxiv
Blockchain-based immunity certificates and digital medical passports
Choice of suitable blockchain, security analysis
17
Tsoi et al., 2021
BMJ journals Blockchain-based vaccine passports and contact tracing
Data provenance, security, integrity, access control, and interoperability
18
Khanna et al., 2020
IGI Global
Blockchain-based drug trackability
Implementation challenges, supply chain integration
19
Ramirez Lopez and Sagepub Beltrán Álvarez, 2020
Blockchain design Covid-19 vaccine distribution chain
Impersonation and corruption of identity (continued)
BEVDS: A Blockchain Model for Multiparty Authentication …
863
Table 1 (continued) S. no. Author, year
Publisher
Proposed study
Challenges faced/solved
20
Musamih et al., 2021
IEEE
Blockchain-based COVID-19 vaccine traceability
The use of a public permissioned Ethereum blockchain will require the users to spend Ether to execute their functions; scalability; and interoperability issues
21
Rotbi et al., 2021
Arxiv
Blockchain system to manage the registration, storage, and distribution of the vaccines
Authenticity and ethical behavior of the system participants, carbon emissions of blockchain mining
22
Bali et al., 2021
IGI Global
Blockchain-based drug traceability
Drug counterfeiting, data schema relevant for traceability
23
Awasthi et al., 2021
Arxiv
Drug distribution policy
Identifying the part of the population who needs the vaccine
24
Shamsi Gamchi et al., Springer 2021
Vehicle routing protocol for reducing vaccine cost
Prioritizing the individuals for receiving vaccines
Blockchain-based vaccine distribution has been addressed by Musamih et al. [27] where they have developed smart contracts for Ethereum. Their solution automates the traceability of COVID-19 vaccines while ensuring data provenance, transparency, security, and accountability. Rotbi et al. [28] propose a blockchain-based system to manage the registration, storage, and distribution of the vaccines. Their system assures an efficient vaccination campaign. Vaccine distribution has been discussed by Awasthi et al. [29] who have proposed a novel distribution policy model, VacSIM. They have used reinforcement learning to solve logistical challenges which need to be taken into consideration while distributing the vaccine. They have used SEIR-based projections to compute the best possible solution for the distribution of vaccine. SIR epidemic model has been utilized by Shamsi Gamchi et al. [30] for modeling the post-disaster situation and apply vaccination as a control tool to cope with infectious diseases. They propose a novel bi-objective vehicle routing protocol which aims to minimize the social cost of vaccine distribution. The solution is developed uses the weighted augmented ε-constraint method, optimal control theory, and dynamic programming.
864
T. Khanna et al.
4 Proposed Blockchain Enabled Vaccine Delivery System Vaccine delivery can be viewed as a problem of multiparty authentication. This requires the authentication of vaccine beneficiary from different authorities. After observing issues in related work section, this work proposes a mechanism which utilizes permissioned blockchain to strengthen the vaccine supply chain. This mechanism can also be implemented to manage the identity of the persons who are getting vaccinated. The blockchain can be developed using either Hyperledger fabric or Ethereum or R3 Corda as analyzed by Polge et al. [31]. This person is viewed as the beneficiary in the system. Utilizing this mechanism, it can be made sure that the same person is not vaccinated twice. This can be viewed as a blockchain enabled vaccine delivery system (BEVDS) which leverages the benefits of blockchain’s immutable records of transactions. When the beneficiary has got the access to the vaccine, it gets recorded in the blockchain. This is essential for stopping double access to the vaccine. The architecture diagram is shown in Fig. 2.
Fig. 2 Architecture diagram of blockchain enabled vaccine delivery system (BEVDS)
BEVDS: A Blockchain Model for Multiparty Authentication …
865
Now, this BEVDS should be implemented at the state level. This can be further diversified into district levels. BEVDS can enable the government medical department to allow multiparty identification of beneficiaries. Blockchain is efficient in providing multiparty transactions with higher transparency. WHO has recommended vaccination priorities for healthcare workers, essential service providers, and high-risk individuals in pandemic. For healthcare workers (HCW), CVBMS needs to provide coordination mechanism between state, district, and medical facilities. This coordination can be provided in an efficient manner by using BEVDS. The data of HCWs will be collected and compiled at the district level. All this data can be connected using the blockchain. For essential services providers and high-risk individuals, the data need to be verified by government offices. For instance, in case of COVID-19, high-risk individuals are older than 60 years or who have health conditions like lung or heart disease, diabetes or conditions that affect their immune system. Here, the age of these individuals needs to be verified by using UIDAI (Adhaar) database. Their medical conditions can be verified by their electronic healthcare records (EHRs). The architecture diagram of the BEVDS platform is shown in Fig. 1. It consists of a consortium blockchain where the vaccinating agency gets information about the vaccine from a blockchain enabled vaccine supply chain. Beneficiary requiring vaccination must provide their information like Aadhaar number, age, category (HCW, police personnel, bank employees, etc.) health condition, and whether they had contracted COVID-19 or not. The BEVDS platform is proposed to be a consortium blockchain which will integrate: • • • • •
Blockchain enabled pharmaceutical supply chains UIDAI database Healthcare facilities Healthcare records Vaccinating agencies
The platform will allow vaccinating agencies to easily verify the identity of the beneficiary. Blockchain will enable the agencies to make sure that no one person can be vaccinated twice by enforcing nonrepudiation. The sequence of verifications that are proposed in this system is shown in Fig. 3. It starts with beneficiary requesting vaccine from the vaccinating agency. All details are provided by the beneficiary to the vaccinating agency. The data are then sent to the blockchain, where age and UID are verified using the UIDAI database. The category is verified from the data provided for healthcare facilities, frontline workers essential services. Then, the health condition is verified from health records. This also verifies whether the beneficiary is COVID positive or not. After all the authorizations have been provided by the responsible participants, blockchain-based pharmaceutical supply chain will provide the required vaccine data and blockchain maps it to the beneficiary.
866
T. Khanna et al.
Fig. 3 Sequence diagram of the vaccine delivery system
5 Discussion There is a plan of the Indian health ministry to receive and utilize around 500 million doses of COVID-19 vaccines and provide to up to 250 million people by July 2021. However, with immunization process being a critical process, a platform needs to be developed to perform around 10 million vaccinations per day. Efficiency and speed of the system are of paramount consideration. With a smaller number of vaccine doses and more people requiring them, there is a chance of corruption while providing access of vaccines. In the COVID-19 vaccine distribution system, it is important that vaccinating agencies have full information that beneficiaries have been vaccinated and a digital certificate is provided. In case of India where there is two-dose vaccine, then a provisional digital certificate will be generated for the first time, and after the stipulated duration, a reminder can be sent by email or message for that person to receive the second dosage. After that, a
BEVDS: A Blockchain Model for Multiparty Authentication …
867
final certificate of immunization can be tendered to the beneficiary. Authentication and authorization can be performed by using the proposed BEVDS system. The novel vaccine delivery system can be vital for combating pandemic like COVID-19. Multiparty authorization is facilitated for providing access to beneficiaries using blockchain technology. Because of occupational exposure, healthcare workers (HCWs) are at an elevated risk of contracting COVID-19. Using BEVDS, the national COVID-19 vaccine cell can provide better vaccination procedure for all HCWs and other categories of beneficiaries. Compared to other information systems, this blockchain enabled system can be used to provide authentication in a much faster and efficient way. Once all the beneficiaries’ data are integrated, the vaccine dissemination should happen in a streamlined manner. The proposed work is accompanied with an architecture diagram and sequence diagram which specify all the involved participants for multiparty authentication of vaccine beneficiary. The implementation details are to be researched further along with the smart contracts required for the authentication to work on the blockchain consortium. Web or mobile application UI can be utilized to gather data from the beneficiaries. Blockchain will be responsible to authenticate and authorize the beneficiaries to access the vaccines. Also, the same application will be used to track the beneficiaries and notified for the second dosage, so that there is no double spend of vaccine dosage. Using medical sensor nodes along with edge computing [32], this proposed model can be utilized to distribute vaccines to remote locations, providing last mile delivery of vaccines.
6 Conclusion A novel vaccine delivery system has been proposed in this paper. Consortium blockchain has been presented which will provide efficient vaccination mechanism. Tracking of the beneficiaries till the two shots of COVID-19 can be been administered using the system. Blockchain has been chosen because of its inherent capability to prohibit double spending of vaccine dosage. Multiparty authorization of the beneficiary and patient tracking is also facilitated by the use of the technology. The challenges that this system may encounter are the adoption by different state departments. Since they have to provide data for multiparty authorization, creating relevant data schema is required. Another set of challenges can be categorized as implementation challenges. Hyperledger Indy or fabric can be used to implement the system. Corda can also be a choice to go forward with this. Consortium blockchain can be developed using both the platforms. However, chain code development for providing real-time access to authorizing data is of paramount importance.
868
T. Khanna et al.
Data privacy is another challenge, which is promised to be kept secure by the inherent design of blockchain. However, privacy policies and government regulations with regards to sharing medical data have still to be formulated.
References 1. L.S.J. Roope, J. Buckell, F. Becker, P. Candio, M. Violato, J.L. Sindelar, A. Barnett, R. Duch, P.M. Clarke, How should a safe and effective COVID-19 vaccine be allocated? health economists need to be ready to take the baton. Pharmaco. Econ. Open. 4, 557–561 (2020). https://doi.org/10.1007/s41669-020-00228-5 2. A. Khurshid, Applying blockchain technology to address the crisis of trust during the COVID19 pandemic. JMIR Med. Inform. 8, e20477 (2020). https://doi.org/10.2196/20477 3. K. Wust, A. Gervais, Do you need a blockchain? in 2018 Crypto Valley Conference on Blockchain Technology (CVCBT). IEEE, Zug (2018). pp. 45–54. https://doi.org/10.1109/ CVCBT.2018.00011 4. S. Smys, W. Haoxiang, Data Elimination on repetition using a blockchain based cyber threat intelligence. IRO J. Sustain. Wirel. Syst. 2, 149–154 (2021). https://doi.org/10.36548/jsws. 2020.4.002 5. M. Mckee, S. Rajan, What can we learn from Israel’s rapid roll out of COVID 19 vaccination? Isr. J. Health Policy Res. 10 (2021). https://doi.org/10.1186/s13584-021-00441-5 6. COVID-19 vaccines operational guidelines, https://www.mohfw.gov.in/pdf/COVID19Vacci neOG111Chapter16.pdf 7. R.W. Ahmad, K. Salah, R. Jayaraman, I. Yaqoob, S. Ellahham, M. Omar, Blockchain and COVID-19 pandemic: applications and challenges (2020). https://doi.org/10.36227/techrxiv. 12936572.v1 8. A. Sharma, S. Bahl, A.K. Bagha, M. Javaid, D.K. Shukla, A. Haleem, Blockchain technology and its applications to combat COVID-19 pandemic. Res. Biomed. Eng. (2020). https://doi. org/10.1007/s42600-020-00106-3 9. A.A. Abd-alrazaq, M. Alajlani, D. Alhuwail, A. Erbad, A. Giannicchi, Z. Shah, M. Hamdi, M. Househ, Blockchain technologies to mitigate COVID-19 challenges: a scoping review. Comput. Methods Programs Biomed. Update. 1, 100001 (2021). https://doi.org/10.1016/j.cmp bup.2020.100001 10. S. Ahir, D. Telavane, R. Thomas, The impact of artificial intelligence, blockchain, big data and evolving technologies in coronavirus disease—2019 (COVID-19) curtailment. in 2020 International Conference on Smart Electronics and Communication (ICOSEC). IEEE, Trichy, India (2020), pp. 113–120. https://doi.org/10.1109/ICOSEC49089.2020.9215294 11. D.C. Nguyen, M. Dinh, P.N. Pathirana, A. Seneviratne, Blockchain and AI-based solutions to combat coronavirus (COVID-19)-like epidemics: a survey. Med. Pharmacol. (2020). https:// doi.org/10.20944/preprints202004.0325.v1 12. V. Chamola, V. Hassija, V. Gupta, M. Guizani, A comprehensive review of the COVID-19 pandemic and the role of IoT, drones, AI, blockchain, and 5G in managing its impact. IEEE Access 8, 90225–90265 (2020). https://doi.org/10.1109/ACCESS.2020.2992341 13. A. Kalla, T. Hewa, R.A. Mishra, M. Ylianttila, M. Liyanage, The role of blockchain to fight against COVID-19. IEEE Eng. Manag. Rev. 48, 85–96 (2020). https://doi.org/10.1109/EMR. 2020.3014052 14. A. Azim, M.N. Islam, P.E. Spranger, Blockchain and novel coronavirus: towards preventing COVID-19 and future pandemics (2020). https://doi.org/10.5281/ZENODO.3779244 15. D. Marbouh, T. Abbasi, F. Maasmi, I.A. Omar, M.S. Debe, K. Salah, R. Jayaraman, S. Ellahham, Blockchain for COVID-19: review, opportunities, and a trusted tracking system. Arab. J. Sci. Eng. 45, 9895–9911 (2020). https://doi.org/10.1007/s13369-020-04950-4
BEVDS: A Blockchain Model for Multiparty Authentication …
869
16. Md.M. Arifeen, Al A. Mamun, M.S. Kaiser, M. Mahmud, Blockchain-enable contact tracing for preserving user privacy during COVID-19 outbreak. Math. Comput. Sci. (2020) https://doi. org/10.20944/preprints202007.0502.v1 17. M. Rimsan, A.K. Mahmood, M. Umair, F. Hassan, COVID-19: a novel framework to globally track coronavirus infected patients using blockchain. in 2020 International Conference on Computational Intelligence (ICCI). IEEE, Bandar Seri Iskandar, Malaysia (2020), pp. 70–74. https://doi.org/10.1109/ICCI51257.2020.9247659 18. J. Song, T. Gu, X. Feng, Y. Ge, P. Mohapatra, Blockchain meets COVID-19: a framework for contact information sharing and risk notification system. ArXiv200710529 Cs. (2020) 19. H. Xu, L. Zhang, O. Onireti, Y. Fang, W.J. Buchanan, M.A. Imran, BeepTrace: blockchainenabled privacy-preserving contact tracing for COVID-19 pandemic and beyond. IEEE Internet Things J. 8, 3915–3929 (2021). https://doi.org/10.1109/JIOT.2020.3025953 20. H.R. Hasan, K. Salah, R. Jayaraman, J. Arshad, I. Yaqoob, M. Omar, S. Ellahham, Blockchainbased solution for COVID-19 digital medical passports and immunity certificates. IEEE Access. 8, 222093–222108 (2020). https://doi.org/10.1109/ACCESS.2020.3043350 21. A. Bansal, C. Garg, R.P. Padappayil, Optimizing the implementation of COVID-19 “immunity certificates” using blockchain. J. Med. Syst. 44, 140 (2020). https://doi.org/10.1007/s10916020-01616-4 22. S. Chaudhari, M. Clear, H. Tewari, Framework for a DLT based COVID-19 passport. ArXiv200801120 Cs (2021) 23. K.K.F. Tsoi, J.J.Y. Sung, H.W.Y. Lee, K.K.L. Yiu, H. Fung, S.Y.S. Wong, The way forward after COVID-19 vaccination: vaccine passports with blockchain to protect personal privacy. BMJ Innov. 7, 337–341 (2021). https://doi.org/10.1136/bmjinnov-2021-000661 24. T. Khanna, P. Nand, V. Bali, Permissioned blockchain model for end-to-end trackability in supply chain management. Int. J. E-Collab. 16, 45–58 (2020). https://doi.org/10.4018/IJeC. 2020010104 25. L.J. Ramirez Lopez, N. Beltrán Álvarez, Blockchain application in the distribution chain of the COVID-19 vaccine: a designing understudy (2020) https://doi.org/10.31124/advance.122 74844.v1 26. V. Bali, T. Khanna, P. Soni, S. Gupta, S. Gupta, S. Chauhan, Combating drug counterfeiting by tracing ownership transfer using blockchain technology. Int. J. E-Health Med. Commun. IJEHMC 13 (2021) 27. A. Musamih, R. Jayaraman, K. Salah, H.R. Hasan, I. Yaqoob, Y. Al-Hammadi, Blockchainbased solution for distribution and delivery of COVID-19 vaccines. IEEE Access. 9, 71372– 71387 (2021). https://doi.org/10.1109/ACCESS.2021.3079197 28. M.F. Rotbi, S. Motahhir, A.E. Ghzizal, Blockchain technology for a safe and transparent Covid19 vaccination. ArXiv210405428 Cs (2021) 29. R. Awasthi, K.K. Guliani, S.A. Khan, A. Vashishtha, M.S. Gill, A. Bhatt, A. Nagori, A. Gupta, P. Kumaraguru, T. Sethi, VacSIM: learning effective strategies for COVID-19 vaccine distribution using reinforcement learning. ArXiv200906602 Cs (2021) 30. N. ShamsiGamchi, S.A. Torabi, F. Jolai, A novel vehicle routing problem for vaccine distribution using SIR epidemic model. Spectr. 43, 155–188 (2021). https://doi.org/10.1007/s00291020-00609-6 31. J. Polge, J. Robert, Y. Le Traon, Permissioned blockchain frameworks in the industry: a comparison. ICT Express 7, 229–233 (2021). https://doi.org/10.1016/j.icte.2020.09.002 32. J.S. Raj, Optimized mobile edge computing framework for IoT based medical sensor network nodes. J. Ubiquitous Comput. Commun. Technol. 3, 33–42 (2021). https://doi.org/10.36548/ jucct.2021.1.004
Reinforcement Learning for Security of a LDPC Coded Cognitive Radio Puneet Lalwani and Rajagopal Anantharaman
Abstract This paper aims to enhance the effectiveness of the present spectrum and efficiency of cognitive radio and use the reinforcement learning model to enhance its security. It incorporates a concept that detects the presence of licensed primary users in a channel and assigns channels to secondary users automatically without the need for user intervention where the primary users are not present. An LDPC decoder used at the receiver’s end allows for error detection and correction considering situations where noisy channels manipulate data. The LDPC decoder and software portion of cognitive radio, that is implemented using the energy detection method, are done using LabVIEW software and the reinforcement learning model which use the deep Q-learning algorithm is developed using Python. Keywords LabVIEW · LDPC decoder · Cognitive radio · Python · Reinforcement learning
1 Introduction The accessibility of usable spectrum is very limited with growing demands in the mobile environment, hence it is crucial to utilize the spectrum frequency band efficiently such that all users can exist mutually without waste or misuse of it. Cognitive radio is a well-versed technology in performing this task and allows for automatic allocation of required spectrums to secondary users wherever the licensed primary users are absent. Each individual block in the system has its own assigned responsibilities to enhance the working of the collective CR. Cognitive radio uses spectrum sensing which is the basic step in assigning channels to users. Spectrum sensing uses a definite algorithm (energy detection sensing algorithm used here) to observe the potentiality of licensed primary clients at a point in the channel. Once this is done, the secondary users are allocated the number of channels they require at instances
P. Lalwani (B) · R. Anantharaman Department of ECE, Dayananda Sagar College of Engineering, Bengaluru, Karnataka, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_64
871
872
P. Lalwani and R. Anantharaman
where the principal user is absent. This minimizes the interference between users and wastage of channel space. Min-sum algorithm is the decoding algorithm used at the decoder end to detect and correct the errors that could have occurred during transmission in the channel. These errors could be caused due to AWGN noise or interferences in the channel which causes errors in the data. The application of this algorithm is to detect and automatically recalibrate and correct the underlying glitches [1].
2 Low Density Parity Check (LDPC) LDPC coding involves an error control coding technique that allows to modify the bit errors in the acquired message which have been modified due to noise and interference [2, 3]. There are various algorithms available for the implementation of LDPC codes such as min-sum algorithm, sum product algorithm, and log decoding algorithm, but the one used in this discussion is the min-sum algorithm as it is easier to construct, and it involves lesser calculations for similar performance as compared to SPA, as it reduces complexity at check nodes [4, 5].
3 Cognitive Radio System The CR is a very integral component of the system which allows the secondary client to automatically be assigned a spectrum established on the basis of retention capability of PU and also based on the fundamental user preconditions [4]. CR has been proposed as a potential candidate for performing complete dynamic spectrum allocation (DSA) by utilizing free frequency bands, also known as “spectrum holes” or “white spaces.” Because CR is capable of identifying these spectral opportunities, it divides users into two groups: licensed (primary users) and unlicensed (secondary users SUs). While PUs have unlimited access to the spectrum, SUs are constrained by PU activity. In other words, SUs must respect the quality of service of PUs and harmful interference from SUs to PUs transmission is strictly prohibited. 1.
2.
3.
Spectrum sensing refers to the detection of presence of any signals in the available set of channels so that the signal if present in any channel can be avoided and the traffic congestion on that channel can be reduced. After the step of spectrum sensing, the system identifies the various opportunistic points where the unlicensed user’s signal can be transmitted along with avoiding congestion in the channel. Adapting to the given surroundings is a very important requirement for any radio system. By analyzing the given constraints on the spectrum and amount of traffic being sent on a given channel, the CR system makes most efficient use of the provided conditions.
Reinforcement Learning for Security of a LDPC …
4.
873
The assets accessible to a cognitive framework must be used by all the users. A variety of approaches can be utilized to achieve such an undertaking. OFDMA, an extraordinary instance of FDMA, has achieved critical significance with its utilization.
3.1 Spectrum Sensing Algorithm A variety of algorithms are available to perform spectrum sensing (matched filter, energy detection, cyclostationary, etc.) but the approach adopted here deals with the energy detection algorithm, as it provides least complexity when compared to other methods [6].
3.2 Energy Detection for CR 1. 2. 3.
There is a need to sample the received signal at specific intervals of time [7]. Next in order to measure the energy, the samples got from the previous step is squared. Next, the mean of the above squared samples is taken and analyzed, which is then compared to a predefined threshold.
The general flow diagram for energy detection algorithm is as shown in Fig. 1. The equation for received signal Rx at CR can be stated as X(t) is equal to n(t), when PU is not present, and it is given as h(t) * s(t) + n(t), when PU is present.
4 Types of Spectrum Sensing Methods: A Comparison Sensing is a vital part of CR in which primary users are detected so that secondary unlicensed users can occupy vacant channels to prevent the undesirable interference. Hence, the choice of the correct sensing technique applied on the spectrum will aid in giving the best required performance for the system. As seen above in Fig. 2, there are a variety of SS techniques available to use. The most suitable SS technique after a thorough study is chosen to be the energy detection method, the reasoning for which is stated below. Interference-based sensing deals with accommodation of more than one user mutually in a single channel selection. This is useful only in high traffic situations and deviates from the goal of a general-purpose cognitive radio, i.e., to occupy vacant spectrum with secondary users (SU) [8]. Cooperative sensing involves a method of sensing in which all the SU together combine and communicate to mutually discover gaps in the spectrum which can be occupied by them. The drawback of this method is that cognitive radios are usually real time, i.e., secondary users are allocated free spaces on request individually without depending on other users. Hence, the radio
874
P. Lalwani and R. Anantharaman
Fig. 1 Energy detection model for spectrum sensing
need not wait for all secondary users to arrive to allocate channels and can do so on the fly. This type can also be used for specific application CRs but is not implemented here between the three non-cooperative sensing methods which are the most used methods as they are the most general purpose methods available [9]. Energy detection method out of the various spectrum sensing methods is the simplest and most cost-effective method which uses a simple squaring algorithm to
Reinforcement Learning for Security of a LDPC …
875
Fig. 2 Types of spectrum sensing methods
detect the presence of the primary user in the spectrum [10]. A plot of probability of detection for various SNR values is as shown above in Fig. 3. Cyclostationary and matched filter techniques are somewhat costlier techniques in which cyclostationary method involves studying PU signal characteristics to observe
Fig. 3 Energy detection for various SNR
876
P. Lalwani and R. Anantharaman
its periodicity and separate it from noise, whereas matched filter requires prior knowledge of the presence of PU to project the SU’s in direction of vacant spaces using a pilot signal [11]. Since the cyclostationary method bets on autocorrelated statistics of a signal, it cannot be applied for all types of signals and is hence not used here. Matched filter method being too expensive and complex is not preferred here. Energy detection on the other hand provides a simple technique, and the detection does not need to worry about SNR due to the presence of OFDM and LDPC combination which targets and fixes errors present in the received signal [12]. A matched filter requires prior knowledge about the primary user’s waveform but in comparison with an energy detector, it is still better under noisy environments [13, 14]. The major drawback of the energy detector is that it is unable to differentiate between sources of received energy that means it cannot distinguish between noise and licensed users. So, this makes it a susceptible technique when there are uncertainties in background noise power, especially at low SNR. Cyclostationary feature detector is a good technique under noisy environments, as it is able to distinguish between noise energy and signal energy [15]. Figure 4 shows comparison of transmitter detection techniques, when the primary user is present under different SNRs.
Fig. 4 Comparison of various spectrum sensing methods
Reinforcement Learning for Security of a LDPC …
877
5 Reinforcement Learning Model Reinforcement learning is an area of machine learning. It is about taking suitable action to maximize reward in a particular situation. Reinforcement learning differs from supervised learning in a way that in supervised learning, the training data have the answer key with it, so the model is trained with the correct answer itself, whereas in reinforcement learning, there is no answer but the reinforcement agent decides what to do to perform the given task. In the absence of a training dataset, it is bound to learn from its experience. In many instances, using simple modeling, it has proved to enhance performance of the systems considerably. The term unsupervised learning means that the system runs without the presence of a teacher to oversee. This essentially means that the agent learns the system all by itself. With no need of any empirical data or prederived results, the agent while performing its usual functioning, at the same time also learns about the system. This is through the process of online learning, to enhance channel characteristics with no environmental analysis, RL proves to be the trick. Figure 5 shows a simplified version of a RL model. To learn the environment, the learning agent actively observes the operating environment for every instance of time. Once it learns the environment, the agent then makes a decision and implements action accordingly [16]. State refers to the factors learnt from the environment by the observer that affects rewards negatively.
Fig. 5 Basic reinforcement learning model issues addressed
878
P. Lalwani and R. Anantharaman
Action refers to the action implemented by the agent. The reward and state may be subjected to change as a result of this action. Reward refers to the effect that is seen in the environment as a consequence of an agent’s action at the previous instant of time, in other words the previous operation. The rapidly increasing numbers of wireless devices has caused the spectrum to be very crowded. Majority of the spectrum that are usable is occupied. Research and analytics on usage of spectrum have demonstrated that at any point of time, a big chunk of licensed spectrum is unused. Despite this, collisions and errors in spectrum sensing take place in the OSA network [17]. The primary goal of this paper is to effectively use reinforcement learning to create a distributed learning algorithm such that it actively monitors spectrum to keep check of vacant spectrum time slots. This is also commonly referred to as white space hunting. Q-learning and deep reinforcement learning are RL methods. These methods are crucial in finding good policies for dynamic programming problems. The most important feature of these methods is the capability of gaging the utility among all available actions without any previous information about the system model [18].
5.1 Model Implementation Python and TensorFlow have been used to design the model used with the CR system. The model has been built in Python and trained using TensorFlow. The requirements that were satisfied in order to create this model is as follows. • To create the environment: The variable action stores information regarding channel characteristics. If action = [1 0 3 4], this translates to the following: user 1 uses channel 1, channel 2 is unused, channel 3 is used by user 3, and channel 4 is used by user 4. In the subsequent code, these actions are passed to the environment created. Once the action is committed, the channels are then acknowledged. The code then returns immediate reward and also the remaining capacity of the channel. • To generate states: The first step in this process is to create a state for each user with regard to other users and the environment. To make the state generation more efficient, one-hot vectors are used. Every vector represents a user’s parameters in the environment. These states are subsequently fed into the deep Q-learning algorithm. • To generate algorithm of clustering for WSN: The following is a brief description of the working of this algorithm. A random node is first picked from a set. This node is connected to its n closest neighbors. The virtual distance between these nodes is set as infinity. This process is repeated until all nodes are assigned. DFS or BFS graph traversal algorithms can be used to achieve this.
Reinforcement Learning for Security of a LDPC …
879
5.2 Training the Model (TensorFlow) To train the built Python model, the following two Python files are used. • Training File: This file accounts for the fundamental training structure that is used. First, required inputs of number of channels, count of time slots, figure of users, and ALOHA properties are set. Required libraries are imported. Next, hot vectors are generated. These vectors are used to create action from observation. The parameters of the training algorithm could be any of the learning rate, rate of exponential decay, discount factor, etc. The DQN and environment are also created and initialized by this file. The actions are generated by observing activity, status, states generated. This data are sampled from users. The next part of the code results in specific rewards and cumulative results. The main code proceeds after this which uses the learning algorithm. The graphs are then plotted for the results of the learning code and rewards. • DRQN File: This file contains the DRQN code that essentially gives the model self-learning and acting properties [19].
6 Experimental Results (Note: Here, 0.0 means no user was using the channels which were otherwise free, whereas 2.0 means both channels were being used without collision, where 1, user was not sending packets) (Figs. 6, 7, and 8).
Fig. 6 Cumulative collision and reward plots
880
P. Lalwani and R. Anantharaman
Fig. 7 Cumulative rewards plot
Fig. 8 Nodes plot with BFS method
7 Conclusions The usual CR systems under study and in practice work at a significant amount of efficiency, and moreover are more complex and harder to maintain. The proposed approach not only provides a simpler method by incorporating the energy detection method for spectrum sensing but also various concepts of LDPC which greatly reduces the errors caused by AWGN noise during transmission and also helps to
Reinforcement Learning for Security of a LDPC …
881
reduce the interference. The proposed system is also scalable for a large value of data in the form of bits or text. A significant comparison has also been made among different types of spectrum sensing methods to show how energy detection methods are highly efficient and less complex. The proposed CR system when used along with LDPC decoder was able to overcome many of the problems faced by a usual CR system by reducing bit errors and interference in the signal by a considerable amount. The proposed approach not only provides a simpler method by incorporating the concepts of LDPC but also provides desired security to the modeled system with the aid of deep reinforcement learning.
References 1. S. Seo, T.N. Mudge, Y. Zhu, C. Chaitali, Design and analysis of LDPC decoders for software defined radio (2007). https://doi.org/10.1109/SIPS.2007.4387546 2. N. Mugesh, R.J. Theivadas, S.K. Padmanabhan. LDPC encoder for ofdm based cognıtıve radıo (2014) 3. R. Anantharaman, K. Kwadiki, V. Rao, Hardware ımplementation analysis of min-sum decoders. Adv. Electr. Electron. Eng. 17 (2019). https://doi.org/10.15598/aeee.v17i2.3042 4. A. Rajagopal, K. Karibasappa, K.S. Vasundara Patel, Hardware implementation of modified SSD LDPC decoder. Int. J. Comput. Aided Eng. Technol. (IJCAET) Indersci. J. 14(3), 426–440. ISSN: 1757–2665 5. A. Rajagopal, K. Karibasappa, K.S. Vasundara Patel, Study of LDPC decoders with quadratic residue sequence for communication system, Int. J. Inf. Comput. Secur. (IJICS) Indersci. J. 13(1), 18–31. ISSN: 1744–1733 6. K.-E. Lee, J.G. Park, S.-J. Yoo, Intelligent cognitive radio Ad-Hoc network: planning. Learn. Dyn. Configuration Electron. 10, 254 (2021). https://doi.org/10.3390/electronics10030254 7. F. Salahdine, Spectrum Sensing Techniques For Cognitive Radio Networks (2017) 8. A. Nasser, H. Al Haj Hassan, J. Abou Chaaya, A. Mansour, K.-C. Yao, Spectrum sensing for cognitive radio: recent advances and future challenge. Sensors 21, 2408 (2021). https://doi.org/ 10.3390/s21072408 9. S. Dhivya, A. Rajeswari, R. Aswatha, Implementatıon of energy detectıon based spectrum sensıng ın NI USRP 2920 (2017) 10. R. Sowmiya, G. Sangeetha, Energy detection using NI USRP 2920 (2016) 11. M. Subhedar, G. Birajdar, Spectrum sensing techniques in cognitive radio networks: a survey. Int. J. Next-Gener.Netw. 3 (2011). https://doi.org/10.5121/ijngn.2011.3203 12. Evaluation of energy detection technique for spectrum sensing. Daniela Mercedes and Angel Gabriel 13. W. Ejaz, Spectrum sensıng ın cognıtıve radıo networks NUST-MS PhD-ComE-01 (2006) 14. C.S. Rawat, G.G. Korde, Comparison between energy detection and cyclostationary detection for transmitter section. Int. J. Electr. Electron. Data Commun. 3, 2320–2084 (2015) 15. J. Chen, A. Gibson, J. Zafar, Cyclostationary spectrum detection in cognitive radios, pp. 1–5 (2008). https://doi.org/10.1049/ic:20080398 16. M. Ling, K.-L. Yau, J. Qadir, G.S. Poh, Q. Ni, Application of reinforcement learning for security enhancement in cognitive radio networks. Appl. Soft Comput. 37 (2015). https://doi.org/10. 1016/j.asoc.2015.09.017 17. A. Nasser, H. Al Haj Hassan, J.A. Chaaya, A. Mansour, K.-C. Yao, Spectrum sensing for cognitive radio: recent advances and future challenge. Sensors 21(7), 2408 (2021). https://doi. org/10.3390/s21072408
882
P. Lalwani and R. Anantharaman
18. K.-L. Yau, G.S. Poh, S.F. Chien, H. Al-Rawi, Application of reinforcement learning in cognitive radio networks: models and algorithms. Sci. World J. 2014, 209810 (2014). https://doi.org/10. 1155/2014/209810 19. F. Obite, A. Usman, E. Okafor, An overview of deep reinforcement learning for spectrum sensing in cognitive radio networks. Digital Sig. Process. 113, 103014 (2021). https://doi.org/ 10.1016/j.dsp.2021.103014
Secure Forensic Data Using Blockchain and Encryption Technique B. S. Renuka and S. Kusuma
Abstract Blockchain is the most recent technology, and it is based on decentralization. All data are recorded in a chain of blocks due to the decentralized nature of the system. To protect the data, it generates the unbreakable hash values. The preferable solution for securing and maintaining the integrity of forensic data is blockchain. However, the data can be retrieved if the forensic data have been tampered with by a third party. This paper implements a few security mechanisms to the forensic data by giving the e-mail notification for each data added by using SMTP protocol and also an access key is sent to the authorized person in order to view the recorded forensic evidence. Also a simple scenario has been added to recover the forensic data if the data got tampered and the hash value in tampered and untampered report in the blockchain are compared. Keywords Blockchain · Integrity · Simple mail transport protocol (SMTP)
1 Introduction In this digital era, it is far more vital to secure forensic evidence, such as forensic reports and medical reports, than it is to collect evidence. Because these reports are created after the evidence has been tested, analyzed and verified, for e.g., When a finger print is found on plexiglass or acrylic, the sample is collected using the appropriate forensic substance (for example, finger print powder made up of titanium dioxide and some type of wettening agent) and following the required procedures. Evidences play a significant part in determining the appropriate penalty for the victim. A blockchain [1] is a technology which stores all the evidences in a chain of block. Every block points to the hash of the previous block. Hash value is obtained using the hash function. Hash function is the mathematical function, which gives exact bits of output data for input arbitrary message length. It is a simple and easy technology B. S. Renuka (B) · S. Kusuma Department Of Electronics and Communication, JSS Science and Technology University, Mysuru, Karnataka, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_65
883
884
B. S. Renuka and S. Kusuma
to store the information in a secured way. It is more secure since all data acquired by police, physicians, forensic staffs and others during the investigation process is stored in encrypted form [2]. Firstly, the third party is unable to break the hash value. If they tamper the data also, it can be recovered by the proper authorized person who handle the data. Every block in the blockchain has the copy of the all transaction or information exchanged. So recovery of the data is possible in the blockchain. Due to this nature of the blockchain, the originality or integrity of the data is not been lost. This helps to produce the proper documented data in the court, then the judgement given legally to the victim.
2 Related Works The blockchain is a technology used to record the details in the chain of the blocks. Every block in the blockchain points to the hash value of the previous block. The first block in the blockchain is called genesis [1], which contains only the hash value of the next block. If the data got tamper in the blockchain, it can be recovered. Bitcoin is the cryptocurrency, it uses blockchain technology to make the transaction transparent [1]. In [3], privacy and security techniques are discussed for enhancing the existing and future blockchain technology by introducing few encryption techniques, i.e., homomorphic encryption (HE) and attribute-based encryption (ABE). Mixing techniques like Mixcoin and CoinJoin. It also analyzes the anonymous signatures, i.e., group signature and ring signature. The idea of transaction details is secured by using blockchain technology in cryptocurrencies like bitcoin and Ethereum. In [4], authors have discussed about how the transaction is takes place in the bitcoin and Ethereum with using SHA-256 and Ethash algorithm, respectively, and how Ethereum overcomes the limitations of the Bitcoin. In [5], authors have mainly discussed about the classification of blockchain-based forensics like cloud forensics, data management forensics, healthcare forensics, IoT forensics, mobile forensics, etc. CRAB (Blockchain-based criminal record management system) architecture in [6] details about how the data exchange between sender and receiver with storing of the data in the SQL database with blockchain technology for securing the criminal records. In [7], author explained the method of storing the forensic related data using chain of custody (CoC). Chain of custody is a documentation of the record and proof of work maintenance. In this paper, the author also explained the steps involved in CoC, i.e., collection of witness, data upload to the database, hash generation and proof of work. Blockchain-based chain of custody (B-CoC) is proposed in this paper [8] based on the private and permissioned blockchain. This avoids the untrusted and unauthorized parties in maintaining the digital evidence.
Secure Forensic Data Using Blockchain and Encryption Technique
885
In [9], the IoT forensic chain (IoTFC) can deliver a guarantee of traceability and track the provenance of evidence items. Here, the author has designed four modules, i.e., registration, data collection, cryptographic function and blockchain system, for generating appropriate authentication for the forensic data in IoT devices. In [10], a novel cyber threat intelligence (CTI) with blockchain has been developed for ensuring the data sharing in a secure way. Also, the data are collected in an efficient manner by deleting unnecessary and repetitive data. This model also avoids the large resource storage area. In [11], the researchers have implemented the unmanned aerial vehicle (UAN) with blockchain technology to collect the health data from the users and later, it gets stored in the nearest server. Also, this research work has discussed about how UAV communicates with the body sensor hives (BSHs). In [12], the proposed model is based on dual custody chain, i.e., main chain and branch chain, for the preservation of forensic data. The attackers need to break both the branch chain and main chain to get the data. If hacker break the chain, they need to decrypt the data. In [13], two-level blockchain system is designed for digital crime evidence management. This model has two blockchains, i.e., hot blockchain and cold blockchain, and these two blockchains can be only accessed through digital certificate and private key. In [14], the author has designed the model, which includes how the forensic investigation takes place in the crime scene and stored the forensic reports and investigation logs in the blockchain. Also, this research work has discussed about the recovery process of the forensic data if the data are tampered by third party. In [15], software-defined networking (SDN) is a centralized network management, so it makes the attackers to attack the forensic data stored in the network. By using the blockchain in SDN, the data get more secured. In [16], the transactions are increasing in recent days by causing many challenges in blockchain, where some of them are scalability, privacy leakage and selfish mining.
3 Proposed Block Diagram The proposed diagram involves the detailed study on how the investigation takes place on particular crime case, i.e., murder. Each documents in the blockchain are stored in MySQL server. It involves six steps that are shown in Fig. 1. Step1: Application manager to add investigators. Application manager also called as system manager can manage the whole application and can add the area of crime occurrence, police station of the perticular area, forensic staff, doctor and higher officer to investigate the crime. Step2: Crime case registered by the police. Police go to the crime scene and register the case with few details like crime place, gender and age of the dead person. Step3: Forensic staff generates the forensic report.
886
B. S. Renuka and S. Kusuma
Fig.1 Block diagram of securing the forensic data
Forensic staff collects the few evidences in the crime place. Evidences are blood sample, finger print sample,...etc. After collecting these evidences staff can test those. The tested result is made as forensic report by staff and report has been store in the block chain in MySQL database. Step4: Doctor generates the doctor report. Doctor investigates the dead body and collects the few evidence like drug information, mental health, blood clot, …, etc. Using these evidences, doctor can generate the doctor report and store in the blockchain in MySQL server. Step5: Crime log generated by the police. Using the forensic report and doctor report generated by forensic staff and doctor, respectively, police can start investigation and generate the investigation log and store in the blockchain. Step6: The higher officer and police will be notified of the observation and will be able to recover it. Case1: Report got tampered before generating the investigation log by the police. Police can have an authority to recover the report. Based on the details in the report, police generate the investigation log. Case 2: Higher officer login to the system and recovers the reports and crime log. Higher officer has an authority to view the reports generated by doctor and forensic staff and investigation log by the police. If the report and logs are got tampered by third party, higher officer can recover it and view the contents in the reports and investigation log.
Secure Forensic Data Using Blockchain and Encryption Technique
887
3.1 Flowchart to Represent the Algorithmic Flow of the Proposed Model Flowchart for the proposed model is represented in the Fig. 2. After application, manager adding the area, police station, forensic staff, doctor and higher officer,
Fig. 2 Flowchart for proposed model
888
B. S. Renuka and S. Kusuma
respective IDs with password is generated and sent to mail-id and updated in MySQL database. Using police station id (PSID), police can register the crime and generate the crime id (CID). Based on the forensic staff id (FSID), password forensic staff login to the application and based on the CID crime get loaded, after adding the investigation details, forensic staff generates the forensic report (FR). Using doctor id (DID), password doctor login to the application and based on the CID crime get loaded and doctor can generate the doctor report (DR). Details of the FR and DR help to the police to generate the crime log. These reports and logs are encrypted using the AES-Rijndael algorithm. For encryption process, it uses cipher key, that is generated by hashing the factors (CID, PSID, FSID or DID, log date) used in the project by using SHA-256 algorithm and converts the first two character of the hash value to ASCII. Expand the cipher key and combine with the plaintext (FR, DR and crime log) and made four transformations (i.e., sub bytes, shift rows, mix column and add round keys) in ten rounds to get the cipher text. This encrypted data are stored in the blockchain in MySQL database. If the data are tampered by the third party, higher officer login to the application using higher officer id (HID) and password then recovered the forensic data using temporary database and view the data by using the access key and decryption process of AES-Rijndael algorithm. If data in the blockchain are not altered higher officer can directly view the reports and logs, by using access key and decrypt the data using AES-Rijndael algorithm.
4 Proposed Work Proposed work involves the details of the implemented secure application.
4.1 Adding Area, Police Station, Doctor, Forensic Staff by Application Manager As in Fig. 3, Crime occurred areas can be added by the application manager. If the area is already exists, it throws a pop message ‘area already exists’ else it can show the success message ‘area added successfully’. Here, application manager can able to add more than one area name. Based on the each area, application manager can add the police station by filling the details like name of the police station, e-mail address, ten digit mobile number and address of the police station as in Fig. 4. After adding, randomly generated police station id and password are sent to the e-mail address, which was specified whilst adding the police station. Also these are updated in the database. In the similar way, application manager can able to add doctor, forensic staff and higher officer based on the roles like AC, DGP, …, etc.
Secure Forensic Data Using Blockchain and Encryption Technique
889
Fig. 3 Adding crime areas
Fig. 4 Graphical user interface (GUI) to add the police station to the particular area
4.2 Crime Registered by the Police After successful login of the police using the PSID(Police Station ID) and password. Police can enter the description of the crime based on the selected crime name and crime place as in Fig. 5.
890
B. S. Renuka and S. Kusuma
Fig. 5 Crime can be registered by the police
4.3 Forensic Staff Generates the Forensic Report Forensic staff firstly login to the application using the FID and password that are sent in mail and also updated in MySQL server. Forensic staff goes to the crime scene and collects the blood sample and fingerprint sample using the proper forensic method. Then, based on the evidence forensic staff testing the samples and generates the forensic report, this case is shown in Fig. 6. After generating the report, these data are encrypted using the AES-Rijndael algorithm with using key which was generated by our own factors. Then, it is stored in the blockchain in MySQL server.
4.4 Doctor Generates the Doctor Report Doctor after login to the application using DID and password that are generated at the step1 and sent in e-mail. Doctor can generate the report based on the detailed investigation of the dead body. The doctor report has an information like death hour, drug name and livor mortis as shown in Fig. 7. Then, this report is encrypted as forensic report and stored in the blockchain in MySQL server.
Secure Forensic Data Using Blockchain and Encryption Technique
Fig. 6 Forensic report generated by the forensic report
Fig. 7 Doctor report with details
891
892
B. S. Renuka and S. Kusuma
Fig. 8 Generated crime log based on a detailed study of the reports by police
4.5 Investigation Log by the Police Police login to the application using the PSID and password obtained during the step1 and view the reports, i.e., forensic and doctor reports and generate the investigation log based on the reports by entering the case summary as in Fig. 8. And this log is encrypted and stored in the blockchain for secure purpose.
4.6 Higher Officers or Police View the Reports Higher officer login to the application and selects the police station, based on the police station registered crime is loaded and view the reports (forensic report and doctor report or crimelog) by using the proper access key that is sent to the authorized mail-id, if the access key entered wrong it gives an error message ‘authorization failed’ and higher officer cannot able to view the report. Figure 9 shows the doctor report view by the higher officer. Similarly, police also can able to view the report. The shown Fig. 10 is the third-party application, it is used to tampered the reports and logs. Third party selects the police station and particular crime which they want to tamper and click on the tamper button to tamper the data. Once the report got tampered, the hash vale in the original database is changed. Figure 11 shows the hash value (DHV-doctor hash value) of the original database in blockchain after report got tampered. Figure 12 shows the temporary database that
Secure Forensic Data Using Blockchain and Encryption Technique
893
Fig. 9 Doctor report viewed by the police or higher officer
Fig. 10 Tamper of the doctor report by tamper application
Fig. 11 Original database hacked by the third-party application
Fig. 12 Temporary database without tamper
has the copy of the original database. It is used as a backup data if the data in the original database are altered by the third person.
894
B. S. Renuka and S. Kusuma
Fig. 13 Doctor report is tampered by the third party
When the higher officer or the police login to the application to view the reports (DR, FR and crime log), the application can compare the hash value (DHV-doctor hashvalue) of the original database with the temporary database. If the original database hash value is different from the temporary database, the application decides that the data got tampered by the third party. Figure 13 shows the status with tampered image of the report once the report is hacked by the third-party application.
5 Result When police or higher officer login and check the reports. If the reports are tampered they can recovered it. They recovered the data by clicking on the recover button in Fig. 13. Once forensic data are recovered in original database from the the temporary database. The hash value (DHV) in blockchain in original database shown in Fig. 14 is matched with the temporary database hash value shown in Fig. 12. Once the data are recovered in original database, it gives the success message and then the status image is changed from tampered image to success image as in Fig. 15. After data recovery, the recover button is disabled and view button is enabled to view the forensic and doctor reports and investigation log. Then, the higher officer or police follow the subsection 4.6 to view the reports and logs.
Fig. 14 Recovered hash value in the blockchain
Secure Forensic Data Using Blockchain and Encryption Technique
895
Fig. 15 Recovered doctor report
6 Limitations 1. 2. 3. 4.
Unable to edit the forensic data once it stored in the blockchain. Requires more time to update the data in the MySQL database. Manual effort is more to generate the forensic and doctor reports and crime log. Less number of bits used in encryption algorithm and hash algorithm.
7 Conclusion The proposed research work has successfully designed an application to store the forensic data as blockchain in MySQL database in order to secure the forensic evidence that was collected from forensic staff, doctor and police investigation. Due to the decentralized nature of the blockchain, every information is stored in original database and same copy of the data is stored in the temporary database. So, if third party can tamper the original database, the data can be recovered from the temporary database. Here, only police and higher officer can view and recover the tampered data. To enhance the security of the proposed application, the authentication details will only available with police and higher officers. Even doctor or forensic staff cannot recover or view the reports, they only generate the report and store in the blockchain. Once the data are stored in the blockchain, it cannot be altered. Due to this transparent nature of blockchain, original data are not altered or lost. This document can be recovered anytime with proper authentication. In this way, the proposed application is implemented to protect the forensic data. In the future, an edit option will be provided to modify the saved forensic data if it is necessary, but only by authorized persons, and a limited amount of time will be allowed to update the data for security reasons. To provide high-level security for storing forensic evidence, the total amount of bits employed in the encryption and hash method was increased.
896
B. S. Renuka and S. Kusuma
8 Novelty Increase the security of forensic data storage by encrypting the data with a cipher key produced by the model’s own factors such as PSID, FID, DID, log date and CID. Using this sort of cipher key in an encryption, approach makes hacking forensic data more challenging. In such a case, stored forensic-related reports and logs in blockchain are hacked by hacker, where only authorized persons (i.e., higher officer and police) can recover and view the data by using the proper access key sent to the authorized person mail-id. This makes one more internal security to the forensic data storage. For example, forensic staff unable to view the doctor report and staff cannot alter the report.
References 1. P. Tasatanattakool, C. Techapanupreeda, Blockchain: challenges and applications. in 2018 International Conference on Information Networking (ICOIN), IEEE (2018), pp. 473–475 2. P. Nivethini, S. Meena, V. Krithikaa, G. Prethija, Data security using blockchain technology. Int. J. Adv. Netw.& Appl. (IJANA) 3. R. Zhang, R. Xue, L. Lıu, Security and privacy on blockchain, ACM Comput. Surv. 1(1), Article 1 (2019) - c, S. Randi´ - c, Blockchain technology, bitcoin, and Ethereum: a 4. D. Vujiˇci´c, S. Randi´ brief overview, in Conference: 2018 17th International Symposium INFOTEH-JAHORINA (INFOTEH) 5. T.K. Dasaklis, F. Casino, C. Patsakis, SoK: blockchain solutions for forensics. In 2020 [cs.CR] 26 May 2020. arXiv:2005.12640v1 6. A.Al. Omar, S. Rahman, Report on the criminal record management system, in Conference Paper 2018 December, Research Gate 7. S. Harihara Gopalan, S. Akila Suba, Forensic chain: blockchain-based digital forensics chain of custody. Dig. Inv. Int. J. Recent Technol. Eng. (IJRTE) 8(2S11) (2019), ISSN: 2277-3878 8. S. Bonomi, M. Casini, C. Ciccotelli, B-coc: a blockchain-based chain of custody for evidences management in digital forensics. OpenAccess Series Inf. 71 (2020) 9. S. Nelson, K. Ponvasanth, S. Karuppusamy, R. Ezhumalai, Blockchain based digital forensics investigation framework in the internet of things and social systems. Int. J. Eng. Res. Technol. (IJERT), ISSN: 2278-0181, Published by, www.ijert.org RTICCT—2020 Conference Proceedings 10. S. Smys, H. Wang, Data elimination on repetition using a blockchain based cyber threat intelligence. IRO J. Sustain. Wirel. Syst. 2(4), 149–154 (2021) 11. J.S. Raj, Security enhanced blockchain based unmanned aerial vehicle health monitoring system. J. ISMAC 3(02), 121–131 (2021) 12. G. Liu, J. He, X. Xuan, A data preservation method based on blockchain and multidimensional hash for digital forensics. Research Article|Open Access, 2021(5536326), https://doi.org/10. 1155/2021/5536326 13. D. Kim, S.-Y. Ihm, Y. Son, Two-level blockchain system for digital crime evidence management. Sensors 21, 3051 (2021). https://doi.org/10.3390/s21093051 14. B.S. Renuka, S. Kusuma, Blockchain based digital forensics investigation framework. Int. Res. J. Eng. Technol. 08(06) (2021). (www.irjet.net) e-ISSN: 2395-0056, p-ISSN: 2395-0072
Secure Forensic Data Using Blockchain and Encryption Technique
897
15. S. Bhardwaj, R. Swami, M. Dave, Forensic investigation-based framework for SDN using blockchain. https://www.igi-global.com/book/revolutionary-applications-blockchainenabled-privacy/264874.In (2021) 16. Z. Zheng, S. Xie, H.N. Dai, X. Chen, H. Wang, Blockchain challenges and opportunities: a survey. Int. J. Web Grid Serv. 14, 352–375 (2018)
Denoising of Surface Electromyography Signal Using Parametric Wavelet Shrinkage Method for Hand Prosthesis S. H. Bhagwat, P. A. Mukherji, and S. Paranjape
Abstract sEMG signals are significantly corrupted by noise, making signal analysis challenging. This paper investigates the wavelet shrinkage-based parametric and non-parametric techniques for sEMG denoising. The sEMG signal utilized for the analysis includes an extension of the index finger. Three non-parametric thresholding methods, SURE, Universal, Minimax, and a parametric thresholding method are analyzed. Birge–Massart has been used to remove white Gaussian noise and color noise from sEMG signals. This study has considered forty-eight different wavelet functions with soft thresholding and two threshold estimation methods: level dependent and level independent. Signal-to-noise ratio (SNR) and L2 -norm ratio are the performance parameters used to measure denoising performance. The results show better performance of Coiflet wavelet with fifth order (coif5) among all other wavelet functions. The SNR value (26.52 dB) and L2 -Norm ratio (Average 88.53%) using level-dependent Birge-Massart thresholding are the best of eight possible wavelet denoising algorithms under investigation. Keywords Surface EMG signal · Denoising · Wavelet shrinkage · Threshold estimation
1 Introduction The electrical signal reflecting the neuromuscular activity associated with a contracting muscle is known as surface EMG (sEMG) [1]. It is one of the important biomedical signals, which is widely used in myoelectric controlled prosthetic applications [2]. These signals are collected from remnant or normal muscles either invasively (Intramuscular EMG or iEMG) or non-invasively (sEMG). As compared S. H. Bhagwat (B) Department of Electronics and Telecommunication, VIIT, Pune, India e-mail: [email protected] P. A. Mukherji · S. Paranjape Department of Electronics and Telecommunication, MKSSS’s Cummins COE, Pune, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_66
899
900
S. H. Bhagwat et al.
to iEMG, it is easy to collect sEMG, and it is also painless [3]. The features of the sEMG signal are determined by the subject’s internal body type, which includes muscle anatomy, measurement location, skin layer depth, and so on. These variables cause a variety of disturbances in the sEMG signals, which might interfere with feature extraction and further analysis. While recording sEMG, it gets contaminated by different noises generated by internal and external factors. The internal factors include ECG, eye blinking, and movement artifacts, whereas the external factors include white noise, color noise, electromagnetic interference, etc. The major noise sources are explained in following section. White Noise and Color Noise: White noise is a random signal with a flat power spectral density S(f ). Generally, it follows a Gaussian distribution and gets added to the desired signal, referred as additive white Gaussian noise (AWGN). Unlike AWGN, color noise does not have the flat PSD for all frequencies. For example, flicker noise generated in the resistors of the signal acquisition system has PSD, which is inversely proportional to the frequency [4]. Movement Artifacts: The movement artifacts are generated due to the cable movements and the skin–electrode interface. The movement artifacts are collected by electrodes placed near the muscle. The movement noise has a frequency range of 1–10 Hz and a voltage level that is similar to the amplitude of the sEMG. Recent studies have shown that the moving median filter [4], moving average filter [4], and the adaptive filter [5] have removed the motion artifact from a sEMG signal. Electromagnetic Interference (EMI): Another source of noise present in sEMG is the electromagnetic interference present in the environment that superimposes the required signal and changes its characteristics. The amplitude level of the electromagnetic radiation is generally comparable with the signal of interest. The subject body acts as an antenna and continuously emits electromagnetic radiation. It is not possible to avoid EMI with sEMG signal [3]. Also, the radiation from power sources of 60 Hz (or 50 Hz), called power-line interference (PLI) is a major noise source. The recorded artifacts can be removed by performing off-line processing [4]. A high-frequency interference can be removed by the appropriate use of high pass filter. However, PLI interference (50 Hz in India) lies in the dominant range of sEMG (20–500 Hz). While notch filters cannot be employed to remove them as it can remove the important part of sEMG spectrum. Electrocardiographic (ECG) Artifacts: Electrocardiogram signal (ECG) frequently contaminates sEMG signals due to their overlapping spectrum. It is challenging to remove the ECG artifacts from the sEMG signal because of their relative characteristics, such as varied temporal shape and non-stationary nature [6]. All the above-mentioned noises except the additive white Gaussian noise (WGN) and low-frequency noise can be removed using typical filtering methods [7]. Wavelets has ability to reduce these noises without changing signal characteristics [8]. Recent advancements in wavelet denoising using wavelet shrinkage algorithms have received significant consideration in removing random noise from ECG and speech [9]. The application of wavelet shrinkage method for sEMG denoising has been successfully used to remove these noises [10]. The elimination of AWGN and pink noise from
Denoising of Surface Electromyography Signal Using Parametric …
901
sEMG signals using wavelet shrinkage utilizing parametric estimate of noise level using Birge-Massart technique is investigated in this paper. The preprocessing stage based on a wavelet shrinkage algorithm for sEMG needs appropriate selection of mother wavelet, decomposition level, threshold estimation rule, and thresholding technique [11, 12]. Previous studies proposed various methods of wavelet denoising [13]. Jiang-Kuo [5] compared four thresholding and two transformation methods. The experiment was performed using original sEMG signal and simulated signal at 16 dB SNR. The denoising quality of the reconstructed signal was evaluated using signal-to-noise ratio (SNR) estimator. The conclusion of the study was that the denoised sEMG is insensitive to the selection of denoising methods. The researchers from [4] performed same study as [5]; however, they used different real sEMG signal, and their results were also similar to Jiang and Kuo [14]. In [6], the researchers have studied five most standard wavelet functions (db2, db5, sym5, sym8, and coif5) for denoising of the sEMG. They analyzed the processed sEMG by measuring the mean square error (MSE) parameter and showed that the scale level 4 provides a better performance when compared with other scale levels. Angkoon et al. [7] compared four classical non-parametric (Universal, SURE, Minimax and Hybrid) denoising methods and concluded that Universal method outperformed other methods. The current work is inspired by the fact that the existing studies have used fixed wavelet function and the decomposition level. However, it was not sufficient to make the fair comparison among variety of wavelet functions. Additionaly, there were no studies reported considering parametric thresholding methods for sEMG denoising. This research work presents a complete comparison of different nonparametric (Universal, SURE, and Minimax) and a parametric (Birge–Massart) denoising methods used in wavelet shrinkage algorithms for removing white Gaussian and color noise from sEMG signals. The objectives of this study were to study the suitable threshold estimation method, wavelet functions and thresholding technique. The work presented in this paper utilizes raw sEMG collected from normal upper limb. All the four thresholding methods are evaluated by using two threshold estimation methods (level dependent and level independent) and 48 wavelet families.
2 Materials and Methods 2.1 Dataset The sEMG dataset [15] used in this study is available on www.rami-khushaba.com for download. It is collected via two Ag–Ag/Cl electrodes attached to the forearms of normal healthy subject. As shown in Fig. 1, the electrodes are attached to the extensor carpi radialis longus and the flexor carpi ulnari. These muscles are responsible for hand and finger movements. The sEMG signal is amplified (gain of 1000), filtered (20–450 Hz) using 10th order butterworth filter and then using a notch filter to
902
S. H. Bhagwat et al.
(a)
(b)
(c)
Fig. 1 a Finger movement b First electrode c Second electrode [15]
remove 50 Hz PLI. The signal is sampled with 4000 Hz and digitized with 12-bit resolution [15]. The dataset considered multiple fingers movements such as pinching of thumb–index fingers as shown in Fig. 1.
2.2 Methodology The noisy sEMG signal y(n) can be modeled using Gaussian additive noise model as: y(n) = x(n) + n(n)μ,σ
(1)
That is noisy sEMG signal y(n) is composed of a clean sEMG signal x(n) and random noise n(n) with mean value μ and standard deviation σ [16]. The aim of the wavelet denoising algorithm is to remove noisy part and recover x(n) from y(n). The typical wavelet-based denoising procedures comprises of discrete wavelet decomposition, thresholding, and reconstruction steps as explained in Fig. 2. Wavelet decomposition requires the right wavelet function and decomposition level (M) for the perfect reconstruction. In this study, total 48 wavelet functions which included 7 Symlets (sym2-sym8), 10 Daubechies (db1–db10), 5 Coiflet (coif1-coif5), 15 BiorSplines, and 10 ReverseBior are investigated for the denoising performance. The maximum depth or level of decomposition is given as M max = log2 N, where N is sample length [7, 8].
Raw (Noisy) sEMG
Discrete Wavelet Transform (DWT) DecomposiƟon
Fig. 2 Typical wavelet-based denoising
SoŌ Thresholding
Inverse DWT ReconstrucƟon
Denoised sEMG
Denoising of Surface Electromyography Signal Using Parametric …
903
2.3 Wavelet Denoising In 1993, Donoho [5] introduced the use of wavelet thresholding method for noise removal from signal. Wavelets are very popular for denoising application because of sparsity property of the discrete wavelet transform (DWT).The signal Y (n) is passed through high-pass filter and low-pass filter to obtain detail (D), and approximation coefficients (A), respectively. DWT decomposes the signal into few coefficients of high magnitude and large coefficients of small magnitude. In practical cases, highfrequency components (small magnitude detail coefficients) represent noise signals and low frequency components (high magnitude approximate coefficients) represent useful signal. In wavelet thresholding, the noise in the high-frequency wavelet coefficients are thresholded and the wavelet coefficients corresponding to the lowfrequency components are reconstructed [17]. Only approximation coefficients are considered for repetitive filtering. At each level of decomposition, detail coefficients are modified as per threshold selection rule, and denoised signal is reconstructed from modified D coefficients and A coefficients collected from last decomposition level.
2.4 Thresholding Methods Three non-parametric threshold estimation methods, namely Universal, Minimax, SURE and a parametric threshold estimation method using Birge–Massart method, applied in this study, are explained below.
2.4.1
Universal Thresholding Method
This is a non-parametric, fixed thresholding method derived by Donoho and Johnstone [5] for removing the white Gaussian noise under a mean square error criterion. Fixed threshold λ, is calculated using expression 2, λ = σ 2 log2 N
(2)
where N is the number of samples in signal and σ is standard variation calculated using Eq. 4. In case of level independent thresholding, the threshold values at each decomposition level is calculated as λj . λ j = σ j 2 log2 N j (for level dependent thresholding)
(3)
where σ j and N j are the noise standard variation computed for jth level and the number of detail coefficients in jth composition level, respectively. The median parameter is used to calculate σ as [17].
904
S. H. Bhagwat et al.
σ =
median(|cDm |) 0.6745
(4)
where 0.6745 is normalization factor and cDm is the detail coefficients at mth decomposition level [11].
2.4.2
SURE Thresholding
It uses Stein’s unbiased estimate of risk (SURE) estimator to estimate the risk λSURE [11]. It selects minimum risk. For instance, minimizing the risk in λSURE , gives a selection of the threshold value [11].
2.4.3
Minimax Thresholding
It uses a fixed threshold to yield minimax performance of mean square error against an ideal procedure [11]. Usually, for the same samples N, the risk of universal threshold is larger than SURE threshold and minimax threshold, and SURE selects minimum risk [12]. For more details of thresholding methods, refer Rashid [12].
2.4.4
The Birge-Massart Thresholding
It is a parametric method based on adaptive functional estimation [13]. The approach depends on the choice of the parameters α and jo . In this method, all coefficients of higher levels than jo , are retained and remaining. The number of wavelet coefficients to be retained after thresholding at each level j from 1 to jo are calculated as K, K =
m ( j0 + 1 + j)α
(5)
The number of coefficients in the sub-band matrix is m. Typically for denoising applications, α = 3 and jo = j − 1 is used [17].
2.5 Thresholding Algorithm 2.5.1
Hard Thresholding
Hard thresholding selects the noise subspace to zero. In this thresholding algorithm, the wavelet coefficients less than the threshold λ will be set to zero
Denoising of Surface Electromyography Signal Using Parametric …
βTH (a)
2.5.2
=
α |a| > λ 0 |a| < λ
905
(6)
Soft Thresholding
Soft thresholding algorithm selects the specified threshold value of the decomposition coefficient. First, all detail coefficients whose absolute values are lower than the threshold are zeroed, and then, the other coefficients are shrinked toward zero. The soft thresholding function is defined as, βTH (a)
=
sgn(a)(|a| − λ) |a| > λ |a| < λ 0
(7)
Inverse discrete wavelet transform (IDWT) is applied to get reconstructed denoised signal [16, 17]. The reconstructed signal is computed by wavelet reconstruction based on the original approximation coefficients of level J and the modified detail coefficients of levels from 1 to J. Thresholding method is used to estimate the noise level by applying either levelindependent (LID) or level-dependent (LD) thresholding technique [9]. LD calculates threshold level at each scale of decomposition, whereas LID calculates the threshold value only at the finest scale. Either soft or hard thresholding is performed for threshold transformation [2]. Hard or soft thresholding algorithms reduce the noise by scaling the coefficients that are smaller than the threshold level [2]. In this work, both LD and LID thresholding using soft thresholding are investigated. The fixed estimators (LID) estimate noise variance σ at finest scale that does not consider the influence of low frequency or color noise. This work proposes a novel method for color and white noise removal from sEMG using adaptive leveldependent Birge–Massart thresholding algorithm. The proposed method is compared with existing fixed thresholding Donoho Johnstone algorithms.
2.6 Performance Evaluation Parameters In the proposed work, the performance of the denoised sEMG signal is evaluated using two parameters: signal-to-noise ratio (SNR) in decibel (dB) and Percent L 2 Norm Ratio [2]. The SNR in dB is given as
SNRdB = 10 log10
2 E F˜ 2 E F − F˜
(8)
906 Table 1 Comparison between threshold levels
S. H. Bhagwat et al. Threshold rule
(Sample 1)
(Sample 2)
(Sample 3)
SURE
0.0014
0.00113
0.00113
Minimax
2.0397
2.2166
2.2166
Universal
3.5322
3.7172
3.7172
BM
0.0010
0.00014
0.00014
where F is the collected sEMG signal data with noise and F˜ is the denoised sEMG signal without noise. L2 -Norm Ratio is the ratio of the squared L2 -norm of the signal approximation to the input signal where FC is the approximation and F is the input signal. This ratio is calculated in percentage. Ideally, this value is 100. For better denoising, the ratio should be closer to 100. L 2 − Norm Ratio = 100 ∗
FC2 F2
(9)
2.7 Experiment In wavelet shrinkage-based denoising, it is important to determine the wavelet function that gives the best SNR values. The following study was performed on all real sEMG signals individually. The sEMG signals are divided into segments of 2000 samples each. Each sEMG segment is decomposed up to level 4. The DWT coefficients from Level 1 to 4 correspond to four frequency sub-bands D1 (2–4 kHz), D2 (1–2 kHz), D3 (0.5–1 kHz), and D4 (0.25–0.5 kHz). D4 gives dominant frequency of sEMG. Four various threshold selection rules (Universal, SURE, Minimax, BirgeMassart) are applied one by one at all 4-level decomposition in level-dependent (LD) case and at finest level in level-independent (LID) case to estimate the threshold level. All detail coefficients are thresholded using soft thresholding technique. IDWT is applied to reconstruct the denoised sEMG signal. Finally, the SNR values in dB are recorded. All the process is repeated for each of 48 wavelets. The SNR in dB of reconstructed signal is reported in Table 1.
3 Results Table 1 shows the estimated threshold level for all the estimators using levelindependent method. For different sEMG samples, Universal rule always selects the highest threshold level and SURE and BM always selecting the lowest threshold level. For three
Denoising of Surface Electromyography Signal Using Parametric …
907
samples of sEMG signal, the sequence of threshold selected followed by all the estimators is Universal > Minimax > SURE > BM. Figure 3 shows proposed method for sEMG denoising using BM thresholding. While applying BM, the number of coefficients, K, to be kept is calculated based on number of coefficients in the sub-band matrix m and α (see Eq. 5). As reported in previous studies, typical value of α and j0 is 3 [17]. It is selected based on the background noise level of sEMG signal. To evaluate the denoising performance more
Fig. 3 Proposed wavelet thresholding technique for sEMG denoising using BM
908
S. H. Bhagwat et al.
critically, the value of α is varied from 0.5 to 4 and the resultant L 2 -Norm ratio are reported in Table 4. Table 2 gives SNR value in dB of reconstructed sEMG signal after evaluating 48 different mother wavelets for four different threshold estimation techniques and two thresholding methods. The experiment consider fourth level wavelet scale. As per SNR values, LD outperformed LID thresholding, and the optimum denoising performance is obtained by sym5 (SNR 24.62 dB), sym8 (26.58 dB), coif5 (26.52 dB), and db8 (26.77 dB) mother wavelets. The eighth order of Daubechies (db8) and fifth order of Coiflet (coif5) are found to be the best wavelet for reconstruction from sEMG point of view. The better SNR value is obtained with coiflet5 wavelet because its shape almost matches with that of sEMG. Another recovered signal strength indicator, L2 -Norm ratio of denoised signal to the collected sEMG signal is given in Table 3. The results depict the best thresholding rule is BM followed by SURE. BM threshold rules is the most conservative and is more convenient when the signal details lie near the noise range. Due to the comparative magnitudes of sEMG signal, noise BM is best suited for denoising sEMG. Universal thresholding performs worst among all estimators as it calculates maximum noise variance level, which causes smaller sEMG signal coefficients to be discarded in thresholding step. Table 4 gives the L 2 -norm calculated for three sEMG samples using various values of α in application of BM algorithm. The results confirm that α should be equal to 3 for denoising. Fig. 4 illustrates original sEMG signal (a) and its denoised version using Birge– Massart (b), Universal (c), SURE (d), Minimax (e) rules. Original signal contains high-frequency noise and desired signal. Universal thresholding removes lowamplitude, high-frequency components, but it also flattens the peaks and smoothens the sharp edges. It is worth noting that all the peaks, valleys, corners, and edges are well preserved in denoised signal using BM as compared to all other techniques.
4 Discussion In sEMG denoising, while applying wavelet-based thresholding methods, the nonparametric techniques such as Universal, SURE, and Minimax select fixed threshold level. However, sEMG signal is contaminated by different noise sources of varying frequencies and amplitudes. Hence, for sEMG denoising, an adaptive threshold selection method based on Birge–Massart strategy is proposed. From various quantitative analysis, it is clear that the Universal rule selects the highest threshold level and SURE and BM always selecting the lowest threshold level. All the thresholding methods except BM are fixed level estimators, whereas BM is more flexible estimator as one can easily control thresholds by changing Alpha. Depends on background noise level of sEMG signal, the value of α and j0 can be optimized. In this research work, value of j0 is 3, and α is optimized as 3 for
Denoising of Surface Electromyography Signal Using Parametric …
909
Table 2 SNR in dB for wavelet families and thresholding methods (decomposition level = 4) S. No. Wavelet Level independent family Universal SURE Minimax BM
Level dependent Universal SURE Minimax BM
1
Sym1
15.48
17.01
16.48
17.73 15.59
17.14
16.57
17.94
2
Sym2
19.42
20.35
19.82
21.07 19.53
20.48
19.91
21.28
3
Sym3
21.01
22.47
21.94
23.19 21.12
22.6
22.03
23.4
4
Sym4
22.08
23.31
22.78
24.03 22.19
23.44
22.87
24.24
5
Sym5
22.56
23.69
23.16
24.41 22.67
23.82
23.25
24.62
6
Sym6
22.57
23.32
22.79
24.04 22.68
23.45
22.88
24.25
7
Sym7
22.84
23.37
22.84
24.09 22.95
23.5
22.93
24.3
8
Sym8
24.51
25.65
25.12
26.37 24.63
25.78
25.21
26.58
9
db1
15.42
15.25
14.72
15.97 15.53
15.38
14.81
16.18
10
db2
18.46
19.01
18.48
19.73 18.57
19.16
18.57
19.94
11
db3
21.02
22.58
22.05
23.3
21.13
22.73
22.14
23.51
12
db4
21.95
23.43
22.9
24.15 22.06
23.58
22.99
24.36
13
db5
22.56
24.05
23.52
24.77 22.67
24.2
23.61
24.98
14
db6
22.57
24.04
23.51
24.76 22.68
24.19
23.6
24.97
15
db7
22.84
24.39
23.86
25.11 22.95
24.54
23.95
25.32
16
db8
24.36
25.84
25.31
26.56 24.47
25.99
25.4
26.77
17
db9
23.3
22.85
22.32
23.57 23.41
23
22.41
23.78
18
db10
22.56
24.05
23.52
24.77 22.67
24.2
23.61
24.98
19
coif1
20.61
22.16
21.63
22.88 20.72
22.31
21.72
23.09
20
coif2
22.97
24.02
23.47
24.74 23.08
24.17
23.56
24.95
21
coif3
23.92
24.58
24.05
25.3
24.03
24.73
24.14
25.51
22
coif4
24.38
25.38
24.85
26.1
24.49
25.53
24.94
26.31
23
coif5
24.66
25.59
25.06
26.31 24.77
25.74
25.15
26.52
24
haar
16.64
17.96
17.43
18.68 16.75
18.11
17.52
18.89
25
Bior1.1
15.48
17.01
16.48
17.73 15.59
17.14
16.57
17.94
26
Bior1.3
19.42
20.35
19.82
21.07 19.53
20.48
19.91
21.28
27
Bior1.5
21.01
22.47
21.94
23.19 21.12
22.6
22.03
23.4
28
Bior1.7
22.08
23.31
22.78
24.03 22.19
23.44
22.87
24.24
29
Bior1.9
22.56
23.69
23.16
24.41 22.67
23.82
23.25
24.62
30
Bior2.2
22.57
23.32
22.79
24.04 22.68
23.45
22.88
24.25
31
Bior2.4
22.84
23.37
22.84
24.09 22.95
23.5
22.93
24.3
32
Bior2.6
22.51
25.65
25.12
24.37 24.63
25.78
25.21
24.58
33
Bior2.8
15.42
15.25
14.72
15.97 15.53
15.38
14.81
16.18
34
Bior3.1
18.46
19.01
18.48
19.73 18.57
19.16
18.57
19.94
35
Bior3.3
21.02
22.58
22.05
23.3
22.73
22.14
21.13
23.51 (continued)
910
S. H. Bhagwat et al.
Table 2 (continued) S. No. Wavelet Level independent family Universal SURE Minimax BM
Level dependent Universal SURE Minimax BM
36
Bior3.5
21.95
23.43
22.9
24.15 22.06
23.58
22.99
37
Bior3.7
22.56
24.05
23.52
24.77 22.67
24.2
23.61
24.98
38
Bior3.9
24.51
25.65
25.12
26.37 24.63
25.78
25.21
26.58
39
rbior1.1 22.84
24.39
23.86
25.11 22.95
24.54
23.95
25.32
40
rbior1.3 24.36
25.84
25.31
26.56 24.47
25.99
25.4
26.77
41
rbior1.5 23.3
22.85
22.32
23.57 23.41
23
22.41
23.78
42
rbior1.7 22.56
24.05
23.52
24.77 22.67
24.2
23.61
24.98
43
rbior1.9 20.61
22.16
21.63
22.88 20.72
22.31
21.72
23.09
44
rbior2.2 22.97
24.02
23.47
24.74 23.08
24.17
23.56
24.95
45
rbior2.4 23.92
24.58
24.05
25.3
24.03
24.73
24.14
25.51
46
rbior2.6 24.38
25.38
24.85
26.1
24.49
25.53
24.94
26.31
47
rbior2.8 24.66
25.59
25.06
26.31 24.77
25.74
25.15
26.52
48
rbior3.1 16.64
17.96
17.43
18.68 16.75
18.11
17.52
18.89
Table 3 Performance comparison (L 2 -norm ratio)
Table 4 L 2 -norm ratio for various values of α
Sample 1
Sample 2
24.36
Sample 3
SURE
76.85
78.12
77.43
Minimax
26.61
28.11
29.22
Universal
10.15
15.33
12.51
BM
89.94
90.01
85.34
α
Sample 1
Sample 2
Sample 3
0.5
82.12
85.43
82.27
1
82.17
86.76
83.34
1.5
83.94
87.78
83.44
2
84.94
88.10
84.45
2.5
86.95
88.61
84.91
3
89.94
90.01
85.34
3.5
89.44
89.71
85.12
4
88.67
89.33
84.14
better denoising performance. BM threshold rule is the most conservative and more convenient when the signal details lie near the noise range. From the quantitative analysis of different denoising methods, it is clear that level-dependent thresholding outperforms the level-independent thresholding technique due to following reason. sEMG signal is contaminated by various forms of
Denoising of Surface Electromyography Signal Using Parametric …
911
Fig. 4 sEMG segment (Thumb-Index) a Original and denoised using b BM c Universal d SURE e Minimax
colored Gaussian and non-colored Gaussian noises. To deal with these noises, leveldependent thresholding is proposed which removes noise from each decomposition level. This in turn, removes the noise effectively from reconstructed sEMG at every scale (or frequency sub-band) as compared to level-independent thresholding which calculates the threshold value only at the finest decomposition level. Due to which better denoising performance is achieved. This research work proposes wavelet denoising of sEMG using level-dependent thresholding with Birge–Massart level dependent estimation technique. Above results confirm the use of BM method for threshold selection rule and LD soft thresholding in the sEMG denoising.
5 Future Scope This work can be used as an aid to a variety of research work, viz. sEMG feature extraction, prosthetic arm movement classification, and so on where denoising is a main preprocessing step. This work can help researchers choose a satisfactory sEMG denoising method.
6 Conclusion This paper has successfully explored the application of parametric Birge–Massart wavelet denoising in WGN and color noise reduction from sEMG signal. The results show that Coiflet wavelet with fifth order and Daubechies wavelet with eighth order (db8) with decomposition level 4 deliver better performance than other options. This
912
S. H. Bhagwat et al.
is due to the fact that the coif5 and db8 most matches with sEMG signal shape. Leveldependent thresholding with soft thresholding performs best of different wavelet denoising algorithms. Among all thresholding methods, Universal method estimates maximum threshold level, SURE estimates minimum threshold level, whereas Minimax selects a threshold level between the former two. In this regard, the Birge–Massart strategy is more flexible in the threshold level selection as it depends on value of α. Hence, the value of α can be set for delivering optimal denoising performance.
References 1. M.S. Hussain, M.B.I. Reaz, F. Mohd-Yasin, M.K. Khaw, Denoising and analysis of surface EMG signals, in 5th WSEAS International Conference on Circuits, Systems, Electronics, Control & Signal Processing (Nov 2006), pp. 306–308 2. Y. Feng, S. Thanagasundram, F.S. Schlindwein, Discrete wavelet-based thresholding study on acoustic emission signals to detect bearing defect on a rotating machine (2006) 3. G. Luo, D. Zhang, D.D. Baleanu, Wavelet denoising. Advances in wavelet theory and their applications in engineering, physics and technology (2012), p. 634 4. D.F. Guo, W.H. Zhu, Z.M. Gao, J.Q. Zhang, A study of wavelet thresholding denoising, in WCC 2000-ICSP 2000, in 2000 5th International Conference on Signal Processing Proceedings, 16th World Computer Congress 2000, vol. 1 (IEEE, 2000), pp. 329–332 5. C.F. Jiang, S.L. Kuo, A comparative study of wavelet denoising of surface electromyographic signals, in 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE, 2007), pp. 1868–1871 6. A.A. Ghanbari, M.N. Kousarrizi, M. Teshnehlab, M. Aliyari, An evolutionary artifact rejection method for brain computer interface using ICA. Int. J. Electric. Comput Sci. 9(9), 48–53 (2009) 7. A. Phinyomark, C. Limsakul, P. Phukpattaranont, An optimal wavelet function based on wavelet denoising for multifunction myoelectric control, in 2009 6th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, vol. 2 (IEEE, May 2009), pp. 1098–1101 8. A. Phinyomark, C. Limsakul, P. Phukpattaranont, EMG denoising estimation based on adaptive wavelet thresholding for multifunction myoelectric control, in 2009 Innovative Technologies in Intelligent Systems and Industrial Applications (IEEE, July 2009), pp. 171–176 9. M. Khezri, M. Jahed, Surface electromyogram signal estimation based on wavelet thresholding technique, in 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE, Aug 2008), pp. 4752–4755 10. I.M. Johnstone, B.W. Silverman, Wavelet threshold estimators for data with correlated noise. J. Roy. Stat. Soc. Ser. B (Stat. Methodol) 59(2), 319–351 (1997) 11. A. Phinyomark, C. Limsakul, P. Phukpattaranont, A comparative study of wavelet denoising for multifunction myoelectric control, in 2009 International Conference on Computer and Automation Engineering (IEEE, Mar 2009), pp. 21–25 12. A. Phinyomark, C. Limsakul, P. Phukpattaranont, EMG signal estimation based on adaptive wavelet shrinkage for multifunction myoelectric control, in ECTI-CON2010: The 2010 ECTI International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (IEEE, May 2010), pp. 322–326 13. V.P. Gopi, M. Pavithran, T. Nishanth, S. Balaji, V. Rajavelu, P. Palanisamy, A novel wavelet based denoising algorithm using level dependent thresholding, in 2014 International Conference on Electronics and Communication Systems (ICECS) (IEEE, Feb 2014), pp. 1–6
Denoising of Surface Electromyography Signal Using Parametric …
913
14. X. Chen, D. Wu, Y. He, S. Liu, A new signal de-noising algorithm combining improved thresholding and patternsearch algorithm, in 2008 International Conference on Machine Learning and Cybernetics, vol. 5 (IEEE, July 2008), pp. 2729–2733 15. U. Baspinar, V.Y. Senyurek, ¸ B. Dogan, H.S. Varol, A comparative study of denoising sEMG signals. Turkish J. Electric. Eng. Comput. Sci. 23(4), 931–944 (2015) 16. I.M. Johnstone, B.W. Silverman, Wavelet threshold estimators for data with correlated noise. J. Royal Statist. Soc. Ser. B (Stat. Methodol.) 59(2), 319–351 (1997) 17. M.A. Awal, S.S. Mostafa, M. Ahmad, M.A. Rashid, An adaptive level dependent wavelet thresholding for ECG denoising. Biocybernet. Biomed. Eng. 34(4), 238–249 (2014)
An Efficient Machine Learning Approach to Recognize Dynamic Context and Action Recommendations for Attacks in Enterprise Network K. B. Swetha and G. C. Banu Prakash
Abstract The size of the computer networks and the developed applications grow exponentially due to the rapid advancement of the modern technology. Meanwhile, a significant increase in the cyber-attacks to data networks has also been observed. Intrusion detection system (IDS) is the major layer of defense in case of data network and thus plays vital role in detection or forewarning of any kind of intrusion in the network. Intrusion detection is quite important in modern data networks. Using the network packets information, identify the DoS/DDoS attack using machine learning model which predicts the network packet accuracy before hitting the application. The goal is to use machine learning/deep reinforcement learning algorithm to detect anomaly in the incoming network traffic. Keywords TCP-IP · DDoS attacks · Enterprise networks · Intrusion detection system (IDS) · k-nearest neighbor network · Deep reinforcement learning · Confusion model
1 Introduction Development of an efficient framework using location and sensing parameters is essential. This framework should provide the services suitable to user context using machine learning algorithms. DDoS attacks are considered to be one of the major shares in network attacks. Distinction between legitimate and malicious users is always a challenging task in any typical enterprise network environment. The testing and implementation of DDoS approaches are difficult due to many factors. Using any of the machine learning approaches, one can attempt to detect attacks and propose K. B. Swetha (B) Department of Information Science and Engineering, R R Institute of Technology, Bangalore, India Visvesvaraya Technological University, Belagavi, Karnataka, India G. C. Banu Prakash Department of Computer Science and Engineering, Sir MVIT, Bangalore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_67
915
916
K. B. Swetha and G. C. Banu Prakash
solutions to handle malicious attacks. Still, it is very difficult to conclude with one novel approach under machine learning. This article attempts to propose a method by combining deep learning and reinforcement learning (deep reinforcement learning) to efficiently detect DDoS attacks. This article also presents performance evaluation of the proposed supervised algorithm and concludes with optimal inference. Detailed analyses of the results are discussed in support of the work [1].
2 Background Intrusion detection is an important responsibility in any ubiquitous computing environment. In today’s challenging situations, enterprise network is expected to work uninterruptedly for continuous services. Fast and robust machine learning algorithms which are capable of detecting and classifying threats are required. Here, dynamic context and action recommendations are proposed in a typical enterprise network [2, 3]. These types of training datasets are usually used for developing intrusion detection frameworks. The main objective of this methodology is to achieve an environment with an intelligent behavior for training dataset, generate rewards, and optimize the behavior with an adversarial objective.
3 Types of DoS Attacks By generating unexpected traffic volumes, DOS HULK tool attacks on Web servers. This in turn escapes from cache engines and attacks the resource pool of the Web servers. A network mapper by name Nmap is one of the latest and open-source discovery tools available currently. Majority of the network administrators are using Nmap as one of the important tools. The program of Nmap facilitates identifying host, scanning the port, sweeping the ping, detection of operating system, and also detection of the version. To deny the access to the machine or network for targeted users, DOS attack is introduced. This attack is rolled by flooding the target with the traffic which causes crash. During this event, it refuses the services and usage of resources from all legitimate users. A network tool which is capable of sending custom TCP/IP packets and target replies is called hping3. It helps in fragmentation, arbitrary packets, and file transfer along with supported protocol. Hence, it is also called as packet crafting technique. GoldenEye is a well-known HTTP DoS tool which works with cache-control options. By bursting the socket connection, it consumes all available sockets on the server [4]. Weak passwords are a well-known vulnerability; as a result, the majority of corporates are vulnerable. Many users use weak passwords that can be brute forced and exposed in plain text. Hashes are created using a one-way mathematical algorithm, which means they cannot be reversed. As a result, the only way to crack them is to use brute force. It can brute force password hashes using FTP Patator and Kali Linux.
An Efficient Machine Learning Approach to Recognize Dynamic Context…
917
FTP Patator is a program that can brute force various types of logins and even ZIP passwords. SYN Flood which is also known as half-open attack is one among the DDoS attack. It makes the server unavailable by consuming all resources. It sends SYN packets to attack on the resources which in turn cause infinite service delay for the legitimate users. UDP Flood sends huge number of user datagram protocol (UDP) packets to target server. This flooding makes the firewall exhausted and deny the services for legitimate traffic. Similarly, even ICMP Flood also behaves to attack and deny the services with ICMP echo-request packets. This results are inaccessible to normal traffic. HTTP Flood is a type of volumetric attack with HTTP requests on servers. When the target becomes saturated with several requests, then deny happens for all additional requests made by the legitimate users. The malicious actors create botnets for maximum attack efficiency in denying the services. An attacker can use many malware infected devices to leverage efforts by creating large volume of traffic. Generally, there are two varieties of HTTP flood attacks namely, HTTP GET attack and HTTP POST attack. In get attack, several requests are placed for images, files, etc., from several devices over targeted servers. When the target is affected, then services will be denied from legitimate traffic sources. In post attack, there could be an occurrence of slow post attack. The server is expected to handle the incoming request and push the data often to a database. The entire process of handling the form and running the necessary database is monitored. Slow attack targets very slow traffic, fundamentally targeting server resources. Typically, low and slow traffic works with less bandwidth and very difficult to mitigate. It generates the traffic in such a way where it becomes difficult to distinguish with normal traffic. Due to this reason, the attack becomes unnoticed and undetected for longer duration. This results in refusing or slowing the services for legitimate users of the servers. Low and slow attacks can be rolled out using a single computer as they do not require more resources. This is more effective when compared to distributed attacks which use botnets. Slow Loris and RUDY are commonly used tools for low and slow attacks.
4 Structure of the Workflow In packet capturing, data packets are captured from the network interface in real time, subsequently a .PCAP file is generated containing these packets data [5]. In feature extraction phase, the PCAP file is used so as to generate a feature file containing the necessary features so as our algorithms can make use of them. During the data cleaning phase, the feature file is cleaned and processed so as to remove redundant values and features for faster inference by algorithms. Our cleaned feature file is first used by a binary classification model which detects and predicts whether there is an attack or not. In case there is an attack, as shown in Fig. 1, another model is run in next phase. If the binary classification model predicts an attack, multi-class model looks to identify what kind of attack it is from the 14 attacks it was trained on. Finally, results are passed in the form of JSONs and Predict API.
918
Fig. 1 Integrated workflow
K. B. Swetha and G. C. Banu Prakash
An Efficient Machine Learning Approach to Recognize Dynamic Context…
919
5 k-Nearest Neighbor Model kNN is a non-parametric learning algorithm which returns labels according to classification and regression. It typically returns k training samples and returns more labels during classification and average labels during regression, as shown in Fig. 2. Typically, the following steps are involved: • Inspect the dataset that you have for an unseen/new point which has to be classified into one of the three groups. • Calculate the distances between the given point and other points in the data. Common distances include Euclidean, Hamming, and Manhattan. • Sort the nearest neighbors (NNs) of the unseen/new point by the distances in the increasing order, i.e., shortest ones on top. • Now, how to decide which label our point should be assigned? We will predict the label of the unseen/new example by either choosing the most occurring label of our k-nearest neighbors (in case of classification) or by calculating the mean (in case of regression). What is the right value of k? There is no hard and fast rule to select the optimal value of k. It varies from dataset to dataset [6]. In general, data scientist would want to keep it small enough to exclude the samples of the other classes but large enough to minimize any noise in the data. One way to choose the optimal k is to try different values of k from 1 to 10 and use cross validation to select the value which results in the best accuracy. A large value of k will result in high bias and vice versa. The softwares used are Python, Scikit-Learn, Pandas, Matplotlib, and Numpy. Further, hardwares used are Intel(R Core(TM) i5 CPU, 16 GB RAM, and Ubuntu 19.10 operating system and suitable GPU. Varies performance evaluation methods used are accuracy, recall, precision, and F1 score. Fig. 2 Typical clustering through kNN
920
K. B. Swetha and G. C. Banu Prakash
6 Deep Reinforcement Learning (DRL) The policy function is optimized by these new states and rewards in the algorithm [4]. To evaluate the next action based on current state, this policy function is used. Suitably, the rewards are provided by the environment as shown in Fig. 3. The DRL algorithm is the combination of deep learning and reinforcement learning concepts. The prime indicates the new state or the reward. The main aim of the DRL framework is to produce actions for the agent by maximizing the total reward points [7]. The components of the DRL algorithm include environment, agent, action, states, and reward. Here, the environment is not real and is simulated. Classification of attacks is the main agent. Predictions are labeled by the action [8]. Typically, network traffic samples are used as input states. Fast neural network-based classifier is the basis for DRL algorithm prediction. This classifier is used to build policy function. A reinforcement learning model is used to train the policy function. The learning process as well as the behavior of the environment is controlled simultaneously. Supervised learning algorithm is guided by a training dataset. This is used to build a framework for intrusion detection system. The training dataset encompasses various network features and intrusion labels [9]. This paradigm is blended with reinforcement learning algorithm in a simulated environment. This environment acts as an agent with intelligent behavior to perform various tasks. New samples are generated randomly from the inputted training dataset [10]. Based on the quality of the prediction by classifier, rewards are assigned. The simulated environment further adjusts its initial behavior with an adversarial objective such that it increases the difficulty level of predictions made by the classifier. Thus, both the classifier and the simulated environment work against each other to get positive reward point. In this study, classifier and environment are considered as main and secondary agents, respectively. A modified reinforcement learning algorithm is proposed in the learning process based on the behavior of the environment. This will increase the performance of the algorithm. A simulated environment is built by integrating network traffic samples and rewards. Fig. 3 A simple framework of the DRL algorithm
An Efficient Machine Learning Approach to Recognize Dynamic Context…
921
Sample dataset is randomly selected from pre-recorded samples of datasets. A quality behavior is defined in the environment in order to reduce the reward [11]. This is achieved by increasing the incorrect predictions by the classifier. This will force the model to learn from more difficult samples. Reward is positive for the correct prediction of the agent, otherwise it is negative. For environment, the nature of the rewards is opposite to the agent. Environment always tries to reduce the reward of the agent during training. The algorithm is trained to maximize the sum of the rewards in the sequence: Tcpdump → Pcap file capture → Feature extraction → Pre - processing of the dataset to make it ready for analysis. To account the new type of attacks/zero day attacks, a feedback loop is introduced in the deep learning model. After identifying the attack type, a security operations center (SOC) analyst will label the dataset properly, and the labeled dataset will then merge with the original training dataset through feedback/retraining loop [12, 13]. The model will retrain itself with this new dataset and create a new model which will be used for intrusion detection. Thus, the performance of the model will get better through such retraining process. The retraining process will happen whenever a new type of attack has been identified by SOC analyst [14]. In the feedback process, new feature developed for collecting the SOC analyst feedback, and using this feedback, the model can be retrained. SOC analyst will have the capability to capture the real-time attack based on the inputs in logs and which are missed identified by model [15]. SOC analyst will be provided with a screen to search the CSV database (converted pcap files) using the source IP, date, timestamp. SOC analyst or the user can label the data with DoS attack. A weekly batch job would process and collect all the updated labeled data and recreate the new model [16]. Feedback features are shown in Fig. 4. Feedback form label is shown in Fig. 5. The feedback form with sample comments is shown in Fig. 6. Both the sample feedback form and features of the datasets can be viewed in Fig. 7.
Fig. 4 Feedback feature introduced in the platform
922
K. B. Swetha and G. C. Banu Prakash
Fig. 5 Feedback form—label
7 Training Results In this section, the behaviors of four classifiers over attacks are discussed. Figure 8 shows the class frequency, classifier types, and various attacks. The distributions of attacks during training are discussed in Fig. 9. Here, we can identify the defense and attack behavior during reward and loss by episode. Total rewards are more for defender agents as compared to attacks. Loss seems to behave similar for both attacks and defense. Performance of the test dataset for different epochs is illustrated in Fig. 10. The behavior of different classifier in the interval of 10 epochs each can be seen and analyzed. The total rewards recorded are 18,164 for 22,544 samples with the accuracy of 80.57%. When we verify the performance measures on test data, we get an accuracy of 80.57%, F1 = 0.7965, Precision Score = 0.8005, and Recall score = 0.8057. In similar lines, Table 1 presents the results of classifiers used. The performance of the proposed model can be evaluated by comparing with other models as shown in Fig. 11. The normalized confusion matrix is shown in Fig. 12 by considering different types of classifiers. It is noticed that the predicted labels are promising in proposed normal model. Now, the performance of the developed model before and after retraining is shown below in Fig. 13. Improvement in all metrics after retraining is shown in Fig. 14. It is evident that the improvement accuracy in
An Efficient Machine Learning Approach to Recognize Dynamic Context…
923
Fig. 6 Feedback form with comments
retrained model is 0.9. The difference in confusion matrix before and after retraining is also shown in Fig. 15. Retraining effect increases the correlation between predicted and true label.
8 Conclusions This article proposes the combination of deep learning and reinforcement learning methods in recognizing dynamic context and action recommendations in any typical ubiquitous environment. Among several applications available in computing domain, we considered DDoS attacks in enterprise networks as the case study. Various possible network attacks are studied and a potential solution for attacks is proposed through this research work. Deep reinforcement model promises to give optimal results over other classifiers of machine learning. The network intrusion detection model based on deep reinforcement learning algorithm is proposed. Along with, a retraining loop in the field of network intrusion detection is a novel approach. The model with retraining loop has better accuracy and performance improvement over the machine learning algorithms. Researchers can attempt to validate the performance of deep reinforcement model for other ubiquitous application in future.
924
Fig. 7 Features of the dataset
Fig. 8 Various classifiers of different attacks
K. B. Swetha and G. C. Banu Prakash
An Efficient Machine Learning Approach to Recognize Dynamic Context…
925
Fig. 9 Distribution of attacks during training
Fig. 10 Performance of the test dataset Table 1 Results from various classifiers and their score Estimated
Correct
Total
F1_score
Normal
11,632
9215
9712
86.3475
DoS
6537
6206
7458
88.6888
Probe
2326
1841
2421
77.5648
R2L
1781
855
2753
37.715
U2R
268
47
200
20.0855
926
K. B. Swetha and G. C. Banu Prakash
Fig. 11 Performance comparison with other models and our proposed model Fig. 12 Normalized confusion matrix
Fig. 13 Performance of the model before and after the retraining
An Efficient Machine Learning Approach to Recognize Dynamic Context…
927
Fig. 14 Improvement after retraining the model
Fig. 15 Behavior of confusion matrix before and after retraining
References 1. N. Bindra, M. Sood, Detecting DDoS attacks using machine learning techniques and contemporary intrusion detection dataset. Aut. Control Comp. Sci. 53, 419–428 (2019). https://doi. org/10.3103/S0146411619050043 2. S. Islam, S. Hassanzadeh Amin, Prediction of probable backorder scenarios in the supply chain using distributed random forest and gradient boosting machine learning techniques. J. Big Data (2020) 3. N. Anitha, An investigation into the detection and mitigation of denial of service (DoS) attacks (Heidelberg (in press), Monograph. Springer, 2011) 4. G. Caminero, M. Lopez-Martin, B. Carro, Adversarial environment reinforcement learning algorithm for intrusion detection. Comput. Netw. 159, 96–109. ISSN 1389–1286. https://doi. org/10.1016/j.comnet.2019.05.013 (2019) 5. V. Basu, An approach to detect DDoS attack with A.I. Toward Data Sci., Oct 2020. Towards Data Science, Weblink 6. R. Robinson, C. Thomas, Ranking of machine learning algorithms based on the performance in classifying DDoS attacks, in Proceedings of the IEEE Recent Advances in Intelligent Computational Systems (RAICS), Trivandrum (2015), pp. 185–190 7. A. Chonka, W. Zhou, J. Singh, Y. Xiang, Detecting and tracing DDoS attacks by intelligent decision prototype, in 2008 Sixth Annual IEEE International Conference on Pervasive Computing and Communications (PerCom), Hong Kong (2008), pp. 578–583
928
K. B. Swetha and G. C. Banu Prakash
8. S. Saad et al., Detecting P2P botnets through network behavior analysis and machine learning, in 2011 Ninth Annual International Conference on Privacy, Security and Trust, Montreal, QC (2011), pp. 174–180 9. I. Sharafaldin, A.H. Lashkari, A.A Ghorbani, Toward generating a new intrusion detection dataset and intrusion traffic characterization, in 4th International Conference on Information Systems Security and Privacy (ICISSP), Portugal (2018) 10. M. Almseidin, S. Alzubi, M. Kovacs, Alkasassbeh, Evaluation of machine learning algorithms for intrusion detection system, in 2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY), pp. 277–282 (2017) 11. A. Gharib, I. Sharafaldin, A.H. Lashkari, A.A. Ghorbani, An evaluation framework for intrusion detection dataset. Proc. Int Conf Inform Sci Secur (ICISS) 2016, 1–6 (2016) 12. D. Sivaganesan, Novel influence maximization algorithm for social network behavior management. J. ISMAC 3(01), 60–68 (2021) 13. S.R. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Parad. (JSCP) 3(02), 70–82 (2021) 14. C.V. Joe, J.S. Raj, Location-based orientation context dependent recommender system for users. J. Trends Comput. Sci. Smart Technol. (TCSST) 3(01), 14–23 (2021) 15. P.H. Meland, S. Tokas, G. Erdogan, K. Bernsmed, A.A. Omerovic, Systematic mapping study on cyber security indicator data. Electronics 10, 1092 (2021). https://doi.org/10.3390/electroni cs10091092 16. B.S. Rachit, P.R. Ragiri, Security trends in internet of things: a survey. SN Appl. Sci. 3, 121 (2021). https://doi.org/10.1007/s42452-021-04156-9
Convolutional Neural Network Based on Self-Driving Autonomous Vehicle (CNN) G. Babu Naik, Prerit Ameta, N. Baba Shayeer, B. Rakesh, and S. Kavya Dravida
Abstract The development of artificial intelligence has functioned as a trigger in the technological world. Things that were previously simply a figment of our imagination are now becoming a reality. A good example is the creation makes self-driving automobile. There are days when you can work or even sleep in your car and yet arrive safely at your destination without touching the steering wheel or accelerator. This research presents a practical model of a self-driving car that can travel from one location to another or on a variety of tracks, including curved, straight, and curved tracks directly followed by curved tracks. Images from the real environment are sent to convolutional neural network via a camera module positioned on the top of the automobile, which can predict any of the following guidelines: right or left, forward or stop, after which an Arduino signal is sent to the remote-controlled car’s controller, and the automobile goes to required destination without help of human participation.
1 Introduction Nowadays, IT engineers are focussing on technologies such as machine learning, deep learning, and data science as an area that offers a wide range of methods and strategies for creating, building, and most importantly, innovating. In the last few years, these tendencies have spawned a slew of novel ways that give or generate solutions for a broader range of applications. There are firms that offer solutions that run through large amounts of data and forecast or propose future commercial involvements based on that data, and there are also companies that predict and design appropriate medical treatments based on a patient’s previous medical history data. While these advancements are tremendous and have made huge strides in the last few years, one field is also making strides [1–5]. It is the field of self-driving automobiles. Many large corporations have already begun research in this area, and autonomous G. Babu Naik (B) · P. Ameta · N. Baba Shayeer · B. Rakesh · S. Kavya Dravida Department of Electrical and Electronics Engineering, BMS Institute of Technology and Management, Bengaluru, Karnataka, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_68
929
930
G. Babu Naik et al.
driving is expected to be one of the most significant advancements in the next years. The primary motivation for this study was to examine current trends and to observe how one of the most advanced technologies available today, which is making its way into production vehicles, is functioning. While working on this project, we considered what advantages this type of technology would have if it were widely used in production vehicles. Some of them are logically: fewer road deaths, fewer harmful emissions, improved fuel economy, shorter travel times, and so forth. Every year, almost above one million people die in traffic accidents around the world, with 3287 deaths every day [6]. In India, there were 137,000 people pass away in road accidents [7]. The primary causes of these accidents are speeding, chatting on the mobile, drunk-driving, and disobeying rush-hour traffic regulations, and the increasing statistics at every day, posing a big worry. Accidents continue to happen without notice, regardless of how much we strive, promote awareness about regulations of the road, and the significance of maintaining a safe driving environment. Errors caused by humans will never be completely removed, but they can be minimised. At this situation, technology has become unquestionably saved the day. The innovation and refinement of this technology have experienced an incredible rapid growth, starting with a radar-based collision detection at an early stage to today’s technology. Self-driving automobiles are one of the most talked-about technologies in the contemporary era. What was once only a dream has become a reality. The meaning of the term self-driving cars is vehicles that claim to carry passengers to reach their destinations with minimum human intervention while prioritising security. Many companies throughout the world are working hard to make driving a risk-free and safe experience, and they have even started producing prototypes. Google, Tesla, Mercedes, and a slew of other businesses have produced a working prototype and aim to sell a perfect model in the coming years. The self-driving car projected to drive faster reflexes and makes more consistent assessments than humans, thereby eliminating simple errors that cause accidents make first place. Other than saving lives, benefits of this technology include better traffic control, as these automobiles, unlike people, follow traffic laws, driving smoothly, and free of congestion. The self-driving cars also can help at parking area space difficulties via enabling for the creation of a taxi/pooling service for underused automobiles, which we describe as a car that is either parked for a few hours, while the owner is at work or is parked for an extended period of time or parked in garage, while the owner is on vacation. As a result, rather than using property for parking, we may make greater use of it. A front-facing camera, [8–10] radar, and a digitally controlled digital brake system are all included, and long-range ultrasonic sensors are at the heart of any autopilot system. Radars detect the moving of vehicles and other objects of moving system in vicinity of the vehicle, while the front-facing camera detects and recognises objects like as driving cars, trees on the road, driving lanes, humans, traffic lights, and other object critical data [11–13]. All this data is collected in real-time scenario and merged to a neural network, which gives anticipated ‘car’ behaviour accordingly [14, 15].
Convolutional Neural Network Based on Self-Driving …
931
2 Developed Systems Paper [8] relates a study attempts to improve driving by developing a system of help. The programme combines lane detection and a vehicle recognition technology to improve driver safety at night. Lane detection aids in the localization of markers, and this can detect a lane, whereas vehicle recognise entails the taking out of taillights and the use of a taillight paring technique.
2.1 Levels of Driving Automation ‘Driving mode’ is defined as ‘a type of driving scenario with characteristic dynamic driving task needs (e.g. highway merging, high-speed cruising, low-speed traffic jam, closed-campus operations)’ in SAE’s automation level standards. Stage 0: the automatic system sends out alerts and may intervene quickly, but it has no long-term control over the vehicle. Stage 1: ‘Hands-on’: both the driver and as well as the automated system are in charge of the car. Adaptive cruise control (ACC), in which it controls the driver steering while an automated system controls the regulations speed, and parking assistance, in which the steering is automated, but the speed is controlled manually, are two examples. At any time, the driver must be ready to seize entire control. Lane-keeping assistance (LKA) Type II is another example of Stage 1 self-driving. Stage 2: (sometimes known as ‘hands-off’): the car is fully controlled by the automated system. If the automatic system fails, the driver must keep a watchful eye on the road and be prepared to take action at any time. The abbreviation ‘hands-off’ should not be interpreted literally. In fact, during SAE 2 driving, hand-wheel touch is frequently required to show ready to react the driver. Stage 3: ‘Eyes-off’: the safely driver can divert his or her focus away from the task at hand, for example by texting or watching a movie. Situations that require immediate action, such as emergency braking, will be handled by the car. When the vehicle asks the driver to intervene, the driver must be ready to do so within a certain amount of time, as defined by the manufacturer. Traffic jam pilot is a feature included with this vehicle. The car supports full management of all kind of aspects driving in slow-moving traffic at speeds up to 60 km/h when the human driver activates it (37 mph). Stage 4: Mind-off: As Stage 3, however, there is never a time when paying attention to the road is essential for safety, allowing the driver to sleep or get out of the driver’s seat without jeopardising the vehicle’s safety. Self-driving cars are permitted in geofenced areas or under certain conditions, such as traffic jams. If the driver does not reclaim control outside of specific regions, the vehicle must be capable of safely terminating the journey, for as by parking the automobile. Stage 5: Optional steering wheel: there is no need for any human interaction. A robotic cab is one example.
932
G. Babu Naik et al.
2.2 Possible Technological Obstacles for Automated Cars • In chaotic inner-city contexts, artificial intelligence is still unable to perform properly. • A car’s computer, as well as a communication system between cars, might be hacked. • The sensing and navigation systems of the car are vulnerable to many forms of weather (such as snow) or malicious intervention, such as jamming and spoofing. • Large-animal avoidance necessitates Volvo is a company that specialises in recognising and tracking discovered that software designed for caribou, deer, and elk was inadequate when it came to kangaroos. • To function successfully, autonomous vehicles may necessitate extremely highquality custom maps. In the event that these maps become outdated, they must be able to revert to reasonable behaviour. • The radio spectrum needed for automobile connectivity is in high demand. • The systems’ field programmability will necessitate a thorough examination of the component supply chain and product development. • For automated cars to work optimally, current road infrastructure may need to be changed.
2.3 Scope of the Project For as long as cars have existed, road safety has been a concern. Every year, over a 2 million people die in traffic accidents around the world, and the majority of which might have been avoided. Constantly increasing traffic has resulted in a significant increase in commute time. Not only does this have an impact on people’s productivity, but also it has an impact on the environment. The use of these technologies in building self-driving automobiles has been facilitated by recent advancements in machine learning and artificial intelligence, as well as the ever-increasing performance of current computers. These automobiles have a number of categories, which are mentioned sectors of better road safety, shorter travel times. Increased productivity, reduced expenditure, environment-friendly, solution to parking problem, better traffic discipline, and potential for a new design. There will be no need for drivers’ licences. No special driver certifications are required because these vehicles can be operated by people of various ages and abilities. As a result, there will be no additional restrictions to the drivers or passengers by using of car as a mode of transportation. The cost of infrastructure will not be prohibitive. Existing highways are capable of handling the arrival of self-driving cars. There is no need for a big revamp. Sensors, cameras, and radar will be installed at intersections to control traffic flow once automobiles become self-driving. Not only will this prevent collisions, but also will it promote a more fuel-efficient flow of vehicles. Driverless car lanes could replace
Convolutional Neural Network Based on Self-Driving …
933
high-occupancy vehicle (HOV) lanes, which would not only promote autonomous travel but also help driverless cars travel more safely and rapidly.
3 The New Technique Model The purpose of this project is to build a monocular vision autonomous automobile prototype that runs on a Raspberry Pi processor. To transmit vital data from the real environment to the automobile, an HD camera and an ultrasonic sensor are considered. The car is capable of safely and intelligently arriving at the specified destination, avoiding the risk of human error. Many known techniques, such as lane detection and obstacle detection, are coupled to provide the car the control it requires. The goal of the project is to create a monocular vision autonomous automobile prototype utilising a Raspberry Pi processor. To transmit vital data from the real environment to the automobile, an HD camera and an ultrasonic sensor are used. The vehicle is capable of arriving at the stated location in a safe and intelligent manner, avoiding the risk of human error. Many known techniques, those are lane detection and obstacle detection, are coupled to provide the car the control it requires. On a computer, a neural network model produces steering predictions based on input visuals. The Arduino is then used to control the RC car based on the predictions shown in Fig. 1. The proposed approach uses a Pi camera connected to a Raspberry Pi in car to capture an image. The Raspberry Pi and laptop are both attached to the similar network, this Raspberry Pi delivers images, and these images collected are given to the convolutional neural network as input image. Before the transferring images to the neural network, it will be greyscaled to the images. The model predicts one of the four possible outcomes: right, left, forward, or stop. When projected result is grasped, the matching Arduino signal is activated, allowing the automobile to travel into a certain direction with help of their controller. For a better understanding, below is a visual representation of model developed of flow diagram shown in Fig. 2.
Fig. 1 Control flow diagram
934
G. Babu Naik et al.
Fig. 2 Flow diagram
4 Implementation of Hardware and Software Requirement The structure of Raspberry Pi, Pi camera, and Arduino microcontroller, and, of course, a remote-controlled toy automobile all these are used in this project. Let us take a quick look at each piece of gear.
4.1 The Working of Raspberry Pi The Raspberry Pi is a single-board computer with a low price tag with a processing for the Pi 3 range of 700 MHz to 1.2 GHz. On-board memory is available in sizes ranging from 256 MB to 1 GB. The boards are having up to four USB ports and HDMI connection. It also contains the greater number of pins GPIO that holds the protocols like as I2C. This also contains Wi-Fi as well as Bluetooth capabilities, making it extremely interoperable with different component devices shown in Fig. 3.
Convolutional Neural Network Based on Self-Driving …
935
Fig. 3 Chip of Raspberry Pi
Fig. 4 Pi camera
4.2 Pi Camera Working While preparation of files, it gives the errors introduced, and the Raspberry Pi camera is an excellent tool for capturing time delay, slow-motion object, and high-definition video. The camera is 25 mm to 24 mm by 9 mm in size, and it is attached to the Raspberry Pi through a flexible stretching with a serial data interface. The camera image sensor features have a five megapixel resolution and a focussed lens. This Fig. 4 shows that camera is really useful for security purposes.
4.3 The Work of Microcontroller Arduino The ATmega329P is the basis for this microcontroller. Six of the 14 digital i/p and o/p pins are on the board, it is also used as PWM outputs. There are additionally analogue 6 inputs. A 16 MHz quartz crystal, and USB port, power jack, an ICSP
936
G. Babu Naik et al.
header, and a button reset are all included. This weighs roughly 25 grams and 32 kilobytes and 2 kb of flash memory and SRAM. Aside from these advantages, the Arduino IDE is extremely user eco-friendly and employs the basic C programming language.
4.4 Arduino IDE The Arduino IDE is the basic platform on which Arduino board programmes are built. It features a compile button that aids in the compilation of the code, as well as an upload tab that aids in the uploading of the code to the board. Sketches are programmes created with the Arduino IDE and saved with the no extension. Verifying, saving, uploading, including part of library and monitor serial data, are just a few of the various capabilities available in the editor. Apart from that, the developers have created simple functions that make coding simple and enjoyable.
4.5 Software Implementation The PC receives the image given by the camera in order to classify the animal. The database is built, and the sample photographs are saved in it. Index image, image set, and get image are some of the functions in the programme. The image set is a container for a group of photographs. An image search index is created using index image. The retrieved image function and the index image function are used to search for images. The processing system receives the taken image as a query image. The retrieve image function accepts two arguments: a query image and a database image. The indices matching to photographs index that are visually comparable to the query image are the end result. The indices are ranked from the most similar match to the least similar match in the picture ID output. The range of value matches is 0–1. The image is not matched if the value is 0. If it is 1, the query image and the cached image are identical. If the data is in the range of 0 and 1, then the query image belongs to the same category as the stored image, which means the query image’s contents are identical to those of the stored image. If the image’s name matches the image’s regular expression, the animal is one of our cattle; otherwise, it is an invasion animal. If the score is between 0.1 and 0.9, the image is matched to the previously saved image.
Convolutional Neural Network Based on Self-Driving …
937
5 The Model Proposed Algorithm and Implementation 5.1 Algorithm Approach You must first preprocess your data in order to generate (accurate) predictions from deep neural networks. These preprocessing tasks are commonly used in the context of deep learning and image categorization. 1. 2.
Subtraction of the mean Reducing the size of the object by a certain amount
Two methods in OpenCV are new deep neural network method to be used for preprocessing images and preparing it to classification using pre-trained deep learning models. The process of block diagram algorithm is shown in Fig. 5. • Capturıng Phase: we must first acquire live photographs of the area to be observed and kept under observation in order to detect motion. Fig. 5 Block diagram of algorithm
938
G. Babu Naik et al.
Fig. 6 Flowchart
• Comparıng Phase: detecting motion by comparing current frames taken with previous frames: to see if there is any motion in live images, it needs to be compared the live images provided with the web cam to each other in order to detect changes in these frames and so forecasts the presence of any motion. Preprocessıng: processing in advance is very reliant on the feature extraction method and the type of input image. Few methods are: Denoising: denoising with a Gaussian or simple box filter. Contrast Enhancement: if the image’s grey level is excessively dark or bright, to boost speed, downsampling is used. Binary image morphological operations. Scaling by a certain amount. In contrast to the relevant work provided in the preceding section, this research proposes a visual approach that does not require any machine learning algorithms. As a result, unlike, say, a HAAR cascade classifier, the system will not require any prior training. A subset of the input test photographs was withheld to tune the various constants required by the various types of algorithms used, such as the Canny threshold flowchart shown in Fig. 6.
5.2 Model Proposed Furthermore, each interface includes a number of examples that assist the user in learning more about the functionality and hardware a prototype shown in Fig. 7 These were all about the hardware that would be required to create a working prototype. This project’s software was used in conjunction with the hardware. For creating machine learning models, the Raspberry Pi camera interface, Arduino IDE, OpenCV, and Spyder environments are used. Let us take a look at each of these in more detail. The automobile was tested on a variety of track configurations, including straight, curved, combined straight and curved. There was a total of 24 videos captured, from which photographs were retrieved. A total of 10,868 photos were retrieved and
Convolutional Neural Network Based on Self-Driving …
939
Fig. 7 Prototype car
Fig. 8 Track image before and after image processing
organised into folders such as left, right, straight, and stop. Each scenario’s greyscaled version is shown below as an example image. These photographs in Fig. 8 were scaled to 320 by 240 pixels and used to train the network. The convolutional neural network feature has the input of 128 nodes, two hidden layers with each 32 nodes and has the four output layers with four nodes for each of the four outputs. The network dropout of 0.5 was considered to avoid overlearning. Among these input as well as hidden layers, the ‘ReLU’ activation function was utilised, whereas output layer used the ‘softmax’ activation function. It takes 5–6 h to train in GPU mode with a batch size of 10 and 3 epochs. It all came down to the network configuration utilised to train a new model. Figs. 9 and 10 are showing the different tracks of the car. Figs. 11 and 12 are showing the signal detection with colour, and Figs. 13 and 14 are the test in virtual environment.
5.3 Comparative Study and Results Unit Testing: it is a development of software technique; in this, smallest testable pieces of a programme, referred to as units, are examined separately and independently for proper operation. Although unit testing is frequently automated, it can also
940 Fig. 9 Track
Fig. 10 Right turn
Fig. 11 Light detection with red signal
Fig. 12 Light detection with green signal
G. Babu Naik et al.
Convolutional Neural Network Based on Self-Driving …
941
Fig. 13 Test in virtual environment 1
Fig. 14 Test in virtual environment 2
be done manually. Unit testing’s purpose is to isolate each component of a programme and demonstrate that each component meets its requirements and functions properly. In the tables, you will find test cases and results. Integration Testing: Individual units are added up and tested in a group in integration testing, which is a type of software testing. This type of testing is done with a design to reveal flaws in the interaction of integrated units. ˙It is aided by test drivers and test stubs. Integration testing is the process of determining whether or not the various components of an application work together appropriately. It happens between unit testing and validation testing. Bottom-up integration testing and topdown integration testing are two approaches of integration testing. Table1 represents the unit testing and integrating testing. Table 2 represents the system and acceptance testing.
6 Conclusion The various hardware components, as well as the software and neural network design, are all discussed in detail. A successful model was constructed with the help of image processing and machine learning, and it performed as expected. Despite its inherent advantages, autonomous vehicle technology faces numerous social challenges. The influence of metal models can stifle technological innovation, just as it did with the first autos. However, new regulation is allowing these vehicles to demonstrate their viability. As more states authorise self-driving cars, the social barrier will fall away, allowing for the biggest change in personal mobility since autos were invented. Future Scope: these models have the scope, and also dependability can be improved. If we consider an actual car, the automobile swerves slightly off the course, which can be a significant concern if it collides with surrounding objects. If we could construct a sophisticated method to address this problem, it would not only make the
942
G. Babu Naik et al.
Table 1 Unit testing and ıntegrating testing case Sl # test case
UTC-1
UTC-2
UTC-3
ITC-1
ITC-2
Name of test
Detection of lane
Obstacle detection (ultrasonic)
Traffic signal detection
Detection of lane and warning intimation
Vehicle control depending on signal condition
Items being tested
Tested for uploading different images
Detection of obstacles in front
Traffic light LED detection
Intimation on lane diversion
Vehicle start and stop at signal
Sample input Upload sample image
Tested for objects placed at different distance
Different coloured LED input
Click and select image
Capture image and send density count to hardware
Expected output
Should detect the obstacle at all distance
Should detect the traffic signal and stop the vehicle
Should send data to hardware wirelessly
Control of vehicle movement
Actual output Lane detection successful
Obstacle detection passed
Vehicle stops after detecting red colour at signal
Serial values should be received on crossing of lanes
Vehicle start and stop operation achieved successfully
Remarks
Pass
Pass
Pass
Pass
Should detect the lane
Pass
Table 2 System and acceptance testing case System testing
Acceptance testing
Sl # test case
STC-1
Test case ID
System test case 1
Name of test
System testing
Description
Intelligent automated vehicle
Items being tested
Ultrasonic, lane detection, and signal detection should work synchronously
Input
Threshold values
Sample input
Three different inputs at a same time
Expected output
Functionality should be according to given criteria
Expected output
Should operate at multiple inputs
Actual result/remarks
Working as expected output
Actual output
Same as expected output
Passed (?)
Yes
Remarks
Pass
Convolutional Neural Network Based on Self-Driving …
943
system more dependable, but it would also make the entire design more appealing and accident-free.
References 1. S. Urooj, I. Ferozand N. Ahmad, Systematic literature review on user interfaces of autonomous cars: liabilities and responsibilities, in 2018 International Conference on Advancements in Computational Sciences (ICACS), Lahore (2018), pp. 1–10. https://doi.org/10.1109/ICACS. 2018.8333489 2. N. Kalkovaliev, G. Mirceva, Autonomous driving by using convolutional neural network, in 2021 3rd International Congress on Human–Computer Interaction Optimization and Robotic Applications (HORA), pp. 1–4. https://doi.org/10.1109/HORA52670.2021.9461350 3. G. Öztürk, R. Köker, O. Eldo˘gan, D. Karayel, Recognition of vehicles, pedestrians and traffic signs using convolutional neural networks, in 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) (2020), pp. 1–8. https://doi.org/10. 1109/ISMSIT50672.2020.9255148 4. M. Duong, T. Do, M. Le, Navigating self-driving vehicles using convolutional neural network, in 2018 4th International Conference on Green Technology and Sustainable Development (GTSD) (2018), pp. 607–610. https://doi.org/10.1109/GTSD.2018.8595533 5. B. Thilaka, J. Sivasankaran, S. Udayabaskaran, Optimal time for withdrawal of voluntary retirement scheme with a probability of acceptance of retirement request. J. Inform. Technol. 2(04), 201–206 (2020) 6. http://asirt.org/initiatives/informing-roadusers/roadsafety-facts/roadcrash-statistics 7. http://sites.ndtv.com/roadsafety/important-feature-toyouin-your-car-5/ 8. C.-C. Wang, S.-S. Huang, L.-C. Fu, Driver assistance system for lane detection and vehicle recognition with night vision, in 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005. (IROS 2005) (IEEE, 2005) 9. X. Miao, S. Li, H. Shen, On board lane detectıon system for ıntellıgent vehıcle based on monocular vısıon. Int. J. Smart Sensing Intellig. Syst. 5.4 (2012) 10. A. Saha et al., Automated road lane detection for intelligent vehicles. Global J. Comput. Sci. Technol. (2012) 11. R. Dave, E.R.S. Boone, K. Roy, Efficient data privacy and security in autonomous cars. J. Comput. Sci. Appl. 7(1), 31–36 (2019). https://doi.org/10.12691/jcsa-7-1-5 12. J. Wang, J. Liu, N. Kato, Networking and communications in autonomous driving: a survey, in IEEE Communications Surveys & Tutorials, vol. 21, issue 2 (Secondquarter 2019), pp. 1243– 1274. https://doi.org/10.1109/COMST.2018.2888904 13. C. Liu, C. Lin, S. Shiraishiand, M. Tomizuka, Improving efficiency of autonomous vehicles by V2V communication, in 2018 Annual American Control Conference (ACC), Milwaukee, WI, pp. 4778–4783 (2018). https://doi.org/10.23919/ACC.2018.8430910 14. W. Haoxiang, S. Smys, Big data analysis and perturbation using data mining algorithm. J. Soft Comput. Paradigm (JSCP) 3(01), 19–28 (2021) 15. R. Dhaya, Analysis of adaptive ımage retrieval by transition kalman filter approach based on ıntensity parameter. J. Innovat. Image Process. (JIIP) 3(01), 7–20 (2021)
Performance Analysis of Machine Learning Algorithms in Detecting and Mitigating Black and Gray Hole Attacks Mahesh Kurtkoti, B. S. Premananda, and K. Vishwavardhan Reddy
Abstract Black hole attacks and gray hole attacks are major denial of service (DoS) attacks in wireless sensor networks (WSNs). Intrusion detection systems are proposed to detect such attacks but lack the analysis for specific attacks. Machine learning (ML) algorithms are the ones used to detect such attacks and provide accurate results. This work is focused on detecting and mitigating black hole attack, gray hole attack, and flooding attack. Initially, performance of different ML algorithms in detecting these attacks is evaluated then later an algorithm for mitigating such attacks is proposed. WSN-DS dataset is obtained by inducing a malicious node acting as gray hole and black hole attacks. This dataset is given as the input of classification models with nine different ratios for training and testing (90:10 to 10:90). Performance measures such as accuracy and execution time for various classifiers were analyzed for each attack. From the results, it is observed that adaboost classifiers have the highest average accuracy of 97.97 and 94.48% in black hole attack and flooding, whereas random forest achieved 98.26% for gray hole attack. Similarly, in execution time, random forest took the least time of 0.081 µs. The analysis of ML algorithms is carried out using Jupyter notebook, simulation and mitigation of attacks is carried out in network simulator 2 (NS2). Keywords Wireless sensor networks · Denial of services · Black hole attack · Gray hole attack · Flooding · Machine learning · NS2
1 Introduction Wireless Sensor Network (WSN) consists of hundreds of nodes, which sense the physical environment. The sensed information is sent to the base station wirelessly. These nodes have limited power capability as they are powered by batteries, and M. Kurtkoti · B. S. Premananda (B) · K. Vishwavardhan Reddy Department of Electronics and Telecommunication Engineering, RV College of Engineering, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_69
945
946
M. Kurtkoti et al.
therefore, it is especially important to use the energy efficient algorithms in transmitting the information to have a longer lifetime of the network [1, 2]. As technology is improving the number of internet of things (IoT) devices used, and the data generated by these devices is getting huge. As these IoT devices transmit the information wirelessly; they have high chances of getting attacked. In medical and military applications, such attacks will lead to gaining very sensitive information. Such attacks may cause a huge threat to big organizations or government bodies. Black hole attack, gray hole attacks and flooding are some of the major Denial of services (DoS) attacks [3] shown in Fig. 1. In black hole attack, the malicious node takes the role of cluster head by sending numerous advertising cluster head packets in the network, then discards all other packets received from other nodes in the cluster. This attack makes an entire cluster in-accessible and thus makes no packets received by the base station from this cluster. In a gray hole attack, the malicious node assumes the role of cluster head; this malicious node forwards some of the received packets to the base station and discards the rest of received packets. This attack makes base stations receive inaccurate information from the cluster. In such situations, it is very crucial to detect and mitigate these DoS attacks in WSN. WSN is used in various applications due to its reliability, ease of deployment, scalability, and power efficiency. In some applications, the feedback should be quick, or the action must be taken in real-time. In such cases, the DoS attacks cause very serious issues; therefore, DoS attacks must be detected and mitigated in real-time. Fig. 1 DoS attacks (flooding attack, gray hole attack and black hole attack)
Performance Analysis of Machine Learning Algorithms…
947
The Intrusion Detection Systems (IDS) are designed to monitor the normal performance of the network and detect any anomalies in the network affecting the normal performance. IDS can be used to monitor the network traffic and to monitor the operating system functions. To tackle DoS attacks, the IDS are used in WSNs, which are signature-based models. The IDS will have a database consisting of network parameters during different DoS attacks which is compared with the current network parameters, if it matches with anything present in the database then the system is alerted for a DoS attack. This type of IDS cannot handle unknown attacks which are not present in its database. The Machine Learning approach for this problem is relevant because they can detect the attacks in very less time and with high accuracy. The only requirement for the ML approach is a proper dataset. In this work, the three major attacks flooding, black hole attack, and gray hole attacks are chosen to identify and classify them by using different ML algorithms such as random forest, random tree, k-nearest neighbor (KNN), J48, stochastic gradient descent classifier (SGD Classifier), adaboost classifier, bagging classifier and Gaussian naïve bayes (GNB). The accuracy and execution times of each algorithm are analyzed. Later algorithm to mitigate DoS attack is proposed. It gets critical to tackle the Denial of Service (DoS) attacks in WSN. In general, Intrusion Detection Systems (IDS) are designed to detect these DoS and numerous different attacks. ML (ML) models can help with the problem of detecting these attacks, for this a well-defined dataset having sufficient data samples is a pre-requisite. WSN-DS Dataset which has considered a WSN with 100 nodes and 5 clusters using LEACH algorithm and has normal, gray hole, black hole, flooding, scheduling attacks data samples [3]. An examination among ML characterization approaches for recognizing flooding, black hole attack and gray hole attack opening DoS attack in WSN is carried out [4]. The analysis is done by taking a WSN based dataset WSN-DS in nine different ratios [3]. From the results, it was observed that random forest achieved a maximum accuracy (99.76%) in detecting attacks. Random forest and j48 were the fastest in detection. The mitigation is carried out by an algorithm which avoids the attack nodes and finds the alternate path between the source node and destination node. Mitigation algorithm chooses a path with minimum distance and maximum energy. The remaining work is organized as follows. In Sect. 2, background to DoS attacks in WSN and ML algorithms. Section 3 has related works. Section 4 discusses the methods and materials. In Sect. 5, discusses about the results and its analysis. Finally, Sect. 7 presents the conclusion and future work.
2 Background of DoS Attacks DoS attacks are caused due to various reasons such as physical network damage, hardware failures, external attack, power shortage, dead nodes, software issues, and
948
M. Kurtkoti et al.
environmental conditions. Any malicious activity in a network that diminishes the normal performance of the wireless sensor network is termed as DoS attacks [3]. In this work, three DoS attacks are analyzed namely black hole attack, gray hole attack and flooding attacks. These attacks are considered because of the availability of dataset which contains all these attacks, WSN-DS [3]. Also, another reason for choosing these attacks is that they are discussed in many research works [1, 3, 4].
2.1 Flooding Attack Flooding attack is a problem occurring in both traditional network and WSNs. Flooding can be caused due to various reasons. In WSN, it can be caused by exploiting the Low Energy Aware Cluster hierarchy (LEACH) routing protocol. The malicious nodes send large number of Advertising Cluster Head (ADV CH) in the network, which causes the malicious node to be selected as Cluster Head (CH). This affects nodes which are present multiple hops away from malicious CH, it consumes lot of energy from theses nodes to transfer data via the intermediate nodes [3]. Flooding attack can also be caused by exploiting Ad-Hoc On Demand Distance Vector (AODV) routing protocol. In this attack, the malicious node sends numerous Route Request (RREQ) message to reach non-existing destination node, thus in this process some of the nodes with less energy will be dead and become nonfunctional in search for the non-existing destination node.
2.2 Black Hole Attack Black hole attack is caused when the attacker node advertises itself as the Cluster Head in the initial rounds. The black hole attacker node receives the packets from other nodes and does not forward anything to the base station or sink [3]. The black hole attack makes an entire cluster to be inactive when the cluster head chosen is a black hole node. Black hole attack can also be caused by exploiting Ad-Hoc On Demand Distance Vector (AODV) routing protocol. In this attack, the malicious node sends fake Route Reply (RREP) messages in response to the Route Request (RREQ) message sent by source node, thus advertises itself as nearest node to destination and diverts all the data packets from the source node to itself and drops all of them. Thus, no packets are received by destination node.
2.3 Gray Hole Attack Gray hole attack is where the attacker node advertises itself as Cluster Head like the black hole attack. But the difference is in forwarding, in gray hole attack partial
Performance Analysis of Machine Learning Algorithms…
949
number of packets are forwarded [3]. This attack makes a cluster to be partially active and inaccurate. This will create serious issues in applications where precision sensing is required. Gray hole attack can also be caused by exploiting the AODV protocol like black hole attack, but in this attack some packets are sent to destination and rest are dropped selectively or randomly by the malicious node.
3 Related Work A security mechanism called ABC and artificial neural network as a ML technique is proposed to improve the performance of the network in the presence of malicious nodes especially black hole attack and gray hole attack [1], authors achieved the throughput of 89.38 and delay of 0.071 s; however, the delay time must be improved. The review in [2] discusses about the various IDS which can be used for IoT; however, the traditional IDS are analyzed which are slow in detection of DoS as compared to ML model. A review on the various wireless sensors [5] classified sensors based on their applications. Comparison of all the existing IDS in WSN in terms of their design, specifications, requirements, and applications is carried out in [6], the IDSs are broadly classified into various classifications such as intruder type, intrusion type, detection methodology, source of the audit data, computing location of the collected data, infrastructure, usage frequency; however, there is lack of quantitative comparison by conducting practical experiments. Six commonly used ML techniques and Kyoto 2006+ data set are used to analyze the performance of each algorithm [7]. RBF, Ensemble, KNN, NB, SVM, KM, and FCM algorithms were compared for their accuracy and precision. RBF performed better than all other algorithms, however, the dataset used here is based on traditional network and not for WSN and the execution time is not analyzed. Basics of ML algorithms, along with practical advice on applying the tools and techniques in real world. It discusses the work on data mining approaches [8] and includes information on probabilistic methods and deep learning. The comparison of the existing LEACH protocol in providing security to WSNs and the mathematical model with the simulation results is done in [3] which resulted in a dataset WS-DS. The work considered LEACH protocol and data is collected from NS2 and then processed to 19 features. Different types of DoS at different layer of the network layers are discussed [9] and suggested defenses against the various types of attacks such as jamming, tampering, collision, exhaustion, unfairness, neglect and greed, homing, misdirection, black holes, flooding, desynchronization, however, there is no practical implementation and analysis. The machine learning approach is used in [10] to improve the accuracy and processing speed of recognition in convergence process of 3-dimensional image to plane image. The proposed model is analyzed for plane correlation and matching errors. The review [11] on comparison of ML algorithms in detection of DoS attacks
950
M. Kurtkoti et al.
shows the accuracy of different algorithms; however, there is a lack of analysis of multiple ML algorithms in detection of multiple DoS attacks. The authors in [12] proposed two methods in Electrical Impedance Tomography called adjacent drive method and cross method. From the results, it is found that the proposed methods were able achieve higher rates in measurement and reconstruction. Random forest algorithm performed best in identifying DoS in terms of accuracy. However, the dataset used is NSL-KDD, which is very old and not tailored for WSN. Besides, only accuracy is analyzed, and execution time is not analyzed. The different architectures in WSNs and various problems in them are discussed and about various types of defense mechanisms against each type of attacks are discussed and various approaches done in detection and in defense of DoS with respect to simulation tool [13]. Rathore et al. [14] propose the real-time intrusion detection system for the high-speed environment using decision tree-based classification model called C4.5 is used. The dataset used in this work is KDD99 which has forty-one features and among which only nine best features are selected by the help of BER and FSR techniques. Weka tool is used in this analysis, although authors have got good accuracy (less than 0.001% false positives), the dataset considered is not relevant to WSN. J48 algorithm and Naive Bayes algorithm are used to classify the emails as spam emails and genuine emails and according to the results J48 is more efficient in time and more accurate than Naive Bayes algorithm [15]; however, the analysis is done for only two ML algorithms. Silvio et al. [4], compares the different ML algorithms such random forest, Naive Bayes, rep tree, random tree, and J48 in detecting and mitigating the gray hole attack, black hole attack and flooding in terms of computational time and accuracy. J48 method was recommended to detect gray hole attack and black hole attack. Random tree method was suggested as the best option for flooding detection. The tool used for ML classifier analysis was WEKA. The accuracy is good in all the attack detection (more than 97%), but the computational times are high even though only five features are selected. The literature gaps observed are: The majority of literature works focused more on DoS detection and very few works propose mitigation models. So, in this work both detection and mitigation of DoS is carried out. Most of the works have analyzed, the IDS in qualitative manner and do not support their results with numerical data, but in this work the analysis of the proposed model is supported by numerical results. In many works using ML approach, the execution time is not analyzed, so in this work the execution time is also considered along with the accuracy. Some ML approaches, used the dataset not specifically meant for WSN. Hence, a newer dataset based WSN called WSN-DS is used in this work.
4 Methods and Materials NS2 is an open-source tool which supports the wired and wireless network simulation support with various routing protocols. NS2 is used to simulate black hole attack, gray
Performance Analysis of Machine Learning Algorithms…
951
Fig. 2 Methodology of the project, showing the process of detection and mitigation of DoS attacks
hole attack, and flooding. The Jupyter notebook is a web-based application which is mainly used in Data Science applications. It has a support for many libraries of ML algorithms. For the analysis of ML algorithms, Jupyter notebook is used.
4.1 Methodology The methodology of the project is shown in Fig. 2 where the network traffic is introduced with three different types of the denial of services attack namely black hole attack, gray hole attack and flooding attack. Then next step is to specifically identify which type of attack has been occurred and then find an alternative path from source to destination node without attacker nodes using mitigation algorithm.
4.2 Simulation of Attacks Simulation of flooding, black hole attack and gray hole attack is done by considering WSN of 50 nodes in area of 1000 × 1000 m with maximum of four attack nodes using AOMDV protocol in NS 2.35 running on Linux Ubuntu 16.04 operating system. The other parameters are Routing Protocol: AOMDV, MAC protocol: CSMA/TDMA,
952
M. Kurtkoti et al.
Simulation time: 100 s, Traffic agent: CBR and Transport agent: UDP and Data rate: 11 Mb/second. The pseudo code for simulation of attacks: 1. 2. 3. 4.
Start simulation Set: S and D Find: SN and DN Make: 3 or 4 nearest SN as BH or GH or FA if(CN=BH) drop all packets if(CN=GH) drop some packets randomly if(CN=FA) send numerous fake RREQ 5. End simulation
Here, S-Source node, D-Destination node, SN-Source neighbors, DN-Destination neighbors, BH-Black hole attacker node, GH-Gray hole attacker node, FA-Flooding attacker node, CN-Current node.
4.3 Detection of Attacks The detection of black hole attack, gray hole attack, and flooding is done using the nine ML algorithms: random forest, random tree, J48, k-nearest neighbor (KNN), gaussian naïve bayes, multi-layer perceptron (MLP), SGD classifier, bagging classifier, adaboost classifier. The accuracies and execution time of each algorithm in classifying the dataset into different attacks is calculated and tabulated in nine different experiments. The pseudo code for detection of attacks: 1. 2. 3. 4.
Import required Libraries and functions Impot dataset Divide dataset into training data and testing data Train the ML model a. b. c. d. e. f. g.
Random trees-(maximum leaf nodes = 10, random state = 4) Random forest-(default parameters) J48-(criterion = entropy, maximum depth = 10) Bagging classifier-(maximum samples = 0.5, maximum features = 1.0, number of estimators = 20) Adaboost classifier-(learning rate = 1, number of estimators = 20) KNN-(number of neighbors = 3) GNB-(default parameters)
Performance Analysis of Machine Learning Algorithms…
953
Table 1 WSN-DS split in 90:10 ratio Classes
Samples
Training set
Testing set
Normal
340,067
306,060
34,007
Black hole
10,049
9044
1005
Gray hole
14,596
13,136
1460
Flooding
3312
2981
331
Total
368,023
331,221
36,803
h. i. 5. 6. 7.
SGD classifier-(loss = hinge, penalty = l2, maximum iterations = 5000) MLP-(max iterations = 500, activation = relu)
Compute the confusion matrix Compute accuracy of the model Compute execution time of the model.
The WSN-DS consists of 368,023 samples in total, in which there are 340,067 samples of Normal class, there are 10,049 samples of black hole attack, there are 14,596 samples of gray hole attack and 3312 samples of flooding [3]. This dataset is obtained by simulating a WSN with 100 nodes and 5 clusters using LEACH protocol NS-2 and making some of the nodes to behave like black hole, gray hole, flooding, and scheduling attack node. In this work WSN-DS is split into two parts, one for testing and one for training in nine ratios (from 10:90 to 90:10). The example for dataset splitting for 90:10 ratios is shown in Table 1. The WSN-DS has features listed in Table 2. Among these, all features except id are selected for analyzing in all algorithms except GNB and MLP. For GNB and MLP, features such as ‘Is_CH’, ‘Dist_To_CH’, ‘ADV_S’, ‘SCH_S’, ‘SCH_R’, ‘dist_CH_To_BS’, ‘send_code’ are selected to train the model. Feature selection is done by trial-and-error method to get high accuracy.
4.4 Mitigation of Attacks After detection of attacks the next step is mitigation of these attacks. The mitigation of the attacks is done by identifying the compromised malicious nodes in both the attacks and then removing these nodes out of the transmission. Instead of the attacker nodes, an alternate path for transmission of data is chosen and sent. Pseudo code of mitigation algorithm
954
M. Kurtkoti et al.
Table 2 Features in WSN-DS [3] Feature
Description
id
Sensor node unique identification
Time
Simulation timestamp
Is_CH
Cluster Head (CH) flag
who CH
Who is CH in the current run;
Dist_To_CH
Distance between node and CH
ADV_S
Count of advertise CH messages sent by CH
ADV_R
Count of advertise CH messages received by nodes
JOIN_S
Count of join requests sent by nodes
JOIN_R
Count of join request received by CH
SCH_S
Count of TDMA messages sent by CH
SCH_R
Count of TDMA received by nodes
Rank
Node rank in TDMA scheduling
DATA_S
Count of packets sent from node to CH
DATA_R
Count of received packets from CH
Data_Sent_To_BS
Count of packets sent from CH to Base station (BS)
dist_CH_To_BS
Distance between CH and BS
send_code
Cluster send code
Consumed Energy
Amount of energy consumed in last turn
1. 2. 3. 4.
5. 6. 7. 8. 9.
Start simulation Set : S and D Find : SN and DN Make: 3 nearest SN as BH or GH or FA if(CN=BH) drop all packets if(CN=GH) drop some packets randomly if(CN=FA) send numerous fake RREQ Sort DN based on distance (increasing order) Sort DN based on energy (decreasing order) Select DNs having minimum distance from S and maximum energy. Transmit the data through selected DNs End simulation
Here, S-Source node, D-Destination node, SN-Source neighbors, DN-Destination neighbors, BH-Black hole attacker node, GH-Gray hole attacker node, FA-Flooding attacker node, CN-Current node.
Performance Analysis of Machine Learning Algorithms…
955
Table 3 Accuracy values for dataset with train: test ratio as 80:20 Algorithms
Black hole
Gray hole
Flooding
Overall accuracy
Random tree
92.60
90.51
90.82
99.21
Random forest
98.60
98.61
94.40
99.80
J48
96.72
98.28
92.94
99.72
Bagging classifier
98.45
98.17
84.56
99.75
Adaboost classifier
98.49
98.37
95.29
99.78
KNN
89.42
88.42
77.67
98.49
GNB
64.24
50.21
88.99
95.95
SGD classifier
64.20
51.51
92.29
96.05
MLP
64.24
61.48
92.38
96.54
5 Results and Analysis The samples of each class are divided into the same ratio as the dataset is split for training and testing. The accuracy and the execution time of each algorithm for each attack is measured. Total of nine experiments are carried out for the dataset split in training set: testing set ratio as 10:90 to 90:10, among which the results for the ratio 80:20 and 90:10 is presented in this work. While calculating the average values the results from all nine experiments are considered.
5.1 Analysis of Accuracy The eighth experiment is done with dataset divided into 80% training data and 20% testing data. In this experiment, the highest accuracy in black hole attack and gray hole is achieved by random forest and the highest accuracy in flooding is achieved by adaboost classifier as shown in Table 3. The nineth experiment is done with dataset divided into 90% training data and 10% testing data. In this experiment the highest accuracy in black hole and gray hole attack is achieved by both adaboost classifier and random forest, and the highest accuracy in flooding is achieved by adaboost classifier as shown in Table 4.
5.2 Execution Time The execution time for the training and testing is calculated by using the command called %%timeit. It returns the time for execution of a cell in Jupyter notebook. Code for training and testing is done in separate cells having %%timeit, this time is execution time of cell (Cell time), but execution time per sample (sample time) is
956
M. Kurtkoti et al.
Table 4 Accuracy values for dataset with train: test ratio as 90:10 Algorithms
Black hole
Gray hole
Flooding
Overall accuracy
Random tree
98.22
91.57
89.97
99.22
Random forest
98.61
98.29
94.70
99.80
J48
96.98
98.26
92.65
99.70
Bagging classifier
98.31
98.03
94.16
99.78
Adaboost classifier
98.61
98.29
95.98
99.80
KNN
87.78
89.18
80.61
98.55
GNB
63.24
50.31
89.32
95.91
SGD classifier
63.41
76.15
93.12
96.70
MLP
63.24
61.85
92.18
96.51
required, for this the execution time of cell is divided by total number of samples. For Example, execution time per sample for experiment 9 with dataset in ratio 90:10 is calculated by using Eq. 1 and tabulated in Table 5. In case of experiment, 9 numbers of samples of samples in training is 331221, and testing is 36803, respectively: Execution time per sample =
Cell time number of samples
(1)
All the algorithms except KNN algorithms is not analyzed here because of its very slow response to %%timeit command. All the execution times recorded are in microseconds. Total of nine experiments for execution time analysis are carried out for the dataset split in training set: testing set ratio as 10:90 to 90:10, among which Table 5 Execution time per sample for train: test ratio as 90:10 Algorithms
Training cell time (s)
Training sample time (µs)
Testing cell time (ms)
Testing sample time (µs)
Random tree
0.699
2.1
2.83
0.076
Random forest
10.9
32.9
205
5.6
J48
0.730
2.2
2.99
0.081
Bagging classifier
4.73
14.3
131
3.5
Adaboost classifier
30.1
90.88
149
4
GNB
0.598
1.8
17.9
0.48
SGD classifier
14.2
42.9
2.75
0.074
MLP
60
181
119
3.23
Performance Analysis of Machine Learning Algorithms…
957
Table 6 Execution times for 80:20 dataset Algorithms
Train time
Test time
Random tree
2
0.078
Random forest
30
5.37
J48
2.17
0.082
Bagging classifier
13.70
6.01
Adaboost classifier
83.45
3.90
GNB
1.73
0.7
SGD classifier
42.69
0.088
MLP
485.22
2.21
Table 7 Execution times for 90:10 dataset Algorithms
Train time
Test time
Random tree
2.1
0.076
Random forest
32.9
5.6
J48
2.2
0.081
Bagging classifier
14.3
3.5
Adaboost classifier
90.88
4
GNB
1.8
0.48
SGD classifier
42.9
0.074
MLP
181
3.23
the results for the ratio 80:20 and 90:10 is presented in this work. While calculating the average values the results from all nine experiments are considered. The execution times per sample for each algorithm for the dataset divided into 80% training and 20% testing set are in Table 6. Random tree takes least testing time per sample and random forest needs the highest testing time per sample. The execution time per sample for each algorithm for the dataset divided into 90% training and 10% testing set are tabulated in Table 7. Random Trees take least testing time per sample and random forest needs the highest testing time per sample.
5.3 Comparison of Results The average accuracy and execution time values are tabulated in Table 8. From Table 8, it can be inferred that the highest accuracy in black hole attack is achieved by adaboost classifier, the highest accuracy in gray hole attack is achieved by random forest, the flooding attack is achieved by adaboost classifier, random forest is fastest in detection. It is observed that highest accuracy of 97.97% for black hole attack by
958
M. Kurtkoti et al.
Table 8 Average accuracy and execution time values of the proposed work Algorithms
Black hole
Gray hole
Flooding
Overall accuracy
Training time
Testing time
Random tree
93.58
90.50
91.00
99.18
1.92
0.081
Random forest
97.79
98.26
93.57
99.76
28.78
5.68
J48
96.89
96.42
93.18
99.61
2.33
0.093
Bagging classifier
97.60
97.82
92.35
99.72
11.66
6.02
Adaboost classifier
97.97
97.85
94.48
99.71
70.34
3.45
KNN
74.77
84.24
72.29
97.86
Nil
Nil
GNB
63.69
58.56
79.55
95.92
1.69
0.58
SGD classifier
69.11
65.54
89.41
96.23
37.5
0.091
MLP
60.29
58.95
91.26
96.46
407.58
2.65
adaboost algorithm is obtained. The highest accuracy in gray hole attack of 98.26% by random forest. In execution time, the least time is recorded by Random tree with 0.081 µs and J48 with 0.093 µs. The black hole attack, gray hole attack and flooding were analyzed using Jupyter notebook with random forest, random tree, J48, adaboost classifier, bagging classifier, SGD Classifier, KNN, MLP, and GNB. The performance in terms of accuracy of each algorithm in classifying the black hole attack, gray hole attack and flooding is shown in Fig. 3. It can be observed from Fig. 3 that random tree, random forest, J48, bagging classifier, adaboost classifier are consistently having an accuracy of more than 90% in each attacks, while the others such as KNN, GNB, SGD Classifier, and MLP are comparatively low. The reason for the poor performance of the KNN, GNB, SGD Classifier in classifying black hole attack and gray hole attack is that the parameters of the samples of these attacks are very much similar with each other. Unlike these attacks, flooding attack has variation of its values for certain parameters which make its accuracy more than black hole attack and gray hole attack in these algorithms. In black hole attack detection, in terms of accuracy the trend from high to low is adaboost classifier, random forest, bagging classifier, J48, random tree, KNN, SGD Classifier, GNB classifier and MLP. In gray hole attack detection, in terms of accuracy the trend from high to low is random forest, adaboost classifier, bagging classifier, J48, random tree, KNN, SGD classifier, MLP, and GNB. In flooding attack detection, in terms of accuracy, the trend from high to low is adaboost classifier, random forest, j48, bagging classifier, MLP, random tree, SGD classifier, GNB, KNN.
Performance Analysis of Machine Learning Algorithms…
959
Average accuracy values of ML algorithms
91.26 60.29 58.95
63.69 58.56 79.55
69.11 65.54 89.41
Flooding 74.77 84.24 72.29
97.97 97.85 94.48
97.6 97.82 92.35
96.89 96.42 93.18
97.79 98.26 93.57
Gray hole
Accuracy
93.58 90.5 91
Blackhole
ML Algorithms
Fig. 3 Average accuracy values of ML algorithms
The accuracy trend for all the attacks is shown in Fig. 4. In terms of execution time, for training, the trend of time consumption per sample from low to high is GNB, random tree, J48, bagging classifier, random forest, SGD classifier, adaboost classifier, MLP. KNN algorithm is not analyzed here because of its very slow response to %%timeit command. Similarly for testing the trend of time consumption per sample from low to high is random tree, SGD classifier, J48, GNB, MLP, adaboost classifier, random forest, and bagging classifier.
GNB
SGD Classifier
MLP
Blackhole
Gray hole Attacks
Fig. 4 Accuracy values in terms of attacks
91 93.57 93.18 92.35 94.48 72.29 79.55 89.41 91.26
KNN
58.56 65.54 58.95
J48
AdaBoost Classifier 90.5 98.26 96.42 97.82 97.85 84.24
Random Forest
Bagging Classifier 93.58 97.79 96.89 97.6 97.97 74.77 63.69 69.11 60.29
Accuracy
Accuracy values in attacks Random Tree
Flooding
960
M. Kurtkoti et al.
6 Conclusions and Future Work The performance analysis of machine learning algorithms in detecting DoS attacks is studied and an algorithm to mitigate such attacks is proposed. The algorithms used in this work are random forest, random tree, KNN, bagging classifier, adaboost classifier, SGD classifier, MLP, J48 and gaussian naïve bayes. Among these algorithms random forest, random tree, J48, bagging classifier and adaboost classifier are performing very well in classifying the black hole attack, gray hole attack and flooding attack with more than 90% accuracy in each attack. The algorithms such as SGD classifier, MLP, GNB, and KNN were not satisfactory in individual class classification. In terms of execution time per sample, random tree took 0.081 µs. J48 took 0.093 µs, therefore, these are faster algorithms in detection. Adaboost classifier took 3.45 µs, random forest took 5.68 µs, and bagging classifier took 6.02 µs. these algorithms are slower in detection. Mitigation of these attacks is done by removing the malicious node from transmission route and using other normal neighboring nodes based on the shortest path and maximum energy for transmission of data. By this way, the lifetime of the WSN will be increased. For future work, the detection and mitigation methods for these attacks should be integrated. Furthermore, some other DoS attacks should be analyzed by ML algorithms.
References 1. P. Rani, K.S. Verma, N.G. Nguyen, Mitigation of black hole attack using swarm inspired algorithm with artificial neural network. IEEE Access, IEEE 8(25), 121755–121764 (June 2020) 2. B.B. Zarpelao, R.S. Miani, C.T. Kawakani, S.C. de Alvarenga, A survey of intrusion detection in internet of things. J. Netw. Comput. Appl. Res. Gate 84, 25–35 3. I. Almomani, B. Al-Kasasbeh, M. Al-Akhras, WSN-DS: a dataset for intrusion detection systems in wireless sensor networks. J. Sens. Hindavi Publishing Corp. (Aug 2016), pp. 1–6 4. S.E. Quincozes, J.F. Kazienko, Machine learning methods assessment for denial-of-service detection in wireless sensor networks, in IEEE 6th World Forum on Internet of Things (28 Oct 2020), pp. 1–6 5. M. Ayaz, M. Ammad-uddin, I. Baig, E.M. Aggoune, Wireless sensor’s civil applications, prototypes, and future integration possibilities: a review. IEEE Sens. J. 18(1), 4–30 (1 Jan 2018) 6. I. Butun, S.D. Morgera, R. Sankar, A survey of intrusion detection systems in wireless sensor networks. IEEE Commun. Surv. Tutorials 16(1), 266–282 (Oct 2014) 7. M. Zaman, C. Lung, Evaluation of machine learning techniques for network intrusion detection, in IEEE/IFIP Network Operations and Management Symposium (NOMS) (Apr 2018), pp. 1–5 8. I. Witten, E. Frank, M. Hall, Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn, ed. by M. Kaufmann (Elsevier, 2011) 9. D.R. Raymond, S.F. Midkiff, Denial-of-service in wireless sensor networks: attacks and defenses. IEEE Pervasive Comput. 7, 74–81 (March 2008) 10. A. Sungheetha, R. Sharma, 3D image processing using machine learning based input processing for man-machine interaction. J. Innovat. Image Process. (JIIP) 3(01), 1–6 (Feb 2021)
Performance Analysis of Machine Learning Algorithms…
961
11. S. Gunduz, B. Arslan, M. Demirci, Review of machine learning solutions to denial-of-services attacks in wireless sensor networks, in Proceedings of International Conference on Machine Learning and Applications (IEEE, 2015), pp. 150–155 12. E.E.B. Adam, P. Sathesh, Survey on medical imaging of electrical impedance tomography (EIT) by variable current pattern methods. J. ISMAC 3(02), 82–95 (Apr 2021) 13. O.A. Osanaiye, A.S. Alfa, G.P. Hancke, Denial of service defence for resource availability in wireless sensor networks. J. Magaz. IEEE Access 6, 6975–7004 (2018) 14. A. Radhakrishnan, V. Vaidehi, Email classification using machine learning algorithms, in International Journal of Engineering and Technology, vol. 9, IJERT (May 2017), pp. 335–340 15. M.M. Rathore, F. Saeed, A. Rehman, A. Paul, A. Daniel, Intrusion detection using decision tree model in high-speed environment, in Proceedings of International Conference on SoftComputing and Network Security (IEEE, Feb 2018), pp. 1–4
Effect of Non-linear Co-efficient of a Hexagonal PCF Depending on Effective Area Md. Rawshan Habib, Abhishek Vadher, Ahmed Yousuf Suhan, Md Shahnewaz Tanvir, Tahsina Tashrif Shawmee, Md. Rashedul Arefin, Al-Amin Hossain, and Anamul Haque Sunny
Abstract A photonic crystal fiber (PCF) is an optical fiber that gets the waveguide characteristics from an array of very small and tightly separated air holes that run the length of the fiber rather than from a spatially changing glass structure. These air holes can be created by stacking capillary and/or solid tubes and implanting those into a bigger tube, or even by utilizing a preform containing holes. PCFs have a wide range of characteristics. One of these is the non-linear co-efficient. This property is influenced by factors such as effective area, pitch size, and so on. The overall goal of this study is to develop and improve the optical characteristics of PCFs and to design a hexagonal PCF for wideband near-zero dispersion-flattened features for dispersion managed applications. The non-linear co-efficient of the hexagonal PCF with respect to effective area is also calculated here. Keywords Photonic crystal fiber · Hexagonal PCF · Non-linearity · Fiber optics
1 Introduction PCF seems to be a novel kind of fiber optics depending on photonic crystal characteristics. PCF is probably seeking usage in fiber-optic networking, fiber lasers, non-linear gadgets, high-power transmission, very responsive gas sensors, as well as other fields thanks to its capability to constrain lights in hollow structure or even with containment properties not feasible in traditional fiber optics. While PCF had been developed, it marked a watershed moment in fiber industry. It does have a ton Md. R. Habib · A. Vadher Murdoch University, Murdoch, Australia A. Yousuf Suhan Curtin University, Bentley, Australia Md S. Tanvir (B) · T. Tashrif Shawmee · Md. R. Arefin · A.-A. Hossain · A. Haque Sunny Ahsanullah University of Science and Technology, Dhaka, Bangladesh Md S. Tanvir Technische Universität Chemnitz, Chemnitz, Germany © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_70
963
964
Md. R. Habib et al.
of potential and possibility. Both in linear or non-linear regimes, the wider design space, adaptable, preferable, and exceptional properties are believed to lead to various science—technical implementations. PCFs as well as the method scientists investigate their possibilities in controlling sunlight are thought to be the foundation of optic. For instance, the medical industry needs unique waveform laser beams or broadband lighting effects for diagnostic test, while the telecommunication industry seeks extra adaptable amplifiers and economical fibers, as well as the sensor sector seeks environment-friendly sensors and responsive gas detector structures both for locally and remotely detection. PCFs, whether hollowed cores or hard cores, may be used as optical systems in any of these new industries. Degradation tolerances are greater than in typical fibers, allowing these to provide high-power beams for laser cutting and welding. They have a lot of potential for different non-linear optical phenomena that are important for environmental monitoring. Furthermore, numerous new uses, including as power supply, very short pulse distribution, and pulse compressing, necessitate the refinement and development of additional features, including higher bandwidth, near-zero flat chromatic dispersion, and non-linear regulation. Such critical activities continue to pose significant technological hurdles. In light of this field’s infinite potential and the aforementioned technological challenges, we have chosen to join the continuing efforts to build smart PCFs for diverse engineering applications. The optical fiber has become an essential part of the information infrastructure in recent years. The development of optical fibers is certainly the biggest scientific revolution in modern telecom era. Along with its numerous appealing characteristics, it has progressively replacing copper cable and satellite connectivity in telecom networks. Optical fiber connection enables for data transfer across long ranges and at faster data rate than conventional communication methods. Fibers in telecommunications infrastructure also have reduced attenuation and are less susceptible to electromagnetic interference. The overall objective of the study is to design and refine the optical properties of PCFs. Followings are the major objectives of this work: • To design a hexagonal PCF for wideband near-zero dispersion-flattened characteristics for dispersion managed applications. • To design and characterize PCFs for high birefringence and near-zero dispersion for optical sensor applications. • To calculate the non-linear co-efficient of the hexagonal PCF with respect to effective area.
2 Brief on Photonic Crystal Fiber PCFs are optical fibers with a micro-structured substance configuration in a backdrop substance with varying refractive indexes. The backdrop substance is frequently undoped silica, with air gaps along the fiber length to create a low index area. High index and low-index guiding fibers are two usual types of PCF. High index guiding fibers, like ordinary fibers, use the M-TIR concept to guide light in a solid core.
Effect of Non-linear Co-efficient of a Hexagonal...
965
The lower effective index in the micro-structured air-filled area causes complete internal reflection. The photonic bandgap (PBG) effect guides light through lowindex guiding fibers. Because the PBG effect prevents light from propagating through the microfabricated cladding zone, the light is limited to the low index core. The effective refractive index is strongly wavelength dependent, and the PCFs’ intrinsic versatility allows for a wide range of unique characteristics. Fibers with an infinitely normal layer, highly non-linear fibers, as well as fibers featuring anomalous dispersion in the shorter wavelengths area are examples of such characteristics. In the THz region, a broad range of photonic crystal fiber usage has recently been demonstrated. To obtain high value transmission in the THz frequency, different air rings dependent PCF is designed in [1]. The suggested concept incorporates five air rings waveguide with a standard multi-layer PCF in the middle. Finite and time domain technique are used to validate the proposed design. All silica PCF is designed in [2] which has 15 µm diameter. 2.2 dB/km is the measured attenuation of the proposed PCF. A unique PCF with typical size of fiber is proposed in [3] where direction of bending seems to have little effect on fiber effectiveness and the suggested fiber has the ability to withstand twisting radiuses of approximately 15 cm. An economical PCF oriented sensor with a simplistic and distinctive architecture has been presented in [4] for fluid application areas. To decrease manufacturing complication, a hexagonal layout and a round form hole are being used. On the other hand, a hexagonal PCF is presented in [5] with an investigation of propagation features via numerical fuu vector method. In comparison with previous PCFs consisting of complicated designs, the suggested hexagonal PCF could be readily manufactured with a standard draw process to obtain significant non-linearity. For perhaps the first instance, researchers show a photonic crystal fiber in [6] having negligible dispersion that is extremely non-linear and maintains polarization. A novel PCF with round shape is developed in [7] where cladding is made up of four circles of air holes and a big air gap inside the middle. The findings show that fiber is resistant to elliptical bending. All of the suggested PCF’s characteristics indicate that it could play a crucial role in fiber optics technologies in near future. By introducing a combination of toluene and chloroform in to dual apertures, a heat-controlled PCF splitter with various coupling properties is created in [8]. Several splitter mechanisms are combined in a single, stabilized PCF structure. The proposed splitter shows a more straightforward construction and superior efficiency. It also has a lot of promise throughout the area of integrated optoelectronic devices. Another hexagonal PCF is presented in [9] for wave transmission with a negligible amount of loss. Scientists describe the development and attributes of several forms of photonic crystal fibers in [10] with polarization-maintaining and polarized capabilities, as well as the most recent developments in this field. Finite element method and COMSOL method are used in [11] to design heptagonal cladding areas that is dependent on PCF. A heterostructured PCF with dual-lattice-matched photonic crystals core and cladding are presented in [12], whereas a SPR-based PCF two windows polarized filter is suggested in [13] since it is simple to make the square structure of air holes.
966
Md. R. Habib et al.
3 PCF Design Scope and Challenges Although the many potential and capacities, PCFs are associated with applicationspecific development difficulties. Such implementation architectural difficulties should be solved effectively in order to completely examine the benefits of PCF technique over traditional fibers. PCF’s developmental successes are listed in Table 1.
3.1 Dispersion Managed Applications Despite the fact that PCFs provide a lot of flexibility in design, creating a almost zero dispersion-flat PCF that is needed in virtually all implementations, remains a difficult task for engineers. This is because, in parallel to dispersion-flat properties, many dispersion managed operations usually necessitate a low confinement loss. As a result, designers utilize PCFs with several rings of air holes to decrease confinement losses, or PCFs having non-uniform cladding to obtain a dispersion-flat slope as well as low confinement losses at the same time. Not only is the previous models method inadequate for wideband dispersion control, but it also leads in a larger holey cladding zone and more manufacturing problems. According to reports, the latter method, while widely utilized, poses a significant manufacturing challenge due to non-uniform cladding. Improved design factors, such as air-hole modulation, result in non-uniform cladding, which has an impact on manufacture and tolerance. To date, such problems remain, and they must be solved by implementing innovative design approaches. Table 1 Milestones in the evolution of PCF
Year
Milestones
1996
First ındex guiding (solid core) PCF
1997
Endlessly single mode PCF
1998
Large mode area PCF
1999
Hollow core PCF, dispersion shifted PCF
2000
Multicore PCF, PM PCF, Er-doped PCF laser and SC
2001
Polymer PCF, non-linear processes in PCF
2002
SF6 glass PCF
2003
Tellurite glass PCF
2004
FWM twin photon generetion in PCF, Ge-doped PCF
2005
PBGs at 1% index contrast, bismuth PCF
2006
Gas and liquid filled PCF
2008
Chalcogenide highly non-linear PCF
Effect of Non-linear Co-efficient of a Hexagonal...
967
3.2 Sensor Applications Sensor applications can benefit from greatly birefringent PCFs (HB-PCFs). Although their relative infancy in the sensing area, PCFs have attracted the attention of a number of research organizations owing to its unique properties. The most appealing feature of PCFs seems to be that the transmission range, mode shape, non-linear effects, dispersion, air filling portion, as well as birefringence, among many other properties, are being fitted to achieve qualities which are not possible with traditional OFs by changing the size as well as position of the cladding holes or the core. Furthermore, the presence of air holes allows light to propagate via air or, conversely, allows fluids or vapors to be inserted into the air holes. This allows for a well-controlled contact among light and specimen, allowing for novel sensing applications which would never have been possible with traditional OFs. PCFs provide a large share of new and enhanced uses in the optical fiber sensing area due to their diversity of characteristics. Connected to near-zero dispersion and negligible confinement losses, while developing. Because birefringent fibers are expected to exhibit nearzero dispersion only at targeted wavelength for real situations, this is the case. This is a continuous problem that needs extra design attention.
3.3 Non-linear Optics Applications Optical parametric amplifier, ultra continuum production, soliton production, as well as wavelength converter are all uses for non-linear PCFs. It’s designed for supercontinuum production, non-linear wavelength conversion, and non-linear PCFs, and it has a one of a kind dispersion profile as well as high non-linear coefficient. Since a PCF with such a narrow pitch and consistent smaller air-hole dimensions tends to change the zero dispersion wavelength toward such lower frequencies, and a PCF with either a higher air-hole dimension comparative to the pitch sets a boundaries on the single-mode process bandwidth, establishing the zero dispersion wavelength all around mobile communications window is a significant obstacle when trying to design highly non-linear PCFs. As a result, optimizing the air-hole diameter and pitch, while retaining layout clarity is critical. Furthermore, because HNL-PCFs employ a lower pitch value, confinement loss management and sensitivity to parameter changes become important concerns. A lower pitch causes more confinement loss and increases sensitivity to parameter changes. At present, the very same problem persists, which must be resolved quickly.
968
Md. R. Habib et al.
3.4 Telecom Applications PCF technologies was first explored solely for optical electronic devices and not as a data transfer medium. Due to the significant optical losses of these kind of fibers, it was the case. Optical losses have lately been decreased to 0.28 dB/km using advanced and precise design and manufacturing processes. As a result, there is considerable attention in rethinking PCFs as a transmission medium for upcoming configurable data transfer uses. For these kind of purposes, PCFs with a wide mode area are ideal.
4 Design of a Hexagonal PCF Photonic crystal fiber is an optical fiber with a cladding made of crystalline materials which surrounds the center of the wire. A photonic crystal seems to be a low-loss periodic dielectric media made up of a regular pattern of tiny air holes which span the width of the cable. Because it is created with a core and cladding of consistent refractive index disparity, light flows via the core as a consequence of the refraction characteristic of light that happens as a result of the variation in between refractive indices of the core and cladding. However, unlike ordinary fiber optics, light is confined in the core of PCF, giving a far superior wave path for photons. In a hexagonal PCF, there are 6 arms. Thus, in the first ring, the number of circle is 6 × 1 = 6. In the second ring, there are 6 × 2 = 12 circles. With the increasing number of rings the number of circles increase. It can easily be found the number of circles in a certain number of ring from the below equation: Number of circles in the ring = number of ring (N) ∗ number of arms The first ring has six circles, each of which is 60° away from another. Nonetheless, while the quantity of rings grows, so does the amount of circles. The source circles are the 6 circles in each ring that are precisely the very same angle as the 6 circles in the first ring. The number of circles in 2nd, 3rd, 4th, and 5th rings are 12, 18, 24, and 30, respectively. With the exception of the first ring, every other ring has a new circle among two neighboring origin circles. ˙In this study, COMSOL MULTIPHYSICS 4.2 software is used for simulating hexagonal photonic crystal fiber. This is one of the most effective programs for simulating hexagonal PCF designs. The program is particularly useful for establishing model parameters and variables, building mesh for finite elements, integrating physics and material characteristics, and analyzing and presenting the results. Final output of 8 ring is depicted in Fig. 1, whereas other simulations are provided in Figs. 2 and 3. Here, the measurement is carried out of the non-linear co-efficient of a hexagonal PCF depending on effective area. Graphical result on non-linear co-efficient of a hexagonal PCF depending on effective area is shown in Fig. 4. The 8 ring hexagonal
Effect of Non-linear Co-efficient of a Hexagonal...
969
Fig. 1 Final output of 8 ring
Fig. 2 Simulation of a 1st ring, b 2nd ring and c 3rd ring
model PCF is made and then by varying the pitch value, the non-linear co-efficient is measured. Non-linear co-efficient is shown for the wavelength range 1.25–1.75 µm. At wave length 1.25 µm, For = 0.65 µm, non-linear co-efficient = 204 W−1 km−1 . For = 0.7 µm, non-linear co-efficient = 190 W−1 km−1 . For = 0.75 µm, non-linear co-efficient = 175 W−1 km−1 . For = 0.8 µm, non-linear co-efficient = 160 W−1 km−1 . For = 0.85 µm, non-linear co-efficient = 148 W−1 km−1 . Therefore, it can be seen that if the pitch is increased, the non-linear confinement is decreasing. Dispersion is proportional to non-linear co-efficient. Thus, with less pitch number, less the dispersion. V V V V V
970
Md. R. Habib et al.
Fig. 3 Simulation of a 4th ring, b 5th ring, c 6th ring and d 7th ring
Fig. 4 Graphical result on non-linear co-efficient of a hexagonal PCF depending on effective area
Effect of Non-linear Co-efficient of a Hexagonal...
971
5 Conclusion PCF is a single-material fiber which contains tiny air holes in a silica background. ˙In this study, a hexagonal PCF for wideband near-zero dispersion-flattened features for dispersion managed applications is designed for developing and improving the optical characteristics of PCFs. The hexagonal PCF’s non-linear co-efficient with regard to effective area is likewise computed here. In future, by making some improvements, photonic crystal fiber can be made more efficient. • More precise and effective approaches for modeling and characterization. • The chances of sensing systems that rely on such fibers becoming commercially available are promising. • Opportunities for unique and modified PCFs with more strong or more weak non-linearity, lower light loss. • Including for wide mode area, PCFs may be constructed having low sensitivity to bend losses. • Filling air cores with liquids and gases in a more efficient manner. At quite high-power levels, gas-filled PCFs can be used for optical sensors or non-linear spectrum broadening.
References 1. A.K. Vyas, Multiple rings based photonic crystal fiber for terahertz application. Optik 231, 166424 (2021) 2. M.D. Nielsen et al., All-Silica photonic crystal fiber with large mode area, in 2002 28th European Conference on Optical Communication. Copenhagen (2002) pp. 1–2 3. J. Wang et al., Design and analysis for large-mode-area photonic crystal fiber with negativecurvature air ring. Opt. Fiber Technol. 62, 102478 (2021) 4. M.J.B.M. Leon, S. Abedin, M.A. Kabir, A photonic crystal fiber for liquid sensing application with high sensitivity, birefringence and low confinement loss. Sens. Int. 2, 100061 (2021) 5. M. Kim, C.G. Lee, S. Kim, Silicon-embedded photonic crystal fiber for high birefringence and nonlinearity. Optik 212, 164657 (2020) 6. K.P. Hansen et al., Highly nonlinear photonic crystal fiber with zero-dispersion at 1.55/spl mu/m, in Optical Fiber Communication Conference and Exhibit, Anaheim (2002), pp. FA9– FA9 7. L. Zhang, Y. Meng, Design and analysis of a photonic crystal fiber supporting stable transmission of 30 OAM modes. Opt. Fiber Technol. 61, 102423 (2021) 8. Y. Zhang et al., Temperature-controlled and multi-functional splitter based on dual-core photonic crystal fiber. Res Phys. 19, 103578 (2020) 9. N.K. Arya, M. Imran, A. Kumar, Design and analysis of low loss porous-core photonic crystal fiber for terahertz wave transmission. Mat. Today Proceed. (2021) 10. A. Petersson et al., Polarization properties of photonic crystal fibers, in 2006 Optical Fiber Communication Conference and the National Fiber Optic Engineers Conference, Anaheim (2006) 11. M.S. Hossain et al., Hexahedron core with sensor based photonic crystal fiber: an approach of design and performance analysis. Sens Bio-Sens Res. 32, 100426 (2021)
972
Md. R. Habib et al.
12. M. Yan, P. Shum, X. Yu, Heterostructured photonic crystal fiber. IEEE Photon. Technol. Lett. 17, 1438–1440 (2005) 13. P. Yu et al., A photonic crystal fiber dual windows polarization filter based on surface plasmon resonance. Optik 244, 167587 (2021)
Analysis of Student Attention in Classroom Using Instance Segmentation K. Meenakshi, Abirami Vina, A. Shobanadevi, S. Sidhdharth, R. Sai Sasmith Pabbisetty, and K. Geya Chitra
Abstract As the youth mind rapidly evolves, education systems keep changing parallelly to remain effective. However, today’s education system focuses primarily on delivering knowledge but does not ensure it reaches the student. When student attentiveness is analysed, we can more efficiently evaluate the course structure and the needs of students. This is especially vital because E-learning is the future, but the inattentiveness of students and lack of feedback is holding back its growth. In order to overcome these problems, analysis of student attention in the classroom is done using computer vision. The behavior of students are captured with cameras fixed in the classroom and instance segmentation is applied to segment each student and determine his/her attentiveness. The analysis reports can be utilized by the course handlers to take further steps to improve the course material, and handle personal student issues. This feedback system aided by computer vision will take E-earning to the next level.
K. Meenakshi (B) Department of Networking & Communications, School of Computing, SRM Institute of Science and Technology, Kattankulathur, India e-mail: [email protected] A. Vina L&T-NxT, Chennai, India A. Shobanadevi Department of Data Science and Business Systems, SRM Institute of Science and Technology, Chennai, India e-mail: [email protected] S. Sidhdharth Ajna Labs Pvt. Ltd., Chennai, India R. Sai Sasmith Pabbisetty Tiger Analytics LLP, Chennai, India K. Geya Chitra Tata Consultancy Services, Chennai, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Innovative Data Communication Technologies and Application, Lecture Notes on Data Engineering and Communications Technologies 96, https://doi.org/10.1007/978-981-16-7167-8_71
973
974
K. Meenakshi et al.
Keywords Human behavior analysis · E-learning · Computer vision · Attention estimation · Instance segmentation
1 Introduction Schooling is an important part of any youngster’s life; whether they realize it or not. Formal education sets up a child for his/her future. Thanks to the Indian Education System, literacy has gone up to 74.04% (as recorded by the 2011 census). Accordingly, enrollment in higher education has increased steadily over the past decade. However, the Indian Curriculum is often condemned for being rooted in memorization based learning rather than logical and practical skills. Further, India only ranks at 92 out of 142 countries in the Legatum Prosperity Index 2015 (in the category of education). This is not due to inattention in the system. Rather the pitfall lies in ineffective feedback systems, rapidly advancing technology, social student dynamics, and outdated processes. The change needs to start at how students are monitored, how they are analyzed, and how it is ensured that they are getting the right education. Analyzing a student’s attentiveness in class can accomplish all of this. Nowadays, we have the technology to create vast amounts of data and analyze it. Using computer vision and cameras in classrooms, it is possible to analyze a student’s attention level in class. This project aims to create a student attention analysis model to gain insight from video coverage of students attending classes. Through this computer vision model, the student’s attention level can be labeled as low, medium, or high with an accuracy of 96.67%.
2 Literature Survey Various scientific journals were analyzed to further investigate and understand the domains of student attention [1], computer vision, and human behavior—as well as how the three go hand in hand [2]. Many papers offered insights that were parallel to the motivation of this project, and existing working systems were understood. In 2017, a model was created to predict students’ attention in the classroom from facial and body features [3]. The model used 2D and 3D data gathered from a Kinect One sensor. This data was then utilized to construct a feature set of facial and body properties of a student. Machine learning techniques led to conclusions on the students’ attention. Further reading on learning and attention models, led to an article from 2014 that described computer vision [4] being used to detect attention in E-learning [5]. Future possibilities of student attention being monitored even without a teacher present in person were discussed.
Analysis of Student Attention in Classroom Using Instance …
975
Having understood that computer vision is feasible for our venture, we set out to understand the methods that could be used in computer vision. In 2018, eye gaze being analyzed in images to see where students were looking was explored [6]. In the same year, detailed monitoring eye status [7] of a driver for drowsiness using some key points was researched [8]. This helped us understand what parameters we could select to analyze. Then, we read about extracting and identifying students from video or images [9] because that was where the drawback of the model from 2017 existed. We learned that it was found in 2018 that instance segmentation has a higher accuracy than bounding box detection methods [10]. Further to clarify some technical aspects, we browsed through literature related to eye detection [11] to better understand how the eye can be monitored to gain information about a person’s attention toward something. Following that we read about eye detection used for drowsiness detection [12], and found that a person’s eye movement or gaze tracking [13] can be helpful for monitoring attention levels. Having understood in depth about eye detection, we searched for other works that might provide a few other aspects to analyze from our input data. For example, we read papers related to analyzing a person’s behavior like recognizing if a person is moving [14, 15], and other related computer vision papers [16]. We were able to gather various methods and formulas to calculate the necessary parameters. We learned how to handle data related to finding out if someone’s eye is closing [17] for eye detection. We also learned how to determine a person’s body posture for pose detection [18]. Similarly, we explored how the direction someone is facing [19] can be utilized for head gaze detection. Finally, we understood how to determine if someone is moving for movement detection. To evaluate the various parameters as one and create an attention classification model, we required a machine learning method to follow [20]. Reading a few surveys related to neural networks and their applications led us to the decision that we would use neural networks as our model for attention classification.
3 Proposed Work and Design The detailed implementation of prediction of student attention level using instance segmentation is split into 2 modules; the dataset and the model.
3.1 Data Design The dataset utilized by the model is a collection of footage shot within a classroom using a DSLR camera at 24 frames per second. Students attending class were recorded from the teacher’s point of view such that different segments of video have the teacher standing at different points in the front of the classroom. This led to footage
976
K. Meenakshi et al.
at different angles. However, it was ensured that the camera was placed a distance away from the students such that their eyes were still visible clearly in the footage (for analysis purposes). The footage contained occluded and non-occluded data, that the model would have to handle.
3.2 Model Design The project’s key element was instance segmentation using Mask R-CNN. The overall flow of the model design (given in Fig. 1) was as follows: The raw video data was split frame by frame and Mask R-CNN was the trained model that detected different objects in the frames and was able to label a person using the COCO Model. Then, instance segmentation was applied to each frame to identify each student in the frame separately. Once the instances of students were segmented, it underwent two successive trainings. The first training portion consisted of detecting the various behavior parameters—eye status, head gaze, pose, movement—using different methods and evaluating them separately (based on some constraints). The second set of training combined the evaluations of the parameters to determine the attention level using a classification model. The result of the second set of training was a particular student’s attention level at that moment in time.
4 Proposed System Implementation The initial COCO model was used for the extraction of the raw video input. The raw video input was segmented and extracted as seperate videos files. They were
Fig. 1 System architecture
Analysis of Student Attention in Classroom Using Instance …
977
Fig. 2 Eye aspect ratio points
Table 1 The eye aspect ratio/classification table
EAR
Classification
(Left eye & Right eye) < 0.35
Sleeping
(Left eye & Right eye) > 0.55
Awake
Left eye < 0.35 && Right eye > 0.55
Awake
Left eye > 0.35 && Right eye < 0.55
Drowsy
iteratively fed to all the 4 programs that calculates and predicts each parameter. The output parameters of the 4 models would act as the features and input for the final neural network model.
4.1 Eye Status Detection Model The first step to detecting eye status was to use Dlib and openCV to capture the eye coordinates using facial landmarks. After capturing the values, a formula was utilized to calculate the Eye Aspect Ratio (EAR). The formula used for EAR was as follows (the corresponding points can be seen in Fig. 2): EAR = |P2 − P6| + |P3 − P5|/|P1 − P4|
(1)
Eye status was classified based on EAR as follows in Table 1 (the sample output can be seen in Fig. 3).
4.2 Head Gaze Detection Model Head gaze detection used the application of the COCO model. A fixed coordinate of the board was set in the model. The nose, left shoulder, and right shoulder points were extracted from the keypoints. The Euclidean distance between the nose and the left shoulder was calculated as NL, and the Euclidean distance between the nose and
978
K. Meenakshi et al.
Fig. 3 Eye status sample output
Table 2 The distances/classification table
Distances
Classification
NL > NR
Looking toward the board
Other
Looking somewhere else
the right shoulder was calculated as NR. Based on these distances, the head gaze was classified as follows in Table 2 (the sample output can be seen in Fig. 4).
4.3 Pose Detection Model Pose detection determined the whole body posture in the frame. It also used the COCO model in a method similar to the head gaze detection model. Using Euclidean distance, the average distance between the base of the neck and shoulders was calculated as NS and the distance between the base of the neck and nose was calculated as NN. The formula for the pose ratio was as follows: Pose Ratio = NN/NS
(2)
Pose was classified based on pose ratio as follows in Table 3 (the sample output can be seen in Fig. 5).
Analysis of Student Attention in Classroom Using Instance …
979
Fig. 4 Head gaze sample output
Table 3 The pose ratio/classification table
Pose ratio
Classification